Consumer Rebalancing in Kafka- An Overview

09 / Nov / 2023 by honey.arora 0 comments

Introduction

Apache Kafka is a distributed system consisting of servers and clients that communicate via a high-performance TCP network protocol. We have components generating events (Producers) and components that consume those events (Consumers). Consumers label themselves with a consumer group name so that each record published on a topic will be delivered to one and only one consumer in the consumer group. We can use this feature to implement load balancing between consumer nodes.

Consumer Rebalancing in Kafka is a process by which partitions get reassigned among consumers in a group to ensure that each consumer gets an equal number of partitions to process data.

As a result, Kafka efficiency and effective performance are maintained, thus preventing any overload or underutilization of any consumer.

Why Consumer rebalancing triggers?

Consumer rebalancing in Kafka can happen in the following conditions:

1. A consumer leaves the group

2. A consumer joins a group

3. Partitions are added to a topic

When a consumer experiences a temporary network failure or stays idle for too long, Kafka may consider it a failed consumer and remove it from the group. Meanwhile, Kafka initiates a rebalance to distribute the partitions among the active consumers in the group.

Let’s take topic T1 with three partitions, a consumer C1, which is the only consumer in group G1, and use it to subscribe to topic T1. Consumer C1 will get all messages from all three partitions.

If we add two consumers, C2 and C3, to group G1, each consumer will only get messages from a single partition.

If we add more consumers to this group with a single topic than we have partitions, some of the consumers will be idle and get no messages at all.

How does Rebalancing work?

There are two types of rebalancing, depending on the partition assignment strategy that the consumer group uses.

Eager Rebalancing: Kafka performs eager rebalancing by default, which states that all consumers stop consuming and give up their membership of partitions for a short period. This is also called a “stop the world” event.

Afterward, Consumer rejoins the group, and reassignment of partitions occurs, this time, there is no guarantee that the consumer will get the same partition as before

Cooperative Rebalancing: Also referred to as incremental rebalancing. This approach involves reassigning only a small subset of the partitions from one consumer to another and allowing consumers to continue processing records from all the partitions that are not reassigned.

Conclusion

Compared to eager, there are clear advantages of using cooperative rebalancing. Kafka consumers who are not concerned by the rebalance can keep processing data without interruptions. So, this algorithm, fortunately, is the default strategy for new Kafka stream applications.

FOUND THIS USEFUL? SHARE IT

Leave a Reply

Your email address will not be published. Required fields are marked *