Kafka Consumer Group: A Brief Explanation

A Kafka Consumer Group is, at its core, a way of allowing a pool of consumers to divide the processing of data over multiple consumer instances, each of which is called a consumer. It might be easier to visualize this.

Consumer Group

As depicted, a consumer group consists of several consumers. Here are the key aspects:

  1. Each consumer within a group will read from a unique set of partitions of the topics they are subscribed to. This way, the records within these partitions can be consumed in parallel, thus providing scalability and fault tolerance.
  2. Kafka ensures that a partition is only consumed by a single consumer within the group to maintain the order of the records. If a consumer fails, Kafka will reassign its partitions to other consumers in the group.

But, what happens if the number of consumers changes, you might wonder? Let’s see.

If a new consumer joins a group, Kafka will trigger a rebalance. This implies that the assignment of partitions to consumers might change to accommodate the new consumer.

Let’s illustrate this with another image:

Consumer Group Rebalance

Similarly, if a consumer leaves a group (either because of failure or shutdown), another rebalance will take place to redistribute the partitions of the leaving consumer to the remaining ones.

Kafka’s Replication Factor: An Overview

Kafka’s replication factor refers to the number of replicas of a Kafka partition within a Kafka cluster. It’s a form of redundancy intended to provide resilience and fault-tolerance. To better illustrate this, let’s consider an example.

Replication Factor

In this example, each partition of the Kafka Topic has three replicas. Now, let’s delve into some key points:

  1. Among the replicas, one is designated as the “leader,” while the others are “followers.” All read and write operations for a partition go through the leader, and then the changes get propagated to the followers.
  2. The leader and follower roles are not static. In case the leader fails, one of the followers will be elected as the new leader, which ensures high availability of the Kafka Topic.

Now, what does this mean in terms of resilience and fault-tolerance?

Essentially, a replication factor of N allows the system to tolerate up to N-1 broker failures without losing any data. However, a higher replication factor comes at the cost of more storage and network traffic due to the increased number of replicas.

Let’s consider another illustration:

Replication Factor

This image illustrates how, despite a broker failure, the Kafka Topic remains available thanks to its replicas.