Performance in Kafka: Throughput, Latency, and Scaling

Kafka is designed for high performance and scalability. Understanding the factors that impact throughput, latency, and scaling is crucial for optimizing Kafka’s performance. In this article, we’ll explore the key properties and fields related to these aspects.


Throughput refers to the number of messages that can be processed per unit of time. Kafka achieves high throughput through various configuration properties:

  1. batch.size: The maximum size of a batch of messages that the producer will send to the broker. Increasing the batch size can improve throughput by reducing the number of requests sent to the broker.

  2. The time the producer waits for additional messages before sending a batch. Increasing allows the producer to accumulate more messages into a single batch, improving throughput.

  3. compression.type: The compression algorithm used for compressing messages. Compressing messages reduces network bandwidth usage and improves throughput. Supported compression types include gzip, snappy, lz4, and zstd.

  4. (broker): The number of I/O threads the broker uses for processing requests. Increasing this value can improve throughput by allowing the broker to handle more concurrent requests.


Latency refers to the time it takes for a message to be processed and delivered. Kafka offers configuration properties to optimize latency:

  1. acks: The number of acknowledgments the producer requires from the broker before considering a message as sent. Setting acks to 0 or 1 reduces latency but may result in potential message loss. Setting acks to all ensures the highest level of durability but increases latency.

  2. fetch.min.bytes (consumer): The minimum amount of data the consumer wants to receive per fetch request. Setting a higher value can reduce the number of fetch requests and improve latency.

  3. (broker): The maximum time a replica can lag behind the leader before it is considered out of sync. Setting a lower value ensures that replicas stay up to date and reduces latency for consumers.


Kafka is designed to scale horizontally by adding more brokers to the cluster. Scaling considerations include:

  1. num.partitions: The number of partitions for a topic. Increasing the number of partitions allows for higher parallelism and throughput. It’s important to choose an appropriate number of partitions based on the expected throughput and number of consumers.

  2. replication.factor: The number of replicas for each partition. Increasing the replication factor improves fault tolerance and availability but requires more storage and network bandwidth.

  3. (broker): The number of network threads the broker uses for handling network requests. Increasing this value can improve the broker’s ability to handle a higher number of concurrent connections.

  4. Consumer Scaling: Kafka allows multiple consumers to read from the same topic in parallel. Each consumer can be assigned a subset of partitions, enabling horizontal scaling of consumption. The number of consumers should be equal to or less than the number of partitions to ensure optimal parallelism.

Best Practices

Here are some best practices for optimizing Kafka’s performance:

  1. Monitor key metrics: Monitor metrics such as producer/consumer throughput, latency, and consumer lag to identify performance bottlenecks and optimize accordingly.

  2. Tune configurations: Adjust the configuration properties based on your specific use case and performance requirements. Experiment with different values to find the optimal settings.

  3. Partition and replication strategy: Choose an appropriate number of partitions and replication factor based on your throughput and fault tolerance needs. Consider factors such as the number of consumers, expected message rates, and data retention requirements.

  4. Scaling considerations: Plan for horizontal scaling by adding more brokers to the cluster as the throughput and data volume grow. Ensure that the number of partitions and the replication factor are aligned with your scaling strategy.

  5. Compression: Enable compression to reduce network bandwidth usage and improve throughput. Choose a compression algorithm that provides a good balance between compression ratio and CPU overhead.