In this article, we will learn about Kafka consumers, offsets while reading data and consumer groups. We will also see how to start consumer from Kafka console.

 

Kafka consumers

Consumers read messages from topics. They only have to provide the topic name and one broket to connect to and Kafka will take care of pulling right data from right brokers and sending them to consumers.  Data is read in parallel across all partitions of the topic. But within the topic partition data is read sequentially. This is important criteria to improve performance as more number of partitions will lead to more parallel processing.

 

Consumer groups

Kafka consumers organize themselves into consumer groups. Each consumer within a group will read messages from one or more partitions.  In Kafka, no partition will be read by two consumers from the same group. That means having more consumers than the number of partitions of the topic is not very useful as extra consumers will sit idle.

Kafka consumers

In above image, the topic has four partitions. In consumer group A, there are only 2 consumers so each consumer is reading from 2 partitions at a time. Whereas in consumer group B there are four consumers so that each consumer is reading from one partition. But Having the fifth consumer in Group b will not help as it has no extra partition to read from.

 

Consumer offsets

In Kafka, offsets at which consumer group or consumer is reading is maintained. It is expected that when the consumer processes data from some Kafka topic it commits its read position to one system topic named as __consumer_offsets. If consumer process suddenly dies it can start reading from where it left using offset value.

As offsets are controlled by the consumer, it can consume records any order it likes. The consumer can reset offset to the beginning of all messages and start reading from there or can skip old messages and start reading from most recent messages.

 

Starting Kafka consumer

Let’s see how to start Kafka consumer from Kafka console.

By default, Kafka consumer will start reading the most recent message. If you want to read messages from the beginning of topic then you can use  ‘–from-beginning’ argument with the console command.

Kafka consumers

We can place multiple consumers in Kafka group and they will start reading messages in topic partitions parallelly. Lets us start a group with two consumers.

we can run this command in multiple terminals at the same time. If we start two consumers at the same time, these both consumers will process part of messages parallelly as seen in the following image.

consumer groups

It is also possible to read messages from particular partition as well. for that, we can use the following command.

This command will read data in partition 0 from the beginning.

These are some of the basics of Kafka consumers. we will see how to implement Kafka producer and consumers using Java and Python APIs in next few articles.

 

I am passionate about data analytics, machine learning, and artificial intelligence. Recently I have started blogging about my experience while learning these exciting technologies.

Leave a Reply

Close Menu