In this article, we will see about Kafka producers, how producers publish data to topics and some advanced concepts of topic partition and ordering of messages.

 

Kafka Producers

Any source system which wants to publish messages to Kafka is known as Kafka producers. Producers publish data to the topic of their choice. Producers only have to specify the name of the topic and one broker to connect to and Kafka will take care of replication as well as partitioning of those messages.

It is the job of the producer to assign messages to partitions of the topic. This can be done using round-robin way to balance the load across all partitions of the topic. If we want to maintain which messages go to which partition we have to use some partition function like Hash Key while sending those messages to topic partition. If we send key with the message then we can guarantee that messages with the same key will go to the same partition. This guarantees ordering of messages across all partitions in a topic.

When producers send data to Kafka server, they can choose to receive acknowledgement for data writes. There are 3 types of acknowledgements in Kafka server.

ack = 0: In this case, the producer will not wait for an acknowledgement and will send one message after another. This is most efficient in terms of performance but it may face some data loss.

ack = all: In this case, the producer will wait for all till it gets acknowledgements from the leader as well as all replication brokers that write is complete. This is least efficient in terms of performance but there is no data loss.

ack = 1: In this case, the producer will wait for an acknowledgement from the leader. After that producer will send next message. This is better in performance compared to ack = all and there is a limited chance of data loss. Ack =1 combines best of ack = 0 and ack = all.

 

Starting producer

Let us start our first Kafka producer using kafka-console-producer command. We need to mention topic name and address of one broker to start sending messages to Kafka server. The following command will start Kafka producer and will publish messages to the topic named first_topic.

Kafka producers

This will start publishing messages to first_topic. If the topic is not present and if you have permission to create new topic then this command will create a new topic with that name and start publishing messages to that topic.

We can pass properties to this command using –property argument.

In above command retries =3 means in case of failure to send message producer will try at max 3 times to send that message.

 

Sending keys with messages

If you send messages from the console, Kafka will attach a null key to them. These messages will be assigned to partitions randomly.  We can pass predefined keys using parse.key and key.separator properties. In this case, messages with the same key will go to same partitions of the topic.

Kafka producers with key

These are basics of Kafka producers. In next article on Kafka, we will learn about Consumers.

I am passionate about data analytics, machine learning, and artificial intelligence. Recently I have started blogging about my experience while learning these exciting technologies.

Leave a Reply

Close Menu