Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Follow publication

Diving Into Kafka Partitioning By Building a Custom Partition Assignor

Kyle Carter
Dev Genius
Published in
10 min readMar 7, 2022

--

Photo by Will Francis on Unsplash

Building a distributed system is not easy. There are many concerns a distributed system developer must take into account at all times. The more concerns a developer must keep track of, the more likely something is going to slip through the cracks and be missed. Often these concerns that need to be addressed and maintained are not core to the problem that is being solved, often labeled “undifferentiated heavy lifting.” This is why developers often look to tools and frameworks to offload some of those tasks so they can focus on the core, unique business problem they are trying to solve. For a developer using Kafka one of those concerns is scheduling different instances of an application to be in charge of different partitions on a particular Kafka topic. Thankfully, Kafka consumers handle this issue transparently to the developer. Let’s pull back the curtain a little and see how that works and then get our hands dirty building our own partitioning scheme.

Let’s start by reminding ourselves a bit about the internals of Kafka. The data in Kafka is divided into topics. A topic is a logical grouping of data. Topics are further subdivided into one or more partitions. Partitions are the unit of scalability for a Kafka topic. While one consumer can handle all partitions in a topic, more than one consumer cannot operate on a particular partition at the same time. This means if I have a topic with 20 partitions and more than 20 consumers will not benefit me as far as throughput goes. Finally, multiple consumers can indicate that they want to work together to process a topic by being part of the same consumer group. It is within the scope of a consumer group that all partitions of a topic will be divided.

To understand how partition balancing is accomplished it is useful to understand a bit about group management in Kafka and what the broker is responsible for and what the client is responsible for. When a consumer comes online it notifies the broker of what topics it wants to subscribe to and what consumer groups it is part of. Each consumer group will be assigned one of the broker instances as its group coordinator. The group coordinator is in charge of tracking the partitions of the subscribed topics as well as the member of the group. Any changes to either of these items will require a…

--

--

Published in Dev Genius

Coding, Tutorials, News, UX, UI and much more related to development

Written by Kyle Carter

I'm a software architect that has a passion for software design and sharing with those around me.

Responses (2)

Write a response