DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Enterprise AI Trend Report: Gain insights on ethical AI, MLOps, generative AI, large language models, and much more.

2024 Cloud survey: Share your insights on microservices, containers, K8s, CI/CD, and DevOps (+ enter a $750 raffle!) for our Trend Reports.

PostgreSQL: Learn about the open-source RDBMS' advanced capabilities, core components, common commands and functions, and general DBA tasks.

AI Automation Essentials. Check out the latest Refcard on all things AI automation, including model training, data security, and more.

Related

  • Automated Application Integration With Flask, Kakfa, and API Logic Server
  • Resilient Kafka Consumers With Reactor Kafka
  • Event Mesh: Point-to-Point EDA
  • Real-Time Edge Application With Apache Pulsar

Trending

  • Build Your Own Programming Language
  • Elevate Your Terminal Game: Hacks for a Productive Workspace
  • Enhancing Performance With Amazon Elasticache Redis: In-Depth Insights Into Cluster and Non-Cluster Modes
  • Automated Data Extraction Using ChatGPT AI: Benefits, Examples
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Kafka Consumer Overview

Kafka Consumer Overview

A developer gives a technical overview of consumers in the Apache Kafka platform, and how they allow devs to work with concurrent data consumption applications.

By 
Sylvester Daniel user avatar
Sylvester Daniel
·
Jul. 02, 19 · Tutorial
Like (14)
Save
Tweet
Share
24.9K Views

Join the DZone community and get the full member experience.

Join For Free

This article is a continuation of Part 1 - Kafka Technical Overview, Part 2 - Kafka Producer Overview and Part 3 - Kafka Producer Delivery Semantics articles. Let's look into Kafka consumer group, consumer, and protocol used in detail.

Consumer Role

Like a Kafka Producer that optimizes writes to Kafka, a Consumer is used for optimal consumption of Kafka data. The primary role of a Kafka consumer is to take Kafka connection and consumer properties to read records from the appropriate Kafka broker. Complexities of concurrent application consumption, offset management, delivery semantics, and a lot more are taken care of by Consumer APIs.

Properties

Some of the consumer properties in the bootstrap servers are: fetch.min.bytes, max.partition.fetch.bytes, fetch.max.bytes, enable.auto.commit, and many more. We will discuss some of these properties later in the next part of the article series.

Role of Kafka Consumers
Role of Kafka consumer

Multi-App Consumption

Multiple applications can consume records from the same Kafka topic, as shown in the diagram below. Each application that consumes data from Kafka gets it’s own copy and can read at its own speed. In other words, offsets consumed by one application could be different from another application. Kafka keeps tracks of the offsets consumed by each application in an internal__consumer_offset topic.

Kafka multi app consumption

Consumer Group and Consumer

Each application consuming data from Kafka is treated as a consumer group. For example, if two applications are consuming the same topic from Kafka, then, internally, Kafka creates two consumer groups. Each consumer group can have one or more consumers. If a topic has three partitions and an application consumes it, then a consumer group would be created and a consumer in the consumer group will consume all partitions of the topic. The diagram below depicts a consumer group with a single consumer.

Kafka multi partition single consumer

Kafka multi-partition single consumer

When an application wants to increase the speed of processing and process partitions in parallel then it can add more consumers to the consumer group. Kafka takes care of keeping track of offsets consumed per consumer in a consumer group, rebalancing consumers in the consumer group when a consumer is added or removed and lot more.

Kafka multi partition multi consumer

Kafka multi-partition multi-consumer

When there are multiple consumers in a consumer group, each consumer in the group is assigned one or more partitions. Each consumer in the group will process records in parallel from each leader partition of the brokers. A consumer can read from more than one partition.

Kafka consumer and multi partition consumption


It’s very important to understand that no single partition will be assigned to two consumers in the same consumer group; in other words, the same partition will not be processed by two consumers as shown in the diagram below.

Kafka same partition multiple consumer

Kafka same partition multiple-consumer

When consumers in a consumer group are more than partitions in a topic then over-allocated consumers in the consumer group will be unused.

Kafka unused consumer

Kafka unused consumer

When you have multiple topics and multiple applications consuming the data, consumer groups and consumers of Kafka will look similar to the diagram shown below.

Multiple application and multiple kafka topic

Multiple application and multiple Kafka topic

Coordinator and Leader Discovery

In order to manage the handshake between Kafka and the application that forms the consumer group and consumer, a coordinator on the Kafka side and a leader (one of the consumers in the consumer group) is elected. The first consumer that initiates the process is automatically elected as leader in the consumer group. As explained in the diagram below, for a consumer to join a consumer group, the following handshake processes take place:

  • Find coordinator
  • Join group
  • Sync group
  • Heartbeat
  • Leave group

Kafka consumer and coordinator protocol

Kafka consumer and coordinator protocol

Coordinator

In order to create or join a group, a consumer has to first find the coordinator on the Kafka side that manages the consumer group. The consumer makes a “find coordinator” request to one of the bootstrap servers. If a coordinator already doesn’t exist it’s identified based on a hashing formula and returned as a response to “find coordinator” request.

Join Group

Once the coordinator is identified, the consumer makes a “join group” request to the coordinator. The coordinator returns the consumer group leader and metadata details. If a leader already doesn’t exist then the first consumer of the group is elected as leader. Consuming application can also control the leader elected by the coordinator node.

Kafka consumer join group

Kafka consumer join group

Sync Group

After leader details are received for the join group request, the consumer makes a “Sync group” request to the coordinator. This request triggers the rebalancing process across consumers in the consumer group, as the partitions assigned to the consumers will change after the “sync group” request.

Kafka consumer sync group

Kafka consumer sync group


Rebalance

All consumers in the consumer group will receive updated partition assignments that they need to consume when a consumer is added/removed or “sync group” request is sent. Data consumption by all consumers in the consumer group will be halted until the rebalance process is complete.

Kafka consumer rebalance group

Kafka consumer rebalance group

Heartbeat

Each consumer in the consumer group periodically sends a heartbeat signal to its group coordinator. In the case of heartbeat timeout, the consumer is considered lost and rebalancing is initiated by the coordinator.

Kafka consumer heartbeat

Kafka consumer heartbeat

Leave Group

A consumer can choose to leave the group anytime by sending a “leave group” request. The coordinator will acknowledge the request and initiate a rebalance. In case the leader node leaves the group, a new leader is elected from the group and a rebalance is initiated.

Kafka consumer leave group

Kafka consumer leave group

Summary

As explained in Part 1of this series, “partitions” are units of parallelism. As consumers in a consumer group are limited by the partition in a topic, it’s very important to decide you partitions based on the SLA and scale your consumers accordingly. Consumer offsets are managed and stored by Kafka in an internal __consumer_offset topic. Each consumer in a consumer group follows the find coordinator, join group, sync group, heartbeat, and leave group protocols. In the next article in this series, we'll look into Kafka consumer properties and delivery semantics.

kafka application

Opinions expressed by DZone contributors are their own.

Related

  • Automated Application Integration With Flask, Kakfa, and API Logic Server
  • Resilient Kafka Consumers With Reactor Kafka
  • Event Mesh: Point-to-Point EDA
  • Real-Time Edge Application With Apache Pulsar

Partner Resources


Comments

ABOUT US

  • About DZone
  • Send feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: