
Gain hands-on mastery of Apache Kafka through lessons on producers, consumers, cluster architectures, monitoring, schema registry, streams, connectors, and securing Kafka, plus integrations with Storm, Spark, and Flume.
Understand Kafka terminologies: a message is a unit of data with a key value pair, a badge groups messages, topics partition into partitions, producers publish to topics, consumers subscribe.
Explore Kafka components including topics, partitions, offsets, producers, and consumers, and learn how batching, compression, and round-robin distribution affect latency, throughput, and data retention across a distributed cluster.
Install Kafka on a Windows system, start Zookeeper on the same machine, configure server properties and logs path, then run Kafka server from bin Windows folder and create topics.
Explore how the Kafka producer sends messages to the broker via fire-and-forget, synchronous, and asynchronous sends, with topic, key, and value serialization, and error handling.
Learn how Avro enables schema evolution and compatibility in Kafka by standardizing serialization between producers and consumers, using a schema registry to store schemas and enable binary, compressed records.
Learn how Kafka topics contain multiple partitions and how keys map to partitions, ensuring the same key goes to the same partition, with round-robin fallback when keys are absent.
Explore the Kafka consumer basics by examining consumer groups and partition rebalance, understand their functioning, and configure consumers with offset rebalance listeners to consume records at specific offsets.
Explore partition rebalance dynamics and how to create a COFCO consumer with bootstrap servers, key/value deserializers, and a group id, then subscribe to topics.
Configure Kafka consumers by tuning fetch min bytes, fetch max wait, and max partition fetch bytes, while managing session timeout to balance failure detection and rebalance.
Learn how to commit specified offsets mid batch using commit sync or async with a map of partitions and offsets to handle rebalances.
Explore the overview of consumer groups and partitions, understand rebalance and its functioning, and learn configuring consumers, commit and offsets, rebalance listeners, and consuming records with specific offsets.
Kafka replication ensures durability and high availability by maintaining multiple topic-partition replicas across brokers, with leaders handling requests and followers staying in sync.
Explore how proper producer configuration, acks, retries, and error handling ensure reliability in Kafka, including handling leader crashes, replicas, and strategies to detect and avoid duplicates.
Understand cross-cluster mirroring in Kafka with Mirror Maker to copy data between regional and central clusters, covering use cases, data redundancy, cloud migration, and best practices.
Explore active-standby and stretch cluster architectures in Apache Kafka to achieve data center redundancy, synchronous replication, and efficient resource use across three data centers.
Explore Apache Kafka consumer group operations, including list group, describe group, delete group, and offset management, using ZooKeeper for old clients or bootstrap server for new clients.
Master dynamic configuration changes by overriding cluster and topic defaults at runtime with ad config parameters, and manage clean up, retention, and flush-to-disk settings.
Explore COFCO monitoring and Skybound Registry concepts, mastering the Schema Registry architecture, components, metrics, and practical use of COFCO Schema Registry.
Learn the most critical metrics to monitor in Kafka and how to respond to them, with a focus on debugging metrics and overall performance.
Learn how to monitor Kafka health through targeted logging, enable loggers at the appropriate levels, and track broker status, topic creation and modification, and producer and consumer activities.
Gain practical insight into monitoring metrics, the architecture and components of Kafka, and how the Kafka schema registry works.
Discover Kafka streams, immutable key–value records, and real-time processing in motion, with end-to-end exactly-once delivery, replay at requested positions, and practical use cases.
Kafka streams architecture, including producers, consumers, partitions, and streaming topology, with local state stores and fault tolerance for scalable, stateful stream processing.
Understand Kafka stream architecture via the record buffer, a per-thread record cache that speeds reads from the state store, batches writes, and tracks changelog activity.
Run Kafka Connect using standalone mode with source and sink connector properties, streaming JSON records from logs.txt into a connect topic via the console producer and consumer.
Gain a solid understanding of Kafka stream architecture, stream concepts, and how processors and topology work. Explore Kafka connectors and configure connector configurations.
Understand the storm architecture with a single Nimbus master, multiple supervisors, and Zookeeper coordinating a stateless, distributed cluster with failover to keep processing running.
Explore Apache Storm components, including Nimbus, supervisors, and workers, and how Zookeeper coordinates heartbeats, failover, and task distribution for storm topologies.
Develop a word count app by streaming sentences from Kafka and Storm, using a Kafka spout, split line bolt, and count bolt to tally words in a topology.
Explore how the resilient distributed dataset underpins Spark applications with partitioned in-memory processing, transformations and actions, and fault-tolerant execution via Spark context and executors.
Understand data sets and the Spark session, including the data set API in Spark 2.0, the shift from data frames to typed datasets, and megastore access through Hive and Impala.
Learn to integrate Kafka with Spark streaming to build real-time applications by consuming from Kafka and counting word frequencies in the last 60 seconds, with configurable duration and direct streams.
Explore Apache Flume as a simple, scalable stream capture tool for Hadoop, detailing its events, sources, sinks, and channels, and learn to configure it with a config file.
[4-Sep-21 Update] Added code of demos as downloaded resource
Apache Kafka is an open-source distributed stream processing platform that provides high-throughput and low latency real-time messaging. More than 80% of all Fortune 100 companies trust, and use Kafka. Companies like Airbnb, Netflix, Microsoft, Intuit, Target, etc use Kafka extensively.
This course has been aligned with industry best practices and has been created by industry leaders.
This is the an exhaustive course covering A-Z of Kafka:
-Basic concepts and architecture of Kafka
-Kafka Producer and consumer
- Serializer/De-serializer
-Kafka Streams
-Kafka Connect
-Cluster setup and Administrating Kafka
-Kafka Monitoring and Schema registry
-Integration of Kafka with Storm
- Integration of Kafka with Spark and Flume
- Kafka Security
-and Many more concepts in detail
The course contains :
-High quality engaging videos of 9.5 Hrs
-18 Demos
-Quizzes for each lesson
-1 Project
The course will help you design Apache Kafka and learn how Apache Kafka is used to storing and processing multiple nonstop streams of information faster and all the more efficiently.
Learn how to design and introduce Kafka groups, With simple to follow and step by step guidelines.
This Apache Kafka course will help students:
- To learn the all the required knowledge to undertake responsibility for their organization's Kafka group by arranging Kafka producer, consumer, streams, and connectors
- Depicting the design of Kafka & clarify use cases in business
- Start journey in Kafka