Apache Kafka - Real-time Stream Processing (Master Class)
4.6 (519 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
3,182 students enrolled

Apache Kafka - Real-time Stream Processing (Master Class)

Processing Real-time Streams using Apache Kafka and Kafka Streams API - Start as Beginner to Finish as PRO
4.6 (517 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
3,182 students enrolled
Last updated 4/2020
English
English
Current price: $13.99 Original price: $19.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 11 hours on-demand video
  • 2 articles
  • 126 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Apache Kafka Foundation and Kafka Architecture
  • Creating Streams using Kafka Producer APIs
  • Designing, Developing and Testing Real-time Stream Processing Applications using Kafka Streams Library
  • Kafka Streams Architecture, Streams DSL, Processor API and Exactly Once Processing in Apache Kafka
  • Auto-generating Java Objects from JSON Schema definition, Serializing, Deserializing and working with JSON messages without Schema Registry.
  • Auto-generating Java Objects from AVRO Schema definition, Serializing, Deserializing and working with AVRO messages using Confluent Schema Registry.
  • Unit Testing and Integration Testing your Kafka Streams Application.
  • Supporting Microservices architecture and implementing Kafka Streams Interactive Query.
Course content
Expand all 75 lectures 10:57:29
+ Introduction to Real-time Streams
6 lectures 59:40

This is the first lecture of this course that talks about some history, from where we started, and where are we going with the data processing.

In this lecture, I will talk about the Big Data problem and how it started. I will also introduce the three whitepapers that Google published which started the big data movement and led to the development of Hadoop. Then, we will go ahead and discuss some shortcomings of the approach that Hadoop takes for processing large volumes of data.

This lecture will help you to understand how the expectations from a big data processing system is growing, and the need to handle time-sensitive data at speed.

See you in the lecture.

Keep Learning and Keep Growing.

Emergence of Bigdata - A Quick Recap
10:44

In this lecture, I will talk about the first concept of real-time stream processing.

The idea of real-time stream processing starts with looking at the day to day business activities as a sequence of events. The notion of business activities as events is in the foundation of real-time stream processing. So, the first step of heading towards stream processing is the idea of looking at your business activities as events. Applying this idea in practice is not too difficult, but it often starts with four main questions.

  1. How to identify and model events?

  2. How to create a stream of events?

  3. How to transport events?

  4. What do you mean by processing events?

This lecture aims to answer above questions and give you a initial sense of what do we mean when we say real-time stream processing.

See you in the lecture.

Keep Learning and Keep Growing.

Conception of Event Streams
11:27

In this lecture, I will talk about the categories of real-time stream processing use cases.

This video aims to cement the idea and concrete your understanding of the application of real-time stream processing. I will be talking about five different technical categories where stream processing is applied. I will also give you some specific business examples, but the main idea is to broadly classify the technical categories where stream processing is being used.

While the categories that I am going to talk about are not exhaustive, but I have tried to arrange them in the increasing order of complexity. A typical business may want to start with the first category of the use case and progress towards the more complex use case.

See you in the lecture.

Keep Learning and Keep Growing.

Real-time Streaming - Use Cases
15:23

In this lecture, I will help you to understand the main challenge that forces us to take a different approach to real-time stream processing.

I will pick up a real-life problem and talk about the complexity and the challenge that you need to solve to process events in real-time. We will also understand how traditional systems tried to tackle the problem in the past, and how do we want to progress into designing a modern solution.

See you in the lecture.

Keep Learning and Keep Growing.

Real-time Streaming Challenges
06:21

In this lecture, I will help you to formulate some critical decision criteria that we can use to evaluate the right solution for streaming data.

Once you have the evaluation criteria, the next question is obvious. What options do we have? There are many ways of integrating applications for data exchange. However, all those approaches can be broadly grouped into four main patterns. I will talk about the four alternative solutions for data exchange, and we will evaluate all of them against the decision criteria.

See you in the lecture.

Keep Learning and Keep Growing.

Real-time Streaming Design Consideration
13:58

You have successfully finished the first section of this training. In this section, I purposefully avoided any technically intense discussions. The main objective of this section was to help you familiarize with the real-time stream processing paradigm.

This summary lecture will quickly recap some of the main things that I talked about in this section.

See you in the summary lecture.

Keep Learning and Keep Growing.

Section Summary
01:47
+ Enter the world of Apache Kafka
6 lectures 01:02:49

This video is a section introduction and clearly defines Kafka. In this lecture, I will define Apache Kafka and set the stage for the rest of the lectures in this section.

See you in the lecture.

Keep Learning and Keep Growing.

What is Apache Kafka?
03:42

Apache Kafka organizes the messages in Topics, and the broker creates a log file for each Topic to store these messages. However, these log files are partitioned, replicated, and segmented. In this lecture, I will help you understand the Kafka Topics in depth and also show you the Kafka log file organization. This lecture is going to be a working session.

See you in the lecture.

Keep Learning and Keep Growing.

Preview 19:49

In this lecture, you will learn how Kafka Cluster is formed and who is responsible for managing the Kafka Cluster and coordinating work in the cluster. I will talk about the role of Zookeeper in the Kafka Cluster and explain the details of the Cluster Controller.

See you in the lecture.

Keep Learning and Keep Growing.

Kafka Cluster Architecture
10:08

In this lecture, I will tie up the relationship between these Kafka partitions and  Kafka Brokers. This lecture will help you understand how the work is distributed among the brokers in a Kafka cluster. Primarily focus of this lecture is to explain Fault tolerance and Scalability of Apache Kafka.

See you in the lecture.

Keep Learning and Keep Growing.

Kafka Work Distribution Architecture - Part 1
10:11

In this lecture, I will talk about the responsibilities of the leader and the followers. You will learn the key responsibilities performed by the Leader broker and also understand what is the job of a follower. This lecture also explains the Kafka ISR list in detail.

See you in the lecture.

Keep Learning and Keep Growing.

Kafka Work Distribution Architecture - Part 2
15:04

You have successfully finished the second section of this training. This summary lecture will point you to some key areas of Kafka Documentation which you should start looking at this stage.

See you in the summary lecture.

Keep Learning and Keep Growing.

Section Summary
03:55
+ Creating Real-time Streams
8 lectures 01:35:58

Bringing data streams into Kafka is your first step towards developing real-time stream processing applications. Hence, we dedicate one full section for bringing data into a Kafka cluster as a stream. In this introductory lecture, I will talk about alternatives to stream your business events into Apache Kafka.

See you in the summary lecture.

Keep Learning and Keep Growing.

Streaming into Kafka
05:06

In this lecture, I will help you understand the producer API structure and create an example to help you understand the usage. To help you understand the API structure, I will create a simplest possible Kafka producer code that sends one million string messages to a Kafka topic.

See you in the summary lecture.

Keep Learning and Keep Growing.

Kafka Producers - Quick Start
11:50

In this lecture, I will talk about the internals of Kafka Producer API. While the API methods are straightforward, but a lot of things happen behind the scenes. This lecture will help you understand what happens under the hood and build solid conceptual foundation.

See you in the summary lecture.

Keep Learning and Keep Growing.

Kafka Producer Internals
17:48

Kafka is all about scalability. How can we scale our application and send hundreds of thousands of events per second? In this lecture, we will explore some details about scaling up the producers.

See you in the summary lecture.

Keep Learning and Keep Growing.

Scaling Kafka Producer
16:50

By Now, You have learned enough for meeting most of the basic requirements of streaming events to the Kafka cluster. However, some specific and intricate scenarios require some extra attention. In this lecture, I will talk about how can you achieve exactly once semantics in Apache Kafka Producers.

See you in the summary lecture.

Keep Learning and Keep Growing.

Advanced Kafka Producers (Exactly Once)
07:01

Kafka producer API allows you to implement atomic transactions across the topics and partitions. The atomicity has the same meaning as in databases, that means, either all messages within the same transaction are committed, or none of them are saved. In this lecture, I will explain implementing atomic transactions using Kafka Producer API and create one example to help you understand the implementation details.

See you in the summary lecture.

Keep Learning and Keep Growing.

Advanced Kafka Producer (Implementing Transaction)
12:44

In this session, we are going to create a point of sale simulator that generates an infinite number of random but realistic invoices and sends it to the Kafka broker.

See you in the summary lecture.

Keep Learning and Keep Growing.

Kafka Producer - Micro Project
10:16

In this lecture, I am going to talk about some corner cases and close my discussion on producer APIs. This is going to be the last lecture on producer APIs which will cover following things.

  1. Synchronous Send

  2. Producer Callback

  3. Custom Partitioner

  4. Avro Serializer and Schema Registry

See you in the summary lecture.

Keep Learning and Keep Growing.

Kafka Producer - Final Note and References
14:23
+ Enter the Stream Processing
8 lectures 01:27:20

In this lecture, I will introduce you to the three methods of stream processing using Kafka. I will briefly explain difference between these three methods and help you understand who offers what?

See you in the lecture.

Keep Learning and Keep Growing.

Stream Processing in Apache Kafka
05:07

In this lecture, I will give you a practical introduction to Kafka consumer API and you will be creating a typical consume-transform-produce pipeline using consumer APIs.

See you in the lecture.

Keep Learning and Keep Growing.

Kafka Consumer - Practical Introduction
11:59

In this lecture, we will take our discussion on Kafka Consumers further, and talk about consumer scalability, fault tolerance, and offset management. I will also walk you through some complex scenarios to help you understand why using Kafka Consumers would be a challenge for stream processing.

See you in the lecture.

Keep Learning and Keep growing.

Kafka Consumer - Scalability, Fault tolerance and Missing Features
13:45

In this lecture, I will help you create your first Kafka Streams Application. The example is the most simple streaming application. However, it helps you understand the basic API structure and it's usage.

See you in the lecture.

Keep Learning Keep Growing.

Preview 13:11

In this lecture, you will learn the notion of the Topology in more detail. We will also explore some available transformation methods that we can use to create a processor topology. Further, we will take a fairly complex business requirement and create a topology DAG for the same.

The primary objective of this lecture is to help you understand the most critical ingredient of the Kafka Streams application: How to create a Topology? Once you get this part correct, the rest of the course will go straight, and learning will be fun.

See you in the lecture.

Keep Learning and Keep Growing.

Creating Streams Topology
16:03

In this lecture, we will implement a complex Streams Topology that we designed in the earlier lecture.

See you in the lecture.

Keep Learning and Keep growing.

Implementing Streams Topology
14:44

In this lecture, we will try to understand how Kafka Streams works underneath the covers to provide parallel processing and and talk about following items.

  1. Multi-threading in Kafka Streams.

  2. Multiple Instances of Streams Application.

  3. Streams Topology

  4. Streams Task

See you in the lecture.

Keep Learning and Keep growing.

Kafka Streams Architecture
09:30

This is a summary lecture. In this lecture we will summarize everything that we learned in this section and I will also give you some further reference links to learn more.

See you in the lecture.

Keep Learning and Keep Growing.

Section Summary and References
03:01
+ Foundation for Real Life Implementations
7 lectures 54:48

This is a section introduction lecture. In this lecture, I will give you an overview of what are you going to learn in this section and why it is important.

See you in the lecture.

Keep Learning and Keep growing.

Introduction to Types and Serialization in Kafka
04:01

In this lecture, we will learn defining schema and auto generating POJO from the schema definition. These auto generated Java Classes are well annotated and that makes them serializable using the JSON serializer and deserializer.

See you in the lecture.

Keep Learning and Keep Growing.

JSON Schema to POJO for JSON Serdes
09:05

In this lecture, we will create JSON based custom Serdes for the POJOs that we generated in the earlier lecture.

See you in the lecture.

Keep Learning and Keep growing.

Creating and Using JSON Serdes
08:32

In this lecture, you will learn creating Avro schema definition and auto generating Java class definitions from those schema definitions.

See you in the lecture.

Keep Learning and Keep Growing.

AVRO Schema to POJO for AVRO Serdes
07:47

In this lecture, we are going to recreate our POS simulator example to utilize Avro schema and send Avro serialized messages to Confluent Kafka Platform.

See you in the lecture.

Keep Learning and Keep Growing.


Creating and using AVRO schema in Producers
11:17

In this lecture, we will learn how to use AVRO serialization in Kafka Streams.

See you in the lecture.

Keep Learning and Keep growing.

Creating and using AVRO schema in Kafka Streams
12:27

This is a summary lecture. In this lecture we will summarize everything that we learned in this section and I will also give you some further reference links to learn more.

See you in the lecture.

Keep Learning and Keep Growing.

Section Summary and References
01:39
+ States and Stores
5 lectures 43:08

In this lecture, I will introduce you to the concept of states and state store.

See you in the lecture.

Keep Learning and Keep Growing.

Understanding States and State Stores
07:26

In this lecture, we will create a state store, and you will learn manually creating state stores and using them in your programs.

See you in the lecture.

Keep Learning and Keep growing.

Creating your First State Store
20:05

In this lecture, we will try to understand the need for repartitioning your Kafka Stream and also learn a method for doing the same.

See you in the lecture.

Keep Learning and Keep Growing.

Caution with States
09:32

In this lecture, we will learn about the fault tolerance capability of the local state store.

See you in the lecture.

Keep Learning and Keep Growing.

State Store Fault Tolerance
04:35

This lecture summarizes the state stores section.

See you in the lecture.

Keep learning and Keep growing.

Section Summary and References
01:30
+ KTable - An Update Stream
4 lectures 29:50

In this lecture, I will introduce you to the notion of KTable.

See you in the lecture.

Keep Learning and Keep growing.

Introducing KTable
05:48

In this lecture, we are going to create a super simple example to understand some details of using KTable.

See you in the lecture.

Keep Learning and Keep growing.

Creating your First Update Stream - KTable
13:16

In this lecture, I will talk about the KTable Caching and emit rates.

See you in the lecture.

Keep Learning and Keep growing.

Table Caching and Emit Rates
06:40

In this lecture, I will introduce you to the GlobalKtables in Kafka.

See you in the lecture.

Keep Learning and Keep growing.

Introducing GlobalKTable
04:06
+ Real-time Aggregates
7 lectures 47:48

In this lecture, we will create a word count example on a real-time stream.

See you in the lecture.

Keep learning and Keep Growing.

Computing Your First Aggregate - Real-time Streaming Word Count
08:15

In this lecture, I will talk about some core considerations for implementing streaming aggregates.

See you in the lecture.

Keep Learning and Keep Growing.

Streaming Aggregates - Core Concept
07:44

In this lecture, we will learn to use a reduce() method for computing real-time aggregate.

See you in the lecture.

Keep Learning and Keep Growing.

KStream Aggregation using Reduce()
06:41

In this lecture, you will learn to use the aggregate() method for real-time stream aggregation.

See you in the lecture.

Keep Learning and Keep Growing.

KStream Aggregation using Aggregate()
10:16

In this lecture, I am going to help you understand the difference between KStream aggregation and KTable aggregation.

See you in the lecture.

Keep Learning and Keep Growing.

Common Mistakes in Aggregation
04:42

In this lecture, we will learn about the count() method on KTable.

See you in the lecture.

Keep Learning and Keep Growing.

Count on KTable
04:34

In this lecture, you will learn the mechanics of using the KTable aggregate() method.

See you in the lecture.

Keep Learning and Keep growing.

KTable Aggregation using Aggregate()
05:36
+ Timestamps and Windows
6 lectures 50:21

In this lecture, I will help you to understand the notion of three timestamp semantics in Apache Kafka.

See you in the lecture.

Keep Learning and keep Gowing.

Timestamps and Timestamp Extractors
09:44
Creating Tumbling Windows
10:53
Stream Time and Grace Period
07:39
Supressing Intermediate Results
07:46
Creating Hopping Windows
03:27
Creating Session Windows
10:52
Requirements
  • Programming Knowledge Using Java Programming Language
  • Familiarity with Java 8 Lambda
  • A Recent 64-bit Windows/Mac/Linux Machine with 4 GB RAM (8 GB Recommended)
Description

This course does not require any prior knowledge of Apache Kafka. We have taken enough care to explain all necessary and complex Kafka Architecture concepts to help you come up to speed and grasp the content of this course.


About the Course

I am creating Kafka Streams - Real-time Stream Processing to help you understand the stream processing in general and apply that knowledge to Kafka Streams Programming. This course is based on my book on the same subject with the same title. My Book is already published and is available with all major online retailers as an eBook and Paperback.

My approach to creating this course is a progressive common-sense approach to teaching a complex subject. By using this unique approach, I will help you to apply your general ability to perceive, understand, and reason the concepts progressively that I am explaining in this course.

Who should take this Course?

Kafka Streams - Real-time Stream Processing course is designed for software engineers willing to develop a stream processing application using the Kafka Streams library. I am also creating this course for data architects and data engineers who are responsible for designing and building the organization’s data-centric infrastructure. Another group of people is the managers and architects who do not directly work with Kafka implementation, but they work with the people who implement Kafka Streams at the ground level.

Kafka Version used in the Course

This course is using the Kafka Streams library available in Apache Kafka 2.x. I have tested all the source code and examples used in this course on Apache Kafka 2.3 open source distribution. Some examples of this course also make use of the Confluent Community Version of Kafka. We will be using Confluent Community Version to explain and demonstrate functionalities that are only available in the Confluent Platform, such as Schema Registry and Avro Serdes.

Source Code, Development IDE, Build Tool, Logging, and Testing Tools

This course is fully example-driven, and I will be creating many examples in the class. The source code files for all the examples are included in your study material.

This course will be making extensive use of IntelliJ IDEA as the preferred development IDE. However, based on your prior experience, you should be able to work with any other IDE designed for Java application development.

This course will be using Apache Maven as the preferred build tool. However, based on your prior experience, you should be able to use any other build tool designed for Java applications.

This course also makes use of Log4J2 to teach you industry-standard log implementation in your application.

We will be using JUnit5, which is the latest version of JUnit for implementing Unit Test Cases.

Example and Exercises

Working examples and exercises are the most critical tool to convert your knowledge into a skill. I have already included a lot of examples in the course. This course also consists of objective questions and some programming assignments as and when appropriate. These exercises will help you to validate and check your concepts and apply your learning to solve programming problems.

Who this course is for:
  • Software Engineers and Architects who are willing to design and develop a Stream Processing Application using Kafka Streams Library.
  • Java Programmers aspiring to learn everything necessary to start implementing real-time streaming applications using Apache Kafka