Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Hands-on Kafka Connect: Source to Sink in S3, GCS & Beyond
Rating: 3.5 out of 5(8 ratings)
143 students

Hands-on Kafka Connect: Source to Sink in S3, GCS & Beyond

Master Kafka Connect with Hands-On experience: S3 Sink, Debezium MySQL CDC Source Connectors, and Connect Cluster Setup
Last updated 1/2025
English

What you'll learn

  • In depth knowledge on Kafka connect and it's Architecture
  • In depth practical knowledge on running S3 sink connector in distributed mode
  • Setting up Kafka Connect cluster
  • Complete understanding on Debezium Mysql CDC Source connector
  • Schema registry necessity and integrating it with sink and source connectors
  • Schema Evolution in sink and source connectors

Course content

4 sections21 lectures2h 51m total length
  • Course Introduction3:21
  • Kafka Connect Architecture8:01

    Explore how Kafka Connect architecture orchestrates source and sink connectors with workers and tasks in standalone or distributed modes, enabling fault-tolerant, scalable data movement between Kafka and external systems.

Requirements

  • Apache Kafka understanding
  • Docker and Docker compose

Description

This course is a completely dedicated to Kafka Connect and exploring its open sourced connectors. There are plenty of connectors available in Kafka Connect.  To begin with, I have added a sink connector and a source connector to this course.

We start this course by learning what is Kafka connect and its architecture.  In the 2nd module, we learn S3 Sink connector in detail. At first, we learn what is s3 sink connector and we install it using Standalone mode. Next, we run the same configurations using distributed mode so that you get the clear difference between them.

We explore below Partitioner Class with examples

  1. Default Partitioner

  2. Time Based Partitioner

  3. Field Partitioner

After  that, we learn how to integration Kafka connect with Schema Registry and test the schema evolution in BACKWARD compatibility mode.

Next, we learn what is DLQ and test it by generating invalid records to Kafka. Lastly, We automate, creating s3 sink connector  using a single command with the help of Docker composer.


Module 3 is dedicated to setting up a Kafka connect cluster.

Here, we provision 2 machine from AWS and start s3 sink connector worker process in both machines. We thoroughly test the Load Balancing and Fault Tolerance behaviour of our Kafka connect cluster.


In Module 4, we explore a popular source connector. That is Debezium Mysql CDC Source connector.

Here, At first, we learn how Debezium CDC connector works internally. Then we start our Debezium mysql connector in distributed mode using docker commands. After that, we run DML statements like insert, update and delete queries and learn the respective event schema changes. Similarly, we run DDL statements like dropping a table etc and observe how schema history Kafka topic capture the event changes. Lastly, we integrate it with Schema Registry and test the setup by running DDL & DML statement.


Who this course is for:

  • Data Engineers
  • Software Engineers