Hands-on Kafka Connect: Source to Sink in S3, GCS & Beyond

Name: Hands-on Kafka Connect: Source to Sink in S3, GCS & Beyond
Rating: 3.5 (8 reviews)

Master Kafka Connect with Hands-On experience: S3 Sink, Debezium MySQL CDC Source Connectors, and Connect Cluster Setup

Created byGanesh Dhareshwar

Last updated 1/2025

English

What you'll learn

In depth knowledge on Kafka connect and it's Architecture
In depth practical knowledge on running S3 sink connector in distributed mode
Setting up Kafka Connect cluster
Complete understanding on Debezium Mysql CDC Source connector
Schema registry necessity and integrating it with sink and source connectors
Schema Evolution in sink and source connectors

Course content

4 sections • 21 lectures • 2h 51m total length

Course Introduction3:21
Kafka Connect Architecture8:01
Explore how Kafka Connect architecture orchestrates source and sink connectors with workers and tasks in standalone or distributed modes, enabling fault-tolerant, scalable data movement between Kafka and external systems.

Requirements

Apache Kafka understanding
Docker and Docker compose

Description

This course is a completely dedicated to Kafka Connect and exploring its open sourced connectors. There are plenty of connectors available in Kafka Connect. To begin with, I have added a sink connector and a source connector to this course.

We start this course by learning what is Kafka connect and its architecture. In the 2nd module, we learn S3 Sink connector in detail. At first, we learn what is s3 sink connector and we install it using Standalone mode. Next, we run the same configurations using distributed mode so that you get the clear difference between them.

We explore below Partitioner Class with examples

Default Partitioner
Time Based Partitioner
Field Partitioner

After that, we learn how to integration Kafka connect with Schema Registry and test the schema evolution in BACKWARD compatibility mode.

Next, we learn what is DLQ and test it by generating invalid records to Kafka. Lastly, We automate, creating s3 sink connector using a single command with the help of Docker composer.

Module 3 is dedicated to setting up a Kafka connect cluster.

Here, we provision 2 machine from AWS and start s3 sink connector worker process in both machines. We thoroughly test the Load Balancing and Fault Tolerance behaviour of our Kafka connect cluster.

In Module 4, we explore a popular source connector. That is Debezium Mysql CDC Source connector.

Here, At first, we learn how Debezium CDC connector works internally. Then we start our Debezium mysql connector in distributed mode using docker commands. After that, we run DML statements like insert, update and delete queries and learn the respective event schema changes. Similarly, we run DDL statements like dropping a table etc and observe how schema history Kafka topic capture the event changes. Lastly, we integrate it with Schema Registry and test the setup by running DDL & DML statement.

Who this course is for:

Data Engineers
Software Engineers

Hands-on Kafka Connect: Source to Sink in S3, GCS & Beyond

What you'll learn

Explore related topics

Course content

Overview2 lectures • 11min

Amazon S3 Sink Connector10 lectures • 1hr 24min

Kafka Connect Cluster2 lectures • 15min

Debezium MySQL CDC Source Connector7 lectures • 1hr 1min

Requirements

Description

Who this course is for: