
Explore how Kafka Connect architecture orchestrates source and sink connectors with workers and tasks in standalone or distributed modes, enabling fault-tolerant, scalable data movement between Kafka and external systems.
Test task distribution and fault tolerance in a two-node Kafka Connect cluster by creating a 3-partition topic and validating S3 sink connector tasks on node 1 and node 2.
Explore how Debezium captures insert, update, and delete events from a MySQL database, including before/after payloads and binlog-based source data, and streams them to Kafka in CDC mode.
This course is a completely dedicated to Kafka Connect and exploring its open sourced connectors. There are plenty of connectors available in Kafka Connect. To begin with, I have added a sink connector and a source connector to this course.
We start this course by learning what is Kafka connect and its architecture. In the 2nd module, we learn S3 Sink connector in detail. At first, we learn what is s3 sink connector and we install it using Standalone mode. Next, we run the same configurations using distributed mode so that you get the clear difference between them.
We explore below Partitioner Class with examples
Default Partitioner
Time Based Partitioner
Field Partitioner
After that, we learn how to integration Kafka connect with Schema Registry and test the schema evolution in BACKWARD compatibility mode.
Next, we learn what is DLQ and test it by generating invalid records to Kafka. Lastly, We automate, creating s3 sink connector using a single command with the help of Docker composer.
Module 3 is dedicated to setting up a Kafka connect cluster.
Here, we provision 2 machine from AWS and start s3 sink connector worker process in both machines. We thoroughly test the Load Balancing and Fault Tolerance behaviour of our Kafka connect cluster.
In Module 4, we explore a popular source connector. That is Debezium Mysql CDC Source connector.
Here, At first, we learn how Debezium CDC connector works internally. Then we start our Debezium mysql connector in distributed mode using docker commands. After that, we run DML statements like insert, update and delete queries and learn the respective event schema changes. Similarly, we run DDL statements like dropping a table etc and observe how schema history Kafka topic capture the event changes. Lastly, we integrate it with Schema Registry and test the setup by running DDL & DML statement.