Standalone vs Distributed Mode

Stephane Maarek | AWS Certified Cloud Practitioner,Solutions Architect,Developer
A free video tutorial from Stephane Maarek | AWS Certified Cloud Practitioner,Solutions Architect,Developer
Best Selling Instructor, Kafka Guru, 9x AWS Certified
4.7 instructor rating • 41 courses • 929,058 students

Lecture description

Learn about the two modes to launch Kafka Connect, Standalone mode and Distributed Mode, and their pros and cons

Learn more from the full course

Apache Kafka Series - Kafka Connect Hands-on Learning

Kafka Connect - Learn How to Source Twitter Data, Store in Apache Kafka Topics & Sink in ElasticSearch and PostgreSQL

04:23:34 of on-demand video • Updated July 2021

  • Configure and run Apache Kafka Source and Sink Connectors
  • Learn concepts behind Kafka Connect & the Kafka Connect architecture
  • Launch a Kafka Connect Cluster using Docker Compose
  • Deploy Kafka Connectors in Standalone and Distributed Mode
  • Write your own Kafka Connector
English [Auto] So you have two ways of bringing you connect workers either in standalone or in distributed modes, and we will get to try out both in this course. We'll try to try standalone first and then we'll do distributed mode for the rest of the course. So standalone mode first, basically a single process. A single worker runs all your connectors and tasks. The configuration is bundled with your process. And it's very easy to get started with its super useful when you're doing development and testing, when you're doing your own Kafka connecter, it's not full tolerance. If that process fails or dies, you're left without a connector. It doesn't scale or horizontally at least can scale vertically. You have having a better CPU, but that sits and it's really hard to monitor because that's a single standalone loan process. It's very hard to monitor now distributed modes, you have multiple workers, their servers basically, and they run your connectors and your tasks. The configuration is not bundled with the workers, it's submitted using a rest API and we'll see how to use our best API in details. It's super easy to scale, to scale, you just add workers, you just add more servers and automatically these new workers will retrieve tasks and execute them. And finally, it's full tolerance. Basically, if a worker dies and we'll see in the next class, if a worker dies, all the tasks are rebalanced onto the available workers. And your connectors can some can still go on. So it's really nice. You get a full tolerance, you get horizontal scalability. So all of that makes it really good, really useful for production, deployment of connectors. So remember, standalone mode is made for development and testing and distributed mode is made for production deployment of connectors.