What is Zookeeper?

A free video tutorial from Stephane Maarek | AWS Certified Cloud Practitioner,Solutions Architect,Developer
Best Selling Instructor, 10x AWS Certified, Kafka Guru
Rating: 4.7 out of 5Instructor rating
68 courses
2,561,293 students
What is Zookeeper?

Lecture description

Understand what is Zookeeper and Zookeeper Basics

Learn more from the full course

Apache Kafka Series - Kafka Cluster Setup & Administration

Hands-On Training on ZooKeeper Quorum Setup, Kafka Cluster Setup and Administration in AWS.

04:05:30 of on-demand video • Updated June 2024

Setup a Zookeeper and Kafka cluster on three machines in AWS
Learn how to deploy Kafka in Production and understand the target architecture for clusters in AWS
Setup ZooKeeper Cluster, learn its role for Kafka and usage
Setup Kafka in Cluster Mode with 3 brokers, including configuration, usage and maintenance
Shutdown and Recover Kafka brokers, to overcome the common Kafka broker problems
Configure Kafka Cluster with production settings and optimisations for better performances based on your workload
Setup web administration tools using Docker: ZooNavigator, Kafka Manager, Confluent Schema Registry, Confluent REST Proxy, Landoop Kafka Topics UI
Administer Kafka using Kafka Manager
English [Auto]
So here's the fun. Here it actually begins. We're going to set up Zookeeper in a quorum mode. So it's going to be a multi server deployment and we're going to discuss configuration and etcetera, etcetera. So before we get started and set up Zookeeper, let's try to understand what is zookeeper. So Zookeeper is a pillar for so many distributed applications because it provides features that are awesome. The first one is distributed configuration management so it can manage the configuration of distributed systems. It can also provide election and provide consensus. That means if if many servers ask, Hey, who is? Who's the leader? Who's the chef? And zookeeper says you are and that's it. You think that'd be easy, but it's really tough. So Zookeeper does this really well. It also does coordination and locks. That's more low level, but it's good to know. And finally, in Kafka's in Kafka's case, it does key value store so it can store many configuration for topics for brokers, etcetera, etcetera. We're going to see this when we do a deep dive. So not just Kafka, but Zookeeper is used by Hadoop and other big data systems. It's an Apache project that's stellar. It's very stable and it hasn't had any major release in many years because it's so stable. So if you're wondering, Hey, what kind of version of zookeeper are we going to use the branch? 3.4 has been stable for years and years, and that's the one we're going to use. You can say, hey, there's 3.5 as well. Should we use it? And 3.5 is about to be awesome. It has many new features that are can't wait to use, but it's not ready yet. It's been in development for many, many years and it's still in beta. So I do not recommend for you to use 3.5 zookeeper at all. Okay. Kafka is not ready for it and I'm not ready for it. And you're not ready for it. Okay, So again, for this tutorial, we're going to use Zookeeper 3.4, and that's going to work out just fine. Now let's just look at what zookeeper is. So we all know what file system look like. There is the slash right here, and that means root. That's the root of your file system, like in Linux. And then we can have folders in file systems and in Zookeeper. You can also have that same concept where where your root has a node and we'll call that node app. It could be whatever, but we'll just call it app. And again, if that app is a directory, well, we can have multiple subdirectories or files. We can have slash app slash finance or slash app slash sales. Okay, So Zookeeper is really like a file system. Okay. There are a few differences, but it looks like one, and each of these things is a node. Okay, so let's just talk about terminology a little bit. It has an internal structure like a tree. You see, like on the left, that's called a tree. We have the root at the very top and then you have branches and at the very bottom you have leaves or nodes. So each node is called a Z node and y, Z. Well, because zookeeper with a Z. So each node is called a Z node. Each node has a path. So in the blue boxes, for example, in the middle one, the path is slash app and in bottom it's slash app slash finance. Okay, so each Z node has a path. Each Z node can be persistent or ephemeral. So what's the difference? A persistent Z node is something that will stay alive all the time. It's it's set in stone. Zookeeper will remember it all the time. An ephemeral Z node is a Z node that will just go away if your app disconnects. So each of them have advantages. Okay. And Kafka uses both. Not for you to worry about, but it's good to know that both exist. Each node can store multiple Z nodes or it can store some data. And then finally, you cannot rename the node. It's impossible. Okay, you could copy some, but you cannot rename them finally. And that's super awesome. And that's one of the best features of Zookeeper. Zookeeper can clients can watch the nodes. And when you watch the zoo node, you watch one for changes. So say my app finance changes. I can watch it. And when it changes, Zookeeper will let me know. Hey, it has changed. You should check it out and I can get the updated value and that's awesome. So Zookeeper is very light in features. It's a very, very minimal project. But what it does, it does really, really well. So that's a high level overview of Zookeeper. Kafka uses all the features that are described. Okay, not for you to worry about, but it's really good for you to know about these things. Okay, so hopefully you have a better understanding of what Zookeeper is. We'll get to play with Zookeeper in the next lectures, but hopefully that's these two slides really help you out. All right. See you in the next lecture.