[Hands-On] Setting up Prometheus + Kafka Broker 1

Stephane Maarek | AWS Certified Cloud Practitioner,Solutions Architect,Developer
A free video tutorial from Stephane Maarek | AWS Certified Cloud Practitioner,Solutions Architect,Developer
Best Selling Instructor, Kafka Guru, 9x AWS Certified
4.7 instructor rating • 41 courses • 925,915 students

Learn more from the full course

Apache Kafka Series - Kafka Monitoring & Operations

Kafka Monitoring Setup with Prometheus and Grafana, Kafka Operations and Kafka Cluster Upgrades Hands-On. Setup in AWS

05:04:58 of on-demand video • Updated July 2021

  • Setup a Multi Broker Kafka Cluster in no-time in AWS (using CloudFormation)
  • Setup Administration Tools such as Kafka Manager, ZooNavigator, LinkedIn's Kafka Monitor
  • Setup Monitoring using Grafana and Prometheus
  • Learn how to perform a safe and automated Roll Restart of Kafka Brokers
  • Update Brokers Configurations in a safe way
  • Rebalance Partitions in a Kafka Cluster
  • Increase and Decrease the Replication Factor of Topics
  • Add a Broker to a Kafka Cluster
  • Service and Replace a Broker in a Kafka Cluster
  • Remove a Broker in a Kafka Cluster
  • Install Command Line Interface (CLI) tools to automate workflows
  • Upgrade a Kafka Cluster with no downtime
English [Auto] OK, so we have quite a lot of things to do, so let's get started and setting up Kotkai with Prometheus. So the this will do is that will install the genomics exporter agent on our Cafcass Brokers', well installed in Prometheus on the administration machine and set it up as a system, the service. And then we'll view that in Prometheus that the Calcaterra is indeed being pulled. And then we'll have to set up as an exercise the two other brokers and zookeeper, because I want you to work to. OK, so let's get started. OK, so the first thing I want to do is set up the Prometheus so Prometheus tool and so what is Prometheus? If you go there, it says that it's a monitoring tool from metrics to insights, and it's available. It's open source monitoring solution. And so we'll get started, we'll download it and so on. So the first thing you want to do, though, is set up the Prometheus, Prometheus, theas Java Genomics Exporter agents. So this agent is basically something we want to set up. And here's the GitHub project for it. And it's basically a way to expose the jambox beans. So our genomics metrics from Kafka via HGP so that Prometheus can consume them. So that's the general idea. It's pretty simple. And so basically it gives us the command to run Java minus Java agents and this jar and this config and then Arja. So the first thing you do is download the jar from here onto our Kathia boxes and then we'll have the Matrix available. So let's, let's have a look and do it step by step, OK. The first thing you want to do is search into our Katka box, so I'll secession to my first cash machine right here. So let's go to the instances and to one to get started. I'll take the public IP which is right here, and I'll do S.H.. And here's the public IP. OK, so we are in Katka one, so if we do pseudo. System, uh, status, Kafka, we can see that Kafka is running and everything is great. OK, so all the code is again coded by me before the course, so we don't have to much struggle. So what we need to do is create a Prometheus directory, go into the directory, then download the file they give us and then the configuration. So let's just do this, we'll copy this line. So the first thing is to make a directory and call it Prometheus, OK? And in there we'll going to download that Java agent jar that was from the GitHub page. And we're done. Now, there is a second thing we need to do is to set up the config. So if we go here, as you can see here, there's an example, config folder and within an example config, there's Cafcass zero eight to Yamal and it's still very much valid right now. So this is a kind of YAML file we want to use to export our matrix. So we also need to download that file. And for this fairly easy, we can just do a web, gets the right command. Oops, I copy this again and paste it, and here we go. So now if you look in our directory, we have and actually I forgot to do the right thing, so I need to move these files, obviously. Into my Prometheus directory, so I'll move this one and I'll move the next one into the Prometheus directory. So if we look at the roots, we have Kafka, Kafka, underscore to the 12 Cathe got a property and then Prometheus. And if you go to Prometheus, we can see that we have the Jamous Java agents and the Cafcass. You're a YAML if you look at what the Kafka 080 Yamal is. So let's have a look sonando Kafka. So this is basically a configuration and I wouldn't change it too much if I were you. But it basically says to export the patterns of the Kathia Matrix based on these rules. And so these rules are made. So that's every Kafka matrix should be exported. And just have a look out for that file in case of changes. But this basically pulls everything from Kafka into a Prometheus format. So this looks good. We have our Java agents and we have our Kafka Yamal. And so what we need to do now is basically change our Cafcass service definition so that it uses the Java agents to get started. So, again, just a reminder. If we go here and we go to Java agent, it says we need to run this minus Java agent option, pointing to James Prometheus Java agents right here into Configure Yamal. And so for this, we're going to have to edit our system default. So we'll do Nonno. And we'll do all of this because we need to be elevated ATC system, the system can cut its service and so here we are. And so we need to change this file basically to add this Java agent line. So right about the exact start. All right. Environment equals and there is this variable called caveat options that you can pass options to and will automatically add this right line correctly. And so we need to edit this Kafka options with the line we need for this, because I'm lazy and I already typed it just to make sure we have it right. I'll copy and paste this. Here we go. So let's have a look at what this did, so would you Kafka options equals minus Java agents. So just like in the GitHub repository, which is so then the full path of my Prometheus Java agent. So if you didn't download it correctly in the right path, it will air out for you at zero eight zero, which is the port. It's going to be exported. So here is what we can query and then we also pass in the configuration. So the configuration is Cafcass, you're a total Yamal and this is the full path. So this looks correct. Now exit it and press. Yes. And so we are ready and so now if I do my restarting of Kafka, things would work. So pseudo system Ciccio restart Kafka. And it says Cafcass, service on disk run systems, it's Yeldham on reload to reload units so opposed to this pseudo. System, Damon. And that's because we changed this file right here, cafè service, so I run this and now I can restart Kafka properly. So I'm restarting Kafka, and this command may take a while, OK? It's done now. I can just do just of students at Yale status. Kafka and Kafka is active and running. So that means that hopefully this JAVOR agents option get taken into account. Otherwise it would be crushed. I guess so if we do a curl now onto localhost or 080. We should see all the Cathrine Matrix, you see all these things, that's all the Cafcass Matrix. And so that's really good because this is what Prometheus will get from Kafka every now and then. So from this step, what we did is that we did set up Cafcass broker No. One to have the Java Jambox agents directly running in the service. So this looks correct. And now in the next step, we need to go and set up the administration box. So I'm going to go on the right hand side. So here on the right hand side is going to be my administration box in here. On the left hand side is going to be Michalka Box. So let's go into the administration box so we can set up Prometheus there. So administration is right here. Here's my public IP. So that looks good. Now I'm going to SSX into it. I Sensage into this, OK? I am in my administration box. And so one thing we can do already is see if we can access the matrix of the left hand side from the right hand side. So kaffiyeh one matrix from the administration box. So for this will do a curl and the IP address. So one. Seventy two. Thirty one. One thirty one point eight zero eight zero and yeah, it works, so we get all the Cafcass matrix on the right hand side and pulled from the machine on the left hand side. So cool. So this is working. We have a connectivity working between our administration box to our katka machine and now we're ready to set up Prometheus on this machine. So let's have a look at how to set up Prometheus. So let's just cool down our little tutorial. It's pretty easy. There is basically on the GitHub of Prometheus, a release download. So let's have a look right here. So if we go to Prometheus and we go to the Prometheus page, here we go. So, by the way, it's work with 18000 stars. So it's quite a popular one if we go through releases. We see that there is a version two point three point two at the time of recording, and it basically gives us a Linux arm 64 or whatever the distribution you need for your stuff. So for our machine, what we need is Linux AMD 64. So we're going to do a web gets so here in our administration machine. OK, don't be mistaken. We download Prometheus so it can take a bit of time because this is quite a big file. So for me there is about 30 seconds. I'll just pause. OK, the file is now downloaded. So if we look into our directory, we do see that we had a Kafka monitor that we download from before and now this Prometheus file. So we need to extract this file for this. We run the Takamine minus X ZF to just extract the correct files. So here we go. We run this and now if we do well, we see that now we have or else we see that now we have a directory called Prometheus 2.0 3.0 Linux Amzi. So we'll have to rename that directory. So I'll just move this into Prometheus. OK, so now if we do else, we have a cat into the directory of Prometheus directory and then our zip file from before and now we're ready to remove our zip file. So let me just remove the zip file. Here we go. So if we have KFKA Monitor and Prometheus and if we look into the Prometheus directory, we see that there is from Tool Prometheus, that Yamal, et cetera, et cetera, et cetera. So this is good. Now we need to look at this configuration file, Prometheus, Yamal. So let's have a look at it. So NENO Prometheus. Prometheus. Prometheus. OK, so let's have a look, I'll just this year, so this is a global configuration and basically it's saying that every 15 seconds you are going to look for new data. So every 15 seconds it is going to query Kafka and say Kafka. What do you have as new data? And so basically, as you scroll down, you can set up different jobs and it's called scrape conflicts. And create conflict is basically a way to get metrics from wherever you want. So as you can see here, there's a job naming named Prometheus, which queries itself on localist ADHD. Do you see the matrix of Prometheus? But so what we need to do here is to add a scrape config so that we can scrape KFKA easy enough. I have everything set up for you. So if we go to Prometheus right here and go to Prometheus YAML, as you can see, I have added a scraped config job named Kafka Static Configures Targets and here is the Kafka app. So I need to make sure that my Kafka app is correct because it's different here. So I'll do, um, from what I remember, it is 171. Thirty one. One thirty one. So one. Thirty one, OK, so this is Mike Kafka, one one 72, 31, 31, and so that's Kafka one. Change EPEAT for your use case. OK, and so I need to basically paste this crape config and even paste to global, so I'll just copy this whole thing and I'll actually remove this file and erase it. So we'll we'll see into Prometheus will remove the premises YAML file and we'll create a new one for me here. YAML file, paste it. And as we can see now, this job should have a target of my Cafcass. So it's just verified that this is the right one. I'll exit and save. Yes. And if you do a Carol again on this, it works. We get the matrix. So the Prometheus configuration file is correct. And so now that we have this, we should just start Prometheus. So for this is just one command dot slash Prometheus. So let's have a look and start Prometheus. And as you can see now, it started really, really quickly and it says Suvir is ready to receive a Web request. So now we actually have to go ahead and open Prometheus into our Web browser. And as you can see here, the address is zero zero zero point ninety nine. So we have to go to port in 1998. So let's have a look here. So now here I am in my Web browser, and I will go and open a new page and go to port ninety ninety. And if everything works, no, it doesn't work, obviously, because maybe you've guessed I need to change the security group one more time. So in the security group, we need to add an inbound rule and we'll add the port 1990 from everywhere and we'll call this Prometheus meeting. Yes, that looks good. So any time you get an error, as in the past can't be accessed or whatever, this is the kind of things you need to look out for. OK, so let's just refresh this page now. So copy the URL again. Part 1990. And if everything works now in the Web browser, we should see Prometheus, so this is Prometheus. It doesn't look very, very good. I have to say, but it works really well. And so basically, if you type any any query right here on Casca, well, you get all the Cafcass matrix right here. So this is really, really nice. If we get, for example, Kafka cluster repartition replicas count and executes, we see all the value of this metric. This one particular metric. So it's not very usable so far. Right. But is all the metrics from Kafka pulled by Prometheus and you just get the hint that it's working because of all these things right here? All these metrics are available for us. So this is all the kind of stuff that will be displayed by Gravano. So now we're almost done. Prometheus is indeed running. But we need to set it up as a service just like before, so I'm going to stop Prometheus and promise to stop it. I see you next time. This is nice. See you next time, Prometheus. And so next, what we have to do is go into this little directory appropriate for you and set up Prometheus as a system default. So for this will create a new system of also pseudo nano BTC system. The system. And then Prometheus. That service and when you feel something in so just like before, everything is ready for you in systemd right here with Prometheus, that service, and we can copy this entire thing in there. So this is provided directly by Prometheus. So here's the description. It's a Prometheus server. Here's the documentation. And we have the same after before when the network is online, the user is going to be running. This is easy to user and to start Prometheus. Very easy. We run the Prometheus come in. We just ran before and we also passed a config file as an argument. So this looks good X. So let's just check right now. If I refer to this page, it says this site can't be reached, but now I'm going to do a pseudo service start. Prometheus. And obviously, it's not service its system. I'm a bit. System CTL start Prometheus, and now we're going to do a status on Prometheus just to make sure that everything is working, so says active running. So this looks good and I refer to this page and we get our Prometheus ready available. And if I copy for any command instant replica accounts, it worked. So we have our metrics and everything all working in Prometheus. So this looks nice so far. What we've done just as a summary, which is all the genomics is bordered on the Kafka brokers. We've installed the Prometheus on the administration machines as a service. We viewed in Prometheus. The the Kafka isn't being pulled out. And now in the next lecture is, as I said, you're going to have to set up the two other brokers and zookeeper. And that's a fun exercise for you. All right. So I will see you in the next lecture with a solution to that.