What is Kafka Streams?

Stephane Maarek | AWS Certified Cloud Practitioner,Solutions Architect,Developer
A free video tutorial from Stephane Maarek | AWS Certified Cloud Practitioner,Solutions Architect,Developer
Best Selling Instructor, Kafka Guru, 9x AWS Certified
4.7 instructor rating • 41 courses • 931,808 students

Lecture description

Learn what is Kafka Streams at a high level

Learn more from the full course

Apache Kafka Series - Kafka Streams for Data Processing

Learn the Kafka Streams API with Hands-On Examples, Learn Exactly Once, Build and Deploy Apps with Java 8

04:48:46 of on-demand video • Updated July 2021

  • Write four Kafka Streams application in Java 8
  • Configure Kafka Streams to use Exactly Once Semantics
  • Scale Kafka Streams applications
  • Program with the High Level DSL of Kafka Streams
  • Build and package your application
  • Write tests for your Kafka Streams Topology
  • And so much more!
English [Auto] Hi and welcome to this special Cafcass Hero's New Course, this course is on Cafcass Streams. I'm glad you join me on this course and get ready. It's going to be lots of learning. So first, of course, introduction. We're going to get started with a couple of streams. We're going to see how to run a first application because see what is Cafcass dreams, understand how it fits in the ecosystem, etc., etc.. So with this brief introduction, I really hope you can get a takeaway from it and understand what we're going to do for the rest of the course. So first question is, what is Cafcass dreams. Cafcass Dreams is an easy data processing and transformation library within Kathak. It ships with the Kafka binary. It's within Kafka projects. So it's not an external library created by a third party. So here you have Kafka and you can create Kafka applications of any kind. It could be to transform data. It could be to enrich data, to perform, for example, fraud detection or monitoring and alerting. So there's lots of basically applications. The idea is that Cafcass Dreams is a library that you sit on top of Kafka and that you create your application on. So what is Cafcass extreme's really it's a standard Java application, it's just a Java library and you just launch it like any Java application and we'll see this during the course. You don't need to create a cluster forecast assumes application like you would for Sparke or Flink or Nephi. And I'll have a lecture that goes over the difference. But the easy thing is that it's just a job application. No cluster's. It's highly scalable. It's elastic and fault tolerance because it inherits every specific benefit that Kathia provides because it's integrated with Kafka. And that makes you really, really awesome. It has exactly one's capabilities and there is a section in this course about what exactly what this means, but this is the first library in the world that provides streaming. Exactly. One's capabilities tighten with Kafka. And that's a huge thing in the streaming world. It processes record one at a time. So there's no batching. So this is true streaming some of the libraries like a spark streaming process, things in batches, and then it works for any application size. So if you have a small project or a very, very large project, you write the same codes, you've got the same application and skills the same way. So it's really awesome. So let's look at the architecture design. OK, so you've seen that slide if you looked at my Kafka Connect course. But let's get over this again. So you have a cluster and it has several brokers, OK, in this case, four, but it can be from one to one hundred or whatever you want. And you have your sources and usually the way you own borders sources in the perfect Kafka architecture design is that you have a connect cluster. And if you don't know what kind of cluster is, I recommend you look at my connect course. You can find a link in the last lecture of this course. So you have your sources and you connect cluster basically on boards it on Kafka and now your data is in Kafka and you want to process it. That's where you have your tram's application. So cephalosporins application basically sit on the right hand side and do from Kafka to Kafka. And that's really cool because all the data processing, all the data transformation is tightly integrated with Kafka. Finally, you want to expose this, transform data to your source, to your Sync's, for example, a database elasticsearch or whatever. Then you can use your Connect cluster for this. And this is all described in my connect course. So in the Connect course, you saw the left hand side. And in this course we really going to see the right hand side to do data transformation and processing using Cafcass streams. So a bit of history about streams, this API was introduced as part of Cafcass Zero 010, which was sometimes in 2016 and has been fully mature as part of Kafka's Your 10 011, which is June 2017. So this is a really new library. Again, the API can change and I know will change. But what are you learning here is still very applicable in case of any changes. As I said before, is the only library that can leverage the new exactly. One's capability from Cafcass year 11 and have a whole section on this. And then it is a serious contender to other streaming processing frameworks which has sparked Flink or Nephi or any other streaming library. So really, really good to get to learn it. And I'm glad you're taking the journey with me. And then finally, as I said, it's a new library, so it's prone to changes. So don't be afraid if things change in the future, what you need to learn is the ideas behind it. The API and all the changes will be somewhat minor as in the future.