Apache Kafka - Spark - Cassandra Real Time Streaming Project

Name: Apache Kafka - Spark - Cassandra Real Time Streaming Project
Rating: 4.2 (35 reviews)

Real Time Data Streaming Project, that involved building the architecture from scratch and deployment.

Created bySonal Saxena

Last updated 3/2023

English

What you'll learn

Professionals looking for an end to end kafka, spark and cassandra streaming pipeline.
Anyone who wants to understand how to use Apache kafka in their current architecture.
Engineers who are looking to design a Kafka Solution Pipeline.
A complete case study from start to execution.

Course content

3 sections • 14 lectures • 32m total length

Introduction: Project_Overview0:52
Documents0:38
Ubuntu_Machine_Creation1:53
Putty_Installation1:14
Java_Installation1:45
Install Java on Ubuntu first, run sudo apt-get update, verify with java -version, then proceed with installing Kafka, Hadoop, Spark, and Cassandra for an end-to-end streaming pipeline.
Kafka_Setup4:35
Install and configure Apache Kafka with Zookeeper to enable real-time streaming pipelines; create directories, download and unzip Kafka, update config with IPv4 address, then start Zookeeper and Kafka to test.
Hadoop_Installation3:02
SPARK SCALA Installation2:25
Cassandra_Installation2:54
Install Apache Cassandra, the NoSQL database, on Ubuntu by adding the repository, installing apt-transport-https, importing the GPG key, updating repositories, and enabling the service, then connect to Cassandra via SQL command.

Requirements

Baisc understanding of Kafka Architecure, Spark and Cassandra or SQL

Description

The course is designed to provide a comprehensive understanding of real-time big data processing using Kafka, Spark, and Cassandra. In today's world, data is produced at an unprecedented rate, and the ability to process and analyze this data in real-time is critical for making informed decisions. This course focuses on the fundamental concepts and architecture of Kafka, Spark, and Cassandra, and how they work together to create a robust big data processing pipeline.

Students will learn how to set up Kafka clusters and work with Kafka producers and consumers. Students will also learn about Kafka Streams, a client library for building real-time streaming applications that process data directly within Kafka.

Throughout the course, students will gain hands-on experience through practical exercises and projects that simulate real-world scenarios. By the end of the course, students will have a understanding of how to use Kafka, Spark, and Cassandra to build real-time big data processing systems.

Course Objectives:

Understand the fundamental concepts of real-time big data processing
Learn the architecture setup of Kafka, Spark, and Cassandra
Understand how Kafka, Spark, and Cassandra work together to create a real-time big data processing pipeline
Gain hands-on experience with Kafka, Spark, and Cassandra through practical exercises and projects
Learn how to build a real-time big data processing pipeline from scratch

This course is intended for software engineers, data engineers, and data analysts who have a basic understanding of programming concepts and are familiar with SQL.

Who this course is for:

Engineers looking for a CASE STUDY on Real Time Data Streaming involving KAKFA and SPARK

Apache Kafka - Spark - Cassandra Real Time Streaming Project

What you'll learn

Explore related topics

Course content

Infrastructure Setup for the Pipeline9 lectures • 19min

Pipeline Setup For KAKFA SPARK Real Time Streaming3 lectures • 8min

FINAL Pipeline Execution End to End(KAFKA + SPARK + CASSANDRA)2 lectures • 6min

Requirements

Description

Who this course is for: