Spark Streaming - Stream Processing in Lakehouse - PySpark
What you'll learn
- Real-time Stream Processing Concepts
- Spark Structured Streaming APIs and Architecture
- Working with Streaming Sources and Sinks
- Kafka for Data Engineers
- Working With Kafka Source and Integrating Spark with Kafka
- State-less and State-full Streaming Transformations
- Windowing Aggregates using Spark Stream
- Watermarking and State Cleanup
- Streaming Joins and Aggregation
- Handling Memory Problems with Streaming Joins
- Working with Azure Databricks
- Capstone Project - Streaming application in Lakehouse
Requirements
- Spark Fundamentals and exposure to Spark Dataframe APIs
- Programming Knowledge Using Python Programming Language
Description
About the Course
I am creating Apache Spark and Databricks - Stream Processing in Lakehouse using the Python Language and PySpark API. This course will help you understand Real-time Stream processing using Apache Spark and Databricks Cloud and apply that knowledge to build real-time stream processing solutions. This course is example-driven and follows a working session-like approach. We will take a live coding approach and explain all the needed concepts.
Capstone Project
This course also includes an End-To-End Capstone project. The project will help you understand the real-life project design, coding, implementation, testing, and CI/CD approach.
Who should take this Course?
I designed this course for software engineers willing to develop a Real-time Stream Processing Pipeline and application using Apache Spark. I am also creating this course for data architects and data engineers who are responsible for designing and building the organization’s data-centric infrastructure. Another group of people is the managers and architects who do not directly work with Spark implementation. Still, they work with those implementing Apache Spark at the ground level.
Spark Version used in the Course.
This Course is using the Apache Spark 3.5. I have tested all the source code and examples used in this Course on Azure Databricks Cloud using Databricks Runtime 14.1.
Who this course is for:
- Software Engineers and Architects who are willing to design and develop a Bigdata Engineering Projects using Apache Spark and Databricks Cloud
- Programmers and developers who are aspiring to grow and learn Data Engineering using Apache Spark and Databricks Cloud
Instructors
Prashant Kumar Pandey is passionate about helping people to learn and grow in their careers by bridging the gap between their existing and required skills. In his quest to fulfil this mission, he is authoring books, publishing technical articles, and creating training videos to help IT professionals and students succeed in the industry.
With over 25 years of experience in IT as a developer, architect, consultant, trainer, and mentor, he has worked with international software services organisations on various data-centric, Big Data, and AI projects.
Prashant is a firm believer in lifelong continuous learning and skill development. To popularise this concept, he started publishing free training videos on his YouTube channel and conceptualised the idea of creating a Journal of his learning under the banner of Learning Journal.
He is the founder, lead author, and chief editor of the ScholarNest portal that offers various skill development courses, training, and technical articles since the beginning of the year 2018.
Learning Journal is a small team of people passionate about helping others learn and grow in their careers by bridging the gap between their existing and required skills. In our quest to fulfill this mission, we are authoring books, publishing technical articles, and creating training videos to help IT professionals and students succeed in the industry.
Together we have over 40+ years of experience in IT as a developer, architect, consultant, trainer, and mentor. We have worked with international software services organizations on various data-centric and Bigdata projects.
Learning Journal is a team of firm believers in lifelong continuous learning and skill development. To popularize the importance of lifelong continuous learning, we started publishing free training videos on our YouTube channel. We conceptualized the notion of continuous learning, creating a journal of our learning under the Learning Journal banner.
We authored various skill development courses, training, and technical articles since the beginning of the year 2018.