Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Google Dataflow with Apache Beam - Beginner to Pro course
Rating: 4.4 out of 5(18 ratings)
131 students

Google Dataflow with Apache Beam - Beginner to Pro course

Master Google Dataflow with hands-on projects | Apache Beam basics to advanced streaming & batch data pipelines
Created bySaidhul Shaik
Last updated 7/2025
English

What you'll learn

  • Understand what Google Cloud Dataflow is and how it enables scalable data processing
  • Learn the Apache Beam programming model, with PCollections and PTransforms
  • Build end-to-end ETL pipelines for both batch and streaming data
  • Use Google Pub/Sub for real-time data ingestion and understand its architecture
  • Implement template-based pipelines for reusability and automation

Course content

1 section9 lectures5h 35m total length
  • Material and Datasets0:02

    please download

  • Course Introduction3:37
  • What is Dataflow - Apache Beam Introduction - How it is different from Dataproc33:31
  • Workbench Creation - Beam Basics - Extract data from Multiple Data Sources23:49
  • How to write Data to Multiple Sinks49:41
  • Apache beam Transformations1:05:16
  • Pipeline Creation : Template Method - Case study-149:48
  • Batch Pipeline Creation : Custom code - Case Study-250:04
  • Streaming Pipeline Creation with Pubsub : Custome code - Case study-359:52

Requirements

  • Basic understanding of Python
  • Familiarity with GCP is helpful but not mandatory
  • A willingness to learn hands-on and solve real-world challenges

Description

Are you looking to master Google Dataflow and Apache Beam to build scalable, production-ready data pipelines on Google Cloud Platform (GCP)? Whether you're a data engineer, cloud enthusiast, or aspiring GCP professional, this course will take you from zero to advanced level, through hands-on labs, real-world case studies, and practical assignments.

What You'll Learn

  • Understand the fundamentals of Google Cloud Dataflow and how it fits in the data engineering ecosystem

  • Explore the Apache Beam framework – the programming model behind Dataflow

    • Learn core concepts like PCollections and PTransforms

  • Differentiate Dataflow vs Dataproc and when to use each

  • Set up your own Cloud Workbench environment for hands-on practice

  • Build real-world ETL pipelines (Extract, Transform, Load) using Apache Beam

  • Use Google Pub/Sub for real-time data ingestion and understand its architecture

  • Develop pipelines using both:

    • Template-based method

      • Case Study 1: Template-driven pipeline

    • Custom code approach

      • Case Study 2: end to end Batch pipeline

      • Case Study 3: end to end Streaming pipeline

  • Complete hands-on assignments to reinforce learning and prepare for real-world scenarios

Hands-On Labs Include:

  • Beam Basics with Python/Java SDK

  • ETL development on Dataflow

  • Streaming pipeline using Pub/Sub

  • Batch pipeline using Cloud Storage

  • Debugging, monitoring, and optimizing pipeline performance

  • end to end pipeline creations from scratch

Who this course is for:

  • Beginner to Learn and Master Google Dataflow and Apache beam
  • Aspiring Data Engineers and GCP enthusiasts