Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Apache Airflow Bootcamp: Hands-On Workflow Automation
Rating: 3.9 out of 5(95 ratings)
633 students
Last updated 4/2025
English

What you'll learn

  • Understand what Apache Airflow is, its purpose and pros and cons of using Airflow
  • Step-by-step guide to installing Airflow
  • Launch and navigate the Airflow Web UI and learn about various views: DAG, Grid, Graph, Calendar, Task Duration, Code, Variable and Gantt View
  • Understand what a DAG is and how to create a DAG definition file and different methods for DAG creation
  • Learn about DAG Run, default_arguments, and DAG arguments and Master scheduling concepts such as depends_on_past, wait_for_downstream, catchup, and backfill
  • Use the Airflow CLI for various operations and access a handy cheatsheet for quick reference
  • Understand tasks, task instances and Learn the lifecycle of a task
  • Master different operators including BashOperator, PostgresOperator, PythonOperator, SqliteOperator, and EmailOperator
  • Implement sensors like FileSensor, SQLSensor, TimeDeltaSensor, and TimeSensor
  • Apply branching logic with BranchSQLOperator, BranchPythonOperator, BranchDayOfWeekOperator, BranchDateTimeOperator, and ShortCircuitOperator
  • Manage DAG dependencies and use TaskGroups ,Utilize TriggerDagRunOperator , ExternalTaskSensor and use hooks such as PostgresHook and SHook
  • Manage resources with pools and task priorities
  • Learn about different types of executors: SequentialExecutor and LocalExecutor and learn the Transition from SequentialExecutor to LocalExecutor
  • Explore the Airflow metadata database and Manage roles and create users with different roles including admin, public, user, and operator roles
  • Set and manage task-level and DAG-level SLAs and handle SLA misses
  • Address issues like zombie tasks, SIGTERM, and SIGKILL errors

Course content

20 sections123 lectures6h 24m total length
  • Introduction to Airflow2:11

    Explore Apache Airflow, an open source platform to author, schedule, and monitor workflows, with DAGs, dynamic pipelines, and extensible operators for scalable data orchestration.

  • PROS & CONS of using Airflow2:48
  • Airflow Architecture3:48
  • Common terminology in Airflow5:47

Requirements

  • Knowledge of Python
  • Rest we will cover to learn Airflow from scratch
  • Familiarity with command-line interfaces.
  • Understanding of database concepts is a plus but not required

Description

Hello and welcome to the Apache Airflow Bootcamp: Hands-On Workflow Automation with Practical Examples!

Throughout my career, I’ve built and managed countless workflows using Apache Airflow, and I’m excited to share my knowledge with you.

This course is designed to take you from a complete beginner to a confident user of Apache Airflow. We’ll cover everything from installation to advanced features, and you'll get hands-on experience through practical examples and real-world projects

What's included in the course ?

Introduction to Airflow

  • Understanding the purpose and benefits of using Apache Airflow.

  • Pros and cons of adopting Airflow in your projects.

Airflow Architecture

  • A detailed look into the components that make up Airflow.

  • Key terminology used in Airflow.

Configuration and Installation

  • Step-by-step guide to installing Airflow.

  • The role and configuration of the airflow.cfg file.

Airflow Web UI Views

  • Launching and navigating the Airflow Web UI.

  • DAG View

  • Grid View

  • Graph View

  • Calendar View

  • Task Duration View

  • Code View

  • Variable View

  • Gantt View

DAGs (Directed Acyclic Graphs)

  • What is a DAG?

  • Creating a DAG definition file.

  • Different methods for DAG creation.

  • Understanding DAG Run, default_arguments, and DAG arguments.

  • Using parameters in DAGs and passing parameters through TriggerDagRunOperator.

  • Scheduling concepts including depends_on_past, wait_for_downstream, catchup, and backfill.

Airflow CLI and Cheatsheet

  • Utilizing the Airflow CLI for various operations.

  • Handy cheatsheet for quick reference.

Tasks in Airflow

  • What are tasks and task instances?

  • The lifecycle of a task.

Operators in Airflow

  • Detailed exploration of operators including BashOperator, PostgresOperator, PythonOperator, SqliteOperator, and EmailOperator.

Sensors

  • Using sensors like FileSensor, SQLSensor, TimeDeltaSensor, and TimeSensor.

Branching

  • Implementing branching logic with BranchSQLOperator, BranchPythonOperator, BranchDayOfWeekOperator, BranchDateTimeOperator, and ShortCircuitOperator.

DAG Dependencies and TaskGroups

  • Managing DAG dependencies and using TaskGroups.

  • Using TriggerDagRunOperator and ExternalTaskSensor.

Hooks

  • Understanding and using hooks such as PostgresHook and SHook.

Resource Management

  • Managing resources with pools and task priorities.

Executors in Airflow

  • Different types of executors: SequentialExecutor and LocalExecutor.

  • Transitioning from SequentialExecutor to LocalExecutor.

Airflow Metadata Database and Roles

  • Understanding the Airflow metadata database.

  • Managing roles: creating users with different roles, including admin, public, user, and operator roles.

  • Creating custom roles and modifying existing ones.

SLA (Service Level Agreement)

  • Setting and managing task-level and DAG-level SLAs.

  • Handling SLA misses.

Advanced Concepts

  • Using XComs for inter-task communication.

  • Retrieving context parameters and using callback functions.

  • Dealing with zombie tasks, SIGTERM, and SIGKILL errors.

I believe that mastering workflow automation with Airflow can open up incredible opportunities in the field of data engineering. I’ve seen firsthand how it can transform the way we handle data, and I can’t wait to see what you’ll achieve with these skills.

So, whether you’re looking to advance your career, work on more efficient data pipelines, or just curious about Airflow, you’re in the right place. Let’s dive in and start creating some amazing workflows together. Are you ready? Let’s get started!

I wish you a great success!

Who this course is for:

  • Data Engineers: Data engineers who are responsible for building and managing data pipelines can greatly benefit from learning Apache Airflow.
  • Data Scientists: Data scientists who work with large datasets and perform data analysis can leverage Apache Airflow to automate repetitive tasks, such as data preprocessing, model training, and evaluation
  • DevOps Engineers: DevOps engineers who are responsible for managing and automating infrastructure can use Apache Airflow to automate deployment processes, monitor system health, and trigger actions based on predefined conditions
  • Software Developers: Software developers who build and maintain software applications can use Apache Airflow to automate various tasks, such as data ingestion, data processing, and workflow orchestration