
Explore Apache Airflow use cases, including ETL orchestration, cron job management, and data movement across services. Discover its support for ML pipelines and automation, plus streaming limitations.
Explore how the Airflow scheduler monitors DAGs, triggers runs by start date and schedule interval, uses an OrderedDict queue, and supports sequential, local, celery, and Kubernetes executors.
Explore Apache Airflow's web server graph view to visualize dag dependencies and task states for a specific dag run, with configurable layout and filtering.
View the code that generates a DAG in real time from GitHub to gain context, debug specific cases, and leverage the admin panel for deeper insights.
Explore Airflow configuration by examining sections like core, web server, scheduler, and kubernetes, and note that changes require restarting the web server or scheduler.
Learn the DAG concept and its core parameters—start_date, dag_id, and schedule_interval—and how cron-based timing governs execution. Explore max_active_runs, catchup, default_args, and how tasks are defined and sequenced within a DAG.
Discover airflow variables as a global key-value store for configuration settings, accessible via the admin panel, definable in code with the Variable class, and supporting crud operations.
Install and run Docker on Mac using Docker Desktop, establishing Docker as part of the developer tech stack and preparing to run PostgreSQL in the next lecture.
Explore how to set up a local Kubernetes environment for Apache Airflow using minikube on a Mac, including installing VirtualBox, kubectl, and starting a single-node cluster.
Create a dedicated code folder in your home directory for the Airflow project, and set the AIRFLOW_HOME environment variable to that location to initialize the Airflow setup.
Create your first dag with an operator, a sensor, and a plugin to grasp Airflow's core components; learn to define a simple element and extend it with resource-specific logic.
Extend BaseOperator to create your first operator, implement an execute function that logs a parameter, decorate with apply_defaults, and wire the operator into a DAG and plugin.
Learn to build a simple Airflow sensor by extending BaseSensorOperator, implement a poke function with a 30-second poke_interval, and attach it to a DAG to gate a downstream task.
Implement xcom in Airflow by pushing a minute value from a sensor to xcom and pulling it in an operator using task_instance.xcom_push and xcom_pull, then verify via DAG run logs.
Explore dynamic DAGs with Apache Airflow using the BranchPythonOperator to conditionally branch execution based on a Python callable, selecting one of multiple pipelines and skipping others.
Build an Apache Airflow Docker image from a Dockerfile by defining a base image, Airflow home, dependencies, and an entrypoint to run webserver or scheduler.
Apache Airflow is an open-source platform to programmatically author, schedule and monitor workflows. In this course we are going to start with covering some basic concepts related to Apache Airflow - from the main components - web server and scheduler, to the internal components like DAG, Plugin, Operator, Sensor, Hook, Xcom, Variable and Connection.
Later in the course I will teach you some more advanced topics like branching, metrics, performance and log monitoring, and Airflow's REST API. Additionally I will help you to build your development environment with just one click using Docker and Docker Compose.
Why stop here? After all this, we will create a Kubernetes cluster in Amazon and we will deploy our application there!
Finally, I will share with you some useful advanced tips which will be helpful to enhance your simple Airflow project to a production ready system.