Udemy
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
Development
Web Development Data Science Mobile Development Programming Languages Game Development Database Design & Development Software Testing Software Engineering Development Tools No-Code Development
Business
Entrepreneurship Communications Management Sales Business Strategy Operations Project Management Business Law Business Analytics & Intelligence Human Resources Industry E-Commerce Media Real Estate Other Business
Finance & Accounting
Accounting & Bookkeeping Compliance Cryptocurrency & Blockchain Economics Finance Finance Cert & Exam Prep Financial Modeling & Analysis Investing & Trading Money Management Tools Taxes Other Finance & Accounting
IT & Software
IT Certification Network & Security Hardware Operating Systems Other IT & Software
Office Productivity
Microsoft Apple Google SAP Oracle Other Office Productivity
Personal Development
Personal Transformation Personal Productivity Leadership Career Development Parenting & Relationships Happiness Esoteric Practices Religion & Spirituality Personal Brand Building Creativity Influence Self Esteem & Confidence Stress Management Memory & Study Skills Motivation Other Personal Development
Design
Web Design Graphic Design & Illustration Design Tools User Experience Design Game Design Design Thinking 3D & Animation Fashion Design Architectural Design Interior Design Other Design
Marketing
Digital Marketing Search Engine Optimization Social Media Marketing Branding Marketing Fundamentals Marketing Analytics & Automation Public Relations Advertising Video & Mobile Marketing Content Marketing Growth Hacking Affiliate Marketing Product Marketing Other Marketing
Lifestyle
Arts & Crafts Beauty & Makeup Esoteric Practices Food & Beverage Gaming Home Improvement Pet Care & Training Travel Other Lifestyle
Photography & Video
Digital Photography Photography Portrait Photography Photography Tools Commercial Photography Video Design Other Photography & Video
Health & Fitness
Fitness General Health Sports Nutrition Yoga Mental Health Dieting Self Defense Safety & First Aid Dance Meditation Other Health & Fitness
Music
Instruments Music Production Music Fundamentals Vocal Music Techniques Music Software Other Music
Teaching & Academics
Engineering Humanities Math Science Online Education Social Science Language Teacher Training Test Prep Other Teaching & Academics
AWS Certification Microsoft Certification AWS Certified Solutions Architect - Associate AWS Certified Cloud Practitioner CompTIA A+ Cisco CCNA CompTIA Security+ Amazon AWS AWS Certified Developer - Associate
Graphic Design Photoshop Adobe Illustrator Drawing Digital Painting InDesign Character Design Canva Figure Drawing
Life Coach Training Neuro-Linguistic Programming Mindfulness Personal Development Personal Transformation Meditation Life Purpose Emotional Intelligence Neuroscience
Web Development JavaScript React CSS Angular PHP WordPress Node.Js Python
Google Flutter Android Development iOS Development Swift React Native Dart Programming Language Mobile Development Kotlin SwiftUI
Digital Marketing Google Ads (Adwords) Social Media Marketing Google Ads (AdWords) Certification Marketing Strategy Internet Marketing YouTube Marketing Email Marketing Retargeting
Microsoft Power BI SQL Tableau Business Analysis Business Intelligence MySQL Data Modeling Data Analysis Big Data
Business Fundamentals Entrepreneurship Fundamentals Business Strategy Online Business Business Plan Startup Freelancing Blogging Home Business
Unity Game Development Fundamentals Unreal Engine C# 3D Game Development C++ 2D Game Development Unreal Engine Blueprints Blender
30-Day Money-Back Guarantee
Development Software Engineering Apache Airflow

Apache Airflow: The Hands-On Guide

Master Apache Airflow from A to Z. Hands-on videos on Airflow with AWS, Kubernetes, Docker and more
Bestseller
Rating: 4.6 out of 54.6 (840 ratings)
6,682 students
Created by Marc Lamberti
Last updated 12/2020
English
English
30-Day Money-Back Guarantee

What you'll learn

  • Coding Production Grade Data pipelines by Mastering Airflow through Hands-on Examples
  • How to Follow Best Practices with Apache Airflow
  • How to Scale Airflow with the Local, Celery and Kubernetes Wxecutors
  • How to Set Up Monitoring with Elasticsearch and Grafana
  • How to Secure Airflow with authentication, crypto and the RBAC UI
  • Core and Advanced Concepts with Pros and Limitations
  • Mastering DAGs with timezones, unit testing, backfill and catchup
  • Organising the DAG folder and keep things clean
Curated for the Udemy for Business collection

Course content

11 sections • 111 lectures • 13h 0m total length

  • Preview01:19
  • Preview01:59
  • Preview01:13
  • Preview01:11
  • A word for you
    00:37

  • Preview00:22
  • Why Airflow?
    02:22
  • What is Airflow?
    03:15
  • How Airflow works?
    03:32
  • [Practice] Installing Airflow
    21:53
  • [Practice] Quick tour of Airflow UI
    09:00
  • [Practice] Quick tour of Airflow CLI
    03:55
  • [Practice] Controlling your DAGs with the CLI
    07:58

  • Introduction
    01:00
  • Docker reminder
    05:57
  • Troubleshoot Docker performances on MacOS
    03:18
  • Project: The Forex Data Pipeline
    03:37
  • What is a DAG?
    05:54
  • [Practice] Defining your first DAG
    09:59
  • What is an Operator?
    06:01
  • Preview12:27
  • [Practice] Checking if the currency file is available - FileSensor
    09:01
  • [Practice] Downloading the forex rates from the API - PythonOperator
    05:38
  • [Practice] Saving the forex rates in the HDFS - BashOperator
    06:50
  • [Practice] Creating the Hive table forex_rates - HiveOperator
    06:45
  • [Practice] Processing the forex rates with Spark - SparkSubmitOperator
    08:01
  • [Practice] Sending an email notification - EmailOperator
    08:12
  • [Practice] Sending a Slack notification - SlackAPIPostOperator
    09:32
  • Operator Relationships and Bitshift Composition
    02:24
  • [Practice] Adding dependencies between tasks
    02:20
  • [Practice] The Forex Data Pipeline in action!
    06:12

  • Introduction
    00:49
  • Start_date and schedule_interval parameters demystified
    06:46
  • [Practice] Manipulating the start_date with schedule_interval
    11:03
  • Backfill and Catchup
    04:01
  • [Practice] Catching up non triggered DAGRuns
    14:58
  • Dealing with timezones in Airflow
    06:50
  • [Practice] Making your DAGs timezone aware
    13:54
  • How to make your tasks dependent
    03:57
  • [Practice] Creating task dependencies between DagRuns
    12:26
  • How to structure your DAG folder
    04:38
  • [Practice] Organizing your DAGs folder
    09:34
  • [Practice] How the Web Server works
    07:16
  • How to deal with failures in your DAGs
    04:19
  • [Practice] Retry and Alerting
    18:32
  • How to test your DAGs
    07:17
  • [Practice] Unit testing your DAGs
    14:11

  • Introduction
    01:03
  • Sequential Executor with SQLite
    03:38
  • Local Executor with PostgreSQL
    07:17
  • [Practice] Executing tasks in parallel with the Local Executor
    18:35
  • [Practice] Ad Hoc Queries with the metadata database
    15:39
  • Scale out Apache Airflow with Celery Executors and Redis
    05:01
  • [Practice] Set up the Airflow cluster with Celery Executors and Docker
    07:01
  • [Practice] Distributing your tasks with the Celery Executor
    11:15
  • [Practice] Adding new worker nodes with the Celery Executor
    20:59
  • [Practice] Sending tasks to a specific worker with Queues
    12:44
  • [Practice] Pools and priority_weights: Limiting parallelism - prioritizing tasks
    11:18
  • Kubernetes Reminder
    07:00
  • Scaling Airflow with Kubernetes Executors
    05:16
  • [Practice] Set up a 3 nodes Kubernetes Cluster with Vagrant and Rancher
    10:51
  • [Practice] Installing Airflow with Rancher and the Kubernetes Executor
    09:55
  • [Practice] Running your DAGs with the Kubernetes Executor
    10:45

  • Introduction
    00:55
  • Preview02:36
  • [Practice] Grouping your tasks with SubDAGs and Deadlocks
    09:49
  • Making different paths in your DAGs with Branching
    03:10
  • [Practice] Make Your First Conditional Task Using Branching
    09:47
  • Trigger rules for your tasks
    04:38
  • [Practice] Changing how your tasks are triggered
    13:13
  • Avoid hard coding values with Variables, Macros and Templates
    04:40
  • [Practice] Templating your tasks
    18:32
  • How to share data between your tasks with XCOMs
    03:59
  • [Practice] Sharing (big?) data with XCOMs
    09:58
  • TriggerDagRunOperator or when your DAG controls another DAG
    02:17
  • [Practice] Trigger a DAG from another DAG
    05:24
  • Dependencies between your DAGs with the ExternalTaskSensor
    04:42
  • [Practice] Make your DAGs dependent with the ExternalTaskSensor
    03:47

  • Introduction
    01:28
  • Quick overview of AWS EKS
    03:45
  • [Practice] Set up an EC2 instance for Rancher
    08:17
  • [Practice] Create an IAM User with permissions
    02:34
  • [Practice] Create an ECR repository
    06:49
  • [Practice] Create an EKS cluster with Rancher
    06:21
  • How to access your applications from the outside
    04:19
  • [Practice] Deploy Nginx Ingress with Catalogs (Helm)
    04:56
  • [Practice] Deploy and run Airflow with the Kubernetes Executor on EKS
    05:21
  • [Practice] Cleaning your AWS services
    02:50

  • Introduction
    01:28
  • How the logging system works in Airflow
    03:43
  • [Practice] Setting up custom logging
    17:16
  • [Practice] Storing your logs in AWS S3
    14:40
  • Elasticsearch Reminder
    04:13
  • [Practice] Configuring Airflow with Elasticsearch
    18:08
  • [Practice] Monitoring your DAGs with Elasticsearch
    10:40
  • Introduction to metrics
    04:33
  • [Practice] Monitoring Airflow with TIG stack
    12:12
  • [Practice] Triggering alerts for Airflow with Grafana
    11:30
  • Airflow maintenance DAGs
    02:59

  • Introduction
    00:54
  • [Practice] Encrypting sensitive data with Fernet
    16:54
  • [Practice] Rotating the Fernet Key
    07:19
  • [Practice] Hiding variables
    03:24
  • [Practice] Password authentication and filter by owner
    09:38
  • [Practice] RBAC UI
    14:15

  • What to expect from Airflow 2.0?
    10:41

Requirements

  • Notions of Docker and Python
  • Virtual Box installed (Only for local Kubernetes cluster part)
  • Vagrant installed
  • The course "The Complete Hands-On Introduction to Apache Airflow" can be a nice plus.

Description

Apache Airflow is a platform created by community to programmatically author, schedule and monitor workflows.

It is scalable, dynamic, extensible and modulable.

Without any doubts, mastering Airflow is becoming a must-have and an attractive skill for anyone working with data.

What you will learn in the course:

  • Fundamentals of Airflow are explained such as what is Airflow, how the scheduler and the web server works

  • The Forex Data Pipeline project is incredible way to discover many operators in Airflow and deal with Slack, Spark, Hadoop and more

  • Mastering your DAGs is a top priority and you will be able to play with timezones, unit testing your DAGs, how to structure your DAG folder and much more

  • Scaling Airflow through different executors such as the Local Executor, the Celery Executor and the Kubernetes Executor will be explained in details. You will discover how to specialise your workers, how to add new workers, what happens when a node crashes.

  • A Kubernetes cluster of 3 nodes will be set up with Rancher, Airflow and the Kubernetes Executor in local to run your data pipelines.

  • Advanced concepts will be shown through practical examples such as templatating your DAGs, how to make your DAG dependent of another, what are Subdags and deadlocks, and more.

  • You will set up a Kubernetes cluster in the cloud with AWS EKS and Rancher  in order to use Airflow along with the Kubernetes Executor

  • Monitoring Airflow is extremely important! That's why you will know how to do it with Elasticsearch and Grafana.

  • Security will be also addressed in order to make your Airflow instance compliant with your company. Specifying roles and permissions for your users with RBAC, Prevent from accessing the Airflow UI with authentication and password,  data encryption and more.

In addition:

  • Many practical exercises are given along the course so that you will have occasions to apply what you learn.

  • Best practices are stated when needed to give you the best ways of using Airflow

  • Quiz are available to assess your comprehension at the end of each section.

  • Answering fast your questions is my top-priority and I will do my best for you.

I put a lot of effort in order to give you the best content and I hope you will enjoy it as much as I enjoyed doing it.

At the end of the course you will more confident than ever to use Airflow

Wish you a great success!

Marc Lamberti

Who this course is for:

  • Data Engineers
  • Inspiring Data Engineers
  • DevOps
  • Software Engineers
  • Data Scientists

Featured review

Max Tarasishin
Max Tarasishin
69 courses
26 reviews
Rating: 5.0 out of 57 months ago
Pros: 1. A lot of valuable information 2. The information will stay relevant for a long time 3. Extremely thorough and practical 4. Can be easily applied IRL Cons: 1. Low code quality 2. There are some dubious and unsecure actions that I advise you against (like posting into the yopmail, copying keys for your GMail, etc)

Instructor

Marc Lamberti
Apache Airflow Expert, Big Data Engineer
Marc Lamberti
  • 4.6 Instructor Rating
  • 3,277 Reviews
  • 17,575 Students
  • 3 Courses

Hi there,

My name is Marc Lamberti, I'm 27 years old and I'm very happy to arouse your curiosity! I'm currently working as Big Data Engineer in full-time for the biggest online bank in France, dealing with more than 1 500 000 clients. For more than 3 years now, I created different ETLs in order to address the problems that a bank encounters everyday such as, a platform to monitor the information system in real time to detect anomalies and reduce the number of client's calls, a tool detecting  in real time any suspicious transaction or potential fraudster, an ETL to valorize massive amount of data into Cassandra and so on.

The biggest issue when you are a Big Data Engineer is to deal with the growing number of available open source tools. You have to know how to use them, when to use them and how they connect to each other in order to build robust, secure and performing systems solving your underlying business needs.

I strongly believe that the best way to learn and understand a new skill is by taking a hands-on approach with just enough theory to explain the concepts and a big dose of practice to be ready in a production environment. That's why in each of my courses you will always find practical examples associated with theoric explanations.

Have a great learning time!

  • Udemy for Business
  • Teach on Udemy
  • Get the app
  • About us
  • Contact us
  • Careers
  • Blog
  • Help and Support
  • Affiliate
  • Terms
  • Privacy policy
  • Cookie settings
  • Sitemap
  • Featured courses
Udemy
© 2021 Udemy, Inc.