Apache Airflow on AWS EKS: The Hands-On Guide
What you'll learn
- How to Set Up a Production Ready Architecture for Airflow on AWS EKS From A-Z
- How to deploy DAGs from Git (public and private)
- How to Create CI/CD Pipelines with AWS CodePipeline Deploy DAGs
- How to Share DAGs and Store Logs with AWS EFS
- How to Enable Remote Logging with AWS S3 in EKS
- How to Test your DAGs in CI/CD pipelines
- How to Store Sensitive Data in AWS Secret Manager
Requirements
- A good knowledge of Apache Airflow
- A intermediate knowledge of AWS
- A intermediate knowledge of Docker and Kubernetes
- AWS SERVICES USED ARE NOT FREE TIER ELIGIBLE
Description
Struggling to set up Airflow on AWS EKS?
You are at the right place!
With more than 15,000 students, I got many feedbacks about how difficult it is to configure Airflow on AWS with the official Helm chart.
Guess what? You are about to learn everything you need to set up a production-ready architecture for Apache Airflow on AWS EKS
This course is designed to guide you through the different steps of creating a real world architecture:
Configuring the EKS cluster following best practices
Deploying automatically changes with GitOps
Using Helm to configure and set up Airflow on Kubernetes
Configuring the official Helm chart of Airflow to use the Kubernetes Executor and many different features
Deploying DAGs in Airflow with Git-Sync and AWS EFS
Deploying DAGs/Airflow through CI/CD pipelines with AWS CodePipeline
Testing your DAGs automatically
Securing your credentials and sensitive data in a Secret Backend
Enabling remote logging with AWS S3
Creating 3 different environments dev/staging and prod
Making the production environment scalable and highly available
and more!
WARNING:
The course is not meant to learn the basic of Airflow, you must be already familiar with it.
If you already know Kubernetes/Docker/AWS your learning will be easier, but no worries I explain everything you need.
YOU WON'T learn how to interact with AWS in your DAGs. This course is about designing an architecture not about DAGs.
The course is NOT free-tier eligible as we are going to use many AWS services and set up a real world architecture.
Who this course is for:
- Data engineers
- Software Engineers
- DevOps
Instructor
Hi there,
My name is Marc Lamberti, I'm 27 years old and I'm very happy to arouse your curiosity! I'm currently working as Big Data Engineer in full-time for the biggest online bank in France, dealing with more than 1 500 000 clients. For more than 3 years now, I created different ETLs in order to address the problems that a bank encounters everyday such as, a platform to monitor the information system in real time to detect anomalies and reduce the number of client's calls, a tool detecting in real time any suspicious transaction or potential fraudster, an ETL to valorize massive amount of data into Cassandra and so on.
The biggest issue when you are a Big Data Engineer is to deal with the growing number of available open source tools. You have to know how to use them, when to use them and how they connect to each other in order to build robust, secure and performing systems solving your underlying business needs.
I strongly believe that the best way to learn and understand a new skill is by taking a hands-on approach with just enough theory to explain the concepts and a big dose of practice to be ready in a production environment. That's why in each of my courses you will always find practical examples associated with theoric explanations.
Have a great learning time!