Apache Airflow using Google Cloud Composer: Introduction
What you'll learn
- Understand automation of Task workflows through Airflow
- Airflow Architecture - On Premise (local install), Cloud, single node, multiple node
- How to use connection functionality to connect to different systems to automate data pipelines
- What is Google cloud Big query and briefly how it can be used in Dataware housing as well as in Airflow DAG
- Master core functionalities such as DAGs, Operators, Tasks through hands on demonstrations
- Understand advanced functionalities like XCOM, Branching, Subdags through hands on demonstrations
- Get an overview understanding on SLAs, Kubernetes executor functionality in Apache Airflow
- The source files of Python DAG programs (9 .py files) used in demonstration are available for download towards practice for students
Requirements
- Google Cloud Platform Account OR even Free Trial account - NO Install required
- Good understanding on Python code and some exposure to bash shell scripting will help.
Description
Apache Airflow is an open-source platform to programmatically author, schedule and monitor workflows.
Cloud Composer is a fully managed workflow orchestration service that empowers you to author, schedule, and monitor pipelines that span across clouds and on-premises data centers. Built on the popular Apache Airflow open source project and operated using the Python programming language, Cloud Composer is free from lock-in and easy to use.
With Apache Airflow hosted on cloud ('Google' Cloud composer) and hence,this will assist learner to focus on Apache Airflow product functionality and thereby learn quickly, without any hassles of having Apache Airflow installed locally on a machine.
Cloud Composer pipelines are configured as directed acyclic graphs (DAGs) using Python, making it easy for users of any experience level to author and schedule a workflow. One-click deployment yields instant access to a rich library of connectors and multiple graphical representations of your workflow in action, increasing pipeline reliability by making troubleshooting easy.
This course is designed with beginner in mind, that is first time users of cloud composer / Apache airflow. The course is structured in such a way that it has presentation to discuss the concepts initially and then provides with hands on demonstration to make the understanding better.
The python DAG programs used in demonstration source file (9 Python files) are available for download toward further practice by students.
Happy learning!!!
Who this course is for:
- People interested in Data warehousing, Big data, Data engineering
- People interested in Automated tools for task workflow scheduling
- Student interested to know about Airflow
- Professional to wish to explore as how Apache Airflow can be used in Task scheduling and building Data pipelines
Instructor
Having over 25 + years of IT industry experience (Product development, Consulting & Training).
His last engagement was with Oracle India (initially with consulting and then with Oracle university) , which was for more than 12 years. Prior to this, had worked with Cap Gemini (formerly iGate Global Solution), GE to name a few.
Had managed projects and programs in Enterprise Resource Planning and Business Intelligence implementations in the range of 3000 man days with revenue about 6mil US$ per year. These projects are on industry domains, such as Oil and Gas, Process Manufacturing, Hi-Tech Retail & Telecom across the globe.
Was awarded with "pace setter" , "best managed " project awards in recognition of efforts in project management.
He was also instrumental in design, development and roll-out of Graduate Hire program on Oracle Products stack for partner IT services companies in recent times.
Right from college was fascinated about devices and sensor interaction with computer, which made him focus on micro-controller in early 90's and that eventually attracted him towards field of Internet of things and Machine learning, AI these days.
Due to his vast experience, he has crafted varied unique courses in Udemy related project management, Internet of things, ERP & BI, Database modeling and design, SQL, Linux and more importantly on Cloud Infrastructure. Has plans to create more, as his passion has been on knowledge acquisition and sharing.
He is certified as
Oracle Financials Business Process Certified Foundations Associate
Oracle Cloud Infrastructure 2021 Architect Professional.
Oracle Cloud Infrastructure 2021 Architect Associate.
Oracle Cloud Infrastructure 2021 Cloud Operations Associate.
Oracle Cloud Infrastructure Security 2021 Certified Associate
Oracle Autonomous Database Cloud 2021 Certified Specialist
Oracle Machine Learning using Autonomous Database 2021 Certified Specialist
Also certified in Python, IOT by NPTEL (Govt of India ) recently.
He holds a degree in engineering (Computer science) with MBA and a certified PMP since 2007