Azure Data Factory for Beginners - Build Data Ingestion
What you'll learn
- Azure Data Factory
- Azure Blob Storage
- Azure Gen 2 Data Lake Storage
- Azure Data Factory Pipelines
- Data Engineering Concepts
- Data Lake Concepts
- Metadata Driven Frameworks Concepts
- Industry Example on How to build Ingestion Frameworks
- Dynamic Azure Data Factory Pipelines
- Email Notifications with Logic Apps
- Tracking of Pipelines and Batch Runs
- Version Management with Azure DevOps
Requirements
- Basic PC / Laptop
Description
The main objective of this course is to help you to learn Data Engineering techniques of building Metadata-Driven frameworks with Azure Data Engineering tools such as Data Factory, Azure SQL, and others.
Building Frameworks are now an industry norm and it has become an important skill to know how to visualize, design, plan and implement data frameworks.
The framework that we are going to build together is referred to as the Metadata-Driven Ingestion Framework.
Data ingestion into the data lake from the disparate source systems is a key requirement for a company that aspires to be data-driven, and finding a common way to ingest the data is a desirable and necessary requirement.
Metadata-Driven Frameworks allows a company to develop the system just once and it can be adopted and reused by various business clusters without the need for additional development, thus saving the business time and costs. Think of it as a plug-and-play system.
The first objective of the course is to onboard you onto the Azure Data Factory platform to help you assemble your first Azure Data Factory Pipeline. Once you get a good grip of the Azure Data Factory development pattern, then it becomes easier to adopt the same pattern to onboard other sources and data sinks.
Once you are comfortable with building a basic azure data factory pipeline, as a second objective we then move on to building a fully-fledged and working metadata-driven framework to make the ingestion more dynamic, and furthermore, we will build the framework in such a way that you can audit every batch orchestration and individual pipeline runs for business intelligence and operational monitoring.
Creating your first Pipeline
What will be covered is as follows;
1. Introduction to Azure Data Factory
2. Unpack the requirements and technical architecture
3. Create an Azure Data Factory Resource
4. Create an Azure Blob Storage account
5. Create an Azure Data Lake Gen 2 Storage account
6. Learn how to use the Storage Explorer
7. Create Your First Azure Pipeline.
Metadata Driven Ingestion
1. Unpack the theory on Metadata Driven Ingestion
2. Describing the High-Level Plan for building the User
3. Creation of a dedicated Active Directory User and assigning appropriate permissions
4. Using Azure Data Studio
5. Creation of the Metadata Driven Database (Tables and T-SQL Stored Procedure)
6. Applying business naming conventions
7. Creating an email notifications strategy
8. Creation of Reusable utility pipelines
9. Develop a mechanism to log data for every data ingestion pipeline run and also the batch itself
10. Creation of a dynamic data ingestion pipeline
11. Apply the orchestration pipeline
12. Explanation of T-SQL Stored Procedures for the Ingestion Engine
13. Creating an Azure DevOps Repository for the Data Factory Pipelines
Event-Driven Ingestion
1. Enabling the Event Grid Provider
2. Use the Getmetadata Activity
3. Use the Filter Activity
4. Create Event-Based Triggers
5. Create and Merge new DevOps Branches
Who this course is for:
- Aspiring Data Engineers
- Developers that are curious about Azure Data Factory as an ETL alternative
Instructor
I'm a Data Management professional that is driven by the power and influence of data in our lives. With the power of data, I was able to help companies become more data-driven to gain a competitive edge or meet regulatory requirements.
In the last 15 years, I have had the pleasure of designing and implementing data warehousing solutions in Retail, Telco, and Banking industries, and recently in more big data lake specific implementations.
I’ve had the pleasure to be lead and also lead teams to implement the above-mentioned strategies and in my spare time, I teach programming online as a YouTuber as I am passionate about technology.