
Master the environment setup for an end-to-end Azure data engineering project, including resource group creation and Azure Data Factory, Databricks, Key Vault, Azure Storage, and Synapse workspace.
Ingest on-premise data with Azure Data Factory using self-hosted integration runtime and a copy pipeline to stage tables in Azure Data Lake bronze layer as parquet.
Demonstrates building an Azure Data Factory pipeline to copy all on-premises SQL Server tables into the Data Lake bronze layer as parquet files using lookup and for-each.
Launch an Azure Databricks workspace, create a data transformation cluster, and mount a data lake using credential passthrough to process bronze, silver, and gold containers.
Orchestrate level-one and level-two data transformations from bronze to silver and silver to gold using Databricks, PySpark, and Delta format, including date conversions and naming convention standardization.
Discover how to build an end-to-end Azure Data Factory pipeline that runs Databricks notebooks for bronze to silver and silver to gold transformations, with real-time monitoring and Delta Lake storage.
Load gold-layer data from Azure data lake into Azure Synapse Analytics using serverless SQL, create views from delta tables, and automate via pipelines and stored procedures.
Connect Azure Synapse Analytics serverless SQL to Power BI to import all views from the gold_db, creating an interactive dashboard with modeled relationships and cross-filtered insights.
Learn to implement security and governance in a real-time data engineering project by creating an Azure Active Directory security group to grant resource group access using role assignments.
Test end-to-end data pipelines by inserting a row into on-prem SQL, copying to bronze via Azure Data Factory, transforming to silver and gold, and loading into Synapse for Power BI.
In this course, let's build a complete End to End Azure Data Engineering Project. In this project we are going to create an end to end data platform right from Data Ingestion, Data Transformation, Data Loading and Reporting.
The tools that are covered in this project are,
Azure Data Factory
Azure Data Lake Storage Gen2
Azure Databricks
Azure Synapse Analytics
Azure Key vault
Microsoft Entra ID (Previously called as AAD) and
Microsoft Power BI
The use case for this project is building an end to end solution by ingesting the tables from on-premise SQL Server database using Azure Data Factory and then store the data in Azure Data Lake. Then Azure databricks is used to transform the RAW data to the most cleanest form of data and then we are using Azure Synapse Analytics to load the clean data and finally using Microsoft Power BI to integrate with Azure synapse analytics to build an interactive dashboard. Also, we are using Microsoft Entra ID ( Previously called as AAD) and Azure Key Vault for the monitoring and governance purpose. In this video, I have also covered the complete end to end pipeline testing right from how a new data gets ingested followed by the data transformation until it goes updating the report that we will be creating using the Power BI