Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Certification Course in Azure Data Engineering

Name: Certification Course in Azure Data Engineering
Rating: 4.5 (29 reviews)

Learn SQL for Data Engineering, Data Warehousing, Data Lake, Data Factory, Databricks, PySpark, Snowflakes and DevOps

Created byHuman and Emotion: CHRMI

Last updated 3/2025

English

What you'll learn

Learn SQL for Data Engineering, including querying and transforming data, optimizing database performance, and handling large datasets efficiently
Gain expertise in Data Warehousing Concepts, including OLTP vs. OLAP, dimensional modeling, schema designs (Star & Snowflake), and ETL/ELT
Learn about Azure Data Engineering Fundamentals, covering key Azure services such as Azure Data Lake Storage, Blob Storage, Synapse Analytics, and security
Develop hands-on skills in Azure Data Factory (ADF) by building ETL pipelines, integrating data from various sources, and transforming data using Azure service
Gain proficiency in Databricks and PySpark, including distributed computing, Spark SQL, RDDs, and performance optimization for handling big data
Learn how to build and execute PySpark jobs for large-scale data processing and integrate Databricks with Azure services
Understand Delta Tables and Versioning, including ACID transactions, schema enforcement, and time-travel capabilities
Explore Snowflake for Data Engineering, covering architecture, data loading, query optimization, and integration with Azure
Learn how to design and deploy Production Pipelines, following best practices for scalable pipeline architectures, exception handling, and monitoring
Learn Azure DevOps for CI/CD pipeline deployment, version control, and automated testing
Discover how to leverage Azure Data Engineering Analytics, including data analysis, visualization, and monitoring with Azure services.

Course content

11 sections • 109 lectures • 14h 37m total length

Introduction7:13
Design, build, and manage scalable data solutions on Azure using Data Factory, Synapse Analytics, Databricks, and Data Lake to ingest, transform, store, and analyze data.

1.1. Introduction to SQL1:45
Explore the introduction to sql, the structured query language that enables creating, reading, updating, and deleting data across relational databases like mysql, postgresql, and sqlite.
1.1.1. Basics of relational databases and SQL.10:21
1.1.2. SQL syntax and query structure.4:34
Explore the basic structure of SQL queries, including select, from, where, order by, group by, having, and limit, and learn how aggregation functions shape results.
1.1.3. SELECT, WHERE, GROUP BY, and ORDER BY clauses13:21
Explore core SQL features, focusing on select with where, and the roles of DML, DDL, DCL, and TCL commands in querying, modifying, defining, securing, and transacting data.
1.1.3. SELECT, WHERE, GROUP BY, and ORDER BY clauses 22:07
1.2. Advanced SQL techniques2:11
Unlock the power of advanced SQL techniques to solve complex data problems, using joins, subqueries, common table expressions, window functions, and indexing to optimize performance on large data sets.
1.2.1. Joins (INNER, OUTER, LEFT, RIGHT).11:46
Explore sql joins in depth, mastering inner, left, right, and full outer joins, plus advanced techniques like self joins and anti joins to retrieve and link data across tables.
1.2.2. Subqueries, CTEs, and Window Functions.11:45
1.2.3. Aggregations and analytical functions.21:33
Explore recursive queries and hierarchical data with CTEs, pivot and unpivot, JSON and XML handling, and index and query optimization, plus dynamic SQL and other advanced techniques.
1.3. SQL for Data Engineering2:18
Explore how SQL underpins data engineering by enabling ETL, data pipelines, and optimized queries for reliable data infrastructure, integrity, and analytics.
1.3.1. Data manipulation and transformation.4:01
1.3.2. Handling large datasets and performance tuning.15:57
Explore key sql concepts for data engineers, including normalization and denormalization, joins, window functions, indexes, partitioning, temporary tables, and best practices for modular queries and performance.
1.3.3. Data ingestion and validation using SQL.17:02
Learn SQL techniques for data engineering, including data cleaning, removing duplicates, null handling with coalesce, aggregations, CTEs and subqueries, and ETL automation with Apache Airflow or Azure Data Factory.

2.1. Introduction to Data Warehousing11:03
Learn how data warehousing centralizes data from multiple sources, uses ETL to prepare it, and employs star or snowflake schemas to enable analytics and business insights.
2.1.1. OLTP vs. OLAP.3:25
Differentiate OLTP and OLAP to contrast online transaction processing with online analytical processing, detailing single source real time data versus multiple sources historical analytics.
2.1.2. Star and Snowflake schema designs.2:41
Understand data warehousing schema designs: star schema with a fact table and dimensions like time, product, and customer; snowflake schema with normalized dimensions; and galaxy schema with multiple fact tables.
2.1.3. Dimensional modeling concepts.18:53
Analyze data warehousing architectures, core components, and models—single, two-tier, and three-tier systems; EDW, data marts, and virtual warehouses; cloud technologies and future trends.
2.2. Data Pipeline Design6:04
Discover how data pipelines design and automate the movement, transformation, and loading of data from sources to a data lake and data warehouse, enabling scalable, reliable, and flexible analytics.
2.2.1. ETL vs. ELT processes.2:33
Compare etl and elt processes, showing etl transforms data before loading to a data warehouse, while elt loads raw data and transforms inside the warehouse, using masking, filtering, and automation.
2.2.2. Data staging, integration, and transformation layers11:25
Explore data pipeline steps from ingestion to delivery, including transformation, storage, processing and analytics, plus batch processing, real time, and hybrid pipelines and design best practices.
2.2.2. Data staging, integration, and transformation layers 210:55
2.3. Hands-On Activity2:57
2.3.1. Creating sample schemas and loading sample data7:46
Explore creating a star schema for retail sales analysis with a fact table for transactions and dimension tables for customers, products, and time, enabling SQL-based querying and time-based trends.
2.3.1. Creating sample schemas and loading sample data 215:39

3.1. Overview of Azure Data Engineering2:38
3.1.1. Introduction to Azure cloud platform11:19
Explore Azure data engineering, designing scalable pipelines to collect, clean, store, and process data using Azure Data Factory, Azure Databricks, Azure Synapse Analytics, and Azure Data Lake Storage Gen2.
3.1.2. Key Azure services for Data Engineering.15:33
Explore Azure data engineering services—Azure Data Factory, Databricks, Synapse Analytics, Azure Data Lake Storage, Azure Stream Analytics, Azure SQL Database, Cosmos DB—and apply best practices for scalable, secure data pipelines.
3.2. Azure Storage Solutions4:53
Azure storage solutions offer scalable, durable, and highly available cloud storage for unstructured data like files and images, and structured data like databases, with rest api and vm storage options.
3.2.1. Azure Data Lake Storage.1:57
Azure data lake storage gen2 provides performance, scalable big data analytics with hierarchical file system, native hdfs support, and integration with Azure Databricks and Azure Synapse Analytics for real-time processing.
3.2.2. Blob storage and file management.8:22
Learn Azure blob storage as a scalable, cost-effective store for unstructured data with hot, cool, and archive tiers, plus Data Lake Storage Gen2 for big data analytics and Azure Files.
3.2.2. Blob storage and file management 26:06
Learn how Azure archive storage enables low-cost, long-term data retention with varying retrieval latencies, lifecycle management policies for automatic tiering, and redundancy options for disaster recovery.
3.2.3. Security and access control mechanisms.6:16
Explore Azure storage use cases for backup and disaster recovery, big data analytics with data lake, media streaming with blob storage, and secure access with RBAC and SAS.
3.3. Azure Data Integration4:48
3.3.1. Introduction to Azure Synapse Analytics.1:44
Azure Synapse Analytics provides an end-to-end analytics solution that unifies big data and data warehousing, with native Azure Data Factory integration and SQL querying for structured and unstructured data.
3.3.2. Data movement and integration tools in Azure.18:04
Explore Azure data integration tools, including Data Factory, Databricks, Stream Analytics, and Event Hub, to build batch and real-time pipelines, orchestrate ETL and ELT workflows, and secure data lake storage.

4.1. Azure Functions and Logic Apps2:50
4.1.1. Automating workflows using Logic Apps.7:39
Discover Azure Logic Apps, a cloud-based, low-code workflow automation service with a visual designer and hundreds of connectors to automate data, notifications, and hybrid connectivity.
4.1.2. Serverless computing with Azure Functions.18:05
Explore Azure Functions, a serverless, event-driven compute service that runs code on triggers and scales automatically. Use cases include real-time data processing, automation, and lightweight back-end APIs with pay-per-execution pricing.
4.2. Azure Event Hub and Stream Analytics3:05
4.2.1. Streaming data ingestion.2:13
Leverage Azure Event Hubs as an entry point for real-time event data ingestion from apps and devices, with storage and integrations to Azure Stream Analytics, Azure Functions, and Data Lake.
4.2.2. Real-time analytics in Azure7:33
4.3. Monitoring and Optimization3:02
Monitor, analyze, and optimize Azure data pipelines with Azure Monitor, Log Analytics, and Application Insights, leveraging autoscale and cost management to meet SLAs.
4.3.1. Cost optimization techniques.9:04
Explore optimization strategies for Azure Data Engineering to boost pipeline efficiency and reduce costs. Implement partitioning and pruning, compression, and CDC incremental loads.
4.3.2. Monitoring and debugging Azure worklads13:15
Leverage Azure Monitor and Log Analytics for telemetry, alerts, and log analysis across Azure Data Factory, Databricks, and SQL resources; optimize costs, performance, and pipelines with Advisor and Cost Management.
4.3.2. Monitoring and debugging Azure worklads 25:33
Learn monitoring and optimization best practices for Azure data engineering, including custom alerts for pipeline failures and cost anomalies, visualizing trends with Azure dashboards, and automating responses.

5.1. Introduction to Azure Data Factory2:31
Explore how Azure Data Factory enables cloud-based data integration. Orchestrate workflows across on-premises and cloud sources with Synapse Analytics, Databricks, and Azure Storage for scalable ETL and ELT pipelines.
5.1.1. ADF architecture and components.6:41
5.1.2. Pipelines, triggers, and datasets.8:46
Explore the core components of Azure Data Factory, including pipelines, activities, linked services, datasets, triggers and integration runtime, enabling data movement, transformation, and orchestrated workflows.
5.1.2. Pipelines, triggers, and datasets 26:27
Create an Azure Data Factory pipeline by connecting to blob storage and SQL database, defining datasets, adding a copy data activity, configuring a trigger, and monitoring with the ADF dashboard.
5.2. Building ETL Pipelines in ADF3:52
Build scalable ETL pipelines in Azure Data Factory to extract data from on premises SQL, transform it with cleansing and enrichment, and load into Azure Synapse Analytics for analytics.
5.2.1. Creating and managing data pipelines.9:56
Build and manage etl pipelines in azure data factory by defining sources, destinations, and creating linked services and data sets. Design pipelines, apply transforms, and configure parameterization, triggers, and monitoring.
5.2.2. Data transformations using ADF.8:57
Explore an Azure Data Factory ETL workflow that extracts on premises sales data, transforms it to monthly trends, stores in data lake, and loads into Azure Synapse Analytics for reporting.
5.3. Integration with Other Services2:38
Explore how Azure data engineering integrates with Azure and non-Azure services to support end-to-end data pipelines using Azure Data Factory, IoT Hub, Stream Analytics, Databricks, Synapse, and Power BI.
5.3.1. Integrating ADF with Databricks, SQL server, and Snowflake.12:32
Orchestrate data workflows with Azure Data Factory by integrating with on-premises databases, Azure Storage, Synapse, and Databricks, enabling real-time streaming, machine learning pipelines, and automated serverless workflows.
5.3.1. Integrating ADF with Databricks, SQL server, and Snowflake 211:23
Harness Power BI with Synapse and ADF for near real-time insights, ingest data from Azure Storage, and process with Databricks or Snowflake for scalable analytics.
5.4. Hands-On Activity3:00
Build a sample etl pipeline in Azure Data Factory, extracting data from Azure Blob storage, transforming with Azure Dataflow, and loading into an Azure SQL database.
5.4.1. Building a sample ETL pipeline in ADF.7:37
5.4.1. Building a sample ETL pipeline in ADF 27:07

6.1. Introduction to Databricks2:14
Explore Databricks, a cloud based unified data platform built on Apache Spark for big data processing, machine learning, collaborative analytics, and workflows with Azure integrations.
6.1. Introduction to Databricks 6.1.1. Overview of Databricks and its architect8:57
Databricks offers an open, collaborative data platform that accelerates innovation by bridging data engineers, data scientists, and analysts with Spark, machine learning, and unified analytics.
6.1.2. Setting up Databricks workspaces.12:31
6.2. Introduction to PySpark2:31
Explore PySpark, the Python API for Apache Spark, and its role in big data processing with distributed computing, including SQL, machine learning, and real-time streaming.
6.2.1. Basics of distributed computing.13:10
Learn Apache Spark's in-memory, high-speed platform for large-scale batch and streaming workloads, and how PySpark provides Python access to distributed data processing, the DataFrame API, MLlib, and streaming.
6.2.2. Dataframes, RDDs, and Spark SQL.3:35
6.2.2. Dataframes, RDDs, and Spark SQL 27:28
Explore PySpark use cases in batch processing, real-time streaming, and machine learning with Mllib, then see PySpark's ETL, data wrangling, and warehousing.
6.3. Advanced PySpark Techniques3:02
Leverage PySpark, the Python API for Apache Spark, to apply advanced techniques and optimizations that improve performance in large-scale data processing through partitioning, caching, and optimized Spark jobs.
6.3.1. Writing and optimizing PySpark jobs9:15
Learn how PySpark caching and persistence speed repeated computations by storing data frames in memory or on disk, and apply join strategies like broadcast to reduce shuffling.
6.3.1. Writing and optimizing PySpark jobs 211:31
Explore window functions in PySpark to compute moving averages, rankings, and range-based aggregations within partitions, and build scalable ML pipelines with MLlib and logistic regression.
6.3.1. Writing and optimizing PySpark jobs 310:51
Explore Spark SQL in PySpark, the catalyst optimizer, predicate pushdown, constant folding, projection pruning, and gain insights on caching temp views and real-time streaming with Kafka and sockets.
6.3.2. Working with large datasets.8:46
Partition and repartition in Spark to distribute large datasets across a cluster for processing. Use coalesce and partitioning by region to reduce shuffling and optimize memory during joins and aggregations.
6.4. Hands-On Activities5:49
Build PySpark applications by integrating Azure Databricks with Azure data services for end-to-end data engineering, real-time and batch processing, scalable analytics, and machine learning.
6.4.1. Building PySpark applications.7:45
Set up a Databricks workspace on Azure, create a Spark cluster, and install libraries to build PySpark applications for distributed data processing, analytics, and machine learning.
6.4.1. Building PySpark applications 27:07
Learn how Databricks on Azure reads and writes data from Azure Blob Storage and Azure Data Lake Storage with PySpark and secure authentication.
6.4.2. Integrating Databricks with Azure services.7:01
Explore how Azure Databricks integrates with Azure Synapse Analytics to build scalable data pipelines. Leverage PySpark with JDBC for data movement and Azure ML for model training.
6.4.2. Integrating Databricks with Azure services 25:50
Monitor and manage PySpark workloads in Azure Databricks with cluster monitoring, logs, and alerts, and integrate with Azure Monitor for real-time health metrics of clusters, pipelines, and workloads.

7.1. Delta Lake Fundamentals5:46
Explore Delta Lake fundamentals, an open source storage layer that enhances Spark and data workloads with reliability, performance, acid, atomicity, consistency, isolation, durability, and transactions, unifying batch and streaming processing.
7.1.1. Overview of Delta tables.13:48
Explore Delta Lake architecture, focusing on Delta tables and Delta logs, acid properties, time travel, and Parquet storage, with hands-on steps for Spark, Delta format, and upserts.
7.1.2. ACID transactions and schema enforcement.8:34
Explore Delta Lake features like acid transactions, atomicity, consistency, isolation, durability, and rollback, schema enforcement, plus unified batch and streaming processing and time travel for versioning.
7.1.2. ACID transactions and schema enforcement 210:11
Delta Lake enforces schema integrity and enables evolution, delivering acid transactions, data lineage and auditing, and time travel for reliable, scalable data pipelines across batch and real-time workloads.
7.2. Versioning and Time Travel2:09
Leverage Delta Lake's versioning and time travel in Delta Lake to query historical data, audit changes, and recover from accidental deletions in large-scale data pipelines.
7.2.1. Querying data at specific points in time.10:40
Explore how Delta Lake versioning preserves a complete history of table changes, enabling time travel and precise queries by timestamp or version number.
7.2.2. Implementing CDC (Change Data Capture) workflows.4:09
Discover how Delta Lake versioning and time travel enable data recovery, rollback, and auditing, with reproducibility and cross-time data comparisons for analytics.
7.2.2. Implementing CDC (Change Data Capture) workflows 210:01
Manage delta tables with Delta Lake through configurable retention and vacuuming of stale files. Enable time travel via the metadata history for auditing, backup, recovery, and reproducibility.

8.1. Introduction to Snowflake5:02
Explore Snowflake, a cloud-native, fully managed data platform that unifies data warehousing, data lakes, and data sharing across AWS, Azure, and GCP, enabling high-performance analytics with JSON, Avro, and Parkway.
8.1.1. Architecture and key features of Snowflake.7:47
Explore Snowflake architecture and key features, including the multi-layer design of storage, compute, and cloud services; scalable virtual warehouses and secure data sharing with automatic scaling.
8.1.2. Warehouses, databases, and schema in Snowflake.10:05
Explore how Snowflake loads, stores, and processes data with virtual warehouses, automatic query optimization, and secure data sharing, enabling scalable data warehousing, data lakes, and real-time analytics.
8.2. Data Loading and Querying in Snowflake3:05
8.2.1. Copying data into Snowflake.9:10
Learn how Snowflake loads data from internal or external stages using copy into, with Snowpipe enabling real-time ingestion from cloud storage such as S3, Azure Blob, or Google Cloud Storage.
8.2.2. Writing and optimizing queries6:06
Explore querying data in Snowflake with standard SQL to retrieve, filter, and analyze using select, where, group by, and order by, plus window functions, CTEs, and subqueries.
8.2.2. Writing and optimizing queries 28:36
Boost Snowflake performance with clustering, caching, and materialized views. Query structured and semi-structured data, including JSON and Parquet, using SQL-based methods and various loading options.
8.3. Snowflake for Data Engineering9:02
Explore Snowflake for data engineering, a cloud-native platform with independent compute and storage that enables scalable data loading, transformation, and delivery of semi-structured data to BI and ML tools.
8.3.1. Integration with Azure services.7:56
Explore Snowflake's data loading and integration from csv, json, parquet, avro, and orc into AWS S3, Azure Blob Storage, and Google Cloud Storage using copy into and Snowpipe.
8.3.1. Integration with Azure services 210:06
Discover how Snowflake integrates with third-party platforms like Informatica, AWS Glue, and Google Data Fusion to automate data pipelines, enable cross-environment data movement, and support SQL-based transformations.
8.3.2. Best practices for using Snowflake in production.11:56
Explore Snowflake's production-ready data engineering practices, including automatic query optimization, clustering keys, materialized views, and secure data sharing with RBAC and encryption to optimize analytics pipelines.

9.1. Designing Production Pipelines5:05
Design production pipelines in data engineering to automate etl or elt, moving, transforming, and loading data from sources to storage with scalable, secure, real time or batch workflows.
9.1.1. Best practices for scalable pipelines.7:06
9.1.2. Handling exceptions and retries.11:44
Explore the production data pipeline architecture, covering data sources, ingestion, processing, transformation, and storage with Apache Kafka, Spark, Databricks, and Snowflake.
9.1.2. Handling exceptions and retries 211:28
Explore the orchestration, monitoring, data quality, and deployment layers essential for production data pipelines, using tools like Apache Airflow, Azure Data Factory, and Jenkins to ensure reliability and scalability.
9.2. CI/CD for Azure Data Engineering7:08
Automate the end-to-end data pipeline in Azure Data Engineering with CI/CD, from ingestion to transformation to analysis, using Azure DevOps, Azure Data Factory, and Azure Key Vault for secure credentials.
9.2.1. Using Azure DevOps for pipeline deployment.13:27
Automate Azure data pipelines by integrating Azure DevOps, GitHub, ADF, and Databricks, with secure storage and Azure Key Vault. Implement end-to-end ci/cd from source control to monitoring.
9.2.2. Version control and automated testing.8:44
Discover tools and Azure services for CI/CD in data engineering, including Azure DevOps, Azure Pipelines, GitHub Actions. Learn best practices for version control, automated testing, environment consistency, and secure deployments.
9.3. Monitoring and Maintenance6:26
Monitor production data pipelines to detect anomalies, track performance, and resolve bottlenecks before affecting data delivery. Maintain pipelines through updates, scalability, and security, ensuring data quality, compliance, and reliable operations.
9.3.1. Monitoring data pipelines in production.11:03
Learn how to monitor and maintain production data pipelines, including health checks, data quality, throughput, latency, alerting, logging, resource management, and performance optimization in Azure data engineering.
9.3.1. Monitoring data pipelines in production 27:31
Ensure production pipelines meet quality standards through data validation, profiling, reconciliation, sampling, and data lineage, then maintain security, updates, and compliance across Azure services.
9.3.2. Troubleshooting and performance tuning.9:23
Explore tools for monitoring and maintenance of Azure data pipelines, including Azure Monitor, Log Analytics, and Data Factory dashboards, with alerts and real-time metrics for troubleshooting and performance tuning.

Requirements

You should have an interest in the fundamentals of Azure Data Engineering.
Basic understanding of programming and algorithms

Description

Description

Take the next step in your career! Whether you're an aspiring data engineer, an experienced IT professional, a cloud solutions architect, or a data analyst, this course is your opportunity to sharpen your Azure Data Engineering skills, enhance your ability to design scalable data solutions, and advance your professional growth in the field of cloud-based data engineering.

With this course as your guide, you learn how to:

Master the fundamental skills and concepts required for Azure Data Engineering, including SQL, Data Warehousing, ETL/ELT processes, and cloud-based data integration.
Build and optimize data pipelines using Azure Data Factory (ADF), Databricks, Snowflake, PySpark, and Delta Tables, ensuring efficient data processing and transformation.
Access industry-standard templates and best practices for data architecture, schema design, and performance optimization in cloud environments.
Explore real-world applications of Azure services, including data lake storage, real-time analytics, data monitoring, and security best practices for enterprise-level data management.
Invest in learning Azure Data Engineering today and gain the skills to design and manage scalable, high-performance data solutions that drive business success.

The Frameworks of the Course

Engaging video lectures, case studies, projects, downloadable resources, and interactive exercises—this course is designed to explore Azure Data Engineering, covering SQL, Data Warehousing, ETL/ELT processes, and cloud-based data solutions using Azure services.

The course includes multiple case studies, resources such as templates, worksheets, reading materials, quizzes, self-assessments, and hands-on labs to deepen your understanding of Azure Data Engineering concepts and real-world applications.

In the first part of the course, you’ll learn SQL basics and advanced techniques, data warehousing fundamentals, and data ingestion and transformation using Azure Data Factory (ADF) and Synapse Analytics.
In the middle part of the course, you’ll develop a deep understanding of Databricks and PySpark, Delta Tables, versioning, and real-time data streaming using Azure Event Hub and Stream Analytics.
In the final part of the course, you’ll gain expertise in Snowflake for Data Engineering, designing production pipelines, CI/CD implementation with Azure DevOps, and monitoring data workflows.

Part 1

Introduction and Study Plan

· Introduction and know your instructor

· Study Plan and Structure of the Course

Module 1. SQL Basics and Advanced Concepts

1.1. Introduction to SQL

1.1.1. Basics of relational databases and SQL.

1.1.2. SQL syntax and query structure.

1.1.3. SELECT, WHERE, GROUP BY, and ORDER BY clauses

1.2. Advanced SQL techniques

1.2.1. Joins (INNER, OUTER, LEFT, RIGHT).

1.2.2. Subqueries, CTEs, and Window Functions.

1.2.3. Aggregations and analytical functions.

1.3. SQL for Data Engineering

1.3.1. Data manipulation and transformation.

1.3.2. Handling large datasets and performance tuning.

1.3.3. Data ingestion and validation using SQL.

Module 2. Data Warehousing Concepts

2.1. Introduction to Data Warehousing

2.1.1. OLTP vs. OLAP.

2.1.2. Star and Snowflake schema designs.

2.1.3. Dimensional modeling concepts.

2.2. Data Pipeline Design

2.2.1. ETL vs. ELT processes.

2.2.2. Data staging, integration, and transformation layers.

2.3. Hands-On Activity

2.3.1. Creating sample schemas and loading sample data.

Module 3. Azure Data Engineering Fundamentals

3.1. Overview of Azure Data Engineering

3.1.1. Introduction to Azure cloud platform.

3.1.2. Key Azure services for Data Engineering.

3.2. Azure Storage Solutions

3.2.1. Azure Data Lake Storage.

3.2.2. Blob storage and file management.

3.2.3. Security and access control mechanisms.

3.3. Azure Data Integration

3.3.1. Introduction to Azure Synapse Analytics.

3.3.2. Data movement and integration tools in Azure.

Module 4. Azure Services for Data Engineering

4.1. Azure Functions and Logic Apps

4.1.1. Automating workflows using Logic Apps.

4.1.2. Serverless computing with Azure Functions.

4.2. Azure Event Hub and Stream Analytics

4.2.1. Streaming data ingestion.

4.2.2. Real-time analytics in Azure.

4.3. Monitoring and Optimization

4.3.1. Cost optimization techniques.

4.3.2. Monitoring and debugging Azure workloads

Module 5. Azure Data Factory (ADF)

5.1. Introduction to Azure Data Factory

5.1.1. ADF architecture and components.

5.1.2. Pipelines, triggers, and datasets.

5.2. Building ETL Pipelines in ADF

5.2.1. Creating and managing data pipelines.

5.2.2. Data transformations using ADF.

5.3. Integration with Other Services

5.3.1. Integrating ADF with Databricks, SQL server, and Snowflake.

5.4. Hands-On Activity

5.4.1. Building a sample ETL pipeline in ADF.

Module 6. Databricks and PySpark

6.1. Introduction to Databricks

6.1.1. Overview of Databricks and its architecture.

6.1.2. Setting up Databricks workspaces.

6.2. Introduction to PySpark

6.2.1. Basics of distributed computing.

6.2.2. Dataframes, RDDs, and Spark SQL.

6.3. Advanced PySpark Techniques

6.3.1. Writing and optimizing PySpark jobs.

6.3.2. Working with large datasets.

6.4. Hands-On Activities

6.4.1. Building PySpark applications.

6.4.2. Integrating Databricks with Azure services.

Module 7. Delta Tables and Versioning

7.1. Delta Lake Fundamentals

7.1.1. Overview of Delta tables.

7.1.2. ACID transactions and schema enforcement.

7.2. Versioning and Time Travel

7.2.1. Querying data at specific points in time.

7.2.2. Implementing CDC (Change Data Capture) workflows.

Module 8. Snowflake Core Concepts

8.1. Introduction to Snowflake

8.1.1. Architecture and key features of Snowflake.

8.1.2. Warehouses, databases, and schema in Snowflake.

8.2. Data Loading and Querying in Snowflake

8.2.1. Copying data into Snowflake.

8.2.2. Writing and optimizing queries.

8.3. Snowflake for Data Engineering

8.3.1. Integration with Azure services.

8.3.2. Best practices for using Snowflake in production.

Module 9. Production Pipelines and Deployment

9.1. Designing Production Pipelines

9.1.1. Best practices for scalable pipelines.

9.1.2. Handling exceptions and retries.

9.2. CI/CD for Azure Data Engineering

9.2.1. Using Azure DevOps for pipeline deployment.

9.2.2. Version control and automated testing.

9.3. Monitoring and Maintenance

9.3.1. Monitoring data pipelines in production.

9.3.2. Troubleshooting and performance tuning.

Part 2

Module 10. Capstone Project

10.1. Project Design and Implementation

10.1.1. Design a complete Data Engineering solution.

10.1.2. Use Azure services, Databricks, Snowflake, and PySpark.

Who this course is for:

Data professionals looking to gain expertise in SQL, Data Warehousing, and ETL/ELT processes for efficient data management and transformation.
New professionals seeking to build a career in Azure Data Engineering by learning cloud-based data solutions, data pipeline development, and big data processing using Azure services.
Existing data engineers, architects, and IT professionals who want to enhance their skills in their respective domain to optimize data workflows and improve performance.
Technical leads, managers, and decision-makers looking to understand scalable data engineering architectures, cloud-based data integration strategies, and real-time data analytics using Azure.

Certification Course in Azure Data Engineering

What you'll learn

Explore related topics

Course content

Introduction1 lecture • 7min

Module 1. SQL Basics and Advanced Concepts13 lectures • 1hr 59min

Module 2. Data Warehousing Concepts11 lectures • 1hr 33min

Module 3. Azure Data Engineering Fundamentals11 lectures • 1hr 22min

Module 4. Azure Services for Data Engineering10 lectures • 1hr 12min

Module 5. Azure Data Factory (ADF)13 lectures • 1hr 31min

Module 6. Databricks and PySpark17 lectures • 2hr 7min

Module 7. Delta Tables and Versioning8 lectures • 1hr 5min

Module 8. Snowflake Core Concepts11 lectures • 1hr 29min

Module 9. Production Pipelines and Deployment11 lectures • 1hr 39min

Requirements

Description

Who this course is for: