Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Azure Synapse Analytics For Data Engineers -Hands On Project

Name: Azure Synapse Analytics For Data Engineers -Hands On Project
Rating: 4.7 (4223 reviews)

Hands on Project for Data Engineers using all the services available in Azure Synapse Analytics [DP-203, DP-500]

Bestseller

Created byRamesh Retnasamy . 200,000+ Learners

Last updated 7/2025

English

What you'll learn

You will learn how to build a real world project using Azure Synapse Analytics. This course has been taught using real world data from NYC Taxi Trips data
You will acquire professional level data engineering skills in Azure Synapse Analytics
You will learn how to create SQL scripts and Spark notebooks in Azure Synapse Analytics
You will learn how to create dedicated SQL pools and spark pools in Azure Synapse Analytics
You will learn how to enable synapse link and enable analytic store in Cosmos DB
You will learn how to ingest and transform data Serverless SQL Pool and Spark Pool
You will learn how to load data into dedicated SQL Pool
You will learn how to serve data to Power BI from Serverless SQL Pool and Dedicated SQL Pool
You will learn how to execute scripts and notebooks using Synapse Pipelines and Triggers
You will learn how to do operational reporting from the data stored in Cosmos DB using Azure Synapse Analytics
You will learn how to build reports in Power BI for the data stored in Azure Synapse Analytics

Course content

19 sections • 132 lectures • 13h 28m total length

Course Introduction8:19
Discover Azure Synapse Analytics for data engineers, covering data integration, enterprise data warehousing, and big data analytics with serverless SQL pool, Spark pool, Synapse Link, and Power BI integration.
Course Materials Download0:12
Useful Links0:18

Creating Azure Account5:52
Create a free Azure account and explore 12 months of popular free services plus £150 or $200 free credit for 30 days, with student options available.
Azure Portal Overview3:52
Explore the Azure portal overview and sign in at portal.azure.com to access resources. Navigate the menu, home, dashboard, all services, favorites, and recent resources using Azure monitor and Microsoft Learn.

Introduction to Azure Synapse Analytics2:45
Explore Azure Synapse Analytics as a limitless analytics service that unifies data integration, enterprise data warehousing, and big data analytics, with scalable compute and storage and serverless or dedicated options.
History of Data Warehouse/ Data Lakes4:36
Explore the evolution from traditional data warehouses to data lakes and the modern data warehouse, highlighting ETL, governance, unstructured data, and Azure Synapse Analytics implementation.
Emergence of Azure Synapse Analytics9:00
Trace the emergence of Azure Synapse Analytics as a unified platform for data integration, data lake, big data analytics, and reporting, featuring Synapse Pipelines, Spark Pool, and Serverless SQL Pool.
Create Azure Synapse Analytics Workspace10:18
Create an Azure Synapse Analytics workspace using the guided wizard. Attach a data lake Gen2 storage account and container, and configure the serverless SQL pool.
Azure Synapse Analytics Workspace Overview5:32
Explore the Azure Synapse Analytics workspace to access built-in serverless SQL pool and manage dedicated SQL, Spark, and Data Explorer pools, plus configure access controls and the SQL endpoint.
Azure Synapse Studio Overview8:11
Learn to access Azure Synapse Studio via the Azure portal or web.azuresynapse.net, and navigate its main areas—Home, Data, Develop, Integrate, Monitor, Manage—for unified development and monitoring.
Data Hub Overview8:03
Navigate the Data Hub in Synapse to manage workspace assets and linked storage, create serverless SQL databases, connect external data, and link datasets like the Bing COVID-19 dataset.
Develop Hub Overview6:11
Develop scripts in the develop hub by creating sql scripts for serverless and dedicated pools, kql for data explorer, notebooks for spark, data flows, and gallery samples.
Integrate Hub Overview2:40
Learn how to create and manage data pipelines in Azure Synapse Analytics, using the Integrate hub to copy data, orchestrate transformations, and invoke notebooks, Spark jobs, or SQL procedures.
Monitor Hub Overview2:28
Explore the monitor hub to track serverless SQL pool activity, pool status, and query execution for SQL and KQL workloads, while understanding data-driven billing and pipeline runs.
Manage Hub Overview5:12
Manage an Azure Synapse workspace by configuring pools (serverless, dedicated SQL; Spark; Data Explorer), creating linked services, pipelines and triggers, integration runtimes, security roles, and git integration.

Section Overview1:04
Explore Azure Synapse Analytics capabilities through a hands-on project using New York taxi trip data, including data lake upload, functional dashboards and nonfunctional monitoring, and a detailed solution architecture.
NYC Taxi Data Source Overview5:36
Explore the NYC taxi data ecosystem, comparing yellow, green, for-hire, and high-volume vehicles, using data dictionaries, lookup tables, and the factbook to analyze trips from 2009 to 2021.
NYC Taxi Data Files Overview2:45
Overview of NYC taxi data files and formats used in the project, including trip data, taxi zones, calendar, and mapping files for rate codes and payments.
Upload NYC Taxi Data to Data Lake7:23
Upload the NYC taxi data to the data lake by creating a blob container in Azure Storage Explorer, then upload the raw folder organized by year and month.
Project Requirements Overview3:29
Outline data discovery, ingestion, and transformation requirements for a data lake: ensure quality, apply schema, enable t-sql and pay-per-query access, store in Parquet, and support BI and IoT reporting.
Solution Architecture Overview5:43
Explore the solution architecture for Azure Synapse Analytics, covering four compute options with serverless sql pool as the compute engine, bronze–silver–gold data layers, external tables, parquet, and Power BI integration.

Section Overview0:58
Explore the Serverless SQL pool in Azure Synapse Analytics, its architecture, features, and cost model; learn to connect from Azure Data Studio and work with T-SQL statements and limitations.
Introduction to Serverless SQL Pool8:31
Explore serverless sql pool in Azure Synapse, a pay-per-query engine that reads data from the data lake using t-sql, with Polaris driving control and compute node architecture and external tables.
Serverless SQL Pool Cost Control11:36
Explore serverless sql pool cost control by analyzing data processed components, including data and partitioning with parquet metadata, and implementing UI and T-SQL limits for daily, weekly, and monthly usage.
Connect from Azure Data Studio to Serverless SQL Pool (Optional)9:39
Connect to Azure Synapse Serverless SQL pool from Azure Data Studio using SQL login or Azure Active Directory, then run queries and explore notebooks, IntelliSense, and source control.

Section Overview2:11
Explore reading delimited files with the open rosette function in Azure Synapse, handling headers, delimiters, and escaping. Learn to specify data types and query subsets of columns.
OPENROWSET Function Overview4:21
Use the openrowset function to read remote Azure storage files, returning data as rows in CSP, parquet, or delta. It requires bulk and format parameters, with optional reject options.
Query Taxi Zone File (CSV File)8:21
Learn to read a taxi zone csv from a data lake with the open rosette function, set header row, and specify field and terminator options, using CSP parser 2.0.
Specify Data Types8:25
Explore inferring and defining explicit data types for signup data in CSP files, using SB_describe_first_result_set and max column length to optimize performance and reduce costs in a serverless Synapse pool.
Specify Collation6:04
Apply UTF eight collation to avoid implicit conversions by specifying it at the column level or database level, and verify default collation in Azure Synapse Analytics.
Query Subset of Columns9:07
Select a subset of columns in Azure Synapse with or without headers, using ordinal positions and the first row option to improve performance and control column naming.
Debugging & Identifying Errors2:30
debug data type mismatches and truncation errors in azure synapse analytics for data engineers by using clearer messages from version 1.0 and restoring the zone length to 50.
Use External Data Source9:24
Create external data sources in Azure Synapse to point to storage containers and avoid hard-coded URLs, then use them in select to access raw data and bronze, silver, and gold.
Query Calendar File (CSV File) - Assignment9:33
Craft a hand written select on the calendar CSP file using openrowset, define width class, and join with trip data to report week versus weekend statistics.
Query Vendor File (Quoted and Escaped Columns)6:32
Learn to handle delimiter conflicts in vendor data files for Azure Synapse Analytics by using an escape connector or field codes to preserve commas within data for the CSP parser.
Query Trip Type File (TSV File) - Assignment2:43
Demonstrate reading a tab separated values file in Azure Synapse by explicitly setting the field terminator to tab, aliasing the dataset in Snap Studio, and publishing changes to Synapse repository.

Section Overview3:34
Explore processing line-delimited, standard, and classic multi-line JSON with CSP parser and open rosette, then extract data using JSON value and open JSON functions.
Query Payment Type (Single Line JSON) - JSON_VALUE Function8:46
Learn to parse line-delimited JSON with a CSP parser, extract payment type and payment type description using JSON_VALUE, and cast results to smallint and varchar in Azure Synapse Analytics.
Query Payment Type (Single Line JSON) - OPENJSON Function6:14
Openjson converts json into rows and columns, supports explicit data types and arrays, and enables easier column naming, demonstrated on a payment type dataset.
Query JSON Array9:17
Discover how to query and explode JSON arrays in Azure Synapse Analytics using JSON value and open JSON, apply cross apply, and extract payment type descriptions from nested arrays.
Query Standard JSON5:07
Explore processing standard JSON files in Azure Synapse by reading the entire JSON string from a vertical tab terminated file, using open json to extract six elements into two columns.
Query Multi Line JSON (Assignment)2:24
Learn to process multi-line JSON by reading it into a single JSON string, overriding the row terminator to a vertical tab, and using the open JSON function.

Query Single Parquet File8:04
Learn to query parquet files in Azure Synapse by reading folders with automatic schema inference, refine data types, and save cost while boosting performance by selecting only the needed columns.
Query Folders and Sub Folders (Assignment)3:36
Query folders and subfolders in Azure Synapse Analytics to read partitioned park data using wildcard characters, file name and file part functions, and recursive folder access.
Query Delta files13:33
Discover how delta lake uses parquet data with a delta log for transactions and time travel, and why the main folder only and partitioned year and month matter for queries.

Data Discovery Overview3:21
Explore data discovery by querying files directly without loading into databases, identify records and counts per day, week, and month, and join datasets with simple transformations to drive business value.
Identify Duplicates5:20
Identify duplicates in a file by counting records per primary key (location ID) and using having count greater than one in Azure Synapse Analytics.
Data Quality Checks9:04
Identify data quality issues in the total amount field using basic checks (min, max, average) and nulls, revealing negative values and null payment types to inform clean data and reporting.
Joining Files8:29
Join files to compute trips per borough by combining trip data with taxi zone data using openrowset joins. Ensure location_id is not null, then group by borough and chart results.
Transform Data6:11
Compute trip duration in hours by taking the difference between pickup time and drop off time using diff, then group by hourly ranges to count trips while filtering invalid records.
Data Discovery Assignment6:54
Identify the percentage of cash and credit card transactions by borough by joining trip data, taxi zone, and payment type, computing totals and percentages to guide campaigns.

Requirements

All the code and step-by-step instructions are provided, but the skills below will greatly benefit your journey
Basic SQL knowledge will be required
Basic Python programming experience will be required
Knowledge of cloud fundamentals will be beneficial, but not necessary
Azure subscription will be required, If you don't have one we will create a free account in the course

Description

Welcome!

I am looking forward to helping you with learning one of the in-demand data engineering tools in the cloud, Azure Synapse Analytics! This course has been taught with implementing a data engineering solution using Azure Synapse Analytics for a real world project of analysing and reporting on NYC Taxi trips data.

This is like no other course in Udemy for Azure Synapse Analytics. Once you have completed the course including all the assignments, I strongly believe that you will be in a position to start a real world data engineering project on your own and also proficient on Azure Synapse Analytics. The primary focus of the course is Azure Synapse Analytics, but it also covers the relevant concepts and connectivity to the other technologies mentioned.

The course follows a logical progression of a real world project implementation with technical concepts being explained and the scripts and notebooks being built at the same time. Even though this course is not specifically designed to teach you the skills required for passing the exams Azure Data Engineer Associate Certification [DP-203] or Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI [DP-500], it can greatly help you get most of the necessary skills required for the exams.

I value your time as much as I do mine. So, I have designed this course to be fast-paced and to the point. Also, the course has been taught with simple English and no jargons. I start the course from basics and by the end of the course you will be proficient in the technologies used.

Currently the course teaches you the following

Azure Synapse Analytics Architecture
Serverless SQL Pool
Spark Pool
Dedicated SQL Pool
Synapse Pipelines
Synapse Link for Cosmos DB / Hybrid Transactional and Analytical Processing (HTAP) capability
Power BI Integration with Azure Synapse Analytics
Azure Data Lake Storage Gen2 integration with Azure Synapse Analytics
Project using NYC Taxi Trips data using the above technologies

Please note that the following are not currently covered

Data Flows
Advanced concepts around Dedicated SQL Pool
Spark Programming
SQL Fundamentals

Who this course is for:

University students looking for a career in Data Engineering
IT developers working on other disciplines trying to move to Data Engineering
Data Engineers/ Data Warehouse Developers currently working on on-premises technologies, or other cloud platforms such as AWS or GCP who want to learn Azure Data Technologies
Data Architects looking to gain an understanding about Azure Data Engineering stack

Azure Synapse Analytics For Data Engineers -Hands On Project

What you'll learn

Explore related topics

Course content

Introduction3 lectures • 9min

Azure Subscription (Optional)2 lectures • 10min

Azure Synapse Analytics Overview11 lectures • 1hr 5min

NYC Taxi Project Overview6 lectures • 26min

Serverless SQL Pool - Overview4 lectures • 31min

Serverless SQL Pool - Query CSV11 lectures • 1hr 9min

Serverless SQL Pool - Query JSON6 lectures • 35min

Serverless SQL Pool - Query Folders & Multiple Files2 lectures • 18min

Serverless SQL Pool - Query Columnar Formats3 lectures • 25min

Serverless SQL Pool - Data Discovery6 lectures • 39min

Requirements

Description

Who this course is for: