
Discover how to keep using Azure portal for free after 12 months with Microsoft Learn sandbox sessions. Switch directory to Microsoft Learn sandbox and work within four hours per session.
Share your reviews to support discourse; you’ll be invited to review after a few minutes. Set 720p quality, enable captions, adjust caption position and font size, and use speed control.
Create a free Microsoft Azure subscription by signing in with a Microsoft account. Receive $200 credit for 30 days, with 25 always free services and 12 months of free services.
Explore the Azure portal, learn to create, manage, and monitor resources via the web interface, customize dashboards, use global search, access marketplace and all services, and cloud shell options.
Learn how to delete resources and resource groups to avoid unwanted costs, and use cost management, cost analysis, and budgets with alerts to monitor and forecast spending.
Data volumes grow exponentially, and data is the biggest superpower in the present times. The data engineer profile evolves to meet rising demand for data professionals in Microsoft Azure environments.
Trace the data flow from collection and ingestion to storage, data wrangling, and dimensional and structural modeling for analysis, using data factory and enabling business insights.
Explore how Azure Data Factory v2 ingests and transforms raw data using distributed file systems, Polybius, and Spark, storing in blob storage or data lake for analytics.
Dive into non-relational datastore concepts with blob storage, data lake, and Cosmos DB, and explore replication, security options, cost, throughput, partitioning, and global distribution.
Explore how Azure storage services handle blob, file, table, and queue data with durability, high availability, encryption, and regional redundancy, while offering rest api and client libraries for scalable access.
Explore data redundancy options to protect data against hardware failures and disasters. Compare locally, zone, and geo redundant storage, including read-access variants, and weigh cost against availability.
Azure blob storage stores any file type as blobs within a storage account and containers, with block, page, and append blob types for streaming, logging, backup, and archiving.
Explore NoSQL table storage as a key-value store with rows and fields, emphasizing semi-structured data, partitioning, and fast, scalable data insertion and retrieval with Cosmos DB.
Discover how Azure file storage provides a fully managed, cross-platform file share that scales from on-premises to the cloud, enabling centralized configurations and tools accessible via SMB or NFS.
Explore how disk storage attaches to virtual machines, OS disks and data disks. Compare managed versus unmanaged disks and disk types like standard HDD, standard SSD, premium SSD, and ultra.
Understand how Cosmos DB evolved from documentdb to solve global distribution and scalability, storing data as json documents with querying capabilities.
Master Cosmos DB's multi-model APIs, including sql, table, MongoDB, Cassandra, and Gremlin. Learn to migrate on-premises data and select the right API for graphs, documents, and key-value stores.
Provision a Cosmos DB account in Azure by selecting subscription, resource group, and a unique name with Core SQL API, then configure networking, encryption, review, and create.
Explore Cosmos DB concepts by creating a database, containers, and items in the data explorer, and understand throughput, partition keys, and unique keys across multiple apis.
Configure Cosmos DB throughput to maximize performance, measure throughput in request units, and monitor latency, with alerts at 80 percent consumption and scalable options above 400 RU.
Explore how Cosmos DB scales horizontally with unlimited storage and throughput by distributing data across multiple machines behind a container, through partitioning.
Explore how Cosmos DB uses a partition key to divide items into logical partitions that map to physical partitions, with multiple logical partitions able to share a single physical partition.
Learn how Cosmos DB throughput can be configured at the database level or at the container level, and how that choice creates shared versus dedicated throughput.
Learn how container throughput is evenly divided among logical partitions in Cosmos DB, recognize hot partitions, and apply partition keys and partition-on-throughput strategies to spread data and queries.
Learn the difference between single partition and cross partition queries in Cosmos DB using a social network scenario, emphasizing efficiency and when fan out occurs.
Learn to select a high-cardinality partition key, like user id or product id, to avoid hot partitions and evenly distribute data and queries across partitions, while understanding partition size limits.
Create a Cosmos DB database and container, insert and update JSON items, and run queries with the data explorer. Learn about partition keys and throughput to optimize data operations.
Configure Cosmos DB time to live to automatically delete documents after a set time, by enabling ttl in the container, specifying seconds or -1 to never expire.
Explore Cosmos DB global distribution by replicating data across multiple data centers, improving read and write latency, availability, and disaster recovery with multi-region replication and automatic failover.
Cosmos DB enables multi-region writes, allowing read and write across centers, with conflict resolution options like last writer wins, merge procedures, or conflict feeds.
Explore manual and automatic failover in Cosmos DB across multi-region setups, configure priorities, and ensure seamless continuity while preserving consistency during regional outages.
Explore Cosmos DB consistency levels—strong, bounded staleness, session, consistent prefix, and eventual—and how they trade latency, availability, and data freshness in design.
Explore how to use the cosmos db cli to create and configure a cosmos account, including resource groups, databases, and options like default consistency level, automatic failover, and multi-region settings.
The data lake is a massive repository for structured, unstructured, and raw data in native format, handling any volume. It enables immediate loading and later transformations, unlike traditional warehouses.
Trace how data lake gen2 evolved from hdfs in cloud, blending blob storage's cost efficiency and tiers with dfs fault tolerance for big data analysis.
Compare blob storage with data lake. Built on blob storage, data lake storage gen2 enables big data analytics with Hadoop integration.
Explore Azure blob and data lake security options, including storage account keys, shared access signatures, and Azure AD RBAC and ACL, with network and IP restrictions.
Learn how same-region high availability uses multiple instances and a load balancer to avoid downtime, and how recovery time objective and recovery point objective govern data loss during failover.
Examine high availability and disaster recovery options for storage, including locally and zone redundant storage, plus manual failover to a secondary region and blob data protection features.
Explore Cosmos DB high availability and disaster recovery options, including regional replication, global distribution, automatic and manual failover, and automated backup and restore with blob storage.
Learn about relational data stores, Azure database offerings (single, elastic, managed), the Azure data warehouse, and Polybius loading, MBP architecture, storage and data distribution, partitioning, and loading methods.
discover how Azure SQL, a fully managed relational database as a service, leverages elastic pools, offers 99.99% uptime, zero replication, and security features.
Examine iaas vs paas for sql server workloads: manage sql server in a virtual machine versus a fully managed database service with automated backups, scaling, and high availability.
Explore Azure SQL deployment options in a PaaS framework, including single database, elastic pool, and managed instance, with guaranteed resources and shared pool benefits.
Provision a single database in Azure SQL Database and explore deployment options—single database, elastic pool, and managed instance—along with firewall and connectivity basics.
Explore Azure SQL Database purchasing models and service tiers, including data-based and vehicle-based options, with general purpose, standard, business critical, and hyperscale deployments.
Provision three Azure SQL databases and group two into an elastic pool on a shared server, migrate the third, then demonstrate scaling and cleanup of resources.
Explore Azure SQL Database security layers, including firewall-based network access, authentication and authorization, auditing and threat protection, encryption in transit and at rest, dynamic data masking, and vulnerability assessment.
Explore vertical and horizontal scaling for Azure databases, including scale up and scale down, scale out with read-only replicas, and global scale out through sharding.
Compare traditional on-premises data warehouses with modern architectures that separate compute and storage, and ingest, clean, and model data for a single source of truth.
Create a dedicated sequel pool by provisioning a server, configuring firewall rules, and pausing compute to save costs, with deployment options inside the workspace or as a separate service.
Create a new Azure Synapse Analytics workspace, provision a data storage account and file system, set access roles, configure security, and open Synapse Studio to manage jobs.
Explore synapse studio's data, development, integrate, monitor, and manage tabs to connect storage, link datasets, run SQL scripts, notebooks, dataflows, and pipelines, and publish reports to Power BI.
Demonstrates how to create a dedicated sequel pool and an Apache Spark pool in a workspace, compare serverless versus dedicated pools, and discuss auto pause and cost considerations.
Resume a dedicated sql pool, create and populate a table with millions of rows, and analyze taxi trip data. Publish scripts, explore round-robin and hash distributions, and export results.
Demonstrates analyzing data from multiple sources with an Apache Spark notebook, loading datasets into a data frame, and ingesting results into Spark databases for New York taxi data.
Use the serverless sql pool for ad hoc queries across blob storage and external data sources, create a serverless database, and link external csv and parquet files.
Demonstrates running a data factory pipeline inside Synapse Analytics Studio to copy data from SQL Server to a data warehouse, with connections, mapping, and monitoring.
Explore monitoring of pipelines, triggers, and integration runtimes in Synapse Studio, and review various query types, Spark, serverless, server pool, and dataflow, with logs, inputs, outputs, and cost insights.
Explore the Azure Synapse MPP architecture with a control node, compute nodes, and a data movement service enabling parallel queries across 60 distributions and scalable data warehousing units.
Explore Azure Synapse storage and sharding patterns, including replicated, round-robin, and hash distributions, and learn how distributions drive parallel queries and table performance.
Learn to select data distributions in Azure Synapse to avoid data skew, using hash keys, replicated tables, and round-robin distributions across 60 buckets, and optimize joins, grouping, and performance.
Choose the smallest data types and default lengths for integers and characters to save space and move data efficiently between compute nodes; compare clustered columnstore, heap, and clustered index types.
Explore table partitioning in Azure Synapse, splitting large tables into partitions by date to speed queries, ease data load and maintenance, and avoid performance pitfalls from excessive partitions.
Apply dimensional modeling in Azure Synapse with hash-distributed fact tables for efficient joins. For dimensions, use hash or round-robin for small tables and avoid partitioning them.
This demo analyzes an on-premises Adventure Works data warehouse before migrating to Azure, using round-robin and hash distribution to prepare large fact and dimension tables and assess data types.
Explore single client loading methods such as SAS data factory or BCP, and parallel loading with Polybius that bypasses the control node to feed compute nodes in a data warehouse.
Compare loading data into Azure Synapse with SARS and Polybius; note control node bottlenecks versus parallel loads from blob storage, and setup external data source, file format, and external table.
Export the on-premises table to a flat file, upload it to blob storage, and run the polybius six-step load to the Azure data warehouse, monitoring and validating 60 distributions.
Scale azure data warehouse on demand by adjusting the data warehouse unit (DWU), pause to save costs, and automate start or pause with PowerShell, Data Factory, or CLI.
Learn how to back up and restore an Azure SQL data warehouse with snapshots and restore points, seven-day retention, up to 42 points, user-defined options, final snapshots, and regional replication.
Differentiate online transaction processing databases from data warehouses: online transaction processing handles create, read, update, delete operations, while data warehouses optimize queries and reports with massively parallel processing across horizontally partitioned compute nodes.
Implement dynamic data masking in sql server to shield sensitive data such as social security numbers, credit cards, and emails using default, random number, email, and credit card masking.
Learn to encrypt data at rest, in motion, and in use across Azure services using symmetric and asymmetric encryption, key vaults, and always encrypted with deterministic or randomized options.
Discover how data factory, a cloud version of SSIS, enables copy and transform data across 80 connectors, from on-premises to cloud sources, with built-in data flow for transformations.
Learn when to use Azure Data Factory versus specialized migration and streaming services, leveraging version 2, connectors, and event-based triggers for cloud data workflows, with SSIS integration.
Create a new Azure Data Factory v2, assign a unique name, choose subscription, resource group, and location, then explore the home, author, monitor, and management hubs to build pipelines.
Explore how Azure Data Factory components—integration runtime, activities, datasets, linked services, sources, sinks, and pipelines—work together to move and transform data, using copy and data flow tasks.
Develop and organize Azure Data Factory pipelines with activities, folders, and templates. Configure data movement, transformation, and control activities; validate, publish, and view JSON-backed code.
Connect and organize your data flows by defining linked services and datasets in Azure Data Factory, enabling copy activities to reference blob storage data with proper schema.
Learn how Data Factory's integration runtime executes activities, data flows, and data movement by bridging linked services and datasets with serverless, managed compute.
Use Azure Data Factory to copy data from blob storage to a SQL database with the copy data activity. Create linked services and datasets, map fields, and monitor the pipeline.
Demonstrate building a copy data activity pipeline in Azure Data Factory's Autopage, sourcing from blob storage and sinking to SQL Server, with schema mapping, publishing, monitoring, and troubleshooting duplicates.
Explore the data factory user property across activities, view source and destination in the monitor, auto generate properties, and add custom user properties like Ishant.
Learn how parameters in Data Factory pipelines enable dynamic inputs, such as file names, container names, and destination details, to run the same workflow for multiple sources.
Explore data flow in data factory to transform data graphically with drag-and-drop, generating code behind the scenes for scalable mapping and wrangling data flows.
Demonstrates mapping data flow in Azure Data Factory by joining product and product category files from blob storage, producing a final output with corrected column names and data preview.
Explore wrangling data flow in Azure Data Factory to clean and transform data with Power Query, including column removal, renaming, and value replacement, and output to blob storage.
Explore how Azure Databricks harnesses Apache Spark on the Azure cloud to deliver a fully managed data lake, data factory integration, and machine learning workflows.
Connect a database with data in place via a service principal, mount a data lake, process with Scala, Python, or SQL, and save results back to the data link.
Provision a database service, create a workspace, and build an interactive cluster with a notebook workbook in Azure; enable premium tier, RBAC, auto termination, and auto scaling.
Mount the data lake to Databricks DBFS using a service principal and app registration. Configure client id, directory id, and secret, grant read, write, execute, and verify access with dbutils.
Explore, analyze, clean, transform, and load taxi data in Databricks using notebooks and Spark dataframes, reading from a data lake and writing results back to the datalink.
Explore Spark basics, an in-memory analytics engine, its evolution from MapReduce and Hadoop, and the RDD, DataFrame, and Dataset abstractions with lazy evaluation and actions.
Explore interactive and automated clusters in Azure Databricks for notebook analysis and scheduled jobs. Learn standard and high concurrency modes, auto scaling, and idle termination to save cost.
Explore Azure Databricks workspace fundamentals, including databases and tables, notebooks with multi-language support, and jobs with scheduling and cluster configurations.
Explore the streaming analytics service for real-time data, learn to configure inputs and outputs, write processing logic, and apply tumbling, hoping, sliding in session windowing with end-to-end demos.
Describe live data processing with event producers, processors, and consumers and real-time responses to anomalies in banking and markets, using Azure Stream Analytics and related services.
Discover how Azure Stream Analytics delivers fully managed real-time analytics for fast-moving data, ingesting from Event Hubs, IoT Hub, and Azure Blob Storage to produce outputs.
Explore streaming analytics by grouping timestamped events into windows, computing metrics like average or count, and learning four types: tumbling windows, hoping windows, sliding windows, and session windows.
Understand tumbling windows by dividing time into non-overlapping buckets, using group by with a time unit and bucket value, e.g., ten-second intervals counting events.
Understand hopping windows, where ten-second intervals overlap every five seconds, counting events like tweets across overlapping windows to illustrate dynamic time-based aggregations.
Explore sliding window analysis with a fixed ten-second window. Each new event starts a window, creating overlap and producing window results of 1, 2, 4, and 1.
Define the session window as a non-fixed, non-overlapping window that starts on an event, ends after five minutes of silence, or after ten minutes (max), using minute units.
Learn to set up blob storage input and output, configure a stream analytics job with a processing query, and start processing uploaded JSON files that flow to the output.
Demonstrates processing data from an IoT hub using a streaming analytics job, with a simulated device feeding sensor data and outputs saved to blob storage.
Azure Monitor centralizes monitoring by collecting and analyzing data from metrics and logs across resources, enabling alerts, insights, log analytics, diagnostic settings, workbooks, dashboards, and custom views.
Explore the Azure Monitor service and its core tools—metrics, logs, alerts, activity logs, diagnostic settings, and workbooks—for monitoring, diagnosing, and optimizing resources.
Learn to monitor blob and data lake storage with insights and workbooks, analyzing metrics, alerts, and diagnostic settings to troubleshoot latency, availability, and capacity.
Explore Azure Synapse Analytics monitoring by reviewing query activity, alerts, metrics, and diagnose settings; learn to create alert rules, configure actions, view query plans, and analyze performance dashboard.
Monitor Cosmos DB in Azure with metrics, alerts, and diagnostic settings to track throughput, storage, latency, availability, consistency, and SLA performance.
Learn to monitor Azure data services including data factory, databases, and streaming analytics using metrics, log analytics, alerts, and service-specific tools for reliable operation.
Monitor data factory pipelines with dashboards and alerts, tracking completion status, run duration, errors, and resource usage. Explore pipelines, triggers, and integration runtimes to understand performance and set proactive alerts.
Learn to configure alerts and metrics for a data factory, create new alert rules and action groups, and set diagnostic settings to collect logs and metrics.
Explore monitoring options for Azure Databricks data, including the Ganglia monitoring system embedded with the database, the adjure monitor workflow, and Gravagna with log analytics workspace.
Monitor Azure stream analytics via portal, SDK, or Visual Studio, and configure alerts for utilization, runtime errors, watermark delay, and deserialization errors.
Optimize data partitioning and troubleshoot bottlenecks to boost performance. Structure data lake, optimize ingestion and stream analytics, and apply hash, Round-Robin, and replicated tables in Polybius ingestion for Sanabis analytics.
Identify and troubleshoot data partitioning bottlenecks by applying horizontal, vertical, and functional partitioning strategies, and follow best practices for balanced workload distribution and cross-partition optimization.
Optimize data lake storage by maximizing throughput with parallel reads/writes and fast cloud links, and size data between 256 megabyte and 100 gigabyte with month/date naming in the same region.
Optimize stream analytics by tuning input, output, and query processing; monitor cpu utilization and memory, and apply partitioning with partition by for scalable, parallel processing.
Implement best practices for Azure Synapse Analytics by maintaining up-to-date statistics on key columns, using Polybius for large data loads, and distributing large tables by join-optimized columns.
Learn to manage data lifecycle in Azure blob storage by configuring policies that move blobs from hot to cool to archive, enabling rehydration and cost governance.
Explore the four data categories—structured, semi-structured, unstructured, and streaming—highlighting schema, flexibility, and real-time analysis with examples like XML, JSON, GPS, and IoT.
Explore relational and non-relational storage types, including key-value, document, graph, column-family, object, and file storage, and learn how to match data structure and latency needs for analytics and apps.
Choose Azure storage by matching your on-prem data to Cassandra API, MongoDB API, or SQL API, then use blob storage or data lake with time series and graph options.
Discover how the Azure data platform architecture layers load, storage, process, serve, and visualize data from sources, including streaming and relational data, using Event Hubs, Data Factory, Synapse, and PolyBase.
Explore RTO and RPO for disaster recovery, and see how lower RTO and near real-time data replication to an alternate region reduce downtime and data loss.
Apply scenario-based design to choose Cosmos DB for global, real-time pricing; use Data Lake for business intelligence reporting from raw data; and Blob Storage for cost-effective video storage.
Evaluate scenarios to decide between SQL database elastic pool and data warehouse, illustrating cost-sensitive transactional workloads versus complex queries on massive data.
Design batch processing architectures on azure using blob storage, cosmos db, data bricks, and spark, with data factory or ssis orchestration for end-to-end processing and reporting.
Learn data ingestion methods in adjure, compare batch and real-time approaches, and explore tools like UCLA, easy copy, datalink, Hadoop, scoop, Polybius, and data factory for efficient ingestion and orchestration.
Analyze real time processing architecture across message injection, streaming, analytical data store, and reporting to deliver near real time insights with event hubs, Kafka, IoT hub, and spark streaming.
Compare dual stream analytics architectures for real-time dashboards, storage-backed reporting, and real-time alerting, using event hubs, streaming analytics, and reference data.
Master lambda architecture for real-time hot path and batch cold path processing. Use data factory, data lake, data warehouse, and machine learning studio for archive, analysis, and AI-driven insights.
Enable secure connectivity with virtual network service endpoints and private endpoints to keep traffic within the network backbone and grant access to specific storage resources.
This course is designed to help you and your team develop the skills necessary to pass the Microsoft Azure DP-203 certification exam.
DP-203 is intended for Azure data engineers. This exam is all about implementation and configuration, so you need to know how to create, manage, use, and configure data services in the Azure portal.
Why one should take DP-203 certification?
According to the 2019 dice dot com report, there was an 88 percent year over year growth in the job postings for data engineers which was the highest growth rate among all the technological jobs.
According to a recent study by pearsonvue, after taking the certification, 65% of people say they feel more confident in their current job. And 35% of people say that their salary has increased.
Highlights of course
Course is completely up-to-date with the latest syllabus released by Microsoft for DP-203
Course covers 100% exam syllabus
Course include
25+ hrs of content
2 practice test
Quiz - specially designed to clear concepts of objectives.
Further study material
PPT and Demo resources
Course includes:
Full lifetime access with all future updates
30-Day Money-Back Guarantee
Certificate of course completion
Intended Audience
Anyone who wants to prepare for DP-203 exam
Anyone who wants to become an Azure Data Engineer
Microsoft Azure Data Engineers
Microsoft Azure Data Scientist
Database and BI developers
Database Administrators
Data Analyst or similar profiles
On-Premises Database related profiles who want to learn how to implement these technologies in Azure Cloud.
Anyone who wants to become an Azure Data Engineer
Prerequisites
Basic Database concepts
Language
English
Please make sure you are comfortable in English, captions are not good enough to understand the course.
Technologies covered in DP-203 certification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Implement non-relational data stores
Non-relational data stores (Blob Storage)
Cosmos DB
Implement relational data stores
Azure SQL Server
Azure Synapse Analytics Service
Manage data security
Data masking
Data Encryption
Develop batch processing solutions
Azure Data Factory
Azure Databricks
Develop streaming solutions
Azure Streaming Service
Monitor data storage
Monitoring for Azure Blob Storage
Monitoring for Azure Data Lake
Monitoring for Azure SQL Database
Monitoring for Azure Synapse Analytics
Monitoring for Azure Cosmos DB
Azure Monitoring Service
Monitor data processing
Monitoring for Azure Data Factory
Monitoring for Azure Databricks
Monitoring for Azure Stream Analytics
Optimize Azure data solutions
Optimize Azure Data Lake
Optimize Azure Stream Analytics
Optimize Azure Synapse Analytics
Optimize Azure SQL Database
Some students Feedback
One of the most amazing courses i have ever taken on Udemy. Please don't hesitate to take this course. The instructor is really professional and has a great experience about the subject of the course. - Khadija Badary
Very nicely explained most of the concepts. a must have course for beginners - Manoranjan Swain
I appreciate this course explaining everything in great detail for a beginner. This will assist me in overcoming challenges at my work - Benjamin Curtis
Good course for Beginners. Labs are really helpful to grasp the concept. Thank you - Sapna