
Discover Microsoft Fabric, a unified data platform for data analysts, data engineers, and data scientists, covering data warehousing, real-time intelligence, Power BI, and data factory through end-to-end multi-experience projects.
Microsoft Fabric unifies data factory, Synapse Analytics, real-time intelligence, data science, and Power BI in a single SaaS platform built on OneLake with Delta Parquet for centralized data.
Explore how Microsoft Fabric serves data analysts, BI engineers, and data engineers through a course starting with essentials and Azure setup, covering warehousing, Power BI, Data Factory, and real-time intelligence.
Learn why Azure is essential for Microsoft Fabric, powering it with Azure cloud and integration with Azure Data Lake Storage Gen two, Synapse Analytics, Azure Data Factory, and machine learning.
Azure, Microsoft's cloud platform delivering IaaS, PaaS, and SaaS with pay-as-you-go pricing. Explore regions, availability zones, security, compliance, monitoring, and governance essentials.
Learn how the Azure portal acts as a unified web console to sign in, manage resources via subscriptions and resource groups, and monitor costs with cost analysis, budgets, and alerts.
Learn how to start a Microsoft Fabric free trial with your domain by creating a new user, granting global administrator and owner roles, then sign in to activate the trial.
Use your organizational Azure account to start the Microsoft Fabric free trial by signing up, entering details, and selecting start trial from your profile to see 59 days remaining.
Explore the Microsoft Fabric user interface, create and manage workspaces and objects across experiences like data engineering and data warehouse, and control access and governance within a unified, capacity-driven platform.
Set up a resource group and a Data Lake Storage Gen2 account to access Fabric data, enabling organized storage with hierarchical namespace, multiple file formats, and a Fabric-specific storage container.
Explore Parquet and Delta formats for efficient big data analytics. Delta adds a transaction log and ACID features, with governance and schema enforcement in Fabric.
Explore Synapse Data Warehouse and Power BI experiences in Microsoft Fabric by learning SQL basics, data warehousing, data modeling, and Power BI reporting, culminating in a report and dashboard.
Explore how data warehouses organize data with fact and dimension tables, implement star and snowflake schemas, and apply semantic modeling on a lakehouse using delta format and SQL.
Create a new Microsoft Fabric workspace named Data Warehouse Tutorial, attach the trial license, and verify it as your current workspace across experiences so all items are stored there.
Create your first data warehouse in fabric by selecting the data warehouse experience, naming it, and exploring schemas, tables, views, and functions to shape a semantic model for business intelligence.
Master T-SQL, Microsoft's SQL extension with procedural features for complex data manipulation in a data warehouse; create, run, and save queries (private or shared) and explore schemas such as dbo.
Learn how to add single-line and multi-line comments to SQL queries, with comments ignored by the engine and no impact on performance.
Access the SQL code for each lecture on the GitHub wiki, where you can view and copy code into your SQL script editor, with links and resources in the readme.
Create and manage schemas in Microsoft Fabric using T-SQL, applying the create schema syntax, specifying optional owner, and dropping schemas with drop schema, illustrated by a tutorial_schema example.
Explore data types in fabric data warehouse tables, including numeric, date/time, char, varchar, and binary, with guidance on precision, scale, and uses for decimal, float, and real.
Create and drop tables in Microsoft Fabric's data warehouse using T-SQL. Explore schema contexts like dbo and tutorial_schema, specify column names, data types, and optional constraints.
Learn to create a table, then insert single and multiple records with insert into and values, validate data types (integer, varchar, date), handle nulls, and use insert into select.
Load the retail dataset into the warehouse by creating a lakehouse and using a data pipeline to copy four parquet files—orders, users, products, reviews—into tutorial_schema as a double star schema.
Recaps the select statement, showing how to retrieve id, product_id, and quantity from the orders table with schema and database qualifiers, and explains aliasing and select star.
Use the distinct keyword after select to eliminate duplicates and return unique records. See four distinct categories, then two hundred unique category-title combinations, or two hundred rows with the star.
Explore built-in t-sql functions and expressions to craft queries, including arithmetic, string, date, and time functions, formatting outputs, aliasing results, and case statements.
Learn how to sort and limit data in SQL by using order by, asc/desc, multiple columns, expressions, and top N limits.
Master filtering sql query results with the where clause, using operators =, >, between, and in, with and/or and parentheses, then order by price and select top results.
Master grouping and aggregation in SQL with group by and aggregate functions like count, sum, average, max, and min. Use having to filter post-aggregation results and alias totals for clarity.
Master sql joins to combine data from related tables using inner, left, right, and full outer joins. Learn aliasing, selecting specific columns, and linking orders, users, and products.
Understand the execution order: from, join, where, group by, having, then select, then distinct, then top, then order by; recognize that join runs before other clauses except from.
Create a new table with the CREATE TABLE AS SELECT statement to copy data and match the schema from a select, including simple or complex queries and CTAS examples.
Update and delete records in the users_ctas table using update and delete statements, with a where clause and upper() to transform names, plus dropping the table when done.
Master subqueries, inner or nested queries that run first, filtering data before the outer query, with examples from orders and products and usage in from, where, and having clauses.
Explore how views in SQL function as virtual tables, and learn to create, save as view, query, and drop them, including data access restrictions.
Explore the visual query editor in Microsoft Fabric’s data warehouse to build queries with drag-and-drop transformations, merges, and filtering, no code required.
Discover zero copy clones that copy metadata without duplicating data, and use time travel on delta tables to query historical versions, enabling development, testing, and recovery.
Create and run sql queries across multiple warehouses using the sql query editor, referencing warehouses with a three level namespace to join tables from different warehouses.
Monitor SQL queries against your data warehouse using the queryinsights schema, which stores 30 days of execution data in views like exec_requests_history, exec_sessions_history, frequently_running_queries, and long_running_queries for performance insights.
Learn how stored procedures in T-SQL encapsulate complex logic and multiple statements, enabling reusable, parameterized operations, execution via exec, and safe modification with alter.
Learn to mirror data from Snowflake, Azure SQL, and Cosmos DB into Fabric's OneLake, using low-cost, low-latency replication and explore mirrored databases, semantic models, and SQL analytics endpoints.
Explore building a semantic model for a data warehouse using star and snowflake schemas, and leverage Direct Lake with Power BI to create and customize reports and visuals.
Establish relationships between tables in the semantic model to enable cross-table filtering and interactive visuals in Power BI, using cardinalities, cross-filter directions, and referential integrity.
Create dynamic measures with DAX in Power BI to compute total_order_amount using SUMX on the orders table, showing how measures differ from static SQL columns and support interactive visualization contexts.
Rename staging tables and transform them into a presentation layer. Create fact and dimension tables (fact_orders, fact_reviews, dim_products, dim_users) and model them in a semantic model for Power BI.
Build a multi-page Power BI report across revenue, customers, and products by connecting to the warehouse, creating visuals, and configuring interactions and slicers.
Pin visuals from reports to a single-page Power BI dashboard to create a real-time, multi-source view of key metrics, with resizable tiles, text tiles, and refreshed data from semantic models.
Create and manage apps in your workspace to package Power BI reports and dashboards, customize audience access, and publish or update the Retail Analytics App.
Explore how Data Factory offers a modern data integration platform to ingest, prepare, and transform data from databases, data warehouses, lakehouses, and real-time sources, via pipelines and dataflows.
Create a dedicated workspace by opening experiences and workspaces, then click new workspace, name it DataFactoryTutorial using camel casing, and apply it for the remainder of this section.
Create a lakehouse to store tables and files, distinguishing it from a warehouse. Ingest data with pipelines, data flows, use delta tables and notebooks, and query via SQL analytics endpoint.
Learn to build and manage data pipelines in Data Factory, configure copy data and wait activities, sequence tasks, validate and monitor runs via the pipeline canvas and run history.
Download the Citibike data zip and upload it to the lakehouse for data pipeline demonstrations, covering Jersey City rides from January to March 2024 (ride_id, rideable_type, dates, stations, member type).
Move files in a lakehouse using Copy Data activity in a Fabric data pipeline, configuring source and sink paths and converting between delimited text and parquet formats with snappy compression.
Explore copy data activities in Microsoft Fabric, comparing copy behaviors: preserve hierarchy, flatten hierarchy, and merge files across folders, subfolders, and formats like CSV and parquet.
Learn to copy data from delimited text files into lakehouse tables and a warehouse using the copy data activity, with append and overwrite options, validating results with row counts.
Learn to copy a lakehouse folder to an ADLS Gen2 container using a Fabric pipeline, configuring Azure blob storage connections, delivering preserved folder hierarchy and managing connections.
The lecture demonstrates the lookup activity in Data Factory, showing how it reads a file or table and returns single values or arrays for downstream activities.
Use the Get Metadata activity in Data Factory to retrieve metadata for tables or files, including exists, column count, structure, and item details, with results in JSON.
Explore how data factory's dynamic content uses expressions and functions to build source and destination paths, leveraging pipeline id, run id, utcnow, formatDateTime, and concat to name files.
Use pipeline parameters to pass external values into resources, parameterize a lookup table like jc_citibike, and reuse components with different values through expressions and defaults.
Define and set pipeline variables with set variable activity, creating variables and assigning utcnow-based values during run; reuse and reassign them across activities like lookup to read parquet in lakehouse.
Use the ForEach activity in a Data Factory pipeline to iterate over a collection produced by Get Metadata and run inner activities for each element in a loop.
Explore how the switch activity implements branching logic in data pipelines by evaluating an expression. Convert results to strings and drive cases or a default path with set variable.
Use the invoke pipeline activity to trigger another pipeline, configure the target (pl_get_metadata), view run IDs, and chain invocations with optional wait on completion for sequential or parallel execution.
Explore data transformation with Dataflows Gen2 in the Data Factory experience, merging Citibike CSV files from the lakehouse into a single jc_citibike_q1 table in a warehouse.
Execute sql operations like select, update, insert, create, alter, and drop in a data factory script activity to query a data warehouse and copy a table with script parameters.
Use the stored procedure activity in a data factory pipeline to filter jc_citibike by ride_type (electric_bike) and create jc_citibike_filter.
Finalize your data factory pipeline and schedule runs at regular intervals. Adjust the cadence to hourly or daily, with a default every 15 minutes, and set start and end times.
Design a Microsoft Fabric data project that ingests NYC taxi yellow parquet data into a lakehouse, uses Data Factory and Dataflow Gen2 for cleansing, and delivers a Power BI report.
Analyze 2024 NYC taxi data in parquet format (yellow taxi trip records) with pickup/dropoff times, counts, and totals, using the data dictionary and taxi zone CSV in fabric.
Create a nyctaxidata workspace with a lakehouse and warehouse, upload nyctaxi yellow parquet files and the taxi_zone_lookup csv, and build a staging pipeline to copy the lookup.
Ingest the yellow_trip_data_2024-01 parquet file into the staging nyctaxi_yellow table using a dynamic pipeline, implement data cleansing, and record processing metadata.
Build a dataflow gen2 to clean, transform, and enrich nyctaxi staging data with borough and zone lookups, then append to nyctaxi_yellow. Publish, orchestrate a pipeline, and log processing metadata.
Design an end-to-end data orchestration that invokes staging and presentation pipelines, schedules runs, and validates February data in the lakehouse for a Power BI report.
Add the NYC yellow taxi table to the semantic model, then create a Power BI report with date filters, KPIs, and visuals to track revenue and trips.
Replacing dataflow gen2 with a stored procedure in the pl_pres_processing_nyctaxi pipeline processes data from staging to presentation by populating dbo.nyctaxi_yellow with left-joined staging data, boosting performance.
Explore data engineering with spark, an open source distributed computing system for big data processing. Learn about dataframes, python focus, notebooks, and transforming the retail analytics data set.
Explore Apache Spark, an open source distributed framework for big data processing. See how a driver program, Spark context, and cluster manager orchestrate parallel tasks across a cluster using dataframes.
Master Spark SQL and the DataFrame API to query structured and semi-structured data with SQL, powered by Spark Core. Leverage DataFrames' named columns to simplify processing in Microsoft Fabric.
Configure a Microsoft Fabric workspace with a Spark starter pool, tune node sizes and runtimes, and enable libraries and automatic logging for Spark tasks.
Create a lakehouse in the data engineering workspace, enable lakehouse schemas, and use the SQL analytics endpoint for read/write access to delta and parquet, merging data lake and warehouse capabilities.
Upload retail analytics data to the lakehouse in Microsoft Fabric by importing orders and products parquet files, reviews json, and users csv from a zipped dataset, removing ds_store files.
Explore interactive notebooks in Microsoft Fabric, linking lakehouses, running Spark sessions, scheduling notebooks, and managing cells across Python, SQL, R, and Scala for end-to-end data engineering workflows.
Download and unzip the instructors notebooks zip, import the ipynb notebooks into a new instructors notebooks folder in Fabric, and detach and reattach them to your lakehouse.
Learn how to add comments in a code cell for documentation, using a hash followed by a space; comments are ignored by spark, while the code executes.
Explore Python built-in functions and modules, learn how to import libraries, alias modules, and use functions like print, len, and floor from the math module.
Learn to read parquet data into a spark dataframe with PySpark, using spark.read.parquet and spark.read.format('parquet').load(path), then display and profile the data in a notebook.
Read csv and json files with Spark, define or infer schemas, use header and delimiter options, and apply ddl-based schemas (int, timestamp, date, string) for data types; handle multi-line json.
Define schema in PySpark with StructType and StructField, importing data types from Spark SQL. Translate data definition language to SQL for readable, repeatable schemas built from StructField entries.
Learn how to write multi-line strings using triple quotes, avoid errors with line breaks, and read a schema into a DataFrame in a single cell.
Learn to load parquet, CSV, and JSON files into a spark data frame via a UI shortcut, with header true, while recognizing the need to define the schema.
Discover the limitations of SQL in Spark and how to load data into dataframes from Parquet, CSV, and JSON, while using Python for PySpark and integrating SQL where appropriate.
learn to select columns from data frames in Spark using df.select, wildcard and explicit references, and aliasing with call and col functions.
Learn to reference data frame columns using dot notation, alias the unit price as price, select id, quantity, and price, and display results from the orders data.
Learn to write Spark data frames to lakehouse files using the Data Frame Writer API, selecting formats (CSV, Parquet, Delta) and modes (append, overwrite, ignore, error) via path and options.
learn to write data frames to tables in a lakehouse and read them back into data frames. use delta, parquet, or csv formats and explore schema options.
Explore adding, updating, and removing DataFrame columns with withColumn, creating id2, order total, and literal values, plus using expr for SQL-like expressions.
Convert data types in a dataframe using cast with select or withColumn, demonstrated on a sample orders dataset; import StringType and col, view dtypes, and verify the change.
Learn to rename DataFrame columns in Spark using with column renamed or with columns renamed, apply dictionary mapping, and alias via select for order timestamp, unit price, and id fields.
Explore filtering rows in a data frame by using sql expressions or spark column references, including and/or conditions, and practical examples like gizmo category and price over 50.
Sort data frames by specified columns with the sort method and col, using asc or desc; then limit results to the top ten by price and show them.
Learn how to group and aggregate data frames using group by, apply aggregations like sum and average, alias and rename results, and sort across one or more columns.
Join data frames in Python using the dataframe join method, specifying the right data frame, join columns, and a join type (inner, full, left) to combine data like SQL.
Learn how to union two dataframes in PySpark and remove duplicate rows with drop duplicates, including selecting columns to define duplicates, illustrated with a df_products example.
Learn to use SQL syntax in PySpark to create schemas and tables in a lakehouse with spark.sql. Compare SQL and Python for schema creation and multi-line queries.
Create local temporary views from data frames via create or replace temp view to run SQL within a spark session; the views are session-scoped and disappear when it ends.
Explore how shortcuts in OneLake link to internal and external storage, using target and shortcut paths, to access data in lakehouses and external storages without copying.
Create spark dataframes from Python data structures using the spark session create dataframe method. Use lists, tuples, sets, and dictionaries with an optional schema to shape columns.
Run a notebook via a data pipeline and pass base parameters to enable dynamic results. Apply parameter cells and f-strings to filter a parquet dataset by vendor name.
Explore the medallion architecture for a data lakehouse, from bronze ingests through silver validation to gold enrichment, enabling incremental data quality and analytics for BI and data science.
Implement the medallion architecture in a lake house using bronze, silver, and gold delta tables, with a landing zone and a retail dataset, via a notebook for end-to-end processing.
Explore how to customize spark pools in Microsoft Fabric, comparing starter pools with custom large pools, configuring memory-optimized nodes, auto scaling, and dynamic executors for efficient data tasks.
Explore Spark environments in Microsoft Fabric, selecting runtimes, configuring compute, adding libraries from public repositories or directories, and creating test_environment to publish, attach to pools, and tune cores and memory.
Explore the real-time intelligence experience in microsoft fabric by streaming and processing live data into a lakehouse and real-time data hub, then analyze and visualize it for real-time data engineers.
Discover microsoft fabric's real time intelligence, enabling real time ingestion, processing, and analysis of streaming data within a unified analytics platform, with event streams, lakehouses, and reflex alerts.
Create a dedicated workspace named RealTimeTutorial and a lakehouse named RealTimeLakehouse for some demos. Enable Lakehouse schemas if available to streamline data workflows.
Create an eventstream in the real-time workspace, ingest bicycle sample data, and route to lakehouse table bike_data_stream. Use processed_timestamp, apply transformations, and configure a 1-day retention to support real-time analytics.
Create an eventhouse to host KQL databases, load sample weather analytics data from various sources, and build KQL querysets to query with KQL or SQL.
Demonstrates routing an eventstream to a lakehouse and a KQL database, with destination linking, event processing before ingestion, and creating a bike_stream_kql sink table.
Learn to analyze bike point data in real time with KQL and SQL queries, visualize results in Power BI, and build real-time dashboards with tiles.
Create and configure reflexes in the data activator to react to events and streams with objects, properties, and triggers; monitor empty docks and trigger alerts when thresholds exceed.
Create reflex alerts from a Power BI report visual to monitor metrics like daily revenue by date and payment method, triggering email notifications when revenue exceeds 4 million.
Delete the workspace to stop ongoing event streams from real-time intelligence, then remove the workspace via the three dots in workspace settings to delete all items.
Build an Azure data lake gen2 pipeline with bronze, silver, and gold layers to ingest, cleanse, and join nyc taxi yellow data with zone lookup, automated by data factory activator.
Create a dedicated workspace named NYCTaxiDataProject2 and a lakehouse named ProjectLakehouseDE, then enable lakehouse schemas to organize data for your NYCTaxi project.
Create an Azure Data Lake Gen2 storage container dedicated to the project and link it to the lakehouse as a shortcut, enabling notebook access via the landing-zone.
Assign the storage blob data contributor role to fabricuser1 via IAM; owning is not enough, and this grants read access to the container.
Upload 2024 January through May yellow taxi trip parquet files and the taxi zone lookup csv to Azure landing zone, in the NYC yellow taxi data folder, one by one.
Create bronze, silver, and gold schemas and a silver layer taxi zone lookup static table in the lakehouse; upload the taxi_zone_lookup.csv and query it via delta load.
Open and attach bronze, silver, and gold notebooks in fabric, process parquet data from the landing zone, and build nyc_taxi_yellow tables across layers using processing_timestamp.
Create a data factory pipeline named NYC taxi data orchestration to orchestrate bronze, silver, and gold processing notebooks. Set a processing_timestamp base parameter and delete landing zone data after processing.
Demonstrates an end-to-end pipeline run across bronze, silver, and gold layers in project lakehouse de, processing parquet data with a processing_timestamp and notebook-based queries.
Automate data ingestion by triggering the pipeline when a file is added to the landing zone, using a reflex to monitor blob creation and process bronze and silver layers.
Complete the project analysis by processing January–March 2024 data, extend to April and May, and enhance analysis with lake house tables in a semantic model to generate Power BI reports.
** Important Notice: As a pre-requisite for this course please make sure you have access to either your companies Azure account or your own personal Azure account that has been created at least 90 days ago. Otherwise you will not be able to start a Free Trial of MS Fabric **
Mastering Microsoft Fabric is your ticket to staying ahead in today’s data-driven world. With data skills in high demand, learning this all-in-one platform will set you apart in the job market. From automating data processes to creating real-time insights, Microsoft Fabric equips you with the cutting-edge tools needed by top employers. Whether you’re looking to boost your career or lead your companies data strategy, becoming proficient in Microsoft Fabric is a game-changer. Don't miss out on the opportunity to become an in-demand expert in the fast-growing field of data analytics and engineering.
This course offers a balanced mix of theory and practical application. You'll dive deep into the concepts behind each component of Microsoft Fabric, followed by hands-on walkthroughs that allow you to apply what you’ve learned. The multi-experience course projects will challenge you to integrate your knowledge across different Fabric tools, preparing you to handle real-world scenarios with confidence.
What You'll Learn:
Data Warehousing: Understand the principles of data warehousing, master SQL for querying and managing data, and learn how to design and implement data warehouses within Microsoft Fabric.
Power BI: Create powerful, interactive reports and dashboards.
Data Engineering: Develop robust data engineering pipelines using Spark (PySpark), manage large-scale data transformations, and automate workflows efficiently.
Data Factory: Master the use of Microsoft’s Data Factory for orchestrating and automating data movement and transformation.
Real-Time Intelligence: Leverage real-time data processing capabilities to gain instant insights and act swiftly in dynamic business environments.
Course Features:
Theory Lectures: Grasp the foundational concepts and methodologies behind each component of Microsoft Fabric.
Hands-On Walkthroughs: Apply your learning in real-time with detailed, step-by-step walkthroughs of key tools and features.
Multi-Experience Projects: Engage in comprehensive projects that require you to integrate multiple Fabric experiences, simulating real-world business challenges.
By the end of this course, you will be proficient in using Microsoft Fabric to design, build, and manage comprehensive data solutions. Whether you're a data engineer, analyst, or business intelligence professional, this course will equip you with the skills needed become an expert with Microsoft Fabric.