Name: Mastering Databricks SQL Warehouse and Spark SQL
Rating: 4.4 (572 reviews)

Udemy Business

Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Created byDurga Viswanatha Raju Gadiraju

Last updated 9/2022

English

What you'll learn

Setup Databricks SQL Warehouse Environment using Azure Databricks for hands-on Practice
Getting Started with Databricks SQL for Data Analysis or Data Engineering
Features of Databricks SQL Warehouse - Clusters, Query Editor, Visualizations and Dashboards, etc
Overview of building reports and dashboards using Databricks SQL
Creating Databases and Tables using Databricks SQL or Spark SQL
Writing Basic Queries using Databricks SQL or Spark SQL
DML to load data into Databricks SQL or Spark SQL Tables
Advanced Operations such as Ranking and Aggregations using Databricks SQL or Spark SQL
Processing Semi-Structured Data using Databricks SQL or Spark SQL
In-depth Coverage about Delta Tables including all possible DML Operations such as Insert, Update, Delete, Merge, etc
End to End Life Cycle of Data Analysis of Data in Files using Databricks (Uploading File to Databricks to Reports and Dashboards)

Course content

17 sections • 198 lectures • 14h 7m total length

Introduction to Mastering Databricks SQL Warehouse and Spark SQL3:56

Introduction to Setup Databricks Environment using Azure2:18
Signup for Azure Portal2:07
Sign up for the Azure portal and set up Databricks on Azure to begin using Databricks sql or engineering clusters, with options on aws and gcp.
Setup Azure Databricks using Azure Portal5:23
Launching Azure Databricks Environment4:07
Access the Azure Databricks environment via the unique workspace URL. Sign in with Azure AD single sign-on to access notebooks, clusters, and data science, engineering, or SQL workspaces.
Create Single Node Databricks Cluster4:25
Create a single-node Databricks cluster on Azure, with a selectable access mode and idle timeout to control costs, and attach notebooks to run Python, Scala, R, and SQL.
Editing Databricks Clusters using Databricks UI2:38
Learn how to edit and reconfigure Databricks clusters through the workspace UI, including changing runtime, node type, access mode, and inactivity settings, with a practical cluster wizard walkthrough.
Getting Started with Databricks Notebooks5:14
Create a notebook and attach it to a running Databricks cluster to run code. Switch between Python, SQL, Scala, and R, use magics, and run individual cells or full notebooks.
Create Databricks SQL Warehouse3:01
Explore the Databricks SQL editor interface, create and configure a SQL warehouse, choose 2x-small or x-small clusters, set auto stop, manage permissions, and prepare to develop queries.
Increase Quota to Create Databricks SQL Warehouse Cluster5:42
Increase your Azure quota to create a Databricks SQL warehouse cluster by submitting a quota request, then configure SQL warehouses and run queries once the quota is approved.
Run Queries using Databricks SQL Warehouse4:55
Run your first query on the nyctaxi trips table in the samples metastore using the Databricks sql warehouse, then explore databases, tables, and visualizations.
Overview of Uploading Data using Databricks SQL Warehouse UI5:15
Review Data Explorer of Data Science and Engineering Environment4:45
Upload data via Databricks SQL Warehouse UI, review data science and engineering interface, start clusters, explore catalogs, DBFS file store, and create tables using the sales.csv path.
Analyze Sales Data using Databricks Notebooks7:09
Create a Databricks notebook, connect to the active cluster, and read a tab-delimited sales csv with spark.read.csv, using header and inferred schema, then define a precise schema to resolve issues.
Terminate Databricks Data Science and Engineering Clusters1:45
Terminate databricks clusters by inactivity (30 minutes) via the cluster dashboard, and delete the cluster state to avoid runtime and infrastructure charges; learn to create clusters and clean up resources.
Terminate Databricks SQL Warehouse Clusters2:11
Terminate Databricks SQL warehouse clusters by stopping the SQL warehouse in the UI to halt charges. Clean up stopped warehouses by deleting them, noting you need compute to access tables.
Delete Azure Databricks Workspace2:54

Overview of Databricks SQL Platform - Introduction3:33
Explore the databricks sql platform on aws, gcp, and Azure with premium access, featuring photon-accelerated queries and seamless data lake house integration with business intelligence tools.
Run First Query using SQL Editor of Databricks SQL3:58
Overview of Dashboards using Databricks SQL1:46
Explore Databricks SQL visualization features with sample dashboards like NYC taxi trip analysis and retail revenue, and learn to connect BI tools via SQL endpoint and use the SQL editor.
Overview of Databricks SQL Data Explorer to review Metastore Database and Tables2:38
Use Databricks SQL Editor to develop scripts or queries4:38
Review Metadata of Tables using Databricks SQL Platform4:28
Overview of loading data into retail_db tables1:16
Learn how to load non-delta source data into delta-format retail_db tables using Databricks CLI and a data engineering cluster, and verify results with the SQL editor.
Configure Databricks CLI to push data into Databricks Platform3:51
Configure the databricks cli, generate and use a new token, validate access, and push raw data into delta tables created via sql editor on the databricks sql platform.
Copy JSON Data into DBFS using Databricks CLI5:48
Leverage the Databricks CLI to copy the retail_db_json data into DBFS public, creating folders, performing recursive copies, and validating with CLI and the web interface before loading into delta tables.
Analyze JSON Data using Spark APIs5:57
Create a notebook in databricks, spin up a single-node cluster, and analyze json data with spark.read.json to prepare delta tables and align column order.
Analyze Delta Table Schemas using Spark APIs3:45
Analyze delta table schemas with spark APIs to review column names and data types, fix order discrepancies, and validate loading into retail DB tables.
Load Data from Spark Data Frames into Delta Tables2:55
Load data from spark data frames into delta tables by aligning column order to the target table and using APIs to overwrite and validate results in Databricks SQL.
Run Adhoc Queries using Databricks SQL Editor to validate data3:59
Validate that data is properly loaded into retail db tables by running ad hoc queries in the Databricks sql editor, comparing counts and distinct values for orders and order items.
Overview of External Tables using Databricks SQL3:55
Create external tables in Databricks SQL on top of files in DBFS or data lake, defining the structure to query CSV or JSON data without copying files.
Using COPY Command to Copy Data into Delta Tables7:19
Manage Databricks SQL Endpoints4:17

Review Databases using Databricks SQL Data Explorer3:08
Create Database or Schema using Databricks SQL3:53
Create the LMS underscore bronze database in Databricks SQL, learn that a database is a schema, review with the schema browser and data explorer, and validate creation with run feedback.
Using IF NOT EXISTS while Creating Databases using Databricks SQL2:05
Listing or Showing Databases and Getting Metadata of Databases using Databricks3:23
List databases with show databases and inspect metadata with describe database, revealing location, owner, namespace, and comments, while data explorer offers access to LMS underscore bronze and LMS underscore silver.
Understand Default Location of Databricks SQL Database or Schema3:20
Create Database or Schema using Location in Databricks SQL Warehouse3:09
Drop Databases in Databricks SQL Warehouse4:49
Alter Database in Databricks SQL Warehouse5:08
Comments on Databases in Databricks SQL Warehouse5:41

List Databases and Save Databricks SQL Script2:10
Create Table using Delta Format in Databricks SQL Warehouse4:41
Create a delta format table in Databricks SQL Warehouse using the LMS underscore silver database, with a users table featuring user_id int, user_first_name string, user_last_name string, and user_email string.
Understand Location and Using Clause to specify File Format for Databricks7:27
Create External Table using Delta Format in Databricks SQL Warehouse4:26
Drop External Table and Delete Folder in Databricks SQL Warehouse5:46
Drop an external table to delete its metadata only, then clean up the data folder using Databricks CLI or notebook, noting Unity Catalog may enable drop external location.
Overview of DML or CRUD Operations using Databricks SQL2:26
Learn how to manage data in a delta table using Databricks SQL, covering insert, select, update, and delete operations (DML/CRUD) with table metadata and basic queries.
Insert Records into Databricks SQL Warehouse table7:58
Insert Multiple Records into Databricks SQL Warehouse table4:17
Learn to insert multiple records into a Databricks SQL warehouse table in a single insert statement, including cleanup with truncate, column order alignment, and validating results.
Update Existing Records in Databricks SQL Warehouse table4:31
Update Databricks SQL warehouse tables by using update and set to modify single or multiple columns, with where conditions to target specific user IDs and handle nulls.
Update Existing Records in Databricks SQL Warehouse table based on Null Values2:56
Learn to perform dml updates on a Databricks SQL warehouse table when columns are null, using is null and is not null to set last names to a placeholder LNU.
Delete Existing Records in Databricks SQL Warehouse table4:09
Cleanup Users Tables from Databricks SQL Warehouse Database or Schema6:34

Getting Started with Databricks fs Commands using Databricks CLI4:06
Learn to use the Databricks CLI to manage DBFS files, listing, copying, moving, and deleting, and prepare datasets for delta file formatted tables.
Create Folder in DBFS using Databricks CLI Commands2:31
Create a folder in DBFS with the Databricks CLI by using mkdirs under file store to set up LMS_DL and copy the course catalog data from a local folder.
Copy Files from Local File System into DBFS using Databricks CLI Commands5:16
Copy local files into dbfs with databricks fs cp, using -r and profile options, create LMS_dl/course_catalog and verify via ls.
Overwrite Files while Copying into DBFS using Databricks CLI Command4:33
Copy folders into dbfs with Databricks fs cp and the overwrite option, then preview small text files with cat to inspect json data.
Understand Course Catalog Data in the files uploaded to DBFS5:46
Review the course catalog json records stored in dbfs, identify their string attributes, and map them into course and instructor tables using delta format across bronze and silver layers.
Options to Analyze Data using Databricks SQL Queries1:34
Explore practical options to analyze data in Databricks by running queries against the file path, or via views, external tables, or managed tables in DBFS, with or without loading data.
Run Select Queries using DBFS Path in From Cluase5:00
Learn to run select queries directly from a DBFS path using from clause with backticks, validate data with JSON files, and explore using views and external tables in Databricks SQL.
Run Queries using Temporary Views in Databricks SQL6:46
Explore creating temporary views in Databricks SQL from JSON files via a path, and why temporary views may not power dashboards in SQL warehouse.
Run Queries using External Tables in Databricks SQL3:39

Queries to Process Values in JSON String Columns1:56
Learn how to query and process values in JSON string columns using Databricks SQL, with a focus on external tables, JSON processing functions, and building queries for data pipelines.
Get Distinct and Count based on Key using Course Catalog Data5:59
Filter Data using Basic Databricks SQL Queries using Course Catalog Data5:09
Filter data with a where clause on the course_catalog table to retrieve records for instructors or courses, using star or explicit columns and noting case sensitivity.
Exploring Functions using Databricks SQL6:22
Understand Record Column Values in Course Catalog Table2:41
Processing JSON String Values using Databricks SQL Queries7:37
Learn to parse JSON strings in Databricks SQL with from_underscore_JSON, define schemas using struct, and extract fields like instructor_id and instructor_name for instructors and courses in the catalog.
Process Instructors JSON Records using Databricks SQL Queries3:34
Parse instructor and course records from json strings in the course_catalog table using from_json with the proper schema, enabling the query of course_id, instructor_id, and course_title.
Create View for Instructors using Databricks SQL Queries3:58
Create permanent views named instructors underscore V and courses underscore V in LMS underscore bronze to process instructors and courses data; validate with select, show tables, and describe.

Create Delta Table for Course Catalog Data Set4:46
Get File Names along with Data using Databricks SQL Queries1:58
Overview of Databricks SQL COPY Command3:07
Copy Data from single file into Delta Tables using Files4:19
Copy Data from multiple files into Delta Tables using Files2:35
Copy data from multiple files into delta tables with the copy command after truncating the target table, and learn why loaded files are ignored and how to override this behavior.
Copy Data from multiple files into Delta Tables using Pattern4:49
Copy data from multiple files into Delta tables using pattern, while overriding default ignore behavior with copy options such as force and merge schema.
Create Course Catalog Table in Databricks SQL Warehouse with additional Column3:50
Create a Delta table with an extra created_ts column in Databricks SQL Warehouse, then populate it via copy into using a query that derives created_ts from current_timestamp, when structures differ.
Copy Data from Files using Queries into Delta Tables6:16
Validate Course Catalog Table in Bronze Layer4:12

Introduction to Insert or Merge Query Results or View into Delta Tables using D3:08
Learn to use the merge into syntax to insert or update data into delta tables from tables, views, queries, or cte, with when matched and when not matched conditions.
Create Course Catalog and Instructors Tables using Databricks SQL6:35
Create and manage delta tables in Databricks SQL by using merge and insert statements to populate LMS silver instructors and LMS bronze course_catalog, with conditional database and table creation.
Copy Data into Course Catalog Table from JSON Files using Databricks SQL5:30
Copy json data from dbfs into the lms_bronze.course_catalog delta table using a Databricks SQL copy command, validating with selects and row counts.
Insert Query Results into Delta Table using Databricks SQL6:33
Filter the bronze course_catalog for instructors, convert JSON to a struct with from_json, explode to expose instructor_id and instructor_name, then insert into LMS_silver.instructors with created and updated timestamps.
Exercise to Create Courses Table and Insert Data2:14
Copy Instructors Data into Course Catalog Table from new file3:23
Understand the Concept of Merge or Upsert in DML or CRUD Operations2:04
Learn how merge upserts update existing records and insert new ones in the instructors silver table, using key matches between the source results and the instructors data.
Develop Query to Get the latest Instructors Records from Course Catalog Table2:43
Develop a Spark SQL merge to insert the latest instructor records from the course_catalog into LMS_silver.instructors, using a subquery to fetch max of bl_created_ts from LMS_bronze.course_catalog.
Overview of Merge Statement Syntax using Databricks SQL4:40
Explore the merge statement syntax in Databricks SQL, merge into LMS_silver.instructors from a delta format source using an on clause, and update or insert when matched or not matched.
Merge Data into Instructors Table from Course Catalog using Databricks SQL9:15
Merge data from the course catalog into LMS_silver.instructors using Databricks SQL, leveraging a CTE for the source and performing update and insert steps with validation.
Exercise to merge Courses Data from Course Catalog into Courses Table3:35
Execute a merge statement to upsert course catalog data into the courses table using a cte, nested queries, and from_json for incremental data via the max created timestamp.

Requirements

Basic SQL Skills and Data Analysis Skills
Computer with decent configuration and Internet
Valid Azure Account with Databricks (instructions provided to setup environment using Azure Databricks for hands on practice)

Description

Databricks SQL Warehouse is relatively new technology to build Data Lakehouse or Data Warehouse leveraging powerful Apache Spark Engine where the analytics can be built at scale. As part of this comprehensive course, you will learn all key skills required to master Databricks SQL Warehouse including Spark SQL as the SQL in Databricks SQL Warehouse is based on Spark SQL.

This course also covers most of the curriculum relevant to clear the Databricks Certified Data Analyst Associate Exam offered by Databricks itself.

Here are the high-level details related to this course. This is a beginner level course where you will be able to not only learn syntax and semantics of Databricks SQL or Spark SQL, you will also understand the concepts of the same.

Setup Course Material and Environment for Databricks SQL Warehouse
Managing Databases using Databricks SQL Warehouse
Manage Delta Tables in Databricks SQL Warehouse
Setup Data Set for Databricks SQL Views and Copy Commands
Databricks SQL or Spark SQL Queries to Process Values in JSON String Columns
Copy Data into Delta Tables in Databricks SQL Warehouse
Insert or Merge Spark SQL or Databricks SQL Query Results or View into Delta Tables
Merge Spark SQL or Databricks SQL Query Results and Data from Delta Table with Delete into Delta Tables
Basic SQL Queries using Spark SQL or Databricks SQL
Performing Aggregations using Group By and filtering using Having leveraging Spark SQL or Databricks SQL
Aggregations using Windowing or Analytical Functions including Cumulative Aggregations using Spark SQL or Databricks SQL
Ranking using Windowing or Analytical Functions using Spark SQL or Databricks SQL
Dealing with different file formats such as parquet, json, csv, etc using Spark SQL or Databricks SQL
All Important types of Joins such as Inner, left or right outer, full outer using Spark SQL or Databricks SQL
Visualizations and Dashboards using Databricks SQL Warehouse

We have also provided quite a few exercises along with solutions with explanations through the course.

Key Takeaways of Mastering Databricks SQL and Spark SQL using Databricks SQL Warehouse

Setup Environment to learn Databricks SQL and Spark SQL using Azure
Support via Udemy Q&A backed by our expert team
Data Sets and Material via GitHub Repository along with instructions to practice Databricks SQL or Spark SQL
Life Time Access to High Quality Video Lectures to learn Databricks SQL and Spark SQL

Who this course is for:

Data Analysts and BI Developers who want to understand Databricks SQL or Spark SQL Queries to analyze the data
Data Engineers who would like to understand Databricks SQL
QA Analysts or Engineers who would like to understand Databricks SQL for the validation of the data
Business Analysts to analyze the Data in the Data Lake using Databricks SQL or Spark SQL Queries
BI Developers who want to understand how to connect BI Tools to Databricks SQL Endpoint and develop required reports and dashboards

What you'll learn

Explore related topics

Course content

Introduction to Mastering Databricks SQL Warehouse and Spark SQL1 lecture • 4min

Setup Databricks Environment using Azure16 lectures • 1hr 4min

Setup Course Material and Environment for Databricks SQL3 lectures • 12min

Getting Started with Databricks SQL16 lectures • 1hr 4min

Managing Databases using Databricks SQL Warehouse9 lectures • 35min

Manage Delta Tables in Databricks SQL Warehouse12 lectures • 57min

Setup Data Set for Databricks SQL Views and Copy Commands9 lectures • 39min

Queries to Process Values in JSON String Columns8 lectures • 37min

Copy Data into Delta Tables in Databricks SQL Warehouse9 lectures • 36min

Insert or Merge Query Results or View into Delta Tables using Databricks SQL11 lectures • 50min

Requirements

Description

Who this course is for: