
Sign up for the Azure portal and set up Databricks on Azure to begin using Databricks sql or engineering clusters, with options on aws and gcp.
Access the Azure Databricks environment via the unique workspace URL. Sign in with Azure AD single sign-on to access notebooks, clusters, and data science, engineering, or SQL workspaces.
Create a single-node Databricks cluster on Azure, with a selectable access mode and idle timeout to control costs, and attach notebooks to run Python, Scala, R, and SQL.
Learn how to edit and reconfigure Databricks clusters through the workspace UI, including changing runtime, node type, access mode, and inactivity settings, with a practical cluster wizard walkthrough.
Create a notebook and attach it to a running Databricks cluster to run code. Switch between Python, SQL, Scala, and R, use magics, and run individual cells or full notebooks.
Explore the Databricks SQL editor interface, create and configure a SQL warehouse, choose 2x-small or x-small clusters, set auto stop, manage permissions, and prepare to develop queries.
Increase your Azure quota to create a Databricks SQL warehouse cluster by submitting a quota request, then configure SQL warehouses and run queries once the quota is approved.
Run your first query on the nyctaxi trips table in the samples metastore using the Databricks sql warehouse, then explore databases, tables, and visualizations.
Upload data via Databricks SQL Warehouse UI, review data science and engineering interface, start clusters, explore catalogs, DBFS file store, and create tables using the sales.csv path.
Create a Databricks notebook, connect to the active cluster, and read a tab-delimited sales csv with spark.read.csv, using header and inferred schema, then define a precise schema to resolve issues.
Terminate databricks clusters by inactivity (30 minutes) via the cluster dashboard, and delete the cluster state to avoid runtime and infrastructure charges; learn to create clusters and clean up resources.
Terminate Databricks SQL warehouse clusters by stopping the SQL warehouse in the UI to halt charges. Clean up stopped warehouses by deleting them, noting you need compute to access tables.
Install and verify the Databricks CLI with Python 3 pip across Windows, Mac, and Linux, ensuring Python 3 compatibility and avoiding Python 2, with the 0.17.1 release.
Explore the databricks sql platform on aws, gcp, and Azure with premium access, featuring photon-accelerated queries and seamless data lake house integration with business intelligence tools.
Explore Databricks SQL visualization features with sample dashboards like NYC taxi trip analysis and retail revenue, and learn to connect BI tools via SQL endpoint and use the SQL editor.
Learn how to load non-delta source data into delta-format retail_db tables using Databricks CLI and a data engineering cluster, and verify results with the SQL editor.
Configure the databricks cli, generate and use a new token, validate access, and push raw data into delta tables created via sql editor on the databricks sql platform.
Leverage the Databricks CLI to copy the retail_db_json data into DBFS public, creating folders, performing recursive copies, and validating with CLI and the web interface before loading into delta tables.
Create a notebook in databricks, spin up a single-node cluster, and analyze json data with spark.read.json to prepare delta tables and align column order.
Analyze delta table schemas with spark APIs to review column names and data types, fix order discrepancies, and validate loading into retail DB tables.
Load data from spark data frames into delta tables by aligning column order to the target table and using APIs to overwrite and validate results in Databricks SQL.
Validate that data is properly loaded into retail db tables by running ad hoc queries in the Databricks sql editor, comparing counts and distinct values for orders and order items.
Create external tables in Databricks SQL on top of files in DBFS or data lake, defining the structure to query CSV or JSON data without copying files.
Create the LMS underscore bronze database in Databricks SQL, learn that a database is a schema, review with the schema browser and data explorer, and validate creation with run feedback.
List databases with show databases and inspect metadata with describe database, revealing location, owner, namespace, and comments, while data explorer offers access to LMS underscore bronze and LMS underscore silver.
Create a delta format table in Databricks SQL Warehouse using the LMS underscore silver database, with a users table featuring user_id int, user_first_name string, user_last_name string, and user_email string.
Drop an external table to delete its metadata only, then clean up the data folder using Databricks CLI or notebook, noting Unity Catalog may enable drop external location.
Learn how to manage data in a delta table using Databricks SQL, covering insert, select, update, and delete operations (DML/CRUD) with table metadata and basic queries.
Learn to insert multiple records into a Databricks SQL warehouse table in a single insert statement, including cleanup with truncate, column order alignment, and validating results.
Update Databricks SQL warehouse tables by using update and set to modify single or multiple columns, with where conditions to target specific user IDs and handle nulls.
Learn to perform dml updates on a Databricks SQL warehouse table when columns are null, using is null and is not null to set last names to a placeholder LNU.
Learn to use the Databricks CLI to manage DBFS files, listing, copying, moving, and deleting, and prepare datasets for delta file formatted tables.
Create a folder in DBFS with the Databricks CLI by using mkdirs under file store to set up LMS_DL and copy the course catalog data from a local folder.
Copy local files into dbfs with databricks fs cp, using -r and profile options, create LMS_dl/course_catalog and verify via ls.
Copy folders into dbfs with Databricks fs cp and the overwrite option, then preview small text files with cat to inspect json data.
Review the course catalog json records stored in dbfs, identify their string attributes, and map them into course and instructor tables using delta format across bronze and silver layers.
Explore practical options to analyze data in Databricks by running queries against the file path, or via views, external tables, or managed tables in DBFS, with or without loading data.
Learn to run select queries directly from a DBFS path using from clause with backticks, validate data with JSON files, and explore using views and external tables in Databricks SQL.
Explore creating temporary views in Databricks SQL from JSON files via a path, and why temporary views may not power dashboards in SQL warehouse.
Learn how to query and process values in JSON string columns using Databricks SQL, with a focus on external tables, JSON processing functions, and building queries for data pipelines.
Filter data with a where clause on the course_catalog table to retrieve records for instructors or courses, using star or explicit columns and noting case sensitivity.
Learn to parse JSON strings in Databricks SQL with from_underscore_JSON, define schemas using struct, and extract fields like instructor_id and instructor_name for instructors and courses in the catalog.
Parse instructor and course records from json strings in the course_catalog table using from_json with the proper schema, enabling the query of course_id, instructor_id, and course_title.
Create permanent views named instructors underscore V and courses underscore V in LMS underscore bronze to process instructors and courses data; validate with select, show tables, and describe.
Copy data from multiple files into delta tables with the copy command after truncating the target table, and learn why loaded files are ignored and how to override this behavior.
Copy data from multiple files into Delta tables using pattern, while overriding default ignore behavior with copy options such as force and merge schema.
Create a Delta table with an extra created_ts column in Databricks SQL Warehouse, then populate it via copy into using a query that derives created_ts from current_timestamp, when structures differ.
Learn to use the merge into syntax to insert or update data into delta tables from tables, views, queries, or cte, with when matched and when not matched conditions.
Create and manage delta tables in Databricks SQL by using merge and insert statements to populate LMS silver instructors and LMS bronze course_catalog, with conditional database and table creation.
Copy json data from dbfs into the lms_bronze.course_catalog delta table using a Databricks SQL copy command, validating with selects and row counts.
Filter the bronze course_catalog for instructors, convert JSON to a struct with from_json, explode to expose instructor_id and instructor_name, then insert into LMS_silver.instructors with created and updated timestamps.
Learn how merge upserts update existing records and insert new ones in the instructors silver table, using key matches between the source results and the instructors data.
Develop a Spark SQL merge to insert the latest instructor records from the course_catalog into LMS_silver.instructors, using a subquery to fetch max of bl_created_ts from LMS_bronze.course_catalog.
Explore the merge statement syntax in Databricks SQL, merge into LMS_silver.instructors from a delta format source using an on clause, and update or insert when matched or not matched.
Merge data from the course catalog into LMS_silver.instructors using Databricks SQL, leveraging a CTE for the source and performing update and insert steps with validation.
Execute a merge statement to upsert course catalog data into the courses table using a cte, nested queries, and from_json for incremental data via the max created timestamp.
Databricks SQL Warehouse is relatively new technology to build Data Lakehouse or Data Warehouse leveraging powerful Apache Spark Engine where the analytics can be built at scale. As part of this comprehensive course, you will learn all key skills required to master Databricks SQL Warehouse including Spark SQL as the SQL in Databricks SQL Warehouse is based on Spark SQL.
This course also covers most of the curriculum relevant to clear the Databricks Certified Data Analyst Associate Exam offered by Databricks itself.
Here are the high-level details related to this course. This is a beginner level course where you will be able to not only learn syntax and semantics of Databricks SQL or Spark SQL, you will also understand the concepts of the same.
Setup Course Material and Environment for Databricks SQL Warehouse
Managing Databases using Databricks SQL Warehouse
Manage Delta Tables in Databricks SQL Warehouse
Setup Data Set for Databricks SQL Views and Copy Commands
Databricks SQL or Spark SQL Queries to Process Values in JSON String Columns
Copy Data into Delta Tables in Databricks SQL Warehouse
Insert or Merge Spark SQL or Databricks SQL Query Results or View into Delta Tables
Merge Spark SQL or Databricks SQL Query Results and Data from Delta Table with Delete into Delta Tables
Basic SQL Queries using Spark SQL or Databricks SQL
Performing Aggregations using Group By and filtering using Having leveraging Spark SQL or Databricks SQL
Aggregations using Windowing or Analytical Functions including Cumulative Aggregations using Spark SQL or Databricks SQL
Ranking using Windowing or Analytical Functions using Spark SQL or Databricks SQL
Dealing with different file formats such as parquet, json, csv, etc using Spark SQL or Databricks SQL
All Important types of Joins such as Inner, left or right outer, full outer using Spark SQL or Databricks SQL
Visualizations and Dashboards using Databricks SQL Warehouse
We have also provided quite a few exercises along with solutions with explanations through the course.
Key Takeaways of Mastering Databricks SQL and Spark SQL using Databricks SQL Warehouse
Setup Environment to learn Databricks SQL and Spark SQL using Azure
Support via Udemy Q&A backed by our expert team
Data Sets and Material via GitHub Repository along with instructions to practice Databricks SQL or Spark SQL
Life Time Access to High Quality Video Lectures to learn Databricks SQL and Spark SQL