Databricks Certified Data Engineer Professional -Preparation
What you'll learn
- Learn how to model data management solutions on Databricks Lakehouse
- Build data processing pipelines using the Spark and Delta Lake APIs
- Understand how to use and the benefits of using the Databricks platform and its tools
- Build production pipelines using best practices around security and governance
- Learn how to monitor and log production jobs
- Follow best practices for deploying code on Databricks
Requirements
- MUST HAVE - All the skills of an "Associate" Data Engineer on Databricks platform
- If you feel you lack these skills, you should first study my preparation course on Udemy for the Associate-level certification. I covered there all the fundamental concepts of Databricks Lakehouse with hands-on training.
Description
If you are interested in becoming a Certified Data Engineer Professional from Databricks, you have come to the right place! This study guide will help you with preparing for this certification exam.
By the end of this course, you should be able to:
Model data management solutions, including:
Lakehouse (bronze/silver/gold architecture, tables, views, and the physical layout)
General data modeling concepts (constraints, lookup tables, slowly changing dimensions)
Build data processing pipelines using the Spark and Delta Lake APIs, including:
Building batch-processed ETL pipelines
Building incrementally processed ETL pipelines
Deduplicating data
Using Change Data Capture (CDC) to propagate changes
Optimizing workloads
Understand how to use and the benefits of using the Databricks platform and its tools, including:
Databricks CLI (deploying notebook-based workflows)
Databricks REST API (configure and trigger production pipelines)
Build production pipelines using best practices around security and governance, including:
Managing clusters and jobs permissions with ACLs
Creating row- and column-oriented dynamic views to control user/group access
Securely delete data as requested according to GDPR & CCPA
Configure alerting and storage to monitor and log production jobs, including:
Recording logged metrics
Debugging errors
Follow best practices for managing, testing and deploying code, including:
Relative imports
Scheduling Jobs
Orchestration Jobs
With the knowledge you gain during this course, you will be ready to take the certification exam.
I am looking forward to meeting you!
Who this course is for:
- Anyone aiming to pass the Databricks Data Engineer Professional certification exam
- Junior Data Engineers on Databricks wanting to gain the skills of Professional Data Engineers
Instructor
Senior data engineer with a master’s degree in data mining, based and working in France. I have over 10 years of experience working on software and data projects, including large data projects on Databricks.
I'm the author of the O’Reilly book “Databricks Certified Data Engineer Associate Study Guide: In-Depth Guidance and Practice”. Get your FREE pdf copy on my LinkedIn.
I hold 8 certifications from Databricks (check them on my website)
Feel free to connect with me on LinkedIn to stay updated on more content and exclusive opportunities.