Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Google Cloud Professional Data Engineer Certification Course

Name: Google Cloud Professional Data Engineer Certification Course
Rating: 4.4 (757 reviews)

Learn to use Google Cloud for Data Engineering and prepare for the Certification Exam!

Created byJose Portilla, Pierian Training

Last updated 5/2023

English

English [Auto],Spanish [Auto],

What you'll learn

Learn what's needed to pass the Google Cloud Professional Data Engineer Certification Exam
Design Data Processing Systems with Google Cloud Tools
Operationalize Machine Learning Models for Deployment
Learn fundamentals of Big Data and Machine Learning
Discover how to use Google Cloud SQL and Spark
Learn to use BigQuery ML for predictions
Build real-time IoT Dashboards with DataFlow and Data Studio
Learn to use Google Cloud APIs for ML, such as Vision API and AutoML
Build Data Lakes and Data Warehouses on Google Cloud
Use DataFlow for Serverless Data Processing with Apache Beam
Understand the Google Cloud Platform Console
Create Virtual Machines with Google Compute Engines

Course content

14 sections • 90 lectures • 10h 30m total length

COURSE OVERVIEW LECTURE - PLEASE DO NOT SKIP!0:30
Course Curriculum Overview3:43
Explore Google Cloud data and machine learning services, from storage options like Cloud Spanner and Cloud SQL to BigQuery, Dataproc, Data Fusion, Pub/Sub, and Vertex AI offerings such as AutoML.
GCP Data Overview5:58
Explore how Google Cloud organizes compute, storage, data, and machine learning, with IAM security at its core, and highlights like BigQuery ML and Vertex AI.
Data Lifecycle11:17
Explore the data life cycle on Google Cloud Platform, covering big data attributes volume, velocity, and variety, and how ingesting, storing, processing, analyzing, and visualizing data map to GCP services.

Introduction to Google Cloud0:07
What is Cloud? A Cloud Computing Overview12:02
Define cloud computing via NIST's five traits—on-demand self-service, broad network access, resource pooling, rapid elasticity, and measured service—plus Google's container-based, serverless approach.
GCP Network Infrastructure12:19
Discover how Google Cloud network infrastructure connects computers worldwide via regions, zones, and multi-region locations, using submarine cables and pop points for fault-tolerant, low-latency cloud deployments.
GCP Network Connections18:58
Learn how Google Cloud uses edge pops, edge nodes, and CDN caching to deliver static content and route BigQuery queries via pops to data centers for fast results.
Why Choose GCP?12:37
Discover why choose Google Cloud platform for compute, storage, big data, and machine learning, with open accessibility, strong multi-cloud security, and actionable budgeting and billing controls.
Google Cloud Account Set-up6:59
Set up your Google Cloud account, claim $300 of free credits for 90 days, and tour the Google Cloud console to create projects and manage resources.
Billing and Budgets4:41
Learn to monitor Google Cloud costs using billing reports and budgets, view forecasted costs and cost details, set alerts and automated actions across projects, and tailor cost controls.
Billing Tour: DEMO7:05
Tour the Google Cloud billing console, including the overview, reports, and budgets tabs. Understand credits and single or multiple billing accounts as you prepare your first budget.
Setting a Budget Alert: DEMO4:16
Create your first Google Cloud budget in the console, set monthly scope across all projects, apply credits, define a 150 target, and add a 1% email alert.

GCP Storage Overview0:45
Explore a variety of data storage options on GCP for Google Cloud data engineer certification, from cloud storage and file store to Cloud SQL, Spanner, Firestore, Bigtable, and Memory Store.
GCP Storage Tour5:39
Take a brief tour of Google Cloud storage options, including Cloud Storage, File Store, Cloud SQL, Cloud Spanner, Firestore, and Bigtable, with use cases and the decision tree.
Cloud Storage23:16
Explore cloud storage on Google Cloud Platform, covering buckets and objects, storage classes, security, and accessibility; learn to manage versions, lifecycle, and access with signed URLs and IAM.
Cloud Storage DEMO16:45
Create a cloud storage bucket, upload and download an image, share it via a public link, organize with folders, and delete objects and the bucket.
Filestore6:49
Discover filestore, a managed nas for compute engine and gke with nfs v3 support, offering basic, enterprise, and high scale tiers for unstructured data.
Filestore DEMO14:14
Demonstrates setting up a Google Cloud file store with an NFS client, mounting a file share via command line, creating a test file, and cleaning up to avoid charges.
Cloud SQL8:43
Cloud SQL provides a fully managed relational database service for MySQL, PostgreSQL, and SQL Server, offering high availability, persistent disks, static IP connectivity, IAM-based authentication, and clear pricing.
Cloud SQL DEMO15:41
Demonstrates a hands-on Cloud SQL demo by creating a new project and a PostgreSQL Cloud SQL instance. Connect via Cloud Shell, create a guest book database, and delete the instance.
Cloud Spanner6:14
Explore Cloud Spanner, a relational database that scales horizontally to unlimited size with automatic regional distribution, Truetime-based serializable transactions, and 99.999% availability.
Cloud Spanner DEMO15:13
Demonstrates creating a Cloud Spanner instance, database, schema, and table. Inserts and queries data, then cleans up resources while noting Google standard SQL vs PostgreSQL dialect and processing units.
Cloud BigTable7:58
Explore Cloud Bigtable, a scalable NoSQL key-value store indexed by a single row key, with column families and timestamped values for petabyte-scale data and sub-10 millisecond latency.
Memorystore3:52
Explore Memorystore for Reddis and Memcached, a fully managed in-memory data store that delivers submillisecond latency with private IPs, IAM integration, and scale on demand.

Big Data Overview1:23
Explore the big data overview and learn how Google Cloud provides managed services for open source technologies like MapReduce, Hadoop, Apache Pig, Spark, and Kafka.
MapReduce6:55
MapReduce coordinates parallel processing of huge datasets across a cluster by mapping data, shuffling keys, and reducing results, using a master node and worker nodes.
Apache Hadoop5:58
Learn how Apache Hadoop enables distributed storage and processing across clusters with components like HDFS, YARN, and MapReduce, plus replication for fault tolerance and the Dataproc managed service.
Apache Pig2:25
Learn Apache Pig, a high-level Pig Latin language on Hadoop MapReduce for simple data analysis and word count. Dataproc supports Pig queries on YARN, though we won't use Pig.
Apache Spark5:56
Explore Apache Spark as the in-memory, fault-tolerant evolution of big data analysis from MapReduce, with HDFS, RDDs, transformations and actions, and Dataproc on GCP.
Apache Kafka4:26
Explore Apache Kafka as a scalable event streaming platform that stores immutable logs, supports topics and partitions, and enables producers, consumers, and real-time data processing via its five APIs.

Introduction to Bigquery1:19
Explore the differences between data lakes and data warehouses, learn BigQuery basics, data loading schemas, nested and repeated fields, partitioning and clustering, and monitoring, streaming, and machine learning capabilities.
Data Lakes vs Data Warehouse15:20
Explore the differences between data lakes and data warehouses, showing how cloud storage stores varied data types and how BigQuery enables analytics with schema before storage and scalable pipelines.
BigQuery Architecture9:42
Explore how BigQuery architecture delivers a cloud-first, serverless data warehouse by separating storage from compute and leveraging Borg, Dremel, Colossus, and Jupiter for scalable, columnar storage via capacitor.
BigQuery Basic Hierarchy11:47
Explore the basic BigQuery hierarchy—projects, datasets, tables, views, and jobs—and learn how native and external tables, federated queries, and the SQL-to-job lifecycle power analytics.
BigQuery Basics: DEMO10:42
Explore BigQuery basics in the console, using the sandbox free tier, and learn to add public datasets, inspect data structures with information schema, and run useful SQL queries.
BigQuery Command Line Tool: DEMO8:41
Explore the BigQuery command line tool (BQ) in Cloud Shell, running show and query commands, working with public data such as Shakespeare, and comparing legacy versus standard SQL.
BigQuery Ingesting Data Input5:11
Discover how to ingest data into BigQuery from diverse sources, and understand the storage versus compute separation, external data trade-offs, and streaming or batch loading with the storage write API.
BigQuery Loading Data: DEMO15:33
Demonstrates how to load data into BigQuery from a Google Drive spreadsheet, a CSV upload, and a Cloud Storage file, with steps to create datasets and tables.
BigQuery Understanding Schemas7:18
Explore how to design schemas in BigQuery, comparing normalized and denormalized data, and using nested and repeated fields to balance performance with data integrity.
BigQuery Nested and Repeated Fields7:17
Explore how BigQuery uses nested and repeated fields with struct and array to build denormalized schemas, balancing performance with flexible, multi-value records.
Nested Fields: DEMO7:51
Explore BigQuery nested fields by querying and previewing tables with nested structures, including repository attributes and forks.
Partioning and Clustering5:48
Explore how BigQuery partitioning and clustering optimize storage and query performance by pruning partitions, managing partitioned data, and co-locating related values for faster filters and aggregations.
BigQuery and Machine Learning3:48
Discover how BigQuery ML enables sql practitioners to build and run models like linear regression, logistic regression, k-means, boosted trees, and deep neural networks inside BigQuery using standard sql.
BigQuery and Machine Learning: DEMO11:50
Learn to create a BigQuery dataset and build a BigQuery ML linear regression model to predict penguin body mass from features, then evaluate and predict results.
BigQuery Best Practices5:29
Learn BigQuery best practices to boost performance and reduce costs using selective columns, early filters, and efficient joins, plus partitioning and clustering.
BigQuery IAM Policy and Monitoring4:23
Explore how BigQuery integrates with IAM policies and cloud monitoring to manage permissions and roles across project, dataset, and table resources, and to track jobs, bytes scanned, and query times.
BigQuery Streaming4:32
Learn how BigQuery streaming connects compute to storage via the Storage Write API, including the default stream, batch loading, and modes—pending, committed, and buffered—for low-latency ingestion.

Introduction to Dataproc0:37
Explore the Hadoop ecosystem and how Dataproc features and cloud storage compare to Hadoop file systems, then cover Dataproc optimization and a practical Apache Spark demo on Google Cloud Platform.
Hadoop Based Ecosystem Review4:07
Explore the Hadoop based ecosystem, including HDFS, MapReduce, Spark, Hive, and Pig, and its fit with Dataproc. Compare on-premises limitations and see how cloud Dataproc enables scalable cluster management.
Dataproc Key Features6:06
Explore Dataproc, a fully managed, scalable service for running Hadoop, Spark, Flink, Presto, and 30+ open source tools, with serverless deployment, logging, monitoring, rapid cluster startup, and tight GCP integration.
Dataproc Optimization5:15
Optimize dataproc on GCP by selecting data location and region, balancing firewall rules, input files, and cloud storage sizing, and using ephemeral clusters with auto scaling and workflow templates.
Dataproc: DEMO10:39
Watch a Dataproc demo that creates a cluster with a master and workers, submits a Spark Pi job, and cleans up by deleting the cluster.
Dataproc with Cloud Storage5:44
Dataproc leverages cloud storage to separate compute from storage, delivering direct data access and HDFS compatibility with GS prefixes, while enabling scalable, low-maintenance analytics on GCP.

Introduction to Data Fusion0:28
Explore cloud data fusion by examining core concepts, setting up a pipeline with the user interface, and demonstrating a practical cloud data fusion workflow.
Cloud Data Fusion7:01
Learn how Cloud Data Fusion provides a drag-and-drop, managed data integration platform that connects diverse data sources, cleans and transforms data, and orchestrates pipelines on Dataproc with Spark or MapReduce.
Data Fusion User Interface5:09
Explore cloud data fusion's user interface, focusing on wrangler for data cleaning, the studio pipeline builder, and the metadata tools to inspect schema and lineage, then deploy and monitor pipelines.
Data Fusion Download JSON Files Here0:03
Data Fusion: DEMO12:42
Explore cloud data fusion's studio pipeline builder and its graphical interface. Import and export pipelines, deploy and run them, and examine metadata lineage across sources and sinks like BigQuery.

Introduction to Cloud Composer0:29
Discover Apache Airflow fundamentals and see how Cloud Composer, a managed Airflow on GCP, provides features and a practical demonstration.
Apache Airflow Overview4:55
Explore Apache Airflow, an open source workflow management platform for data engineering pipelines, defined by Python DAGs. Leverage dynamic pipelines with jinja parameterization, nodes, tasks, operators, and Cloud Composer integrations.
Cloud Composer4:36
Cloud Composer provides a fully managed workflow orchestration service for Apache Airflow, enabling you to schedule, monitor, and manage DAGs across clouds and on-premises data centers with Python scripts.
Cloud Composer: DEMO8:33
Explore a hands-on Cloud Composer demo that guides you through enabling APIs, creating an environment, uploading a quickstart dag, and using the Apache Airflow UI to monitor DAGs.

Introduction to Cloud Dataflow0:25
Explore cloud dataflow, pipelines, templates, and SQL, and see a practical demo of using cloud dataflow on GCP.
Cloud Dataflow Overview6:52
Explore Cloud Dataflow, a fully managed Apache Beam service for unified batch and streaming data processing, with autoscale and seamless integration from Pub/Sub to BigQuery.
Cloud Dataflow: Templates and SQL3:49
Explore cloud dataflow templates and dataflow sql to design, stage, and run apache beam pipelines from the cloud, with flexible templates, runtime parameters, and BigQuery web user interface integration.
Cloud Dataflow: DEMO14:37
Explore a hands-on cloud dataflow demo: configure permissions and APIs, install Apache Beam in Cloud Shell, run a word count pipeline on Shakespeare text, and view results in Cloud Storage.

Introduction to Pub Sub0:29
Explore the open source libraries Apache, Kafka, and Pulsar that pub sub is based on, and examine the pub sub architecture. Finish with a practical demonstration of using pub sub.
Apache Kafka Pulsar5:34
Learn how Apache Kafka and Pulsar underpin GCP Pub/Sub with producers, consumers, and topics, delivering durable, immutable logs, low latency, multi-tenancy, and horizontal scalability.
Pub Sub Overview2:48
Explore cloud Pub/Sub on GCP, a publisher-subscriber messaging system with 100 ms latency for asynchronous ingestion and distribution via topics and subscriptions.
Pub Sub Architecture8:49
Explore pub/sub architecture with topics and subscriptions, compare pool and push subscriptions, and show how publishers and subscribers exchange events through topics with filtering, acks, and ordering.
Pub Sub DEMO5:11
Learn how to create a pub/sub topic and subscription in Google Cloud, publish messages, and pull or acknowledge them, with defaults and optional filters.

Requirements

Google Cloud Account (to follow along with demo labs and assignments)
Some Basic Google Cloud Platform experience

Description

Welcome to your one-stop shop for passing the Google Cloud Professional Data Engineer Certification Exam!

A FULL 50 QUESTION PRACTICE EXAM IS INCLUDED WITH THIS COURSE!

We've designed this course to be a complete resource for you to learn how to use Google Cloud to pass the Professional Data Engineer Certification Exam!

As you may have heard, Google Cloud is growing at a tremendous rate, with almost 50% YoY growth, and has a higher growth rate than overall cloud. Since Google has some of the most advanced data and machine learning offerings of any cloud provider, it makes sense to get skilled in this highly in-demand field today.

In this course we'll teach you how to make data-driven decisions by collecting, transforming, and publishing data. This certification preparatory course will show you how to use Google Cloud to design, build, and operationalize data systems that can run at the scale of Google.

In this course, we'll prepare you for the Google Cloud Professional Data Engineer Certification Exam by teaching you about the following:

Designing Data Processing Systems
- Google Cloud Storage
- Data Pipelines
- BigQuery
- DataFlow
- Cloud Composer
Operationalize Data Systems
- Cloud BigTable
- Cloud SQL
- Data Cleaning and Transformation
- Data Monitoring
Machine Learning DevOps
- Google Cloud ML APIs
- Deploying ML Pipelines
- Infrastructure Decisions
Data Solutions Quality
- Data Security and Access
- Test Suites and Troubleshooting
- Verification and Monitoring
- Data Portability
And much more!

While this course is specifically designed to help you pass the Google Cloud Professional Data Engineer Certification Exam, we believe anyone that is interested in using Google Cloud to create development operations for the latest data products will benefit massively from taking this course.

Also, not only do you get great technical content with this course, but you'll also get access to our in course Question and Answer Forums and our Discord Student Chat Channel.

You can try the course risk free with a 30-day money back guarantee!

Enroll today and we'll see you inside the course!

Who this course is for:

Anyone wanting to pass the Professional Data Engineer Certification from Google Cloud

Google Cloud Professional Data Engineer Certification Course

What you'll learn

Explore related topics

Course content

Introduction and Course Welcome4 lectures • 21min

Introduction to Google Cloud9 lectures • 1hr 19min

Google Storage Overview12 lectures • 2hr 5min

Big Data Overview6 lectures • 27min

Google Bigquery17 lectures • 2hr 17min

Dataproc on Google Cloud6 lectures • 32min

Cloud Data Fusion5 lectures • 25min

Cloud Composer4 lectures • 19min

Cloud Dataflow4 lectures • 26min

Google Cloud Pub Sub5 lectures • 23min

Requirements

Description

Who this course is for: