
Explore Google Cloud data and machine learning services, from storage options like Cloud Spanner and Cloud SQL to BigQuery, Dataproc, Data Fusion, Pub/Sub, and Vertex AI offerings such as AutoML.
Explore how Google Cloud organizes compute, storage, data, and machine learning, with IAM security at its core, and highlights like BigQuery ML and Vertex AI.
Explore the data life cycle on Google Cloud Platform, covering big data attributes volume, velocity, and variety, and how ingesting, storing, processing, analyzing, and visualizing data map to GCP services.
Define cloud computing via NIST's five traits—on-demand self-service, broad network access, resource pooling, rapid elasticity, and measured service—plus Google's container-based, serverless approach.
Discover how Google Cloud network infrastructure connects computers worldwide via regions, zones, and multi-region locations, using submarine cables and pop points for fault-tolerant, low-latency cloud deployments.
Learn how Google Cloud uses edge pops, edge nodes, and CDN caching to deliver static content and route BigQuery queries via pops to data centers for fast results.
Discover why choose Google Cloud platform for compute, storage, big data, and machine learning, with open accessibility, strong multi-cloud security, and actionable budgeting and billing controls.
Set up your Google Cloud account, claim $300 of free credits for 90 days, and tour the Google Cloud console to create projects and manage resources.
Learn to monitor Google Cloud costs using billing reports and budgets, view forecasted costs and cost details, set alerts and automated actions across projects, and tailor cost controls.
Tour the Google Cloud billing console, including the overview, reports, and budgets tabs. Understand credits and single or multiple billing accounts as you prepare your first budget.
Create your first Google Cloud budget in the console, set monthly scope across all projects, apply credits, define a 150 target, and add a 1% email alert.
Explore a variety of data storage options on GCP for Google Cloud data engineer certification, from cloud storage and file store to Cloud SQL, Spanner, Firestore, Bigtable, and Memory Store.
Take a brief tour of Google Cloud storage options, including Cloud Storage, File Store, Cloud SQL, Cloud Spanner, Firestore, and Bigtable, with use cases and the decision tree.
Explore cloud storage on Google Cloud Platform, covering buckets and objects, storage classes, security, and accessibility; learn to manage versions, lifecycle, and access with signed URLs and IAM.
Create a cloud storage bucket, upload and download an image, share it via a public link, organize with folders, and delete objects and the bucket.
Discover filestore, a managed nas for compute engine and gke with nfs v3 support, offering basic, enterprise, and high scale tiers for unstructured data.
Demonstrates setting up a Google Cloud file store with an NFS client, mounting a file share via command line, creating a test file, and cleaning up to avoid charges.
Cloud SQL provides a fully managed relational database service for MySQL, PostgreSQL, and SQL Server, offering high availability, persistent disks, static IP connectivity, IAM-based authentication, and clear pricing.
Demonstrates a hands-on Cloud SQL demo by creating a new project and a PostgreSQL Cloud SQL instance. Connect via Cloud Shell, create a guest book database, and delete the instance.
Explore Cloud Spanner, a relational database that scales horizontally to unlimited size with automatic regional distribution, Truetime-based serializable transactions, and 99.999% availability.
Demonstrates creating a Cloud Spanner instance, database, schema, and table. Inserts and queries data, then cleans up resources while noting Google standard SQL vs PostgreSQL dialect and processing units.
Explore Cloud Bigtable, a scalable NoSQL key-value store indexed by a single row key, with column families and timestamped values for petabyte-scale data and sub-10 millisecond latency.
Explore Memorystore for Reddis and Memcached, a fully managed in-memory data store that delivers submillisecond latency with private IPs, IAM integration, and scale on demand.
Explore the big data overview and learn how Google Cloud provides managed services for open source technologies like MapReduce, Hadoop, Apache Pig, Spark, and Kafka.
MapReduce coordinates parallel processing of huge datasets across a cluster by mapping data, shuffling keys, and reducing results, using a master node and worker nodes.
Learn how Apache Hadoop enables distributed storage and processing across clusters with components like HDFS, YARN, and MapReduce, plus replication for fault tolerance and the Dataproc managed service.
Learn Apache Pig, a high-level Pig Latin language on Hadoop MapReduce for simple data analysis and word count. Dataproc supports Pig queries on YARN, though we won't use Pig.
Explore Apache Spark as the in-memory, fault-tolerant evolution of big data analysis from MapReduce, with HDFS, RDDs, transformations and actions, and Dataproc on GCP.
Explore Apache Kafka as a scalable event streaming platform that stores immutable logs, supports topics and partitions, and enables producers, consumers, and real-time data processing via its five APIs.
Explore the differences between data lakes and data warehouses, learn BigQuery basics, data loading schemas, nested and repeated fields, partitioning and clustering, and monitoring, streaming, and machine learning capabilities.
Explore the differences between data lakes and data warehouses, showing how cloud storage stores varied data types and how BigQuery enables analytics with schema before storage and scalable pipelines.
Explore how BigQuery architecture delivers a cloud-first, serverless data warehouse by separating storage from compute and leveraging Borg, Dremel, Colossus, and Jupiter for scalable, columnar storage via capacitor.
Explore the basic BigQuery hierarchy—projects, datasets, tables, views, and jobs—and learn how native and external tables, federated queries, and the SQL-to-job lifecycle power analytics.
Explore BigQuery basics in the console, using the sandbox free tier, and learn to add public datasets, inspect data structures with information schema, and run useful SQL queries.
Explore the BigQuery command line tool (BQ) in Cloud Shell, running show and query commands, working with public data such as Shakespeare, and comparing legacy versus standard SQL.
Discover how to ingest data into BigQuery from diverse sources, and understand the storage versus compute separation, external data trade-offs, and streaming or batch loading with the storage write API.
Demonstrates how to load data into BigQuery from a Google Drive spreadsheet, a CSV upload, and a Cloud Storage file, with steps to create datasets and tables.
Explore how to design schemas in BigQuery, comparing normalized and denormalized data, and using nested and repeated fields to balance performance with data integrity.
Explore how BigQuery uses nested and repeated fields with struct and array to build denormalized schemas, balancing performance with flexible, multi-value records.
Explore BigQuery nested fields by querying and previewing tables with nested structures, including repository attributes and forks.
Explore how BigQuery partitioning and clustering optimize storage and query performance by pruning partitions, managing partitioned data, and co-locating related values for faster filters and aggregations.
Discover how BigQuery ML enables sql practitioners to build and run models like linear regression, logistic regression, k-means, boosted trees, and deep neural networks inside BigQuery using standard sql.
Learn to create a BigQuery dataset and build a BigQuery ML linear regression model to predict penguin body mass from features, then evaluate and predict results.
Learn BigQuery best practices to boost performance and reduce costs using selective columns, early filters, and efficient joins, plus partitioning and clustering.
Explore how BigQuery integrates with IAM policies and cloud monitoring to manage permissions and roles across project, dataset, and table resources, and to track jobs, bytes scanned, and query times.
Learn how BigQuery streaming connects compute to storage via the Storage Write API, including the default stream, batch loading, and modes—pending, committed, and buffered—for low-latency ingestion.
Explore the Hadoop ecosystem and how Dataproc features and cloud storage compare to Hadoop file systems, then cover Dataproc optimization and a practical Apache Spark demo on Google Cloud Platform.
Explore the Hadoop based ecosystem, including HDFS, MapReduce, Spark, Hive, and Pig, and its fit with Dataproc. Compare on-premises limitations and see how cloud Dataproc enables scalable cluster management.
Explore Dataproc, a fully managed, scalable service for running Hadoop, Spark, Flink, Presto, and 30+ open source tools, with serverless deployment, logging, monitoring, rapid cluster startup, and tight GCP integration.
Optimize dataproc on GCP by selecting data location and region, balancing firewall rules, input files, and cloud storage sizing, and using ephemeral clusters with auto scaling and workflow templates.
Watch a Dataproc demo that creates a cluster with a master and workers, submits a Spark Pi job, and cleans up by deleting the cluster.
Dataproc leverages cloud storage to separate compute from storage, delivering direct data access and HDFS compatibility with GS prefixes, while enabling scalable, low-maintenance analytics on GCP.
Explore cloud data fusion by examining core concepts, setting up a pipeline with the user interface, and demonstrating a practical cloud data fusion workflow.
Learn how Cloud Data Fusion provides a drag-and-drop, managed data integration platform that connects diverse data sources, cleans and transforms data, and orchestrates pipelines on Dataproc with Spark or MapReduce.
Explore cloud data fusion's user interface, focusing on wrangler for data cleaning, the studio pipeline builder, and the metadata tools to inspect schema and lineage, then deploy and monitor pipelines.
Explore cloud data fusion's studio pipeline builder and its graphical interface. Import and export pipelines, deploy and run them, and examine metadata lineage across sources and sinks like BigQuery.
Discover Apache Airflow fundamentals and see how Cloud Composer, a managed Airflow on GCP, provides features and a practical demonstration.
Explore Apache Airflow, an open source workflow management platform for data engineering pipelines, defined by Python DAGs. Leverage dynamic pipelines with jinja parameterization, nodes, tasks, operators, and Cloud Composer integrations.
Cloud Composer provides a fully managed workflow orchestration service for Apache Airflow, enabling you to schedule, monitor, and manage DAGs across clouds and on-premises data centers with Python scripts.
Explore a hands-on Cloud Composer demo that guides you through enabling APIs, creating an environment, uploading a quickstart dag, and using the Apache Airflow UI to monitor DAGs.
Explore cloud dataflow, pipelines, templates, and SQL, and see a practical demo of using cloud dataflow on GCP.
Explore Cloud Dataflow, a fully managed Apache Beam service for unified batch and streaming data processing, with autoscale and seamless integration from Pub/Sub to BigQuery.
Explore cloud dataflow templates and dataflow sql to design, stage, and run apache beam pipelines from the cloud, with flexible templates, runtime parameters, and BigQuery web user interface integration.
Explore a hands-on cloud dataflow demo: configure permissions and APIs, install Apache Beam in Cloud Shell, run a word count pipeline on Shakespeare text, and view results in Cloud Storage.
Explore the open source libraries Apache, Kafka, and Pulsar that pub sub is based on, and examine the pub sub architecture. Finish with a practical demonstration of using pub sub.
Learn how Apache Kafka and Pulsar underpin GCP Pub/Sub with producers, consumers, and topics, delivering durable, immutable logs, low latency, multi-tenancy, and horizontal scalability.
Explore cloud Pub/Sub on GCP, a publisher-subscriber messaging system with 100 ms latency for asynchronous ingestion and distribution via topics and subscriptions.
Explore pub/sub architecture with topics and subscriptions, compare pool and push subscriptions, and show how publishers and subscribers exchange events through topics with filtering, acks, and ordering.
Learn how to create a pub/sub topic and subscription in Google Cloud, publish messages, and pull or acknowledge them, with defaults and optional filters.
Welcome to your one-stop shop for passing the Google Cloud Professional Data Engineer Certification Exam!
A FULL 50 QUESTION PRACTICE EXAM IS INCLUDED WITH THIS COURSE!
We've designed this course to be a complete resource for you to learn how to use Google Cloud to pass the Professional Data Engineer Certification Exam!
As you may have heard, Google Cloud is growing at a tremendous rate, with almost 50% YoY growth, and has a higher growth rate than overall cloud. Since Google has some of the most advanced data and machine learning offerings of any cloud provider, it makes sense to get skilled in this highly in-demand field today.
In this course we'll teach you how to make data-driven decisions by collecting, transforming, and publishing data. This certification preparatory course will show you how to use Google Cloud to design, build, and operationalize data systems that can run at the scale of Google.
In this course, we'll prepare you for the Google Cloud Professional Data Engineer Certification Exam by teaching you about the following:
Designing Data Processing Systems
Google Cloud Storage
Data Pipelines
BigQuery
DataFlow
Cloud Composer
Operationalize Data Systems
Cloud BigTable
Cloud SQL
Data Cleaning and Transformation
Data Monitoring
Machine Learning DevOps
Google Cloud ML APIs
Deploying ML Pipelines
Infrastructure Decisions
Data Solutions Quality
Data Security and Access
Test Suites and Troubleshooting
Verification and Monitoring
Data Portability
And much more!
While this course is specifically designed to help you pass the Google Cloud Professional Data Engineer Certification Exam, we believe anyone that is interested in using Google Cloud to create development operations for the latest data products will benefit massively from taking this course.
Also, not only do you get great technical content with this course, but you'll also get access to our in course Question and Answer Forums and our Discord Student Chat Channel.
You can try the course risk free with a 30-day money back guarantee!
Enroll today and we'll see you inside the course!