
Frame machine learning problems from a business perspective, decide when to apply machine learning, design scalable solutions, build data pipelines and features, develop models, test them, and monitor performance.
Open the course console, use the overview and QR tabs to search questions, post new ones, and view current or all lectures, or message Dan Sullivan on LinkedIn for answers.
Assess whether a problem suits machine learning by examining data size and task type, from classification and identification to prediction and segmentation, with examples like spam, fraud, and image analysis.
Define the project scope and success criteria, identify the problem type (classification, prediction, or clustering), and establish metrics like accuracy, precision, and recall alongside business goals.
Learn to deploy machine learning models in production using ML ops, CI/CD pipelines, monitoring, and scalable infrastructure. Explore data validation, data preparation, retraining, and model registry for reliable, repeatable deployment.
Supervised learning uses labeled data to solve classification and regression, leveraging features from structured data, images, or text, with algorithms like logistic regression, decision trees, neural networks, and ensemble methods.
Use regression algorithms to predict numeric values, including linear, polynomial, and decision tree regressions. Build a baseline model with linear regression and minimize prediction error using MSE, RMSE, and MAE.
explains unsupervised learning without label data, covering clustering with k-means, association rules, and dimensionality reduction via principal component analysis and auto encoders, with grouping, anomaly detection, and data compression.
Discover semi-supervised learning that blends labeled and unlabeled data to build models when labeling is costly, using anchor points and clustering on low-dimensional manifolds.
Explore reinforcement learning, where an agent learns from trial and error via positive or negative feedback from an environment to maximize rewards, as in games or robotics in complex environments.
Convert inputs into numeric feature vectors by encoding continuous, discrete, and categorical attributes, flattening images, and using word counts or term frequency inverse document frequency, then train and evaluate.
Model output structure varies by task; classification yields a category indicator, regression a numeric value, clustering a group of items, and PCA a set of vectors, sometimes including probability distributions.
Identify data-related and process risks that threaten successful machine learning model development. Address insufficient data, imbalanced data, data quality issues, bias, labeling consistency, data poisoning, and privacy and confidentiality compliance.
Explore the three categories of machine learning—unsupervised learning, supervised learning, and reinforcement learning—covering clustering, anomaly detection, principal component analysis, regression, classification, and fraudulent credit card transactions.
Compare two broad approaches to machine learning: symbolic artificial intelligence and deep learning with neural networks, and trace their dominance and the rise of large neural nets.
Model domains with symbols and apply rule-based inferences in symbolic machine learning, using decision trees and random forests, Naive Bayes, SVM, and kNN on iris classification.
Explore how neural networks and deep learning map input features to outputs using weighted inputs, non-linear activation such as sigmoid, tanh, and ReLU, across multiple layers.
Explore how features are input attributes that describe an entity and how labels are the predicted outputs, illustrated with iris, selling price, fraud detection, and income examples.
Enhance model quality by engineering features through transformations, deriving new features and mapping existing ones, including one-hot encoding, scaling, bucketing, and feature crosses, with time series and text data.
Define the problem, collect data, and establish an evaluation method, then train, validate, and test a machine learning model through iterative data preparation and hyperparameter tuning.
Evaluate models using accuracy, precision, and recall for classification and mean squared error for regression; learn to use a confusion matrix and separate train and test data.
Discover how gradient descent minimizes loss by updating weights with a learning rate and how back propagation efficiently computes gradients to train neural nets.
Explore common model problems such as underfitting and overfitting, and learn fixes like adjusting complexity, training time, and regularization. Understand bias and variance tradeoffs and their impact on generalization.
Explore practical model building in Google Cloud using AutoML (Tables, vision, language), AI Platform Training, Kubeflow, DataProc with Spark ML, and BigQuery ML for SQL-based training and deployment.
Explore Google's pre-trained models for vision, language, and conversation, including transfer learning, video metadata, named entities, sentiment analysis, translation, speech to text, text to speech, and Dialogflow.
Explore machine learning pipelines from development to production, using containers, orchestration, feature store, and model registry to enable scalable, repeatable deployments, monitoring, and automated continuous delivery.
Wrap your model as a restful service, containerize it, and deploy it into production with Kubernetes or Cloud Run, enabling health checks and monitoring.
Leverage queue flow on Kubernetes for end-to-end ml workflows—scaffolding, pipelines, hyperparameter tuning, and model serving—plus Vertex AI AutoML on Google Cloud for diverse data.
Explore Vertex AI as a comprehensive platform for preparing datasets, labeling tasks, pipelines, model training, experiments, and model registry, and delivering predictions via endpoints or batch jobs.
Master Vertex AI data sets, abstracting storage details into a single unit for training, with support for tabular, image, text, and video data from cloud sources.
Explore how Vertex AI feature store manages prepared data by storing features from transformed data, using entity types and features for house sale price, and monitor distributions with alerts.
Explore Vertex AI workbench notebooks, comparing managed notebooks and user managed notebooks. Learn how to tailor environments with PyTorch, OS choices, GPUs, security, and networking for deep learning workflows.
Train a tabular model in Vertex II with AutoML on a corrected dataset, then evaluate with precision-recall, ROC, and the confusion matrix, and iterate with feature engineering to improve performance.
Explore cloud storage for machine learning, focusing on object storage for large training data and unstructured data. Learn storage classes, redundancy, lifecycle policies, retention, object holds, and security controls.
Explore how BigQuery, a managed Google Cloud analytics database, enables data warehousing with SQL, scales to petabytes, and supports datasets, tables, views, models, and partitioning with clustering to reduce scanning.
Explore creating scalable machine learning production environments with Google Cloud Compute Engine virtual machines, managed instance groups, and container orchestration using Cloud Run and Kubernetes Engine.
Choose between GPUs and TPUs on Google Cloud for deep learning training, considering precision, cost, and scalability across model sizes.
Deploy ML models to edge devices to reduce latency in industrial settings, vehicle fleets, and remote sensors. Optimize for constrained memory and CPU with TensorFlow Lite.
Identify sensitive data in datasets, protect it without harming model performance, and establish governance with secure access, encryption in transit and at rest, masking, tokenization, and data coarsening.
Explore descriptive statistics to summarize data, using measures like mean, median, variance, and standard deviation, and understand distributions, central tendency, and data spread for both numeric and categorical features.
Practice feature selection to train efficient, high-performing models. Apply Pearson's correlation, Spearman's rank coefficient, ANOVA, Kendall's rank coefficient, chi square, and mutual information as described.
Use feature crosses to create synthetic features by multiplying two features, boosting predictive performance on nonlinear relationships with non neural network algorithms.
Organize training sets using Vertex II managed data sets to label, annotate, track data lineage, and generate statistics, while automatically splitting data into training, testing, and validation sets.
Handle missing data in training sets by deleting rows, imputing with mean/median/mode, or carrying forward last observed values; weigh simplicity against bias and data leakage, especially in time series.
Identify why outliers occur, from errors to natural minority cases, and apply z-scores, interquartile range, box plots, and DBSCAN to detect and decide how to handle them.
Identify data leakage, where training data include information unavailable at prediction time, and learn examples like future-based imputations, session total counts, and city proxies that bias model performance.
Machine Learning Engineer is a rewarding, in demand role, and increasingly important to organizations moving building data intensive services in the cloud. The Google Cloud Professional Machine Learning Engineer certification is one of the field's most recognized credentials. This course will help prepare you to take and pass the exam. Specifically, this course will help you understand the details of:
Building and deploying ML models to solve business challenges using Google Cloud services and best practices for machine learning
Aspects of machine learning model architecture, data pipelines structures, optimization, as well as monitoring model performance in production
Fundamental concepts of model development, infrastructure management, data engineering, and data governance
Preparing data, optimizing storage formats, performing exploratory data analysis, and handling missing data
Feature engineering, data augmentation, and feature encoding to maximize the likelihood of building successful models
Understand responsible AI throughout the ML development process and apply proper controls and governance to ensure fairness in machine learning models.
By the end of this course, you will know how to use Google Cloud services for machine learning and just as importantly, you will understand machine learning concepts and techniques needed to use those services effectively.
Unlike courses that set out to teach you how to use particular Google Cloud services, this course is designed to teach you services as well as all the topics covered in the Google Cloud Professional Machine Learning Exam Guide, including machine learning fundamentals and techniques.
The course begins with a discussion of framing business problems as machine learning problems followed by a chapter on the technical framing on ML problems. We next review the architecture of training pipelines and supporting ML services in Google Cloud, such as:
Vertex AI Datasets
AutoML
Vertex AI Workbenches
Cloud Storage
BigQuery
Cloud Dataflow
Cloud Dataproc.
Machine learning and infrastructure and security are reviewed next.
We then shift focus to building and implementing machine learning models starting with managing and preparing data for machine learning, building machine learning models, and training and testing machine learning models. This is followed by chapters on machine learning serving and monitoring and tuning and optimizing both the training and serving of machine learning models.
Machine learning operations, also known as MLOps, borrow heavily from software engineering practices. As a machine engineer, you will use your understanding of software engineering practices and apply them to machine learning. Machine learning engineers know how to use ML tools, build models, deploy to production, and monitor ML services. They also know how to tune pipelines and optimize the use of compute and storage resources.
Machine learning engineers and data engineers complement each other. Data engineers build services and pipelines for collecting, storing, and managing data while machine learning engineers use those data services as a starting point for accessing data and building ML models to solve specific business problems.