
In this lecture, we begin our journey into Linear Regression using BigQuery ML by solving a real-world business problem. The goal of this tutorial is to understand how structured data can be used to train a machine learning regression model directly inside BigQuery, without moving data outside the platform.
We focus on a housing price prediction use case, where the objective is to predict the median house value based on several important input features. This type of problem is common in domains such as real estate, finance, and market analysis, making it an excellent example to understand regression modeling concepts.
Business Problem Overview
The task is to predict the median house value, which acts as the target variable, using the following features:
Housing median age
Total number of rooms
Total number of bedrooms
Population
Number of households
Median income
By analyzing how these features influence house prices, we build a regression model that learns patterns from historical data and makes accurate predictions on unseen data.
Approach and Workflow
In this lecture, we follow a structured and practical workflow that mirrors how machine learning projects are implemented in real environments:
Uploading raw data files to Google Cloud Storage (GCS)
Creating a BigQuery dataset and table to store housing data
Splitting the data into training and testing datasets
Training a Linear Regression model using BigQuery ML
Evaluating the model’s performance on test data
Generating predictions for median house values
Each step is explained clearly so you understand why it is required and how it fits into the overall machine learning pipeline.
Key Concepts Covered
Understanding regression problems and target variables
Identifying features and their role in prediction
End-to-end machine learning workflow in BigQuery ML
Training, evaluating, and predicting using a regression model
Applying ML concepts to a real business scenario
In this lecture, we focus on one of the most important foundational steps in any machine learning workflow: preparing and organizing data in BigQuery. Before we can train any machine learning model, the data must be properly stored, structured, and accessible. This tutorial walks you through that complete setup using Google Cloud Console, Cloud Storage, and BigQuery.
You will learn how to move raw data into Google Cloud, create a dataset, and convert a CSV file into a structured BigQuery table that can later be used for machine learning with BigQuery ML.
What You Will Do in This Lecture
This lecture follows a step-by-step, practical approach that mirrors real-world cloud data preparation:
Accessing the Google Cloud Console
Creating a Google Cloud Storage (GCS) bucket
Understanding and selecting bucket configuration options such as:
Location type and region
Storage class
Access control and public access prevention
Data protection and encryption settings
Creating folders inside the bucket to organize machine learning data
Uploading a CSV file containing housing data to Cloud Storage
Creating a BigQuery Dataset
Once the data is available in Cloud Storage, the lecture moves into BigQuery Studio, where you will:
Create a new BigQuery dataset
Select the appropriate region to match the Cloud Storage bucket
Understand key dataset configuration options, including:
External dataset settings
Table expiration
Encryption options
Advanced dataset settings
Verify dataset creation and explore dataset metadata
This ensures that your data storage and processing locations are aligned, which is a best practice in Google Cloud.
Creating and Exploring a BigQuery Table
After creating the dataset, you will learn how to:
Create a BigQuery table from a CSV file stored in Cloud Storage
Configure table settings such as:
Table name
Schema auto-detection
Table type (native table)
Successfully load the data into BigQuery
You will then explore the created table by:
Reviewing the table schema (columns, data types, and modes)
Checking table and storage information
Previewing the data records directly in BigQuery
Understanding record counts and table statistics
This step confirms that the data has been loaded correctly and is ready for further analysis.
Key Concepts Covered
Data ingestion using Google Cloud Storage
Organizing machine learning data using folders and buckets
Creating datasets and tables in BigQuery
Understanding table schema and metadata
Validating data through table preview and exploration
Preparing structured data for BigQuery ML workflows
In this lecture, we move to the next critical step in the machine learning workflow: exploring and preparing the data for modeling. Since the dataset and table have already been created in the previous tutorial, this session focuses on understanding the data stored in BigQuery and ensuring it is clean, complete, and ready for training a machine learning model.
Data exploration is a crucial step because the quality of your model depends heavily on the quality of your data. In this tutorial, you will learn how to inspect raw data, analyze data structure, identify data issues, and handle missing values using BigQuery.
Viewing and Understanding the Raw Data
We begin by querying the housing data table to:
View a sample of records from the dataset
Limit the number of rows returned for easier inspection
Understand how the data looks in its raw form
This step helps you become familiar with the dataset before performing any transformations or analysis.
Exploring Schema and Data Types
Next, we examine the table schema to understand how the data is structured:
Identify all column names in the table
Review the data type of each column
Confirm which columns are numerical and which are categorical
Understanding data types is essential before applying machine learning algorithms, as it ensures the data is suitable for analysis and modeling.
Generating Summary Statistics
To gain deeper insights into the numerical data, we calculate summary statistics such as:
Total number of records in the table
Minimum, maximum, and average values for key numerical columns
Distribution insights for features like income and house prices
These statistics help you understand data ranges, spot outliers, and assess whether the values are reasonable for the business problem.
Detecting Missing Values
Missing data can negatively impact machine learning models. In this lecture, you will learn how to:
Identify columns that contain null or missing values
Count the number of missing records per column
Isolate and inspect rows where missing values occur
This step allows you to clearly see which features require data cleaning.
Handling Missing Values with Data Cleaning
After identifying missing values, we focus on cleaning the data by:
Calculating the mean value of the affected column
Replacing null values with the column’s mean
Creating a new cleaned table to preserve the original dataset
This approach ensures that the dataset is complete and suitable for training a regression model without losing important records.
Saving Queries for Reusability
To maintain a clean and reusable workflow, we also save important queries:
Store data exploration and cleaning queries in BigQuery
Organize queries for future reference and collaboration
Prepare the environment for upcoming modeling steps
Key Concepts Covered
Data exploration using BigQuery
Understanding schema and data types
Summary statistics for numerical features
Identifying and handling missing values
Creating cleaned datasets for machine learning
Best practices for data preparation in BigQuery
In this lecture, we move to a critical stage of the machine learning workflow: splitting the prepared dataset into training and testing sets. Since the data has already been explored and cleaned in previous tutorials, this session focuses on creating reliable and non-overlapping datasets that will be used for model training and evaluation.
Proper data splitting is essential to ensure that a machine learning model can be evaluated fairly and performs well on unseen data.
Why Train and Test Splits Matter
Before building a machine learning model, it is important to:
Train the model on one portion of the data
Evaluate the model on a separate, unseen portion
Avoid data leakage and duplicate records
Ensure consistent and reproducible results
In this lecture, we achieve this using a deterministic hashing strategy inside BigQuery.
Adding a Deterministic Hash Column
We begin by creating a new view that adds a hash-based column to the cleaned housing dataset.
Key ideas covered include:
Creating a deterministic hash value using selected feature columns
Converting the hash into a numeric range
Ensuring the hash value is always non-negative
Generating a hash bucket between 0 and 99
This hash column ensures that each row is always assigned to the same bucket, making the train-test split consistent across multiple runs.
Ensuring Non-Overlapping Datasets
By using a hash-based approach:
The same data row will never appear in both training and testing sets
The split remains stable even if the data is reprocessed
There is no randomness that could change results between runs
This method is especially useful in production-grade machine learning pipelines.
Creating the Training Dataset
Next, we create a training data view by selecting rows based on the hash bucket range.
Key points include:
Selecting approximately 80% of the data for training
Using views instead of physical tables for efficiency
Verifying the training dataset by querying the view
Creating the Testing Dataset
After creating the training view, we create a testing data view:
The remaining 20% of the data is assigned to testing
This dataset is kept separate from training data
The testing set is used only for model evaluation
This clear separation allows for accurate performance measurement in later steps.
Key Concepts Covered
Importance of train-test data splitting
Deterministic hashing for reproducible splits
Preventing data leakage in machine learning
Creating views for training and testing datasets
Validating data splits in BigQuery
In this lecture, we reach one of the most important milestones in the machine learning workflow: building and evaluating a machine learning model. After successfully exploring the data and splitting it into training and testing datasets, this session focuses on creating a Linear Regression model using BigQuery ML and evaluating its performance.
This lecture demonstrates how machine learning models can be trained and evaluated directly inside BigQuery using SQL-based workflows, without exporting data or using external tools.
Creating the Linear Regression Model
We begin by creating a Linear Regression model to predict house prices.
Key concepts covered in this section include:
Creating a machine learning model inside a BigQuery dataset
Using a model replacement strategy to ensure consistent results during experimentation
Selecting linear regression as the model type for predicting numerical values
Defining the target (label) column as the median house value
Choosing the appropriate feature columns from the training dataset
Once the model training starts, we review the job execution details and understand how BigQuery ML processes model creation internally.
Understanding Model Training Details
After the model is created, we explore important training-related information, such as:
Job execution summary and resource usage
Training iterations and completion status
Loss values and how they change during training
Execution stages including preprocessing, training, and evaluation
Visual graphs showing:
Training iteration versus loss
Training iteration versus duration
These insights help you understand how the model learns from data and how efficiently the training process is executed.
Exploring Model Metadata and Configuration
Once the model is available, we review its details within BigQuery:
Model type and model ID
Dataset and regional location
Training options and configuration parameters
Label column and feature columns used during training
Model modification and expiration settings
This section helps you understand how BigQuery stores and manages machine learning models.
Model Evaluation on Test Data
After training, the model is evaluated using a separate test dataset to measure its real-world performance.
In this part of the lecture, you will learn:
How to evaluate a trained model using test data
Why evaluation on unseen data is critical
How to interpret common regression metrics, including:
Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Mean Squared Log Error
Median Absolute Error
R-squared (R²) score
Explained variance
We compare the evaluation results from the test dataset with the training metrics to ensure the model generalizes well and does not overfit.
Understanding Model Performance
Special attention is given to the R-squared (R²) score, which indicates how well the model explains the variance in house prices. By comparing R² values from both training and test datasets, we assess the consistency and reliability of the model.
Key Concepts Covered
Creating machine learning models using BigQuery ML
Linear regression for numerical prediction problems
Model training workflow inside BigQuery
Interpreting training metrics and execution graphs
Evaluating models using test datasets
Understanding regression performance metrics
Comparing training and testing results
In this lecture, we complete the end-to-end linear regression workflow by focusing on model inference, also known as making predictions. After successfully creating and evaluating the linear regression model in the previous tutorial, this session demonstrates how to use the trained model to generate predictions on both test data and new, unseen data.
This is a crucial step where machine learning models deliver real business value by producing actionable outputs.
Generating Predictions on Test Data
We begin by making predictions on the test dataset that was created earlier.
In this part of the lecture, you will learn how to:
Use a trained BigQuery ML model to generate predictions
Apply the model to test data without including the target variable
Store prediction results in a new BigQuery table
Understand the structure of prediction outputs, including:
Input feature columns
Predicted median house value
This step allows you to compare predicted values with actual values and validate model performance further if required.
Understanding Prediction Output Tables
Once predictions are generated, we explore the newly created prediction table:
Review the table schema
Verify feature columns and predicted values
Preview sample prediction records
This helps confirm that the model is producing outputs in the expected format.
Making Predictions on New, Unseen Data
Next, we move beyond test data and generate predictions on completely new data, simulating a real-world production scenario.
Key steps covered include:
Uploading new input data to Google Cloud Storage
Creating a BigQuery table from the new data file
Ensuring the new dataset contains only feature columns
Verifying the schema and previewing the data before prediction
This demonstrates how a trained model can be reused to make predictions without retraining.
Storing Predictions for New Data
After preparing the new data, we generate predictions and:
Save the prediction results in a separate BigQuery table
Review the schema and preview the prediction results
Understand how predicted house values are produced for unseen records
This step shows how BigQuery ML supports scalable and repeatable inference workflows.
Key Concepts Covered
Model inference using BigQuery ML
Generating predictions on test datasets
Predicting outcomes for new, unseen data
Creating and managing prediction tables
Understanding prediction output structure
Completing an end-to-end linear regression pipeline
In this lecture, we explore an alternative and more interactive way to execute BigQuery ML workflows by using BigQuery SQL Notebooks. Until now, all SQL queries were executed using the BigQuery SQL Query Editor. In this tutorial, you will learn how to run the same machine learning workflow inside a BigQuery Studio notebook, which provides a more organized and collaborative environment.
This approach is especially useful for experimentation, documentation, and team collaboration.
Introduction to BigQuery SQL Notebooks
We begin by understanding what BigQuery SQL Notebooks are and why they are useful:
A managed notebook environment within BigQuery Studio
Supports execution of SQL queries in an interactive, cell-based format
Enables better organization of machine learning workflows
Ideal for documenting end-to-end ML pipelines
You will also see the available notebook options and learn how to create a BigQuery SQL notebook.
Enabling Required APIs and Runtime Setup
Before running queries in the notebook, we configure the required environment:
Enabling the BigQuery Unified API
Reviewing and confirming permission settings
Connecting the notebook to a runtime environment
Selecting the appropriate region for execution
This setup ensures the notebook can securely access BigQuery resources.
Executing SQL Queries in a Notebook Environment
Once the notebook is ready, we demonstrate how to:
Use special notebook commands to execute BigQuery SQL
Run SQL queries directly inside notebook cells
Verify successful execution by querying sample records from tables
This shows how SQL execution in notebooks differs slightly from the standard query editor while producing the same results.
Organizing the End-to-End Machine Learning Workflow
Next, we structure the entire Linear Regression workflow inside the notebook:
Step 1: Data exploration
Step 2: Splitting data into training and testing sets
Step 3: Creating the linear regression model
Step 4: Evaluating the model
Step 5: Generating predictions on test data and new data
Each step can be placed in separate notebook cells or grouped logically, making the workflow easier to read, run, and maintain.
Saving and Reusing Notebooks
Finally, we save the notebook so it can be:
Reused for future experiments
Shared with team members
Used as documentation for the complete ML pipeline
This makes BigQuery SQL notebooks a powerful tool for reproducible machine learning workflows.
Key Concepts Covered
Difference between SQL Query Editor and SQL Notebooks
Creating and managing BigQuery SQL notebooks
Runtime configuration and API enablement
Executing BigQuery SQL inside notebook cells
Structuring end-to-end ML workflows in notebooks
Best practices for organizing ML experiments
In this lecture, we begin a new regression use case by introducing Boosted Trees regression in BigQuery ML. Before building the machine learning model, this tutorial focuses on the most important foundation step: data preparation and exploratory data analysis (EDA).
You will work with an insurance dataset and understand how to upload data, create datasets and tables in BigQuery, and explore the data to uncover meaningful patterns that influence insurance charges.
Uploading Insurance Data to Google Cloud Storage
We start by uploading the insurance CSV file to Google Cloud Storage (GCS):
Use an existing GCS bucket created earlier for BigQuery ML
Organize data inside the regression folder
Upload the insurance dataset that will be used for modeling
This step ensures the raw data is securely stored and ready for ingestion into BigQuery.
Creating Dataset and Table in BigQuery
After uploading the file, we move to BigQuery and perform the following steps:
Create a new dataset named insurance demo
Select an appropriate region to match the storage location
Create a native BigQuery table using the CSV file
Enable schema auto-detection for faster setup
Once the table is created, we review:
Table schema and column data types
Feature columns such as age, BMI, smoker status, region, and children
Target variable charges, which represents insurance cost
Understanding the Regression Problem
This is a regression problem, where the goal is to predict a numerical value:
Target variable: Insurance charges
Feature variables: Age, sex, BMI, children, smoker, and region
Understanding this distinction is critical before applying boosted trees regression.
Initial Data Exploration
We begin exploring the dataset to understand its structure and quality:
View sample records from the table
Confirm the total number of records
Validate that the dataset is suitable for regression modeling
Summary Statistics and Data Quality Checks
Next, we analyze key statistics for the target variable:
Total number of records
Minimum, maximum, and average insurance charges
Verification that there are no missing values in the target column
We then extend the missing value analysis across all columns to confirm that the dataset is complete and clean.
Analyzing Categorical Variables
For categorical features, we calculate unique value counts to understand data distribution:
Sex
Smoker
Region
This helps identify how many categories exist and how they may influence the model.
Feature-Level Insights and Relationships
We perform several exploratory analyses to understand relationships between features and insurance charges:
Smoker vs Charges:
Clear difference in average insurance charges between smokers and non-smokers
BMI Categories vs Charges:
Higher BMI categories show higher average insurance costs
Age vs Charges:
Insurance charges increase as age increases
Children vs Charges:
No strong correlation observed
Region vs Charges:
Moderate variation in charges across regions
Smoker vs Region:
Smokers consistently have higher charges across all regions
These insights help justify why boosted trees, which can capture non-linear relationships and feature interactions, are well-suited for this problem.
Key Concepts Covered
Uploading regression data to Google Cloud Storage
Creating datasets and tables in BigQuery
Understanding regression problems and target variables
Performing exploratory data analysis (EDA)
Checking data quality and missing values
Analyzing numerical and categorical features
Identifying important patterns before model training
In this lecture, we move from data exploration to building a powerful regression model using Boosted Trees in BigQuery ML. After understanding the insurance dataset in the previous tutorial, this session covers the complete machine learning workflow, including data splitting, model training, evaluation, and prediction.
Boosted Trees are well-suited for tabular data and can capture complex, non-linear relationships, making them an excellent choice for predicting insurance charges.
Splitting the Dataset into Training and Testing Sets
We begin by preparing the dataset for modeling using a deterministic, hash-based split.
Key points covered in this section:
Creating a new table that includes a data split indicator column
Using a deterministic hashing approach to ensure:
Consistent train-test splits
No overlap between training and testing data
Reproducible results across multiple runs
Assigning approximately 80% of data to training and 20% to testing
We then verify the split proportions to confirm that the dataset is correctly divided.
Creating Training and Testing Views
To keep the workflow efficient and modular:
A training data view is created from the split table
A testing data view is created for model evaluation
Views allow flexible reuse without duplicating data
This structured setup prepares the data for machine learning model training.
Building a Boosted Trees Regression Model
Next, we create and train a Boosted Trees regression model.
Important concepts explained include:
Selecting Boosted Tree Regressor for numerical prediction problems
Defining insurance charges as the target variable
Choosing feature columns from the training dataset
Understanding key model configuration options:
Number of training iterations
Learning rate and its impact on model updates
Row subsampling to improve generalization
Tree depth to control model complexity
Minimum node size to avoid overly specific splits
Early stopping to prevent overfitting and unnecessary computation
Each option is explained in simple terms so you understand how it affects model performance.
Understanding Model Training and Performance
After training, we explore the model details:
Model metadata and training configuration
Training progress, including:
Iteration versus loss
Iteration versus duration
Initial evaluation metrics on training data
Special focus is given to the R-squared (R²) score, which indicates how well the model explains variation in insurance charges.
Evaluating the Model on Test Data
We then evaluate the trained model using the test dataset to measure real-world performance.
Evaluation metrics covered include:
Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Mean Squared Log Error
Median Absolute Error
R-squared (R²) score
We compare training and test metrics to ensure the model generalizes well and is not overfitting.
Generating and Storing Predictions
Finally, we use the trained model to:
Generate predictions on the test dataset
Store prediction results in a new BigQuery table
Compare actual insurance charges with predicted values
Validate that predictions are reasonable and consistent
This step demonstrates how machine learning models deliver real business value through predictions.
Key Concepts Covered
Deterministic train-test data splitting
Creating reusable training and testing views
Boosted Trees regression in BigQuery ML
Understanding and tuning model parameters
Evaluating regression models using standard metrics
Generating and storing predictions
Building an end-to-end regression pipeline in BigQuery
In this lecture, we begin a new machine learning journey by introducing Binary Classification using BigQuery ML. Unlike regression problems where the goal is to predict numerical values, binary classification focuses on predicting one of two possible outcomes. In this tutorial, we work with an employee attrition dataset to predict whether an employee is likely to stay with the company or leave.
Before building any machine learning model, we focus on data ingestion and exploratory data analysis (EDA), which are critical steps for understanding the problem and preparing the dataset.
Uploading Classification Data to Google Cloud Storage
We start by preparing the raw data:
Use an existing Google Cloud Storage (GCS) bucket created earlier
Create a new folder dedicated to classification datasets
Upload the employee attrition CSV file to Cloud Storage
This ensures that the data is securely stored and ready for processing in BigQuery.
Creating Dataset and Table in BigQuery
After uploading the data, we move to BigQuery Studio and perform the following steps:
Create a new dataset named attrition demo
Select the appropriate region to match the Cloud Storage location
Create a native BigQuery table using the CSV file
Enable schema auto-detection for faster setup
Once the table is created, we review:
Table schema and column data types
Feature columns such as age, gender, department, job role, income, and more
Target variable attrition, which indicates whether an employee leaves the organization
Understanding the Binary Classification Problem
This is a binary classification problem because:
The target variable has only two possible values: true and false
True indicates that an employee leaves the company
False indicates that an employee stays with the company
Understanding this distinction is essential before applying classification algorithms.
Exploring the Dataset
We then perform exploratory data analysis to understand the structure and quality of the data:
View sample records from the dataset
Review column names and data types
Identify which features are numerical and which are categorical
Confirm the total number of records
Checking Data Quality and Missing Values
Next, we verify data completeness:
Count total records in the dataset
Check for missing values in key columns such as attrition, age, and job role
Confirm that the dataset does not contain null values in important fields
This step ensures the data is suitable for machine learning.
Analyzing Target Variable Distribution
To understand class balance, we analyze the attrition distribution:
Count employees who stay versus those who leave
Identify class imbalance, which is common in real-world classification problems
Understanding class distribution helps in selecting the right evaluation metrics later.
Feature-Level Insights and Relationships
We explore how different features relate to employee attrition:
Gender vs Attrition: Compare attrition rates between male and female employees
Department-wise Attrition: Calculate attrition percentage across departments
Job Role vs Attrition: Identify roles with higher attrition rates
Numerical Feature Summary: Analyze minimum, maximum, average, and standard deviation for income and other numerical features
These insights provide strong intuition about which features may influence employee attrition.
Key Concepts Covered
Introduction to binary classification problems
Uploading classification data to Google Cloud Storage
Creating datasets and tables in BigQuery
Understanding target variables and feature columns
Exploratory data analysis (EDA) for classification
Analyzing categorical and numerical features
Understanding class distribution and attrition patterns
In this lecture, we continue our journey into Binary Classification using BigQuery ML by building, evaluating, and using a Logistic Regression model. In the previous tutorial, we prepared the employee attrition dataset and explored the data. In this session, we focus on converting that data into a complete, end-to-end machine learning pipeline.
The goal of this lecture is to predict employee attrition, meaning whether an employee is likely to leave the organization or stay.
Splitting the Dataset into Training and Testing Sets
We begin by preparing the data for model training and evaluation.
Key steps covered include:
Adding a deterministic hash column to uniquely identify each record
Creating a new table that includes this hash value
Ensuring that the same employee record never appears in both training and testing datasets
Using the hash value to split the data into:
80% training data
20% testing data
This approach guarantees consistent, reproducible, and non-overlapping data splits, which is a best practice in machine learning workflows.
Creating Training and Testing Views
After adding the hash column:
A training data view is created using records assigned to the training split
A testing data view is created using records assigned to the testing split
Views are used to keep the workflow efficient and flexible without duplicating data
This setup prepares clean and structured inputs for model training.
Training a Binary Classification Model Using Logistic Regression
Next, we train a Logistic Regression model, which is one of the most commonly used algorithms for binary classification problems.
In this section, you will learn:
How to create a machine learning model inside BigQuery
Why logistic regression is suitable for predicting binary outcomes
How to define the target variable (attrition)
How to select feature columns from the training dataset
How BigQuery ML handles model training automatically
Once training is complete, we explore the model details and configuration.
Understanding Model Training and Metrics
After the model is created, we review important training insights:
Number of training iterations
Loss reduction during training
Learning rate behavior
Feature columns used in the model
We then analyze training performance metrics, including:
Accuracy
Precision
Recall
F1 score
Log loss
ROC and AUC
You will also see how changing the classification threshold impacts these metrics, which is critical for real-world decision-making.
Evaluating the Model on Test Data
To measure real-world performance, we evaluate the model using the test dataset.
This section explains:
Why test data evaluation is important
How training and test accuracy should be compared
Interpreting evaluation metrics on unseen data
Verifying that the model generalizes well and is not overfitting
The similarity between training and test accuracy confirms the reliability of the model.
Making and Storing Predictions
Finally, we use the trained model to generate predictions:
Predict employee attrition on the test dataset
Generate both:
Predicted class (true or false)
Predicted probabilities for each class
Store prediction results in a new BigQuery table
Understand how to interpret prediction outputs and probabilities
You will also learn how this same process can be applied to new, unseen data in real-world scenarios.
Key Concepts Covered
Binary classification concepts
Deterministic train-test data splitting
Logistic regression in BigQuery ML
Model training and evaluation metrics
Precision, recall, accuracy, F1 score, and ROC AUC
Making predictions with probability scores
End-to-end classification workflow in BigQuery
In this lecture, we begin a new machine learning topic: Multi-Class Classification using BigQuery ML. Unlike binary classification, where the target variable has only two possible outcomes, multi-class classification involves predicting one class out of many possible categories.
In this tutorial, we work with a wine quality dataset, where the goal is to predict the quality category of wine based on several chemical and physical properties.
Uploading Multi-Class Classification Data to Google Cloud Storage
We start by preparing the raw dataset:
Use an existing Google Cloud Storage (GCS) bucket created earlier
Upload a new CSV file containing wine quality data into the classification folder
Ensure the dataset is securely stored and ready for BigQuery processing
Creating Dataset and Table in BigQuery
After uploading the data, we move to BigQuery Studio and perform the following steps:
Create a new dataset named wine quality demo
Select the appropriate region to align with the storage location
Create a native BigQuery table using the CSV file
Enable schema auto-detection for faster table creation
Once the table is created, we review:
Table schema and column data types
Feature columns such as acidity, alcohol, pH, sugar, and more
Target variable quality, which contains multiple class labels
Understanding the Multi-Class Classification Problem
This is a multi-class classification problem because:
The target variable has more than two possible classes
Wine quality values range across multiple categories
The model must predict one quality class per record
Understanding the nature of the target variable is critical before building a classification model.
Exploring the Wine Quality Dataset
We then perform exploratory data analysis (EDA) to understand the dataset:
View sample records from the table
Confirm total record count
Identify minimum and maximum values for the quality column
Understand how many unique classes exist in the target variable
Analyzing Target Class Distribution
To assess class balance, we analyze the distribution of quality classes:
Count records for each quality category
Observe whether the dataset is balanced or skewed
Understand potential challenges for multi-class modeling
Feature-Level Insights
We explore relationships between features and the target variable:
Analyze average alcohol content by wine quality
Observe how alcohol levels vary across different quality classes
Gain intuition about feature importance before model training
Checking Data Quality and Missing Values
Next, we assess data completeness:
Identify columns with missing values
Count the number of missing records per column
Evaluate whether missing values require treatment
Since missing values are minimal compared to the dataset size, we note them for future handling.
Splitting the Dataset into Training and Testing Sets
To prepare the data for modeling, we split it into training and testing datasets using a deterministic approach:
Add a hash-based data split column to the dataset
Ensure consistent and reproducible train-test splits
Assign approximately 80% of records to training and 20% to testing
This approach prevents data leakage and ensures reliable model evaluation.
Creating Training and Testing Views
Finally, we create structured views for modeling:
A training data view containing only training records
A testing data view containing only testing records
Rename columns to remove spaces for better compatibility with machine learning models
This structured setup ensures clean and ready-to-use inputs for the next steps.
Key Concepts Covered
Introduction to multi-class classification
Uploading classification data to Google Cloud Storage
Creating datasets and tables in BigQuery
Understanding multi-class target variables
Exploratory data analysis for multi-class problems
Handling missing values
Deterministic train-test data splitting
Preparing clean training and testing views
In this lecture, we move to the most important phase of the multi-class classification workflow: training, evaluating, and interpreting a machine learning model using Boosted Tree Classifier in BigQuery ML. In the previous tutorial, we explored the wine quality dataset and prepared clean training and testing views. In this session, we use that prepared data to build a powerful classification model.
The objective of this lecture is to predict the wine quality category, where the target variable contains multiple classes, making this a true multi-class classification problem.
Training a Boosted Tree Classification Model
We begin by creating and training a Boosted Tree Classifier.
Key concepts explained in this section include:
Creating or replacing a machine learning model inside a BigQuery dataset
Using Boosted Tree Classifier, a gradient boosted decision tree algorithm suitable for complex classification problems
Defining the target variable (wine quality) for multi-class prediction
Understanding important training options:
Maximum number of boosting iterations
Learning rate and its impact on model updates
Maximum tree depth to control model complexity
Row subsampling to reduce variance and prevent overfitting
Early stopping to halt training when performance stops improving
Each parameter is explained in simple terms so you understand how it influences model behavior.
Understanding Model Training Execution
Once training starts, we examine the model execution details:
Job execution summary and resource usage
Training stages such as validation, preprocessing, training, and evaluation
Number of planned versus completed training iterations
Training duration and computational effort
Execution graphs showing:
Iteration versus loss
Iteration versus duration
Learning rate behavior
These insights help you understand how the model learns during training.
Exploring Model Performance on Training Data
After training, we explore the model details:
Model metadata and configuration
Training performance metrics
Overall training accuracy
Confusion matrix showing how predictions are distributed across all classes
The confusion matrix helps visualize how well the model distinguishes between different wine quality categories.
Evaluating the Model on Test Data
Next, we evaluate the model using the test dataset, which was kept separate during training.
In this section, you will learn:
How to evaluate a multi-class classification model on unseen data
Why test accuracy is important for measuring generalization
How to compare training accuracy and test accuracy
Interpreting evaluation metrics for multi-class classification
We observe that test accuracy is slightly lower than training accuracy, which is expected and indicates realistic model behavior.
Making Predictions and Storing Results
After evaluation, we generate predictions on the test data and store the results in a new table.
This part of the lecture covers:
Generating predicted class labels for each record
Understanding predicted probabilities for each class
Selecting the final predicted class based on the highest probability
Storing prediction results in BigQuery for further analysis
This demonstrates how the trained model can be used for real-world inference.
Understanding Feature Importance
Finally, we analyze feature importance to interpret the model’s decisions.
Key points include:
Identifying which features contribute most to predictions
Understanding importance metrics such as:
Importance gain
Importance weight
Importance cover
Observing which chemical properties most influence wine quality predictions
Feature importance helps explain model behavior and builds trust in machine learning results.
In this lecture, we begin our journey into Time Series Forecasting using BigQuery ML. Time series forecasting is used when data is collected over time and the goal is to predict future values based on historical patterns. This tutorial introduces the core concepts, tools, and workflow required to build a time series forecasting model directly inside BigQuery.
We focus on understanding the ARIMA+ model, which is a powerful and scalable approach provided by BigQuery ML for forecasting time-based data.
Understanding Time Series Forecasting in BigQuery
We start by reviewing the official Google Cloud documentation for time series forecasting to understand:
What time series forecasting is
When to use univariate forecasting with ARIMA models
When to use multivariate forecasting with ARIMA + external regressors
The overall time series modeling pipeline, including:
Data preprocessing
Time series decomposition
Forecast generation
Forecast explanation
Anomaly detection
We also discuss how BigQuery ML supports large-scale time series forecasting, including handling multiple time series and automatic scaling using distributed cloud resources.
Uploading Time Series Data to Google Cloud Storage
Next, we prepare the dataset for modeling:
Use an existing Google Cloud Storage (GCS) bucket
Create a dedicated folder for time series data
Upload a CSV file containing historical Superstore sales data
This step ensures that the raw time series data is securely stored and ready for analysis.
Creating Dataset and Table in BigQuery
After uploading the data, we move to BigQuery Studio and:
Create a new dataset named Superstore Sales Demo
Select the appropriate region to match the storage location
Create a native BigQuery table using the uploaded CSV file
Enable schema auto-detection for faster setup
We then review:
Table schema and data types
Key columns such as order date and sales amount
Total number of records available for forecasting
Exploring and Visualizing Time Series Data
Before building the model, we explore and visualize the data to understand trends and patterns:
View sample records ordered by date
Aggregate daily sales values
Visualize sales trends using:
Line charts
Bar charts
Scatter plots
Visualization helps identify seasonality, trends, and potential anomalies in the data.
Creating and Training a Time Series Forecasting Model
We then build a time series forecasting model using ARIMA+ in BigQuery ML.
Key concepts covered include:
Selecting ARIMA+ as the model type
Defining the time column (order date)
Defining the value column (total sales)
Adding holiday effects to improve forecast accuracy
Training the model using aggregated historical sales data
BigQuery ML automatically handles model selection, parameter tuning, and optimization.
Understanding Model Evaluation and Diagnostics
After training, we explore the model details and evaluation results:
Model metadata and training configuration
Automatically generated ARIMA model variants
Key ARIMA parameters such as:
Autoregressive terms (p)
Differencing (d)
Moving average terms (q)
Detection of:
Seasonality patterns
Holiday effects
Spikes, dips, and structural changes
Understanding AIC (Akaike Information Criterion):
Used as a model quality metric
Lower AIC indicates a better-fitting model
This section helps you understand how BigQuery ML evaluates and selects the best forecasting model.
In this lecture, we complete the time series forecasting workflow in BigQuery ML by focusing on model evaluation, future forecasting, forecast explanation, and anomaly detection. In the previous tutorial, we explored and visualized the data and created a time series forecasting model using ARIMA+. This session demonstrates how to use that trained model to extract meaningful insights and actionable predictions.
Evaluating the Time Series Model
We begin by evaluating the trained ARIMA+ model to understand how well it fits the historical data.
In this section, you will learn:
How to evaluate a time series model using BigQuery ML
How BigQuery ML generates multiple candidate ARIMA configurations
How to interpret evaluation outputs visually using charts
Why evaluation is important before trusting future forecasts
This evaluation step helps confirm that the model has learned the underlying patterns in the data.
Inspecting Model Coefficients
Next, we inspect the ARIMA model coefficients to understand how the model behaves internally.
Key concepts explained include:
Autoregressive (AR) coefficients
Indicate how strongly past values influence current values
Moderate positive influence suggests recent history matters
Moving Average (MA) coefficients
Capture the impact of past prediction errors
Positive and negative values reflect different correction patterns
Intercept or drift
Represents the baseline trend in the time series
Indicates whether the series has an upward or downward trend
This step provides interpretability and helps build confidence in the model.
Forecasting Future Sales
After evaluation, we use the trained model to forecast future sales values.
In this part of the lecture, you will learn:
How to generate forecasts for a defined future horizon
Forecasting the next 30 time periods
Understanding confidence levels in forecasts
Storing forecast results in a BigQuery table for further analysis
This demonstrates how time series models deliver real business value by predicting future outcomes.
Understanding Forecast Output Columns
We carefully review the forecast output table and explain each column:
Forecast timestamp: Date for which the prediction is made
Forecast value: Predicted sales value
Standard error: Measure of uncertainty in the prediction
Confidence level: Probability that the true value lies within the interval
Prediction interval (lower and upper bounds): Expected range of values
Confidence interval (lower and upper bounds): Statistical confidence range
Understanding these fields helps interpret not just predictions, but also the uncertainty around them.
Explaining the Forecast
We then generate a forecast explanation to understand how the model adjusted historical data:
Review adjusted time series values
Analyze confidence and prediction intervals
Understand how BigQuery ML explains forecast behavior
This step is useful for transparency and model validation.
Detecting Anomalies in Historical Data
Finally, we use the trained model to detect anomalies in historical sales data.
This section covers:
Identifying unusual spikes or drops in sales
Understanding anomaly flags (true or false)
Interpreting anomaly probability scores
Reviewing lower and upper bounds for expected values
Anomaly detection is extremely useful for identifying outliers, operational issues, or unexpected business events.
Key Concepts Covered
Evaluating time series models in BigQuery ML
Interpreting ARIMA model coefficients
Forecasting future values with confidence intervals
Explaining time series forecasts
Detecting anomalies in historical data
Understanding uncertainty and probability in forecasts
In this lecture, we begin an exciting new topic: Matrix Factorization, which is widely used to build recommendation systems. Recommendation engines power many real-world applications such as movie suggestions, product recommendations, and content personalization. In this tutorial, you will learn how to design and prepare a recommendation system using BigQuery ML.
The focus of this lecture is on data preparation, model setup, and infrastructure requirements needed to train a matrix factorization model in BigQuery.
Understanding Matrix Factorization for Recommendations
Matrix factorization is a machine learning technique used to:
Learn user preferences based on historical interactions
Identify relationships between users and items
Generate personalized recommendations
In this use case, we build a movie recommendation system using user ratings data.
Uploading Recommendation Data to Google Cloud Storage
We begin by preparing the raw datasets:
Use an existing Google Cloud Storage (GCS) bucket
Create a dedicated folder for matrix factorization
Upload two CSV files:
Movies (movie details such as title and genres)
Ratings (user ratings for movies)
This step ensures the data is securely stored and ready for BigQuery processing.
Creating Dataset and Tables in BigQuery
Next, we move to BigQuery Studio and perform the following steps:
Create a new dataset named recommendation demo
Select the appropriate region for processing
Create two native BigQuery tables:
Movies table containing movie metadata
Ratings table containing user ratings and timestamps
We then explore both tables to understand their structure and identify the common column (movie ID) that links them.
Creating the Final Training Dataset
To prepare data for matrix factorization:
We join the movies and ratings tables using the movie ID
Create a final consolidated table containing:
User ID
Movie ID
Rating
Timestamp
Movie title
Movie genres
This combined dataset forms the foundation for training the recommendation model.
Creating the Matrix Factorization Model
After preparing the final dataset, we configure the Matrix Factorization model:
Specify the model type as matrix factorization
Define:
User column
Item (movie) column
Rating column
Select relevant columns from the final dataset for training
At this stage, we encounter an important limitation related to BigQuery ML infrastructure.
Understanding BigQuery Reservation Requirements
Matrix factorization models cannot be trained using on-demand BigQuery resources. To proceed, we must configure a BigQuery reservation.
In this section, you learn:
Why reservations are required for matrix factorization
How to create a reservation using Capacity Management
Selecting:
Reservation type (Enterprise)
Maximum slot capacity
Autoscaling behavior
Understanding baseline slots, autoscaling slots, and cost estimation
Reviewing advanced reservation settings
This step is critical and often overlooked, making it an important real-world lesson.
Assigning the Reservation to the Project
Once the reservation is created:
We assign it to the active Google Cloud project
Specify the job type for the reservation
Ensure BigQuery jobs can use the reserved capacity
Only after this assignment can the matrix factorization model be trained.
Training the Matrix Factorization Model
With the reservation in place:
The training process starts successfully
BigQuery ML begins learning latent factors for users and movies
Training runs using the reserved compute capacity
This completes the setup and training phase of the recommendation system.
Key Concepts Covered
Introduction to matrix factorization
Recommendation systems in BigQuery ML
Preparing user–item interaction data
Joining multiple datasets for model training
Creating matrix factorization models
BigQuery capacity reservations and slot management
Assigning reservations to projects
Infrastructure requirements for advanced ML models
In this lecture, we complete the Matrix Factorization workflow by focusing on model verification, recommendation generation, enrichment, and cleanup. In the previous tutorial, we successfully trained a matrix factorization model using BigQuery ML. In this session, we explore the trained model in detail and use it to generate meaningful movie recommendations.
This lecture demonstrates how recommendation systems work end to end and how they can be applied in real-world scenarios.
Verifying the Trained Matrix Factorization Model
We begin by verifying that the matrix factorization model has been created successfully.
In this section, you will learn how to:
Locate the trained model inside the BigQuery dataset
Review model metadata such as:
Model type and model ID
Creation time and location
Training configuration and parameters
Understand important training settings, including:
Number of latent factors
Iterations completed
Regularization settings
Early stopping behavior
Review training diagnostics such as:
Iteration versus loss
Iteration versus duration
We also examine evaluation metrics like MAE, MSE, and R-squared to understand overall model quality.
Understanding Model Inference for Recommendations
Once the model is verified, we move into inference, where the recommendation system starts delivering value.
You will understand:
How matrix factorization predicts user–item interactions
Why there is no label column in recommendation models
How predicted ratings represent user preference strength
This lays the foundation for generating recommendations.
Generating Recommendations for All Users
Next, we generate recommendations for all users in the dataset:
Store recommendation results in a new BigQuery table
Review recommendation outputs including:
User ID
Movie ID
Predicted rating
These predicted ratings indicate how strongly a user is expected to like a particular movie.
Enriching Recommendations with Movie Titles
To make recommendations more meaningful and user-friendly, we enrich them by:
Joining recommendation results with movie metadata
Replacing movie IDs with movie titles
Sorting recommendations by predicted rating
Reviewing top recommendations across users
You will also learn why predicted ratings may exceed the original rating scale and how they should be interpreted as relative preference scores rather than exact ratings.
Generating Recommendations for a Specific User
In real-world applications, recommendations are often generated per user.
In this section, you learn how to:
Generate personalized recommendations for a specific user
Retrieve top recommended movies for that user
Rank recommendations based on predicted preference
This demonstrates how matrix factorization powers personalized recommendation engines.
Cleaning Up BigQuery Reservations (Important Cost Management Step)
Since matrix factorization requires BigQuery reservations, we perform an important cleanup step:
Remove reservation assignments from the project
Delete the reservation entirely
Ensure no unused resources remain active
This step is critical to avoid unexpected cloud costs and reflects best practices for production environments.
Summary of the Matrix Factorization Workflow
By the end of this lecture, you will clearly understand the complete process:
Preparing and joining user–item data
Training a matrix factorization model
Verifying model training and evaluation
Generating recommendations for all users
Creating personalized recommendations
Managing and cleaning up BigQuery resources
Key Concepts Covered
Recommendation systems using matrix factorization
Model inspection and evaluation
Generating and storing recommendations
Enriching recommendations with metadata
Personalized recommendations for individual users
BigQuery cost and reservation management
End-to-end recommendation workflow in BigQuery ML
In this lecture, we begin learning about Autoencoders in BigQuery ML, with a strong focus on data preparation and feature engineering, which are critical steps before training any anomaly detection model. Autoencoders are commonly used for identifying unusual patterns or outliers in data, and clean, well-structured input data is essential for their success.
This tutorial prepares the foundation required to build an autoencoder-based anomaly detection system.
Uploading Data to Google Cloud Storage
We start by preparing the raw data:
Use an existing Google Cloud Storage (GCS) bucket
Create a new folder dedicated to autoencoders
Upload the Superstore sales CSV file into this folder
This ensures that the source data is securely stored and ready for BigQuery processing.
Creating Dataset and Table in BigQuery
After uploading the file, we move to BigQuery Studio and perform the following steps:
Create a new dataset named market demo
Select the appropriate region to align with the storage location
Create a native BigQuery table using the uploaded CSV file
Enable schema auto-detection for quick setup
Once the table is created, we review:
Table schema and column data types
Sample records from the dataset
Total number of records available for analysis
Comprehensive Data Exploration
Before building any machine learning model, we perform a detailed exploration of the data.
In this step, we analyze:
Total number of records
Presence of null values in the sales column
Minimum, maximum, and average sales values
Standard deviation of sales
This overview helps us understand the overall distribution and scale of the data.
Identifying Obvious Outliers
Next, we check for extreme values in the sales data:
Analyze the highest sales values
Review how frequently large values occur
Identify potential outliers that may impact modeling
This step provides early insights into unusual patterns in the dataset.
Data Cleaning with Additional Feature Engineering
After exploration, we move into data cleaning and feature enhancement, which is essential for anomaly detection.
In this step, we:
Remove null and invalid sales values
Ensure sales data is properly converted to numeric format
Add new analytical features, including:
A unique sequential row identifier
Sales percentiles to understand relative position in the dataset
Z-scores to measure how far each value deviates from the average
These additional features help quantify abnormal behavior in the data.
Building the Features Table for Anomaly Detection
Finally, we prepare the features table, which is required for autoencoder-based anomaly detection in BigQuery ML.
Key ideas covered include:
Creating a compact feature vector for each record
Packing the sales value into a structured feature format
Retaining a row identifier to trace anomalies back to original records
This table represents the final input that will be used by the autoencoder model.
Key Concepts Covered
Introduction to autoencoders and anomaly detection
Uploading data to Google Cloud Storage
Creating datasets and tables in BigQuery
Exploratory data analysis for anomaly detection
Identifying extreme values and outliers
Data cleaning and validation
Feature engineering using percentiles and Z-scores
Preparing feature vectors for BigQuery ML models
In this lecture, we complete the Autoencoder-based anomaly detection workflow using BigQuery ML. In the previous tutorial, we explored the data, cleaned it with additional features, and prepared a features table. In this session, we move from data preparation to model training, evaluation, anomaly detection, and operational monitoring.
This lecture demonstrates how autoencoders can be used to identify unusual patterns in large datasets and how to convert those results into a production-ready anomaly monitoring table.
Verifying Cleaned and Feature-Engineered Data
We begin by verifying the cleaned dataset created earlier:
Review columns such as sales, row ID, sales percentile, and Z-score
Confirm that the data is properly structured and ready for modeling
Validate that feature engineering steps were applied correctly
This ensures the autoencoder receives clean and meaningful input.
Training the Autoencoder Model
Next, we train an Autoencoder model for anomaly detection.
Key concepts covered include:
Creating and training an autoencoder in BigQuery ML
Limiting training iterations for faster convergence
Using early stopping to automatically halt training when improvements stop
Setting minimum relative improvement thresholds to avoid unnecessary computation
We review job execution details such as training duration, data processed, and iteration behavior.
Evaluating the Autoencoder Model
After training, we evaluate the model to understand reconstruction performance:
Review evaluation metrics such as:
Mean Absolute Error (MAE)
Mean Squared Error (MSE)
Mean Squared Log Error
Understand how these metrics indicate how well the model reconstructs normal data
Use both model details and explicit evaluation queries for validation
This confirms that the autoencoder is working as expected.
Quick Anomaly Preview
We then generate a quick anomaly preview to identify the most unusual records:
Detect anomalies based on reconstruction error
Specify an approximate contamination rate (percentage of anomalies)
Sort records by mean squared error to surface the most extreme anomalies
This gives an immediate view of high-risk data points.
Anomaly Summary Statistics
To better understand anomaly distribution, we compute summary statistics:
Count of normal vs anomalous records
Percentage of anomalies relative to total data
Verification that anomaly proportion aligns with the chosen contamination rate
This step provides confidence in anomaly labeling consistency.
Materializing Anomalies into a Monitoring Table
Next, we convert anomaly detection results into a production-ready monitoring table.
In this step, we:
Create a consolidated table that stores anomaly detection results
Include key fields such as:
Sales values
Percentiles and Z-scores
Reconstruction error
Anomaly flag
Categorize anomalies by:
Anomaly type (high value, low value, mid-range, normal)
Severity level (severe, high, medium, low, normal)
Add timestamps for tracking and auditing
This table is ideal for dashboards, alerts, and downstream analytics.
Querying and Analyzing Anomalies
Finally, we analyze the anomaly monitoring table to extract insights:
Count anomalies by type
Analyze severity level distribution
Combine anomaly type and severity for deeper analysis
Retrieve the top most extreme anomalies for investigation
These queries demonstrate how anomaly detection results can be operationalized.
Key Concepts Covered
Autoencoder training in BigQuery ML
Model evaluation for anomaly detection
Detecting anomalies using reconstruction error
Contamination rate and anomaly thresholds
Creating anomaly monitoring tables
Classifying anomalies by type and severity
Querying and analyzing anomaly results
Building dashboards and alert-ready datasets
In this lecture, we introduce a complete Feature Engineering Pipeline using BigQuery, which is a critical step before building any machine learning model. Feature engineering transforms raw data into meaningful inputs that machine learning algorithms can understand and learn from effectively.
This tutorial focuses on designing, applying, and validating feature engineering techniques using SQL in BigQuery, following a structured and practical approach.
Creating the Dataset and Sample Data
We begin by creating a dedicated dataset named Feature Demo, which will be used throughout this pipeline.
Next, we generate sample customer data to demonstrate feature engineering techniques. This synthetic dataset is intentionally designed to include:
Numerical features
Categorical features
Date-based features
Target variable for prediction
Missing (null) values to simulate real-world data issues
The customer data includes fields such as:
Age
Gender
Region
Last activity score
Purchase indicator (target variable)
Customer tenure (date since signup)
Total spending
Last visit date
Support tickets
This dataset helps us demonstrate how raw, imperfect data can be transformed into model-ready features.
Performing Data Quality Checks
Before applying feature engineering, we perform data quality checks to understand the dataset:
Total number of records
Count of non-null values in key columns
Detection of missing values
Basic statistics such as:
Average age
Average total spend
Purchase rate
These insights help guide decisions such as how to handle missing values and what default values to use.
Comprehensive Feature Engineering Pipeline
This section forms the core of the lecture, where raw data is transformed into rich machine learning features.
1. Handling Missing Values
Numerical fields are filled with meaningful defaults (based on data statistics)
Categorical fields are replaced with an “Unknown” category
Date fields are assigned default placeholder values
Ensures no null values remain in important features
2. One-Hot Encoding for Categorical Variables
Converts categorical values (gender, region) into binary indicator columns
Makes categorical data usable by machine learning algorithms
3. Numerical Transformations
Calculates time-based features such as:
Days since last visit
Derives behavioral metrics like:
Spend per day
4. Interaction Features
Combines multiple features to capture complex patterns:
Age × activity score
Activity score × total spend
Helps the model understand behavioral relationships
5. Feature Normalization
Applies min–max scaling to numerical features
Scales values between 0 and 1
Prevents features with large values from dominating model training
6. Creating Categorical Buckets
Groups continuous values into meaningful categories:
Age groups (young, middle, senior)
Activity levels (low, medium, high)
Spending tiers (no spend, low, medium, high)
All engineered features are stored in a new table called Customer Features.
Exploring and Understanding the Engineered Features
We then explore the newly created feature table to understand:
How missing values have been handled
How categorical variables are encoded
How normalized values fall within the expected range
How interaction and bucketed features add context
This step helps validate that feature transformations are logical and useful.
Verifying Feature Engineering Quality
To ensure the pipeline is reliable, we perform validation checks:
Confirm no null values remain in critical columns
Verify the target variable is complete
Validate row counts before and after transformation
This ensures the data is clean, consistent, and model-ready.
Key Concepts Covered
Importance of feature engineering in machine learning
Creating synthetic datasets for experimentation
Data quality assessment
Handling missing values
One-hot encoding categorical features
Feature normalization
Feature interactions
Bucketizing continuous variables
Validating feature pipelines in BigQuery
In this lecture, we continue our Feature Engineering Pipeline in BigQuery by moving from data preparation to machine learning model creation, evaluation, interpretation, and prediction. In the previous tutorial, we completed data creation, quality checks, and comprehensive feature engineering. Now, we use those engineered features to build and compare multiple machine learning models.
This lecture demonstrates how different algorithms perform on the same feature set and how to interpret their results.
Creating a Logistic Regression Model
We begin by training a Logistic Regression model to solve a binary classification problem—predicting whether a customer will make a purchase.
Key concepts covered:
Why logistic regression is suitable for binary classification
Defining the target variable (purchase or no purchase)
Automatically handling class imbalance using class weights
Limiting training iterations to prevent overfitting
Using both raw and engineered features for training
After training, we explore the model details, including training behavior, loss curves, and evaluation metrics.
Understanding Logistic Regression Evaluation
We review the evaluation results of the logistic regression model, including:
Accuracy
Precision and recall
F1 score
Log loss
Area Under the ROC Curve (AUC)
Confusion matrix and ROC curves
These metrics help assess how well the model predicts customer purchase behavior.
Training a Random Forest Classification Model
Next, we train a Random Forest classifier, which is more powerful for capturing:
Non-linear relationships
Feature interactions
Complex decision boundaries
Important concepts explained:
Using multiple decision trees for better generalization
Voting mechanism across trees
Enabling global explainability for feature importance analysis
We explore the trained model, review training progress, and understand evaluation outputs.
Comparing Model Performance
We evaluate both models using the same evaluation approach and compare:
Accuracy
Precision
Recall
F1 score
Log loss
AUC
This comparison highlights how different algorithms perform on the same feature-engineered dataset.
Feature Importance Analysis
To understand which features matter the most, we perform feature importance analysis for both models.
Random Forest Feature Importance
Identifies which features contribute most to prediction decisions
Measures how much each feature reduces impurity across all trees
Highlights top drivers such as spending behavior and activity levels
Logistic Regression Feature Weights
Interprets model coefficients directly
Positive weights indicate increased probability of purchase
Negative weights indicate decreased probability
Absolute weight shows feature impact strength
This section helps you move beyond prediction and into model explainability.
Making Predictions with Probability Scores
Finally, we use the trained model to make predictions:
Generate predicted labels (purchase or no purchase)
Output probability scores for each prediction
Sort customers based on likelihood of purchase
Probability scores allow for more informed business decisions, such as targeting high-probability customers.
Key Concepts Covered
Training classification models in BigQuery ML
Logistic regression vs random forest
Model evaluation and comparison
Interpreting performance metrics
Feature importance and model explainability
Predicting outcomes with probability scores
Using feature-engineered data for ML models
In this lecture, we complete the Feature Engineering Pipeline in BigQuery by moving beyond basic modeling into advanced customer segmentation, business insights generation, and final model deployment. This session brings together everything we have built so far and shows how feature engineering directly translates into real business value.
By the end of this lecture, you will understand how to transform engineered features into actionable customer intelligence and how to build a production-ready machine learning model.
Recap of Completed Pipeline Steps
Before moving forward, we briefly recap the steps already completed:
Creating sample customer data
Performing data quality checks
Applying comprehensive feature engineering
Verifying engineered features
Training machine learning models (Logistic Regression and Random Forest)
Evaluating model performance
Performing feature importance analysis
Making predictions with probability scores
With this foundation in place, we move to the advanced stages of the pipeline.
Advanced Customer Segmentation (Step 9)
In this section, we build sophisticated customer segments using behavioral and financial features.
Customer Value Segmentation
Customers are grouped based on spending behavior and activity level into categories such as:
High Value – Active
High Value – Inactive
Low Value – Active
Low Value – Inactive
This segmentation helps identify customers who contribute the most value and those who need re-engagement.
Churn Risk Assessment
We assess churn risk using:
Number of support tickets
Recency of customer visits
Customers are classified into High, Medium, or Low churn risk, enabling proactive retention strategies.
Percentile Rankings
We calculate percentile rankings for:
Spending behavior
Activity levels
This shows where each customer stands relative to others on a 0–1 scale.
All these insights are stored in a new Customer Segments table.
Generating Business Insights (Step 10)
Next, we derive actionable business insights from customer segments.
Customer Segment Distribution
We analyze each segment to understand:
Customer count
Purchase conversion rate
Average spending
Average activity level
This helps identify the most valuable and responsive customer groups.
Churn Risk Analysis
We study how churn risk impacts purchase behavior by analyzing:
Purchase rate by churn risk
Average support tickets
Average days since last visit
This highlights the relationship between customer engagement and churn.
Regional Performance Analysis
We evaluate performance across regions by comparing:
Customer count
Purchase rate
Average spending
These insights help tailor regional marketing strategies.
Final Model Creation with Enhanced Features (Step 11)
With enriched features and segments, we build a final advanced machine learning model:
Uses a Boosted Tree Classifier
Trained on:
Core demographic features
Behavioral features
Engineered features
Customer segments and risk indicators
Optimized using:
Higher iterations
Learning rate tuning
Subsampling
Enabled global explainability for transparency
This becomes the most powerful and business-aware model in the pipeline.
Final Model Comparison
We compare all trained models:
Logistic Regression
Random Forest
Enhanced Boosted Tree
Models are evaluated using:
Accuracy
Precision
Recall
F1 Score
ROC AUC
This comparison helps identify the best-performing model based on objective metrics.
Final Predictions with Business Context
We then generate business-ready predictions and store them in a dedicated table.
Each prediction includes:
Customer details
Behavioral and financial metrics
Customer segment
Marketing priority (High / Medium / Low)
Purchase probability
Predicted purchase outcome
Prediction timestamp
These enriched predictions can be directly used by marketing, sales, and business teams.
Business Insights from Predictions
Finally, we analyze saved predictions to answer real business questions:
High-priority customer summary
Marketing priority distribution
Customer segment vs marketing priority analysis
This shows how machine learning outputs can be transformed into decision-making insights.
Key Concepts Covered
Advanced customer segmentation
Churn risk assessment
Percentile-based customer ranking
Business insight generation
Advanced boosted tree modeling
Model comparison and selection
Business-context-aware predictions
End-to-end feature engineering pipeline
In this lecture, we begin our journey into Vertex AI Model Garden, one of the most powerful and important components of Google Cloud Vertex AI. This session focuses on understanding what Model Garden is, why it exists, and how it can be used to quickly access and experiment with modern AI models.
By the end of this lecture, you will have a clear conceptual understanding of Model Garden and how it fits into the broader Vertex AI ecosystem.
What Is Vertex AI Model Garden?
Vertex AI Model Garden is Google Cloud’s centralized hub for AI and machine learning models. You can think of it as an “app store for AI models”, where you can:
Discover Google’s foundation models
Explore partner and open-source models
Experiment with models directly in Vertex AI Studio
Fine-tune or deploy models for real-world use
It provides a single interface to explore, test, and use a wide variety of pre-trained and open-source models.
Navigating Vertex AI and Model Garden
In this lecture, we walk through:
Accessing Vertex AI from the Google Cloud Console
Understanding the main Vertex AI sections such as:
Tools
Notebooks
Vertex AI Studio
Agent Builder
Data, Model Development, and Deployment
Within the Tools section, we focus on Model Garden and explore its layout and capabilities.
Capabilities and Tasks Supported
Model Garden supports a wide range of AI tasks, including:
Text generation
Text classification
Entity extraction
Image classification
Image segmentation
Text embeddings
Multimodal use cases
These tasks can be performed using ready-to-use models without building everything from scratch.
Model Collections and Providers
You will learn about different model collections available in Model Garden, such as:
Google models
Partner models
Self-deploy partner models
We also explore models from popular providers, including:
Meta
Anthropic
Salesforce
Hugging Face
Mistral AI
AI21 Labs
This shows how Model Garden brings together models from multiple ecosystems under one platform.
Foundation Models Overview
The lecture introduces foundation models, including:
Gemini model family (Gemini 1.x, 2.x, 2.5 variants)
Image generation and editing models
Music generation models
Large language models from partners like Claude, LLaMA, and Mistral
You will understand how these models are grouped and what types of problems they are designed to solve.
Fine-Tunable and Task-Specific Models
We also discuss:
Fine-tunable models, which can be customized using notebooks or pipelines
Task-specific solutions, which are pre-built and ready for immediate use
This helps you decide when to use a pre-trained model and when customization is needed.
Exploring Model Details
Using a text generation model as an example, the lecture demonstrates how to:
View model overview and description
Understand supported input and output formats
Review common use cases like summarization and question answering
Explore prompt design examples
Access official documentation and references
This section helps you understand how to evaluate a model before using it.
Using Models in Vertex AI Studio
You will see how models can be opened directly in Vertex AI Studio, where you can:
Interact with models using prompts
Switch between different models easily
Adjust output settings and response formats
Control advanced options like temperature, token limits, and response streaming
This highlights how Model Garden enables fast experimentation without setup overhead.
Key Concepts Covered
What Vertex AI Model Garden is
Types of models available
Model providers and collections
Foundation vs fine-tunable models
Using models in Vertex AI Studio
Model configuration and advanced settings
In this lecture, we take a hands-on approach to building a real-world Generative AI application using Vertex AI Model Garden. You will learn how to transform a simple idea into a fully working web application powered by Google’s Gemini models—without worrying about infrastructure complexity.
The focus of this session is on text generation, prompt design, grounding, and rapid deployment using Vertex AI Studio and Cloud Run.
What This Lecture Covers
In this tutorial, we build a Trip Planner application that generates a detailed, day-by-day travel itinerary based on user inputs. The application is created using a text generation model from Vertex AI Model Garden and deployed as a web app.
Selecting a Model from Model Garden
You will learn how to:
Navigate to Vertex AI Model Garden
Choose a text generation task
Select a suitable Gemini model
Open the model directly in Vertex AI Studio
Switch between available Gemini model variants when needed
This step demonstrates how easy it is to start experimenting with powerful foundation models.
Designing the Prompt for the Application
A major part of this lecture focuses on prompt engineering. You will see how to:
Clearly define the intent of the application
Specify required input parameters, such as:
Starting point
Destination
Trip duration (in days)
User interests (multiple interests supported)
Describe the expected output format, including:
Day-by-day itinerary
Activities and experiences
Accommodation suggestions
Transportation options
Estimated costs
You will also learn how Gemini can automatically generate a high-quality prompt based on your intent, saving time and improving consistency.
Enhancing Accuracy with Grounding
To improve reliability and relevance, the lecture demonstrates how to:
Enable grounding with Google Search
Anchor model responses to verifiable, real-world information
Reduce hallucinations in generated content
This is especially important for travel-related applications where accuracy matters.
Deploying the Application as a Web App
Once the prompt is finalized, you will learn how to:
Convert the prompt into a working web application
Deploy the app using Cloud Run
Configure public access for testing
Understand how Vertex AI automatically handles backend setup
This shows how quickly you can go from prompt to production-ready app.
Testing the Trip Planner Application
You will see a live demonstration of the application by providing inputs such as:
Starting city
Destination country
Trip duration
Interests like culture, food, art, and adventure
The application generates:
A complete multi-day itinerary
Transportation recommendations
Accommodation options
Cost estimates for activities, food, and travel
This validates how Generative AI can create rich, structured outputs from simple inputs.
Exploring the Cloud Run Service
The lecture also covers:
Viewing application metrics such as:
Request count
Latency
CPU and memory usage
Exploring deployed application files
Understanding how prompts power the app logic
Knowing how to update the application by modifying the prompt and redeploying
Cleaning Up Resources
To avoid unnecessary costs, you are shown how to:
Safely delete the Cloud Run service
Confirm that resources are fully removed
This highlights best practices for cost management on Google Cloud.
Key Takeaways from This Lecture
By the end of this lecture, you will understand:
How to use Vertex AI Model Garden for application development
How to design effective prompts for real-world use cases
How grounding improves AI reliability
How to deploy AI-powered applications quickly using Cloud Run
How Generative AI can be turned into a usable product with minimal effort
In this lecture, you will learn how to build a real-world Entity Extraction application using Vertex AI Model Garden. The goal of this tutorial is to demonstrate how Generative AI models can automatically extract structured information from unstructured job descriptions, including text, PDF files, and images.
This session focuses on Entity Extraction, prompt design, application deployment, and iterative improvement using Vertex AI Studio and Cloud Run.
What You Will Build in This Lecture
You will create an application that:
Accepts job descriptions as input
Supports text, PDF, and image formats
Automatically extracts key entities such as:
Job title
Company name
Location
Employment type
Required experience
Education qualifications
Salary range
Application deadline
This type of application is commonly used in HR systems, recruitment platforms, and resume screening tools.
Selecting an Entity Extraction Model
You will start by:
Navigating to Vertex AI Model Garden
Choosing the Entity Extraction task
Selecting a suitable Gemini model
Opening the model in Vertex AI Studio
Understanding how model versions can change and how to proceed with available variants
This step shows how Model Garden simplifies access to powerful foundation models.
Exploring the Prompt Gallery
Before creating the application prompt, you will explore the Prompt Gallery, which provides ready-made prompt examples for different use cases such as:
Chatbots
Reviews analysis
Document processing
Text classification
This helps you understand how prompts are structured and how they can be adapted for your own application.
Designing an Effective Entity Extraction Prompt
A key part of this lecture is prompt engineering. You will learn how to:
Start with a rough or imperfect prompt intent
Use Gemini’s “Help me write” feature to refine it
Convert vague requirements into a clear system instruction
Define exactly which entities should be extracted
Specify supported input formats such as text, PDF, and images
You will see how refining the prompt dramatically improves extraction accuracy and application behavior.
Deploying the Entity Extraction Application
Once the prompt is finalized, you will:
Deploy the application directly from Vertex AI Studio
Enable public access for easy testing
Understand how the app is automatically hosted using Cloud Run
Learn how quickly a proof-of-concept AI app can be created
Testing the Application with Real Data
You will test the application using:
Plain text job descriptions
PDF job description documents
The application successfully extracts and displays structured entities such as job title, company name, location, experience, and salary information, demonstrating the practical power of Generative AI for document understanding.
Improving the Application Through Iteration
You will also learn an important real-world skill:
How to redeploy the application when requirements change
How modifying the prompt can enable new features, such as file uploads
Why iterative prompt refinement is critical in AI application development
This shows how AI applications evolve through experimentation.
Exploring the Deployed Application
The lecture briefly covers:
Viewing the application in Cloud Run
Accessing the source code editor
Understanding where prompts and configurations are stored
Navigating between Vertex AI Studio and Cloud Run
Cleaning Up Resources
To follow best practices, you will see how to:
Delete deployed applications
Avoid unnecessary cloud costs
Keep your Google Cloud environment clean
Key Takeaways from This Lecture
By the end of this lecture, you will understand:
How to build an Entity Extraction application using Vertex AI
How to design effective prompts for structured data extraction
How to support multiple input formats like text, PDF, and images
How to deploy AI applications quickly using Cloud Run
How to iteratively improve AI behavior through prompt refinement
In this lecture, you will learn how to build a multi-language translation application using Vertex AI Model Garden. The focus of this tutorial is on using Generative AI models to translate product descriptions from English into multiple languages through a simple, deployable application.
This lecture demonstrates how quickly and effectively translation use cases can be implemented using Google Cloud’s Gemini models and Vertex AI Studio.
What You Will Build in This Lecture
You will create a translation application that:
Accepts product descriptions written in English
Translates the content into multiple target languages, such as:
Hindi
Spanish
French
German (and other supported languages)
Displays all translated outputs in a clear and structured format
This type of application is commonly used in e-commerce platforms, global marketplaces, and multilingual content systems.
Exploring the Translation Task in Model Garden
You will begin by:
Navigating to Vertex AI Model Garden
Selecting the Translation task
Choosing a Gemini model suitable for language translation
Opening the model in Vertex AI Studio
This step helps you understand how Model Garden categorizes AI tasks and makes them easy to explore and use.
Designing the Translation Prompt
A major part of this lecture focuses on prompt design. You will learn how to:
Define a clear prompt intent for translation
Use Gemini’s “Help me write” feature to automatically generate a high-quality system instruction
Specify:
Source language (English)
Multiple target languages
The type of content being translated (product descriptions)
Save and reuse prompts for future applications
This ensures the translations are accurate, consistent, and suitable for production use.
Deploying the Translation Application
Once the prompt is finalized, you will:
Save the prompt with region and autosave settings
Use the Build with code option to deploy the application
Allow public access for easy testing
Automatically deploy the application using Cloud Run
This shows how quickly an AI-powered web application can be launched with minimal setup.
Testing the Application
After deployment, you will:
Open the live application
Enter a product name and description
Submit the input for translation
View the translated content in multiple languages
This confirms that the application works as expected and demonstrates real-world usability.
In this lecture, you will learn how to generate high-quality images using Vertex AI Model Garden. The focus of this tutorial is on understanding how foundation image generation models can be used to create visually rich and creative images using natural language prompts.
This session introduces image generation as a practical and powerful use case of Generative AI on Google Cloud.
Introduction to Image Generation in Model Garden
The lecture begins by exploring the Model Garden interface inside Vertex AI and selecting the Image Generation category. You will learn how to:
Browse available foundation image models
Understand the difference between model variants
Select an advanced image generation model suitable for creative tasks
You will work with a high-quality image generation model designed for professional-grade outputs.
Understanding the Image Generation Model
Before generating images, the lecture explains important model information, including:
Model overview and capabilities
Common use cases such as illustration, design, and creative artwork
Available documentation for deeper learning
Pricing considerations for image generation workloads
This helps you make informed decisions when choosing models for real-world projects.
Configuring Image Generation Settings
You will then open the model in Vertex AI Studio and configure key generation options, such as:
Aspect ratio (using the default 1:1 format)
Safety settings, including:
Person generation settings
Age constraints
Safety filter thresholds
Deployment region selection
These configurations ensure that generated images follow safety guidelines and meet project requirements.
Writing an Effective Image Prompt
A major focus of this lecture is prompt design for image generation. You will learn how to:
Describe artistic style clearly
Specify colors, mood, and visual tone
Define subject behavior and expressions
In this tutorial, the prompt is designed to generate a bright, colorful, cartoon-style illustration with:
Bold outlines
Exaggerated features
Playful expressions
A cheerful and friendly atmosphere
A subject is added to the prompt to guide the model in creating a specific visual concept.
Generating and Reviewing Images
After submitting the prompt, the model generates multiple image variations. You will:
Review different generated image options
Understand how the model interprets your prompt
See how slight prompt changes affect visual output
This helps you evaluate and refine creative results.
In this lecture, you will learn how to build a multi-modal Generative AI application using Vertex AI Model Garden. Multi-modal models can understand and process both images and text together, enabling powerful use cases such as automatic content creation for social media and video platforms.
This tutorial demonstrates how to use Gemini 2.5 Pro, one of Google’s advanced multi-modal models, to generate marketing and social media content directly from uploaded images.
Introduction to Multi-Modal Generation
The lecture begins by introducing multi-modal generation, where a single AI model works with multiple input types simultaneously. In this case, the model analyzes:
Images (visual context)
Text prompts (instructions and intent)
You will see how this capability enables richer and more contextual outputs compared to text-only models.
Selecting a Multi-Modal Model
You will explore the Gemini 2.5 Pro model inside Vertex AI Model Garden and review:
Model overview and capabilities
Supported input types (text + image)
Common use cases such as content generation, media analysis, and creative automation
The model is then opened in Vertex AI Studio for interactive use.
Designing a Multi-Modal Prompt
A key part of this lecture focuses on prompt design for multi-modal tasks. Using the built-in assistance feature, you will create a structured prompt that instructs the model to:
Analyze an uploaded image
Generate multiple types of content from that image, including:
Instagram Reel title
Instagram caption
YouTube thumbnail text
YouTube video title
YouTube video hashtags
The lecture shows how the system can suggest a well-structured prompt and how you can customize it further if needed.
Deploying the Application
Once the prompt is ready, you will deploy the application as a web app powered by Cloud Run. During deployment, you will:
Allow public access for testing
Create a fully functional interactive application
Wait for the deployment to complete and access the live app
This allows you to test the multi-modal experience in a real user interface.
Testing with Real Images
After deployment, you will test the application using different images:
Example 1: Product Image
Uploading a smartphone image
Automatically generating:
Instagram Reel title and caption
YouTube video title
Thumbnail text
Relevant hashtags
Example 2: Travel Image
Uploading a scenic sea view image
Receiving creative, engaging social media content such as:
Eye-catching Instagram reel titles
Emotional captions
YouTube video titles suitable for travel content
Optimized hashtags
These examples clearly show how the model adapts its output based on the visual context of each image.
In this lecture, you will get a clear and practical introduction to Google Cloud’s Document AI API, a powerful service designed to automatically understand, digitize, and extract information from documents using artificial intelligence.
This session focuses on concepts, architecture, and real-world use cases, helping you understand what Document AI is, how it works, and when to use it, before moving into hands-on implementations in later lectures.
What Is Document AI?
The lecture begins by explaining Document AI in simple, non-technical terms:
Document AI is a Google Cloud service that uses AI to read, understand, and extract information from documents
It works like a smart digital assistant that can process large volumes of documents automatically
It converts unstructured documents (PDFs, scanned images, forms) into structured, usable data
You will see how Document AI eliminates the need for slow and error-prone manual data entry.
Why Document AI Is Important
You will learn why document processing is a critical challenge for businesses:
Most business information is still stored in documents
Manual document digitization is time-consuming and expensive
Extracting information accurately at scale is difficult without AI
Common real-world examples discussed include:
Invoice and receipt processing
Medical intake forms
Identity verification using ID cards
Digitizing printed or handwritten documents
Expense report validation
Document AI Architecture and Core Components
The lecture walks through the high-level architecture of Document AI, including:
How Document AI sits on top of Vertex AI
How it integrates with:
Google Cloud Storage
BigQuery
Vertex AI Search and Conversation
How it supports scalable, end-to-end document processing pipelines
You will also understand how Document AI uses OCR, layout analysis, classification, and entity extraction together to process documents intelligently.
Key Document Processing Capabilities
This lecture explains the main capabilities supported by Document AI:
Text and layout extraction (OCR)
Key-value pair detection
Table extraction
Document classification
Splitting and routing documents by type
Entity extraction and normalization
Multilingual document support
Dataset preparation for training and evaluation
These capabilities allow businesses to automate complex document workflows without building custom ML models from scratch.
Understanding Document AI Processors
A major part of this lecture focuses on Document AI processors, which act as the bridge between:
Raw document files
Machine learning models that analyze them
You will learn:
What a processor is
Why processors are required
Different processor categories such as:
Digitization
Extraction
Classification
How to choose the right processor based on your use case
Examples include:
Enterprise Document OCR for text extraction
Form Parser for structured forms
Invoice and receipt processors for financial documents
Typical Workflow with Document AI
The lecture outlines the standard steps to use Document AI:
Choose the appropriate processor
Create the processor in Google Cloud
Send documents for processing
Receive structured output such as:
Extracted fields
Tables
OCR text
Confidence scores
This gives you a clear mental model of how Document AI fits into real applications.
Supported File Types and Limitations
You will also learn about practical constraints, including:
Supported document formats (PDF, PNG, JPEG, TIFF, GIF)
Maximum file size limits
Page limits per processor
Language support differences across processors
Understanding these limits is essential before building production solutions.
Live Demonstration with Sample Documents
The lecture includes hands-on exploration using sample documents, such as:
Invoices
Personal information forms
You will see how Document AI:
Identifies key-value pairs
Extracts structured fields like dates, amounts, names, and addresses
Extracts tables and OCR text
Displays extracted information visually
This helps you clearly understand the quality and depth of extraction provided by Document AI.
In this lecture, you will move from understanding Document AI concepts to practically setting up Document AI in Google Cloud. This session focuses on the essential configuration steps required before using Document AI programmatically, ensuring you are fully prepared for hands-on implementation in upcoming tutorials.
What This Lecture Covers
In this tutorial, you will learn how to enable the Document AI API, create a document processor, and prepare Google Cloud Storage for document processing. These steps form the foundation of any real-world Document AI solution.
Step 1: Enabling the Document AI API
The lecture begins by guiding you through enabling the Cloud Document AI API from the Google Cloud Console. You will understand:
Where to find the Document AI API in the Google Cloud API Library
How to enable the API for your project
What information is available on the API overview page, including:
Pricing details
Page processing limits
Quotas and system limits
API metrics and usage monitoring
How Document AI integrates with other Google services such as:
Google Drive API
Gmail API
Gemini API
This step ensures your Google Cloud project is fully authorized to use Document AI services.
Step 2: Exploring Document AI in the Console
After enabling the API, you will explore the Document AI section in Google Cloud Console, where you will see:
An overview of how Document AI transforms unstructured documents into structured data
A guided workflow explaining:
Creating a processor
Extracting structured data
Using the extracted data for downstream applications
This gives you a clear understanding of how Document AI works end to end.
Step 3: Understanding Document AI Processors
A major focus of this lecture is learning about Document AI processors, which are responsible for analyzing documents. You will explore:
Processor Gallery
Prebuilt processors available for immediate use
Categories such as:
General processors
Specialized processors
Trainable processors
Common Processor Types
Document OCR – for extracting text and layout from documents
Form Parser – for extracting key-value pairs, checkboxes, and structured fields
Layout Parser – for identifying document structure and sections
Specialized Processors
You will also see processors designed for specific use cases, such as:
Invoice processing
Expense parsing
Bank statement analysis
Identity document verification
Lending document classification
You will learn how to choose the right processor based on your business requirement.
Step 4: Creating a Document Processor
In this lecture, you will create a Form Parser processor step by step. You will understand:
How to name and configure a processor
How to choose the region for deployment
Encryption options and why the default Google-managed encryption is sufficient in most cases
Processor details such as:
Processor ID
Status
Processor type
Region
Prediction endpoint
This processor will later be used to extract structured data from uploaded documents.
Step 5: Testing the Processor with Sample Documents
Once the processor is created, you will test it directly from the console by uploading a sample PDF document. You will see:
Automatic extraction of key-value pairs such as:
Name
Date of birth
Phone number
Address
Signature
Detection of entities like:
Person
Date
Phone number
Address
Exporting extracted results as structured JSON output
This live test demonstrates the power and accuracy of Document AI processors.
Step 6: Creating a Google Cloud Storage Bucket
To prepare for batch document processing, you will create a Google Cloud Storage (GCS) bucket. This section covers:
Creating a new GCS bucket with proper region selection
Understanding storage class options
Using default access control and security settings
Organizing data using folders
You will create:
An input folder to store documents for processing
An output folder to store processed results
Sample PDF documents are uploaded to the input folder, preparing the environment for automation.
In this lecture, you will begin the hands-on implementation of Document AI using a Google Colab notebook. This tutorial focuses on setting up the Colab environment, understanding the business problem, and preparing everything required before writing the core document processing logic.
This lecture acts as the bridge between Document AI setup in the Google Cloud Console and real-world programmatic usage.
Business Problem Overview
The lecture starts by clearly defining the business use case:
You have PDF documents stored in a Google Cloud Storage bucket
These documents need to be processed using a Document AI Form Parser processor
The goal is to extract structured entities such as key-value pairs from the PDFs
The final output should be a clean, structured Excel file containing the extracted information
This real-world scenario reflects common enterprise use cases such as form processing, data digitization, and automated document analysis.
Working in Google Colab
You will be introduced to the Google Colab notebook environment that will be used throughout the implementation. Colab is chosen because it:
Runs in the cloud with no local setup required
Integrates seamlessly with Google Cloud services
Supports authentication with Google accounts
Is ideal for experimentation and prototyping
Step 1: Environment Setup in Google Colab
The first major focus of this lecture is preparing the Colab environment so it can communicate with Google Cloud services.
You will understand why each dependency is required, including:
A library to interact with Document AI
A library to work with Google Cloud Storage
A library for data manipulation and tabular processing
A library to read and write Excel files
Authentication libraries to securely access Google Cloud resources
The emphasis is on why these tools are needed rather than how to write the code.
Step 2: Authenticating Google Colab with Google Cloud
Next, the lecture explains how to authenticate Google Colab with your Google Cloud account.
You will learn:
How Colab securely connects to your Google Cloud project
Why authentication is mandatory for accessing Document AI and Cloud Storage
What happens during the sign-in and permission approval process
How this authentication allows the notebook to act on behalf of your project
This step ensures that your notebook has the required permissions to process documents.
Step 3: Project Configuration and API Enablement
After authentication, the lecture covers project-level configuration, including:
Setting the correct Google Cloud project ID
Verifying that the notebook is connected to the intended project
Enabling required APIs such as:
Document AI API
Google Cloud APIs
Cloud Storage APIs
This ensures that all services used later in the workflow are available and properly configured.
Outcome of This Lecture
By the end of this tutorial, you will have:
A clear understanding of the document processing business problem
A fully configured Google Colab notebook
All required libraries installed
Secure authentication between Colab and Google Cloud
The correct project and APIs enabled
This setup is essential before writing the logic to process PDFs, extract entities, and export results.
In this lecture, you will continue building your Document AI processing pipeline by completing Step 2: Configuration in the Google Colab environment. This tutorial focuses on setting up project-level configurations, authentication handling, storage paths, and client initialization, which are critical before processing any documents.
This lecture ensures that your notebook is fully connected, secure, and correctly aligned with your Google Cloud resources.
Recap of Previous Step
In the previous lecture, you successfully:
Installed all required libraries in Google Colab
Authenticated Colab with your Google Cloud account
Enabled access to Document AI and Cloud Storage APIs
With authentication complete, this lecture moves into environment configuration.
Understanding the Configuration Phase
Configuration is the foundation that allows your notebook to:
Locate the correct Google Cloud project
Access the right Document AI processor
Read PDF files from Cloud Storage
Store outputs in structured locations
Apply billing and quotas correctly
Without proper configuration, document processing workflows often fail or behave inconsistently.
Key Imports and Their Purpose
The lecture begins by explaining the required imports and why they matter:
Operating system utilities to manage directories and file paths
Data manipulation tools to prepare and analyze extracted information
Typing utilities to improve readability and reduce errors in larger projects
This step emphasizes clarity, maintainability, and best practices.
Project Configuration Variables
Next, you configure all essential project-level details:
Google Cloud Project ID to ensure all operations run in the correct project
Region (location) aligned with the Document AI processor and storage bucket
Processor ID that uniquely identifies the Form Parser processor you created earlier
This ensures your requests are routed to the correct Document AI resources.
Google Cloud Storage Paths
You will define and understand three important Cloud Storage locations:
Input path – where PDF documents are stored
Output path – where processed JSON results will be written
Excel output path – where the final structured Excel files will be saved
These paths help keep the pipeline organized and scalable.
Local Working Directory in Colab
The lecture also covers setting up a local working directory inside the Colab environment:
Used for temporary downloads and file inspection
Helpful for debugging and verification
Created dynamically to avoid manual setup
This allows you to work with files locally while still using cloud storage.
Authentication and Credential Management
A major focus of this lecture is secure credential handling, including:
Loading default Google Cloud credentials
Refreshing expired credentials automatically
Applying correct permission scopes
Setting the quota project to avoid billing conflicts
This step is especially important in multi-project or enterprise environments.
Client Initialization
You will then initialize the core service clients:
Cloud Storage client to upload and download files
Document AI client to send documents for processing
These clients establish the connection between your Colab notebook and Google Cloud services.
Processor Resource Path
The lecture explains how a fully qualified processor resource path is constructed:
Combines project ID, region, and processor ID
Acts as the unique identifier for Document AI requests
Ensures documents are processed by the correct processor
This step confirms that your environment is ready for real processing tasks.
In this lecture, you will complete Step 3 of the Document AI pipeline by building helper utilities for Google Cloud Storage (GCS). These helper functions are designed to simplify and standardize common storage operations that are repeatedly used in document processing workflows.
By the end of this lecture, your project will have a clean and reusable way to interact with Cloud Storage, making the entire pipeline more readable, reliable, and scalable.
Recap of Progress So Far
Before this lecture, you have successfully completed:
Step 1: Installing required libraries and authenticating Google Colab with Google Cloud
Step 2: Configuring the project, processor, credentials, and storage paths
With authentication and configuration in place, the next logical step is to simplify Cloud Storage operations.
Why GCS Helper Functions Are Important
When working with Document AI, you often need to:
Read PDF files from Cloud Storage
Upload local files to a bucket
List multiple documents for batch processing
Parse Cloud Storage paths consistently
Writing this logic repeatedly can make notebooks hard to maintain. Helper utilities solve this problem by encapsulating repetitive tasks into reusable functions.
Goals of the GCS Helper Utilities
The helper utilities created in this lecture focus on four main objectives:
Parsing Google Cloud Storage URIs
Listing files inside a bucket using a path prefix
Uploading files from Colab or a local environment to GCS
Simplifying repetitive file-handling tasks
These utilities act as the backbone for all upcoming document processing steps.
Helper Function 1: Parsing GCS URIs
The first utility focuses on parsing Google Cloud Storage addresses.
This helper:
Takes a GCS URI as input
Separates the bucket name from the file or folder path
Makes it easier to work with storage APIs that require these components separately
This step ensures that GCS paths are handled consistently across the notebook.
Helper Function 2: Listing Files in a GCS Bucket
The second utility helps you discover documents stored in Cloud Storage.
This helper:
Searches for files inside a bucket
Filters files using a specific folder or prefix
Returns a list of matching files for further processing
This is especially useful for batch processing, where multiple PDF documents need to be sent to Document AI at once.
Helper Function 3: Uploading Files to Cloud Storage
The third utility simplifies uploading local files to Google Cloud Storage.
This helper:
Takes a file from your local machine or Colab environment
Uploads it to a specified bucket and folder
Ensures files are available for Document AI processing
This step is helpful when testing new documents or dynamically adding files to the pipeline.
Benefits of Using GCS Helper Utilities
By introducing these helper functions, you gain:
Cleaner and more readable notebooks
Reduced code duplication
Fewer errors related to file paths
Easier debugging and maintenance
A scalable structure for production pipelines
These utilities allow you to focus on document processing logic instead of low-level storage handling.
Completion of Step 3
At the end of this lecture, you have successfully:
Created reusable utilities for Cloud Storage
Standardized how GCS paths are parsed and managed
Simplified file listing and upload operations
This completes Step 3: GCS Helpers in the Document AI workflow.
In this lecture, you will learn how to run the Document AI Form Parser in batch mode on multiple PDF files stored in Google Cloud Storage (GCS). This is a critical step when working with real-world document processing systems, where handling documents one by one is inefficient and costly.
By the end of this lecture, you will understand how to process multiple documents simultaneously, extract structured data, and store the results in a scalable and organized way.
What Problem This Lecture Solves
In real business scenarios, documents such as:
Personal information forms
Job application forms
Invoices, receipts, or contracts
are usually stored in cloud storage and arrive in large volumes. Processing them individually is slow and expensive. This lecture demonstrates how to solve that problem using batch document processing with Document AI.
Objective of This Lecture
In this tutorial, you will:
Process multiple PDF files at once from a GCS bucket
Use the Form Parser processor created earlier
Automatically extract structured entities from each document
Store the extracted results as JSON files in Cloud Storage
Input Documents Used
The lecture works with two sample PDF files that were already uploaded in earlier steps:
Personal Information Form
Job Application Form
These documents are stored inside the input folder of a GCS bucket and are ready for batch processing.
Why Batch Processing Is Important
This step is performed to achieve the following goals:
1. Discover All Ready-to-Process PDFs
The system automatically finds all PDF documents stored in the input folder, eliminating the need to manually specify each file.
2. Clean Previous Results
Before starting a new batch job, old output files are removed. This prevents confusion and ensures that only fresh and accurate results are generated.
3. Process Multiple Documents Together
Instead of sending one document at a time, all PDFs are sent together to Document AI for simultaneous processing, which improves speed and efficiency.
4. Handle Large-Scale Operations
Batch processing is:
Faster than individual processing
More cost-effective
Suitable for enterprise-scale document workflows
What Happens During Batch Processing
During this step:
All PDF files are collected from the GCS input folder
Old output data is cleaned automatically
Documents are submitted to Document AI in a single batch job
Document AI processes each PDF using the Form Parser
Extracted entities are saved as JSON files, one per document
Understanding the Output
After batch processing completes:
A new output folder is created in the GCS bucket
Separate subfolders are generated for each document
Each subfolder contains a JSON file with extracted data
These JSON files include:
Key-value pairs such as names, dates, email addresses, phone numbers
Structured fields like application ID, address details, signatures, and more
OCR-based text extraction and entity detection
Verifying the Results
In this lecture, you also learn how to:
Open and inspect the generated JSON files
Match extracted entities with the original PDF content
Verify that fields such as names, dates, email IDs, phone numbers, and addresses are correctly extracted
This validation step confirms that the Form Parser is working accurately.
In this lecture, we focus on Step 5 of the Document AI workflow: creating parsing helper functions. This step is essential for converting the raw output generated by Document AI into a format that is easy to understand, analyze, and store.
After batch processing, Document AI produces complex JSON files with deeply nested structures. These files are powerful but not directly usable for business analytics or reporting. In this tutorial, you will learn how to design helper functions that extract, flatten, and organize this data into clean, structured formats.
Why This Step Is Important
Document AI output is:
Highly structured
Deeply nested
Designed for machine processing, not direct human use
To make this data usable in:
Excel reports
Databases
Dashboards
Downstream machine learning pipelines
we must first parse and simplify it. This lecture shows you how to do exactly that.
Objective of This Lecture
By the end of this tutorial, you will understand how to:
Read Document AI output files from Google Cloud Storage
Convert layout-based text into readable content
Flatten complex entity structures into simple rows
Extract key-value pairs in a clean, standardized format
These parsing helpers form the backbone of any production-grade document processing pipeline.
Overview of Helper Functions Created
In this lecture, we define four essential helper functions, each with a clear and focused responsibility.
1. Converting Layout Objects to Readable Text
Document AI stores extracted text using layout references and index positions.
The first helper function:
Converts layout objects back into readable text
Uses index ranges to reconstruct meaningful text segments
Makes extracted content human-readable
This step is critical because most entities refer to text indirectly through layout indexes.
2. Loading Document AI Output from GCS
After batch processing, results are saved as JSON files in a GCS output bucket.
The second helper function:
Downloads the JSON output files from Google Cloud Storage
Loads them into Python-friendly data structures
Makes the extracted results accessible for further processing
This bridges the gap between cloud-based processing and local analysis.
3. Flattening Nested Entities
Document AI entities often contain:
Parent-child relationships
Nested sub-entities
Repeated structures
The third helper function:
Recursively traverses these nested entities
Flattens them into a simple list of rows
Prepares the data for spreadsheets or databases
This transformation is crucial for exporting data into tabular formats like Excel.
4. Extracting Key-Value Pairs
The final helper function serves as the main entry point for structured data extraction.
It:
Collects all detected entities
Applies flattening logic
Outputs clean, structured rows containing key-value information
This makes it easy to extract meaningful business data such as:
Names
Dates
IDs
Addresses
Contact details
In this lecture, we complete the final step of our Document AI mini-project by converting the processed JSON output into clean, easy-to-use Excel files.
In the previous tutorial, we created helper functions to parse and organize the Document AI output. Now, we use those helpers to generate one Excel file per document, containing only the extracted entities and their confidence scores.
This step transforms raw AI output into a business-friendly format that can be shared with analysts, operations teams, or stakeholders.
Objective of This Lecture
By the end of this tutorial, you will understand how to:
Locate Document AI output JSON files stored in Google Cloud Storage
Process each JSON file independently
Extract only meaningful entities from each document
Create one Excel file per document
Upload the generated Excel files back to Google Cloud Storage
This completes an end-to-end document processing pipeline, from PDFs to structured Excel reports.
Understanding the Input and Output
Input
PDF documents stored in a GCS input folder
Document AI output stored as JSON files in a GCS output folder
Each JSON file corresponds to one PDF document
Output
One Excel file per document
Each Excel file contains:
Field Name
Field Value
Confidence Score
This makes the output easy to review, validate, and analyze.
Step-by-Step Workflow Covered in This Lecture
1. Identifying JSON Output Files
The process begins by locating all JSON files created during batch processing
These files are automatically discovered from the GCS output folder
The system confirms how many documents were processed
This ensures no document is missed.
2. Initializing Per-Document Processing
A tracking mechanism is set up to monitor generated Excel files
Each JSON file is processed independently
This guarantees one Excel file per input PDF
3. Extracting Entities from JSON Files
For each JSON file:
The Document AI output is loaded from GCS
Parsed entity data is extracted using previously created helper functions
Only key-value entities are selected
Non-essential metadata is ignored to keep the Excel file clean.
4. Generating Meaningful Excel File Names
Excel files are named based on the original PDF file
This makes it easy to map:
PDF → JSON → Excel
Clear naming is important for traceability in production pipelines.
5. Creating Excel Files
Each document gets a single-sheet Excel file
The sheet contains:
Entity name
Extracted value
Confidence score
This format is ideal for validation and business reporting.
6. Uploading Excel Files to GCS
All generated Excel files are uploaded back to Google Cloud Storage
A dedicated folder is created for Excel outputs
Files are now accessible from anywhere in the cloud
This enables easy sharing and integration with other systems.
Final Results Demonstrated
At the end of this lecture, you will see:
PDF files in the input folder
JSON files containing raw Document AI output
Excel files containing clean, structured entity data
One Excel file per document
This confirms a successful end-to-end processing flow.
In this lecture, we rebuild our Document AI Form Parser project, but this time with an interactive user interface using Gradio. Instead of processing documents only through scripts, we create a user-friendly web app where users can upload a PDF, extract entities, and download the results as an Excel file.
This approach demonstrates how AI-powered document processing can be exposed to end users through a simple and intuitive interface.
What This Project Does
In this application:
A user uploads a single PDF form using a web interface
Google Document AI Form Parser extracts entities from the document
Extracted entities are organized into structured fields
Results are exported into an Excel file
The Excel file can be downloaded directly from the UI
This mimics a real-world document automation tool used in HR, finance, compliance, and operations teams.
High-Level Architecture
The project integrates three key components:
Google Document AI – for intelligent entity extraction
Pandas & Excel export – for structured output
Gradio – for building a simple web-based user interface
Together, they create a complete end-to-end document processing application.
Step-by-Step Breakdown of the Workflow
Step 1: Installing Required Libraries
We begin by installing all necessary libraries for:
Document AI communication
Authentication
Data processing and Excel generation
UI creation using Gradio
This ensures the environment is fully prepared.
Step 2: Imports and Basic Setup
Core Python utilities are imported for:
File handling
Debugging and error tracking
Data manipulation
Type safety and readability
This makes the project easier to maintain and debug.
Step 3: Authentication in Google Colab
Since the application runs inside Google Colab, user authentication is required:
The user logs in with a Google account
Secure access is granted to Google Cloud services
The notebook can now call the Document AI API
Step 4: Google Cloud & Document AI Configuration
In this step:
The Google Cloud project is configured
The processor location and processor ID are defined
The Document AI client is initialized
Authentication credentials are refreshed if required
This connects the application directly to the Form Parser processor.
Step 5: Entity Extraction Helper Functions
Document AI returns structured but complex data.
To simplify this:
Helper functions flatten nested entities
Output is standardized into three columns:
Field Name
Field Value
Confidence
This format is ideal for Excel and reporting.
Step 6: Reading Uploaded PDF Files
A helper function is used to:
Accept uploaded PDFs from different sources
Convert them into raw bytes
Ensure flexibility and reliability for file uploads
This allows smooth integration with the Gradio interface.
Step 7: Core Processing Pipeline
This is the heart of the application. The pipeline performs:
PDF validation
Entity extraction using Document AI
Conversion of results into a structured table
Exporting the output as an Excel file
Returning status messages for success or failure
Robust error handling ensures clear feedback to the user.
Step 8: Building the Gradio User Interface
A simple but powerful UI is created where users can:
Upload a PDF form
Click a button to extract entities
Preview extracted data in a table
Download the Excel file
View success or error messages
This removes the need for technical knowledge from the end user.
Step 9: Launching the Application in Colab
The Gradio app is launched with:
A public shareable URL
A Colab-compatible fallback option
This ensures the app works reliably inside notebooks and browsers.
In this lecture, we begin our journey into AutoML for tabular data using Google Cloud and Vertex AI. This tutorial focuses on preparing data correctly, which is one of the most critical steps before training any AutoML model.
You will learn how to move from a raw CSV file to well-structured training, testing, and evaluation datasets, ready to be used in Vertex AI AutoML.
What You Will Learn in This Lecture
In this tutorial, you will learn how to:
Prepare tabular data for AutoML workflows
Use Google Cloud Storage to manage datasets
Create datasets and tables in BigQuery from CSV files
Understand a real-world binary classification problem
Split data into training, testing, and evaluation sets in a stable and reproducible way
Business Problem Overview
We work with an Employee Attrition dataset, where the goal is to predict:
Whether an employee will stay with the company or leave
Key points:
The target column is attrition
This is a binary classification problem
All other columns act as feature variables
This is a very common business use case in HR analytics and workforce planning.
Step-by-Step Workflow Covered in This Lecture
1. Creating a Google Cloud Storage (GCS) Bucket
We start by creating a dedicated GCS bucket:
The bucket is created in the US Central 1 region
A folder named input is created inside the bucket
The employee attrition CSV file is uploaded to this folder
This bucket will serve as the data source for BigQuery and Vertex AI.
2. Creating a BigQuery Dataset
Next, we move to BigQuery and:
Create a new dataset for AutoML
Ensure the dataset is created in the same region as the GCS bucket
This region consistency is important for smooth integration with Vertex AI
3. Creating a BigQuery Table from the CSV File
Using the uploaded CSV file:
A BigQuery table is created directly from Google Cloud Storage
Schema detection is handled automatically
No partitioning is applied for simplicity
Once created, the table is explored to:
Review column names and data types
Preview sample records
Identify the target column (attrition)
4. Understanding the Dataset Structure
At this stage:
The attrition column is identified as the label
All remaining columns are treated as features
The problem is confirmed as binary classification
This clarity is essential before moving into AutoML training.
5. Creating Stable Data Splits (Train, Test, Evaluation)
Instead of random splits, a deterministic and reproducible splitting strategy is used:
a. Split Key Table
A separate table is created to generate a stable split key for each employee
The split key is a numeric value between 0 and 1
This ensures consistent splits every time the pipeline is run
b. Training Table
Approximately 70% of the data
Used to train the AutoML model
c. Testing Table
Approximately 15% of the data
Used to evaluate model performance during training
d. Evaluation Table
Remaining 15% of the data
Used for final unbiased evaluation
This three-way split follows best practices for machine learning.
6. Validating the Data Splits
Finally, the lecture verifies:
Row counts in each dataset
Percentage distribution across train, test, and evaluation tables
This confirms that:
The data is split correctly
There is no overlap
The distribution closely matches the intended ratios
In this lecture, we move from data preparation to model training using Vertex AI AutoML for tabular data. After successfully creating the training, testing, and evaluation tables in the previous tutorial, we now focus on building a fully managed machine learning model without writing any training code.
This session demonstrates how to create a Vertex AI Dataset, connect it to BigQuery, and train an AutoML classification model step by step using the Google Cloud Console.
Step-by-Step Workflow Covered in This Lecture
1. Navigating to Vertex AI
We begin by opening Vertex AI in the Google Cloud Console and selecting the correct region (US Central 1) to match our dataset and BigQuery tables.
2. Creating a Vertex AI Dataset
We create a new dataset with the following configuration:
Dataset Type: Tabular
Objective: Regression and Classification
Encryption: Google-managed encryption key
This dataset will serve as the input source for AutoML training.
3. Connecting BigQuery as the Data Source
Instead of uploading CSV files again, we directly connect:
The BigQuery training table (employee_train)
Dataset format is automatically recognized as BigQuery
Dataset metadata such as location, encryption, and data types is displayed
This approach is scalable and commonly used in production environments.
4. Generating Dataset Statistics
Once the dataset is created:
Vertex AI automatically computes column statistics
You can see:
Number of distinct values per column
Data type distribution (boolean, integer, string, etc.)
This helps validate data quality before training
You can also view dataset lineage using graphical or list views.
5. Starting AutoML Model Training
Next, we start training a new model:
Training Method: AutoML
Model Type: Tabular Classification
Target Column: attrition
Model Name: AutoML Tabular Model
This tells Vertex AI that we want to predict employee attrition using AutoML.
6. Understanding Data Splitting During Training
Although we already created train, test, and evaluation tables earlier:
AutoML still performs an internal split on the training dataset
Default split configuration:
80% Training
10% Validation
10% Testing
The split method used is random
Other options such as manual or chronological splits are available, but default settings are sufficient for most use cases.
7. Training Configuration and Optimization
During training setup:
Feature transformations are handled automatically
Optimization objective defaults to ROC AUC
Feature Store integration is optional and skipped
Encryption remains Google-managed
All settings follow AutoML best practices, allowing the platform to automatically tune the model.
8. Setting the Training Budget
Since our dataset contains fewer than 10,000 rows:
Training budget is set between 1 to 3 hours
We select 1 hour
Early stopping is enabled by default
Vertex AI provides a clear pricing estimate before training begins, ensuring transparency in cost.
9. Monitoring Model Training
Once training starts:
The job appears under Vertex AI → Training
You can monitor:
Training status
Pipeline ID
Model type (Tabular Classification)
Budget and elapsed time
Region and timestamps
Training typically takes up to a few hours, depending on dataset size and complexity.
In this lecture, we complete the AutoML Tabular machine learning lifecycle by exploring the trained model, analyzing its performance, and deploying it to a Vertex AI endpoint for real-time predictions.
After the AutoML training finishes successfully, this tutorial focuses on model evaluation, interpretability, deployment, and testing, which are critical steps before using any machine learning model in production.
What You Will Learn in This Lecture
By the end of this session, you will be able to:
Locate your trained AutoML model in the Vertex AI Model Registry
Understand key evaluation metrics for a classification model
Interpret confusion matrix, ROC curve, and precision-recall curve
Analyze feature importance generated by AutoML
Deploy the model to a Vertex AI endpoint
Test the deployed endpoint with real input values
Interpret prediction results and confidence scores
Step-by-Step Workflow Covered in This Lecture
1. Accessing the Trained Model
Once training is completed:
The model moves from Training Jobs to the Model Registry
From here, you can manage, deploy, and monitor the model
Clicking the model opens detailed evaluation and deployment options
2. Understanding the Evaluation Metrics
Inside the Evaluation tab, Vertex AI provides multiple performance metrics:
Confidence Threshold (0.5) – used to convert probabilities into predictions
Accuracy
Precision
Recall
F1 Score
ROC AUC
Log Loss
These metrics help you understand how well the model predicts employee attrition.
3. Visual Performance Analysis
Vertex AI also provides rich visualizations, including:
ROC Curve – shows the trade-off between true positive and false positive rates
Precision-Recall Curve – useful when dealing with class imbalance
Threshold Curves – help adjust prediction confidence levels
These graphs help you choose the right confidence threshold for business use cases.
4. Confusion Matrix Interpretation
The confusion matrix clearly shows:
True Positives
True Negatives
False Positives
False Negatives
Using this matrix, we can manually calculate the model accuracy, which comes out to approximately 57–60%. This confirms the evaluation score shown by Vertex AI.
Note: The accuracy is relatively low because the dataset is small. With larger and more balanced data, AutoML generally performs much better.
5. Feature Importance Analysis
Vertex AI automatically calculates feature importance, helping you understand:
Which features most influence predictions
How strongly each feature impacts the model
In this case:
Features like OverTime, JobLevel, and other employment-related attributes rank highest
This improves model explainability and trust
6. Deploying the Model to an Endpoint
Next, we deploy the model for online predictions:
Create a new endpoint
Select the region (US Central 1)
Assign 100% traffic to this model
Choose compute resources (machine type, CPU, memory)
Skip model monitoring for now
Once deployed, the endpoint becomes active and ready to receive prediction requests.
7. Exploring the Endpoint Dashboard
After deployment, you can monitor:
Request count
Latency
Error rate
Response time
Model inference metrics
Initially, these metrics show no data because the endpoint is newly created.
8. Testing the Deployed Model
To test the endpoint:
Provide input values for all required features
Submit a prediction request
Receive:
Predicted label (true or false)
Confidence score
In this case:
Prediction = False
Meaning: the employee is not likely to leave the organization
This confirms that the endpoint is working correctly.
In this lecture, we move beyond online predictions and learn how to perform batch inference using a trained AutoML Tabular model in Vertex AI. Batch inference is a critical capability when you need to generate predictions for large datasets at once, instead of making real-time requests through an endpoint.
You will learn how to configure, run, and analyze batch prediction jobs using BigQuery tables as input and Google Cloud Storage (GCS) as output.
What You Will Learn in This Lecture
By the end of this lecture, you will be able to:
Understand the difference between online inference and batch inference
Create a batch inference job using a BigQuery table
Configure batch prediction output in CSV format
Store prediction results in a GCS bucket
Explore and interpret batch prediction output files
Understand available input and output options for batch inference in Vertex AI
Step-by-Step Topics Covered
1. Why Batch Inference Is Important
Batch inference is used when:
You need predictions for thousands or millions of records
Real-time latency is not required
Predictions are needed for reporting, analytics, or downstream processing
Cost efficiency is more important than low-latency responses
In this tutorial, we generate predictions for the employee test dataset created earlier.
2. Creating a Batch Inference Job
You begin by creating a new batch inference job and configuring:
Batch inference name for easy tracking
Input source as a BigQuery table (employee_test)
Output format as CSV
Output location as a Google Cloud Storage bucket
This setup allows Vertex AI to process all test records automatically and store predictions in cloud storage.
3. Input and Output Configuration
Key configuration choices explained in this lecture include:
Input source options
BigQuery tables
CSV files from Cloud Storage
Output format options
CSV files
BigQuery tables
JSON format
TFRecord format
For this demo, we use:
BigQuery as the input
CSV files in GCS as the output
4. Running the Batch Inference Job
Once configured:
Vertex AI creates a batch inference pipeline
The job runs asynchronously
The status changes from Running to Completed
Multiple output files are generated automatically
This approach allows efficient and scalable predictions without manual intervention.
5. Exploring Batch Prediction Results
After the job completes:
The output folder in GCS contains multiple CSV files
Files are split automatically for scalability
Each CSV contains:
Prediction probabilities for each class
Separate scores for:
Attrition_False_Score
Attrition_True_Score
This makes it easy to:
Analyze prediction confidence
Choose custom decision thresholds
Integrate results into analytics or dashboards
6. Understanding Prediction Output
Each record includes:
Probability that the employee will stay
Probability that the employee will leave
These scores help business teams:
Identify high-risk employees
Take preventive actions
Perform trend analysis at scale
7. Where Batch Inference Fits in the ML Lifecycle
This lecture completes an important phase of the ML workflow:
Model training
Model evaluation
Online inference (real-time predictions)
Batch inference (large-scale predictions)
Batch inference is commonly used in:
HR analytics
Marketing segmentation
Financial risk assessment
Periodic reporting pipelines
In this lecture, we focus on an important but often ignored step in machine learning projects—resource cleanup. After completing the AutoML Tabular workflow in Vertex AI, it is critical to delete unused resources to avoid unnecessary cloud costs, keep the project organized, and follow cloud best practices.
You will learn how to safely delete all resources created during the AutoML Tabular project, including Vertex AI components, BigQuery datasets, and Cloud Storage files.
Why Resource Cleanup Is Important
Machine learning services on Google Cloud are billable resources. If left running, they can continue to incur costs even when not in use. This lecture helps you:
Prevent unexpected cloud charges
Maintain a clean and manageable Google Cloud project
Follow production-grade cloud hygiene practices
Understand dependencies between cloud resources
What You Will Learn in This Lecture
By the end of this lecture, you will be able to:
Correctly delete a deployed Vertex AI endpoint
Remove trained AutoML Tabular models and training pipelines
Delete Vertex AI datasets
Clean up BigQuery datasets and tables
Remove input and output files from Google Cloud Storage
Understand the correct order of deletion to avoid errors
Step-by-Step Topics Covered
1. Cleaning Up Vertex AI Resources
You start by cleaning all resources created in Vertex AI, following the correct order:
a. Deleting the Endpoint
Endpoints must be deleted before deleting models or pipelines
A model must be undeployed from the endpoint first
Once undeployed, the endpoint can be safely deleted
b. Deleting the Training Pipeline
After the endpoint is removed, the training pipeline can be deleted
This removes the AutoML training job and related metadata
c. Deleting the Vertex AI Dataset
The dataset created for AutoML training is deleted next
This ensures no unused datasets remain in the project
2. Deleting BigQuery Resources
Next, you clean up BigQuery:
Delete the dataset created for AutoML tabular data
This automatically removes:
Training tables
Test tables
Evaluation tables
This step ensures no unused BigQuery storage remains active.
3. Cleaning Up Cloud Storage (GCS)
Finally, you remove Cloud Storage resources:
Delete the input folder containing uploaded CSV files
Delete the output folder containing batch prediction results
Ensure the bucket is fully cleaned
This completes the removal of all storage resources created during the project.
Best Practices Highlighted in This Lecture
Always undeploy models before deleting endpoints
Follow the correct deletion order to avoid permission errors
Clean up resources immediately after experimentation
Regularly review your cloud environment for unused assets
In this lecture, we begin our journey into AutoML Forecasting using Google Vertex AI. Forecasting is a powerful machine learning technique used to predict future values based on historical time-series data, such as stock prices, sales, demand, or traffic trends.
In this session, you will learn how to prepare data, create datasets, and configure an AutoML Forecasting model step by step using Google Cloud services.
What You Will Learn in This Lecture
By the end of this lecture, you will be able to:
Prepare time-series data for forecasting
Upload forecasting data to Google Cloud Storage (GCS)
Create BigQuery datasets and tables for AutoML forecasting
Understand required columns for forecasting models
Create a Vertex AI Tabular Forecasting dataset
Configure and start training an AutoML Forecasting model
Key Topics Covered
1. Uploading Forecasting Data to Cloud Storage
Use an existing GCS bucket created earlier
Create an input folder for organizing forecasting files
Upload:
Historical data file (training data)
Test data file (future timestamps with missing target values)
This ensures clean separation between training and prediction data.
2. Creating a BigQuery Dataset for Forecasting
Create a new dataset dedicated to forecasting
Select the same region (us-central1) to maintain compatibility
Organize forecasting tables inside a single dataset
This dataset will act as the data source for Vertex AI.
3. Creating BigQuery Tables from CSV Files
You will create multiple tables:
Historical stock data table
Contains columns like date, open, high, low, close, and volume
Test data table
Contains future dates where the target value needs to be predicted
Training table
A cleaned and simplified version used specifically for AutoML forecasting
4. Understanding Required Columns for Forecasting
For AutoML Forecasting, you must define:
Timestamp column
Represents the time dimension (e.g., daily dates)
Target column
The value you want to forecast (e.g., closing stock price)
Series identifier column
Identifies each time series
Even if there is only one series, a placeholder ID is required
This lecture explains why these columns are mandatory and how they are used internally by AutoML.
5. Creating a Vertex AI Forecasting Dataset
Navigate to Vertex AI → Datasets
Choose Tabular → Forecasting
Select BigQuery table as the data source
Specify:
Series identifier column
Timestamp column
Once created, the dataset becomes ready for model training.
6. Configuring AutoML Forecasting Model Training
You will learn how to configure key forecasting parameters, including:
Target column (value to predict)
Time granularity (daily data)
Forecast horizon
Number of future time periods to predict
Context window
Amount of historical data used for predictions
Holiday region
Helps the model learn seasonal effects
Chronological data split
Ensures proper time-based training, validation, and testing
7. Training Budget and Model Execution
Select AutoML as the training method
Allocate training budget based on dataset size
Start the AutoML Forecasting training job
Monitor the training status in Vertex AI
The training process runs asynchronously and may take up to a few hours depending on data size and configuration.
In this lecture, we continue our AutoML Forecasting journey by exploring the trained forecasting model, understanding its evaluation metrics, and generating predictions using batch inference in Vertex AI.
You will learn why forecasting models work differently from other machine learning models and how to generate predictions correctly using Google’s managed forecasting workflow.
What You Will Learn in This Lecture
By the end of this lecture, you will be able to:
Explore a trained AutoML Forecasting model in Vertex AI
Understand key forecasting evaluation metrics
Analyze feature importance for time-series models
Learn why forecasting models cannot be deployed to online endpoints
Generate predictions using batch inference
Access and interpret forecasting results stored in BigQuery
Key Topics Covered
1. Reviewing the Trained AutoML Forecasting Model
The forecasting model is successfully trained using Vertex AI
Training duration is close to two hours, based on the allocated budget
The model appears in the Model Registry under training pipelines
Model type: Tabular Forecasting
This confirms that the AutoML forecasting workflow has completed successfully.
2. Understanding Model Evaluation Metrics
Inside the Evaluate tab, you analyze forecasting performance using industry-standard metrics:
MAE (Mean Absolute Error) – Measures average absolute error
MAPE (Mean Absolute Percentage Error) – Measures percentage-based error
RMSE (Root Mean Squared Error) – Penalizes large errors
RMSLE (Root Mean Squared Log Error) – Useful for scale-sensitive data
R² Score – Indicates how well the model explains variance
These metrics help you judge how reliable the forecast is.
3. Feature Importance in Forecasting Models
Feature importance shows how much each feature contributes to predictions
Date (time) contributes the most
Target value (close price) also plays a significant role
This confirms that the model is learning meaningful time-based patterns.
4. Why Forecasting Models Cannot Be Deployed to Endpoints
Unlike classification or regression models, AutoML Forecasting models cannot be deployed for real-time predictions
Attempting to deploy results in an error
Forecasting predictions must be generated using batch inference only
This is an important architectural difference to understand.
5. Automatically Exported Evaluation Predictions
During training, the option to export test data to BigQuery was enabled
Vertex AI automatically creates a BigQuery table containing:
Actual values
Predicted values
Prediction timestamps
Series identifiers
This allows easy inspection and validation of model predictions.
6. Creating a Batch Inference Job
You then create a batch prediction job by:
Selecting the trained forecasting model
Choosing a BigQuery test table as input
Specifying BigQuery as the output destination
Disabling optional explainability and monitoring
Submitting the batch inference job
Batch inference is the correct and recommended way to generate forecasts.
7. Monitoring Batch Prediction Execution
The batch inference job runs for around 20 minutes
Job status shows:
Number of predictions processed
Success and failure counts
Execution time
Logs are available for auditing and debugging
8. Viewing Forecasting Results in BigQuery
After completion:
A new BigQuery table is created automatically
This table includes:
Date
Predicted forecast value
Series identifier
Each row represents a future forecast point
These predictions can now be used for reporting, dashboards, or downstream analysis.
In this lecture, we begin learning how to work with text data on Google Cloud, focusing on sentiment analysis using Vertex AI. You will understand why traditional AutoML Text models are no longer supported and how Gemini models are now used instead for text classification tasks.
This lecture walks you through the complete transition from AutoML Text to Gemini-based fine-tuning, using a real-world example of restaurant reviews.
What You Will Learn in This Lecture
By the end of this lecture, you will be able to:
Understand why AutoML Text is deprecated in Vertex AI
Learn how text classification is now handled using Gemini models
Prepare text datasets for supervised fine-tuning
Understand the required JSON format for Gemini fine-tuning
Fine-tune a Gemini model for sentiment classification
Use trained models for batch predictions
Key Topics Covered
1. Understanding the Text Classification Use Case
The dataset contains restaurant reviews
Each review has a label:
1 → Positive review
0 → Negative review
The dataset includes 1,000 total records
This represents a binary text classification problem
2. Uploading Text Data to Google Cloud Storage
A TSV (tab-separated) file is uploaded to a GCS bucket
The file contains:
Review text
Sentiment label
This data will be used for training and prediction
3. AutoML Text Deprecation Explained
Attempting to create a Text dataset in Vertex AI results in an error
Google has deprecated AutoML NLP models
Vertex AI now recommends using:
Gemini models
Vertex AI Studio
Generative AI workflows
This is an important architectural change you must understand when working with text data on Google Cloud.
4. Using Gemini for Text Classification
Instead of AutoML Text, you use Gemini foundation models
Gemini can classify text using prompt-based instructions
You define:
What is a positive review
What is a negative review
The model responds with:
1 for positive sentiment
0 for negative sentiment
5. Why Fine-Tuning Is Required
Prompt-based classification works, but fine-tuning improves:
Accuracy
Consistency
Efficiency
Fine-tuning is ideal when:
You have labeled data
The task is clearly defined
Text classification is a perfect use case for supervised fine-tuning
6. Preparing Data for Gemini Fine-Tuning
The original TSV file is converted into three JSON files:
Training data
Validation data
Batch prediction data
Each JSON record includes:
System instruction (defines model behavior)
User content (review text)
Model response (label: 0 or 1)
This format teaches Gemini exactly how to classify sentiment.
7. Understanding the Gemini Fine-Tuning JSON Structure
Each record clearly defines:
System role → Explains the classification rules
User role → Contains the review text
Model role → Contains the correct label
This structure is required for supervised tuning.
8. Fine-Tuning a Gemini Model in Vertex AI
A fine-tunable Gemini model is selected from Model Garden
A new tuned model is created with:
A base Gemini model
Training data from Cloud Storage
Validation data from Cloud Storage
Default training parameters are used
Fine-tuning runs for more than one hour
9. Monitoring the Fine-Tuning Process
The model status shows as Running
Vertex AI manages the entire training pipeline
Once completed, the tuned model will be available for predictions
In this lecture, we continue working with the fine-tuned Gemini text classification model that was trained in the previous tutorial. Now that the tuning process has completed successfully, we focus on understanding the training results, testing the model interactively, and running batch inference on unseen data.
This lecture helps you move from model training to model validation and real-world usage.
What You Will Learn in This Lecture
By the end of this lecture, you will be able to:
Understand fine-tuning results and metrics for a Gemini model
Interpret training vs validation accuracy and loss
Explore checkpoints and model versions
Test a tuned Gemini model using Vertex AI Studio
Perform batch inference on multiple text inputs
Read and understand batch prediction outputs
Key Topics Covered
1. Reviewing the Fine-Tuning Status and Progress
The tuned model Text Classifier v3 has successfully completed training
The tuning status shows Succeeded
Training progress is visualized using:
Accuracy curves
Loss curves
The blue line represents training performance
The pink line represents validation performance
This comparison helps identify overfitting or underfitting.
2. Understanding Checkpoints and Model Versions
Multiple checkpoints are created during tuning
Each checkpoint represents a saved model state
A separate endpoint is deployed for each checkpoint
This allows testing different versions of the model
Accuracy is tracked for both:
Training data
Validation data
3. Dataset and Token Statistics
Training dataset contains 800 labeled records
Detailed statistics are shown for:
Input tokens per example
Output tokens per example
Messages per example
You can see:
Minimum, maximum, mean, and median values
This helps you understand:
Text length distribution
Model token usage
4. Exploring the Model in Vertex AI Model Registry
The tuned model is available in the Model Registry
Key tabs include:
Evaluate
Test
Batch Inference
Version Details
Lineage
Evaluation can also be done using Generative AI Evaluation Service in Colab
5. Testing the Tuned Model in Vertex AI Studio
The model is opened directly in Vertex AI Studio
To get correct predictions:
A system instruction must be provided
The instruction defines classification rules:
1 for positive sentiment
0 for negative sentiment
Example tests include:
Positive reviews → output 1
Negative reviews → output 0
This ensures consistent and reliable predictions.
6. Advanced Testing Options
You also explore additional configuration options such as:
Structured output (on/off)
Thinking budget:
Auto
Manual
Off
Grounding options:
Google Search
Your own data
Advanced parameters:
Temperature
Output token limits
Top-p sampling
These settings help control model behavior.
7. Running Batch Inference on Multiple Reviews
Batch inference is created using:
The tuned Gemini model
A JSON file stored in Cloud Storage
The batch file contains multiple text reviews
The output is saved to a Cloud Storage folder
Batch inference allows:
Large-scale predictions
Automated evaluation
Offline processing
8. Understanding Batch Inference Output
The output file contains:
Input text
Predicted label (0 or 1)
Confidence information
Token usage details
Each record shows:
Whether the sentiment is positive or negative
This makes it easy to validate model behavior on real data
In this lecture, we focus on evaluating a fine-tuned Gemini model using the Vertex AI Generative AI Evaluation Service. After testing the model in the previous tutorial, this session helps you measure model quality, compare responses against reference answers, and understand evaluation metrics in a practical and structured way.
This lecture introduces both automatic similarity-based evaluation and model-based qualitative evaluation, which are critical for validating real-world Generative AI applications.
What You Will Learn in This Lecture
By the end of this lecture, you will be able to:
Launch and use the Vertex AI Evaluation notebook
Run evaluations using Colab or Vertex AI Workbench
Understand reference-based evaluation metrics
Use model-based pointwise and pairwise metrics
Interpret evaluation results in a simple and practical way
Decide whether a model is performing well enough for production
Key Topics Covered
1. Accessing the Evaluation Notebook
The evaluation notebook is launched directly from the Evaluate tab
You can open it in:
Google Colab
Colab Enterprise
Vertex AI Workbench
GitHub
The notebook is provided by Google LLC and licensed under Apache License 2.0
This notebook is specifically designed for Generative AI model evaluation in Vertex AI.
2. Understanding the Purpose of the Evaluation Notebook
The notebook demonstrates how to evaluate models using the Vertex AI Python SDK for:
First-party models (Gemini)
Third-party models (Anthropic, OpenAI, etc.)
Prompt engineering experiments
In this lecture, the focus is on first-party Gemini models.
3. Authentication and SDK Initialization
The notebook requires authentication to your Google Cloud project
You specify:
Project ID
Region (for example, us-central1)
This ensures the evaluation runs securely within your cloud environment
4. Loading the Evaluation Dataset
A sample evaluation dataset (Open-Orca style dataset) is used
The dataset contains:
System prompts
Questions
Reference (expected) answers
The dataset is converted into a structured format for evaluation
This dataset acts as the ground truth for comparison.
5. Reference-Based Evaluation (ROUGE-L Metric)
The first evaluation uses ROUGE-L, a similarity-based metric
ROUGE-L measures:
How much the model response overlaps with the reference answer
Key statistics explained:
Mean score (average similarity)
Standard deviation (variation across examples)
Interpretation:
Scores closer to 1.0 indicate strong similarity
Lower scores mean partial or weak overlap
This helps identify whether the model is answering correctly, even if phrasing differs
6. Understanding the Evaluation Table
Each evaluation row includes:
System instruction
Input question
Reference answer
Model response
ROUGE-L similarity score
This allows you to:
Inspect individual failures
Understand why certain responses scored low
Identify patterns in model behavior
7. Model-Based Pointwise Evaluation Metrics
In addition to similarity metrics, the lecture explores model-based qualitative metrics, such as:
Fluency
Coherence
Safety
Summarization quality
Instruction following
Text quality
These metrics are:
Generated by AI models
Based on human-like judgment
Useful for evaluating writing quality and readability
8. Interpreting Pointwise Evaluation Results
For each response, the evaluation provides:
A score (higher is better)
A natural language explanation
This helps you understand:
Why a response is fluent or not
Whether the answer is safe and well-structured
How well the model followed the instruction
9. Creating Custom Pointwise Metrics
The notebook also allows:
Defining your own evaluation criteria
Measuring custom attributes such as:
Linguistic acceptability
Domain-specific quality
This is useful for enterprise-specific use cases
10. Pairwise Evaluation Metrics
Pairwise evaluation compares:
Two model responses side-by-side
Metrics such as:
Summarization quality
Text verbosity
The output clearly indicates:
Which response is better
Why it is better
This is extremely helpful when:
Comparing different prompts
Comparing different model versions
11. Extending Evaluation to Other Models
The same framework can be used to evaluate:
Third-party models hosted on Vertex AI
Multiple Gemini versions
This allows fair and consistent comparison across models
12. Prompt Engineering Evaluation
The notebook also supports:
Comparing multiple prompt templates
Measuring how prompt changes affect quality
Viewing experiment logs for evaluation runs
This makes evaluation a powerful tool for prompt optimization.
In this lecture, we begin our journey into AutoML for Image Data using Google Vertex AI. The primary focus of this session is image data preparation, which is one of the most important and often overlooked steps in building a successful machine learning model.
You will learn how to prepare a real-world image dataset, organize it correctly, upload it to Google Cloud Storage, and make it ready for AutoML single-label image classification in Vertex AI.
What You Will Learn in This Lecture
By the end of this lecture, you will be able to:
Understand image AutoML options available in Vertex AI
Select the correct image objective for your use case
Prepare an image dataset for single-label classification
Organize images using labels and folders
Upload image data to Google Cloud Storage (GCS)
Create a CSV file required by Vertex AI for image datasets
Make your dataset fully ready for AutoML training
Key Topics Covered
1. Introduction to AutoML for Images in Vertex AI
Overview of Vertex AI image datasets
Available image objectives:
Single-label classification
Multi-label classification
Object detection
Image segmentation
Why single-label classification is selected for this tutorial
This lecture focuses on classifying images into one category only.
2. Understanding the Image Dataset
Introduction to the Rock–Paper–Scissors dataset from Kaggle
Dataset details:
Total images: 2,892
Three classes: rock, paper, scissors
Real-world example of a classification problem
This dataset is ideal for beginners to understand image classification concepts.
3. Downloading and Exploring the Dataset
Downloading the dataset from Kaggle
Extracting the ZIP file
Understanding dataset structure:
Separate folders for each class
Images grouped by label
Visual inspection of images to understand variations
4. Training and Testing Data Split
Creating a training dataset:
100 images per class (rock, paper, scissors)
Using the provided test dataset as-is:
124 images per class
Importance of separating training and testing data
This ensures better model evaluation later.
5. Uploading Images to Google Cloud Storage (GCS)
Creating folders inside a GCS bucket:
rock
paper
scissors
Uploading training images to their respective folders
Verifying successful uploads
Each folder name directly represents the label for classification.
6. Why a CSV File Is Required for AutoML Images
Vertex AI requires a CSV or JSONL file that tells the system:
Where each image is stored (GCS path)
What label belongs to each image
This lecture uses a CSV file, which is simpler to understand.
7. Understanding the CSV File Structure
The CSV file contains:
Image file path (GCS URI)
Label name
Important rules explained:
GCS paths are case-sensitive
Labels must:
Start with a letter
Contain only letters, numbers, or underscores
Optional column for dataset splitting is discussed but not used
8. Generating Image Paths Using Cloud Shell
Using Cloud Shell to list all images from each folder
Extracting image paths automatically
Assigning correct labels to each image
Creating individual text files for:
rock
paper
scissors
Combining all files into one CSV dataset file
This approach avoids manual errors and saves time.
9. Uploading the CSV File to GCS
Copying the generated CSV file to the GCS bucket
Verifying the file upload
Downloading and validating the CSV content
Final dataset summary:
Total records: 300
One row per image
Each image mapped to its correct label
In this lecture, we move to the next critical step in our AutoML image classification workflow:
creating a Vertex AI image dataset and training a machine learning model using the data prepared in the previous tutorial.
By the end of this session, you will understand how to take a fully prepared image dataset and use Vertex AI AutoML to train a single-label image classification model—without writing any machine learning code.
What You Will Learn in This Lecture
By completing this lecture, you will be able to:
Create an image dataset in Vertex AI
Import labeled images using a CSV file from Google Cloud Storage
Understand dataset statistics and label distribution
Configure AutoML training options for image classification
Start training an AutoML image model on the cloud
Understand compute, pricing, and training workflow in Vertex AI
Topics Covered in Detail
1. Recap of Data Preparation
We begin with a quick recap of what was completed earlier:
Three image folders were created:
rock
paper
scissors
Images were uploaded to Google Cloud Storage
A CSV file was created containing:
Image file path (GCS URI)
Corresponding label for each image
This CSV file acts as the bridge between Cloud Storage and Vertex AI.
2. Creating an Image Dataset in Vertex AI
You will learn how to:
Navigate to Vertex AI → Datasets
Create a new dataset with:
Image as the data type
Single-label classification as the objective
Correct region selection (US Central 1)
Once created, this dataset becomes the foundation for training the model.
3. Importing Images Using Google Cloud Storage
Instead of uploading images manually, we use the CSV import method, which allows Vertex AI to:
Read image paths directly from Cloud Storage
Automatically assign labels from the CSV file
Import all images in one step
Key points covered:
Supported image formats
Why CSV import is preferred for large datasets
Default data split selection when no split column is provided
4. Verifying Imported Data
After import, we review:
Total number of images
Labeled vs unlabeled images
Distribution across labels:
Rock
Paper
Scissors
Dataset properties such as:
Objective type
Region
Encryption
Annotation sets
This step ensures your dataset is ready for training.
5. Dataset Analysis and Readiness Check
Vertex AI automatically analyzes the dataset and confirms:
All images are labeled correctly
Each class has sufficient data
The dataset meets AutoML training requirements
This helps avoid training failures later.
6. Configuring AutoML Training
You will learn how to configure training options, including:
Selecting AutoML as the training method
Choosing Cloud deployment (not Edge)
Training a new model version
Naming and describing the model clearly
We also discuss when and why Edge deployment might be used.
7. Data Splitting Strategy
Training configuration includes:
Random data split
Default ratios:
80% training
10% validation
10% testing
You will also learn when manual data splits might be useful.
8. Incremental Training Explained
This lecture explains:
What incremental training is
When it should be enabled
Why it is disabled for this first model
This concept is important for future model improvements and cost optimization.
9. Compute and Pricing Configuration
Before starting training, we:
Select minimum compute requirements
Understand node-hour usage
Choose a cost-effective training setup
This section helps you control training cost without sacrificing learning outcomes.
10. Starting the Training Process
Finally, the AutoML image classification training is started:
Training status can be monitored in Vertex AI
Model training runs in the background
Expected training time is around two hours
You also learn how to track training progress from the Training section.
In this lecture, we complete the end-to-end AutoML image classification workflow by evaluating the trained model, deploying it to an endpoint, testing real-time predictions, and finally running batch inference on multiple images stored in Google Cloud Storage.
This session helps you understand how a trained AutoML image model behaves in real-world scenarios and how predictions are generated both online and in batch mode.
What You Will Learn in This Lecture
By the end of this lecture, you will be able to:
Explore training pipeline results and completion details
Analyze AutoML image model evaluation metrics
Understand confidence scores, precision, recall, and confusion matrix
Deploy an image classification model to a Vertex AI endpoint
Perform real-time (online) image predictions
Understand why small datasets can affect prediction accuracy
Prepare input data for batch image inference
Run batch inference using JSONL input files
Analyze batch prediction outputs stored in Cloud Storage
Topics Covered in Detail
1. Training Pipeline Completion Overview
We begin by reviewing the completed training pipeline:
Training status marked as Finished
Total training duration of approximately 1 hour and 43 minutes
Trained model: Rock Paper Scissors Image Classification Model
This confirms that AutoML training completed successfully.
2. Model Evaluation Results
In the Evaluate tab, we analyze model performance:
Default confidence threshold: 0.5
Explanation of confidence score and model certainty
Label-wise evaluation for:
Rock
Paper
Scissors
Precision and recall shown as 100%
Dataset split:
Training images: 240
Validation images: 30
Test images: 30
Confusion matrix interpretation
⚠️ Important note: Perfect evaluation metrics on a small dataset may not reflect real-world performance.
3. Deploying the Model to an Endpoint
Next, we deploy the trained model for online predictions:
Create a new deployment endpoint
Select region and access type
Use default encryption settings
Configure traffic split (100% to one model)
Allocate compute resources
Complete endpoint deployment
This makes the model available for real-time inference.
4. Testing Online Predictions
After deployment, we test the model by uploading images:
Predictions return probabilities for each label
Observed cases where:
Predictions are incorrect
Confidence scores are misleading
Explanation of why misclassifications occur:
Training dataset is too small
Model needs more diverse data for better generalization
This highlights an important real-world ML concept:
Good evaluation scores do not always guarantee good predictions.
5. Introduction to Batch Inference
We then move to batch inference, which is used when:
Predicting labels for many images at once
Images are already stored in Cloud Storage
Real-time predictions are not required
Batch inference is ideal for offline processing and large datasets.
6. Preparing Data for Batch Prediction
Before running batch inference, we prepare:
A Cloud Storage folder containing test images
A JSONL file that:
References image paths in GCS
Specifies image MIME type
This JSONL file acts as the input manifest for the batch job.
7. Granting Required Permissions
To avoid job failures, required permissions are explained:
Vertex AI service account must have:
Read access to input images
Write access to output folder
Proper IAM roles are required for batch jobs to succeed
This step is critical for production-ready pipelines.
8. Running the Batch Inference Job
We then create the batch inference job:
Specify:
Input JSONL file
Output format (JSONL)
Output destination folder
Monitor job execution
Handle initial failure and retry after permission fixes
Confirm successful job completion
9. Analyzing Batch Prediction Output
Finally, we analyze the batch inference results:
Output JSONL file contains:
Image path
Predicted label
Confidence scores
Observations:
Some predictions are incorrect
Only one image classified correctly
Explanation:
Limited training data leads to poor generalization
This reinforces the importance of larger and more diverse datasets.
In this lecture, you will begin your journey with the Agent Development Kit (ADK) by building and understanding your very first AI agent. The focus of this session is on creating a simple yet practical example called the Fun Facts Agent, which demonstrates how agents are structured, configured, and prepared to run using ADK.
We start by introducing the overall objective of the lecture—learning what ADK is and how it can be used to build AI-powered agents. You will then walk through the complete project setup, starting from the folder structure to preparing the environment required to run an agent successfully.
What This Lecture Covers
Introduction to ADK
What the Agent Development Kit (ADK) is
Why ADK is used for building AI agents
Overview of the Fun Facts Agent use case
Understanding the Project and Folder Structure
Explanation of the SDK folder and its purpose
Organizing agents into categories (for example, starter agents and tool agents)
How to manage multiple agents under a single category
Role of important files such as:
README.md for documentation and notes
.env file for environment variables
agent.py for defining the agent logic
Setting Up the Development Environment
Creating a single virtual environment at the SDK level
Why one shared environment is useful for multiple agents
Installing all required dependencies using a centralized requirements file
Overview of key dependencies used in the project, including:
Google Agent Development Kit (ADK)
Libraries for language model interaction
Utility and environment management packages
Exploring Gemini Models
Introduction to Gemini language models
Understanding the differences between Gemini model variants
Selecting an appropriate model (Gemini 2.5 Flash) for the Fun Facts Agent
Where to find official documentation and pricing details
How model choice impacts performance and cost
Creating and Managing the API Key
Generating a Google API key using Google AI Studio
Associating the API key with a Google Cloud project
Storing the API key securely in the .env file
Best practices for managing API keys in agent-based projects
Defining the Fun Facts Agent
Setting the agent name and description
Configuring the language model used by the agent
Writing clear instructions that define the agent’s behavior
Adding behavior rules and example interactions to guide responses
Ensuring outputs are short, accurate, friendly, and kid-friendly
In this lecture, we take the next step in our ADK (Agent Development Kit) journey by running and testing the Fun Facts Agent that we prepared in the previous tutorial. After completing the setup tasks earlier, this session focuses on understanding how Python packages work in ADK projects, resolving common environment issues, and interacting with the agent through the ADK web interface.
We begin by revisiting the progress made so far and then move into the final steps required to successfully execute the agent and verify its behavior.
What This Lecture Covers
Recap of Previous Setup
Understanding the folder structure
Creating and activating the virtual environment
Installing all required dependencies
Exploring Gemini models
Creating and configuring the API key
Understanding the __init__.py File
Why the __init__.py file is required in agent folders
How it helps treat a folder as a Python package
Making the agent.py file accessible when the agent package is imported
Relationship between __init__.py, agent.py, and the ADK framework
Resolving Common Environment Errors
Understanding why dependency-related errors can appear after installation
Configuring the correct Python interpreter in VS Code
Linking VS Code to the virtual environment created for the SDK
Verifying that the ADK libraries are correctly recognized
Running the Agent Using ADK
Navigating to the correct agent directory
Understanding available ADK commands and their purpose:
Running agents locally
Starting a web-based interface
Interactive testing and evaluation options
Launching the agent using the ADK web command for local testing
Interacting with the Fun Facts Agent
Accessing the local web UI
Chatting with the agent using natural language
Generating fun, kid-friendly facts on different topics
Confirming that the agent responds correctly and consistently
Understanding the ADK Web Interface
Exploring the chat panel
Understanding the Events tab:
User prompts and agent responses
Role of the language model
Invocation details and metadata
Introduction to additional tabs such as:
State and session information
Artifacts and evaluation data
High-level overview of these sections, with deeper explanations planned for upcoming lectures
In this lecture, we move beyond the basic starter agent and step into a more powerful concept in the Agent Development Kit (ADK): Tools Agents. Tools allow an AI agent to go beyond simple text generation and interact with external systems, perform calculations, fetch real-time information, and execute specific actions.
In this session, you will design, run, and understand three different tools-based agents:
Search Agent
Time Agent
Calculator Agent
Each agent demonstrates a different way tools can be integrated into AI workflows using ADK.
What This Lecture Covers
Introduction to Tools in ADK
What a tool is in the context of AI agents
How tools extend an agent’s capabilities beyond reasoning and text generation
High-level flow of tool-based agent execution:
User request
Model reasoning
Tool selection and execution
Final response generation
Types of Tools Supported by ADK
Function tools (custom Python or Java functions)
Built-in tools (such as Google Search and code executors)
Third-party tools
Google Cloud tools
MCP (Model Context Protocol) tools
OpenAPI-based tools for REST API integration
Guidance on when to use each type
Search Agent: Using a Built-in Tool
Understanding the Google Search built-in tool
Configuring an agent to fetch up-to-date information from the web
Defining agent instructions and behavior for research-oriented tasks
Running the Search Agent using the ADK web interface
Testing real-world queries such as recent AI news
Exploring grounding metadata, sources, and web references in agent responses
Understanding how search results are traced back to their sources
Time Agent: Using a Custom Function Tool
Converting a simple Python function into an ADK tool
Using function calling to fetch real-time system data
Designing an agent that answers time-related queries accurately
Understanding how the agent invokes the function tool during execution
Exploring events, requests, and responses for function-based tools
Calculator Agent: Multiple Custom Tools
Designing a calculator agent with multiple tools
Performing mathematical calculations safely
Supporting unit conversion using custom logic
Understanding how one agent can choose between multiple tools
Observing how tool calls appear in the ADK events and logs
Understanding Agent Events and Metadata
How tool usage is reflected in the Events panel
Differentiating between agent responses and tool executions
Understanding request context, system instructions, and model usage
Introduction to sessions, state, and evaluation options
In this lecture, you will learn how to go beyond Gemini models and use other large language models (LLMs) with the Agent Development Kit (ADK). Until now, all agents were built using Gemini API keys. In this session, we introduce a flexible and scalable approach that allows you to use multiple LLM providers through a single integration.
You will be introduced to LiteLLM, a lightweight Python framework that acts as a unified interface for calling models from different providers, and OpenRouter, a platform that provides access to many popular AI models using a single API key.
What This Lecture Covers
Why Use Non-Gemini Models in ADK
Limitations of using only one LLM provider
Benefits of model flexibility for different use cases
How ADK supports external LLMs through integrations
Introduction to LiteLLM
What LiteLLM is and why it is used
Using a single API to call multiple LLM providers
Supported capabilities:
Chat-based models
Text generation
Streaming responses
Embeddings
Built-in support for logging, retries, and cost tracking
Easy model switching by changing only the model name
Using OpenRouter as a Unified API Layer
What OpenRouter is and how it works
Accessing models from multiple providers such as:
OpenAI
Anthropic
Mistral
xAI (Grok)
Benefits of using one API key instead of multiple provider-specific keys
Understanding paid vs free models
Adding credits and managing usage (optional for paid models)
Creating and Managing API Keys
Signing in to OpenRouter
Creating a new API key
Setting usage limits
Storing the API key securely in the environment file
Reusing the same API key across multiple agents
Building an Agent with OpenAI Models
Configuring LiteLLM with OpenRouter and OpenAI models
Creating a motivational quote agent powered by GPT models
Using a custom function as a tool for structured input handling
Observing how the agent invokes tools before generating responses
Testing the agent with different topics and reviewing outputs
Building an Agent with Anthropic Models
Switching from OpenAI to Anthropic models using only a model name change
Creating a programming joke agent powered by Claude models
Understanding how the same tool-based design works across providers
Testing humor and creative responses with different prompts
Using Free Models (Grok / xAI)
Identifying free models available in OpenRouter
Switching to a free model without changing agent logic
Running agents with Grok models for experimentation
Observing differences in responses across providers
Understanding Events and Tool Calls
How LiteLLM-based agents appear in ADK events
Tracking tool invocation, request context, and model responses
Verifying which model and provider handled each request
In this lecture, you will learn how to work with structured data in AI agents using the Agent Development Kit (ADK). So far, most agents have produced free-form text. In many real-world applications, however, AI systems are expected to return well-defined, predictable, and structured outputs that can be easily consumed by applications, APIs, or downstream systems.
This session introduces the concept of data structuring and schema-based outputs, enabling your agents to return consistent and validated responses.
What This Lecture Covers
Introduction to Structured Data in ADK
Why structured data is important in AI-driven systems
Common use cases for structured outputs (emails, reports, forms, content pipelines)
How ADK supports structured input and output using schemas
Schema Definitions in ADK
Understanding input schema and output schema
Role of the output key in mapping agent responses
How schemas define expected fields, data types, and constraints
Ensuring predictable and machine-readable outputs from language models
Using Pydantic for Structured Output
Why Pydantic is used for data validation and serialization
Defining structured models using BaseModel
Adding metadata, constraints, and descriptions to fields
Validating and cleaning model-generated data automatically
Email Agent: Structured Email Generation
Designing an agent that generates professional emails
Defining structured fields such as:
Subject
Body
Tone
Priority
Estimated read time
Suggested attachments
Follow-up requirement
Using predefined choices with enums for tone and priority
Applying custom validation logic to clean and standardize output
Running the agent and analyzing structured responses in the ADK interface
Understanding how the output key maps the structured result in the agent state
Exploring Agent Events and State
Viewing raw model output versus structured state output
Understanding how structured fields appear in the State tab
Tracking how schemas control and shape the final response
Blog Agent: Structured Content Creation
Creating a blog generation agent with structured output
Defining fields such as:
Blog title
Meta description
Blog body
Tags
Estimated read time
Generating complete blog posts in a consistent, reusable format
Observing how structured content can be reused for websites, CMS platforms, or SEO workflows
In this lecture, you will learn how to run AI agents without using the ADK Web interface. Until now, agents were executed and tested using the ADK Web UI. In real-world applications, however, agents often need to run programmatically, such as inside backend services, APIs, scripts, or automation pipelines.
To achieve this, the Agent Development Kit (ADK) provides three core concepts: Session, State, and Runner. This lecture focuses on understanding these concepts and how they work together to manage conversations and execute agents without a web-based interface.
What This Lecture Covers
Why Context Matters in AI Conversations
Importance of maintaining conversational context in multi-turn interactions
How agents avoid repetition and maintain continuity
Comparison with human-like conversation flow
Core Concepts in ADK Context Management
Session
Represents a single conversation thread
Holds the sequence of user messages and agent actions
Exists only for the duration of a conversation
State
Stores conversation-specific data within a session
Used to manage temporary information relevant to the current interaction
Memory
Stores long-term or cross-session information
Can act as a searchable knowledge base
Useful for recalling information across multiple conversations
Context Management Services in ADK
Role of Session Service and Memory Service
Overview of how ADK manages conversation lifecycle
Understanding the interaction between execution logic and storage
Session Service Implementations
In-memory session service for temporary, local storage
Vertex AI session service for managed, cloud-based sessions
Database-backed session service for persistent storage
When to use each option based on application needs
Understanding the Agent Execution Lifecycle
High-level overview of the ADK runtime
How the runner processes events and agent logic
Interaction between:
Language models
Tools
Callbacks
Session and state storage
Running an Agent Without ADK Web
Project structure for non-web execution
Role of environment configuration using .env files
Loading API keys and runtime settings securely
Creating an agent programmatically using an LM-based agent
Initializing and managing sessions in memory
Using the runner to orchestrate agent execution
Sending user queries and receiving agent responses through events
Asynchronous Execution Flow
Why asynchronous execution is used
Handling agent responses through event streams
Extracting and displaying the final response from events
In this lecture, we put theory into practice by running the agent without using the ADK Web interface and closely examining the execution flow and output. Building on the previous tutorial—where we analyzed the code structure—this session focuses on execution, debugging, and understanding how each component behaves at runtime.
You will see how an ADK agent runs end-to-end using the session service, runner, and event stream, and how to troubleshoot common errors that may appear during execution.
What This Lecture Covers
Executing the Agent Without ADK Web
Navigating to the correct project folder
Running the agent directly using the Python command
Understanding how this approach differs from using adk web
Debugging a Common Runtime Error
Interpreting the error related to session handling
Understanding why the error occurs in asynchronous code
Learning the difference between synchronous and awaitable returns
Applying the correct fix and re-running the agent successfully
Understanding What Changed
Why the session creation method does not require await
How correcting this resolves the execution flow
Best practices for handling async and sync methods in ADK-based projects
Analyzing the Agent Output
Confirming successful session creation
Verifying that the agent initialized correctly
Understanding how the runner processes the user query
Observing how events are generated and streamed
Extracting and displaying the final agent response
End-to-End Execution Flow
Session service initialization
Agent setup and configuration
Query formatting into structured content
Runner execution
Event generation and response extraction
In this lecture, you will build a persistent AI agent that can remember conversations, user details, reminders, and notes even after the application is closed and restarted. Until now, the agents you created stored session data only in memory, which meant all information was lost once the application stopped. In this tutorial, you move one step closer to real-world, production-ready agents by using a local database for persistent session and state storage.
We achieve this by using the Database Session Service provided by the Agent Development Kit (ADK).
What This Lecture Covers
Why Persistent Sessions Matter
Limitations of in-memory session storage
Importance of data persistence for real applications
How agents can “remember” users across restarts
Understanding Database Session Service
Overview of ADK session service implementations:
In-memory session service
Vertex AI session service
Database session service
How database-backed sessions work
Supported databases (SQLite, PostgreSQL, MySQL)
When and why to choose database session service
Project Structure Overview
Purpose of each file in the project:
main.py – application entry point
agent.py – defines agent behavior and tools
utils.py – helper utilities for UI and response handling
__init__.py – package initialization
.env – environment configuration
How all components work together as one system
Main Application Flow (main.py)
Initializing database-backed session storage
Connecting to a local SQLite database
Defining initial session state for new users
Detecting existing user sessions and restoring them
Setting up the agent runner
Running an interactive command-line chat loop
Handling user exit commands gracefully
Memory Agent Design (agent.py)
Defining agent personality and responsibilities
Managing structured state such as:
User name
Reminders
Notes
Tool-based functions for:
Adding, viewing, updating, and deleting reminders
Clearing all reminders
Adding and viewing notes
Updating user profile information
Fetching complete user details
How agent tools directly read from and write to persistent state
Utility Functions (utils.py)
Enhancing command-line experience with colored output
Displaying session state in a readable format
Processing different agent response types
Showing state before and after each interaction
Printing a professional welcome banner with usage guidance
How Persistence Works End-to-End
User interacts with the agent
Agent updates state through tools
State is automatically saved to the database
Application restarts
Existing session data is restored seamlessly
In this lecture, you will see the complete persistent memory agent in action. After understanding the code structure in the previous tutorial, this session focuses on two key goals:
Understanding how all four Python files work together as a single system
Running the agent and observing real, persistent behavior using a local database
This lecture demonstrates how an AI agent can remember user information across multiple runs, making it suitable for real-world applications.
How the Application Works as a Whole
Before running the agent, we first clarify the role of each file and how they collaborate:
main.py – The Orchestrator
Starts the application
Manages session creation and restoration
Controls the interaction flow between user, agent, and database
agent.py – The Brain
Defines the agent’s intelligence and personality
Contains all tools for managing reminders, notes, and user information
Updates and reads persistent state
utils.py – The Interface Manager
Handles user-friendly output formatting
Displays state before and after each interaction
Processes agent responses and presents them clearly in the terminal
init.py – The Foundation
Enables proper package structure
Allows smooth imports between files
Together, these files create a seamless flow where:
User input is captured
The agent processes requests
State changes are saved to the database
Results are displayed clearly to the user
Running the Persistent Memory Agent
Once the application is executed, you observe:
Automatic session creation
A welcome message explaining agent capabilities
Confirmation that reminders and notes will persist across sessions
Creation of a local SQLite database file to store all information
Live Agent Capabilities Demonstrated
During the interactive session, the agent successfully:
Recognizes and remembers the user’s name
Adds multiple reminders
Displays stored reminders on request
Updates existing reminders
Deletes specific reminders
Stores personal notes
Retrieves notes accurately
Updates user profile information (such as name changes)
Maintains all data even after the application is closed and restarted
This clearly shows how persistent storage works in practice.
Verifying Persistence with Database Inspection
To validate persistence, the lecture also covers:
Restarting the application and confirming data is restored
Viewing reminders and notes after restart
Inspecting the SQLite database using a database viewer
Understanding stored tables such as:
Sessions
Events
Application state
Seeing how conversations, state updates, and timestamps are saved internally
In this lecture, you will be introduced to the concept of Multi-Agent Systems in the Agent Development Kit (ADK) and begin building your first multi-agent application. As AI applications grow in complexity, relying on a single monolithic agent becomes difficult to manage, scale, and maintain. Multi-agent systems solve this problem by dividing responsibilities across multiple specialized agents that work together toward a common goal.
This session focuses on understanding the concept, architecture, and initial implementation of a multi-agent system using ADK.
What This Lecture Covers
What Is a Multi-Agent System in ADK
Why single-agent systems become hard to manage at scale
How ADK supports agent composition
Understanding collaboration between multiple agents
Parent–child relationships in multi-agent hierarchies
Key Concepts from ADK Documentation
LM-powered agents vs custom agents
Agent hierarchy and the single-parent rule
Role of orchestrator or manager agents
Overview of workflow agents (sequential, parallel, loop – introduced for later tutorials)
Real-World Analogy of a Multi-Agent System
A manager (boss) agent that receives user requests
Specialized sub-agents that perform specific tasks
Tools that provide utility functions
Clear separation of responsibilities
Multi-Agent Architecture for This Project
Manager Agent (Router)
Acts as the decision maker
Routes user requests to the correct agent or tool
Never performs tasks directly
Sub-Agents
Joke Agent – handles jokes and humor
Stock Agent – provides stock prices and market data
News Agent – fetches and summarizes news
Tools
Time tool for date and time queries
Folder and Package Structure
Manager folder with routing logic
Sub-agents folder containing specialized agents
Tools folder for reusable utility functions
Use of __init__.py files to enable clean imports and package structure
Designing the Routing Logic
Defining clear instructions for the manager agent
Listing available specialists and their responsibilities
Building a decision-tree style routing strategy
Enforcing strict rules so the manager always delegates work
Ensuring requests are routed correctly based on intent
Understanding Agent vs Tool Usage
Why some components are added as sub-agents
Why certain agents are registered as tools
How built-in tools influence agent composition
In this lecture, we complete the multi-agent system implementation by understanding all the sub-agents and tools in detail and then running the system to see how everything works together in practice. Building on the previous tutorial—where we designed the root manager agent—this session demonstrates how specialized agents collaborate under a single coordinator to deliver accurate, task-specific responses.
You will clearly see how task delegation, agent coordination, and tool execution happen inside an ADK-powered multi-agent application.
What This Lecture Covers
Overview of the Multi-Agent Flow
Role of the manager (root) agent as a coordinator
How user queries are routed to the correct agent or tool
Clear separation of responsibilities between agents
Joke Agent
Purpose: generating original, nerdy, and technical humor
Designed for programming, science, math, and technology topics
Uses creativity from the language model without external tools
Structured response style with optional explanations
Focus on originality, wordplay, and technical references
News Agent
Purpose: researching and summarizing current news and events
Uses Google Search as a built-in tool
Handles time-sensitive queries effectively
Produces structured summaries with headlines, key points, and sources
Designed for accuracy, clarity, and relevance
Stock Agent
Purpose: retrieving and analyzing stock market data
Supports single and multiple stock queries
Provides real-time prices and market context
Includes validation and error handling for invalid ticker symbols
Focuses on clear, professional financial insights
Tools Module
Time-related utilities for date and time queries
Standardized output format for consistency
Acts as lightweight functionality used by the manager
Demonstrates how tools complement agents in a multi-agent system
Understanding Agent and Tool Registration
Why some components are registered as sub-agents
Why certain functionalities are registered as tools
How built-in tools affect agent composition and routing
Running the Multi-Agent System
Launching the application using ADK Web
Interacting with the system using natural language
Observing task delegation in real time
Testing different scenarios:
News queries
Time-related questions
Technical jokes
Stock price requests
Exploring Events and Delegation
How the manager delegates tasks to sub-agents
Understanding event logs for:
Task transfer
Tool execution
Agent responses
Seeing how responses flow back to the manager and then to the user
In this lecture, you will be introduced to the concept of a Stateful Multi-Agent System using the Agent Development Kit (ADK). This is an important milestone where we move from simple, stateless chat interactions to real-world AI systems that remember context, manage state, and coordinate multiple specialized agents.
The session is divided into three logical parts:
Understanding the problem and theory
Exploring the architecture and project structure
Preparing to run the system in the next tutorial
Part 1: Understanding the Problem
Limitations of Traditional Chatbots
Each user message is treated as an isolated interaction
No memory of previous actions or user intent
Example: adding items to a cart and then being unable to retrieve them
Why this happens: lack of persistent state between messages
Real-World E-commerce Requirements
Persistent shopping cart across messages
Order history tracking
Context-aware responses (for example, “add two more”)
Domain-specific knowledge (products, returns, policies)
Support for complex workflows such as browsing, cart management, checkout, and returns
The Solution: Stateful Multi-Agent Architecture
Persistent session-based state
Multiple specialized agents for different tasks
Tool-based actions that modify shared state
Intelligent routing of user requests
Part 2: What Is a Stateful Multi-Agent System?
State Management
The system remembers information across interactions
State survives between messages in a session
Examples: cart items, order history, user details
Multi-Agent Architecture
Different agents handle different responsibilities
Each agent is an expert in its own domain
A routing agent decides which sub-agent should handle each request
Combining Both Concepts
Memory ensures continuity
Specialization ensures clarity and scalability
Together, they form a stateful multi-agent system
Part 3: Project Structure Overview
Root Project Directory
Environment configuration file for API keys
main.py as the application entry point
utils.py containing helper functions for interaction handling
Shopping Assistant Module
Root routing agent that delegates all tasks
No direct tools, only coordination
Sub-Agents
Product Catalog Agent
Read-only access to product information
Handles browsing and product queries
Cart Agent
Manages cart operations
Supports adding, updating, removing, and clearing items
Order Agent
Converts cart into orders
Tracks order status
Returns Agent
Handles product returns
Enforces return policies and updates order state
Shared Session State
User details
Cart items
Order history
Interaction history
Accessible by all agents
System Architecture and Data Flow
Architecture Overview
A shared session state accessible to all agents
A central routing agent coordinating requests
Specialized sub-agents performing actions
Tools modifying the shared state
Data Flow Example
User submits a request
Request is routed by the shopping assistant
Appropriate sub-agent executes logic and tools
State is updated
Response is returned to the user with full context preserved
In this lecture, we move from theory to implementation by deeply understanding the Python code behind the Stateful Multi-Agent system. In the previous tutorial, you learned what a stateful multi-agent system is and why it is needed. In this session, the focus is on how it is built.
We carefully walk through each Python file, explain its responsibility, and show how all components work together to create a realistic online retail assistant that remembers user context, manages a shopping cart, tracks orders, and handles returns.
What This Lecture Covers
1. Main Application Entry Point (main.py)
Role of main.py as the starting point of the application
Use of in-memory session service for maintaining state during a session
Explanation of initial state, including:
User name
Cart items
Order history
Interaction history
How a new session is created with predefined state
Setting up the runner to connect the agent with session management
Interactive conversation loop:
Accepting user input
Handling exit conditions
Sending queries to the agent
Displaying the final session state
Note: You also learn why database-backed session services are recommended for production systems.
2. Utility Functions (utils.py)
Purpose of utility helpers in large agent applications
Terminal formatting using ANSI color codes
Functions for:
Updating interaction history
Storing user queries and agent responses
Displaying the current session state in a readable format
Processing agent response events
Calling agents asynchronously
How utilities keep the main logic clean and readable
3. Environment Configuration (.env)
Managing API keys securely
Separation of configuration from application logic
Best practices for environment-based setup
4. Root Routing Agent – Shopping Assistant
Role of the Shopping Assistant as the main routing agent
Responsibilities:
Understanding user intent
Delegating tasks to specialized sub-agents
Maintaining shared session state
Awareness of:
User information
Cart contents
Order history
Interaction history
Rules for delegation:
When to send queries to cart, product, order, or returns agents
When to ask clarifying questions
Why this agent has no tools and focuses purely on orchestration
5. Cart Agent – Cart Management Logic
Tools for:
Adding items to the cart
Removing items
Updating quantities
Clearing the cart
Product validation using product IDs
Mapping natural language product names to internal IDs
Handling categories such as:
Electronics
Home & Kitchen
Sports & Outdoors
Books & Media
Clear separation between cart logic and product browsing
6. Order Agent – Order Processing
Tools for:
Placing orders from cart items
Retrieving order status
Automatic cart clearing after successful order placement
Order lifecycle handling:
Processing
Shipped
Delivered
Cancelled
Returned
Clear guidance on order history and status tracking
7. Product Catalog Agent – Product Discovery
Read-only agent with no tools
Hardcoded product catalog with:
Categories
Product IDs
Prices
Descriptions
Features
Responsibilities:
Browsing products
Searching items
Recommending products
Clear boundary: does not modify cart
8. Returns Agent – Returns & Refunds
Tool for processing returns
Enforcement of 30-day return policy
Validation against order history
Handling:
Eligible returns
Ineligible return requests
Refund explanations
Structured responses for policy clarification and return processing
In this lecture, we bring the Stateful Multi-Agent System to life by running the application and carefully analyzing the output in the terminal. After understanding the complete Python code in the previous tutorial, this session focuses on practical execution, real-time interaction, and observing how state and multiple agents work together in a realistic e-commerce workflow.
You will see how the system behaves like a real online retail assistant—remembering context, managing a shopping cart, placing orders, and handling returns across multiple interactions.
What This Lecture Covers
Running the Stateful Multi-Agent Application
Launching the application from the terminal
Understanding session creation and session ID
Interpreting startup messages and system prompts
Product Browsing and Discovery
Asking the agent about available products
Browsing product categories such as:
Electronics
Home and Kitchen
Sports and Outdoors
Books and Media
Observing how the Product Catalog Agent responds
Shopping Cart Management
Adding single and multiple products to the cart
Viewing cart contents and total price
Updating product quantities
Removing specific items from the cart
Clearing the entire cart
Watching state changes before and after each action
Order Placement and Tracking
Placing orders from cart items
Receiving order confirmation and order IDs
Viewing complete order history
Fetching details of specific orders
Understanding order statuses such as processing and returned
Returns and Refunds Workflow
Initiating a return request
Validating return eligibility based on the 30-day policy
Processing refunds and receiving confirmation
Understanding refund timelines and instructions
State Visibility and Interaction History
Observing state before and after each user interaction
Tracking:
User information
Cart items
Order history
Interaction history
Understanding how shared state enables context-aware responses
Multi-Agent Coordination in Action
Seeing how the root agent routes requests
Observing different sub-agents being triggered automatically
Understanding how tools modify shared state
Confirming that all agents work together seamlessly
Graceful Session Termination
Exiting the application cleanly
Viewing the final session state summary
Confirming that the agent handled the full workflow correctly
In this lecture, you will begin learning about callbacks in the Agent Development Kit (ADK)—a powerful feature that allows you to observe, customize, and control agent behavior without changing the core SDK code. Callbacks act as hooks that run automatically at specific points during an agent’s execution lifecycle.
In this session, we focus specifically on two important callbacks:
Before Agent Callback
After Agent Callback
These callbacks help you understand what happens just before an agent starts working and right after it finishes responding.
What This Lecture Covers
Introduction to Callbacks
What callbacks are and why they are important in ADK
How callbacks let you hook into an agent’s execution flow
Benefits of callbacks for monitoring, logging, validation, and control
Types of Callbacks in ADK
Overview of all six callback types:
Before Agent
After Agent
Before Model
After Model
Before Tool
After Tool
Scope of this lecture: Before Agent and After Agent callbacks only
Before Agent Callback
When it is triggered in the execution flow
Typical use cases:
Initializing counters or state
Starting timers
Performing pre-run checks
Logging agent start events
How it can read and modify session state
Why it always runs before the agent processes a request
After Agent Callback
When it is triggered
Typical use cases:
Measuring execution time
Logging completion status
Cleaning up resources
Tracking how many questions were answered
How it can inspect final state and results
Why it always runs after the agent finishes processing
Execution Flow Explained
Step-by-step flow:
User input
Before Agent callback
Agent processing
After Agent callback
Final response to the user
Understanding that both callbacks run every time, regardless of the query
Simple Real-World Analogy
Restaurant example:
Before Agent: waiter takes the order
Agent Processing: chef prepares the food
After Agent: waiter delivers the food and closes the order
How this analogy maps to agent execution
State Persistence Across Queries
Maintaining counters such as number of questions asked
Tracking timestamps and response duration
Observing how state evolves across multiple user interactions
Practical Demonstration
Running an agent with before and after callbacks enabled
Asking simple and complex questions
Observing:
Question count increasing
Start and end timestamps
Processing time differences based on query complexity
Understanding how callbacks help analyze agent performance
In this lecture, we move one step deeper into the Agent Development Kit (ADK) callback system by exploring Language Model (LM) interaction callbacks—specifically the Before Model Callback and After Model Callback.
While agent-level callbacks focus on the overall lifecycle of an agent, model callbacks allow you to intercept, validate, and modify interactions directly with the AI model itself. This gives you fine-grained control over what goes into the model and what comes out of it.
What This Lecture Covers
Introduction to Model Callbacks
What LM interaction callbacks are
How they differ from agent lifecycle callbacks
Why they are critical for content moderation, logging, and response shaping
Before Model Callback
When it runs in the execution flow
How it intercepts user input before the AI model is called
Common use cases:
Input validation and filtering
Blocking inappropriate or unsafe language
Logging user requests with metadata such as agent name and timestamp
How a callback can completely stop the model from being called
Example behavior:
Detecting prohibited words
Returning a custom error message instead of calling the model
After Model Callback
When it runs in the execution flow
How it intercepts the AI-generated response
Common use cases:
Modifying tone or wording of responses
Replacing negative language with positive alternatives
Enforcing brand or communication guidelines
How it can override the model’s original response before it reaches the user
Complete Execution Flow Explained
User sends a message
Before Model Callback validates or blocks input
AI model generates a response (if allowed)
After Model Callback modifies the response
Final response is delivered to the user
Practical Demonstration
Running an agent with both callbacks enabled
Testing blocked inputs and observing how the model is skipped
Testing normal inputs and seeing AI responses
Observing how specific words are automatically replaced in the final output
Verifying changes through agent events and logs
Key Design Takeaways
Before Model callbacks are ideal for safety, moderation, and validation
After Model callbacks are ideal for tone control, compliance, and response enhancement
Both callbacks allow behavior changes without modifying the core model logic
In this lecture, we complete the callback journey in the Agent Development Kit (ADK) by focusing on Tool Execution Callbacks—specifically the Before Tool Callback and After Tool Callback.
While agent and model callbacks control when an agent starts or how the model responds, tool callbacks allow you to validate, modify, and enhance tool usage itself. This is extremely powerful when your agents interact with external systems, APIs, or business logic.
What This Lecture Covers
1. Introduction to Tool Callbacks
Understanding where tool callbacks fit in the ADK execution lifecycle
Difference between:
Before Tool Callback – runs before a tool is executed
After Tool Callback – runs after the tool returns a result
Why tool callbacks are essential for safe, reliable, and intelligent agent behavior
2. Before Tool Callback – Validation & Control
Concept explained using a real-world ATM security check analogy
Purpose of the before tool callback:
Validate tool input
Modify arguments if needed
Block unsafe or restricted requests
Normalize user inputs (nicknames, abbreviations, shortcuts)
Practical examples discussed:
Converting abbreviations like NYC into New York
Handling alternate city names
Blocking restricted or classified inputs before the tool executes
Key takeaway:
The tool never runs if the before callback blocks the request
3. After Tool Callback – Enhancing the Result
Explained using the ATM receipt and balance update analogy
Purpose of the after tool callback:
Enhance tool output
Add derived information
Apply business rules
Attach alerts, recommendations, or metadata
Practical examples covered:
Adding “feels like” temperature
Generating heat alerts for high temperatures
Providing safety recommendations after tool execution
Key takeaway:
The core tool result remains intact, but the response becomes richer and more user-friendly
4. End-to-End Tool Callback Flow
Complete execution flow explained step by step:
User asks a question
Agent decides to use a tool
Before Tool Callback validates or modifies inputs
Tool executes with updated arguments
After Tool Callback enriches the response
Final response is returned to the user
Clear explanation of how both callbacks work together in a single request
5. Live Demonstration and Testing
Running the agent using ADK Web
Testing real scenarios:
Input normalization (city abbreviations)
Blocked requests due to restricted inputs
Normal weather queries
Heat alerts for high temperatures
Observing how:
Before tool callback changes inputs
After tool callback enhances outputs
Reviewing agent events to clearly see callback execution
Key Learning Outcomes
By the end of this lecture, you will:
Clearly understand Before Tool and After Tool callbacks
Know when and why to use tool callbacks in real applications
Be able to:
Validate and sanitize tool inputs
Block unsafe or unauthorized requests
Enhance tool responses with business logic
Confidently design agents that interact safely with tools, APIs, and external systems
In this lecture, we begin our journey into Sequential Agents using the Agent Development Kit (ADK). Sequential agents are ideal when tasks must be performed in a fixed order, where each step depends on the output of the previous one. This session focuses on building a strong conceptual foundation before moving on to implementation and execution in later lectures.
What You Will Learn in This Lecture
1. Understanding Sequential Agents
What sequential agents are and how they work in ADK
How they differ from single-agent and parallel-agent approaches
Why execution order is critical in many real-world workflows
2. Real-World Analogy for Clarity
Sequential agents explained using a factory assembly line example:
One worker validates raw material
The next worker builds the product
The final worker packages and labels it
Key takeaway: Each step waits for the previous step to complete
3. Sequential Agent Workflow Example
Introduction to a Blog Post Creation Pipeline
How user input flows through multiple agents in a strict order:
Topic Validator Agent – checks if the topic is valid and usable
Content Generator Agent – creates the blog content
SEO Optimizer Agent – enhances the content with SEO recommendations
How outputs from earlier agents are passed to later agents
4. Key Characteristics of Sequential Agents
Fixed Execution Order
Agents always run in the sequence you define
Data Passing Between Agents
Each agent can access outputs produced by previous agents
Session-Based State Storage
Intermediate results are stored in memory during the session
Output Keys
Each agent saves its result using a unique key
Later agents access results using these keys
5. Folder and Project Structure Overview
Understanding the layout of the blog post pipeline project:
Root sequential agent
Sub-agents for validation, content generation, and SEO optimization
Configuration and documentation files
How ADK loads and connects these components at runtime
6. Execution Flow Explained
How the sequential agent is initialized and executed
Step-by-step breakdown of how user input moves through:
Topic validation
Content generation
SEO optimization
How final output is assembled and presented to the user
7. Data Flow and State Management
How session state acts as shared memory across agents
How results persist throughout the workflow
Why state persistence is essential for multi-step pipelines
8. When to Use (and Not Use) Sequential Agents
Best use cases:
When task order matters
When each step depends on the previous one
When you want predictable, repeatable workflows
When responsibilities should be clearly separated
When to avoid sequential agents:
When tasks can run independently
When parallel execution is more efficient
When a single agent can handle the entire task without dependencies
In this lecture, we move from theory to practice by deeply understanding the Python implementation of Sequential Agents using the Agent Development Kit (ADK). Building on the concepts covered in the previous lecture, we now explore how a real-world, multi-step AI pipeline is implemented and executed.
The focus of this session is a Blog Post Pipeline, where multiple agents work in a fixed, sequential order to validate a topic, generate content, and optimize it for SEO.
What You Will Learn in This Lecture
1. Overview of the Sequential Blog Post Pipeline
How the root (route) agent orchestrates a three-step workflow
Understanding the purpose of each stage:
Topic validation
Content generation
SEO optimization
Why sequential execution is ideal for content pipelines
2. Root Sequential Agent (Pipeline Orchestrator)
Role of the root agent as the controller of execution order
How the sequential agent ensures:
Step 1 always runs before Step 2
Step 2 always runs before Step 3
How the pipeline description and agent order define the workflow
3. Topic Validator Agent
Purpose of topic validation before content creation
What the agent checks:
Topic clarity and specificity
Target audience definition
Content depth and relevance
Uniqueness and saturation level
How validation results are stored using an output key
Why output keys are critical for passing data to later agents
Examples of:
Valid topics with clear scope
Invalid topics that are too broad or unclear
4. Content Generator Agent
How this agent uses:
The original user input
The topic validation result from the previous agent
Structure of generated content:
Engaging title
Clear introduction
Well-organized body sections
Strong conclusion
Writing guidelines enforced by the agent
How generated content is saved into session state using a dedicated output key
5. SEO Optimizer Agent
Role of the final agent in the pipeline
How it analyzes:
Topic validation output
Generated blog content
SEO elements covered:
Focus keywords
Related keywords
Meta title
Meta description
SEO optimization suggestions
How SEO recommendations are stored and presented using an output key
6. Understanding Output Keys and State Flow
How each agent writes results into session memory
How later agents read earlier outputs using output keys
Viewing and understanding:
Topic validation results
Blog content draft
SEO recommendations
Why session state persistence is essential for sequential workflows
7. Running the Sequential Agent and Viewing Results
Executing the pipeline using ADK Web
Observing:
Agent-by-agent execution
Output produced at each stage
Testing different scenarios:
A valid, well-defined blog topic
A broad and invalid topic
A clearly scoped technical topic with a defined audience
How all agents still execute in sequence, even when the topic is invalid
Understanding how validation impacts later agent behavior
In this lecture, we explore Parallel Agents in the Agent Development Kit (ADK) and learn how they enable multiple agents to work simultaneously on the same user input. This session covers the complete journey—from core concepts to code structure and finally running the agent to observe real outputs.
What You Will Learn in This Lecture
1. What Are Parallel Agents?
Understanding parallel agents as workflows where multiple sub-agents execute concurrently
How parallel execution differs from sequential execution
Why parallel agents are ideal when tasks are independent of each other
We begin by reviewing the official ADK documentation to understand:
Independent execution of sub-agents
Shared session state
How results are collected after all agents finish
2. Real-World Analogy for Parallel Agents
Parallel agents explained using a restaurant kitchen analogy
Multiple chefs preparing different dishes at the same time
No waiting for one task to finish before starting another
Comparison with sequential workflows using a house construction example
Why parallel execution drastically reduces total processing time
3. When to Use (and Not Use) Parallel Agents
Use parallel agents when:
Tasks are independent
All agents can work with the same user input
Speed and efficiency are important
You want to generate multiple outputs simultaneously
Avoid parallel agents when:
One task depends on another task’s output
Execution order matters
Tasks may conflict with each other
4. Demo Overview: Multi-Channel Content Creator
In this lecture, we build a Multi-Channel Content Creator using parallel agents.
From a single content topic provided by the user, five agents generate content at the same time:
Blog post
LinkedIn post
Twitter (X) thread
Instagram caption
Email newsletter
All agents run concurrently and return platform-specific content.
5. Project and Folder Structure
Root (route) agent responsible for orchestration
Sub-agents organized by content platform
Supporting files such as environment configuration and documentation
This structure clearly separates responsibilities while enabling parallel execution.
6. Root Parallel Agent (Orchestrator)
How the Parallel Agent class is used instead of a sequential agent
How the root agent launches all sub-agents at the same time
How a single user input is broadcast to all agents
Why total execution time depends on the slowest agent, not the sum of all agents
7. Sub-Agents Breakdown
Each sub-agent is designed for a specific platform and works independently:
Blog Post Agent
Generates long-form blog content with proper structure
LinkedIn Post Agent
Creates professional, thought-leadership-focused posts
Twitter (X) Thread Agent
Produces concise, multi-tweet threads within character limits
Instagram Caption Agent
Generates engaging captions with storytelling and hashtags
Email Newsletter Agent
Creates subscriber-friendly emails with subject lines, previews, and CTAs
Each agent:
Uses clear instructions tailored to its platform
Stores its output using a dedicated output key
Writes results to shared session state
8. Understanding State and Output Keys
How each agent saves its result independently
How outputs are stored in session state using output keys
How you can later retrieve or reuse results from specific agents
9. Running the Parallel Agent and Observing Output
Starting the agent using the ADK Web interface
Submitting a single content topic
Watching all five agents execute in parallel
Viewing multiple platform-specific outputs generated at once
Verifying outputs through session state and output keys
You will clearly see that:
All agents run simultaneously
Each agent produces a different format of content
Results are returned together, not step-by-step
In this lecture, we dive deep into Loop Agents in the Agent Development Kit (ADK) and understand how they enable iterative workflows where agents repeatedly run until a defined goal or condition is met. This session combines theory, architecture, code walkthrough, and a live demo to give you a complete and practical understanding of loop-based agent workflows.
What This Lecture Covers
1. What Are Loop Agents?
A loop agent is a workflow agent that repeatedly executes its sub-agents
Execution continues:
Until a termination condition is met, or
A maximum number of iterations is reached
Ideal for scenarios where progressive improvement is required
We begin by reviewing the official ADK documentation to understand:
How loop agents work internally
How sub-agents are executed repeatedly
How loop termination is handled
2. Real-World Analogy for Loop Agents
To simplify the concept, we use easy-to-understand analogies:
Editing a document until it reaches acceptable quality
Learning to ride a bike, where repeated attempts lead to success
Quality control processes, where checks are repeated until standards are met
These examples clearly explain why loop agents are powerful for refinement-based tasks.
3. Demo Use Case: Iterative Essay Improver
In this tutorial, we build an Iterative Essay Improver that:
Takes user-written text
Reviews its quality
Improves it step by step
Stops automatically when the quality score reaches a defined threshold
This demonstrates how loop agents handle review → improve → re-review cycles efficiently.
4. Key Characteristics of Loop Agents
Repeated execution of the same agents
Exit conditions based on logic or tool invocation
Progressive improvement across iterations
Maximum iteration limit to prevent infinite loops
Exit tools that explicitly stop the loop when criteria are met
5. Project and Folder Structure
Single-package design for simplicity
One main agent file containing:
Sub-agents
Loop logic
Exit tools
Environment configuration and documentation files
Unlike earlier projects, all agents are defined in one file to clearly demonstrate loop behavior.
6. Architecture and Workflow Flow
You will understand:
How user input initializes session state
How content flows through multiple iterations
How session state is updated after every loop
How final output is produced once exit conditions are satisfied
This section explains:
Initialization phase
Refinement loop phase
Termination phase
7. Sub-Agents Inside the Loop
The loop agent uses multiple specialized agents:
Content Initializer
Captures user input exactly as provided
Content Reviewer
Scores the content (1–10)
Provides structured feedback
Content Improver
Improves content based on review feedback
Decides whether to continue or exit the loop
Each agent plays a specific role in the iterative refinement process.
8. Exit Condition and Exit Tool
The loop stops when:
Content quality score reaches 8 or higher, or
Maximum iterations (e.g., 5) are completed
A dedicated exit tool is used to break the loop cleanly
This ensures predictable, controlled execution
9. Running the Loop Agent and Observing Output
Launching the agent using the ADK Web interface
Providing simple and complex text inputs
Watching quality scores improve across iterations
Seeing the loop terminate automatically when criteria are met
Inspecting:
Agent events
Session state
Final refined content
10. Comparing Workflow Agents
At the end of the lecture, we review a workflow comparison guide to clearly distinguish:
Sequential agents – step-by-step pipelines
Parallel agents – simultaneous independent tasks
Loop agents – repeated refinement until success
This comparison helps you confidently choose the right workflow agent for your real-world use cases.
In this lecture, you will begin your journey into understanding the Vertex AI RAG (Retrieval-Augmented Generation) Engine and its role in building more accurate and reliable AI applications using Google Cloud.
We start by exploring the official Vertex AI RAG Engine documentation to understand what the RAG Engine is and why it is needed when working with Large Language Models (LLMs).
What You Will Learn in This Lecture
What the Vertex AI RAG Engine is and why it is important
How RAG improves the accuracy and reliability of LLM responses
The problem of private and organizational data with standard LLMs
The high-level architecture and workflow of the RAG Engine
Key concepts involved in the Retrieval-Augmented Generation process
Supported regions and platform availability
Where the RAG Engine fits within Vertex AI and Agent Builder
Understanding the Vertex AI RAG Engine
Large Language Models are trained on public and general data, which means they do not have access to your organization’s private or internal information. The Vertex AI RAG Engine solves this limitation by allowing you to combine your own data with the model’s general knowledge.
By adding relevant organizational data to the model’s context:
The quality of responses improves
Hallucinations (incorrect answers) are reduced
Answers become more relevant, accurate, and context-aware
How the Vertex AI RAG Engine Works
This lecture explains the end-to-end workflow of the RAG Engine using the official architecture diagram from Google’s documentation.
You will understand two major flows:
1. Document Ingestion Flow
This is how your data is prepared and stored:
Source parsing
Data transformation
Embedding creation
Indexing
Storage in a vector database
2. Query and Retrieval Flow
This is how the model uses your data to answer questions:
Query parsing
Retrieval from the vector database
Ranking of relevant data
Sending enriched context to the LLM
Final response generation
These steps together form the Retrieval-Augmented Generation pipeline.
Core Concepts Covered
The lecture introduces the RAG process in a structured and logical order:
Data ingestion
Data transformation
Embedding generation
Indexing
Retrieval
Response generation
This sequence helps you clearly understand how data flows from your documents to the final AI-generated response.
In this lecture, you will move from theory to practice by configuring the Vertex AI RAG (Retrieval-Augmented Generation) Engine and creating your first corpus. This session builds directly on the previous tutorial, where you learned the theoretical concepts behind the RAG Engine.
The focus of this lecture is to help you understand how to set up the RAG Engine inside Google Cloud and ingest your own data so it can later be used by Large Language Models for context-aware responses.
Region Selection for RAG Engine
The lecture begins with selecting the region where the RAG Engine will be configured. Since the RAG Engine is available only in specific locations, you will see how to:
Refer to the official documentation
Identify supported regions
Select a valid region (such as US East 4) for configuration
Configuring the RAG Engine Tier
You will then learn how to configure the RAG Engine tier, which determines performance and cost:
Scaled Tier
Designed for production-grade workloads
Supports auto-scaling
Suitable for large datasets and performance-sensitive applications
Basic Tier
Cost-effective and low compute option
Ideal for experimentation and small datasets
Suitable for latency-insensitive workloads
Useful when working with external vector databases
For this tutorial, the Basic Tier is selected to demonstrate a practical and economical setup.
Creating a Corpus
After configuring the RAG Engine, the next step is to create a corpus, which is a structured collection of documents used by the RAG Engine.
You will learn how to:
Provide a meaningful corpus name and description
Choose the data source for upload
Upload documents directly from your local system
Ingest a PDF document as part of the corpus
Advanced Configuration Options Explained
This lecture also introduces important advanced settings, including:
Chunking Strategy
Chunk size
Chunk overlap
Embedding request limits
These settings control how documents are broken into smaller, searchable pieces. For simplicity, default values are used in this tutorial.
Layout Parsing
You will understand how the layout parser:
Extracts structured content from documents
Creates context-aware chunks
Improves retrieval accuracy for generative AI applications
Default parsing options are selected to keep the setup simple and effective.
Configuring the Vector Store
Finally, you will configure the vector store, which is responsible for storing and retrieving document embeddings.
Topics covered include:
Selecting an embedding model
Understanding multilingual and general-purpose embedding options
Choosing a managed vector database provided by Vertex AI
Once these configurations are complete, the corpus creation process begins.
Verifying the Corpus
After the corpus is created, you will learn how to:
Navigate to the RAG Engine dashboard
Verify the corpus status
Confirm uploaded files
Review embedding model and vector store details
This ensures that your data is fully ingested and ready for querying.
In this lecture, you will learn how to test and query the corpus that you created in the previous tutorial using the Vertex AI RAG Engine. This session focuses on validating whether your data is being correctly retrieved and used by a Large Language Model to generate accurate, grounded responses.
You will perform all testing using Vertex AI Studio, which provides an interactive environment to query your data without writing any code.
Accessing the Corpus for Testing
The lecture starts by showing multiple ways to test the corpus:
Using the Test option directly from the RAG Engine
Opening the corpus and selecting Test in Vertex AI Studio
Both options lead you to the same interactive testing environment.
Using Vertex AI Studio for RAG Queries
Inside Vertex AI Studio, you will:
Select a Large Language Model (such as Gemini 2.5 Flash Preview)
Learn that the model can be switched based on your needs
Understand output settings such as structured output
Explore the Tools section and enable the RAG Engine grounding option
This ensures that the model answers are generated using your uploaded data rather than general knowledge alone.
Configuring Generation Parameters
You will also explore key generation settings that control response behavior:
Region selection for RAG processing
Temperature for response creativity
Output token limits
Seed and Top-P values for response consistency
Default values are used in this tutorial to keep the focus on understanding the workflow.
Querying the Corpus with Real Examples
To test the setup, you will ask natural language questions based on the uploaded PDF document, such as:
Questions related to company policies
Queries based on employee tenure
You will see how:
The model retrieves relevant information from the corpus
Answers are generated with clear explanations
The response is grounded in your uploaded document
Understanding Grounded Responses
One of the key highlights of this lecture is learning how to verify the correctness of responses. For each answer, you will see:
The source document used for retrieval
Extracted text references
A confidence score indicating response reliability
This helps ensure transparency and trust in AI-generated outputs.
Saving Prompts and Reusing Queries
You will learn how to:
Save prompts along with region settings
Reuse them for repeated testing or demonstrations
This is useful when building consistent AI workflows or demos.
From Testing to Application Development
Towards the end of the lecture, you will explore how to move beyond testing:
Viewing auto-generated application code
Understanding how the solution can be deployed using Cloud Run
Opening a managed notebook environment for further experimentation
This bridges the gap between interactive testing and real-world application development.
In this lecture, you will begin building a Retrieval-Augmented Generation (RAG) Agent by first understanding all the theoretical foundations required for the project. This tutorial focuses on concept clarity, architecture understanding, and project structure, which are essential before moving to implementation.
The RAG agent you will build is designed to answer user questions by searching and retrieving information from user-provided documents, making responses more accurate and context-aware.
Understanding RAG in Simple Terms
RAG stands for Retrieval-Augmented Generation.
You can think of it as an open-book exam for AI:
Without RAG, the AI can answer only based on what it learned during training
With RAG, the AI can look into documents before answering
This is similar to using notes or reference material during an exam
This approach helps the AI give more accurate, relevant, and trustworthy answers.
Key RAG Terminology Explained
This lecture introduces important concepts you must know before building a RAG agent:
Corpus
A collection of documents stored together, like books on a library shelf
Embedding
Converting text into numerical form so machines can understand and compare it
Chunking
Breaking large documents into smaller, manageable pieces
Vector Search
Finding similar pieces of text by comparing embeddings
Agent
An AI entity that uses tools to perform tasks such as managing documents and answering questions
Tools
Functions that the agent can call to perform actions like creating a corpus, adding data, or querying documents
How a RAG Agent Works
The lecture explains the complete RAG workflow:
A user asks a question
The agent searches through uploaded documents
Relevant document chunks are identified
The agent uses these chunks as context
The AI generates a final, informed response
This process ensures that answers are grounded in actual data rather than guesses.
RAG Agent Architecture Overview
You will understand the high-level architecture of the system:
User interacts through a web interface
The ADK framework manages agent execution
The agent uses multiple tools to perform tasks
Tools communicate with the Vertex AI RAG service
Vertex AI handles document storage, embeddings, and vector search
This layered architecture keeps the system modular, scalable, and easy to manage.
Project Folder Structure Explained
The lecture walks through the complete project structure and explains the purpose of each component:
Main project folder for the RAG agent
Configuration files for environment and settings
Dependency management files
Documentation files
Virtual environment setup
Agent definition files
Configuration files for chunking and API behavior
Tool definitions for corpus and document management
Utility files containing helper functions
This structure helps maintain clean, organized, and production-ready projects.
Sample Workflows Covered
You will explore real-world workflows such as:
Creating a Corpus
User requests corpus creation
Agent calls the appropriate tool
Tool checks for existing corpus
Corpus is created in Vertex AI
Agent confirms success
Adding Documents
User provides a document link
Agent validates the input
Tool configures chunking and ingestion
Vertex AI processes and stores the data
Agent responds with status
Querying Documents
Agent searches relevant chunks
Retrieves context from the vector database
Generates accurate answers
State Management in the RAG Agent
The lecture also explains state management, which acts as shared memory across tools:
Tools write information to the state
Agent reads from the state when needed
Helps track created corpora, uploaded documents, and query context
This ensures smooth coordination between agent actions and tools.
In this lecture, you will complete the installation and setup required to start working on the RAG Agent project. Building on the previous tutorial where you understood the project overview, this session focuses on preparing your local development environment and Google Cloud configuration.
By the end of this lecture, your system will be fully ready to run and develop the RAG agent.
Step-by-Step Setup Overview
This lecture follows a clear and structured setup process consisting of six main steps, making it easy to follow along.
Step 1: Installing the Google Cloud CLI
You will start by installing the Google Cloud CLI using the official documentation.
The installer supports:
Windows
macOS
Linux distributions
The lecture demonstrates the Windows installation process and explains how to verify that the CLI is installed correctly.
Step 2: Initializing the Google Cloud CLI
Once the CLI is installed, you will:
Initialize the Google Cloud CLI
Select or reinitialize a configuration
Choose the Google account to use
Select the default Google Cloud project
This step connects your local machine to your Google Cloud environment.
Step 3: Setting Up Application Default Credentials
Next, you will authenticate your local environment using Application Default Credentials. This allows your application to securely access Google Cloud services without hardcoding credentials.
You will:
Authenticate using your Google account
Set the selected project as the default project
This step ensures seamless communication between your application and Google Cloud services.
Step 4: Enabling Required APIs
To work with the Vertex AI RAG Engine, certain Google Cloud APIs must be enabled.
In this lecture, you will enable the necessary services to ensure that all Vertex AI features work correctly.
Step 5: Creating a Python Virtual Environment and Installing Dependencies
To keep the project clean and isolated, you will:
Create a Python virtual environment
Activate the virtual environment
Install all required Python dependencies listed in the project
This ensures consistent behavior across different systems and avoids dependency conflicts.
Step 6: Configuring Environment Variables
Finally, you will edit the environment configuration file to:
Specify the Google Cloud project ID
Define the Vertex AI region
Enable Vertex AI usage for Generative AI
You will also learn how to:
Choose a supported region for the Vertex AI RAG Engine
Refer to official documentation for supported regions
Switch regions if quota or availability issues occur
Important Notes on Region Selection
The lecture explains how:
Vertex AI RAG Engine is available only in specific regions
Preview regions can be used for experimentation
Alternative regions can be selected if issues arise
This knowledge helps prevent common configuration problems.
In this lecture, you will take a deep dive into the code structure of the RAG Agent project. After completing the installation and setup in the previous tutorial, this session focuses on understanding how the agent is designed, how different components interact, and what role each file plays.
The goal of this lecture is to help you clearly understand the architecture and logic of the RAG agent before running it.
Project Structure Overview
The lecture begins by walking through the main project folder, which includes:
A Python virtual environment used to isolate dependencies
A project folder that contains all RAG agent logic
Documentation and dependency files
Each folder and file is explained so you understand why it exists and how it fits into the overall system.
Understanding the Main Agent Definition
You will start by exploring the agent definition file, which acts as the brain of the RAG system.
In this part, you will learn:
How the main agent is created using the ADK framework
How a Large Language Model is assigned to the agent
How the agent is given a clear role and responsibilities
How different tools are attached to the agent
How instructions guide the agent’s behavior
You will also understand how the agent decides:
When to manage document corpora
When to answer user knowledge-based questions
Which tool to use for each type of request
Communication and Tool Usage Guidelines
The lecture explains how the agent is guided to:
Respond clearly and concisely
Use the correct tool for each task
Confirm destructive actions like deletions
Focus on helping users manage and query documents effectively
These guidelines ensure predictable and safe agent behavior.
Initialization and Vertex AI Connection
Next, you will explore the package initialization file, which plays a critical startup role.
You will understand how:
Environment variables are loaded
Google Cloud project and region are read
Vertex AI is initialized at startup
The agent is made discoverable to the ADK web framework
This file acts as the bridge between your local project and Google Cloud.
Configuration Management
The lecture then covers the central configuration file, which contains all adjustable settings.
Key configurations explained include:
Google Cloud project and region settings
Chunk size and overlap values
Retrieval parameters such as top-K and distance threshold
Embedding model selection
Rate limits for embedding requests
This file allows you to control RAG behavior without modifying core logic.
Understanding the Tools Layer
A major part of the lecture focuses on the tools used by the agent. Each tool has a clear responsibility, such as:
Creating a new corpus
Adding documents to a corpus
Deleting a corpus
Deleting individual documents
Listing available corpora
Fetching corpus details
Querying a corpus with user questions
You will understand how each tool:
Validates user input
Interacts with Vertex AI RAG services
Updates shared state
Returns meaningful responses to the agent
Utility Functions and Shared Logic
Finally, you will explore the utility module, which contains helper functions used across multiple tools.
These utilities help with:
Resolving corpus resource names
Checking whether a corpus exists
Tracking the currently active corpus
This shared logic keeps the project clean, reusable, and easy to maintain.
In this lecture, you will run the Vertex AI RAG Agent that you built in the previous tutorials and test all its capabilities in action. After understanding the code and architecture, this session focuses on executing the agent, interacting with it through a web interface, and validating that each tool works as expected.
By the end of this lecture, you will see a complete end-to-end RAG workflow, from creating a corpus to querying documents and cleaning up resources.
Starting the RAG Agent
The lecture begins by:
Opening the terminal
Confirming that the Python virtual environment is active
Running the ADK web server from the project directory
Once the server starts, you will open the provided local URL in a browser and select the Vertex AI RAG Agent.
First Interaction with the Agent
You will test the agent with a simple greeting to confirm:
The agent is running correctly
The agent introduces itself clearly
The agent explains its capabilities, such as:
Creating and managing corpora
Adding documents
Answering questions using uploaded documents
This confirms that the agent is fully operational.
Testing Individual RAG Tools
The lecture then walks through testing each tool step by step:
Creating a Corpus
You will ask the agent to create a new corpus
The agent calls the Create Corpus tool
You will verify the corpus creation in the Vertex AI Console
Adding a Document
A document stored in Google Drive is shared and added to the corpus
The agent calls the Add Data tool
You will confirm that the document is ingested and indexed
Embedding model and vector store details are verified
Querying the Corpus
You will ask a question based on the document’s content
The agent retrieves relevant chunks
A grounded and accurate answer is returned
The response is validated against the original PDF
Listing and Inspecting Corpora
You will then test:
List Corpora to view all available corpora
Get Corpus Info to retrieve detailed information such as:
Corpus name
Number of documents
File names inside the corpus
This helps confirm that the agent can inspect and manage stored data.
Deleting Documents and Corpora Safely
The lecture demonstrates safe deletion workflows:
Deleting a specific document from a corpus
Confirming deletion before execution
Verifying document removal in the Vertex AI Console
You will also:
Delete the entire corpus
Confirm the action
Verify that no corpora remain in the account
This shows responsible resource management using the agent.
Executing a Complete Workflow with a Single Prompt
Finally, you will see a powerful example where:
A corpus is created
A document is added
A query is answered
—all within a single user prompt.
The agent automatically:
Selects the correct tools
Executes them in the right order
Returns a final, accurate answer
This highlights the true power of tool orchestration using a RAG agent.
In this lecture, you will get a clear and beginner-friendly introduction to Vertex AI Search, one of the powerful services offered by Google Cloud for building AI-enabled search and recommendation systems.
We begin by understanding what Vertex AI Search actually is, directly referring to the official Google documentation. You will learn how Vertex AI Search allows developers—even those with limited machine learning knowledge—to leverage Google’s foundation models, search expertise, and recommendation systems to build enterprise-grade generative AI applications.
The lecture then explains Vertex AI Search in very simple words, helping you clearly understand:
What problem Vertex AI Search solves
How it differs from traditional keyword-based search
How it uses advanced AI and large language models to understand user intent, not just keywords
You will learn how businesses can use Vertex AI Search with their own private data, such as:
Internal documents
Website content
Product catalogs
Knowledge bases
This makes it possible to build:
Smart enterprise search systems
Conversational Q&A applications
AI-powered chatbots for websites or internal tools
To strengthen your understanding, the lecture also walks through a simplified definition of Vertex AI Search and explains how it is powered by the same advanced AI technologies behind Google Search.
Key Topics Covered in This Lecture
What is Vertex AI Search and why it is important
Use cases for AI-enabled search and recommendations
Difference between traditional search and AI-powered search
Types of data that can be indexed and searched
High-level overview of responsible AI, data governance, and generative AI concepts
Official categories within Vertex AI Search:
Custom Search
Site Search with AI Mode
Media Search
Search for Commerce
Healthcare and industry-specific search
Conversational agents and chat assistants
Recommendation engines for media and retail
Vertex AI Search UI Walkthrough
In the second part of the lecture, you will explore the Vertex AI Search user interface inside the Google Cloud Console. You will see:
Where Vertex AI Search is located within Agent Builder
How to navigate to the AI Applications section
The different types of applications you can build using Vertex AI Search
Categories such as Search & Assistants, Conversational Agents, and Recommendations
A quick walkthrough of managing existing test applications in the console
This UI exploration helps you become familiar with the platform before moving into hands-on implementation.
In this lecture, you will learn how to create data stores in Vertex AI Search, which is a very important step before building any AI-powered search, chatbot, or recommendation application.
In the previous lecture, we covered the theoretical introduction to Vertex AI Search. In this tutorial, we move one step forward and focus on setting up data stores, which act as the knowledge source for Vertex AI Search applications.
What Is a Data Store in Vertex AI Search?
A data store is where you connect and index your data so that Vertex AI Search can:
Search information
Generate grounded answers
Power conversational AI
Support recommendations
Without a data store, Vertex AI Search cannot retrieve or understand your data.
Navigating to Data Stores in the Google Cloud Console
In this lecture, you will see:
How to access Vertex AI Search from the Agent Builder
How the “See All” option redirects you to the AI Applications page
An overview of the Apps, Data Stores, Monitoring, and Settings sections
How to manage existing data stores, including deleting old ones
Understanding Data Source Categories
Before creating a data store, you must choose a data source. Vertex AI Search supports multiple categories:
1. Cloud Sources
These allow you to import data directly from Google Cloud services:
Website content (public websites)
BigQuery
Cloud Storage
Healthcare API and other cloud-based sources
2. Workspace Sources
These connect data from Google Workspace:
Google Drive
Gmail
Google Calendar
Other Workspace services
3. Third-Party Sources
These integrate external platforms such as:
Microsoft Entra ID
Confluence Cloud
Dropbox
OneDrive
ServiceNow
SharePoint
Creating a Data Store Using Website Content
You will learn how to:
Select Website Content as a data source
Understand advanced website indexing and its purpose
Specify valid URL patterns for indexing
Avoid common URL format mistakes
Control inclusion and exclusion of website pages
Configure indexing and refresh options
Name the data store and choose region and AI settings
This section shows how Vertex AI Search can crawl and index public websites, making them searchable through AI.
Exploring Website Data Store Settings
After creation, you will explore:
Data store ID and type
Region and serving state
Language selection
Included URL patterns
Generative AI inclusion or exclusion
Upgrade and edit options
This helps you understand how website-based data stores are structured and managed.
Creating a Data Store Using Cloud Storage (PDF Documents)
Next, the lecture demonstrates how to create a data store using Google Cloud Storage (GCS):
Key Concepts Covered
Creating a GCS bucket
Uploading unstructured documents such as PDF files
Selecting unstructured data for import
Choosing synchronization frequency
Selecting folders instead of individual files
Assigning a data store name and region
Document Processing and Configuration
You will also understand:
Document parsing options (digital parser, OCR, layout parser)
Default document processing behavior
Chunking and generative AI options
How documents are processed and indexed in the background
Viewing import status, events, and processing activity
This explains how Vertex AI Search prepares documents internally for search and retrieval.
Reviewing the Created Data Stores
By the end of the lecture, you will have created and explored:
A website-based data store
A cloud storage-based data store using PDF files
You will also see how to:
Monitor document processing
Check import status and timestamps
Review processing configurations
In this lecture, you will learn how to create a Custom Search application using Vertex AI Search, by using the data stores that we created in the previous tutorials.
Earlier, we successfully created two data stores:
Company Policy Data
Wikidata
In this tutorial, we will use the Wikidata data store to build a fully functional Custom Search application and explore its features in detail.
Understanding Application Types in Vertex AI Search
Before creating the application, this lecture explains the different categories of applications available in Vertex AI Search:
1. Search and Assistant Applications
Gemini Enterprise Custom Search
Custom Search for Healthcare Data
Site Search with AI Mode
Media Search
Search for Commerce
2. Conversational Agents
Conversational AI applications
Chat-based assistants
3. Recommendation Systems
Media Recommendations
Retail Recommendations
For this tutorial, we focus on Custom Search, which is designed to build tailored search and generative experiences on structured and unstructured data.
Creating the Custom Search Application
You will learn how to:
Select Custom Search as the application type
Understand the supported data sources such as public websites and unstructured documents
Configure the search application settings
Enable enterprise-level features like:
Extractive answers
Image search
Website search
Core generative and summarization features
Provide application-level details such as:
Application name
Organization or company name
Deployment location
Connecting the Data Store
In this step, you will:
Attach the Wikidata data store to the application
Complete the application creation process
Verify that the application has been created successfully
This step connects your indexed data to the search engine, making it searchable.
Exploring the Application Overview
After creating the application, the lecture walks through the application overview dashboard, which is divided into multiple functional areas:
System Overview
Prepare (synonyms and autocomplete)
Retrieve (connected data stores)
Signal (boosting, filtering, and promotion)
Serve (safe search and result delivery)
UI, Tuning, and Application Settings
User experience configuration
Search behavior tuning
Model response settings
Previewing and Testing the Search Application
You will test the application using:
Desktop view
Mobile (smartphone) view
By running sample search queries, you will see how:
Search results are retrieved from Wikipedia
The application responds quickly and accurately
The same experience works across different device types
This confirms that the custom search application is working as expected.
Data and Ranking Insights
The lecture also introduces data quality and ranking signals, including:
Tier 1: Popularity-based ranking
Tier 2: Predicted Click-Through Rate (CTR)
Tier 3: Personalized CTR prediction
Since the application is newly created, you will understand why ranking data is not yet available and how it becomes visible after real user interactions.
Configuration Options Explained
You will explore key configuration areas, such as:
Autocomplete behavior and suggestion rules
Query matching options
User interface settings
Feedback and user event collection
Safe search configuration
Base model response tuning
Controls for filtering and promoting results
These settings allow you to fine-tune the search experience based on your requirements.
Integration Options: Widget and API
The lecture explains how this search application can be integrated into real systems using:
Search Widget for easy web integration
API-based integration for backend or server-side use cases
You will understand when to use a widget and when an API-based approach is more suitable.
Analytics Overview
Finally, the lecture introduces the Analytics section, where you can track:
Searches
User visits
Search performance comparisons
You will also learn why analytics data is not immediately available and how it becomes visible after the application is used over time.
In this lecture, you will learn how to create and use a Gemini Enterprise application using Vertex AI Search. This is a powerful, enterprise-grade AI assistant that combines search, chat, content generation, and task automation into a single interface, powered by Google’s Gemini models.
In the previous tutorial, we created a Custom Search application. In this session, we take the next step and build a Gemini Enterprise application, connect it with enterprise data, and explore its full capabilities.
What Is Gemini Enterprise?
Gemini Enterprise is an enterprise-compliant AI search and assistant solution that allows organizations to:
Find answers from large volumes of company data
Chat with AI grounded on internal documents
Generate content using Gemini models
Perform actions using connected applications
Use AI agents for research, ideation, and analysis
All of this is available from one unified interface.
Creating the Gemini Enterprise Application
In this lecture, you will see how to:
Navigate to Vertex AI Search → AI Applications
Select Gemini Enterprise as the application type
Understand its purpose and capabilities
Configure basic application details such as:
Application name
Deployment region (multi-region)
Optional organization details
Create the application successfully
Understanding Data Store Compatibility
After creating the application, the lecture highlights an important concept:
Not all data stores are supported by every application type
Existing data stores may not appear automatically in Gemini Enterprise
To solve this, you will learn how to:
Identify supported data store types
Create a new Cloud Storage-based data store
Import unstructured PDF documents
Understand document content such as company policies
Verify successful data store creation
Connecting Data Stores to Gemini Enterprise
You will then:
Connect the newly created Company Policies data store
Understand indexing and grounding delays
Learn why search results may take time to reflect changes
See best practices when working with enterprise data
Previewing the Gemini Enterprise Experience
The lecture provides a detailed walkthrough of the Gemini Enterprise preview interface, including:
Chat-based interaction with enterprise data
Content generation powered by Gemini
Using connected data sources for grounded answers
Initiating new chats and searches
Exploring the built-in content library
Generative Capabilities: Image and Content Creation
You will also explore:
Image generation using Google’s Nano Banana model
Creating images from text prompts
Viewing generated images and videos in the library
Understanding how Gemini Enterprise supports creative workflows
Exploring Built-In Google Agents
Gemini Enterprise comes with Google-made AI agents, which are demonstrated in this lecture:
Notebook LLM
Upload documents or provide text sources
Chat and analyze content based on uploaded material
Use advanced features like audio overview and deep-dive conversations
Idea Generation Agent
Run multi-agent brainstorming sessions
Generate structured ideas for business and enterprise use cases
Review goals, requirements, and attributes interactively
Deep Research Agent
Generate AI-driven research reports
Analyze complex topics
Produce structured insights and summaries
Configuration and Customization Options
The lecture also explores advanced configuration options, including:
Autocomplete behavior and search UI settings
Feedback and user event collection
Controls for filtering and result management
Assistant system instructions
Web grounding and enterprise search grounding
Knowledge Graph configuration (private and Google Cloud)
Feature management for end users
Model availability and feature toggles
Actions, Agents, and Integrations
You will learn how to:
Add actions by connecting external tools such as Gmail and Calendar
Configure action sources for task execution
Add custom agents using different agent frameworks
Set up identity and workforce authentication for full access
Analytics and Insights
The lecture introduces the Analytics dashboard, covering:
Adoption metrics (active users, retention, growth)
Usage and quality feedback
Agent performance insights
Estimated business value and time saved
Interaction success rates and timelines
These analytics help measure the real business impact of Gemini Enterprise.
Verifying Grounded Responses
Finally, you will see a real example where:
Gemini Enterprise answers questions based on company policy documents
Responses are verified against the original PDF content
The assistant provides accurate, grounded answers from enterprise data
In this lecture, we begin our journey into Vector Search using Vertex AI, a powerful and enterprise-grade search technology developed by Google. This tutorial focuses on building a strong conceptual foundation before moving into practical demos in upcoming lectures.
We start by understanding the theoretical concepts behind vector search and then explore how these concepts are represented in the Vertex AI user interface.
What Is Vector Search?
Vector Search in Vertex AI is a high-performance similarity search engine built on advanced research from Google Research. It uses the same core technology that powers Google Search, YouTube, and Google Play, making it:
Highly scalable
Extremely fast
Reliable for enterprise use cases
This technology enables you to build next-generation search, recommendation, and generative AI applications.
Vector Search vs Traditional Search
This lecture clearly explains the key difference between traditional keyword-based search and vector search:
Traditional Search
Matches exact keywords
Looks for literal word matches
Vector Search
Searches by meaning and context
Understands relationships between words
Finds similar items even when exact words do not match
For example, vector search understands that “bank” and “financial institution” are related, even if the words are different.
Understanding Vectors in Simple Words
To understand vector search, you first need to understand vectors.
A vector is explained as:
A long list of numbers
A mathematical representation of data
Similar to GPS coordinates, where similar items are located close to each other
Text, images, products, and user behavior can all be converted into vectors.
Core Concepts of Vector Search
The lecture breaks vector search into three key steps:
1. Embedding
Data is converted into vectors using AI models
Similar items are placed close together in a shared space
2. Indexing
Vertex AI organizes these vectors into a high-speed index
This index allows extremely fast similarity search
3. Searching
User queries are also converted into vectors
The system finds the nearest neighbors based on meaning
Results are returned instantly
Why Vector Search Is Powerful
You will understand why vector search is widely used:
It understands context and intent
It works with text, images, and products
It supports recommendation systems
It can search through billions of items in milliseconds
It is backed by proven Google-scale infrastructure
Common Use Cases
This lecture highlights real-world use cases of vector search, such as:
Recommendation systems (e.g., “Users who bought this also bought…”)
Chatbots and conversational AI
Semantic document search
Image similarity search
Product discovery and personalization
Exploring Vector Search in Vertex AI UI
You will also explore the Vertex AI Vector Search interface, where you can see:
Indexes
Index Endpoints
From the UI, you can:
Create vector indexes
Configure algorithm and storage settings
Create index endpoints
Manage deployments
This gives you a clear picture of how vector search is structured in Vertex AI.
Key Vector Search Terminology Explained Simply
The lecture explains important terms using easy-to-remember analogies:
Index
The data store that holds all vectors
Compared to a library of books
Index Endpoint
A secure entry point to search the index
Compared to a librarian’s desk
Deploy Index to Endpoint
Makes the index searchable
Like opening the library for visitors
Upsert Data Points
Insert or update vectors in the index
Like adding or updating books
Querying
Searching for similar items
Like asking the librarian a question
These analogies make complex concepts easy to understand and remember.
In this lecture, we begin the hands-on demo of Vector Search using Vertex AI. This tutorial marks the transition from theory to practice, where we start building a complete vector search workflow step by step.
The focus of this lecture is Section 1: Setup and Initialization, which prepares the environment required to work with Vertex AI Vector Search.
Overview of the Demo Environment
In this session, you will work inside Colab Enterprise, using a structured notebook that is divided into seven logical sections:
Setup and Initialization
Data Preparation
Embedding Generation
Vector Index Creation and Deployment
Data Ingestion
Querying and Search
Maintenance
This lecture covers only the first section, ensuring a clean and stable foundation before moving forward.
Understanding the Colab Notebook
You will learn:
How to work with a Colab Enterprise notebook
How to manage notebook options such as renaming, downloading, and deleting
How the notebook is organized into clearly defined sections for better readability and maintenance
This structured approach helps in building and understanding the complete vector search pipeline.
Section 1: Setup and Initialization
This section is divided into three important steps, each explained clearly during the demo.
Step 1: Installing Required Libraries
Installation of essential Google Cloud libraries required to work with:
Vertex AI
Google Cloud Storage
Explanation of why these libraries are required for vector search workflows
Ensuring a clean installation without unnecessary output
Step 2: Kernel Restart
Explanation of why restarting the notebook kernel is sometimes required
Ensuring that newly installed packages are properly loaded
Best practices when working in notebook environments
This step ensures a stable runtime before proceeding further.
Step 3: Configuration and Vertex AI Initialization
In this step, you will understand:
How to configure project-level settings
How to define key variables such as:
Google Cloud Project ID
Region
Cloud Storage bucket name
Dataset file name
How to connect the notebook securely to Vertex AI
How to verify successful initialization
Understanding the Dataset
You will also explore the dataset used in this demo:
A movie dataset stored in Google Cloud Storage
Structured information such as:
Movie ID
Title
Year
Genre
Plot
Director
Cast
Ratings
A brief walkthrough of the dataset format and size
This dataset will be used in later sections for embedding generation and vector search.
Verification and Output
By the end of this lecture, you will:
Confirm that Vertex AI is successfully initialized
Verify project, region, and storage configuration
Ensure the environment is ready for the next steps in the vector search pipeline
In this lecture, we continue our Vector Search demo using Vertex AI and focus on Section 2: Data Preparation, which is a critical step in building any vector search or similarity-based application.
In the previous tutorial, we completed setup and initialization, including package installation, kernel restart, and Vertex AI configuration. In this session, we prepare the data so that it is ready for embedding generation.
Why Data Preparation Is Important
Vector search does not work directly on raw files.
Before generating embeddings, the data must be:
Loaded correctly
Structured properly
Converted into meaningful text
Prepared in a format supported by Vertex AI
This lecture ensures your data is clean, organized, and embedding-ready.
Section 2 Overview: Data Preparation Steps
This section is divided into four clear and logical steps, each explained in detail.
Step 1: Loading the Dataset from Google Cloud Storage
In this step, you will learn:
How to connect to a Google Cloud Storage (GCS) bucket
How to download a JSON dataset stored in GCS
How to parse the JSON file into a usable format
How to verify that the dataset is loaded correctly
You will also validate:
The total number of records loaded
Sample movie details such as title, year, genre, director, and rating
This confirms that the dataset is successfully accessed and ready for processing.
Step 2: Preparing Text for Embedding Generation
Next, the lecture explains how to prepare meaningful text for embeddings.
Key concepts covered:
Why embeddings work best when all relevant information is combined
How to merge multiple movie attributes into a single descriptive text
Why title, genre, and plot are combined to capture the full semantic meaning of a movie
By the end of this step, each movie is represented as a single, rich text description, which is ideal for embedding generation.
Step 3: Creating a JSONL File for Batch Embedding
In this step, you will learn:
What a JSONL (JSON Lines) file is
Why Vertex AI batch embedding requires this format
How each movie is written as a separate JSON object
How to prepare a batch input file containing all records
You will also:
Upload the generated JSONL file to Google Cloud Storage
Verify that the file contains all movie records
Inspect the content to ensure correctness
This file will be used directly by the embedding service in the next tutorial.
Step 4: Granting Required IAM Permissions
To allow Vertex AI to process the data, proper permissions are required.
This step explains:
Why IAM permissions are necessary
How service accounts access project resources
How to grant the required roles for embedding operations
How to confirm that permissions are applied successfully
This ensures that all services can work together without access issues.
In this lecture, we move to Section 3: Embedding Generation, which is one of the most important steps in building a Vector Search system.
So far, we have completed:
Section 1: Setup and Initialization
Section 2: Data Preparation
Now, in this tutorial, we convert our prepared text data into numerical vectors (embeddings) using Google’s advanced AI models.
What Is Embedding Generation?
Embeddings are numerical representations of text that capture meaning and context.
Instead of working with raw text, Vector Search works with vectors, which makes semantic search, similarity matching, and recommendations possible.
In this lecture, we generate embeddings for:
99 movie descriptions
Each movie is converted into a 768-dimensional vector
Section 3 Overview: Embedding Generation Steps
This section is divided into two clear steps, both explained in detail.
Step 1: Running the Batch Prediction Job
In this step, you will learn:
How batch embedding generation works in Vertex AI
Why batch prediction is used for large datasets
How all movie descriptions are sent together to Google’s AI model
How the Text Embeddings 005 model is used to convert text into vectors
How to specify:
Input file stored in Google Cloud Storage
Output location for generated embeddings
What to expect during execution, including approximate processing time
After the job completes, you will:
Confirm that the batch job ran successfully
Identify the output folder created in the GCS bucket
Step 2: Exploring and Extracting Embeddings
Once the embeddings are generated, the lecture shows how to:
Locate the batch output files in Google Cloud Storage
Understand the structure of the output JSON files
Identify embeddings stored for each movie
Extract vector values from the batch output
Store all embeddings in memory for further processing
You will also verify:
The total number of embeddings generated
The dimensionality of each embedding
That each movie has exactly one vector
Verification and Validation
To ensure correctness, the lecture includes:
Validation of the embedding count against the number of movies
Confirmation that each embedding has 768 dimensions
Preview of sample embedding values
This step confirms that the embedding generation process was completed successfully.
In this lecture, we move to Section 4 of the Vector Search demo, where we take a major step toward building a real, production-ready vector search system.
In the previous tutorial, we successfully generated embeddings for our dataset. Now, in this session, we will create a Vector Search index, create an index endpoint, and deploy the index to that endpoint.
This process transforms raw embeddings into a searchable, scalable, and high-performance vector database.
What Is a Vector Search Index?
A vector search index is a specialized database optimized for similarity search.
You can think of it as:
A supermarket-style organization system
Similar items are grouped together
This grouping allows extremely fast search when a query is made
Vertex AI Vector Search uses the same enterprise-grade technology that powers Google’s large-scale products.
Section 4 Overview: Index Creation and Deployment
This section is divided into three major steps:
Creating the Vector Search Index
Creating the Index Endpoint
Deploying the Index to the Endpoint
Each step is explained conceptually and verified through the Vertex AI console.
Step 1: Creating the Vector Search Index
In this step, you will learn:
Why an index is required before searching vectors
How embeddings are organized into a high-speed index
How a unique index ID is generated to avoid naming conflicts
Why stream update is enabled for real-time updates
How distance measurement works for similarity search
How tuning parameters balance speed vs accuracy
Key Concepts Explained
Dimensions: Matches the embedding size (768)
Distance measure: Determines how similarity is calculated
Stream update: Allows adding new data without rebuilding the index
Approximate neighbors count: Controls how many close matches are returned
Leaf node embedding count: Controls internal data grouping
Leaf nodes to search percentage: Improves speed by scanning only the most relevant data
You will also see how Google Gemini can be used to explain complex configurations in very simple words.
Monitoring Index Creation
After starting index creation:
You will monitor the index status in the Vertex AI Vector Search UI
Understand what “Creating” and “Ready” states mean
Verify index properties such as:
Dimensions
Algorithm type
Update method
Distance measure
Shard size
This confirms that the index has been created successfully.
Creating an Index from the UI (Conceptual Overview)
Although the index is created programmatically in this tutorial, you will also see:
How the same index can be created from the Vertex AI UI
Available algorithm options (Tree-based and Brute Force)
Batch vs Stream update modes
Advanced configuration options
This helps you understand both UI-based and programmatic workflows.
Step 2: Creating the Index Endpoint
Next, you will create an Index Endpoint, which acts as:
The API entry point to your vector index
The interface through which applications send search queries
Key points covered:
Why an endpoint is mandatory for querying
Difference between a private and public endpoint
How the endpoint is created and monitored
Verification of endpoint status in the console
Step 3: Deploying the Index to the Endpoint
Creating the index and endpoint is not enough.
They must be connected.
In this step, you will:
Deploy the index to the endpoint
Understand why deployment is required
Learn about replica configuration
Monitor deployment progress
Verify the deployed index from the console
This step makes the vector search system fully operational.
In this lecture, we continue the Vector Search demo using Vertex AI and focus on data ingestion, which is the step where our vector search system becomes truly usable.
In the previous tutorial, we completed Vector Index Creation and Deployment, where we:
Created a Vector Search index
Created an index endpoint
Deployed the index to the endpoint
In this session, we explore the deployed index and then ingest data into it, making the index ready for real search queries.
Exploring the Deployed Vector Index
The lecture begins by exploring the deployed index in the Vertex AI console.
You will learn:
How to locate the deployed index
How to review important index details such as:
Display name and index ID
Deployment status
Connected index endpoint
You will also see the query interface, where:
Different query types are available (dense, sparse, and hybrid search)
Vectors can be provided directly as query input
Filters and numeric restrictions can be applied
Full data point information can be included in query responses
Additionally, the monitoring section is introduced, where you can observe:
Node and shard count
Queries per second
Latency
CPU and memory usage
This helps you understand the performance and health of your vector search system.
Section Overview: Data Ingestion Steps
After reviewing the deployed index, the lecture moves to data ingestion, which consists of three main steps:
Preparing data points
Upserting data points into the index
Verifying ingestion completion
Step 1: Preparing Data Points for Upsert
In this step, you will learn:
What a data point is in Vector Search
How each movie is represented as:
A unique data point ID
A corresponding embedding vector
Why unique IDs (such as movie identifiers) are important
How to verify the total number of prepared data points
How to confirm the embedding dimension
This step ensures that the data is correctly structured before ingestion.
Step 2: Upserting Data Points into the Index
Next, the lecture explains upsert operations, which combine:
Insert: Add new vectors to the index
Update: Modify existing vectors if the ID already exists
Key concepts covered:
Why upserting is required to populate the index
How batch processing avoids quota and timeout issues
How ingestion progress is tracked
How to confirm successful ingestion of all data points
By the end of this step, all 99 movie vectors are successfully added to the index.
Step 3: Verification and Summary
Finally, the lecture provides a clear summary of the completed setup:
Project and region details
Index and endpoint information
Deployed index ID
Dataset size and embedding dimensions
This confirmation ensures the system is fully prepared for querying.
In this lecture, we move to Section 6: Querying and Search, which is the most exciting part of the Vector Search workflow.
At this stage, we have already:
Created and deployed the Vector Search index
Ingested all vector data successfully
Now, the system is fully ready to answer semantic search queries.
What This Lecture Focuses On
This tutorial shows how to query a deployed vector index and retrieve similar items based on meaning, not just keywords. Using real examples, you will see how vector search understands context, intent, and semantics.
The demo is done using a movie dataset, where we search movies based on descriptions rather than exact titles.
Step 1: Testing a Single Semantic Query
The lecture starts with a single test query written in natural language.
You will learn:
How a text query is prepared for vector search
Why the query must first be converted into an embedding
How batch prediction is used to generate an embedding for the query
How the generated query vector is sent to the deployed index
How the system returns the most similar movies
This demonstrates how semantic similarity search works end-to-end.
Step 2: Understanding the Query Workflow
The complete flow is clearly explained:
Define the query in natural language
Generate an embedding for the query
Search the vector index using the query embedding
Retrieve and display the most similar results
This step-by-step breakdown helps you understand what happens behind the scenes during a vector search.
Step 3: Running Multiple Semantic Queries
Next, the lecture demonstrates multiple test queries, each representing a different type of movie description, such as:
Psychological thrillers
Fantasy adventures
Family-friendly animated movies
Crime and mafia dramas
Science-fiction themes
For each query:
The query embedding is generated
The index is searched
Top matching results are displayed
This clearly shows how vector search works across different contexts and genres.
Understanding Search Results
You will notice that:
Results are based on semantic similarity, not exact matches
Some results may not look perfect initially
This is expected and can be improved with tuning and better data
This helps set realistic expectations and prepares you for optimization techniques later.
Step 4: Using Online Prediction for Faster Queries
The lecture then introduces online embedding generation, which is much faster and more suitable for real-time use cases.
You will learn:
The difference between batch embedding and online embedding
Why online prediction is ideal for interactive search
How queries can be converted into vectors in seconds
How the index responds almost instantly
This approach is closer to what you would use in production applications.
Step 5: Building an Interactive Search Function
To make querying easier, the lecture demonstrates:
How an interactive search function works conceptually
How users can input a query and number of results
How results are ranked and displayed clearly
This shows how vector search can be wrapped into user-friendly search features.
In this lecture, we complete Section 7: Maintenance, which is the final step of the Vector Search demo using Vertex AI.
This tutorial focuses on post-processing and resource management, which are critical aspects when working with cloud-based AI systems.
By the end of this lecture, you will understand how to save results for future use and how to clean up cloud resources to avoid unnecessary costs.
What This Lecture Covers
This session is divided into two important maintenance tasks:
Saving results (optional but recommended)
Cleaning up resources after experimentation
Step 1: Saving Results to a File
Although optional, saving results is a good best practice, especially for:
Auditing
Reuse in other applications
Offline analysis
Backup and documentation
In this step, you will learn:
How to enhance the original dataset by adding embeddings
How each movie record is enriched with its corresponding vector
How to save the enhanced dataset:
In Google Cloud Storage
Locally for quick access
How to verify the saved files
How to inspect the saved content to confirm embeddings are included
This ensures that all the work done in previous steps is preserved and reusable.
Step 2: Cleaning Up Cloud Resources
Once the demo is complete, it is important to delete unused resources to prevent unnecessary billing.
This lecture explains:
Why cloud cleanup is important
Why indexes and endpoints cannot be deleted directly when they are deployed
The correct order for deleting resources:
Undeploy the index
Delete the index endpoint
Delete the vector search index
You will see how to:
Perform cleanup using the Vertex AI UI
Understand common deletion errors and how to resolve them
Use a confirmation-based cleanup process to avoid accidental deletion
Verification of Cleanup
After the cleanup process:
You will refresh the Vector Search console
Confirm that:
The index is deleted
The index endpoint is deleted
Ensure that no active Vector Search resources remain
This confirms a successful and complete cleanup.
What We Achieved in This Lecture
By the end of this tutorial, you have:
Saved the final dataset with embeddings
Stored results safely in cloud and local storage
Learned how to properly delete Vector Search resources
Completed the full Vector Search lifecycle from setup to cleanup
In this lecture, we begin a new section on Vertex AI Feature Store, an essential component for building production-ready machine learning systems on Google Cloud.
This tutorial focuses on the theoretical understanding of the Feature Store, its architecture, and why it plays a critical role in real-world AI applications. Practical demos will follow in upcoming lectures.
What Is Vertex AI Feature Store?
Vertex AI Feature Store is a managed, cloud-native service that helps you store, manage, and serve machine learning features in a consistent and reliable way.
In simple words, it acts as a central place to store features so that:
Features are created once
Used consistently across training and prediction
Shared across teams and models
It integrates tightly with BigQuery and Vertex AI, making it suitable for both offline training and real-time online serving.
Why Feature Store Is Important
Without a Feature Store:
Different teams may compute the same feature in different ways
Training data and prediction data may not match
Bugs like training-serving skew can occur
With Feature Store:
There is a single source of truth for features
Feature definitions are reusable
Models behave consistently in production
Feature Store Workflow Overview
This lecture explains the high-level workflow of Vertex AI Feature Store:
Prepare the data source in BigQuery
Register features using Feature Groups
(Optional) Organize and manage feature definitions
Set up Online Store for low-latency serving
Use Feature Views to control which features are served
Serve features to real-time applications or models
Core Concepts Explained in Simple Words
The lecture clearly explains the three most important concepts of Vertex AI Feature Store:
1. Feature Groups
Logical grouping of related features
Acts as a connection to raw data (BigQuery tables or views)
Helps organize features such as customer data, transaction data, or product data
Think of Feature Groups as folders that organize features logically.
2. Online Store
A high-speed, low-latency storage layer
Used by real-time applications and models
Returns feature values in milliseconds
This is what makes real-time predictions possible.
3. Feature View
Controls which features are synced to the online store
Not all features need to be served in real time
Acts like a playlist of selected features
Feature Views ensure that only the required features are served efficiently.
Advantages of Using Vertex AI Feature Store
The lecture highlights several key advantages:
No Training-Serving Skew
Ensures the same feature logic is used during training and prediction
Reusability
Features created once can be reused by multiple teams and models
Point-in-Time Correctness
Prevents data leakage during model training
Consistency and Governance
Centralized management of feature definitions
Real-World Use Cases
You will also understand how Feature Store is used in real production systems:
Movie Recommendations
Fetch user behavior features instantly for recommendations
Fraud Detection
Retrieve recent transaction counts in real time to block suspicious activity
Ride-Sharing Applications
Use traffic and driver availability features to calculate dynamic pricing
These examples show why low-latency feature serving is critical.
Understanding Feature Store Architecture
Using a simple architecture diagram, the lecture explains:
Raw data stored in BigQuery
Feature Groups connected to raw data
Feature Views selecting features for serving
Online Store serving features with low latency
Real-time applications consuming these features
This visual explanation helps connect all concepts together.
Vertex AI Feature Store UI Overview
The lecture also shows where Feature Store lives inside the Vertex AI console, including:
Feature Groups
Online Store
Advanced search options (to be covered later)
This prepares you to navigate the UI confidently in upcoming demos.
In this lecture, we begin the hands-on demo of Vertex AI Feature Store, focusing on Section 1: Setup and Installation.
In the previous tutorial, we covered the theoretical introduction to Feature Store. Now, we move into a practical, step-by-step implementation using a Colab Enterprise notebook.
This lecture lays the foundation required for online feature serving, which will be used in the upcoming sections.
Overview of the Demo Notebook
For this demo, a Colab Enterprise notebook has been created specifically for:
Feature Store Online Serving with BigQuery and Bigtable.
The notebook is structured into seven clear sections:
Setup and Installation
Prepare Feature Data in BigQuery
Create Feature Online Store
Create Feature View
Sync Data to Online Store
Fetch Features (Online Serving)
Summary and Cleanup
In this tutorial, we focus only on Section 1, ensuring the environment is correctly prepared before moving forward.
Goal of This Feature Store Demo
The overall goal of this demo series is to:
Use a real-world e-commerce dataset
Build an online feature store backed by BigQuery and Bigtable
Enable automatic feature syncing
Fetch features in milliseconds for real-time machine learning predictions
The dataset used contains approximately 29,000 products, making it suitable for real-world scenarios.
Section 1: Setup and Installation
This section is divided into four important steps, each explained clearly.
Step 1: Installing Required Libraries
You will learn:
Which libraries are required to work with Vertex AI Feature Store
Why each library is important:
Vertex AI SDK for Feature Store operations
BigQuery client for accessing feature data sources
Data type handling for compatibility between BigQuery and Python
Why the latest versions are installed
How to suppress unnecessary installation logs for a cleaner notebook
This step ensures that all required dependencies are available.
Step 2: Restarting the Kernel
You will understand:
Why restarting the notebook kernel is sometimes necessary
How it ensures newly installed libraries are properly loaded
When this step can be skipped and when it is recommended
This helps avoid runtime and import issues later.
Step 3: Project Configuration
In this step, you will:
Configure the Google Cloud project and region
Understand how Vertex AI uses project and location settings
Initialize the Vertex AI SDK
Use regional endpoints to reduce latency
Verify project and location configuration
This step sets the default context for all upcoming Feature Store operations.
Step 4: Importing Required Libraries and Clients
This lecture clearly explains the purpose of each imported client and data structure, including:
Clients for creating and managing online feature stores
Clients for real-time feature retrieval
Clients for feature registry and feature group management
Data structures for feature definitions, feature groups, feature views, and online store settings
Request and response types for admin and serving operations
By the end of this step, the notebook is fully equipped to work with Feature Store APIs.
In this lecture, we move to Section 2 of the Feature Store demo, where we prepare feature data in BigQuery.
In the previous tutorial, we completed Setup and Installation, ensuring our environment was ready. Now, we focus on building a clean, reliable, and production-ready data source that will later be used by Vertex AI Feature Store.
This step is critical because Feature Store features must come from well-defined and high-quality data sources.
What This Lecture Covers
This tutorial walks through the entire lifecycle of preparing feature data in BigQuery, from writing a feature extraction query to validating the final dataset.
You will learn how to:
Combine raw data into meaningful features
Analyze data quality and statistics
Create a BigQuery view as a stable feature source
Step 1: Defining the Feature Extraction Query
The lecture begins by defining a SQL query that prepares features for the Feature Store.
Key concepts covered:
Counting successful (good) and problematic (bad) orders per product
Restricting order analysis to the last 30 days
Fetching basic product details such as name, category, brand, and price
Normalizing text fields for consistency
Combining product information with order statistics
Adding a timestamp for feature freshness
This query forms the foundation of all features used later in the Feature Store.
Step 2: Previewing the Dataset
Next, the lecture demonstrates how to:
Execute the feature extraction query in BigQuery
Load results into a DataFrame for inspection
Verify dataset size and structure
You will see that the dataset contains:
Around 29,000 product records
Nine feature columns per product
A quick preview confirms that the data looks correct before deeper analysis.
Step 3: Exploring Dataset Statistics
Before moving data into the Feature Store, it is important to understand it.
In this step, you will:
Review the total number of products
Identify the total number of features
List all feature names
Examine data types for each feature
Analyze price statistics such as:
Minimum
Maximum
Average
Median
This helps ensure that features are numerically and logically valid.
Step 4: Data Quality and Missing Values Check
The lecture then focuses on data quality, which is essential for Feature Store reliability.
You will learn how to:
Identify missing values in each feature
Understand which columns are most affected
Distinguish between:
Products with complete data
Products with missing values
Analyze products with and without order history
The lecture also explains that missing values can be handled later using strategies such as mean or median filling.
Step 5: Category and Order Analysis
To better understand the data distribution, the lecture includes:
Category Analysis
Identifying the top product categories
Understanding how products are distributed across categories
Visualizing category counts for better insights
Order Analysis
Products with good orders
Products with bad orders
Products with no orders
Average number of orders per product
Overall order success rate
These insights help validate that the features reflect real business behavior.
Step 6: Creating a BigQuery Dataset
After validating the data, the lecture demonstrates how to:
Create a dedicated BigQuery dataset for Feature Store
Ensure correct region selection (US)
Verify dataset creation in the BigQuery console
This dataset acts as a container for Feature Store-related views.
Step 7: Creating a BigQuery View for Feature Store
Instead of storing static tables, a BigQuery view is created.
You will learn:
Why views are preferred for Feature Store
How the view dynamically executes the feature extraction query
How this view becomes the data source for Feature Store
How to verify the view’s schema and data
The view contains all feature columns and stays up to date automatically.
Step 8: Verifying the BigQuery View
Finally, the lecture verifies:
Total number of rows in the view
Sample records from the view
Feature timestamps
Schema details such as:
Column names
Data types
Modes
This confirms that the view is ready to be used by Vertex AI Feature Store.
In this lecture, we move to Section 3 of the Feature Store demo, where we create a Feature Online Store in Vertex AI.
In the previous tutorials, we completed:
Section 1: Setup and Installation
Section 2: Preparing Feature Data in BigQuery
Now, we focus on building the high-performance online serving layer that will deliver features in real time to machine learning models.
What Is a Feature Online Store?
A Feature Online Store is a low-latency serving layer that provides feature values to applications and models in milliseconds.
In this tutorial, the online store is backed by Google Bigtable, which offers:
Sub-10 millisecond latency
Automatic scaling based on traffic
High availability and fault tolerance
Ability to handle millions of requests per second
This makes it ideal for real-time machine learning predictions.
Section 3 Overview: Create Feature Online Store
In this section, we complete the following steps:
Initialize Feature Store service clients
Create a Bigtable-backed online store with autoscaling
Verify successful creation
Inspect store details and configuration
Understand autoscaling behavior
Step 1: Initializing the Service Clients
The lecture begins by initializing the service clients required to manage Feature Store resources.
You will understand:
Why different clients are required for admin operations and feature metadata
How the admin client manages online stores and feature views
How the registry client manages feature definitions and metadata
Why using a regional API endpoint improves performance
Why Regional API Endpoints Matter
Without a regional endpoint, requests go to global servers
With a regional endpoint, requests go directly to us-central1, which:
Is closer to the data
Reduces latency
Improves overall performance
Step 2: Creating the Feature Online Store
Next, the lecture demonstrates how to create a Bigtable-backed online store.
Key concepts covered:
Assigning a unique name to the online store
Choosing Bigtable as the storage backend
Configuring autoscaling:
Minimum node count (always at least one node running)
Maximum node count (upper limit for scaling)
CPU utilization target to trigger scaling
Understanding why autoscaling is critical for production systems
You will see that the store is created quickly and transitions to a stable state.
Step 3: Verifying Online Store Creation
After creation, the lecture walks through:
Verifying the online store in the Vertex AI Feature Store UI
Checking:
Store name and status
Region
Storage type (Bigtable)
Autoscaling configuration
Viewing the store in Dataplex Universal Catalog for governance and metadata
This confirms that the online store is successfully provisioned.
Step 4: Inspecting Online Store Details
You will then inspect the online store configuration in detail, including:
Creation timestamp
Autoscaling settings
Minimum and maximum node counts
CPU utilization target
Estimated cost per hour based on node usage
This step helps you understand cost and capacity planning for online serving.
Understanding Bigtable Autoscaling (Conceptual Explanation)
The lecture concludes with a clear, real-world explanation of autoscaling behavior:
Low traffic:
One node is enough
CPU usage stays below the target
Medium traffic:
CPU usage exceeds the target
A second node is added
High traffic:
CPU usage remains high
A third node is added
Traffic drops:
CPU usage falls below the target
Extra nodes are removed
This ensures:
High performance during peak traffic
Cost optimization during low usage
In this lecture, we move to Section 4 of the Feature Store demo, where we create a Feature View.
In the previous tutorial, we successfully created the Feature Online Store. Now, we will connect our BigQuery feature data to this online store using a Feature View.
This step is extremely important because features cannot be served online without a Feature View.
What Is a Feature View?
A Feature View acts as a bridge between BigQuery and the Feature Online Store.
It defines:
Which BigQuery table or view should be used as the data source
Which column acts as the entity ID (lookup key)
How often data should be synced
When and how features are refreshed in the online store
Without a Feature View, the online store has no idea what data to load or how to keep it updated.
Real-World Analogy for Better Understanding
The lecture explains Feature View using a simple real-world analogy:
BigQuery → Warehouse where products are stored
Feature Online Store → Retail store where customers shop
Feature View → Delivery truck
The delivery truck:
Knows what products to deliver (entity ID)
Knows where to pick them from (BigQuery source)
Runs on a fixed schedule (cron)
This analogy makes it easy to understand the role of Feature Views in real systems.
Section 4 Overview: Create Feature View
In this section, we complete the following steps:
Define Feature View name and sync schedule
Configure BigQuery as the feature source
Specify the entity ID column
Define the sync schedule using cron
Create the Feature View
Verify the Feature View using API and UI
Step 1: Defining Feature View Identity and Schedule
The lecture starts by defining:
A Feature View ID (logical name of the view)
A cron-based sync schedule
You will learn:
Why sync schedules are required
How time zones affect scheduling
Why it is important to specify the correct time zone (for example, Pacific Time)
This ensures feature updates happen at the right time and frequency.
Step 2: Understanding Cron Schedules
The lecture clearly explains:
The structure of a cron expression
What each field represents (minutes, hours, day, month, weekday)
How schedules like “run every hour” or “run daily” are defined
You will also see practical examples, such as:
Hourly syncs
Daily syncs
Weekly syncs
High-frequency syncs for fast-changing data
Guidelines are shared on choosing schedules based on feature type, such as:
Product inventory
User demographics
Stock prices
Step 3: Defining the BigQuery Feature Source
Next, the lecture shows how to define BigQuery as the data source for the Feature View.
Key points explained:
How the Feature View connects to a BigQuery view
What a URI represents
Why the entity ID column is critical
How the entity ID is used to fetch feature values for a specific entity
This step tells the Feature Store where the data lives and how to look it up.
Step 4: Creating the Feature View
Once all configurations are defined:
The Feature View is created inside the Feature Online Store
The system registers the BigQuery source, entity ID, and sync schedule
The creation is confirmed with a successful response
At this point, the Feature View is ready to sync data from BigQuery to the online store.
Step 5: Verifying the Feature View
The lecture then verifies the Feature View by:
Listing all Feature Views in the online store
Confirming:
Feature View name
Source type (BigQuery)
Entity ID column
Sync schedule
This confirms that the Feature View is configured correctly.
Step 6: Exploring Feature View in the UI
You will also see the Feature View in the Vertex AI Console, where you can:
View Feature View details
Check creation and update timestamps
Inspect sync configuration
Review BigQuery source path
Explore lineage and relationships
Delete the Feature View if required
This helps you understand how Feature Views are managed visually.
In this lecture, we move to Section 5 of the Feature Store demo, where we sync feature data from BigQuery to the Feature Online Store.
In the previous tutorial, we successfully created the Feature View, which defines what data to sync, how to sync it, and on what schedule. Now, we perform the actual data synchronization so that features become available for real-time serving.
What Does “Syncing” Mean in Feature Store?
Syncing is the process of copying feature data from BigQuery into Bigtable, which is the storage layer behind the online store.
Once synced:
Features can be fetched in milliseconds
Data becomes ready for online predictions
Models and applications can access the latest feature values instantly
Types of Feature View Sync
The lecture explains two ways to sync data:
1. Scheduled Sync
Runs automatically based on the cron schedule
Suitable for regular, automated updates
2. Manual Sync
Triggered on demand
Useful for:
First-time data loading
Testing
Emergency refreshes
In this tutorial, we perform a manual sync.
What Happens During a Sync?
The lecture clearly breaks down the internal workflow of a sync operation:
BigQuery executes the feature view query
Joins product and order data
Computes aggregations
Adds feature timestamps
Data is transformed into Bigtable format
Features are indexed by the Entity ID
Enables fast lookup
Data is written to Bigtable
Features become available for online serving
This process ensures that feature data is optimized for low-latency access.
Step 1: Triggering a Manual Sync
The lecture demonstrates how to:
Use the Feature Store admin client
Trigger a manual sync for a specific Feature View
Capture the sync response and sync ID
You will learn:
Why manual sync is useful
How to identify a sync operation using its ID
What to expect during the sync process
Step 2: Monitoring Sync Progress
After starting the sync, the lecture shows how to:
Monitor the sync operation
Wait for completion
Check whether the sync succeeded or failed
This step ensures that the data has been fully loaded into the online store.
Step 3: Inspecting Sync Details
Once the sync completes, the lecture explains how to review:
Sync name and ID
Start and end timestamps
Sync status codes
Total duration
Throughput (rows processed per second)
These metrics help you understand:
Performance of the sync
Data volume processed
Operational health of the Feature Store pipeline
Step 4: Understanding Sync Results
The lecture then summarizes what happened during the sync:
BigQuery executed the Feature View SQL query
Approximately 29,000 product records were processed
Data was successfully loaded into Bigtable
Features are now ready for online serving
This confirms that the Feature Store is fully populated with real-time data.
Step 5: Listing All Sync Operations
You will also learn how to:
List all sync operations for a Feature View
Review historical syncs
Identify when syncs were triggered (manual or scheduled)
This is useful for:
Auditing
Debugging
Monitoring feature freshness
Verifying Sync in the Vertex AI Console
The lecture also verifies the sync using the Vertex AI UI, where you can:
View the Feature View
See sync history
Confirm successful data loading
This reinforces both programmatic and UI-based verification.
In this lecture, we move into one of the most exciting and practical parts of the Feature Store workflow — fetching features from the online store for real-time serving.
So far, we have completed all the foundational steps, including setup, installation, preparing feature data in BigQuery, creating the feature online store, and syncing data to the online store (Bigtable). Now, it is time to use those features in real time.
Since the features are already synced to Bigtable, they can now be retrieved in milliseconds, making them ideal for real-time machine learning use cases such as recommendations, personalization, and ranking systems.
Key Concepts Covered
1. Initializing the Data Client
You will first initialize the online data client, which is responsible for fetching features from the Feature Online Store. This client connects to Bigtable through the Feature Store service and enables ultra-fast feature access.
Once the data client is ready, the system can start serving features in real time.
2. Fetching Features for a Single Product
Next, you will fetch features for a specific product using its product ID (entity key).
At this stage, you will understand:
How the feature view is used to identify the feature schema
How the entity key maps directly to a row in Bigtable
How features are returned as name–value pairs
This confirms that the features are successfully available for online serving.
3. Displaying Features in a Readable Format
Raw feature responses are often returned in key-value structures. In this lecture, you will also see how to:
Parse the response
Display the feature values in a clean and readable format
Clearly view attributes such as product name, category, price, order counts, and timestamps
This makes feature inspection and debugging much easier.
4. Using Proto Struct (JSON-Like) Format
You will then explore a different output format called Proto Struct format.
Important highlights:
Features are returned as a dictionary-like structure
This format is more JSON-friendly
It is easier to parse programmatically
Ideal for APIs, microservices, and production systems
You will clearly see the difference between key-value format and Proto Struct format.
5. Fetching Features for Different Product IDs
You will test feature retrieval using different product IDs to:
Validate consistency
Experiment with multiple entities
Understand how easy it is to reuse the same logic for different products
This section encourages hands-on experimentation.
6. Comparing Features Across Multiple Products
In this part, you will:
Fetch features for multiple product IDs
Extract selected feature values
Display a comparison table
This is extremely useful for:
Product comparisons
Analytics
Recommendation systems
Debugging feature behavior across entities
7. Measuring Online Feature Retrieval Performance
Performance is critical for real-time machine learning.
In this lecture, you will:
Measure the time taken to fetch features from the online store
Observe response times in the millisecond range (typically 20–70 ms)
Understand why this latency is suitable for real-time inference
You will also see how response times may vary slightly across runs, which is normal in distributed systems.
8. Feature Store vs BigQuery Performance Comparison
This is one of the most important sections of the lecture.
You will:
Compare online Feature Store performance with BigQuery
Run multiple performance tests
Analyze average, minimum, and maximum response times
Visualize the comparison using charts
Key takeaway:
Feature Store delivers milliseconds-level latency
BigQuery takes hundreds to thousands of milliseconds
9. Real-World Business Impact
To make things practical, the lecture concludes with a real-world e-commerce recommendation scenario, covering:
High concurrency requirements
Limitations of using BigQuery for real-time serving
Cost implications
Scalability challenges
User experience impact
You will clearly see why:
BigQuery is excellent for analytics
Feature Store is essential for real-time machine learning systems
Industry Use Cases
The lecture also highlights how major companies use online feature stores in production, including:
Amazon for product recommendations
Netflix for content personalization
Uber for dynamic pricing
Spotify for music recommendations
Airbnb for search ranking
In this lecture, we conclude the end-to-end demo by summarizing everything we have built so far and performing a complete cleanup of all cloud resources. This is an important step in any production-grade system to avoid unnecessary cloud costs and to follow best practices.
By this point, you have already seen how to fetch features for online serving. Now, we take a step back to review the entire solution and understand the infrastructure cost, system capabilities, and key learnings, followed by a safe and complete teardown of resources.
Resource Cost and Usage Analysis
We begin by checking the resource configuration of the Feature Online Store.
Key points covered:
Retrieving online store details
Understanding the Bigtable configuration
Minimum and maximum node count
CPU utilization target
Cost estimates per:
Hour
Day
Month
You will see how even a small number of nodes can impact monthly costs and why it is critical to monitor resource usage during development and testing.
What We Built in This Demo
This demo is not just about feature retrieval—it demonstrates multiple production-level skills, including:
Data Engineering
Preparing and structuring feature data
MLOps Infrastructure
Online feature stores and scalable serving
Performance Engineering
Low-latency feature retrieval
Production Machine Learning Systems
Real-time feature serving at scale
You will also review:
Dataset details
Performance benchmarks
Infrastructure design
Cost considerations
Real-World Applications
The lecture connects the demo to real business scenarios where feature stores are widely used, such as:
E-commerce – product recommendations and ranking
Entertainment – content personalization
Finance – fraud detection and risk scoring
Transportation – dynamic pricing and demand forecasting
This helps you understand how the same architecture can be applied in real production systems.
In this lecture, we begin a new and advanced approach to using Vertex AI Feature Store by introducing Feature Groups. In the previous demo, we successfully implemented online feature serving using a simpler approach. Now, we move one level deeper and learn how Feature Store works when features are managed through a feature registry using Feature Groups.
This lecture lays the foundation for building a more structured, scalable, and production-ready feature management system.
What Are Feature Groups?
A Feature Group is a feature registry resource in Vertex AI that is directly associated with a BigQuery source table.
Key characteristics of Feature Groups:
They represent a logical grouping of feature columns
Each column is registered as an individual feature resource
They use entry ID columns to identify feature records
They require a timestamp column when working with time-series data
They help manage features with metadata, versioning, and governance
Feature Groups make feature management more structured compared to the simple approach.
Feature Groups vs Simple Feature Store Approach
This lecture clearly explains the difference between the two approaches:
Simple Approach
BigQuery View
Feature View
Online Store
Feature Group (Advanced) Approach
BigQuery Table
Feature Group
Feature Registration (with metadata)
Feature View
Online Store
The Feature Group approach adds two extra but critical steps, making it suitable for enterprise and production environments.
Feature Group Creation Options
You will also see:
How Feature Groups can be created using the Vertex AI Console (UI)
Required configuration details such as:
Region
Feature Group name and description
Labels
BigQuery source path
Entry ID column
In this tutorial, however, the focus is on creating Feature Groups using Python-based workflows.
Demo Overview: What We Are Building
Before starting the implementation, we clearly define the architecture we are building:
Feature data stored in BigQuery tables
Feature Groups acting as a feature registry
Individual feature registration with metadata
Feature Views created from the registry
A production-ready Feature Store architecture
This helps you understand the complete flow before moving ahead.
Setup and Installation (Section 1)
This section follows the same setup steps as the previous demo, including:
Installing required libraries
Restarting the runtime (if required)
Configuring Google Cloud project details
Importing necessary libraries
Since these steps were already explained earlier, they are reused here to maintain consistency.
Preparing Feature Data in BigQuery (Section 2)
In this section, we prepare the data specifically for Feature Groups:
Key points covered:
Feature data is created using a BigQuery table, not a view
Feature Groups require tables for registration
Product and order history data is combined
The dataset contains around 29,000 product records
The table schema includes entity ID, product attributes, pricing, order counts, and timestamps
You will also:
Create a new BigQuery dataset
Materialize the query results into a table
Verify the table schema, data types, and row count
Preview the data to confirm correctness
Why Tables Are Mandatory for Feature Groups
A key concept emphasized in this lecture:
Feature Groups do not work with views
Data must be materialized into BigQuery tables
This ensures stable schema, metadata management, and feature registration
This is one of the most important differences from the simple feature store approach.
In this lecture, we move forward with the advanced Feature Store workflow by completing two critical steps: creating a Feature Group and registering individual features with metadata.
In the previous tutorial, we completed the setup, installation, and prepared the feature data in BigQuery. Now, we focus on organizing and governing those features in a structured, reusable, and production-ready manner using Feature Groups.
What Is a Feature Group?
A Feature Group is a logical collection of related features derived from a BigQuery table.
Key benefits of using Feature Groups include:
Centralized metadata management
Reusability across multiple Feature Views
Better team collaboration
Feature governance and lineage tracking
Advanced search support in the UI
Think of a Feature Group as a folder that organizes related features along with rich metadata, making them easier to discover and manage.
Creating the Feature Group (Section 3)
In this section, you create a Feature Group that points directly to the BigQuery table created earlier.
Key concepts covered:
Initializing service clients for Feature Store management
Linking the Feature Group to a BigQuery source table
Defining the entity ID column for feature records
Creating the Feature Group resource
Verifying the Feature Group in the Vertex AI console
After creation, you can view:
Feature Group name and description
BigQuery source URI
Entity ID column
Creation and update timestamps
At this stage, the Feature Group exists but does not yet contain any registered features.
Feature Group Architecture Overview
By this point in the workflow, the architecture looks like this:
BigQuery table containing feature data (around 29,000 products)
Feature Group created and linked to the table
No features registered yet
This clear separation between data storage and feature metadata is a key advantage of the Feature Group approach.
Registering Individual Features (Section 4)
After creating the Feature Group, the next step is feature registration.
What Is Feature Registration?
Feature registration creates metadata entries in the Feature Registry for each feature column.
Each registered feature includes:
Feature name
Description
Labels and tags
Data type
Ownership and lineage information
Why Register Features?
Registering features provides:
Better organization of features
Easier feature discovery
Improved team collaboration
Strong governance and version control
Support for advanced search in the UI
Registering Product Features
In this tutorial, multiple product-related features are registered, including:
Product name
Product category
Product brand
Product cost
Retail price
Good order count
Bad order count
Each feature is registered with:
Clear descriptions
Meaningful labels (such as product info, pricing, and order metrics)
In this lecture, we complete the end-to-end Feature Store implementation using Feature Groups. Until now, we have already covered setup, BigQuery feature preparation, Feature Group creation, and individual feature registration. In this session, we bring everything together by creating the Feature Online Store, building a Feature View from the Feature Registry, syncing data, fetching features, comparing approaches, exploring advanced search, and finally performing a full resource cleanup.
This lecture helps you clearly understand how Feature Groups fit into a production-grade Feature Store architecture and when they should be used.
Creating the Feature Online Store (Section 5)
We begin by creating the Feature Online Store, which is required for fast, low-latency feature serving.
Key points covered:
Bigtable is used as the backend for online serving
Auto-scaling configuration:
Minimum nodes: 1
Maximum nodes: 3
CPU utilization target: 50%
The store is created as a long-running operation
Store details such as name, creation time, and scaling configuration are verified
At this stage:
The online store is ready
No Feature Views are attached yet
Creating Feature View from Feature Registry (Section 6)
This is where Feature Groups are actually used.
Instead of creating a Feature View directly from BigQuery (simple approach), we now:
Create a Feature View from the Feature Registry
Reference the Feature Group and registered features
Define a scheduled sync using a cron schedule
Why this matters:
Feature Registry metadata is used
Multiple Feature Groups can be combined
Better governance, lineage, and reusability
Feature Views are decoupled from raw data sources
After creation, you verify:
Feature View name
Sync schedule
Associated Feature Group
Registered features
Syncing Data to the Online Store
Once the Feature View is created, we sync the data.
Key points:
Sync copies data from BigQuery tables to Bigtable
Feature metadata from the registry is applied
Features become available for online serving
Sync process and status are monitored
Sync duration and success status are verified
This step is identical to the simple approach.
Fetching Features for Online Serving
Feature fetching is exactly the same in both approaches.
Important concepts explained:
Once data is in Bigtable, Feature Groups are not involved in fetching
Fetch APIs, data formats, and performance remain unchanged
Feature Groups are only used during:
Feature registration
Feature View creation
Metadata management
You will see:
Feature fetching for single products
Fetching in Proto Struct (JSON-like) format
Fetching for different product IDs
Formatted and readable outputs
Performance Insights
Performance measurements show:
Feature fetch latency around 70–80 milliseconds
Identical performance to the simple approach
Feature Groups add metadata, not latency
Key takeaway:
Feature Groups do not slow down online serving. They improve organization and governance, not runtime performance.
Exploring Feature Registry – Advanced Search
In this section, you explore the Feature Registry UI and Advanced Search.
You learn how to:
Search for features by name
Explore Feature Groups, Feature Views, and Online Stores
Use keywords to find registered resources
Understand how metadata makes features discoverable
This is especially useful for:
Large teams
Shared feature platforms
Enterprise environments
Simple Approach vs Feature Groups Approach
A clear comparison is provided:
Simple Approach
BigQuery View
Feature View
Online Store
Best for:
Learning and experimentation
Quick prototypes
Small teams
No governance requirements
Feature Groups Approach
BigQuery Table
Feature Group
Feature Registration
Feature View
Online Store
Best for:
Production systems
Large teams
Feature sharing across projects
Governance, compliance, and discovery
Long-term maintenance
Bottom line:
Both approaches have the same serving performance.
The difference lies in organization, governance, collaboration, and discoverability.
Resource Cleanup
The lecture concludes with a complete cleanup of all resources.
Deleted resources include:
Feature View
Feature Online Store
Feature Group and registered features
BigQuery dataset and tables
You verify:
BigQuery datasets are removed
Feature Groups and Online Stores no longer exist
No active resources remain
This ensures:
No unnecessary cloud charges
Best practices for cloud cost management
Become a job-ready Google Cloud Professional Machine Learning Engineer by mastering end-to-end Machine Learning, Generative AI, RAG architectures, and Agent development on Google Cloud Platform. This course is fully hands-on and covers the complete ML lifecycle — from SQL-based modeling with BigQuery ML to advanced AI systems using Vertex AI, Gemini models, and the Agent Development Kit (ADK).
You will start by building production-ready Machine Learning models using BigQuery ML, including regression, boosted trees, classification, recommendation systems, anomaly detection with autoencoders, time-series forecasting, and advanced feature engineering. From there, you will explore Vertex AI Model Garden to create text generation workflows, multimodal AI applications, image generation pipelines, and real-world AI solutions.
The course goes far beyond traditional ML by teaching you how to design intelligent systems powered by Generative AI. You will build complete projects using the Document AI API for large-scale document processing, develop AutoML pipelines for tabular, text, image, and forecasting workloads, and implement enterprise-grade AI agents using Google’s Agent Development Kit.
Inside the Agent Development section, you will create starter agents, tool-enabled agents, multi-agent systems, stateful workflows, persistent storage, callbacks, sequential and parallel agent architectures, and deploy production-ready agents to Vertex AI Agent Engine and Cloud Run.
You will also design modern enterprise AI architectures including:
Retrieval-Augmented Generation (RAG) systems using Vertex AI RAG Engine
Vector Search pipelines with embeddings, indexing, and querying
Vertex AI Search implementations for enterprise search use cases
Gemini File Search API projects to build RAG applications on your own data
Feature Store pipelines for scalable online ML serving
In addition, you will learn how to build Generative AI applications using Google AI Studio, experiment with Gemini models, and create AI-powered applications without complex infrastructure.
Throughout the course, every concept is implemented through real-world, production-style demos, ensuring you gain practical skills aligned with the Google Cloud Professional Machine Learning Engineer certification and modern industry AI workflows.
Key topics covered:
BigQuery ML (Regression, Classification, Boosted Trees, Forecasting, Recommendations, Autoencoders, Feature Engineering)
Vertex AI Model Garden (Text Generation, Translation, Multimodal AI, Image Generation)
Document AI API (Form Parsing, Batch Processing, JSON Extraction, Gradio Applications)
Vertex AI AutoML (Tabular, Text, Image, Forecasting, Batch & Online Predictions)
Agent Development Kit (Starter Agents, Tools, Stateful Multi-Agents, Callbacks, Sequential & Parallel Agents, Deployment)
Vertex AI RAG Engine and RAG Agent Development
Vertex AI Search and enterprise AI search systems
Vertex AI Vector Search (Embeddings, Indexing, Querying, Maintenance)
Vertex AI Feature Store (Online Serving, Feature Groups)
Gemini File Search API for RAG applications
Google AI Studio and Generative AI application development
This course is designed as a complete Google Cloud AI engineering roadmap, combining classical machine learning, Generative AI, agentic workflows, and enterprise search architectures into one structured learning path. Whether you are preparing for certification or building production-ready AI systems, you will gain the skills required to design, deploy, and scale machine learning solutions confidently on Google Cloud.