Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

MLOps Zero to Hero

Name: MLOps Zero to Hero
Rating: 4.6 (2157 reviews)

Learn Production-Grade MLOps using DVC, MLFlow, AWS, Docker, Kubernetes, KServe, SageMaker and Kubeflow.

Bestseller

Highest Rated

Created byAbhishek Veeramalla

Last updated 1/2026

English

German [Auto],English [Auto],

What you'll learn

Introduction to Machine Learning Operations (MLOps)
Transition from DevOps Engineers to MLOps Engineers
Machine Learning Basics for DevOps Engineers
Model Deployment and Monitoring in Production
End-to-End ML Pipeline Orchestration
Real-World MLOps Project

Course content

12 sections • 69 lectures • 12h 19m total length

Intro1:08
Begin your journey from absolute zero to hero in ml ops by following the course in order, using sections, lectures, and articles; review the GitHub repository for notes.
GitHub Repository for the course2:00
Access the public GitHub repository for the course and bookmark it for updates. Explore sections like Introduction to MLOps and Experiment tracking; use the notes and MLflow to revise anytime.
GitHub Repository Link0:02

What is Machine Learning and What is a Model ?4:14
Learn how machine learning trains a model from data to identify patterns and predict outputs. See how data sets and algorithms turn training into a model that predicts flower types.
How is a Model created in realtime (detailed steps)6:50
Data scientists build a model from a flower dataset, split 80/20 for training and testing, train with an algorithm, and package the final model in pkl, Joblib, or ONNX.
What is MLOps?12:36
Explore how MLOps extends DevOps to the machine learning lifecycle, automating CI/CD, infrastructure as code, and Kubernetes management for rapid model deployment.
Machine Learning Lifecycle Overview11:50
Explore the machine learning lifecycle from problem definition to deployment and monitoring in production, including data collection, cleaning, feature engineering, model selection, training, evaluation, and maintenance.
Data Scientist vs ML Engineer vs MLOps Engineer9:15
Data scientists define problems, gather and clean data, engineer features, and train models. ML engineers productionize models with APIs; MLOps engineers automate pipelines, deployment, observability, and infrastructure.

Introduction to the Project1:17
Explore how data scientists tackle a flower species prediction using petal and sepal measurements, and how MLOps and ML engineers collaborate to move the model to the next level.
Useful Links to get started0:03
Learn how Datascientists work without MLOps practices11:17
Discover how data scientists address a requirement by clarifying requirements with the product owner, gathering data, setting up a Python environment, training a model with iris data, and saving artifacts.
Learn how MLOps Engineers help Datascientists22:09
MLOps engineers automate data science workflows with version control, RBAC and auditing, and CI/CD, enabling multi-python training, automated environment setup, model training, and artifact storage.
Role of ML Engineers in a project at a high level8:47
Learn how ML engineers build scalable APIs for models, enabling back-end integration via /predict endpoints using Python frameworks like Flask or FastAPI.
Learn how MLOps Engineers help ML Engineers with Model deployment7:40
discover how mlops engineers assist ml engineers by containerizing models with docker, creating dockerfiles, installing dependencies, and deploying on kubernetes with manifests, networking, and security.
Conclusion0:47
Discover how MLOps engineers, data scientists, and ML engineers collaborate to train, save, run, and ship models, with MLOps engineers supporting training, saving, and deploying.

What is Data versioning - Why Git isn't enough ?5:46
Learn why git isn't enough for data versioning and how DVC provides data version control with auditing and RBAC, handling large CSV and image data in MLOps workflows.
Introduction to DVC (Data Version Control)5:19
Discover how DVC provides data version control by moving large data sets and models to remote cloud storage (S3 bucket, Azure Blob, Google Cloud Storage) with versioning.
GitHub Repo Link for the next hands-on lecture0:02
DVC Realtime Hands-on [DVC + AWS S3]16:21
Master data versioning with DVC and AWS S3, integrating git to manage wine prediction data and its versions via DVC add and push to remote storage.

Introduction to Experiment Tracking7:41
Track every training run to capture learning rate and other parameters, code and data set versions, metrics, artifacts, and system information, enabling reproducible experiments and progress toward the target efficiency.
What is MLflow? [Introduction to MLflow]7:12
Discover how MLflow centralizes experiment tracking, model versioning, and deployment, replacing Excel sheets, with MLOps engineers deploying a centralized server and data scientists instrumenting Python scripts.
Basic MLflow Installation [Non-Production, Demo purpose]3:39
Install mlflow on your local machine using a python virtual environment, run mlflow ui with a sqlite backend on port 7006 for demo purposes; production setup comes in later.
MLflow basic installation doc0:03
Basic MLflow Installation on Kubernetes [Non-Production, Demo purpose]6:10
Learn how to install and configure MLflow on a local Kubernetes cluster using kind, via Helm charts or manifests, with port forwarding for access, for a demo or PoC.
MLflow production setup using Postgres doc0:03
MLflow Installation in Production with PostgreSQL on Kubernetes [Important]23:51
Deploy MLflow in production on Kubernetes by linking a MLflow server to an AWS RDS Postgres database. Create a dedicated MLflow database and user, and configure with Helm.
GitHub repository for the next lecture0:03
Realtime Tracking with MLflow for Wine Prediciton Model [DVC + MLflow]22:30
Connect to an MLflow instance from the terminal, set tracking URI and experiment, then update train.py to log parameters and metrics for a wine prediction model using DVC and MLflow.
Comparing multiple runs in an experiments visually using MLflow4:57
Learn how MLflow enables data scientists and MLOps engineers to compare multiple runs within an experiment using box plots, parameter and metric comparisons, git commits, and artifacts.
MLflow Quick revision2:11
Learn how MLflow automates experiment tracking and artifact storage by integrating with Python scripts, enabling data scientists to compare runs via the MLflow UI and manage models securely in production.

Introduction to Model Deployment and Model Serving4:46
Explore model deployment and model serving, packaging, versioning, and deploying ML models with scalable APIs, runtime resources, and autoscaling in production.
Popular ways to Deploy and Serve models in Production by MLOps engineers8:15
Explore four popular model deployment and serving strategies in production: virtual machines, Kubernetes, managed SageMaker, and k serve with Knative serving.
GitHub repository for the next lecture0:03
Intent Classifier Model - We will use this model to understand Model Serving.13:01
Train the intent classifier model with train.py, save the artifact, and expose predictions via a local Flask API at /predict.

Architecture12:51
Implement production-grade ml model deployment with a VPC, subnets, internet gateway, and load balancer, using WSGI and nginx to enable dynamic auto scaling and handle concurrency.
Important Note0:04
Implementing WSGI8:38
Deploy the intent classifier with WSGI on an EC2 instance using Gunicorn, create a virtual environment, install dependencies, run the server, and test the /predict endpoint with curl.
Userdata script for the Implementation [Important]30:34
Write a user data script for auto scaling group launch template to install the model, API, and Python dependencies. Configure Gunicorn and Nginx as scalable services that start on boot.
End to End Implementation with Notes40:02
Implement AWS architecture from VPC to load balancer using CLI, including subnets, internet gateway, route table, security groups, launch template, target group, and auto scaling group; document steps in GitHub.
Cleanup resources9:43
Delete cloud resources in the correct order to avoid charges, removing the auto scaling group, load balancer, target group, launch template, security group, subnet, and the VPC.

Important Note0:03
Overview4:46
Deploy and serve a model on a Kubernetes cluster by preparing a Dockerfile, building an image, pushing to a model registry, and deploying with manifests and ingress for inference.
Preparing Dockerfile for the Project11:49
Prepare a dockerfile using a python slim base, set /app, and separate layers for requirements and source code. Train the model, expose port 6000, and run gunicorn.
Running and testing the Docker Image locally11:06
Test a docker image locally by building from the Dockerfile, running a container, and binding ports to expose the /predict API for model serving. Verify the setup with curl.
Pushing the Image to a Model Registry9:46
Push your model container to a model registry, enable versioning, auditing, and RBAC, then tag, push, and pull with Docker Hub or AWS ECR to share and deploy.
Creating and Managing a Kubernetes cluster24:35
Learn to prepare and deploy a Kubernetes cluster for model deployment, from local setups with kind to production with EKS/AKS/GKE, plus namespace-based isolation and quotas.
Preparing Kubernetes manifests for Model Deployment18:57
Learn to deploy a model container on a Kubernetes cluster by creating a namespace, deployment with replicas, and a service for stable access, using manifests for namespace, deployment, and service.
[Model serving] - Overview7:08
Understand model serving as part of deployment, why node port exposure is limited in real-time and disconnected environments, and how an ingress controller with a load balancer enables end-user inference.
Ingress Controller deployment5:41
Install a traffic ingress controller in a Kubernetes cluster and learn how ingress resources create a load balancer.
Model Serving using Ingress10:20
Demonstrate deploying a model in real time with Kubernetes ingress and an ingress controller. Configure an ingress resource to route example.com/predict to the intent classifier service via a load balancer.

Important Note0:02
Introduction to KServe12:15
Discover KServe, the open-source kubernetes-based platform that automates model deployment, serving, and inference across frameworks like scikit-learn, TensorFlow, XGBoost, and PyTorch, with automatic scaling via native Kubernetes and Keda.
KServe Architecture7:38
Understand the KServe architecture on Kubernetes. The KServe controller watches inference service CRDs across namespaces, deploys model pods with a horizontal pod autoscaler, and exposes them via ingress.
KServe End to End Demo for a sample model11:30
Install cert-manager, set up a local kind cluster, install CRDs and KServe, and deploy an inference service for a sample model exposed via port-forward.
KServe End to End Demo for the Intent Classifier model14:02
Deploy and serve the intent classifier on a Kubernetes cluster with KServe. Install cert-manager, CRDs, and the KServe controller; deploy the inference service.
Kserve for LLMOps1:38
Explore kserve for llm ops and why the same platform used for mlops, such as mlflow and csv, supports llm ops; practice with multiple models to boost your resume.

Introduction to Amazon SageMaker AI9:18
Explore Amazon SageMaker AI as a unified end-to-end ML platform that spans training to deployment within a studio, with domains, notebooks, pipelines, and model serving.
Getting Started with Amazon SageMaker AI12:29
Learn to get started with Amazon SageMaker AI, create domains for teams, set up user profiles, and access SageMaker Studio tools like Jupyter notebooks, pipelines, and MLflow.
Amazon SageMaker Production Setup - Part 112:21
Set up SageMaker AI production by creating domains and user profiles, then enforce ABAC with tagging and policies to restrict access to each user's workspace.
Amazon SageMaker Production Setup - Part 219:10
Implement SageMaker production setup by creating a domain in a VPC with subnets and a SageMaker domain execution role, using IAM authentication and ABAC for user profiles.
Create and Save Models to Registry using SageMaker18:16
Create a model with SageMaker, store it in an S3-based model registry, and prepare for deployment and inference, using Jupyter notebooks and a SageMaker Studio workflow.
Deploy and Serve Model for Inference using SageMaker20:52
Learn to deploy and serve models on SageMaker by packaging pickle models with inference.py into a tar.gz, uploading to S3, and creating endpoints; compare two deployment approaches.
Delete Resources [Important]2:42
Delete SageMaker resources step by step to avoid cloud charges. Remove the resources, spaces, and user profiles before deleting the domain in the SageMaker console.

Requirements

Fundamental understanding of DevOps
Basic understanding of DevOps concepts like Docker, Kubernetes and CI/CD.

Description

MLOps Zero to Hero is a practical, hands-on course designed to help engineers understand how machine learning systems are built, deployed, and operated in real production environments. The course focuses on the real challenges teams face after a model is trained versioning data, tracking experiments, deploying models, scaling inference, and managing ML workloads reliably.

You will start with the fundamentals of the ML lifecycle and gradually move into core MLOps practices. The course covers data and model versioning using DVC, experiment tracking with MLflow, and containerization using Docker. You will deploy models on Kubernetes, understand production-grade serving patterns, and implement Kubernetes-native inference using KServe.

The course also introduces AWS-based MLOps workflows, including Amazon SageMaker, to help you understand how managed platforms are used in real organizations. You will further explore Kubeflow to learn how ML pipelines and training workloads are orchestrated in Kubernetes environments.

Every concept is explained using simple examples and real-world workflows, with a strong emphasis on clarity and practical understanding rather than theory. By the end of the course, you will have a complete picture of how machine learning moves from experimentation to production — and the confidence to design, deploy, and operate MLOps systems in real projects.

Who this course is for:

DevOps Engineers planning to transition to MLOps roles
Beginners curious about Model Deployments and Model Maintenence
Everyone who is curious about undertstanding how ML models are dealt at production level.

MLOps Zero to Hero

What you'll learn

Explore related topics

Course content

Course Introduction3 lectures • 3min

Introduction to MLOps5 lectures • 45min

Understand the role of MLOps in ML Lifecycle at a High level [Project based]7 lectures • 52min

Data Versioning and Data Version Control (DVC)4 lectures • 27min

Experiment Tracking and MLFlow11 lectures • 1hr 18min

Fundamentals of Model Deployment and Model Serving4 lectures • 26min

Model Deployment and Serving using Virtual Machines6 lectures • 1hr 42min

Model Deployment and Serving using Kubernetes10 lectures • 1hr 44min

KServe6 lectures • 47min

AWS Sagemaker AI7 lectures • 1hr 35min

Requirements

Description

Who this course is for: