Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Machine Learning, Data Science & AI Engineering with Python

Name: Machine Learning, Data Science & AI Engineering with Python
Rating: 4.5 (36808 reviews)

Complete hands-on deep learning, AI engineering and Generative AI tutorial with data science, Tensorflow, GPT, OpenAI

Created bySundog Education by Frank Kane, Frank Kane, Sundog Education Team

Last updated 5/2026

English

Arabic [Auto],Bulgarian [Auto],

What you'll learn

Build generative AI systems with OpenAI, RAG, and LLM Agents
Build artificial neural networks with Tensorflow and Keras
Implement machine learning at massive scale with Apache Spark's MLLib
Classify images, data, and sentiments using deep learning
Make predictions using linear regression, polynomial regression, and multivariate regression
Data Visualization with MatPlotLib and Seaborn
Understand reinforcement learning - and how to build a Pac-Man bot
Classify data using K-Means clustering, Support Vector Machines (SVM), KNN, Decision Trees, Naive Bayes, and PCA
Use train/test and K-Fold cross validation to choose and tune your models
Build a movie recommender system using item-based and user-based collaborative filtering
Clean your input data to remove outliers
Design and evaluate A/B tests using T-Tests and P-Values

Course content

17 sections • 148 lectures • 20h 13m total length

Introduction2:41
What to expect in this course, who it's for, and the general format we'll follow.
Udemy 101: Getting the Most From This Course2:10
Important note0:24
Installation: Getting Started0:39
[Activity] WINDOWS: Installing and Using Anaconda & Course Materials10:43
[Activity] MAC: Installing and Using Anaconda & Course Materials8:07
[Activity] LINUX: Installing and Using Anaconda & Course Materials9:11
Python Basics, Part 1 [Optional]4:59
[Activity] Python Basics, Part 2 [Optional]5:17
[Activity] Python Basics, Part 3 [Optional]2:46
[Activity] Python Basics, Part 4 [Optional]4:02
Introducing the Pandas Library [Optional]10:14
Master pandas for loading and cleaning tabular data with read_csv, inspect with head and tail, and translate to NumPy arrays for scikit-learn.

Types of Data (Numerical, Categorical, Ordinal)6:58
We cover the differences between continuous and discrete numerical data, categorical data, and ordinal data.
Mean, Median, Mode5:26
A refresher on mean, median, and mode - and when it's appropriate to use each.
[Activity] Using mean, median, and mode in Python8:30
[Activity] Variation and Standard Deviation11:12
Probability Density Function; Probability Mass Function3:27
Introducing the concepts of probability density functions (PDF's) and probability mass functions (PMF's).
Common Data Distributions (Normal, Binomial, Poisson, etc)7:45
[Activity] Percentiles and Moments12:32
[Activity] A Crash Course in matplotlib12:13
[Activity] Advanced Visualization with Seaborn17:30
[Activity] Covariance and Correlation11:31
[Exercise] Conditional Probability16:04
Exercise Solution: Conditional Probability of Purchase by Age2:20
Here we'll go over my solution to the exercise I challenged you with in the previous lecture - changing our fabricated data to have no real correlation between ages and purchases, and seeing if you can detect that using conditional probability.
Bayes' Theorem5:23
An overview of Bayes' Theorem, and an example of using it to uncover misleading statistics surrounding the accuracy of drug testing.

[Activity] Linear Regression11:01
Explore linear regression by fitting a straight line with least squares (ols) to predict unseen values, evaluate fit with r squared, and compare gradient descent and python/scipy tools.
[Activity] Polynomial Regression8:04
[Activity] Multiple Regression, and Predicting Car Prices16:26
Explore multiple regression to predict car prices using multiple features, and learn how ordinary least squares estimates coefficients, normalizes data, and handles outliers in statsmodels.
Multi-Level Models4:36
We'll just cover the concept of multi-level modeling, as it is a very advanced topic. But you'll get the ideas and challenges behind it.

Supervised vs. Unsupervised Learning, and Train/Test8:57
The concepts of supervised and unsupervised machine learning, and how to evaluate the ability of a machine learning model to predict new values using the train/test technique.
[Activity] Using Train/Test to Prevent Overfitting a Polynomial Regression5:47
Learn how to use a train and test split to evaluate polynomial regression, prevent overfitting, and compare training and test performance using the R squared score.
Bayesian Methods: Concepts3:59
We'll introduce the concept of Naive Bayes and how we might apply it to the problem of building a spam classifier.
[Activity] Implementing a Spam Classifier with Naive Bayes8:05
K-Means Clustering7:23
K-Means is a way to identify things that are similar to each other. It's a case of unsupervised learning, which could result in clusters you never expected!
[Activity] Clustering people based on income and age5:14
Measuring Entropy3:09
Entropy is a measure of the disorder in a data set - we'll learn what that means, and how to compute it mathematically.
[Activity] WINDOWS: Installing Graphviz0:22
[Activity] MAC: Installing Graphviz1:16
[Activity] LINUX: Installing Graphviz0:54
Decision Trees: Concepts8:43
Decision trees can automatically create a flow chart for making some decision, based on machine learning! Let's learn how they work.
[Activity] Decision Trees: Predicting Hiring Decisions9:47
Ensemble Learning5:59
Random Forests was an example of ensemble learning; we'll cover over techniques for combining the results of many models to create a better result than any one could produce on its own.
[Activity] XGBoost15:29
Support Vector Machines (SVM) Overview4:27
Support Vector Machines are an advanced technique for classifying data that has multiple features. It treats those features as dimensions, and partitions this higher-dimensional space using "support vectors."
[Activity] Using SVM to cluster people using scikit-learn9:29

User-Based Collaborative Filtering7:57
One way to recommend items is to look for other people similar to you based on their behavior, and recommend stuff they liked that you haven't seen yet.
Item-Based Collaborative Filtering8:15
The shortcomings of user-based collaborative filtering can be solved by flipping it on its head, and instead looking at relationships between items instead of relationships between people.
[Activity] Finding Movie Similarities using Cosine Similarity9:08
Apply item-based collaborative filtering to find movie similarities using Movielens rating data, exploring with pandas pivot tables, correlation, and missing-value handling to interpret top similar movies.
[Activity] Improving the Results on Movie Similarities7:59
[Activity] Making Movie Recommendations with Item-Based Collaborative Filtering10:22
Build an item-based collaborative filtering recommender with pandas on the movielens data, creating a fake user profile and a user–movie sparse matrix, computing movie correlations, and generating top recommendations.
[Exercise] Improve the recommender's results5:29
Experiment with improving movie recommendations by tweaking item-based collaborative filtering parameters, such as correlation method and minimum period. Evaluate using train-test techniques and RMSE to balance discovery and confidence.

K-Nearest-Neighbors: Concepts3:44
KNN is a very simple supervised machine learning technique; we'll quickly cover the concept here.
[Activity] Using KNN to predict a rating for a movie12:29
Apply k nearest neighbors to predict a movie's rating using genre vectors, popularity, and mean rating, via a distance metric based on cosine similarity and popularity difference.
Dimensionality Reduction; Principal Component Analysis (PCA)5:44
Data that includes many features or many different vectors can be thought of as having many dimensions. Often it's useful to reduce those dimensions down to something more easily visualized, for compression, or to just distill the most important information from a data set (that is, information that contributes the most to the data's variance.) Principal Component Analysis and Singular Value Decomposition do that.
[Activity] PCA Example with the Iris data set9:05
Data Warehousing Overview: ETL and ELT9:05
Cloud-based data storage and analysis systems like Hadoop, Hive, Spark, and MapReduce are turning the field of data warehousing on its head. Instead of extracting, transforming, and then loading data into a data warehouse, the transformation step is now more efficiently done using a cluster after it's already been loaded. With computing and storage resources so cheap, this new approach now makes sense.
Reinforcement Learning12:44
We'll describe the concept of reinforcement learning - including Markov Decision Processes, Q-Learning, and Dynamic Programming - all using a simple example of developing an intelligent Pac-Man.
[Activity] Reinforcement Learning & Q-Learning with Gym12:56
Explore reinforcement learning with OpenAI Gym by training a taxi agent using Q-learning, a Q-table, and exploration strategies to optimize pickup and drop-off routes.
Understanding a Confusion Matrix5:17
What's a confusion matrix, and how do I read it?
Measuring Classifiers (Precision, Recall, F1, ROC, AUC)6:35

Bias/Variance Tradeoff6:15
Bias and Variance both contribute to overall error; understand these components of error and how they relate to each other.
[Activity] K-Fold Cross-Validation to avoid overfitting10:26
Data Cleaning and Normalization7:10
Cleaning your raw input data is often the most important, and time-consuming, part of your job as a data scientist!
[Activity] Cleaning web log data10:56
Normalizing numerical data3:22
A brief reminder: some models require input data to be normalized, or within the same range, of each other. Always read the documentation on the techniques you are using.
[Activity] Detecting outliers6:21
Identify and handle outliers responsibly using standard deviation and interquartile range. Implement principled filtering beyond two standard deviations of the median to preserve meaningful means, demonstrated with income data.
Feature Engineering and the Curse of Dimensionality6:03
Imputation Techniques for Missing Data7:48
Master imputation techniques for missing data, from mean replacement and median considerations to KNN and MICE, while emphasizing data quality and the option to collect more data.
Handling Unbalanced Data: Oversampling, Undersampling, and SMOTE5:35
Explore how to handle unbalanced data in fraud detection using oversampling, undersampling, and SMOTE. Learn how adjusting prediction thresholds balances false positives and false negatives.
Binning, Transforming, Encoding, Scaling, and Shuffling7:51

Warning about Java 21+ and Spark 3!0:39
Spark installation notes for MacOS and Linux users1:32
[Activity] Installing Spark11:06
We'll present an overview of the steps needed to install Apache Spark on your desktop in standalone mode, and get started by getting a Java Development Kit installed on your system.
Spark Introduction9:10
A high-level overview of Apache Spark, what it is, and how it works.
Spark and the Resilient Distributed Dataset (RDD)11:42
We'll go in more depth on the core of Spark - the RDD object, and what you can do with it.
Introducing MLLib5:09
A quick overview of MLLib's capabilities, and the new data types it introduces to Spark.
Introduction to Decision Trees in Spark16:15
We'll walk through an example of coding up and running a decision tree using Apache Spark's MLLib! In this exercise, we try to predict if a job candidate will be hired based on their work and educational history, using a decision tree that can be distributed across an entire cluster with Spark.
[Activity] K-Means Clustering in Spark11:23
We'll take the same example of clustering people by age and income from our earlier K-Means lecture - but solve it in Spark!
TF / IDF6:44
We'll introduce the concept of TF-IDF (Term Frequency / Inverse Document Frequency) and how it applies to search problems, in preparation for using it with MLLib.
[Activity] Searching Wikipedia with Spark8:21
Let's use TF-IDF, Spark, and MLLib to create a rudimentary search engine for real Wikipedia pages!
[Activity] Using the Spark DataFrame API for MLLib8:07
Spark 2.0 introduced a new API for MLLib based on DataFrame objects; we'll look at an example of using this to create and use a linear regression model.

Deploying Models to Real-Time Systems8:42
High-level thoughts on various ways to deploy your trained models to production systems including apps and websites.
A/B Testing Concepts8:23
Running controlled experiments on your website usually involves a technique called the A/B test. We'll learn how they work.
T-Tests and P-Values5:59
How to determine significance of an A/B tests results, and measure the probability of the results being just from random chance, using T-Tests, the T-statistic, and the P-value.
[Activity] Hands-on with T-Tests6:03
Determining How Long to Run an Experiment3:24
Some A/B tests just don't affect customer behavior one way or another. How do you know how long to let an experiment run for before giving up?
A/B Test Gotchas9:26
There are many limitations associated with running short-term A/B tests - novelty effects, seasonal effects, and more can lead you to the wrong decisions. We'll discuss the forces that may result in misleading A/B test results so you can watch out for them.

Deep Learning Pre-Requisites11:43
If you skipped ahead, I'll show you where to get the course materials for just this section. And we'll cover some pre-requisite concepts for understanding how neural networks operate: gradient descent, autodiff, and softmax.
The History of Artificial Neural Networks11:14
We'll cover the evolution of artificial neural networks from 1943 to modern-day architectures, which is a great way to understand how they work.
[Activity] Deep Learning in the Tensorflow Playground12:00
Google's Tensorflow Playground lets you experiment with deep neural networks and understand them - without writing a line of code!
Deep Learning Details9:29
Let's dive into the details on how modern multi-level perceptrons are trained and tuned.
Introducing Tensorflow11:29
We'll cover Google's open-source Tensorflow Python library, and see how it can help you create and train neural networks.
[Activity] Using Tensorflow, Part 113:11
We'll use Tensorflow to create a neural network that classifies handwritten numerals from the MNIST data set. Part 1 of 2.
[Activity] Using Tensorflow, Part 212:03
We'll use Tensorflow to create a neural network that classifies handwritten numerals from the MNIST data set. Part 2 of 2.
[Activity] Introducing Keras13:33
The Tensorflow 1.9 offers a higher-level API called Keras, and makes it easier to construct your neural networks. We'll use Keras to solve the same handwriting recognition problem - but with much less code.
[Activity] Using Keras to Predict Political Affiliations11:48
As another hands-on example, we'll use Keras to build a neural network that learns how to determine if a politician is Republican on Democrat just based on their votes.
Convolutional Neural Networks (CNN's)11:28
CNN's mimic your visual cortex, and can find features in one, two, or three-dimensional data even if you're not sure where exactly that feature is.
[Activity] Using CNN's for handwriting recognition8:02
CNN's are better suited to image data, and we'll prove that by using a CNN in Keras on the MNIST data.
Recurrent Neural Networks (RNN's)11:02
RNN's can handle sequences of data, like events over time or words in a sentence. Learn what's different about how they work, how they are trained, and ways to optimize them.
[Activity] Using a RNN for sentiment analysis9:37
Let's implement a RNN in Keras to determine positive or negative sentiments for real movie reviews from IMDb!
Tuning Neural Networks: Learning Rate and Batch Size Hyperparameters4:39
Deep Learning Regularization with Dropout and Early Stopping6:21
The Ethics of Deep Learning11:02
As with any new technology, sometimes we can become overzealous in how we use it. A few cautionary tales to make sure your deep learning work does more good than harm.

Requirements

You'll need a desktop computer (Windows, Mac, or Linux) capable of running Anaconda 3 or newer. The course will walk you through installing the necessary free software.
Some prior coding or scripting experience is required.
At least high school level math skills will be required.

Description

Master Machine Learning & AI Engineering — From Data Analytics to Agentic AI Solutions

Launch your career in AI with a comprehensive, hands-on course that takes you from beginner to advanced. Learn Python, data science, classical machine learning, and the latest in AI engineering—including generative AI, transformers, and LLM agents / agentic AI.

Why This Course?

Learn by Doing

With over 145 lectures and 21+ hours of video content, this course is built around practical Python projects and real-world use cases—not just theory.

Built for the Real World

Learn how companies like Google, Amazon, and OpenAI use AI to drive innovation. Our curriculum is based on skills in demand from leading tech employers.

No Experience? No Problem

Start from scratch with beginner-friendly lessons in Python and statistics. By the end, you’ll be building intelligent systems with cutting-edge AI tools.

A Structured Path from Beginner to AI Engineer

1. Programming Foundations
Start with a crash course in Python, designed for beginners. You’ll learn the language fundamentals needed for data science and AI work.

2. Data Science and Statistics
Build a solid foundation in data analysis, visualization, descriptive and inferential statistics, and feature engineering—essential skills for working with real-world datasets.

3. Classical Machine Learning
Explore supervised and unsupervised learning, including linear regression, decision trees, SVMs, clustering, ensemble models, and reinforcement learning.

4. Deep Learning with TensorFlow and Keras
Understand neural networks, convolutional neural networks (CNNs), and recurrent neural networks (RNNs), using real code examples and exercises.

5. Advanced AI Engineering and Generative AI
Go beyond traditional ML to learn the latest AI tools and techniques:

Transformers and self-attention mechanisms
GPT, ChatGPT, and the OpenAI API
Fine-tuning foundation models
Advanced Retrieval-Augmented Generation (RAG)
LangChain and LLM agents
Designing and building multi-agent systems with the OpenAI Agents SDK
Real-world GenAI projects and deployment strategies

6. Big Data and Apache Spark
Learn how to scale machine learning to large datasets using Spark, and apply ML techniques on distributed computing clusters.

Designed for Career Growth

Whether you're a programmer looking to pivot into AI or a tech professional seeking to expand your skills, this course delivers a complete, industry-relevant education. Concepts are explained clearly, in plain English, with a focus on applying what you learn.

What Students Are Saying

"I started doing your course... and it was pivotal in helping me transition into a role where I now solve corporate problems using AI. Your course demystified how to succeed in corporate AI research, making you the most impressive instructor in ML I've encountered."
— Kanad Basu, PhD

Enroll Today and Build Your Future in AI

Join thousands of learners who have used this course to land jobs, lead projects, and build real AI applications. Stay ahead in one of the fastest-growing fields in tech.

Start your journey today—from Python beginner to AI engineer.

Who this course is for:

Software developers or programmers who want to transition into the lucrative data science and machine learning career path will learn a lot from this course.
Technologists curious about how deep learning really works
Data analysts in the finance or other non-tech industries who want to transition into the tech industry can use this course to learn how to analyze data using code instead of tools. But, you'll need some prior experience in coding or scripting to be successful.
If you have no prior coding or scripting experience, you should NOT take this course - yet. Go take an introductory Python course first.

Machine Learning, Data Science & AI Engineering with Python

What you'll learn

Explore related topics

Course content

Getting Started12 lectures • 1hr 1min

Statistics and Probability Refresher, and Python Practice13 lectures • 2hr 1min

Predictive Models4 lectures • 40min

Machine Learning with Python16 lectures • 1hr 39min

Recommender Systems6 lectures • 49min

More Data Mining and Machine Learning Techniques9 lectures • 1hr 18min

Dealing with Real-World Data10 lectures • 1hr 12min

Apache Spark: Machine Learning on Big Data11 lectures • 1hr 30min

Experimental Design / ML in the Real World6 lectures • 42min

Deep Learning and Neural Networks16 lectures • 2hr 49min

Requirements

Description

Who this course is for: