Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Master statistics & machine learning: intuition, math, code

Name: Master statistics & machine learning: intuition, math, code
Rating: 4.8 (3040 reviews)

A rigorous and engaging deep-dive into statistics and machine-learning, with hands-on applications in Python and MATLAB.

Bestseller

Highest Rated

Created byMike X Cohen

Last updated 6/2026

English

What you'll learn

Descriptive statistics (mean, variance, etc)
Inferential statistics
T-tests, correlation, ANOVA, regression, clustering
The math behind the "black box" statistical methods
How to implement statistical methods in code
How to interpret statistics correctly and avoid common misunderstandings
Coding techniques in Python and MATLAB/Octave
Machine learning methods like clustering, predictive analysis, classification, and data cleaning

Course content

19 sections • 224 lectures • 38h 20m total length

[Important] Getting the most out of this course4:28
Strategies for optimal learning.
About using MATLAB or Python4:09
How to use different programming languages in the course.
Statistics guessing game!8:47
Simulate data and run a statistical analysis. A fun way to start the course :)
Using the Q&A forum5:16
I explain how to get the most out of the interactive part of this course: The Q&A forum!
(optional) Entering time-stamped notes in the Udemy video player1:52

Should you memorize statistical formulas?3:12
A discussion about memorizing formulas.
Arithmetic and exponents4:02
A reminder about foundational arithmetic rules.
Scientific notation5:53
Ways of representing very large and very small numbers.
Summation notation4:21
Mathematical notation for adding a series of numbers.
Absolute value3:04
Absolute value is the distance away from zero, regardless of sign.
Natural exponent and logarithm8:00
Natural exponent and logarithm are two of the most important functions in math and its applications.
The logistic function8:58
The logistic function is used often in statistics, machine learning, and optimization.
Rank and tied-rank6:30
To rank data means to transform raw numerical values into ordinal position. Rank is used in non-parametric statistics.

Is "data" singular or plural?!?!!?!1:53
My take on statistical terminology, grammar, and modern culture.
Where do data come from and what do they mean?6:09
A philosophical discussion about how we can obtain numbers from the universe.
Types of data: categorical, numerical, etc14:56
Data come in different forms, which has implications for ways of visualizing and analyzing data.
Code: representing types of data on computers8:58
Introduction to data types in MATLAB and Python.
Sample vs. population data12:02
There is an important distinction between measuring *all* of the data vs. some of the data.
Samples, case reports, and anecdotes5:31
This distinction is related to sample size, and has implications for the generalizability of experimental findings.
The ethics of making up data6:57
The take-home message here is simple: Don't lie or cheat!

Bar plots11:37
Lecture on how to create and interpret bar plots, including the types of data that are used.
Code: bar plots16:59
Creating bar plots in MATLAB and Python, including parameters.
Box-and-whisker plots5:41
Creating and interpreting box plots, also called box-and-whisker plots.
Code: box plots8:41
Box plots in MATLAB and Python.
"Unsupervised learning": Boxplots of normal and uniform noise2:31
An exercise on creating box plots of random numbers drawn from different distributions.
Histograms11:16
A lecture on how to create and interpret histograms, including frequency vs. proportion.
Code: histograms16:40
Creating and visualizing histograms in code.
"Unsupervised learning": Histogram proportion2:22
An exercise on transforming frequencies (counts) into proportions.
Pie charts5:59
Pie charts are nice visualizations when your data add up to 100%.
Code: pie charts13:22
Create pie charts in code. It's easier than you think!
When to use lines instead of bars6:11
A critical discussion of how to visualize categorical vs. continuous data using lines vs. bars.
Linear vs. logarithmic axis scaling9:04
A comparison of scaling the y-axis and x-axis intervals.
Code: line plots7:24
More on plotting and parameterizing line plots in code.
"Unsupervised learning": log-scaled plots1:44
An exercise on scaling data in different ways.

Descriptive vs. inferential statistics4:31
The term "statistics" actually has two broad meanings: characteristics of a sample vs. generalizing to other samples.
Accuracy, precision, resolution7:28
These terms relate to how your data relate to the real world objects that the data measure.
Data distributions11:26
Data come in different distributions, which has implications for how to visualize and analyze datasets.
Code: data from different distributions32:08
You will learn how to create random data with different distributions in MATLAB and Python.
"Unsupervised learning": histograms of distributions1:57
What happens when you plot the distribution of a distribution function? Find out!
The beauty and simplicity of Normal5:29
The Gaussian distribution describes a remarkable and fundamental quality of the universe.
Measures of central tendency (mean)12:47
The mean, aka average, is the most common and insightful measure of a data set.
Measures of central tendency (median, mode)12:17
The mean is not appropriate for all data distributions; here you will learn two non-parametric measures of dataset centrality.
Code: computing central tendency13:57
Computing mean, median, and mode in MATLAB and Python.
"Unsupervised learning": central tendencies with outliers3:07
An exercise to help you understand the impact of outliers on mean, median, and mode.
Measures of dispersion (variance, standard deviation)17:48
You will learn about dispersion, which is how wide the data distribution is.
Code: Computing dispersion26:33
Computing different measures of dispersion in code.
Interquartile range (IQR)4:53
IQR is a measures of the spread of most (but not all) of the data, and is robust to outliers.
Code: IQR15:58
See how to generate the interquartile range in code.
QQ plots7:20
QQ plots show how your data compare to a theoretical normal (Gaussian) distribution.
Code: QQ plots15:34
Learn how QQ plots are created in Python and MATLAB.
Statistical "moments"8:23
Moments are statistical characteristics of the data. Here you'll learn the first four moments of a distribution.
Histograms part 2: Number of bins10:00
More on histograms: Learn the formulas for determining the number of bins (data discretizations) to use.
Code: Histogram bins12:24
Experiment with histogram parameters.
Violin plots3:19
Learn how to create and interpret a beautiful graph for visualizing data and data distributions.
Code: violin plots10:09
See how violin plots are created in code. Tip: Use lots of colors!
"Unsupervised learning": asymmetric violin plots2:31
An exercise to visualize two data distributions in one violin plot.
Shannon entropy11:02
Learn how to interpret this nonlinear measure of data dispersion.
Code: entropy20:15
Shannon entropy in code.
"Unsupervised learning": entropy and number of bins1:26
You will see how the bin-count parameter affects entropy.

Garbage in, garbage out (GIGO)4:10
No amount of fancy statistics or data cleaning can fix terrible data. Start with good data!
Z-score standardization9:25
Z-score is the most important data normalization in statistics and machine learning.
Code: z-score12:50
Translate the z-score formula into code.
Min-max scaling5:06
Min-max scaling is the second-most important data normalization method.
Code: min-max scaling8:16
Translate min-max scaling into Python and MATLAB code.
"Unsupervised learning": Invert the min-max scaling2:35
An exercise to get from normalized data back to their original scale.
What are outliers and why are they dangerous?14:26
Outliers are unusual values that can completely screw up your analyses and interpretation!
Removing outliers: z-score method9:26
This is one of the most common methods for identifying and removing outliers.
The modified z-score method4:03
The modified z-score method uses the median instead of the mean, and therefore is good for removing outliers in non-normal distributions.
Code: z-score for outlier removal22:30
Implement the modified z-score method in code.
"Unsupervised learning": z vs. modified-z2:38
Does it really matter if you use the regular or modified z-score method? Come find out!
Multivariate outlier detection9:26
Extend the z-score method to outliers in high-dimensional datasets.
Code: Euclidean distance for outlier removal9:01
Multivariate outlier identification and removal, using concepts from geometry.
Removing outliers by data trimming5:47
Another common method for removing outliers, based on threshold-exceedance.
Code: Data trimming to remove outliers11:03
See how data trimming is implemented in MATLAB and Python.
Non-parametric solutions to outliers4:40
Instead of removing outliers, you can use analyses that are robust to outliers.
Nonlinear data transformations13:46
Some outliers can be transformed into non-outliers by applying certain nonlinear transformations.
An outlier lecture on personal accountability3:03
A lecture on one of the main challenges of online learning. Just something to reflect on.

What is probability?12:17
Introduction to probability and the role of probability in statistics.
Probability vs. proportion9:25
Probability and proportion are really similar concepts, but it's important to know their subtle difference.
Computing probabilities10:28
Instructions on how to compute probabilities (math).
Code: compute probabilities14:34
How to compute probabilities in practice (code).
Probability and odds4:58
Probability and odds are different concepts; see how they differ and how to interpret odds ratios.
"Unsupervised learning": probabilities of odds-space2:30
This exercise on odds-ratios will help make sure you really understand the math of odds-ratios.
Probability mass vs. density13:06
Different terms are used for probabilities, depending on the data type (categorical vs. continuous).
Code: compute probability mass functions11:37
Compute empirical probability mass functions.
Cumulative distribution functions13:46
cdfs are central to evaluating statistical significance. In this video you'll learn how to create and interpret cdfs.
Code: cdfs and pdfs10:10
Here you will learn how to compute cdfs from pdfs, including a potentially confusing aspect of their relationship.
"Unsupervised learning": cdf's for various distributions2:25
An exercise to create cdfs from various random distributions.
Creating sample estimate distributions18:31
Learn how to create a distribution of means from repeated samples. This is key to hypothesis-testing.
Monte Carlo sampling2:53
You already know how to do Monte Carlo sampling; here I will make sure you know the terminology.
Sampling variability, noise, and other annoyances8:41
Sampling isn't perfect, and understanding its limitations will help you properly interpret statistical results.
Code: sampling variability26:15
See an example of sampling variability in code using random data.
Expected value10:09
"Expected value" is really similar to the "average value" but incorporates probabilities of data values in the population.
Conditional probability12:45
The world is a complicated place, and some probabilities make sense only when taking other factors into account.
Code: conditional probabilities20:12
A hands-on example of conditional probabilities in MATLAB/Python.
Tree diagrams for conditional probabilities6:24
Learn how to construct and interpret tree diagrams.
The Law of Large Numbers9:50
The LLN shows how more samples better approximates the true population parameters.
Code: Law of Large Numbers in action19:23
Think the LLN is too good to be true? You can see it for yourself in code!
The Central Limit Theorem10:34
The CLT is a surprising and yet fundamental aspect of the universe. Hint: All roads lead to Gauss.
Code: the CLT in action16:21
Think the CLT is too good to be true? You can see it for yourself in code!
"Unsupervised learning": Averaging pairs of numbers2:09
Another (surprising!) example of the CLT.

IVs, DVs, models, and other stats lingo16:45
You will learn important statistical terminology.
What is an hypothesis and how do you specify one?15:08
Perhaps the most important (and easily confused) term in science.
Sample distributions under null and alternative hypotheses10:38
You will understand how sample statistics are distributed in different states of reality.
P-values: definition, tails, and misinterpretations18:54
What the heck is a p-value?!? Watch this video to find out! You'll also learn common misunderstandings.
P-z combinations that you should memorize6:51
P-values are closely related to z-values (from the z-score normalization). A few pairs of relationships are worth memorizing.
Degrees of freedom12:21
Degrees of freedom (abbreviated df) are used in computing p-values, and therefore in understanding and evaluating statistical significance.
Type 1 and Type 2 errors14:18
Here you will learn two common errors in statistical inference; what they mean, where they come from, and why you cannot avoid them.
Parametric vs. non-parametric tests9:12
Parametric statistics rely on assumptions about null-hypothesis distributions. When those assumptions are violated (or cannot be verified), non-parametric approaches might be more suitable.
Multiple comparisons and Bonferroni correction12:36
You are more likely to make statistical errors when you run multiple statistical tests. In this video, you will learn how to use Bonferroni correction to minimize the risk of statistical errors resulting from multiple comparisons.
Statistical vs. theoretical vs. clinical significance6:51
"Significance" is an ambiguous and multifaceted term; it's not just about p<.05.
Cross-validation11:30
Cross-validation is a different statistical approach, often used in machine-learning. In this video, you will learn what it is and how to compute it.
Statistical significance vs. classification accuracy11:12
This video follows up on the previous video by discussing the relationship (and differences) between cross-validation accuracy and statistical significance (p-values).

Purpose and interpretation of the t-test13:13
The t-test is a simple yet super-duper important statistical test for comparing two groups. Here you will learn about the idea and interpretation of the t-test.
One-sample t-test8:08
There are several t-test formulas depending on the experiment design. The one-sample t-test is the most basic -- and the easiest to learn.
Code: One-sample t-test20:46
See the t-test implemented in code, both manually and using functions.
"Unsupervised learning": The role of variance2:50
Data variance is super-important for the t-test effect size (which comes from the formula for the t-test). In this code challenge, you will explore this relationship yourself!
Two-samples t-test13:06
The two-samples t-test is a minor modification of the one-sample t-test, and is used for when you compare two groups against each other.
Code: Two-samples t-test22:09
See the two-samples t-test in practice.
"Unsupervised learning": Importance of N for t-test4:45
This exercise will help you discover the importance of sample size in statistical inference, even while holding constant the effect size and variance.
Wilcoxon signed-rank (nonparametric t-test)7:35
When your data violate assumptions of the t-test, or when you have outliers, you can use non-parametric t-tests.
Code: Signed-rank test18:33
Learn how to implement the signed-rank test in code.
Mann-Whitney U test (nonparametric t-test)6:03
Nonparameteric two-sample t-test to test for differences in medians between two groups.
Code: Mann-Whitney U test5:21
See the nonparametric median-based two-group t-test implemented in code.
Permutation testing for t-test significance11:25
Learn how to create your own empirical null-hypothesis distribution to determine statistical significance, instead of relying on assumptions and formulas.
Code: permutation testing25:26
Permutation testing in code.
"Unsupervised learning": How many permutations?5:21
This exercise is your opportunity to explore the number of permutations that you need to create a null-hypothesis distribution.

Requirements

Good work ethic and motivation to learn.
Previous background in statistics or machine learning is not necessary.
Python -OR- MATLAB with the Statistics toolbox (or Octave).
Some coding familiarity for the optional code exercises.
No textbooks necessary! All materials are provided inside the course.

Description

Statistics and probability control your life. I don't just mean What YouTube's algorithm recommends you to watch next, and I don't just mean the chance of meeting your future significant other in class or at a bar. Human behavior, single-cell organisms, Earthquakes, the stock market, whether it will snow in the first week of December, and countless other phenomena are probabilistic and statistical. Even the very nature of the most fundamental deep structure of the universe is governed by probability and statistics.

You need to understand statistics.

Nearly all areas of human civilization are incorporating code and numerical computations. This means that many jobs and areas of study are based on applications of statistical and machine-learning techniques in programming languages like Python and MATLAB. This is often called 'data science' and is an increasingly important topic. Statistics and machine learning are also fundamental to artificial intelligence (AI) and business intelligence.

If you want to make yourself a future-proof employee, employer, data scientist, or researcher in any technical field -- ranging from data scientist to engineering to research scientist to deep learning modeler -- you'll need to know statistics and machine-learning. And you'll need to know how to implement concepts like probability theory and confidence intervals, k-means clustering and PCA, Spearman correlation and logistic regression, in computer languages like Python or MATLAB.

There are six reasons why you should take this course:

This course covers everything you need to understand the fundamentals of statistics, machine learning, and data science, from bar plots to ANOVAs, regression to k-means, t-test to non-parametric permutation testing.
After completing this course, you will be able to understand a wide range of statistical and machine-learning analyses, even specific advanced methods that aren't taught here. That's because you will learn the foundations upon which advanced methods are build.
This course balances mathematical rigor with intuitive explanations, and hands-on explorations in code.
Enrolling in the course gives you access to the Q&A, in which I actively participate every day.
I've been studying, developing, and teaching statistics for over 20 years, and I think math is, like, really cool.

What you need to know before taking this course:

High-school level maths. This is an applications-oriented course, so I don't go into a lot of detail about proofs, derivations, or calculus.
Basic coding skills in Python or MATLAB. This is necessary only if you want to follow along with the code. You can successfully complete this course without writing a single line of code! But participating in the coding exercises will help you learn the material. The MATLAB code relies on the Statistics and Machine Learning toolbox (you can use Octave if you don't have MATLAB or the statistics toolbox). Python code is written in Jupyter notebooks.
I recommend taking my free course called "Statistics literacy for non-statisticians". It's 90 minutes long and will give you a bird's-eye-view of the main topics in statistics that I go into much much much more detail about here in this course. Note that the free short course is not required for this course, but complements this course nicely. And you can get through the whole thing in less than an hour if you watch if on 1.5x speed!
You do not need any previous experience with statistics, machine learning, deep learning, or data science. That's why you're here!

Is this course up to date?

Yes, I maintain all of my courses regularly. I add new lectures to keep the course "alive," and I add new lectures (or sometimes re-film existing lectures) to explain maths concepts better if students find a topic confusing or if I made a mistake in the lecture (rare, but it happens!).

You can check the "Last updated" text at the top of this page to see when I last worked on improving this course!

What if you have questions about the material?

This course has a Q&A (question and answer) section where you can post your questions about the course material (about the maths, statistics, coding, or machine learning aspects). I try to answer all questions within a day. You can also see all other questions and answers, which really improves how much you can learn! And you can contribute to the Q&A by posting to ongoing discussions.

And, you can also post your code for feedback or just to show off -- I love it when students actually write better code than me! (Ahem, doesn't happen so often.)

What should you do now?

First of all, congrats on reading this far; that means you are seriously interested in learning statistics and machine learning. Watch the preview videos, check out the reviews, and, when you're ready, invest in your brain by learning from this course!

Who this course is for:

Students taking statistics or machine learning courses
Professionals who need to learn statistics and machine learning
Scientists who want to understand their data analyses
Anyone who wants to see "under the hood" of machine learning
Artificial intelligence (AI) students
Business intelligence students

Master statistics & machine learning: intuition, math, code

What you'll learn

Explore related topics

Course content

Introductions5 lectures • 25min

Math prerequisites8 lectures • 44min

IMPORTANT: Download course materials1 lecture • 5min

What are (is?) data?7 lectures • 56min

Visualizing data14 lectures • 2hr

Descriptive statistics25 lectures • 4hr 33min

Data normalizations and outliers18 lectures • 2hr 32min

Probability theory24 lectures • 4hr 29min

Hypothesis testing12 lectures • 2hr 26min

The t-test family14 lectures • 2hr 45min

Requirements

Description

Who this course is for: