Master statistics & machine learning: intuition, math, code
4.8 (105 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
1,467 students enrolled

Master statistics & machine learning: intuition, math, code

A rigorous and engaging deep-dive into statistics and machine-learning, with hands-on applications in Python and MATLAB.
Highest Rated
4.8 (105 ratings)
Course Ratings are calculated from individual students’ ratings and a variety of other signals, like age of rating and reliability, to ensure that they reflect course quality fairly and accurately.
1,467 students enrolled
Created by Mike X Cohen
Last updated 8/2020
English
English [Auto]
Current price: $13.99 Original price: $19.99 Discount: 30% off
5 hours left at this price!
30-Day Money-Back Guarantee
This course includes
  • 36 hours on-demand video
  • 1 article
  • 2 downloadable resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
Training 5 or more people?

Get your team access to 4,000+ top Udemy courses anytime, anywhere.

Try Udemy for Business
What you'll learn
  • Descriptive statistics (mean, variance, etc)
  • Inferential statistics
  • T-tests, correlation, ANOVA, regression, clustering
  • The math behind the "black box" statistical methods
  • How to implement statistical methods in code
  • How to interpret statistics correctly and avoid common misunderstandings
  • Coding techniques in Python and MATLAB/Octave
  • Machine learning methods like clustering, predictive analysis, classification, and data cleaning
Requirements
  • Good work ethic and motivation to learn.
  • Previous background in statistics or machine learning is not necessary.
  • Python -OR- MATLAB with the Statistics toolbox (or Octave).
  • Some coding familiarity for the optional code exercises.
  • No textbooks necessary! All materials are provided inside the course.
Description

Statistics and probability control your life. I don't just mean What YouTube's algorithm recommends you to watch next, and I don't just mean the chance of meeting your future significant other in class or at a bar. Human behavior, single-cell organisms, Earthquakes, the stock market, whether it will snow in the first week of December, and countless other phenomena are probabilistic and statistical. Even the very nature of the most fundamental structure of the universe is governed by probability and statistics.

You need to understand statistics.

Nearly all areas of human civilization are incorporating code and numerical computations. This means that many jobs and areas of study are based on applications of statistical and machine-learning techniques in programming languages like Python and MATLAB. This is often called 'data science' as is an increasingly important topic.

If you want to make yourself a future-proof employee, employer, data scientist, or researcher in any technical field, you'll need to know statistics and machine-learning. And you'll need to know how to implement concepts like probability theory and confidence intervals, k-means clustering and PCA, Spearman correlation and logistic regression, in computer languages like Python or MATLAB.

There are six reasons why you should take this course:

  • This course covers everything you need to understand the fundamentals of statistics, machine learning, and data science, from bar plots to ANOVAs, regression to k-means, t-test to non-parametric permutation testing.

  • After completing this course, you will be able to understand a wide range of statistical and machine-learning analyses, even specific advanced methods that aren't taught here. That's because you will learn the foundations upon which advanced methods are build.

  • This course balances mathematical rigor with intuitive explanations, and hands-on explorations in code.

  • Enrolling in the course gives you access to the Q&A, in which I actively participate every day.

  • I've been studying, developing, and teaching statistics for 20 years, and I'm, like, really great at math.

What you need to know before taking this course:

  • High-school level maths. This is an applications-oriented course, so I don't go into a lot of detail about proofs, derivations, or calculus.

  • Basic coding skills in Python or MATLAB. This is necessary only if you want to follow along with the code. You can successfully complete this course without writing a single line of code! But participating in the coding exercises will help you learn the material. The MATLAB code relies on the Statistics and Machine Learning toolbox (you can use Octave if you don't have MATLAB or the stats toolbox). Python code is written in Jupyter notebooks.

  • I recommend taking my free course called "Statistics literacy for non-statisticians". It's 90 minutes long and will give you a bird's-eye-view of the main topics in statistics that I go into much much much more detail about here in this course. Note that the free short course is not required for this course, but complements this course nicely. And you can get through the whole thing in less than an hour if you watch if on 1.5x speed!

  • You do not need any previous experience with statistics, machine learning, or data science. That's why you're here!

Is this course up to date?

Yes, I maintain all of my courses regularly. I add new lectures to keep the course "alive," and I add new lectures (or sometimes re-film existing lectures) to explain maths concepts better if students find a topic confusing or if I made a mistake in the lecture (rare, but it happens!).

You can check the "Last updated" text at the top of this page to see when I last worked on improving this course!

What if you have questions about the material?

This course has a Q&A (question and answer) section where you can post your questions about the course material (about the maths, statistics, coding, or machine learning aspects). I try to answer all questions within a day. You can also see all other questions and answers, which really improves how much you can learn! And you can contribute to the Q&A by posting to ongoing discussions.

And, you can also post your code for feedback or just to show off -- I love it when students actually write better code than mine! (Ahem, doesn't happen so often.)

What should you do now?

First of all, congrats on reading this far; that means you are seriously interested in learning statistics and machine learning. Watch the preview videos, check out the reviews, and, when you're ready, invest in your brain by learning from this course!

Who this course is for:
  • Students taking statistics or machine learning courses
  • Professionals who need to learn statistics and machine learning
  • Scientists who want to understand their data analyses
  • Anyone who wants to see "under the hood" of machine learning
Course content
Expand all 210 lectures 35:50:23
+ Introductions
5 lectures 24:32
Statistics guessing game!
08:47
Using the Q&A forum
05:16
(optional) Entering time-stamped notes in the Udemy video player
01:52
+ Math prerequisites
8 lectures 41:53
Should you memorize statistical formulas?
03:12
Arithmetic and exponents
04:02
Scientific notation
05:53
Summation notation
04:21
Absolute value
03:04
Natural exponent and logarithm
05:53
The logistic function
08:58
+ IMPORTANT: Download course materials
1 lecture 03:48
Download materials for the entire course!
03:48
+ What are (is?) data?
7 lectures 56:26
Is "data" singular or plural?!?!!?!
01:53
Where do data come from and what do they mean?
06:09
Types of data: categorical, numerical, etc
14:56
Code: representing types of data on computers
08:58
Sample vs. population data
12:02
Samples, case reports, and anecdotes
05:31
The ethics of making up data
06:57
+ Visualizing data
14 lectures 01:59:31
Bar plots
11:37
Code: bar plots
16:59
Box-and-whisker plots
05:41
Code: box plots
08:41
"Unsupervised learning": Boxplots of normal and uniform noise
02:31
Histograms
11:16
Code: histograms
16:40
"Unsupervised learning": Histogram proportion
02:22
Pie charts
05:59
Code: pie charts
13:22
When to use lines instead of bars
06:11
Code: line plots
07:24
"Unsupervised learning": log-scaled plots
01:44
+ Descriptive statistics
25 lectures 04:32:43
Descriptive vs. inferential statistics
04:31
Accuracy, precision, resolution
07:28
Data distributions
11:26
Code: data from different distributions
32:08
"Unsupervised learning": histograms of distributions
01:57
The beauty and simplicity of Normal
05:29
Measures of central tendency (mean)
12:47
Measures of central tendency (median, mode)
12:17
Code: computing central tendency
13:57
"Unsupervised learning": central tendencies with outliers
03:07
Measures of dispersion (variance, standard deviation)
17:48
Code: Computing dispersion
26:33
Interquartile range (IQR)
04:53
Code: IQR
15:58
QQ plots
07:21
Code: QQ plots
15:34
Statistical "moments"
08:23
Code: Histogram bins
12:24
Violin plots
03:19
Code: violin plots
10:09
"Unsupervised learning": asymmetric violin plots
02:31
Shannon entropy
11:02
Code: entropy
20:15
"Unsupervised learning": entropy and number of bins
01:26
+ Data normalizations and outliers
17 lectures 02:18:25
Garbage in, garbage out (GIGO)
04:10
Z-score standardization
09:25
Code: z-score
12:50
Code: min-max scaling
08:16
"Unsupervised learning": Invert the min-max scaling
02:35
What are outliers and why are they dangerous?
14:26
Removing outliers: z-score method
09:26
The modified z-score method
04:03
Code: z-score for outlier removal
22:30
"Unsupervised learning": z vs. modified-z
02:38
Multivariate outlier detection
09:26
Code: Euclidean distance for outlier removal
09:01
Removing outliers by data trimming
05:47
Code: Data trimming to remove outliers
11:03
Non-parametric solutions to outliers
04:40
An outlier lecture on personal accountability
03:03
+ Probability theory
24 lectures 04:25:52
What is probability?
12:17
Probability vs. proportion
09:25
Computing probabilities
10:28
Code: compute probabilities
14:34
"Unsupervised learning": probabilities of odds-space
02:30
Probability mass vs. density
13:06
Code: compute probability mass functions
11:37
Cumulative probability distributions
10:44
Code: cdfs and pdfs
09:41
"Unsupervised learning": cdf's for various distributions
02:25
Creating sample estimate distributions
18:31
Monte Carlo sampling
02:53
Sampling variability, noise, and other annoyances
08:41
Code: sampling variability
26:15
Expected value
10:09
Conditional probability
12:45
Code: conditional probabilities
20:12
Tree diagrams for conditional probabilities
06:24
The Law of Large Numbers
09:50
Code: Law of Large Numbers in action
19:23
The Central Limit Theorem
10:34
Code: the CLT in action
16:21
"Unsupervised learning": Averaging pairs of numbers
02:09
+ Hypothesis testing
12 lectures 02:22:15
IVs, DVs, models, and other stats lingo
16:45
What is an hypothesis and how do you specify one?
15:08
Sample distributions under null and alternative hypotheses
10:38
P-values: definition, tails, and misinterpretations
18:56
Degrees of freedom
12:21
Type 1 and Type 2 errors
14:18
Parametric vs. non-parametric tests
09:12
Multiple comparisons and Bonferroni correction
08:33
Statistical vs. theoretical vs. clinical significance
06:51
Cross-validation
11:30
Statistical significance vs. classification accuracy
11:12
+ The t-test family
14 lectures 02:44:41
Purpose and interpretation of the t-test
13:13
One-sample t-test
08:08
Code: One-sample t-test
20:46
"Unsupervised learning": The role of variance
02:50
Code: Two-samples t-test
22:09
"Unsupervised learning": Importance of N for t-test
04:45
Wilcoxon signed-rank (nonparametric t-test)
07:35
Code: Signed-rank test
18:33
Mann-Whitney U test (nonparametric t-test)
06:03
Code: Mann-Whitney U test
05:21
Permutation testing for t-test significance
11:25
Code: permutation testing
25:26
"Unsupervised learning": How many permutations?
05:21