Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Mathematical Statistics for Data Science

Name: Mathematical Statistics for Data Science
Rating: 4.7 (491 reviews)

Ex-Google data scientist's guide to mathematical statistics, covering method of moments, maximum likelihood, and more

Created byBrian Greco

Last updated 11/2023

English

What you'll learn

Learn how to use the method of moments and maximum likelihood estimation to learn from data
Learn how to evaluate and compare different methods using notions such as bias, variance, and mean squared error.
Master the Bernoulli, Uniform and Normal Distributions
Learn about the Cramer-Rao lower bound and how to know if we have found the best possible estimator
Learn to evaluate asymptotic properties of estimators, including consistency and the central limit theorem.
Learn to create confidence intervals

Course content

11 sections • 65 lectures • 4h 13m total length

Course Introduction1:01
Explore the course structure and how each section builds on the previous one, introducing the Bernoulli, Binomial, Uniform, and Normal distributions through lectures, examples, notes, and assignments.

Random variables, PMFs and PDFs7:23
Explore random variables, sample spaces, and probability distributions, including discrete pmf and continuous pdf. Learn about Bernoulli distribution, uniform distribution, and normal distribution, and how parameters describe their centers.
The Bernoulli Distribution7:27
Model a two-outcome experiment with zero or one, where one occurs with probability p and zero with probability 1-p, expressed via the pmf.
The Uniform Distribution6:51
Explore the uniform distribution, a continuous model with equal likelihood on [0, theta], its density function 1 over theta, and computing interval probabilities via area under the curve.
The Normal Distribution4:32
Study the normal distribution, a bell-shaped continuous distribution centered at mu with variability sigma, and apply the empirical rule that 95% of values lie within two sigma of mu.
Probability Distribution Recap3:03
Define a random variable and its sample space, describe Bernoulli distributions with parameter p, and explain continuous distributions like uniform and normal through probability density functions and shaded areas.
Probability Distributions Quiz

Sample mean and Expected Value6:51
Bernoulli Distribution Expected Value2:12
Determine the expected value of a Bernoulli random variable by summing x times its PMF over {0,1}, yielding p, the probability of X being one.
Uniform Distribution Expected Value3:00
Compute the expected value of a uniform random variable on [0, theta] by integrating x times the pdf 1/theta from 0 to theta, showing the mean is theta over two.
Normal Distribution Expected Value3:35
Mean, or expected value, of a normal distribution equals mu due to symmetry, and the mean also equals the median; the lecture discusses existence and contrasts with the Cauchy distribution.
Expected Value Recap1:56
Explore the expected value of random variables, from sample means to population means, and how Bernoulli, uniform, and normal distributions determine their means for the method of moments.
Expected Value Practice Problems and Solutions0:02

Estimators and the Method of Moments6:30
Apply the method of moments to estimate parameters by setting the sample mean equal to the first moment, solving for theta to obtain estimates.
Bernoulli Distribution MOM3:37
Compute the method of moments estimator for p in a Bernoulli distribution by equating E[X] with p and using x-bar as the approximate mean, yielding p-hat = x-bar.
Uniform Distribution MOM3:22
Estimate theta for a uniform(0, theta) distribution using the method of moments by equating theta/2 to the sample mean to obtain theta_hat = 2 x-bar.
Normal Distribution MOM1:44
Apply the method of moments to a normal distribution by equating the sample mean to the distribution's mean, yielding mu_hat = x_bar.
Method of Moments Recap1:50
Learn the method of moments, linking the expected value to the sample mean to form estimators for Bernoulli, uniform, and normal distributions, and prepare to study unbiased estimators.
Method of Moments Practice and Solutions0:02

Sampling Distribution, Evaluating Estimators, Bias5:51
Explore the sampling distribution of estimators and how to compare them. Learn how bias, defined as expected value minus true value, identifies unbiased estimators.
Properties of Expected Values5:07
Apply core properties of expected value to estimators by pulling out constants, summing expectations, and using the law of the unconscious statistician to analyze bias.
Bernoulli MOM Bias4:00
Uniform MOM Bias3:19
Normal MOM Bias4:23
Learn that the method of moments estimator for a normal distribution is the sample mean, and prove that its expected value equals the population mean, making it an unbiased estimator.
Bias Recap1:40
Define bias as the difference between an estimator's expected value and true value, and prove unbiasedness for Bernoulli, uniform, and normal estimators using linearity of expectation and method of moments.
Unbiased Estimators Practice and Solutions0:02

Variance4:39
Identify how variance measures spread, compare normal and uniform distributions, and derive Var(X) = E[X^2] − (E[X])^2 to compute variance from E[X] and E[X^2].
Bernoulli Distribution Variance3:03
Compute the variance of a Bernoulli random variable by using Var(X)=E[X^2]-E[X]^2, showing that for 0-1 outcomes the variance equals p(1-p).
Uniform Distribution Variance3:31
Compute the variance of a uniform distribution on zero to theta by using E[X]=theta/2 and E[X^2]=theta^2/3, yielding var(X)=theta^2/12.
Normal Distribution Variance1:52
Variance of Estimators and Properties of Variance3:47
Explain why the variance of estimators matters by comparing unbiased estimators with different variances using a political poll example, and review variance properties for constants and independent sums.
Bernoulli MOM Variance6:31
Explore the variance of the method of moments estimate for p in a Bernoulli distribution; p-hat equals the sample proportion, with variance equal to p(1-p)/n, and larger n reduce variability.
Uniform MOM Variance5:35
Normal MOM Variance4:52
Variance Recap2:34
Variance Practice and Solutions0:02

Likelihood Function and Maximum Likelihood Estimation - Motivation4:40
Learn how maximum likelihood estimation uses the likelihood function, not the pdf, to infer the mean of a normal distribution from data, starting with a single observation and known variance.
Joint pdf, joint likelihood7:05
Explore how to form joint pdfs and joint likelihoods from a random sample, using Bernoulli, uniform, and normal distributions to perform maximum likelihood estimation.
Log-likelihood and finding the MLE4:05
maximize the log-likelihood by differentiating with respect to theta, setting the derivative to zero, and solving for theta, using the score function and second derivative to confirm a maximum.
Properties of logarithms1:50
Review how logarithms convert multiplication to addition, simplify the log-likelihood by turning products of pdfs into sums, and apply exponent and denominator rules for easier differentiation.
Bernoulli MLE6:49
Uniform MLE10:28
Learn how to compute the maximum likelihood estimator for a uniform distribution, showing that theta hat equals the sample maximum, and compare it to the method of moments estimator.
Mean Squared Error3:16
Examine mean squared error as bias squared plus variance, comparing maximum likelihood estimator and method-of-moments for a uniform distribution, highlighting when slight bias with low variance outperforms unbiased estimates.
Normal MLE6:53
Derive the maximum likelihood estimator for the mean of a normal distribution, showing that mu hat equals the sample mean x-bar, the best estimator under normality.
MLE Recap3:21
Learn how maximum likelihood estimation uses likelihood and log-likelihood, with the score and second derivative, to find estimates and compare to method of moments for Bernoulli, normal, and uniform distributions.
MLE Practice and Solutions0:02

The Cramer-Rao Lower Bound (CRLB) and Fisher Information4:21
Bernoulli CRLB4:53
Uniform CRLB2:37
explain why the fisher information and the crlb cannot be computed for the uniform distribution, due to parameter-dependent support and the distribution’s exclusion from the exponential family.
Normal CRLB6:46
Derive the Cramer–Rao lower bound for mu in a normal distribution, showing single-observation information is 1/σ^2; with n observations, the bound is σ^2/n, and X-bar attains it as best estimator.
Efficiency2:45
Define efficiency as the ratio of the Kromer rule lower bound to an unbiased estimator's variance and compare estimators by variance, illustrating asymptotic relative efficiency with mean and median examples.
CRLB Recap1:30
Explain the Fisher information as the negative expected second derivative of the log-likelihood, and show that for Bernoulli and normal distributions the estimator variance meets the lower bound.
CRLB Practice and Solutions0:02

Distribution of Estimators and Convergence in Distribution6:33
Explore the asymptotic distribution of estimators and apply the central limit theorem to assess probability statements for large samples.
Bernoulli MOM/MLE Distribution6:57
Uniform MOM Distribution5:58
Normal MOM/MLE Distribution4:19
Know that for a normal distribution, mu-hat equals the sample mean and is exactly normal; the nine-sample IQ example shows mean 100 and sd 5, with 95% within 90-110.
Consistency2:13
Demonstrate how consistency ensures that as sample size grows, the estimator converges to the true value, with decreasing variance and a distribution concentrating near the truth.
CLT Recap2:54

Confidence Intervals9:20
Learn to construct confidence intervals by pivoting from theta hat to theta, using the central limit theorem to create 95% intervals that contain the true parameter with repeated sampling.
Bernoulli Confidence Interval6:42
Uniform Confidence Interval based on MOM4:33
Construct a 95% confidence interval for theta in a uniform distribution using the method of moments estimator theta_hat. Pivot the interval around theta_hat to get [10,14] for n=48.
Normal Confidence Interval4:25
Compute a 95% confidence interval for mu in a normal distribution using x bar plus or minus two sigma over sqrt(n); with four samples, interval is 63 to 71 inches.
Confidence Interval Recap, Link to Hypothesis Testing1:42
Connects point estimators, such as method of moments and maximum likelihood estimates, to interval estimates via the central limit theorem. Demonstrates forming 95% confidence intervals as a long-run recipe.
Confidence Interval Practice and Solutions0:02

Requirements

High school algebra, including manipulating functions with variables
Basic knowledge of calculus (integration and differentiation) is recommended for some chapters.
Prior experience with probability or statistics will be useful, but we cover everything assuming no previous knowledge!

Description

This course teaches the foundations of mathematical statistics, focusing on methods of estimation such as the method of moments and maximum likelihood estimators (MLEs), evaluating estimators by their bias, variance, and efficiency, and explore asymptotic statistics, including the central limit theorem and confidence intervals.

Course Highlights:

57 engaging video lectures, featuring innovative lightboard technology for an interactive learning experience
In-depth lecture notes accompanying each lesson, highlighting key vocabulary, examples, and explanations from the video sessions
End-of-chapter practice problems to solidify your understanding and refine your skills from the course

Key Topics Covered:

Fundamental probability distributions: Bernoulli, uniform, and normal distributions
Expected value and its connection to sample mean
Method of moments for developing estimators
Expected value of estimators and unbiased estimators
Variance of random variables and estimators
Fisher information and the Cramer-Rao Lower Bound
Central limit theorem
Confidence intervals

Who This Course Is For:

Students with prior introductory statistics experience, looking to delve deeper into mathematical foundations
Data science professionals seeking to refresh or enhance their statistics knowledge for job interviews
Anyone interested in developing a statistical mindset and strengthening their analytical skills

Pre-requisites:

This course requires a solid understanding of high school algebra and equation manipulation with variables.
Some chapters utilize introductory calculus concepts, such as differentiation and integration. However, even without prior calculus knowledge, those with strong math skills can follow along and only miss a few minor mathematical details.

Who this course is for:

Anyone who has taken a basic statistics class and wants to dive into more mathematical detail
Data scientists looking to learn some basics of mathematical statistics
Undergraduate and graduate students looking for help in mathematical statistics courses
Academics and professionals wanting a strong foundation for further study in statistics

Mathematical Statistics for Data Science

What you'll learn

Explore related topics

Course content

Introduction1 lecture • 1min

Probability Distributions5 lectures • 29min

Expected Values6 lectures • 18min

Estimators and the Method of Moments6 lectures • 17min

Unbiased Estimators7 lectures • 24min

Variance10 lectures • 36min

Maximum Likelihood Estimation10 lectures • 48min

Fisher Information and the Cramer-Rao Lower Bound7 lectures • 23min

Central Limit Theorem6 lectures • 29min

Confidence Intervals6 lectures • 27min

Requirements

Description

Who this course is for: