Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Statistics & Mathematics for Data Science & Data Analytics

Name: Statistics & Mathematics for Data Science & Data Analytics
Rating: 4.6 (3029 reviews)

Learn the statistics & probability for data science and business analysis

Created byNikolai Schuler

Last updated 6/2026

English

What you'll learn

Master the fundamentals of statistics for data science & data analytics
Master descriptive statistics & probability theory
Machine learning methods like Decision Trees and Decision Forests
Probability distributions such as Normal distribution, Poisson Distribution and more
Hypothesis testing, p-value, type I & type II error
Logistic Regressions, Multiple Linear Regression, Regression Trees
Correlation, R-Square, RMSE, MAE, coefficient of determination and more

Course content

9 sections • 91 lectures • 11h 25m total length

Welcome!2:13
Meet your instructor, Nikolai Schuller, a mathematician turned data engineer, and learn the course goals for statistics, mathematics, data science, and data analytics.
What will you learn in this course?5:50
Learn core statistics and mathematics for data science, covering descriptive statistics, probability theory, distributions, hypothesis testing, and regression with practical exercises.
How can you get the most out of it?5:49
Use Udemy tools to adjust playback rate, add notes at key moments, and practice with quizzes to stay motivated and apply data science and statistics concepts.
Download: Formula cheat sheet0:29

Intro2:37
Explore descriptive statistics by comparing central tendency measures, like average height, with measures of spread to describe data and identify typical values in a dataset.
Mean6:00
Learn the mean, the most common measure of central tendency in descriptive statistics, by averaging the dataset (2, 1, 3, 4, 7) to estimate daily sales.
Quiz: Mean
Median4:57
Learn how to compute the median by ordering data, identifying the middle value for odd counts, and taking the mean of the two middle numbers for even counts.
Quiz: Median
Mode3:42
Explore the mode, the easiest central tendency measure, by counting the most frequent values, and identify when the dataset has a single mode, is bimodal, or has no mode.
Quiz: Mode
Mean or Median?7:30
Compare mean and median for skewed data, identify outliers, and choose the representative central tendency using examples of prices and income distributions; note when to apply mode for categorical values.
Skewness8:22
Explore skewness as a measure of asymmetry in data, distinguishing positively skewed (right-skewed) and negatively skewed (left-skewed) distributions, and learn how mean, median, and outliers shape this feature.
Practice: Skewness1:19
Assess skewness in an hourly store traffic dataset using three methods: visual inspection for outliers, compare mean and median, and apply the skewness formula.
Solution: Skewness3:25
Explore how a histogram shows a right-skewed dataset and how the mean exceeds the median in such data, then calculate skewness using the mean and the dataset size.
Range & IQR9:40
Explore measures of spread, including the range and the inter quartile range, and learn to calculate data dispersion using cereal protein examples.
Sample vs. Population5:01
Explore the distinction between population and sample, learn to estimate population mean and variance from a sample, and understand symbols like x-bar, s squared, and sigma squared.
Variance & Standard deviation10:39
Explore population and sample variance and standard deviation, including the mean-centering formula, squared deviations, degree of freedom (n-1), and the relationship between variance and standard deviation.
Quiz: Variance
Impact of Scaling & Shifting18:33
Explore how shifting and scaling data affect central tendency and spread, showing that mean, median, and mode shift by the added value, while variance scales by lambda squared.
Statistical moments5:48
Explore how statistical moments extend mean and variance to describe data shape, including skewness and tail heaviness. Standardize moments to prevent growth with sample size and reveal distribution characteristics.

What is a distribution?9:52
Explore how data distributions reveal frequency patterns using histograms, learn to fit a theoretical distribution like the normal curve, and use density functions to compute ranges, percentages, and probabilities.
Normal distribution8:53
Explore the normal distribution, its mean and standard deviation, and the empirical rule (68-95-99.7) for estimating intervals, and learn how z-scores and set score tables simplify practical calculations.
Z-Scores12:46
Learn to compute z-scores from a normal distribution using mean and standard deviation, and use z-score tables to convert values to percentiles or to find x-values.
Practise: Normal distribution3:42
Practice applying the normal distribution to sports shoe scenario, using mean 27 cm and standard deviation 2.5 cm to estimate percentages of customers fitting models A and B with z-scores.
Solution: Normal distribution7:12
Demonstrates how to use a normal distribution (mu 27, sigma 2.5) to compute the share of people in model A and B ranges via z-scores and z-table areas.
Normal distribution
More distributions0:07

Intro0:50
Explore the mathematics of probability, from basic probability calculations to the expected value, and learn about the law of large numbers and the central limit theorem with real-life applications.
Probability Basics9:47
Explore the basics of probability notation, the 0–1 range, and the difference between probability and proportion, illustrated with coin tosses and sample versus population ideas.
Calculating Simple Probabilities5:20
Learn how to calculate simple probabilities with equal chances using coin tosses and a 52-card deck, by dividing favorable events by all possible outcomes.
Practice: Simple Probabilities1:23
Apply simple probabilities by rolling two six-sided dice, enumerate all possible outcomes, and calculate basic probabilities to solve follow-up questions.
Quick solution: Simple Probabilites0:39
Compare your results with the quick solution for simple probabilities, skip if you have no questions, or review the detailed solution in the next lecture, and explore the DETAT solution.
Detailed solution: Simple Probabilities6:11
Count all two-dice outcomes to form the 36-outcome sample space and determine events a, b, and c, then compute their probabilities: 1/6, 5/9, and 1/9.
Rule of addition12:45
Learn how to calculate probabilities of combined events using the addition rule, including union and intersection, with examples from cards and coin tosses, and avoid double counting.
Practice: Rule of addition2:20
Apply the addition rule to compute probabilities for events A and B, such as light colored cars and station wagons, and for clothing or food purchases.
Quick solution: Rule of addition0:55
Explore the quick solution to the probability rule of addition by comparing results for A or B and A and B, using examples of light colored cars and station wagons.
Detailed solution: Rule of addition7:20
Apply the addition rule to find P(A or B) and P(A and B) from the given probabilities, including the intersection; examples yield 0.45 and 0.20.
Rule of multiplication10:50
Apply the multiplication rule to compute the probability both events occur, as shown with independent coin tosses. For dependent events, use P(A)×P(B|A), illustrated by drawing two queens without replacement.
Practice: Rule of multiplication0:39
Apply the multiplication rule to determine the probability of drawing two black balls without replacement from a container with seven black and three white balls.
Solution: Rule of multiplication3:19
Apply the multiplication rule to find the probability of two black balls drawn: P(A)=7/10 and P(B|A)=6/9, yielding about 0.4667.
Bayes Theorem9:37
Study Bayes' theorem as the foundation of conditional probability and see how to relate P(B|A) and P(A|B) through a coin-toss example.
Bayes Theorem - Practical example6:42
Use Bayes theorem in a practical two-bucket experiment to find bucket one given a red ball, by combining a 50% bucket choice with the conditional probability of drawing red.
Expected value10:49
Learn how to calculate the expected value of a random variable by weighting outcomes with their probabilities, and apply it to games and laptop profits.
Practice: Expected value1:07
Calculate the expected value and probabilities of discounts per spin on a nine-section wheel of fortune, with one 20 euro, three 10 euro, three 5 euro, and two zero discounts.
Solution: Expected value2:44
Apply the expected value formula by multiplying each discount outcome by its probability, summing the results, and obtaining an expected discount per spin of about 7.22 euro.
Law of Large Numbers7:53
Explore the law of large numbers, its sample mean converging to the expected value, illustrated by coin toss experiments and the gambler's fallacy.
Central Limit Theorem - Theory10:07
Explore the central limit theorem, showing how sample means converge to a normal distribution regardless of original randomness, with coin-toss examples and intuition, visuals, and practical probability calculations.
Central Limit Theorem - Intuition7:31
Explore an intuitive, visual understanding of the central limit theorem by simulating sample means from various distributions, showing convergence to a normal distribution as sample size grows.
Central Limit Theorem - Challenge10:55
Apply the central limit theorem to a waste management scenario: compute the probability of exceeding 400 kg with 100 customers using the sample mean distribution, and assess expected profit.
Central Limit Theorem - Exercise1:49
Apply the central limit theorem to a 25-trader investment bank scenario, with each trader averaging $5,000 weekly and $20,000 standard deviation, to estimate weekly loss probability and 99% cash reserve.
Central Limit Theorem - Solution14:12
Using the central limit theorem, this lecture computes the probability the sample mean of 25 traders is below zero and the initial capital needed for 99% safety.
Quiz: Bayes Theorem
Binomial distribution15:49
Discover the binomial distribution, its binomial coefficient defined via N factorial and Bernoulli trials, with examples like coin flips and customer purchases, including mean, variance, and normal approximation.
Poisson distribtuion16:37
Poisson distribution models number of events in a time interval as a discrete, whole-number variable, with lambda as the mean, density and cumulative density functions, and variance equal to lambda.
Real life problems15:29
Apply real life problems using the binomial distribution with Excel calculations, using independence and lambda for the mean. See examples from device failures and customer counts to practice probabilities.

Intro1:12
Intro to hypothesis testing using real-world data to prove claims, compare groups, and draw conclusions, with topics like statistical significance, p value, and type I and II errors.
What is an hypothesis?18:50
Explore inferential statistics by stating null and alternative hypotheses, determining the significance level, collecting data, and deciding to reject the null in mean and proportion tests.
Significance level and p-value6:12
Understand the level of significance (alpha) and p-values in hypothesis testing, determine when to reject the null, and compare type I and II errors using a coin example.
Type I and Type II errors5:02
Explore how type I and type II errors arise in hypothesis testing and how alpha, beta, p-value, and significance shape the null. Increase sample size to boost power.
Confidence intervals and margin of error14:46
Learn how confidence intervals and margin of error quantify uncertainty in population proportions, using the central limit theorem and sample size to tighten estimates and manage alpha and beta.
Excursion: Calculating sample size & power10:55
Discover how larger sample size lowers margin of error and variance to tighten the confidence interval around the sample mean, while reducing type I and type II errors via power calculations.
Performing the hypothesis test19:38
Learn how to perform a two-sided hypothesis test: set a 5% significance level, compute sample mean and z score, and decide whether to reject the null hypothesis using the p-value.
Practice: Hypothesis test1:19
Apply hypothesis testing to assess if the school's students outperform others, using sample mean 107 (n=25) vs population mean 100 (sd 50); decide one- or two-sided at 0.05 and conclude.
Solution: Hypothesis test5:35
Apply a one-sided z-test to test if the school mean IQ exceeds 100, using n=25 and x̄=107; the p-value 0.0099 is below 0.05, so we reject the null.
t-test and t-distribution13:29
When sigma is unknown, perform a t-test using the sample standard deviation, compute the t-score, and compare to the t-distribution (n−1 df) for one- or two-tailed tests.
Proportion testing10:03
Compute the sample proportion, its standard deviation, and the z score to test the null hypothesis in proportion testing, using a one-tailed test with alpha 0.05 to interpret the p-value.
Important p-z pairs8:09
Learn key p-z combinations for the normal distribution to guide hypothesis testing, using z-scores, standard deviations, and one- or two-tailed significance levels to decide when to reject the null.
Quiz: Hypothesis Testing

Intro2:06
Learn how regression uses a regression line to predict house prices from living area. Relate two variables and estimate the price for a given living area using the regression line.
Linear Regression10:46
Explore linear regression by fitting a line to x and y that minimizes distance to data, using y = m x + p with slope m and intercept p.
Correlation coefficient10:08
Explore how the Pearson correlation measures relation between two variables, bounded by -1 and 1, with zero meaning no correlation and values near ±1 indicating predictability of Y from X.
Practice: Correlation1:46
Apply correlation analysis to assess whether height or weight better predicts shoe size. Calculate the height–shoe size correlation and compare it to the existing 0.59 weight–shoe size correlation.
Solution: Correlation7:32
Compute the correlation between height and shoe size using the correlation formula, including sums, sums of squares, and data points, revealing a strong 0.741 relationship and guiding height-based estimates.
Practice: Linear Regression0:31
Apply linear regression to height to estimate shoe size, and create the regression line for this exercise; then we will walk through the solution together.
Solution: Linear Regression6:36
Compute the regression line by determining the slope m and intercept b from data, then use it to predict shoe size from height, and plan regression evaluation.
Residual, MSE & MAE7:32
Explore residuals and their limitations, then compare squared error with mean squared error and mean absolute error as metrics to assess regression accuracy.
Practice: MSE & MAE0:52
Assess regression accuracy by calculating mean squared error and mean absolute error on a new dataset of 10 heights, comparing true shoe sizes with predicted sizes.
Solution: MSE & MAE3:19
Compute the mean square error of 0.35 and the mean absolute error of 0.49 by squaring residuals and averaging over ten data points.
Coefficient of determination12:16
Explore how the coefficient of determination, or r-squared, shows how much of the variance in y is explained by x in regression, and its 0-1 range.
Root Mean Square Error6:24
Explore how root mean square error measures regression quality by averaging squared residuals and linking it to variance and standard deviation to interpret data dispersion.
Practice: RMSE1:00
Apply the root mean square error to assess regression performance and determine the 68 percent error range using the empirical rule and standard deviation.
Solution: RMSE2:08
Compute the root mean square error as the square root of the sum of squared residuals divided by N, yielding about 0.55, with roughly 68% of errors within ±0.55.
Quiz: Regression

Multiple Linear Regression16:02
Explore how multiple linear regression extends simple linear regression using multiple predictors, like horsepower and weight, to predict miles per gallon while examining correlation, residuals, and overfitting.
Overfitting5:19
Examine overfitting in data science by comparing an overfitted model to linear regression and applying cross-validation with a train-test split to assess generalization.
Polynomial Regression13:05
Discover how polynomial regression better fits non-linear data than linear regression by reducing residuals and squared errors, while using base information criteria to choose an appropriate order and avoid overfitting.
Logistic Regression9:27
Understand why logistic regression outperforms linear regression for classification tasks by modeling probability between zero and one and using a 0.5 decision boundary.
Decision Trees21:06
Explore how decision trees classify and regress data using root and leaf nodes, criteria, and data splits, including numerical and categorical features and Gini impurity.
Regression Trees14:21
Explore regression trees, combining decision trees with regression to bucket data and predict leaf means using squared residuals. Learn splits, overfitting risks, and advantages intuitive use and handling missing data.
Random Forests12:38
Explore how random forests, an ensemble method using bootstrap aggregating of many small decision trees trained on random feature subsets, reduce overfitting and improve accuracy for classification and regression.
Dealing with missing data10:16
Learn how missing data can bias results and apply four strategies to handle it: delete a column, remove rows, delete rows by a key column, or impute values.

ANOVA - Basics & Assumptions5:31
Explore the basics of ANOVA, including factors, levels, and one-way versus two-way designs, and examine key assumptions such as normality, equal variances, and random independent sampling.
One-way ANOVA12:25
Explore the one-way ANOVA process using smoking categories to test differences in aggressiveness, calculating group means, sum of squares, degrees of freedom, and the F ratio.
F-Distribution10:19
Understand the f distribution and f ratio in one-way ANOVA, using the f table and critical value to decide whether to reject the null hypothesis.
Two-way ANOVA – Sum of Squares15:44
Apply two-way ANOVA to assess how age group and smoking influence aggressiveness, including their interaction. Compute sums of squares— totals, columns, rows, within-group, and error to test effects and interaction.
Two-way ANOVA – F-ratio & conclusions11:24
Learn to compute the two-way ANOVA F-ratio, interpret main effects and interaction, and draw conclusions on smoking and age effects using sum of squares, degrees of freedom, and critical values.
Quiz: ANOVA
Wrap up0:40
Celebrate completing the statistics journey and recognize your perseverance throughout the course. Offer continued support, invite questions, and look forward to seeing you in another class.

Requirements

Absolutely no previous experience required. We will learn everything right from the basics and then work our way up step by step
Eagerness and motivation to learn

Description

Are you aiming for a career in Data Science or Data Analytics?

Good news, you don't need a Maths degree - this course is equipping you with the practical knowledge needed to master the necessary statistics.

It is very important if you want to become a Data Scientist or a Data Analyst to have a good knowledge in statistics & probability theory.

Sure, there is more to Data Science than only statistics. But still it plays an essential role to know these fundamentals ins statistics.

I know it is very hard to gain a strong foothold in these concepts just by yourself. Therefore I have created this course.

Why should you take this course?

This course is the one course you take in statistic that is equipping you with the actual knowledge you need in statistics if you work with data
This course is taught by an actual mathematician that is in the same time also working as a data scientist.
This course is balancing both: theory & practical real-life example.
After completing this course you ll have everything you need to master the fundamentals in statistics & probability need in data science or data analysis.

What is in this course?

This course is giving you the chance to systematically master the core concepts in statistics & probability, descriptive statistics, hypothesis testing, regression analysis, analysis of variance and some advance regression / machine learning methods such as logistics regressions, polynomial regressions , decision trees and more.

In real-life examples you will learn the stats knowledge needed in a data scientist's or data analyst's career very quickly.

If you feel like this sounds good to you, then take this chance to improve your skills und advance career by enrolling in this course.

Who this course is for:

Anybody that wants to master statistics & probabilities for data science & data analysis
Anybody who wants to pursue a career in Data Science
Professionals and students who want to understand the necessary statistics for data analysis

Statistics & Mathematics for Data Science & Data Analytics

What you'll learn

Explore related topics

Course content

Let's get started4 lectures • 14min

Descriptive statistics13 lectures • 1hr 28min

Distributions6 lectures • 43min

Probability theory27 lectures • 3hr 14min

Hypothesis testing12 lectures • 1hr 55min

Regressions14 lectures • 1hr 13min

Advanced regression & machine learning algorithms8 lectures • 1hr 42min

ANOVA (Analysis of Variance)6 lectures • 56min

Wrap up1 lecture • 1min

Requirements

Description

Who this course is for: