
Explore the mathematics behind data science, covering core data concepts, visualizing data, and distributions, plus probability, combinatorics, sampling, hypothesis testing, and regression with real-world applications.
Explore core data concepts, including continuous versus discrete, structured versus unstructured, and nominal versus ordinal data, plus population versus sample and basic measures such as mean, median, mode, and variance.
Learn mean, median, and mode as central tendencies, including weighted and truncated means, and when to apply each for data with outliers.
Explore dispersion measures that quantify how data spread around the mean. Learn how variance and standard deviation capture this spread for population and sample data.
Explain quartiles and the interquartile range using a data set, define the five-number summary (min, max, median, Q1, Q3), and show how to identify outliers with 1.5 times IQR.
Explore how data visualizations bridge data science and decision makers to tell clear stories. Learn when to use scatter plots, line plots, histograms, distribution plots, and categorical plots.
Explore scatter plots to compare two characteristics, read data points, and use a regression line with slope and intercept to estimate outcomes. Note that correlation does not imply causation.
Explore line plots to show continuous changes over time, like monthly temperatures, and compare series using slope to interpret rate of change. Use accumulating line plots when communicating annual totals.
Explore histograms as distribution plots with continuous axes, equal bin widths, and bar heights that show counts by age groups, guiding clear data interpretation and probability questions.
Learn how bar plots visualize categorical variables, not continuous ones, using frequency tables, bar spacing, and optional legends to compare categories and data across continents, with vertical or horizontal bars.
Explore box and whisker plots to visualize median, spread, quartiles, and the five-number summary, including min, max, and interquartile range.
Explore violin plots and kernel density estimation plots to visualize distribution shapes. Grasp how bandwidth and kernel shapes determine density curves, quartiles, and the median for comparing data sets.
Identify common plot pitfalls that distort data, such as misleading vertical axes, inconsistent scales, and weak labeling; learn to choose clear chart types and ensure accuracy.
Explore factorial, permutations, and combinations to count possibilities in business contexts, from product bundles to survey scores, and learn how these concepts drive logistics and decision making.
Explore factorials, from n! and zero factorial equals one to counting arrangements in permutations, with division, multiplication, and other operations, including 52-card deck examples.
Unlock permutations by focusing on order and arrangement, using n!/(n-k)! for no repeats and n^k for repeats, with examples like seating and seven-digit phone numbers, and later compare to combinations.
Discover how combinations differ from permutations, with no repeats and repeats cases, using the binomial coefficient (n choose k) and real-world examples like lotteries and sample selections.
Master probability concepts from addition rules and Bayes theorem to conditional probability, hypothesis testing, and reporting future-event likelihoods for data-driven decision making.
Explore simple probability, understand how experimental probability varies with small samples and converges to the expected probability under the law of large numbers, using coin tosses and dice examples.
learn to use the addition rule for probabilities, distinguishing union (or) and intersection (and) for mutually exclusive and overlapping events, illustrated with Venn diagrams and a six-sided die.
Apply the multiplication rule to independent and dependent events to compute the probability of A and B. Use P(B|A) to assess independence with coin flips, cards, and defective parts.
Apply Bayes' theorem to compute the probability a defective part came from line one versus line two using conditional probabilities, priors, and a step-by-step tree diagram.
Explore discrete probability by distinguishing discrete from continuous variables, constructing probability distributions for coin flips, and computing the mean, variance, and standard deviation of a discrete random variable.
Shift a random variable by a constant to alter mean, median, and mode while keeping range, interquartile range, and standard deviation; scale to adjust all measures with a dropshipping example.
Combine two independent random variables to obtain their sum or difference, using precomputed means and variances to quickly derive the total mean, variance, and standard deviation.
Explore covariance and correlation for two data features using total bill and tip, learn the covariance formula, and see how correlation standardizes to a minus one to one range.
Learn covariance and its link to correlation, see how it relates to variance, and measure how two data series move together through means and the mean of products.
Learn how the Pearson correlation coefficient (R) standardizes covariance between X and Y by dividing by their standard deviations, producing a unitless -1 to 1 measure of direction and strength.
Explore data distributions through histograms, PMFs and PDFs, comparing discrete and continuous cases with discrete uniform, continuous uniform, and common distributions like binomial, Bernoulli, and Poisson.
Explore probability mass functions that map discrete outcomes to probabilities, nonnegative and summing to one, and can be read or graphed alongside an algebraic form for X.
Explain the discrete uniform distribution as a probability mass function with equal outcomes, using a six-sided die to show each value has probability 1/6.
Explore probability density functions for continuous variables, compute interval probabilities by area under the curve, and verify non-negativity with total area equal to one.
discover the continuous uniform distribution with pdf 1/(B-A) on [A,B], zero outside; learn its mean (A+B)/2, variance (B-A)^2/12, and how it contrasts with the discrete uniform distribution.
Explore how cumulative distribution functions accumulate probability for discrete and continuous variables, using capital F, and relate to mass and density functions to compute less-than-or-equal intervals.
Model the number of successes in a fixed number of independent trials with constant probability using the binomial distribution. Compute P(K; N, p) and mean Np, variance Np(1-p).
Define a Bernoulli distribution as a single-trial binomial, with success as 1 and failure as 0, where the mean is p and variance is p(1-p) for a rideshare example.
Explore the Poisson distribution as a discrete random variable with rate lambda, its conditions (constant mean, independence, non-overlapping intervals), and the probability formula, with comparison to binomial.
Explore the normal distribution (gaussian distribution), identify when data are normally distributed, and use mean, standard deviation, and z scores to calculate probabilities in real-world datasets.
Master the normal distribution and its bell curve centered at the mean. Compute population and sample mean, variance, and standard deviation, with N and n, and n-1 bias correction.
Explore the normal distribution by deriving its probability density function from mean and standard deviation, and apply the empirical rule to estimate areas within one, two, and three standard deviations.
Learn how to standardize any normal distribution to the standard normal with mean zero and standard deviation one by applying shift and scale, yielding Z-scores.
Convert any normally distributed X to standard normal Z with Z = (X - μ)/σ, interpret Z-scores, and use the Z table to read probabilities and percentiles.
Explore how sampling from a population estimates characteristics when full data is unavailable. Apply the central limit theorem, confidence intervals, and t tests to compare samples and assess p values.
Explore sampling from a population, including simple random, stratified, cluster, and systematic methods, to create representative samples and minimize biases like voluntary response and non-response.
Explore how population parameters differ from sample statistics, central limit theorem's normal sampling distribution of the sample mean, and using standard error and confidence intervals to infer the population mean.
Explore the student’s t distribution as a bell-shaped alternative to standard normal, using the t table and degrees of freedom for small samples. For larger samples, use the z distribution.
Learn to build a confidence interval for the population mean from a sample, using t scores when sigma is unknown. Understand the trade-off between confidence level and interval width.
Learn the basics of hypothesis testing, including null and alternative hypotheses, significance level, and type I and II errors. Explore one-tailed and two-tailed tests, p values, and A/B testing.
Explore how inferential statistics use sample data to test hypotheses, focusing on formulating null and alternative statements, directionality options, and the goal of supporting not proving the chosen claim.
Define null and alternative hypotheses and explain choosing a level of significance with alpha. Examine how type I and II errors and test power shape decisions in hypothesis testing.
Compute the t statistic from the sample mean, mu0, and s/√n; compare one-tailed and two-tailed tests using alpha to determine regions of rejection and acceptance.
Learn to determine significance in hypothesis testing with t statistics, p values, and critical values, including alpha, degrees of freedom, and one- or two-tailed decisions.
Apply hypothesis testing to A/B testing by defining null and alternative hypotheses, using two-tailed tests, selecting appropriate statistics for discrete or continuous data, and assessing significance.
Welcome to the best online course for learning about the Math behind the field of Data Science!
Working together for the first time ever, Krista King and Jose Portilla have combined forces to deliver you a best in class course experience in how to use mathematics to solve real world data science problems. This course has been specifically designed to help you understand the mathematical concepts behind the field of data science, so you can have a first principles level understanding of how to use data effectively in an organization.
Often students entering the field of data science are confused on where to start to learn about the fundamental math behind the concepts. This course was specifically designed to help bridge that gap and provide students a clear, guided path through the complex and interesting world of math used in the field of data science. Designed to balance theory and application, this is the ultimate learning experience for anyone wanting to really understand data science.
Why choose this course?
Combined together, Krista and Jose have taught over 3.2 million students about data science and mathematics and their joint expertise means you'll be able to get the best and clearest mathematical explanations from Krista with framing about real world data science applications from Jose. At the end of each section is a set of practice problems developed from real-world company situations, where you can directly apply what you know to test your understanding.
What's covered in this course?
In this course, we'll cover:
Understanding Data Concepts
Measurements of Dispersion and Central Tendency
Different ways to visualize data
Permutations
Combinatorics
Bayes' Theorem
Random Variables
Joint Distributions
Covariance and Correlation
Probability Mass and Density Functions
Binomial, Bernoulli, and Poisson Distributions
Normal Distribution and Z-Scores
Sampling and Bias
Central Limit Theorem
Hypothesis Testing
Linear Regression
and much more!
Enroll today and we'll see you inside the course!
Krista and Jose