
Introduction of the course. A full overview of the course
About the instructor
Explore how statistical hypothesis tests infer population claims from sample data in data science and machine learning. Verify or reject population statements using the null hypothesis.
Explore statistical hypothesis tests, a cornerstone of inferential statistics used in data science and machine learning to make decisions from sample data and verify population claims.
Follow the five basic steps of hypothesis testing. Define the null and alternative hypotheses, set significance, and calculate the test statistic and p-value to decide between the hypotheses.
Learn the next steps of hypothesis testing, including selecting the alpha level (five percent or one percent), computing the test statistic and p-value, and deciding to reject the null hypothesis.
Analyze criteria of a hypothesis test by comparing significance level (alpha) and power, and distinguishing type I and type II errors when deciding to reject or accept the null hypothesis.
Explore type I and type II errors in hypothesis testing by comparing medicines to test if effectiveness is the same, and learn how level of significance and power influence decisions.
Learn how rejecting the null hypothesis at five percent signals statistical significance, while a one percent level yields highly statistically significant results, highlighting accuracy and common mistakes.
Explain p-value as the probability of the test statistic under the null, and how small p-values at a significance level lead to rejecting the null in favor of the alternative.
Use a one-sided hypothesis test to assess whether the mean delivery time exceeds 40 minutes, compute a p-value around 0.12, and compare to 0.05 to reach a conclusion.
Explore one- and two-tailed statistical tests and null versus alternative hypotheses, including less than or greater than scenarios. Understand tails of the normal distribution and the role of the t-test.
Identify the key conditions for hypothesis tests: independent observations, sampling with or without replacement, the 10 percent rule, and a sample size of more than 30 for skewed data.
Basic idea about normal distribution
Learn how to perform z-normal tests under normal distribution assumptions, with known population variance, large samples, and randomized samples, including one- and two-sample mean tests, proportion tests, and paired analyses.
Apply a one-sample z-test for population mean with known sigma and n ≥ 30; formulate null and alternative hypotheses, compute z statistic, and decide using p-values or critical values.
Present a one-sample z-test for population mean, two-tailed at five percent, using sample mean, population mean, and sigma to compute the z-statistic and p-value.
Demonstrate a one-sample z-test for a population mean in a one-tailed setting, comparing 16.1 to 15.2 with variance 2.5 under 1% significance, including null and alternative hypotheses, z, and p-value.
Apply a one-sample z-test for a population mean using a sleep study, with general mean 7.7 and smoker mean 7.5, z = -2.3, p < 0.05, concluding smokers sleep less.
Apply a one-sample z-test for proportions to assess whether the 22% sample supports the historical 25% claim, using a two-tailed 5% significance, and fail to reject the null.
Use the p-value method for a two-tailed z-test of a proportion, calculating the z and p-value with Excel or R and comparing to 0.05 to decide on the null hypothesis.
Apply a one-sample z-test for proportion to evaluate whether a seatbelt awareness campaign raises usage beyond 0.14, using 3 of 590 drivers and a one-tailed p-value.
Perform a two-sample z-test for means to compare saturated fat between two brands. Test the null hypothesis of equal means and interpret the p-value to determine if the brands differ.
Apply a two-sample z-test for two-tailed proportions to compare smokers and non-smokers on tax increase opinions, using a pooled proportion, null and alternative hypotheses, and p-value at 1% significance.
Explore the basics of the t-test, including when to use it for small samples under 30 versus larger samples, and how population variance, random sampling, and normality shape test choice.
Evaluate whether battery mean equals 300 hours using a one-sample two-tailed t-test with unknown variance on a sample of 20 (mean 270, sd 50); conclude the claim is not supported.
Learn how to compute and interpret the two-tailed one-sample t-test p-value, using t statistics and degrees of freedom, with Excel and R demonstrations.
Apply a one-sample t-test to determine whether the pipe mean resistance meets the specification, using a five percent significance level, a one-tailed alternative, and p-value interpretation.
learn how to perform a two-sample independent t-test, formulate null and alternative hypotheses, and compute the standard error and degrees of freedom for comparing sample means.
This lecture demonstrates an independent two-sample t-test comparing writing scores of two independent classes, covering null and alternative hypotheses, standard error, degrees of freedom, and p-value interpretation for significance.
Learn the paired t-test for dependent samples, such as before-after measurements, form null and alternative hypotheses, and compute the t statistic with the standard error of the difference.
Use a paired two-sample t-test to assess class effectiveness by comparing pre- and post-class scores, formulating hypotheses, computing t and p values, and drawing conclusions.
Explore chi-square test basics for categorical data, covering goodness-of-fit and independence tests, with emphasis on random samples, independent observations, and minimum expected counts of five.
Explore chi-square goodness-of-fit by formulating null and alternative hypotheses, computing expected counts, and using p-values to determine if the juror age distribution matches registered voters.
In a chi-square goodness-of-fit example, brands A, B, and C are tested for equal preference with 90 students; counts 38, 28, 24 yield a chi-square 3.47 (df=2), not significant.
Explore the chi-square test of independence using a two-category contingency table of gender and evening class. Learn to formulate hypotheses, compute expected counts, and interpret p-values at the 5% level.
Explains a chi-square test of independence to assess whether fear of heights depends on gender, outlining null and alternative hypotheses, expected counts, and significance decision at 5%.
Learn the basics of analysis of variance (ANOVA) to compare means across three or more groups, and understand hypotheses, assumptions, and the F-test with between-group and within-group sums of squares.
Perform a one-way anova to compare mean customer satisfaction across four companies, each with eight observations. Learn to state hypotheses, compute the f-statistic, and conclude whether means differ.
Explore how one-way ANOVA compares mean outcomes across five categories to assess effects on student achievement and anxiety, with practical examples and hints.
Celebrating completion of the course on statistical hypothesis testing, this bonus lecture shares tips to simplify the statistical test and previews probability distribution as part of a continuing learning journey.
Explore the binomial distribution arising from a Bernoulli process, with two outcomes, n trials, and probability of success p; learn its mean, variance, and standard deviation, plus practical examples.
Hypothesis testing is one of the most important concepts in statistics, especially in inferential statistics. The basis of the statistical hypothesis test and different terminologies (p-value, level of significance, type 1 and type 2 errors)will be explained elaborately. Students will be capable to infer a population mean, proportion, differences between means or proportions, and the relationships between variables and many others. The students will come to know the process of formulating and conducting the hypothesis test step by step. They will gain an insight view of different types of a statistical hypothesis tests. First of all, students will get basic ideas about normal distribution, which is the basis of all the statistical tests and the most widely used distribution too. Along with the normal distribution, they get knowledge about an empirical rule. They will be able to distinguish between the t-test and z-test. This course also includes the test for qualitative data, which is the chi-square test. The course will lay the foundation for the advanced level of a statistical hypothesis test. It will be very helpful to understand and infer the different models and algorithms in data science and machine learning. Specially, those who are interested to advance their careers in data science and machine learning should complete the course