What is Normal Distribution

A free video tutorial from 365 Careers
Creating opportunities for Data Science and Finance students
Rating: 4.5 out of 5Instructor rating
111 courses
2,880,319 students
The Normal distribution

Lecture description

We introduce the Normal distribution and its great importance to statistics as a field.

Learn more from the full course

Statistics for Data Science and Business Analysis

Statistics you need in the office: Descriptive & Inferential statistics, Hypothesis testing, Regression analysis

04:48:26 of on-demand video • Updated February 2024

Understand the fundamentals of statistics
Learn how to work with different types of data
How to plot different types of data
Calculate the measures of central tendency, asymmetry, and variability
Calculate correlation and covariance
Distinguish and work with different types of distributions
Estimate confidence intervals
Perform hypothesis testing
Make data driven decisions
Understand the mechanics of regression analysis
Carry out regression analysis
Use and understand dummy variables
Understand the concepts needed for data science even with Python and R!
Ok! Here we go! So far, we learned that a distribution of a dataset shows us the frequency at which possible values occur within an interval. We also said that there are dozens of distributions. Experienced statisticians can immediately distinguish a Binomial from a Poisson distribution, as well as a Uniform from an Exponential distribution in a quick glimpse at a plot. In this course, though, we will rather focus on the Normal and Student’s t distributions due to the following reasons: • They approximate a wide variety of random variables. • Distributions of sample means with large enough sample sizes could be approximated to normal. • All computable statistics are elegant. • Decisions based on normal distribution insights have a good track record. If it sounds like gibberish now, I promise that things will be much easier once we get started 😊 Here is a visual representation of a Normal distribution. You have surely seen a normal distribution before as it is the most common one. The statistical term for it is Gaussian distribution, but many people call it the Bell Curve as it is shaped like a bell. It is symmetrical and its mean, median and mode are equal. If you remember the lesson about skewness, you would recognize it has no skew! It is perfectly centered around its mean. Alright. It is denoted in this way. N stands for normal, the tilde sign shows it is a distribution and in brackets we have the mean and the variance of the distribution. On the plane, you can notice that the highest point is located at the mean, because it coincides with the mode. The spread of the graph is determined by the standard deviation. Now, let’s try to understand the normal distribution a little bit better. Let’s look at this approximately normally distributed histogram. There is a concentration of the observations around the mean, which makes sense as it is equal to the mode. Moreover, it is symmetrical on both sides of the mean. We used 80 observations to create this histogram. Its mean is 743 and its standard deviation is 140. Okay, great! But what if the mean is smaller or bigger? Let’s first zoom out a bit by adding the origin of the graph. The origin is the zero point. Adding it to any graph gives perspective. Keeping the standard deviation fixed, or in statistical jargon, controlling for the standard deviation, a lower mean would result in the same shape of the distribution, but on the left side of the plane. In the same way, a bigger mean would move the graph to the right. In our example, this resulted in two new distributions – one with a mean of 470 and a standard deviation of 140, and one with a mean of 960 and a standard deviation of 140. Alright, let’s do the opposite. Controlling for the mean, we can change the standard deviation and see what happens. This time the graph is not moving but is rather reshaping. A lower standard deviation results in a lower dispersion, so more data in the middle and thinner tails. On the other hand, a higher standard deviation will cause the graph to flatten out with less points in the middle and more to the end, or in statistics jargon – fatter tails. Great! These are the basics of a normal distribution. In our next lesson, we will use this knowledge to talk about standardization. Stay tuned!