What is a distribution?

365 Careers
A free video tutorial from 365 Careers
Creating opportunities for Business & Finance students
4.5 instructor rating • 71 courses • 1,389,850 students

Lecture description

We explain what a distribution is, what types of distributions are there and how this helps us to better understand statistics.

Learn more from the full course

Statistics for Data Science and Business Analysis

Statistics you need in the office: Descriptive & Inferential statistics, Hypothesis testing, Regression analysis

04:48:23 of on-demand video • Updated January 2021

  • Understand the fundamentals of statistics
  • Learn how to work with different types of data
  • How to plot different types of data
  • Calculate the measures of central tendency, asymmetry, and variability
  • Calculate correlation and covariance
  • Distinguish and work with different types of distributions
  • Estimate confidence intervals
  • Perform hypothesis testing
  • Make data driven decisions
  • Understand the mechanics of regression analysis
  • Carry out regression analysis
  • Use and understand dummy variables
  • Understand the concepts needed for data science even with Python and R!
English Before we can talk about testing, we have to learn what a distribution is. And in this lesson, we’ll do just that! In statistics, when we use the term distribution, we usually mean a probability distribution. Good examples are the Normal distribution, the Binomial distribution, and the Uniform distribution. Alright. Let’s start with a definition! A distribution is a function that shows the possible values for a variable and how often they occur. Think about a fair die. It has six sides, numbered from 1 to 6. We roll the die. What is the probability of getting 1? It is one out of six, so one-sixth, right? Easy. What is the probability of getting 2? Once again - one-sixth. The same holds for 3, 4, 5 and 6. We have an equal chance of getting each of the 6 outcomes. Now. What is the probability of getting a 7? It is impossible to get a 7 when rolling a single die. Therefore, the probability is 0. Okay. Let’s generalize. The distribution of an event consists not only of the input values that can be observed but is made up of all possible values. So, the distribution of the event - rolling a die - will be given by the following table. The probability of getting one is one-sixth, or 0.17, the probability of getting 2 is 0.17, and so on. We are sure that you have exhausted all possible values when the sum of their probabilities is equal to 1 or 100%. Similar to what we discussed about getting a 7, for all other values, the probability of occurrence is 0. And that’s the probability distribution of rolling a die. By the way, it is called a discrete uniform distribution. All outcomes have an equal chance of occurring. Okay. Each probability distribution has a visual representation. It is a graph describing the likelihood of occurrence of every event. Here’s the graph for our example. It is crucial to understand that the graph is JUST a visual representation of a distribution. Often, when we talk about distributions, we make use of the graph. That’s why many people believe that a distribution is the graph itself, however, this is NOT true. A distribution is defined by the underlying probabilities and not the graph. The graph is just a visual representation. Alright. After this short clarification, let’s explore a different example. Think about rolling two dice. What are the possible outcomes? One and one, two and one, one and two, and so on. Here’s a table with all the possible combinations. Say we are playing a game, where we are trying to guess the sum of the two dice. What’s the probability of getting a sum of 1? It’s 0, as this event is impossible. The minimum sum we can get is 2. So, what’s the probability of getting a sum of 2? There is only one combination that would give us a sum of 2 – when both dice are equal to 1. So, 1 out of 36 total outcomes, or 0.03. Similarly, the probability of getting a sum of 3 is given by the number of combinations that give a sum of three divided by 36. Therefore, 2 divided by 36, or 0.06. We can continue in this way until we have the full probability distribution. Let’s see the graph associated with it. Looking at it we can easily understand that when rolling two dice, the probability of getting a 7 is the highest. Moreover, we can also compare different outcomes such as: the probability of getting a 10 and the probability of getting a 5. It’s evident that it’s less likely that we’ll get a 10. Great! The examples that we saw here were of discrete variables. Next, we will focus on continuous distributions, as they are more common in inferences. In the next few lessons, we’ll examine some of the main types of continuous distributions, starting with the Normal distribution. See you there!