# Computing the mean and the weighted mean of the data A free video tutorial from Luc Zio
Adjunct faculty of Statistics, Data Scientist
4.3 instructor rating • 7 courses • 3,124 students

## Lecture description

In this video we show how to construct the mean, discuss the advantages and disadvantages of the mean, we illustrate how to calculate a weighted mean with real world examples.

### Learn more from the full course

Introductory statistics Part1: Descriptive Statistics

Learn the concepts, calculations and applications of statistics at your own pace and comfort.

04:14:07 of on-demand video • Updated September 2016

• By the end of this course you will be knowledgeable in using descriptive statistical analysis techniques to summarize and analyze the data
• By the end of this course you will be able to compute measures of center of the data, measures of spread, measures of relative positions
• By the end of this course, you will know how to use the empirical rule for analyzing data
• By the end of this course you will know how to compute the correlation coefficient and make interpretations of the data
• By the end of this course, you will understand the concepts of sample, population and the different methods of sampling
English [Auto] Hello and this next year will cover the method for describing the center of the day. Those many tricks are called measure of central tendencies or measure of the center of the data. So we'll talk about how to compute the mean the median the more the mid range the way that I mean and many other type of measure of center of the data will show how to make interpretation of the statistics that you know we talk about the mean median mode way that mean based on real world data calculate and the sample mean the average of the sample mean average of the data is fun by adding all the observation X divided by the number observation. And it is red x y so huge X could represent weight of patient height of people students scores etc.. The type of data that we have had and will be the number of data element the sample mean is also known as the arithmetic mean so he the formula is X Bar X bar equal Sigma Expo and the Greek letter Sigma means submission of. So we want to add all the element of X divided by the number of elements in the sample. All right here's an example with five-Test score 1 9 8 9 9 5 one hundred. What is the meaning of the data. We use formula x by equal sum of x the Y. And so we add all the data point divided by five and we get 91 calculating population mean or average all the year the process is similar to calculate and the sample mean we add all the element x in the population divided by the size of the population. The only difference is that we use the Greek letter mew mew to denote the meaning of the population. So here you equal Sigma X all the element in the population divided by the size of the population. So since the population is much bigger than the sample so we use the lead and to denote the size of the population. So computationally it's the same way US using computing the sample mean advantages of the sample mean if we take several sample from a population the means are likely to be approximately the same. That properties call resistance to sampling variation. So what this means is that if we take several sample at random from population and compute the mean of each sample obviously the mean will not be exactly the same but because we are selecting the element at random we are not selecting the same element and ish process. However because we are using random sampling to mean although they will vary that variability will be small. So they are likely to be approximately the same. For example let's say that I select a sample of 1000 to them at random from a big university and compute the average age of that sample and repeat the process select another random sample and compute the average age in the first sample I may get is to be 28 and 5 in the second one maybe twenty six point two in the third let's say 29 wins one and so forth. They will vary but that variability will be small. All the data is used to compute the mean. So just one other property of the sample in disadvantages of the mean domain is easily affected by extreme observation in the data which we call by call out lightless. It could be misleading when the data has been outlined. Let's suppose that we are computing the average income of students in college students that say we have 40 student in one classroom. Let's recall that one of the students is a big entrepreneur and his income is let's say two million dollars a year. So now when we compute the average income and add all the incomes so that student making 2 million or 10 million dollars a year will compensate for all the class so that when we compute the mean we will have a wrong impression that everybody makes you know every student makes you know a large income that's a \$100000 a year. Another case is for simple let's say we want to compute the average income of baseball players. Some of them make \$300000 a year. Others just one million dollar a year. And then you have somebody make 20 million dollars a year. So they have large contracts. When we compete the average income and just add all the income divided by the number of baseball players we get the wrong impression. All over all of them are millionaires. Why. Because the people making 20 million dollars a year. Compensate for everybody. And they call outliers because those are unusual cases where you have only a handful of players with very large contract and the majority of all these players make in just average income a year you that mean it's distorted when the distribution is skewed. So when the distribution is skewed We will talk about skewed distribution later on network when the division has a tail a long tail to the right or long tail to the left. The meaning is distorted. In other words it doesn't give a broad interpretation of the data. Now we will talk about another type of mean call the way that men in a situation where the measurement have different weight with it mean needs to be calculated. The formula for the way that I mean is X by they call Sigma w time X you w is the weight so we will multiply the weight by the observation and then sigma I say all right. Add them on at all. All of them divide by the sum of the width. So if the measurement have equal weight in the Word for example let's say they have all of them have a weight of 1 then the result will be just a sample mean on with it. But in many cases when the data is weighted the measurement will not have equal weight. Let's go to a practical example. All right we have two scenarios. In the first scenario the student grades are assumed to be the same have the same weight. Typically most teachers will yes use that approach so that in the end of the semester to compute the average weight the average score we just add all the grades divided by the number is straightforward. But for illustration that Suppose we use a width of one all of them have same weight. How do we compete with it. So we will multiply the grade X by the way W. So 80 by one people eighty seventy one by one equals 71 68 by one equals 60 70 by 170 61 by 161. So we multiply that by W X times w. So the formula here didn't say add them up. Add all of them Sigma W X so when we add all these we get 350. All right the sum of the weighted Sigma w equal qualifier. So when we compete with it I mean when the weight is one who is just 350 over 5 or 7 which is the same as chess observation does because they have the same width you want not. Let's consider the second scenario. In the second scenario did you decide that the first test will have a weight of 10 percent or play. The second test will have a wait. Also 10 percent Quain 1. However the true test will have a way to 20 percent point to the fourth Test. A weight of 20 percent. Point two in the final on last test. We'll have a larger wait. 40 percent 0.4. This means in this particular case that if a student doesn't do well on the last test he or she great may be largely infected affected. So let's go to illustrate how to compute the weighted mean here. So we multiply X by W 80 times playing 1 8 71 times playing 1 7 playing one 68 times point to thirteen point six seventy point Times plane to 40 61 times plan for twenty four point four. Now applying the formula we at X the Sigma w time x so we add all this we get 61. This is sixty seven point one. And here we add the sum of the weight again. Here are some of the weight is one. So when we compute do we mean in the second scenario is just sixty seven point one divided by one sixty seven point one. So as you can see in this second scenario the average of the student who was sixty seven point one west in the on where that case it was 70 blocks for the last test. The student didn't do well but it was a way that you know two months you will has the same weight as all the other tests. So it did bring the average grade down but you don't do well the last test or the final has a larger way the student didn't do well. So it brought the average score. I wish the men understood in the second scenario done. So here the student gets sixty seven point one which is a D. Whereas you the student gets 70 which is a C. Thank you.