- 80 Views
- Uploaded on
- Presentation posted in: General

Variance & Standard Deviation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

- This lecture is organized by Dr. Alemi and narrated by Yara Alemi. The lecture is based on the OpenIntro Statistics book.

Variance & Standard Deviation

By Farrokh Alemi, Ph.D.

The mean describes the center of data, but the variability in the data is also important. Here we see two sets of data. Both have the same mean. The red data shows a set with tight spread of data around the mean. The green distribution shows a large spread around the mean. We need a measure of how far the data is spread around the mean.

To measure the spread of data we start with deviation. Deviation is the difference between the ith observation and the mean. Here x is the ith observation and meu is the population mean.

Distance

You can think of deviation as how far off the course we are if we deviate from our main road and end up at the observation point.

Observation

Mean

We can deviate to the right of the mean or to the left, so the difference of the mean and observation can be positive or negative. One idea for measuring the spread around the mean is to calculate the sum of deviations for all points in the population. This is problematic. If we just sum the deviations some positive deviations will cancel out other negative deviations and create an image that the spread is less than it is. To avoid this, we first square the deviations and then sum it.

Less than Mean

More than Mean

Observation

Mean

Observation

Mean

The measure of spread calculated in this way is called Variance. It is defined as the average of the square of deviations for every point in the population.

The standard deviation is the square root of the variance. In this formula x indicates the ith observation and x with a bar on top indicates the mean of the sample. N indicates the number of observations in the sample. As in calculation of variance, we are summing the square of the deviations and not the deviations themselves. The standard deviation is roughly the average of the squared deviations. The standard deviation is useful when considering how close the data are to the mean.

To calculate the standard deviation, you first calculate the deviations for each point. Here we have three observations, 5, 2 and 5. The mean is 4. The deviation from the mean are 1, minus 2 and 1. Note that the total sum of deviations is 0.

Once the deviations have been calculated, then you square the deviations. The sum of the squared deviation is 6.

Last you divide the sum by one minus the number of observations and take the square root of it. The sum is 6, there are 3 observations so the division yields 2 and the square root of 2 is 1.7. This value is referred to as standard deviation.

Standard deviation is a measure of spread of data around the mean. If the spread is small, then the value of standard deviation will be small. If the spread is large, the value of standard deviation will be large. We now have a measure for describing the spread of data around the mean.

Large standard deviation

Small standard deviation

One also expects most data to fall within one standard deviation of the mean. The actual percentage that falls within one or two standard deviation depends on the shape of the distribution. Here we see three different distributions that have the same mean and the same standard deviations. Usually 70% of the data fall within one standard deviation of the mean. 95% of the data fall within 2 standard deviation of mean.

Standard Deviation measures the spread of data around the mean

Calculate the standard deviation for the following three observations: 3, 4 and 5.

Let us test your knowledge of standard deviation. See if you can answer this question.