Chapter 4 - PowerPoint PPT Presentation

1 / 23

Chapter 4. The Description of Data: Measures of Variation and Dispersion. Measures of Variation. We have looked at measures of the center, or location, of data. We also need a measure of the dispersion of data. Range. The range is the distance spanned by the data .

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Chapter 4

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Chapter 4

The Description of Data:Measures of

Variation and Dispersion

Measures of Variation

• We have looked at measures of the center, or location, of data.

• We also need a measure of the dispersion of data.

Range

• The range is the distance spanned by the data.

• The range is calculated by subtracting the smallest data value from the largest.

• The range is sensitive to outliers.

• The range does not provide any information regarding the data between the minimum and maximum.

Interquartile Range

• The interquartile range is the distance spanned by the middle 50% of the data.

• The interquartile range is calculated by subtracting Q1 from Q3.

• The interquartile range is not sensitive to outliers, but still gives insight into the dispersion of the data.

Mean Absolute Deviation

• The mean absolute deviation is the mean distance to the mean. In other words, it’s the average distance from the data to µ.

Variance andStandard Deviation

• The variance is the average squared distance to the mean.

• The standard deviation is the square root of the variance.

Variance andStandard Deviation

• For samples, we divide by n-1 to avoid bias.

• The standard deviations of populations and samples are available from your calculator. Variance can be calculated as the square of the standard deviation.

Chebyshev’s Theorem

• The minimum proportion of data that can be found within k standard deviations from the mean is:

Chebyshev’s Theorem

• Chebyshev’s Theorem works for any distribution, but it does not work very well.

• This theorem gives the minimum proportion of data that will be found in a given interval, but in reality, the actual amount is usually much higher than Chebyshev predicts.

The Empirical Rule

• If the distribution of data is normal (bell shaped), then:

• 68% of the data will be found within one standard deviation of the mean.

• 95% of the data will be found within two standard deviations of the mean.

• 99.7% of the data will be found within three standard deviations of the mean.

The Empirical Rule

• The empirical rule only works for distributions that are normal (bell shaped).

• The empirical rule is much more accurate than Chebyshev’s Theorem.

Coefficient of Variation

• The coefficient of variation measures the relative variation of a distribution.

• Since this is a relative measure, there are no units, making it easier to compare the variation of two different populations.

Skewness

• Distributions with a long right tail are positively skewed.

• Distributions with a long left tail are negatively skewed.

• Distributions that are not skewed are symmetric.

Pearson’s Coefficient of Skewness

• Pearson’s coefficient of skewness gives a numeric measurement of the skewness of a distribution.

• Distributions with an SK of 0 are symmetric.

• Distributions with a positive SK are positively skewed, while distributions with a negative SK are negatively skewed.

Try it!

• The median price of a home selling in San Diego during 1991 was \$195,000. The first and third quartile prices were \$170,500 and \$232,000 respectively. What was the semi-interquartile range for the cost of a home in San Diego in 1991?

• \$30,750

Try it!

• A sample of 6 prices quoted for a particular television set are \$326, \$299, \$345, \$295, \$310, and \$345.

• Find the range of this sample.

• \$50

Try it!

• A sample of 6 prices quoted for a particular television set are \$326, \$299, \$345, \$295, \$310, and \$345.

• Find the variance for the quoted price of the TV.

• \$490.40

Try it!

• A sample of 6 prices quoted for a particular television set are \$326, \$299, \$345, \$295, \$310, and \$345.

• Find the standard deviation for the quoted price of the TV.

• \$22.14

Try it!

• Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of:

• 200

• k = -1.2235

Try it!

• Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of:

• 238.4

• k = 1.0353

Try it!

• Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of:

• 229

• k = .4824

Try it!

• Given a set of data with a mean of 220.8 and a standard deviation of 17.0, find the k, or z, value of:

• 198.1

• k = -1.3353

Try It!

• Exercise 4.12

• SK = -.5430