- 51 Views
- Uploaded on
- Presentation posted in: General

Introduction

Introduction

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Introduction

Data sets can be compared by examining the differences and similarities between measures of center and spread. The mean and median of a data set are measures of center. These measures describe the expected value of a data set. The mean absolute deviation is a measure of spread that describes the range of data values, with respect to the mean. The mean absolute deviation is the average of the absolute values of the differences between each data value and the mean. The interquartile range is a measure of spread that describes the range of the middle 50% of a data set. The interquartile range is the difference between the third and first quartiles.

4.1.3: Comparing Data Sets

Introduction, continued

The center and spread of a data set can also be seen in the shape of a graphical representation of the data. The range of data values can be seen in the x-axis of a graphical representation. Clusters of data values can be seen in graphs that show frequency, such as dot plots and histograms. The interquartile range and median are shown in box plots.

4.1.3: Comparing Data Sets

Key Concepts

Measures of center, such as the mean and median, describe the expected value of a data set. The mean is influenced by very small or large data values, whereas the median is not.

Measures of spread describe the range of data in a set. Interquartile range and mean absolute deviation are measures of spread.

The interquartile range shows the range of the middle 50% of a data set. It is the difference between the third and first quartiles.

4.1.3: Comparing Data Sets

Key Concepts, continued

The mean absolute deviation compares data values to the mean of a data set. If the mean absolute deviation is large, this is a sign that the data points are distributed farther from the mean.

Two or more data sets can be compared using measures of center and spread. When choosing a measure of center or spread, identify whether there are very large or very small data values that may influence the mean.

4.1.3: Comparing Data Sets

Key Concepts, continued

Data can be compared graphically. The shape of a data set can be seen in a frequency plot, such as a dot plot or histogram.

Data that is symmetric is concentrated toward the middle of the range of data. The data is arranged the same way on both sides.

4.1.3: Comparing Data Sets

Key Concepts, continued

Data that is skewed to the right is concentrated toward the lower range of the data; it has a tail to the right.

Data that is skewed to the left is concentrated toward the upper range of the data; it has a tail to the left.

4.1.3: Comparing Data Sets

Key Concepts, continued

Data that is widely or evenly distributed has greater variation, and data that clusters around a set of values has less variation.

Data can also be compared using a box plot. The width of the box displays the range of the middle 50% of the data; the width increases as variation increases.

4.1.3: Comparing Data Sets

Common Errors/Misconceptions

comparing graphical data that is not drawn using the same scale on the x- and/or y-axes

comparing different measures of center or variation

using an average, such as the mean, to compare data that has very small or very large data values

4.1.3: Comparing Data Sets

Guided Practice

Example 2

Each girl in Mr. Sanson’s class and in Mrs. Kwei’s class measured her own height. The heights were plotted on the dot plots below. Use the dot plots to compare the heights of the girls in the two classes.

Mrs. Kwei’s Class

Mr. Sanson’s Class

4.1.3: Comparing Data Sets

Guided Practice: Example 2, continued

Compare the range of recorded values.

The overall range of heights of girls in the two classes is similar. The heights in the two classes range from 59 inches to 72 inches, and 60 inches to 72 inches.

4.1.3: Comparing Data Sets

Guided Practice: Example 2, continued

Compare the middle values of the data sets.

The girls in Mr. Sanson’s class appear to be taller than the girls in Mrs. Kwei’s class. By looking at where the dots are clustered, we can estimate that the middle height in Mr. Sanson’s class is around 67 inches. The middle height in Mrs. Kwei’s class is 65 inches.

4.1.3: Comparing Data Sets

Guided Practice: Example 2, continued

Compare the variation in the data sets.

The variation in the two sets of heights appears to be similar, except Mr. Sanson’s data is skewed to the left and Mrs. Kwei’s data is skewed to the right. The majority of the heights are within approximately 6 inches in both classes. The majority of the girls in Mr. Sanson’sclass are between 64 and 70 inches, and the majority of the girls in Mrs. Kwei’s class are between 61 and 67 inches.

✔

4.1.3: Comparing Data Sets

Guided Practice: Example 2, continued

4.1.3: Comparing Data Sets

Guided Practice

Example 3

Sam wants to buy a lottery ticket. There are two different tickets that he can buy, and each costs $10. He found a website with information about how much money others have won with their lottery tickets. The information is presented in two box plots, shown on the next slide. Use the two box plots to compare the amounts others have won with Ticket 1 and with Ticket 2.

4.1.3: Comparing Data Sets

Guided Practice: Example 3, continued

Ticket 1

Ticket 2

4.1.3: Comparing Data Sets

Guided Practice: Example 3, continued

Compare the range of data.

The range of amounts won is similar. With Ticket 1, winnings ranged from $0 to $25, and with Ticket 2, winnings ranged from $0 to $22.50.

4.1.3: Comparing Data Sets

Guided Practice: Example 3, continued

Compare the center of the data sets.

The median of the data sets is shown by the vertical line inside the box. The median winnings from Ticket 1 are higher than the median winnings from Ticket 2.

4.1.3: Comparing Data Sets

Guided Practice: Example 3, continued

Compare the variation of the data.

The interquartile range, or middle 50% of each data set, is contained within the box in each box plot. The IQR of winnings with Ticket 1 is smaller than the IQR of winnings with Ticket 2. Also, consider where the median is in the interquartile range. The median winnings with Ticket 1 are slightly to the left in the interquartile range, indicating that half of the winnings were less than $7.50 and half were greater.

4.1.3: Comparing Data Sets

Guided Practice: Example 3, continued

The median winnings with Ticket 2 are to the left in the interquartile range, indicating that half the winnings were less than $5 and half were more. Given the wide range of winnings with Ticket 2, but the lower median, it is more likely that Sam will win a greater amount with Ticket 1.

✔

4.1.3: Comparing Data Sets

Guided Practice: Example 3, continued

4.1.3: Comparing Data Sets