Chapter 12: Describing Distributions with Numbers. We create graphs to give us a picture of the data. We also need numbers to summarize the center and spread of a distribution. Two types of descriptive statistics for categorical variables: 1) Counts (Frequencies)
1) Counts (Frequencies)
2) Rates or Proportions (Relative Frequencies)
Question: Who is the best home run hitter ever in major league baseball?
Players with high numbers of homeruns in seasons:
The median (M) is the midpoint of a distribution when the observations are arranged in increasing order. Number such that half the observations are smaller and the other half are larger. (p. 219)
Calculate M for Sosa’s homeruns in a season (8 seasons, to 1999).
Calculate M for Maris’s homeruns in a season (11 seasons).
The 5-number summary of a data set consists of the following descriptive statistics (p. 221):
Minimum, First Quartile (Q1), Median, Third Quartile (Q3), Maximum
Give the 5-number summaries for Sosa and Maris’s homeruns.
A boxplot is a graphical representation of the 5-number summary. (p. 221)
Inter-quartile Range = IQR = Q3 - Q1
1) Compute the 5-number summary.
2) Draw a vertical line at the Q1 and Q3.
3) Draw two horizontal lines to complete the box.
4) Draw a vertical line at the median.
5) Draw “whiskers” to the extremes (Min and Max).
Draw boxplots for Sosa and Maris’s homeruns.