Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Chapter 4 Displaying and Summarizing Quantitative Data
Objectives • Histogram • Stem-and-leaf plot • Dotplot • Shape • Center • Spread • Outliers • Mean • Median • Range • Interquartile range (IQR) • Percentile • 5-Number summary • Resistant • Variance • Standard Deviation
Dealing With a Lot of Numbers… • Summarizing the data will help us when we look at large sets of quantitative data. • Without summaries of the data, it’s hard to grasp what the data tell us. • The best thing to do is to make a picture… • We can’t use bar charts or pie charts for quantitative data, since those displays are for categorical variables.
Reasons for Constructing Quantitative Frequency Tables 1. Large data sets can be summarized. 2. Can gain some insight into the nature of data. 3. Have a basis for constructing a histogram.
Ways to chart quantitative data • Histograms and stemplots These are summary graphs for a single variable. They are very useful to understand the pattern of variability in the data. • Line graphs: time plots Use when there is a meaningful sequence, like time. The line connecting the points helps emphasize any change over time. • Other graphs to reflect numerical summaries are Dotplots and Cumulative Frequency Curves (Ogive).
Quantitative Data Histogram
Histogram • To make a histogram we first need to organize the data using a quantitative frequency table. • Two types of quantitative data • Discrete – use ungrouped frequency table to organize. • Continuous – use grouped frequency table to organize.
Quantitative Frequency Tables – Ungrouped • What is an ungrouped frequency table? An ungrouped frequency table simply lists the data values with the corresponding frequency counts with which each value occurs. • Commonly used withdiscrete quantitative data.
Quantitative Frequency Tables – Ungrouped • Example:The at-rest pulse rate for 16 athletes at a meet were57, 57, 56, 57, 58, 56, 54, 64, 53, 54, 54, 55, 57, 55, 60,and58. Summarize the information with an ungrouped frequency distribution.
Quantitative Frequency Tables – Ungrouped • Example Continued Note: The (ungrouped) classes are the observed values themselves.
Quantitative Relative Frequency Tables - Ungrouped Note:The relative frequency for a class is obtained by computingf/n.
Quantitative Frequency Tables – Grouped • What is a grouped frequency table? A grouped frequency table is obtained by constructing classes (or intervals) for the data, and then listing the corresponding number of values (frequency counts) in each interval. • Commonly used withcontinuous quantitative data.
Quantitative Frequency Tables – Grouped • Later, we will encounter a graphical display called the histogram. We will see that grouped frequency tables are used to construct these displays.
Quantitative Frequency Tables – Grouped • There are several procedures that one can use to construct a grouped frequency tables. • However, because of the many statistical software packages (MINITAB, SPSS etc.) and graphing calculators (TI-83 etc.) available today, it is not necessary to try to construct such distributions using pencil and paper.
Quantitative Frequency Tables – Grouped • A frequency table should have a minimum of 5 classes and a maximum of 20 classes. • For small data sets, one can use between 5 and 10 classes. • For large data sets, one can use up to 20 classes.
Quantitative Frequency Tables – Grouped • Example:The weights of 30 female students majoring in Physical Education on a college campus are as follows: 143, 113, 107, 151, 90, 139, 136, 126, 122, 127, 123, 137, 132, 121, 112, 132, 133, 121, 126, 104, 140, 138, 99, 134, 119, 112, 133, 104, 129,and123. Summarize the data with a frequency distribution using seven classes.
Quantitative Frequency Tables – Grouped Example Continued • NOTE:We will introduce the histogramhere to help us explain a grouped frequency distribution.
Quantitative Frequency Tables – Grouped Example Continued • What is a histogram? A histogram is a graphical display of a frequency or a relative frequency table that uses classes and vertical (horizontal) bars (rectangles) of various heights to represent the frequencies.
Histogram • The most common graph used to display one variable quantitative data.
Quantitative Frequency Tables – Grouped Example Continued • The MINITAB statistical software was used to generate the histogram in the next slide. • The histogram has seven classes. • Classes for the weights are along the x-axis and frequencies are along the y-axis. • The number at the top of each rectangular box, represents the frequency for the class.
Quantitative Frequency Tables – Grouped Example Continued Histogram with 7 classes for the weights.
Quantitative Frequency Tables – Grouped Example Continued • Observations • From the histogram, the classes (intervals) are 85 – 95, 95 – 105,105 – 115etc. with corresponding frequencies of 1, 3, 4, etc. • We will use this information to construct the group frequency distribution.
Quantitative Frequency Tables – Grouped Example Continued • Observations (continued) • Observe that the upper class limit of 95 for the class 85 – 95 is listed as the lower class limit for the class 95 – 105. • Since the value of 95 cannot be included in both classes, we will use the convention that the upper class limit is not included in the class.
Quantitative Frequency Tables – Grouped Example Continued • Observations (continued) • That is, the class 85 – 95 should be interpreted as having the values 85 and up to 95 but not including the value of 95. • Using these observations, the grouped frequency distribution is constructed from the histogram and is given on the next slide.
Quantitative Frequency Tables – Grouped Example Continued • Observations (continued) • In the grouped frequency distribution, the sum of the relative frequencies did not add up to 1. This is due to rounding to four decimal places. • The same observation should be noted for the cumulative relative frequency column.
Creating a Histogram It is an iterative process—try and try again. What bin size should you use? • Not too many bins with either 0 or 1 counts • Not overly summarized that you lose all the information • Not so detailed that it is no longer summary Rule of thumb: Start with 5 to10 bins. Look at the distribution and refine your bins. (There isn’t a unique or “perfect” solution.)
Not summarized enough Too summarized Same data set
Histograms Definitions • Frequency Distributions • Example
Lower Class Limits are the smallest numbers that can actually belong to different classes
Lower Class Limits Lower Class Limits are the smallest numbers that can actually belong to different classes
Upper Class Limits Upper Class Limits are the largest numbers that can actually belong to different classes
Class Boundaries are the numbers used to separate classes, but without the gaps created by class limits
- 0.5 99.5 199.5 299.5 399.5 499.5 Class Boundaries number separating classes
- 0.5 99.5 199.5 299.5 399.5 499.5 Class Boundaries Class Boundaries number separating classes
Class Midpoints or Class Mark midpoints of the classes Class midpoints can be found by adding the lower class limit to the upper class limit and dividing the sum by two.
Class Midpoints Class Midpoints midpoints of the classes 49.5 149.5 249.5 349.5 449.5
100 100 100 100 100 Class Width Class Width is the difference between two consecutive lower class limits or two consecutive lower class boundaries
Summary of Terminology • Class - non-overlapping intervals the data is divided into. • Class Limits –The smallest and largest observed values in a given class. • Class Boundaries – Fall halfway between the upper class limit for the smaller class and the lower class limit for larger class. Used to close the gap between classes. • Class Width – The difference between the class boundaries for a given class. • Class mark – The midpoint of a class.
Constructing A Frequency Table 1. Decide on the number of classes (should be between 5 and 20) . 2. Calculate (round up). 3. Starting point: Begin by choosing a lower limit of the first class. 4. Using the lower limit of the first class and class width, proceed to list the lower class limits. 5. List the lower class limits in a vertical column and proceed to enter the upper class limits. 6. Go through the data set putting a tally in the appropriate class for each data value. (highest value) – (lowest value) class width number of classes
Histogram Then to complete the Histogram, graph the Frequency Table data.
Frequency Histogram vs Relative Frequency Histogram A bar graph in which the horizontal scale represents the classes of data values and the vertical scale represents the frequencies.
Frequency Histogram vs Relative Frequency Histogram Has the same shape and horizontal scale as a histogram, but the vertical scale is marked with relative frequencies.
Histograms - Facts • Histograms are useful when the data values are quantitative. • A histogram gives an estimate of the shape of the distribution of the population from which the sample was taken. • If the relative frequencies were plotted along the vertical axis to produce the histogram, the shape will be the same as when the frequencies are used.
Making Histograms on the TI-83/84 Use of Stat Plots on the TI-83/84 Raw Data: 548, 405, 375, 400, 475, 450, 412 375, 364, 492, 482, 384, 490, 492 490, 435, 390, 500, 400, 491, 945 435, 848, 792, 700, 572, 739, 572
Frequency Frequency Table Data: Class Limits 350 to < 450 450 to < 550 550 to < 650 650 to < 750 750 to < 850 850 to < 950 11 10 2 2 2 1
Quantitative Data Stem and leaf Plot
Stem-and-Leaf Plots • What is a stem-and-leaf plot? A stem-and-leaf plot is a data plot that uses part of a data value as the stemto form groups or classes and part of the data value as theleaf. • Most often used for small or medium sized data sets. For larger data sets, histograms do a better job. • Note: A stem-and-leaf plot has an advantage over a grouped frequency table or hostogram, since a stem-and-leaf plot retains the actual data by showing them in graphic form.
Stemplots Include key – how to read the stemplot. 0|9 = 9 How to make a stemplot: • Separate each observation into a stem, consisting of all but the final (rightmost) digit, and a leaf, which is that remaining final digit. Stems may have as many digits as needed. Use only one digit for each leaf—either round or truncate the data values to one decimal place after the stem. • Write the stems in a vertical column with the smallest value at the top, and draw a vertical line at the right of this column. • Write each leaf in the row to the right of its stem, in increasing order out from the stem. Original data: 9, 9, 22, 32, 33, 39, 39, 42, 49, 52, 58, 70 STEM LEAVES