Chapter 5 understanding and comparing distributions
This presentation is the property of its rightful owner.
Sponsored Links
1 / 21

Chapter 5 Understanding and Comparing Distributions PowerPoint PPT Presentation


  • 57 Views
  • Uploaded on
  • Presentation posted in: General

Chapter 5 Understanding and Comparing Distributions. Another Useful Graphical Method: Boxplots. Pulse Rates n = 138. Median: mean of pulses in locations 69 & 70: median= (70+70)/2=70. Q 1 : median of lower half (lower half = 69 smallest pulses); Q 1 = pulse in ordered position 35; Q 1 = 63.

Download Presentation

Chapter 5 Understanding and Comparing Distributions

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Chapter 5 understanding and comparing distributions

Chapter 5 Understanding and Comparing Distributions

Another Useful Graphical Method: Boxplots


Pulse rates n 138

Pulse Rates n = 138

Median: mean of pulses in locations 69 & 70: median= (70+70)/2=70

Q1: median of lower half (lower half = 69 smallest pulses); Q1 = pulse in ordered position 35;

Q1 = 63

Q3 median of upper half (upper half = 69 largest pulses); Q3= pulse in position 35 from the high end; Q3=78


Recall the 5 number summary of data from chapter 4

Recall the 5-number summary of data from Chapter 4

  • Minimum Q1 median Q3 maximum

  • Pulse data 5-number summary

    45 63 70 78 111

  • A boxplot is a graphical display of the 5-number summary


Example

Example

  • Consider the data shown at the left.

    • The data values 6.1, 5.6, …, are in the right column

    • They are arranged in decreasing order from 6.1 (data rank of 25 shown in far left column) to 0.6 (data rank of 1 in far left column)

    • The center column shows the ranks of the quartiles (in blue) from each end of the data and from the overall median (in yellow)


Chapter 5 understanding and comparing distributions

Boxplot: display of 5-number summary

Largest = max = 6.1

BOXPLOT

Q3= third quartile

= 4.2

m = median = 3.4

Q1= first quartile

= 2.3

Five-number summary:

min Q1 m Q3 max

Smallest = min = 0.6


Boxplot display of 5 number summary

Boxplot: display of 5-number summary

  • Example: age of 66 “crush” victims at rock concerts 1999-2000.

    5-number summary:

    13 17 19 22 47


Boxplot construction

Boxplot construction

1) construct box with ends located at Q1 and Q3; in the box mark the location of median (usually with a line or a “+”)

2) fences are determined by moving a distance 1.5(IQR) from each end of the box;

2a) upper fence is 1.5*IQR above the upper quartile

2b) lower fence is 1.5*IQR below the lower quartile

Note: the fences only help with constructing the boxplot; they do not appear in the final boxplot display


Box plot construction cont

Box plot construction (cont.)

3) whiskers: draw lines from the ends of the box left and right to the most extreme data values found within the fences;

4) outliers: special symbols represent each data value beyond the fences;

4a) sometimes a different symbol is used for “far outliers” that are more than 3 IQRs from the quartiles


Chapter 5 understanding and comparing distributions

8

Boxplot: display of 5-number summary

Largest = max = 7.9

BOXPLOT

Distance to Q3

7.9 − 4.2 = 3.7

Q3= third quartile

= 4.2

Interquartile range

Q3 – Q1=

4.2 − 2.3 = 1.9

Q1= first quartile

= 2.3

1.5 * IQR = 1.5*1.9=2.85. Individual #25 has a value of 7.9 years, which is 3.7 years above the third quartile. This is more than 2.85 = 1.5*IQR above Q3. Thus, individual #25 is a suspected outlier.


Atm withdrawals by day month holidays

ATM Withdrawals by Day, Month, Holidays


Beg of class pulses n 138

Beg. of class pulses (n=138)

  • Q1 = 63, Q3 = 78

  • IQR=78  63=15

  • 1.5(IQR)=1.5(15)=22.5

  • Q1 - 1.5(IQR): 63 – 22.5=40.5

  • Q3 + 1.5(IQR): 78 + 22.5=100.5

40.5

70

78

100.5

63

45


Chapter 5 understanding and comparing distributions

Below is a box plot of the yards gained in a recent season by the 136 NFL receivers who gained at least 50 yards. What is the approximate value of Q3 ?

410

958

136

684

1232

0

273

1369

821

547

1095

Pass Catching Yards by Receivers

  • 450

  • 750

  • 215

  • 545

10

Countdown


Rock concert deaths histogram and boxplot

Rock concert deaths: histogram and boxplot


Automating boxplot construction

Automating Boxplot Construction

  • Excel “out of the box” does not draw boxplots.

  • Many add-ins are available on the internet that give Excel the capability to draw box plots.

  • Statcrunch (http://statcrunch.stat.ncsu.edu) draws box plots.


Chapter 5 understanding and comparing distributions

Statcrunch Boxplot

Largest = max = 7.9

Q3= third quartile

= 4.2

Q1= first quartile

= 2.3


Tuition 4 yr colleges

Tuition 4-yr Colleges


Statcrunch 2012 13 nfl salaries by position

Statcrunch: 2012-13 NFL Salaries by Position


College football head coach salaries by conference

College Football Head Coach Salaries by Conference


2013 major league baseball salaries by team

2013 Major League Baseball Salaries by Team


End of chapter 5

End of Chapter 5


  • Login