slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Slides Prepared by JOHN S. LOUCKS St. Edward’s University PowerPoint Presentation
Download Presentation
Slides Prepared by JOHN S. LOUCKS St. Edward’s University

Loading in 2 Seconds...

play fullscreen
1 / 35

Slides Prepared by JOHN S. LOUCKS St. Edward’s University - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

Slides Prepared by JOHN S. LOUCKS St. Edward’s University. y. x. Chapter 2 Descriptive Statistics: Tabular and Graphical Presentations Part B. Exploratory Data Analysis Crosstabulations and Scatter Diagrams. Minor mods by McKim. Exploratory Data Analysis.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Slides Prepared by JOHN S. LOUCKS St. Edward’s University' - winifred


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Slides Prepared by

JOHN S. LOUCKS

St. Edward’s University

slide2

y

x

Chapter 2Descriptive Statistics:Tabular and Graphical PresentationsPart B

  • Exploratory Data Analysis
  • Crosstabulations and

Scatter Diagrams

Minor mods by McKim

exploratory data analysis
Exploratory Data Analysis
  • The techniques of exploratory data analysis consist of
  • simple arithmetic and easy-to-draw pictures that can
  • be used to summarize data quickly.
  • One such technique is the stem-and-leaf display.
stem and leaf display
Stem-and-Leaf Display
  • A stem-and-leaf display shows both the rank order
  • and shape of the distribution of the data.
  • It is similar to a histogram on its side, but it has the
  • advantage of showing the actual data values.
  • The first digits of each data item are arranged to the
  • left of a vertical line.
  • To the right of the vertical line we record the last
  • digit for each item in rank order.
  • Each line in the display is referred to as a stem.
  • Each digit on a stem is a leaf.
slide5

Example: Hudson Auto Repair

The manager of Hudson Auto

would like to have a better

understanding of the cost

of parts used in the engine

tune-ups performed in the

shop. She examines 50

customer invoices for tune-ups. The costs of parts,

rounded to the nearest dollar, are listed on the next

slide.

slide6

Example: Hudson Auto Repair

  • Sample of Parts Cost for 50 Tune-ups
slide7

Stem-and-Leaf Display

5

6

7

8

9

10

2 7

2 2 2 2 5 6 7 8 8 8 9 9 9

1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9

0 0 2 3 5 8 9

1 3 7 7 7 8 9

1 4 5 5 9

a stem

a leaf

stretched stem and leaf display
Stretched Stem-and-Leaf Display
  • If we believe the original stem-and-leaf display has
  • condensed the data too much, we can stretch the
  • display by using two stems for each leading digit(s).
  • Whenever a stem value is stated twice, the first value
  • corresponds to leaf values of 0 - 4, and the second
  • value corresponds to leaf values of 5 - 9.
slide9

Stretched Stem-and-Leaf Display

5

5

6

6

7

7

8

8

9

9

10

10

2

7

2 2 2 2

5 6 7 8 8 8 9 9 9

1 1 2 2 3 4 4

5 5 5 6 7 8 9 9 9

0 0 2 3

5 8 9

1 3

7 7 7 8 9

1 4

5 5 9

slide10

Stem-and-Leaf Display

  • Leaf Units
  • A single digit is used to define each leaf.
  • In the preceding example, the leaf unit was 1.
  • Leaf units may be 100, 10, 1, 0.1, and so on.
  • Where the leaf unit is not shown, it is assumed
  • to equal 1.
example leaf unit 0 1
Example: Leaf Unit = 0.1

If we have data with values such as

8.6 11.7 9.4 9.1 10.2 11.0 8.8

a stem-and-leaf display of these data will be

Leaf Unit = 0.1

8

9

10

11

6 8

1 4

2

0 7

slide12

Example: Leaf Unit = 10

If we have data with values such as

1806 1717 1974 1791 1682 1910 1838

a stem-and-leaf display of these data will be

Leaf Unit = 10

16

17

18

19

8

The 82 in 1682

is rounded down

to 80 and is

represented as an 8.

1 9

0 3

1 7

crosstabulations and scatter diagrams
Crosstabulations and Scatter Diagrams
  • Thus far we have focused on methods that are used
  • to summarize the data for one variable at a time.
  • Often a manager is interested in tabular and
  • graphical methods that will help understand the
  • relationship between two variables.
  • Crosstabulation and a scatter diagram are two
  • methods for summarizing the data for two (or more)
  • variables simultaneously.
slide14

Crosstabulation

  • A crosstabulation is a tabular summary of data for

two variables.

  • Crosstabulation can be used when:
    • one variable is qualitative and the other is
    • quantitative,
    • both variables are qualitative, or
    • both variables are quantitative.
    • In other words, any time!! (Duh.)
  • The left and top margin labels define the classes for
  • the two variables.
slide15

Crosstabulation

  • Example: Finger Lakes Homes

The number of Finger Lakes homes sold for each style and price for the past two years is shown below.

qualitative

variable

quantitative

variable

Home Style

Price

Range

Colonial Log Split A-Frame

Total

18 6 19 12

55

45

< $99,000

> $99,000

12 14 16 3

30 20 35 15

Total

100

crosstabulation
Crosstabulation
  • Insights Gained from Preceding Crosstabulation
  • The greatest number of homes in the sample (19)
  • are a split-level style and priced at less than or
  • equal to $99,000.
  • Only three homes in the sample are an A-Frame
  • style and priced at more than $99,000.
slide17

Crosstabulation

Frequency distribution

for the price variable

Home Style

Price

Range

Colonial Log Split A-Frame

Total

18 6 19 12

55

45

< $99,000

> $99,000

12 14 16 3

30 20 35 15

Total

100

Frequency distribution

for the home style variable

crosstabulation row or column percentages
Crosstabulation: Row or Column Percentages
  • Converting the entries in the table into row percentages or column percentages can provide additional insight about the relationship between the two variables.
slide19

Crosstabulation: Row Percentages

Home Style

Price

Range

Colonial Log Split A-Frame

Total

32.73 10.91 34.55 21.82

100

100

< $99,000

> $99,000

26.67 31.11 35.56 6.67

Note: row totals are actually 100.01 due to rounding.

(Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100

slide20

Crosstabulation: Column Percentages

Home Style

Price

Range

Colonial Log Split A-Frame

60.00 30.00 54.29 80.00

< $99,000

> $99,000

40.00 70.00 45.71 20.00

100 100 100 100

Total

(Colonial and > $99K)/(All Colonial) x 100 = (12/30) x 100

slide21

Crosstabulation: Simpson’s Paradox

  • Data in two or more crosstabulations are often

aggregated to produce a summary crosstabulation.

  • We must be careful in drawing conclusions about the
  • relationship between the two variables in the
  • aggregated crosstabulation.
  • Simpson’ Paradox: In some cases the conclusions

based upon an aggregated crosstabulation can be

completely reversed if we look at the unaggregated

data. suggests the overall relationship between the

variables. (Yikes, just ignore that last partial sentence.)

slide22

Simpson’s Paradox (p.48)

Judge

Verdict

Total

Appealed

Luckett Kendall

129 (86%) 110 (88%)

Upheld

Reversed

239

36

  • 21 (14%) 15 (12%)

150 (100%) 125 (100%)

Total (%)

275

Kendall appears to have the better rate of upheld verdicts

slide23

Simpson’s Paradox (Cont’d)

Judge Luckett

Verdict

Total

Appealed

Common Pleas Municipal

29 (91%) 100 (85%)

Upheld

Reversed

129

21

  • 3 (9%) 18 (15%)

32 (100%) 118 (100%)

Total (%)

150

slide24

Simpson’s Paradox (Cont’d)

Judge Kendall

Verdict

Total

Appealed

Common Pleas Municipal

90 (90%) 20 (80%)

Upheld

Reversed

110

15

  • 10 (10%) 5 (20%)

100 (100%) 25 (100%)

Total (%)

125

slide25

Simpson’s Paradox (Cont’d)

Luckett has the better rate of upheld verdicts in both Common Pleas and Municipal Courts

Yet the aggregate table makes it appear the Kendall has the better rate.

What appears to have happened is that Municipal Court appeals have a higher overturn rate and Luckett just happened to try most of his cases there.

What other important factor is missing from this analysis??

slide26

Scatter Diagram and Trendline

  • A scatter diagram is a graphical presentation of the
  • relationship between two quantitative variables.
  • One variable is shown on the horizontal axis and the
  • other variable is shown on the vertical axis.
  • The general pattern of the plotted points suggests the
  • overall relationship between the variables.
  • A trendline is an approximation of the relationship.
  • Example in book is number of commercials vs sales
scatter diagram
Scatter Diagram
  • A Positive Relationship

y

x

scatter diagram1
Scatter Diagram
  • A Negative Relationship

y

x

scatter diagram2
Scatter Diagram
  • No Apparent Relationship

y

x

example panthers football team
Example: Panthers Football Team
  • Scatter Diagram

The Panthers football team is interested

in investigating the relationship, if any,

between interceptions made and points scored.

x = Number of

Interceptions

y = Number of

Points Scored

14

24

18

17

30

1

3

2

1

3

scatter diagram3

35

30

25

20

15

10

5

0

1

0

2

3

4

Scatter Diagram

y

Number of Points Scored

x

Number of Interceptions

slide32

Example: Panthers Football Team

  • Insights Gained from the Preceding Scatter Diagram
  • The scatter diagram indicates a positive relationship
  • between the number of interceptions and the
  • number of points scored.
  • Higher points scored are associated with a higher
  • number of interceptions.
  • The relationship is not perfect; all plotted points in
  • the scatter diagram are not on a straight line.
tabular and graphical procedures
Tabular and Graphical Procedures

Data

Qualitative Data

Quantitative Data

Tabular

Methods

Graphical

Methods

Tabular

Methods

Graphical

Methods

  • Bar Graph
  • Pie Chart
  • Frequency
  • Distribution
  • Rel. Freq. Dist.
  • Percent Freq.
  • Distribution
  • Crosstabulation
  • Dot Plot
  • Histogram
  • Ogive
  • Scatter
  • Diagram
  • Frequency
  • Distribution
  • Rel. Freq. Dist.
  • Cum. Freq. Dist.
  • Cum. Rel. Freq.
  • Distribution
  • Stem-and-Leaf
  • Display
  • Crosstabulation
keep in mind
Keep in mind…
  • Frequency Distribution shows how many.
  • Rel. Freq. Distribution shows what fraction.
  • Percent Freq. Distribution shows what percentage.
  • Can represent any of the above in tables or charts.