Statistics and Data Analysis. Professor William Greene Stern School of Business IOMS Department Department of Economics. Statistics and Data Analysis. Part 1 – Data Presentation. Data Presentation Agenda. Data and Data Types Representing Data: pie chart, bar chart.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Professor William Greene
Stern School of Business
Department of Economics
Part 1 – Data Presentation
Pizza Sales by Type
What do the data tell you?
How can you use the information?
What additional information would make these data more informative?
Strongly disagree Disagree Neutral Agree Strongly agree
Moody’s bond ratings: Aaa, Aa, A, Bbb, Bb, B, and so on.
Ordered Qualitative DataGerman Health Satisfaction Survey; 27,326 individuals. On a scale from 0 to 10, how do you feel about your health?
Bond Ratings Movie Ratings
61 Stern Students’ Ranking of Subway Safety (1994)*
Is there an objective meaning to “3” on some standard scale?Does everyone’s “1” or “2” or “3” … mean the same thing?
* Jeff Simonoff: Data Presentation and Summary, pp. 3-4
No units of measurement
Arithmetic manipulation is usually meaningless. The average of Air and Bus is not Train
Units of measurement make sense. Arithmetic computations make sense.
Pizza Pies Sold, by Type
BAR CHART PIE CHART
Same data. Which is easier to understand?
Box and Whisker Plot for House Price Listings
3rdQuartile = 24933
Interquartile Range = IQR= 24933-21677 = 3256
1stQuartile = 21677
What is an outlier?Why do we believe a particular point is an outlier?
Smaller of (Maximum, Median + 1.5 IQR
Larger of (Minimum, Median – 1.5 IQR
HOG, pp. 39-43
A histogram describes the sample data and suggests the nature of the underlying data generating process. Note the “skewness” of the distribution of listings.
HOG, pp. 16-18
… shows up in the box and whisker plot. Note the long whisker at the top of the figure.
Asymmetry (skewness) in the histogram of listing prices…
Graphical tools can be very badly behaved when:
(1) The data have only a few observations.
(2) There are wild observations in the data set.
The box and whisker plot is distorted (and dominated) by one wildly errant observation.
“There are lies, damned lies and statistics.” (Benjamin Disraeli)
Probability of Survival to Age 50, Female at BirthU.S. and 20 Other Wealthy Countries