Gary Klass Department of Politics and Government Illinois State University
From last years’ conference: Moving Practitioners Beyond Descriptive StatisticsSaturday, October 18, 2008 8:30 a.m. - 12:30 p.m.Seminar Description: Statistical analysis is very important with regard to reports, projects, policy, and the general understanding of information `processed on a daily basis. Many people are intimidated by mathematics and statistics, which causes an overreliance on simple descriptive statistics such as means, standard deviations, rates, and percent changes. These simple descriptive statistics all have limitations, however. The primary goal of this presentation is to identify the limitations of descriptive statistics and explore more meaningful bivariate and multivariate analyses, such as z-scores, t-tests, ANOVA, and regression.Instructor:Jamie Price, PresidentSocialphenom, Inc.West Palm Beach, Florida
Part I – Principles of Data Display Part II – Statistical Fallacies
Data Presentation Standards“Graphical Excellence” -- Edward Tufte • well-designed presentation of data of substance, statistics and design • complex ideas communicated with clarity, precision and efficiency • the greatest number of ideas in the shortest time with the least ink in the smallest space.Best example: Baseball statistics
Better Crime Reports • New Jersey • NYC (Bronx) Compstat • Florida websiteoffense data
Data Presentation Principles • Show the data • Minimize the ink to data ratio • Sort by a meaningful variable • Tell the Truth -- Avoid data distortion • Tables and charts should be self-explanatory • Highlight Meaningful comparisons
The problems with that chart • Time goes right to left • Scaling distortion • Unnecessary 3-D ink • Better not to use different colors to measure the same thing. • Year to year changes are usually not important.
Sort the Data! • By the most meaningful variable • The Alphabet is not the most meaningful variable • Time goes left to right
Sort Data by the most important variable The Alphabet is not the most important variable
Sorting data by least meaningful variable Also note: unnecessary decimal place
Revised Chart: Self-reported Use of Powder Cocaine Over the Past 30 Days, 2003, 2007 % of Arresteess Reporting Use
Tell the Truth! • Avoid Data Distortion
Minimize the ink-to-data ratio • Avoid all ChartJunk • Never use 3-D • Eliminate unnecessary lines
Don’t use Pie Charts!!!!!!! • Never use Pie Charts • Never ever use 3-D Pie Charts • Never compare data across two pie charts • Beware of the Pie Chart’s friends: pyramids, cones, donuts and radars
Simple Graphics Boxplots, Sparklines, and Dot - Plots the greatest number of ideas in the shortest time with the least ink in the smallest space.
FBI\UCR Violent Crime Rates: New York City and 69 Largest Cities ( 2007 population > 250,000) [show outliers]