data analysis for description n.
Skip this Video
Loading SlideShow in 5 Seconds..
Data Analysis for Description PowerPoint Presentation
Download Presentation
Data Analysis for Description

Data Analysis for Description

0 Views Download Presentation
Download Presentation

Data Analysis for Description

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson Dr. G. Johnson,

  2. Simple But Concrete • The Children’s Defense Fund reports on each day in America: • Four children are killed by abuse or neglect • Five children or teens commit suicide • Eight children or teens are killed by firearms • Seventy-five babies die before their 1st birthday • Dr. G. Johnson,

  3. Simple But Concrete • A million seconds = 11 ½ days • A billion seconds= 32 years • A trillion seconds= 32,000 years Dr. G. Johnson,

  4. Simple But Concrete • A $700 billion bailout translates into $2,333 IOU from every person in the U.S. • Or—using a different metric-it comes to $45 per week for each person in the U.S. • Going one step further, it comes out to $6 a day • Framing: are you willing to pay $6 a day to have a functioning financial system?Read more:,8599,1870699,00.html#ixzz0aqek0mRZ Dr. G. Johnson,

  5. Going Too Far? • Six dollars a day is also 25 cents an hour, or less than half a penny a minute. • Framing: Would you be willing to pay less than half a penny a minute? • Key Point: Does the comparison point make a difference in what you would be willing to pay? • Read more:,8599,1870699,00.html#ixzz0aqf9HSQ9 Dr. G. Johnson,

  6. Common Descriptive Analysis • Counts: how many • Decennial census • Percents • Women earned 77% of what men earned in 2006, up from 59% in 1970 • Parts of a whole • Percents (75%) and proportions (.75 or three-quarters) Dr. G. Johnson,

  7. Common Descriptive Analysis • But be mindful of “bigger pie” distortions when working with percents and proportions • If the pie grows much faster than the slice, the slice will appear relatively smaller as a percent even though it still grew • Best example is budget deficit as a percent of the GDP: if GDP grows much faster than the budget deficit, it will appear smaller even though it has also grown. Dr. G. Johnson,

  8. Common Descriptive Analysis • Rates: number of occurrences that are standardized • Deaths of infants per 100,000 births • Crop yields per acre • Crime rates • Rates provide an apples-to-apples comparison between places of different size or populations Dr. G. Johnson,

  9. Common Descriptive Analysis • Ratio: numbers presented in relationship to each other • Student to teacher ratio: 15:1 • Divide number of students by the number of teachers • 1,500 students and 45 teachers equals a 33 to 1 student to teacher ratio (1,500 divided by 45) Dr. G. Johnson,

  10. Common Descriptive Analysis • Rates of change • Percentage change from one time period to the other • For example: The budget increased 23% from FY 2006 to FY 2007. Three Steps: • Divided newest data by oldest data • Subtract 1 • Multiple by 100 to get the percentage change Dr. G. Johnson,

  11. Common Descriptive Analysis • Rates of change • Percentage change from one time period to the other • For example: The budget increased 23% from FY 2006 to FY 2007. Three Steps: • Divided newest data by oldest data • Subtract 1 • Multiple by 100 to get the percentage change Dr. G. Johnson,

  12. Common Descriptive Analysis • Rates of change: applied • What was the rate of change in 1992 budget deficit as compared to 1980. • Divide 1992 budget deficit ($290 billion) by the 1980 budget deficit ($73.8 billion) = 3.93 • 3.93-1 – 2.93 • 2.93 x 100 = 293 percent • The budget deficit in current dollars (meaning not controlled for by inflation) increased 293 percent. Dr. G. Johnson,

  13. Common Descriptive Analysis • Frequency Distributions • Number and percents of a single variable Dr. G. Johnson,

  14. In The News: Women Now Are Majority of College Graduates Dr. G. Johnson,

  15. Interpretation? • How would you interpret these percentages in the comparative trend analysis? • Are you surprised by the changes over time? • Why or why not? Dr. G. Johnson,

  16. Frequency and Percent Distributions • Survey data: analyzed by distributions • How many men and women are in the program? Distribution of Respondents by Gender: Male Female Total Number Percent Number Percent Number 100 33% 200 67% 300 Dr. G. Johnson,

  17. Frequency and Percent Distributions • How many men and women are in the program? Write-up: Of the 300 people in this program, 67% are women and 33% are men. Dr. G. Johnson,

  18. Different Analysis Tools For Different Situations • Frequency/percent distributions make sense when working with nominal and ordinal data • But frequency/percent distributions for interval/ratio data can result in a ridiculously long table that is impossible to interpret • If I ask 500 people how many years they lived in an area, I can can get a wide range of answers. • For this type of data, I would then look at means, medians, modes to describe that variable. Dr. G. Johnson,

  19. Describing Distributions • Central tendency • Means, Medians, Modes • How similar are the characteristics? • Example: Use when we want to describe the similarity of the ages of a group of people. • Dispersion • Range, standard deviation • How dissimilar are the characteristics? • Example: how much variation in the ages? Dr. G. Johnson,

  20. Measures of Central Tendency • The 3-Ms: Mode, Median, Mode. • Mode: most frequent response. • Median: mid-point of the distribution • Mean: arithmetic average. Dr. G. Johnson,

  21. Basic Concepts Revisited • Levels of Measurement • Nominal Level Data: names, categories • Eg. Gender, religion, state, country • Ordinal Level Data: data with an order, going from low to high • Eg. Highest educational degree, income categories, agree—disagree scales • Interval Level Data: numbers but no zero • Eg. IQ scores, GRE scores • Ratio Level Data: real numbers with a zero point • Eg. Age, weight, income, temperature Dr. G. Johnson,

  22. Which Measure of Central Tendency to Use? Depends on the type of data you have: • Nominal data: mode • Ordinal data: mode and median • Interval/ratio: mode, median and mean Dr. G. Johnson,

  23. For Interval Or Ratio Data: Which One To Use? • Concept of the Normal Distribution—also called the bell-shape curve • In a normal distribution, the mean, median and mode should be very similar • Use mean if distribution is normal • Use median if distribution is not normal Dr. G. Johnson,

  24. Normal Distribution: Bell-Shaped Curve Mean Dr. G. Johnson,

  25. Office contributions • $10, $ 1, $.50, $.25, $.25. • The mean is $2.40 (add up and divide by 5) • The median is .50 (the mid-point of this distribution) • The mode is .25 (the most frequently reported contribution) • Best description of contributions is median. Dr. G. Johnson,

  26. Salaries • Assume that you had 11 teachers. 10 teachers earned $21,000 per year and one earned $1,000,000. • What would be the best measure to describe this data? Dr. G. Johnson,

  27. Salaries • The average salary would be $110,000. • The median and mode is $21,000. • The curve would be positively skewed, i.e. Mean higher than Mode and Median • The median would do the best job at describing the center the salaries Dr. G. Johnson,

  28. Skewed Data • negative skew: The mass of the distribution is concentrated on the right of the figure. It has relatively few low values. The distribution is said to be left-skewed. • positive skew: The mass of the distribution is concentrated on the left of the figure. It has relatively few high values. The distribution is said to be right-skewed. The $ million salary pulls the average up. Wikipedia: Dr. G. Johnson,

  29. Skewed Distributions:Negative and Positive Dr. G. Johnson,

  30. Using Means With Survey Data? • Survey data is typically coded using numbers: • Gender: Male is coded 1 • Female is coded 2 • It is faster and less error-prone to code variables using numbers • But the computer could treat these as numbers and will compute a mean if asked • How would you interpret a mean for gender of 1.6? Or a mean for religion of 2.8 Dr. G. Johnson,

  31. Do Not Use Means With Nominal Data • Gender (and religion) are nominal variables and should only be reported in terms of distributions: • Frequency distribution: 10 men and 12 women • Percentage distribution: 45% men and 55% women Dr. G. Johnson,

  32. Using Means With Survey Data? • Scales (very satisfied<->very dissatisfied are ordinal scales • But they coded into the computer using numbers • 5 for very satisfied<->1 for very dissatisfied • The computer will compute a mean if asked: • The mean was 3.8 for job satisfaction. • The mean satisfaction with faculty performance was 4.2 on a scale from 1-5 • Grade-point averages are an example of means based on an ordinal scale (A—F (scale of 0-4) Dr. G. Johnson,

  33. Using Means With Ordinal Data? • There is disagreement in the field—partly based on academic discipline-about whether to use means with ordinal data. • Things like GPA or faculty ratings are often shown as means • It is often helpful for researchers to look at the means initially when working with a lot of data—researchers are looking for unusually high or low means. • It is also true that sometimes it is easier to show the means than the percentage distribution for every variable Dr. G. Johnson,

  34. Washington Employee Survey

  35. Using Means With Ordinal Data? • But most people are more familiar with polling results, which report percent distributions. • We tend to see something like 55% report supporting cap and trade legislation rather than a mean of 3.4 on a scale of 5 (for) to 1 (against). • The decision about whether means or percent distributions are used to report ordinal data should reflect audience preference and ease of audience understanding. • Not an ideological stance Dr. G. Johnson,

  36. Measures of Dispersion • Used with Interval and Ratio Data • Simple Description: The Range • Reported salaries ranged from $21,000 to $1,000,000 • Ages in the group ranged from 18 to 32 • Standard Deviation • Measures the dispersion in terms of the the distance from the mean • Small standard deviation: not much dispersion • Large standard deviation: lots of dispersion Dr. G. Johnson,

  37. Standard Deviation • Normal Distribution: Bell-shaped curve • 68% of the variation is within 1 standard deviation of the mean • 95% of the variation is within 2 standard deviations of the mean Dr. G. Johnson,

  38. Normal Distribution 95% of the distribution Standard deviations Standard deviations Mean

  39. Applying the Standard Deviation • Average test score= 60. • The standard deviation is 10. • Therefore, 95% of the scores are between 40 and 80. • Calculation: • 60+20=80 60-20=40. Dr. G. Johnson,

  40. Standard Deviation with Means • The Standard Deviation is used with interval/ratio level data • Typically, standard deviations are presented with means so the reader can tell whether there is a lot or a little variation in the distribution. • Note: the standard deviation is sometimes used in other statistical calculations, such as z-scores and confidence intervals Dr. G. Johnson,

  41. Describing Two Variables Simultaneously • Cross-tabulations (cross tabs, contingency tables) • Used when working with nominal and ordinal data • It provides great detail Dr. G. Johnson,

  42. Describing Two Variables Simultaneously Detail about the race and gender of the 233 people in the workplace: Dr. G. Johnson,

  43. Describing Race and Gender • Write-up: Of the 233 employees, the greatest proportion are white women (31%) followed by white men (21%). Fifteen percent of the employees are black men and 11% are black women, and 14% are men of other race identity and 6% are women of other race identity. Dr. G. Johnson,

  44. Describing Two Variables Simultaneously Comparison of Means • Used when one variable is nominal or ordinal, and the second variable is interval/ration level of measurement. • Examples: • Men in the MPA program have a GPA of 3.2 as compared to 3.0 for women. • The mean overall citizen satisfaction score is 4.2 this year as compared to 3.5 last year. • Mean salary for women was $35,000 as compared to $38,000 for men last year. Dr. G. Johnson,

  45. Key Points • These simple descriptive analysis techniques can be effective: • Illuminates, provides feedback, informs and might persuade. • The math is generally straight-forward. • Descriptive data is generally easy for many people understand as compared to more complex statistics (stay tuned). • Complex statistics are not inherently better! Dr. G. Johnson,

  46. The Tough Question • If descriptive data is distorted, it is tends to be in the way things are being counted and measured. • The math is usually correct. • Example: The federal debt is often presented just in terms of percent of debt held by the public but the total debt includes money borrowed from other government funds. • As a result, the debt looks smaller than what it actually is. Dr. G. Johnson,

  47. The Tough Question • If descriptive data is distorted, it is tends to be in the way things are being counted and measured. The math is usually correct • Example. Health insurance profits look different when calculated as a percent of corporate revenue than when calculated as a percent of all spending on health care. • It will look smaller when presented as a percent of all health care spending which is larger than just corporate insurance revenue. Dr. G. Johnson,

  48. The Tough Question • Always ask: what exactly is being measured and counted? • Consider whether there are other ways of counting and other ways of doing the analysis that might yield different results (or create different perceptions). • Do the choices reflect a political agenda? Dr. G. Johnson,

  49. Creative Commons • This powerpoint is meant to be used and shared with attribution • Please provide feedback • If you make changes, please share freely and send me a copy of changes: • • Visit for more information