1 / 72

Another Information-Gathering Technique & Introduction to Quantitative Data Analysis

Another Information-Gathering Technique & Introduction to Quantitative Data Analysis. Neuman and Robson Chapter 11. Research Data library at SFU http://www.sfu.ca/rdl/. Quiz 2 Coverage. New Material from the Lectures and from the following Chapters

lsorrentino
Download Presentation

Another Information-Gathering Technique & Introduction to Quantitative Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Another Information-Gathering Technique & Introduction to Quantitative Data Analysis Neuman and Robson Chapter 11. • Research Data library at SFU • http://www.sfu.ca/rdl/

  2. Quiz 2 Coverage • New Material from the Lectures and from the following Chapters • 7 (Sampling), 8 (Surveys), 10 (Nonreactive Measures & Existing Statistics) and the beginning of Chapter 11 (univariate statistics) • The quiz may also include material covered in the first quiz especially: • Standardization & rates • Scales & indices • validity & reliability, • levels of measurement, • the notions of exhaustive & mutually exclusive categories.

  3. Types of Equivalence for comparative research using existing statistics • lexicon equivalence (technique of back translation) • contextual equivalence (ex. role of religious leaders in different societies) • conceptual equivalence (ex. income) • measurement equivlence (ex. different measure for same context)

  4. Ethical Issues in Comparative Research • ethical issues sometimes very important • ex. impact of demographic research on funding of developing countries, controversy surrounding studies of the origins of AIDS • sensitivity, privacy etc.sometimes still issues even if “subjects” dead.

  5. Quantitative Data • Types of Statistics • Descriptive • Inferential • Common Ways of Presenting Statistics • Tables • Charts • Graphs

  6. Data Preparation • Recall: Coding Issues with War & Peace Journalism codes last day • Entering Data into Spreadsheet or data processing software • Cleaning Data

  7. Recall: Coding Principles categories exhaustive mutually exclusive consistent for all cases comparable with other studies

  8. Ways of Developing Coding Categories pre-defined coding schemes e.g. close-ended questions Ex. Coding Missing Values (conventions not always used) not applicable=77, don’t know=88, no response=99 post-collection analysis

  9. More Examples of Coding Process • Sheet for One Television Commercial • Excel spreadsheet showing entered codes • SPSS example

  10. Data entry conventions

  11. Discrete & Continuous Variables • Continuous • Variable can take infinite (or large) number of values within range • Ex. Age measured by exact date of birth • Discrete • Attributes of variable that are distinct but not necessarily continuous • Ex. Age measured by age groups (Note: techniques exist for making assumptions about discrete variables in order to use techniques developed for continuous variables)

  12. Cleaning Data checking accuracy & removing errors Possible Code Cleaning check for impossible codes (errors) Some software checksat data entry Examine distributions to look for impossible codes Contingency cleaning inconsistencies between answers (impossible logical combinations, illogical responses to skip or contingency questions)

  13. Descriptive Statistics (some topics for next few weeks) • Univariate (one variable) • Frequency distributions • Graphs & charts • Measures of central tendency • Measures of dispersion • Bivariate (two variables) • Crosstabulations • Scattergrams & other types of graphs • Measures of association • Multivariate (more than two variables) • Statistical control • Partials • Elaboration paradigm

  14. Frequency Distribution (Univariate) Table 5-1 Alienation of Workers __________________________________ --------------------------------------------------------- Level of Alienation Frequency --------------------------------------------------------- High 20 Medium 67 Low 13 (Sub Total) 100 (N=150) No Response 60 (Total) (N=210)

  15. univariate:= one variable “raw count” (frequencies, percentages) Simple Univariate Frequency Distributions and Percentages

  16. Conventions in table design • total number of cases (N=) • grouping cases • pro: see patterns • con: lose information

  17. Graph of Frequency Distribution (Univariate)

  18. Another visual representation of a distributions: Pie charts

  19. Consider Raw Data (Numbers) not just percentages Examine data preparation Treatment of missing cases? Collapsing categories? Critically Analyzing Data on Frequency Distributions: Collapsing Categories and Treatment of Missing Data Johnson, A. G. (1977). Social Statistics Without Tears. Toronto: McGraw Hill.

  20. Treatment of Missing Data: Raw Data Table 5-1 Alienation of Workers __________________________________ --------------------------------------------------------- Level of Alienation Frequency --------------------------------------------------------- High 20 Medium 67 Low 13 (Sub Total) 100 (N=150) No Response 60 (Total) (N=210)

  21. Comparison of % distributions and without non respondents Treatment of Missing Data (%) Table 5-1 Alienation of Workers Level of Alienation F % High 30 14 Medium 100 48 Low 20 10 No Response 60 29 (Total) 210 100 Table 5-1 Alienation of Workers Level of Alienation F % High 30 20 Medium 100 67 Low 20 13 (Total) 150 100

  22. Comparison with high & medium collapsed Treatment of Missing Data (%) Table 5-1 Alienation of Workers Level of Alienation F % High & Medium 130 62 Low 20 10 No Response 60 29 (Total) 210 100 Table 5-1 Alienation of Workers Level of Alienation F % High & Medium 130 87 Low 20 13 (Total) 150 100 Non-respondents eliminated Non-respondents included

  23. Comparison with medium & low collapsed Treatment of Missing Data (%) Table 5-1 Alienation of Workers Level of Alienation F % High 30 14 Medium & Low 120 58 No Response 60 29 (Total) 210 100 Table 5-1 Alienation of Workers Level of Alienation F % High 30 20 Medium & Low 120 80 (Total) 150 100 Non-respondents eliminated Non-respondents included

  24. Comparison of with high & medium response categories collapsed Grouping Response Categories(%) Table 5-1 Alienation of Workers Level of Alienation Freq % High & Medium 62 Low 10 No Response 29 (Total) 210 100 Table 5-1 Alienation of Workers Level of Alienation Freq % High& medium 87 Low 13 (Total) 150

  25. Core Notions in Basic Univariate Statistics Ways of describing data about one variable (“uni”=one) • Measures of central tendency • Summarize information about one variable • three types of “averages”: arithmeticmean, median, mode • Measures of dispersion • Analyze Variations or “spread” • Range, standard deviation, percentiles, z-scores

  26. most common or frequently occurring category or value (for all types of data) Mode Babbie (1995: 378)

  27. Graph (Normal Distribution) with single mode

  28. Bimodal Distribution • When there are two “most common” values that are almost the same (or the same)

  29. middle point of rank-ordered list of all values (only for ordinal, interval or ratio data) Median Babbie (1995: 378)

  30. Arithmetic “average” = sum of values divided by number of cases (only for ratio and interval data) Mean (arithmetic mean) Babbie (1995: 378)

  31. Two Data Sets with the Same Mean

  32. Symmetric Also called the “Bell Curve” Normal Distribution & Measures of Central Tendency Neuman (2000: 319)

  33. Skewed Distributions & Measures of Central Tendency Skewed to the left Skewed to the right Neuman (2000: 319)

  34. Normal & Skewed Distributions

  35. Why Measures of Central Tendency are not enough to describe distributions: Crowd Example • 7 people at bus stop in front of bar aged 25,26,27,30,33,34,35 • median= 30, mean= 30 • 7 people in front of ice-cream parlour aged 5,10,20,30,40,50,55 • median= 30, mean= 30 • BUT issue of “spread” socially significant

  36. Measures of Variation or Dispersion • range: distance between largest and smallest scores • standard deviation:for comparing distributions • percentiles: for understanding position in distribution% up to and including the number (from below) • z-scores:for comparing individual scores taking into account the context of different distributions

  37. Range & Interquartile range • distance between largest and smallest scores • what does a short distance between the scores tell us about the sample? • problems of “outliers” or extreme values may occur

  38. Interquartile range (IQR) • distance between the 75th percentile and the 25th percentile • range of the middle 50% (approximately) of the data • Eliminates problem of outliers or extreme values • Example from StatCan website (11 in sample) • Data set: 6, 47, 49, 15, 43, 41, 7, 39, 43, 41, 36 • Ordered data set:6, 7, 15, 36, 39, 41, 41, 43, 43, 47, 49 • Median:41 • Upper quartile: 41 • Lower quartile: 15 • IQR= 41-15

  39. Standard Deviation and Variance • Inter quartile range eliminates problem of outliers BUT eliminates half the data • Solution? measure variability from the center of the distribution. • standard deviation & variance measure how far on average scores deviate or differ from the mean.

  40. 1 6 2 3 4 5 7 8 Calculation of Standard Deviation 1 Neuman (2000: 321)

  41. Calculation of Standard Deviation Neuman (2000: 321)

  42. Standard Deviation Formula Neuman (2000: 321)

  43. Calculation of Standard Deviation Neuman (2000: 321)

  44. amount of variation from mean social meaning depends on exact case Interpreting Standard Deviation

  45. Details on the Calculation of Standard Deviation Neuman (2000: 321)

  46. The Bell Curve & standard deviation

  47. Discussion of Preceding Diagram • “Many biological, psychological and social phenomena occur in the population in the distribution we call the bell curve (Portney & Watkins, 2000).” link to source • Preceding picture • a symmetrical bell curve, • average score [i.e., the mean] in the middle, where the ‘bell’ shape tallest. • Most of the people [i.e., 68% of them, or 34% + 34%] have performance within 1 segment [i.e., a standard deviation] of the average score.”

  48. amount of variation from mean Illustration: high & low standard deviation meaning depends on exact case Interpreting Standard Deviation

  49. Another Diagram of Normal Curve (Showing Ideal Random Sampling Distribution, Standard Deviation & Z-scores)

  50. Example:Central Tendency & Dispersion (description of distributions) Recall: • 7 people at bus stop in front of bar aged 25,26,27,30,33,34,35 • median= 30, mean= 30 • Range= 10, standard deviation=10.5 • 7 people in front of ice-cream parlour aged 5,10,20,30,40,50,55 • median= 30, mean= 30 • Range= 50, standard deviation=17.9

More Related