Sampling Observations for Study Political Science 102 - Introduction to Political Inquiry – Lecture 8 Or…Who Cares about Sampling?
Populations versus Samples • A population is any well-defined set of units of analysis. • The set of cases that we want to understand • Determined by the research question • A sample is a subset of a population. • Selected from population by a systematic procedure: the sampling method. • Sample statistics measure characteristics of the sample • We use sample statistics to estimate the characteristics of a population • Key logical inference is external validity
Some Terms and Definitions • Population parameter • Quantifiable characteristic of a population • Denoted by capital English or Greek letter • Sampling Frame • Specific population from which sample is actually drawn • Sample statistic • Quantifiable characteristic of a sample • Denoted with a small letter or a ^ • Estimator • Sample statistic that estimates a population parameter
Populations and Samples • We would like to analyze the population but cannot • Large and impractical data collection • Desire to generalize across time (and into future) • Seek to produce a sample that matches population parameters • Any deviation between sample and population is bias • Leads to flawed inferences about population • Two methods of drawing samples: • Probability and Non-Probability samples
Probability Samples • Each element in the population has a known probability of inclusion in the sample • Random selection guards against bias • Same kind of effect as random assignment in experiments • Simple random sample • Each element in a population has an equal chance of selection • Done by a lottery, a random number generator, dice, etc. • Example: The Vietnam Draft Lottery • Be sure that you mix well! • Computers are useful but dice and tables of random numbers work too!
Probability Samples • Systematic sample: • Select elements from a list of the population at a predetermined interval • Start point for must be random or list must be randomized • Be aware of cycles in list corresponding to sampling interval • Stratified sample: • Population divided into two or more strata based on a criterion • Elements selected from each strata in proportion to the strata’s representation in the entire population • Reduces bias if population parameters are known
Probability Samples • Disproportionate stratified sample: • Elements are drawn disproportionately from the strata. • Used to over-represent smaller groups • Ensures large enough sub-samples for reliable inferences • Example: 1996 National Black Election Study • Set of about 1,200 interviews of African American respondents • Used in combination with ANES for comparisons • Example: Surveys stratified by State (for Senate elections) • How can we make unbiased inferences about population parameters? • Weighted observations!
Probability Samples • Cluster samples: • Group elements for an initial sampling frame (e.g. 50 states). • Random samples drawn from increasingly narrow groups (e.g.) counties, then cities, then blocks) • Final random sample of elements is drawn from the smallest group (individuals living in each household). • Almost all face-to-face surveys done this way • Example: American National Election Study • Lancet Studies of Iraqi Civilian Casualties • 2003 ~ 100,000 deaths • 2006 ~ 650,000 deaths
Nonprobability Samples • Nonprobability samples: Elements in the population have an unknown probability of inclusion in the sample. • Used when probability samples are not feasible • Purposive samples (Case Studies) • Observations selected because of values on variables • Select on independent not dependent variables • Generalizability comes from exogenous knowledge, not probability • Hard cases vs. easy cases
Nonprobability Samples • Convenience sample: • Elements that are convenient for the investigator (e.g. college students) • Often used in experimental studies • Quota sample: • Elements are chosen in proportion to representation in population • Selection within quotas is nonprobabilistic • Often used in surveys prior to random sampling, still used in market research (focus groups)
Nonprobability Samples • Snowball sample: • Elements in the target population identify other elements • Useful when studying hard-to-locate or identify populations that have social networks • Highly subject to selection bias & non-representativeness • Example: 2004 Study of LGBT environment on college campuses • A Potential Solution: Matching • 1996 study of psychiatric disorders & drug abuse • Match on characteristics of subject AND friends