FIELD Research METHODSLECTURE 6: Small-N/Case Studies Edmund J. Malesky, Ph.D.
Organization of Today’s Lecture • What is small-n design? • What is a case study? • Abuse of the case study method? • Selecting on the dependent variable? • k>n • Why use a case study? • Selecting case studies for the appropriate purpose? • Single-Outcome Studies
What is small-n design? • N is the number of observations in analysis • Some confusion because N is sometimes used to mean the number of observation in a study. • Just as it sounds, this is a research agenda that involves only a few number of observations. • The idea is to use close observation of small number of units to draw out detailed and rich findings about: • Causal process • Thick descriptive information • Because it looks closely at a few cases and traces causal pathways, qualitative research often outperforms quantitative research in its measurement validity and internal validity. • A well-designed quasi-experiment with only a few cases is often preferable to a non-experiment with many cases. • Very important, but often abused research program.
What is a Case Study? • The notion of a case study is hard to pin down, because it has been used to describe a number of very different concepts of research. • Small-n • Thick description • Particular type of evidence (ethnograpghic, clinical, nonexperimental) • That the evidence gathered is naturalistic (real-life) • That the topic is diffuse • That it employs triangulation • That a single observation is highlighted (narrative) • That the research investigates the properties of a single phenomenon.
Gerring’s Definition (s) • “Case denotes a spatially delimited phenomenon (a unit) observed at a single point in time or over some period of time. It comprises the type of phenomenon than an inference attempts to explain.” (19) • Later, Gerring expands the definition to include a “spatially or temporarily delimited phenomenon (211).” • This definition can cover: • Nation-States • Subnational units • Individuals • … • Can it cover? • Al Qaeda • Doctors without Borders
Unit of Analysis • Unit of analysis: definitions • “The objects that a hypothesis describes or explains and are the focus of study.” • “A unit is the unit of analysis for an effect if and only if that effect is assessed against the variation among units.” • Think of a unit as a cell in a panel data set. • Case: what is the unit? • “Observations used to draw inferences at whatever level of analysis is of interest.”
Two classes of research • Conventionally, researchers have distinguished between two classes of research • Quantitative research measures differences in number for variables, and usually studies a large number of cases (Large “N”). • Qualitative research measures differences in kind for variables, and usually studies a small number of cases (Small “N”).
Two classes of research • Because it covers a broad range of cases, quantitative research yields conclusions that can be generalized (it has the strongest external validity). • Because it looks closely at a few cases and traces causal pathways, qualitative research often outperforms quantitative research in its measurement validity and internal validity.
Gerring rejects these strict distinctions • Case study: Involves intensive study of a single-case where the purpose of that study is – at least in part – to shed light on a larger class of cases (a population). • Case study research: Incorporates several cases, multiple cases. • Cross-Case analysis: At a certain point, the emphasis shifts from an individual case to a sample of cases. Then it becomes a cross-case analysis. • All research can be classified as either case studies (comprising one or a few cases) or cross-case study (comprising many cases).
Four basic points to Gerring’s Critique • Case study and cross-case research are united in an ontolological spectrum. There is no true methodological dichotomy. • Case studies have been wrongly treated as qualitative by definition. They may (and often should) involve quantitative techniques. • Case studies are never observations in isolation. To conduct a case-study implies that one has conducted cross-case analysis or at least thought about a broader set of cases. • Controversially… The strongest defense of the case study is that it is quasi-experimental in nature.
A Much Maligned Approach • Case studies have received a bad name in social science research, predominantly because the term has been so often abused, as to stand for nothing. • Poorly designed case studies usually make one of the four following errors • Motive behind the selection of case studies is not obvious (Is it convenience? Or is it because they are good stories). Without understanding this, the project is at best useless and at worst terrible misleading. • Generalizability – Can the lessons learned from this case be applied to a larger class? • Falsifiability – Results are presented in such a way that it would be difficult for an impartial researcher to replicate the project and arrive at the same result. • No or Negative Degrees of Freedom: The researcher has more explanatory variables (moving pieces) than observations. It is impossible to determine whether they • Selection on the Dependent Variable: Choosing cases because of their performance on outcome of interest. (i.e. Porter, Competitive Advantage of Nations)
The Most Common Error w/Small-N Research • Galtung: “The traditional quotation/illustration method.” • Cases are picked in accordance with the hypothesis. • Hypothesis are rejected if one deviant case is found.
Case Selection • When selecting cases for your quantitative research sample, it is imperative that you use random selection. • In qualitative research, “selection must be done in an intentional fashion, consistent with research objectives and strategy.” (King, Keohane, and Verba, 1994, p.139)
Selecting Cases on the Independent Variable • “Selecting on the independent variable” means “selecting your cases according to the values of the independent variable that they take on.” • In order to do this, you have to know a little bit about all of your potential cases. • In order to do this right, you cannot act as if you also know the values that the dependent variable takes on.
1. The Typical Case • Definition: Cases (one or more) are typical examples of a cross-case relationship • Cross-Case Technique: A low residual case (on-lier) • Uses: Hypothesis Testing • Representative by Definition
2. The Diverse Case • Definition: Cases (two or more) illuminated the full range of variation on X1, Y, X1/Y • Cross-Case Technique: Diversity may be calculated by categorical values, standard deviations, or combinations of values • Uses: Hypothesis Generation or Hypothesis Testing • Representative: In the minimal sense of representing the full variation of the population. Though they might not mirror the distribution of that population.
Techniques of Case Selection – Diverse Case
3. The Extreme Case • Definition: Represent unusual values on x or y. • Cross-Case Technique: A case lying many standard deviations away from the mean of X or Y • Uses: Hypothesis Generation Only! • Representative: Not really meant to be. Can be achieved only in relation to other cases. • Self-conscious attempt to maximize variation (i.e. Studying German Fascism).
4. Deviant Case • Definition: Cases (one or more) deviate from cross-case relationship. • Cross-Case Technique: A high residual case (outlier) • Uses: Hypothesis Generation (to develop new explanations of Y) • Representative: Can later be corroborated by cross-case analysis.
5. Influential Case (one of my favorites) • Definition: Cases (one or more) influential configuration of the independent variables. • Cross-Case Technique: High leverage or Cook’s D. • Uses: Hypothesis Testing to verify the status of a highly influential case. • Representative: Not really the goal. This is the exception that proves the rule case.
6: Crucial Case • Definition: Cases (one or more) are most likely or least likely to represent a given outcome. • Cross-Case Technique: Qualitative assessment of relative crucialness. • Uses: Hypothesis Testing (confirmatory or disconfirmatory) • Representative: Should be highly representative, given key measures discussed.
7. Pathway Case • Definition: Cases (one or more)where X1 and not X2 is likely to have case a positive outcome. • Cross-Case Technique: Cross-tab or residual analysis. • Uses: Hypothesis Testing (process tracing of causal mechanisms) • Representative: May be tested by examining residuals of chosen cases.
8. Most Similar • Definition: Cases (two or more) are similar on specified variables other than X and Y. • Cross-Case Technique: Matching, Propensity Score Matching • Uses: Hypothesis Testing and Generation • Representative: May be tested by examining residuals of chosen cases.
Selecting Cases on the Independent Variable • Most Similar Systems is: • A Non-Equivalent Group Design (NEGD) with a treatment and comparison group. N O X O N O O
Income Inequality and Civil War Income Inequality Poverty Civil War Colonial Past External Threat
Income Inequality and Civil War • We can hold the confounds constant by selecting these similar cases from Latin America. • It appears that income inequality does lead to civil war.
Lily Tsai: Accountability Mechanisms “Most similar” cases High Mountain Village, Jiangxi Li Settlement, Jiangxi Per capita investment: Nothing Per capita investment: 100 yuan (US$12) Income per cap: 1100 yuan Income per cap: 1200 yuan Local tax per cap: 126 yuan Local tax per cap: 150 yuan Stratification: Low Stratification: Low Elections: Poor Elections: Poor Oversight: None Oversight: None Solidary group: No Solidary group: Yes
Impact of village-wide lineage groups:Fitted values of investment and roads
Can be a powerful research design when it is difficult or costly to study a large number of cases. When carried out correctly, can be internally valid. Do not need a large number of cases for a proper test. Implicit foundation for “area studies.” Belief that regions share many similarities, and that these similarities are related to similar outcomes (weak test) and not related to dissimilar outcomes (stronger test). Most Similar Design
How Similar is Similar? • In most similar designs, which covariates should you try to match? • Similar but irrelevant covariates do not add anything to the test. Likewise, dissimilar but irrelevant covariates do not detract from the test. Both reduce degrees of freedom. • Covariates that are related to both the treatment and outcome variables must be included whether similar or not – otherwise, omitted variables bias.
Problems of Inference • Must include covariates that correlate with X and Y. • If select cases with similar (relevant) covariates, likely to be similar in X as well. Indeed, since covariates and X are correlated, “naturally occurring” cases with similar covariates and different treatments may be outliers. • In “real world” cases, treatment effect is likely to be small, hard to identify, and uncertain.
9. Most Different • Definition: Cases (two or more) are different on specified variables other than X and Y. • Cross-Case Technique: Inverse of matching • Uses: Hypothesis Testing (eliminates deterministic theories) and Generation • Representative: May be tested by examining residuals of chosen cases.
Most Different Case Studies • Mills’ “method of agreement.” • Choose cases that are as different as possible except for the variable of interest (i.e., all receive same treatment). • If X and Y occur despite different covariates, X and Y may be related. • Less useful since it can only disprove a hypothesis.
Lily Tsai: Village Accountability “Most different” cases West Gate Village, Fujian Yang Hamlet, Hebei Income per cap: 6712 yuan Distance from county: Near Stratification: High Population: 3700 people Type of solidary group: Temple Vill govt public goods: Good Income per cap: 1500 yuan Distance from county: Far Stratification: Low Population: 367 people Type of solidary group: Temple Vill govt public goods: Good
Don’t Select on the Dependent Variable • Restricted range on DV misestimates effect of IV; Faulty inferences resultExample: • Achen and Snidal on rational deterrence theory • Selection of “acute crises” give an incomplete picture • No variation on DV – can’t learn anything • Example: • Skocpol on revolutions • Does have some information on non-revolutions; events at “moments of revolutionary crisis.”
But…In some cases it makes sense… • The researcher is trying to establish that x is both a necessary and sufficient condition for y. • You are using Mill’s Method of Agreement to demonstrate that a hypothesis does not hold – deviant case.