1 / 13

Historical Review – The Evaluation Gap

Rigorous Impact Evaluation What It Is About and How It Can Be Done In Practice Alexandra Caspari, Frankfurt/Main Germany. Conference » Perspectives on Impact Evaluation: Approaches to Assessing Development Effectiveness « 31 st March – 2 nd April 2009, Cairo.

buffy
Download Presentation

Historical Review – The Evaluation Gap

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rigorous Impact EvaluationWhat It Is About and How It Can Be Done In PracticeAlexandra Caspari, Frankfurt/Main Germany Conference »Perspectives on Impact Evaluation:Approaches to Assessing Development Effectiveness« 31st March – 2nd April 2009, Cairo

  2. Historical Review – The Evaluation Gap • MDGs (2000), ‘Paris Declaration on Aid Effectiveness’ (2005), and ‘Agenda for Action’ (Accra, 2008): • Increasing attention to Impact Evaluations • Lack of knowledge about effectiveness of projects and programs • 2006: Report “When will we ever learn?” of the CGD ‘Evaluation Gap Working Group’ • gap in quantity and quality of impact evaluations: • too few impact evaluations are being carried out and • those conducted often unable to properly assess impact because of methodological shortcomings • Recommendation: ‘Collective Action’ • International Initiatives (NONIE, 3IE, …)

  3. What is Impact Evaluation? • OECD/DAC (2002): “positive and negative, primary and secondary long-term effectsproduced by a development intervention, directly or indirectly, intended or unintended” • emphasiseson ‘produced by’: • measures impact with clear causation (causal attribution) • considers the counterfactual, i.e. the question “What difference did this program make?”“What would have happened without the intervention?” Rigorous Impact Evaluation (RIE): • Distinction against more “usual evaluations” by adding “rigorous” • focus on clear causation • use of adequate methods (to meet methodological shortcomings) • most important point: selection of the evaluation design to consider the counterfactual

  4. The Counterfactual • Causal effect: An actual effect δi caused by a treatment T (a program) is the difference between the outcome Yi1 under a treatment T(T=1), i.e. program participant,minus the alternative outcome Yi0 that would have happened without the treatment T (T=0), i.e. non-participant: • Impact is not directly observable: • one can observe any given individual either as a treated person (participant) or untreated person (non-participant) but not both states • if individual iis participating in a program (T=1), then the outcome Yi0 is unobservable • this unobservable outcome Yi0 is called counterfactual • Analyzing the difference between the observed outcome and the unobserved potential outcome by choosing the best evaluation design

  5. one-group pre-test post-test design (a) P measuredimpact impact indicator t1 t2 time ●: observation, P: participants (treated), t: time (first, second observation), X: project intervention Considering the Counterfactual • often used non-experimental designs: • measured impact = • the counterfactual is not considered! • with non-experimental designs causal attribution is not possible!

  6. Considering the Counterfactual • necessary: experimental or quasi-experimental designs adequate comparison group (‘with-and without comparison’) • „Real“ Experiments / Randomized Controlled Trials (RCTs):(Laboratory)Experiments: • random assignment of individuals to treatment (P) and control group (C)  groups differ solely due to chance • treatment and conditions are known/checkable Field experiments: • take place in real-world settings • anyhow treatment and control groups are assigned at random • Quasi-Experiments: • no random assignment • has a source of randomization that is “as if” randomly assigned • control group is often reconstructed ex-post

  7. pre-test post-test control group design (1)/(2) one-group pre-test post-test design (a) static group comparison (4) P P P measuredimpact = Dt2 – Dt1 measuredimpact= Dt2 Dt2 measuredimpact impact indicator impact indicator impact indicator over- estimated impact C C Dt1 t1 t2 t1 t2 t1 t2 time time time (single difference) (double difference) ●: observation, P: participants (treaded), C: control group (non-treated), D: difference, t: time (first, second observation), X: project intervention Considering the Counterfactual

  8. Approaches to Impact Evaluation • appropriate impact evaluation designs are often reject as unnecessarily sophisticated or because of ethical concerns • various realistic ways in which quasi-experimental designs can be introduced in an ethically and politically acceptable manner: • Matching on Observables • Regression Discontinuity • Propensity Score Matching (PSM) • Pipeline Approach • Multiple Comparison Group Design

  9. Possible Approaches in Practice • Matching on Observables: • characteristics (access tor services, economic level, type of housing, etc.) on which the comparison group should match the program group (individuals, households or areas) are identified carefully • often easily observable or identifiable characteristics • unobservable differences has to be kept in mind • control group is build out of those individuals, households or areas which match best • quasi-experimental design “pretest-posttest-comparison with post-test non-equivalent control group” (3) or at least “static group comparison” (4) is possible  single-difference (SD) possible

  10. Possible Approaches in Practice • Regression Discontinuity: • if a program is assigned using a clear threshold for eligibility comprised for one ore more criteria (age, income less than…) • control group is built out of those just above the threshold and hence not eligible for the program • those individuals will have comparable characteristics • quasi-experimental design “pre-test post-test non-equivalent control group design” (2) possible! • double-difference (DD) possible!

  11. Possible Approaches in Practice • Pipeline Approach: • if large programs (housing or community infrastructure, immunization, …) are introduced in phases over several years • when there are no major differences between the characteristics of families, communities scheduled for each phase and • when there is no selection criteria for participants of the first phase (the poorest families, communities, …) • participants of phase 2 & 3 = control group for participants phase 1 • quasi-experimental design “pre-test post-test non-equivalent control group design” (2) possible! • double-difference (DD) possible

  12. Important Remarks • The international discussion about RIE refers just to a small aspect of evaluation: the causal attribution of impact • Impact is measured at the level of target groups/participants because target groups are typically large, for this evaluation step quantitative methods are necessary (representativeness vs. profundity) • other evaluation methods are not condemned! • causal attribution is necessary but not sufficient  ‘black box’ remains: why does a program have impact (or does not) • comprehensive meaningful and reliable impact evaluations need the use of mixed method, i.e. use of quantitative and qualitative methods

  13. Reference: Caspari, Alexandra/Barbu, Ragnhild (2008): Wirkungsevaluierungen Zum Stand der internationalen Diskussion unddessen Relevanz für die Evaluierung derdeutschen Entwicklungszusammenarbeit • http://www.fh-frankfurt.de/de/.media/~caspari/2008bmzwpwirkungsevaluation.pdf

More Related