An Introduction to Evaluation Methods Embry Howell, Ph.D. The Urban Institute
Introduction and Overview Why do we do evaluations? What are the key steps to a successful program evaluation? What are the pitfalls to avoid?
Why Do Evaluation? Accountability to program funders and other stakeholders Learning for program improvement Policy development/decision making: what works and why?
“Evaluation is an essential part of public health; without evaluation’s close ties to program implementation, we are left with the unsatisfactory circumstance of either wasting resources on ineffective programs or, perhaps worse, continuing public health practices that do more harm than good.” Quote from Roger Vaughan, American Journal of Public Health, March 2004.
Key Steps to Conducting a Program Evaluation Stakeholder engagement Design Implementation Dissemination Program change/improvement
Stakeholder Engagement Program staff Government Other funders Beneficiaries/advocates Providers
Develop Support and Buy-in Identify key stakeholders Solicit participation/input Keep stakeholders informed “Understand, respect, and take into account differences among stakeholders…” AEA Guiding Principles for Evaluators.
EvaluabilityAssessment Develop a logic model Develop evaluation questions Identify design Assess feasibility of design: cost/timing/etc.
Develop a Logic Model Why use a logic model? What is a logic model?
Develop Evaluation Questions Questions that can be answered depend on the stage of program development and resources/time.
Assessing AlternativeDesigns Case study/implementation analysis Outcome monitoring Impact analysis Cost-effectiveness analysis
Early State of Program or New Initiative within a Program Type of Evaluation __________ -Is the program being delivered as intended? 1. Implementation -What are successes/challenges with implementation? Analysis/ Case Study -What are lessons for other programs? -What unique features of environment lead to success? Mature, stable program with well-defined program model________________________________ -Are desired program outcomes obtained? 2. Outcome monitoring -Do outcomes differ across program approaches or subgroups? -Did the program cause the desired impact? 3. Impact Analysis -Is the program cost-effective (worth the money)? 4. Cost –effectiveness analysis
Confusing Terminology Process analysis=implementation analysis Program monitoring=outcome monitoring Cost-effectiveness=Cost-benefit (when effects can be monitized)= Return-on-Investment (ROI) Formative evaluation: similar to case studies/implementation analysis; used to improve program Summative evaluation: uses both implementation and impact analysis (mixed methods) “Qualitative”: a type of data often associated with case studies “Quantitative”: numbers; can be part of all types of evaluations, most often outcome monitoring, impact analysis, and cost-effectiveness analysis “Outcome measure”=“impact measure”(in impact analysis)
Case Studies/Implementation Analysis Quickest and lowest-cost type of evaluation Provides timely information for program improvement Describes community context Assesses generalizability to other sites May be first step in design process, informing impact analysis design In-depth ethnography takes longer; used to study beliefs and behaviors when other methods fail (e.g. STDs, contraceptive use, street gang behavior)
Outcome Monitoring Easier and less costly than impact evaluation Uses existing program data Provides timely ongoing information Does NOT answer well the “did it work” question
Impact Analysis Answers the key question for many stakeholders: did the program work? Hard to do; requires good comparison group Provides basis for cost-effectiveness analysis
Cost-Effectiveness Analysis/Cost-Benefit Analysis Major challenges: Measuring cost of intervention Measuring effects (impacts) Valuing benefits Determining time frame for costs and benefits/impacts
An Argument for Mixed Methods Truly assessing impact requires implementation analysis: Did program reach population? How intensive was program? Does the impact result make sense? How generalizable is the impact? Would the program work elsewhere?
Assessing Feasibility/Constraints How much money/resources are needed for the evaluation: are funds available? Who will do the evaluation? Do they have time? Are skills adequate? Need for objectivity?
Assessing Feasibility, contd. Is contracting for the evaluation desirable? How much time is needed for evaluation? Will results be timely enough for stakeholders? Would an alternative, less expensive or more timely, design answer all/most questions?
Particularly Challenging Programs to Evaluate Programs serving hard-to-reach groups Programs without a well-defined or with an evolving intervention Multi-site programs with different models in different sites Small programs Controversial programs Programs where impact is long-term
Developing a Budget Be realistic! Evaluation staff Data collection and processing costs Burden on program staff
Revising Design as Needed After realistic budget is developed, reassess the feasibility and design options as needed.
“An expensive study poorly designed and executed is, in the end, worth less than one that costs less but addresses a significant question, is tightly reasoned, and is carefully executed.” Designing Evaluations, Government Accountability Office, 1991
Developing an Evaluation Plan Time line Resource allocation May lead to RFP and bid solicitation, if contracted Revise periodically as needed
Developing Audience and Dissemination Plan Important to plan products for audience Make sure dissemination is part of budget Include in evaluation contract, if appropriate Allow time for dissemination!
Key steps to Implementing Evaluation Design Define unit of analysis Collect data Analyze data
Key Decision: Unit of Analysis Site Provider Beneficiary
Collecting Data Qualitative data Administrative data New automated data for tracking outcomes Surveys (beneficiaries, providers, comparison groups)
Human Subjects Protection Need IRB Review? Who does review? Leave adequate time
Qualitative Data Key informant interviews Focus groups Ethnographic studies E.g. street gangs, STDs, contraceptive use
Administrative Data Claims/encounter data Vital statistics Welfare/WIC/other nutrition data Hospital discharge data Linked data files
New Automated Tracking Data Special program administrative tracking data for the evaluation Define variables Develop data collection forms Automate data Monitor data quality Revise process as necessary Keep it simple!!
Surveys Beneficiaries Providers Comparison groups
Key Survey Decisions Mode: In-person (with our without computer assistance) Telephone Mail Internet Response Rate Target Sampling method (convenience, random)
Key Steps to Survey Design Establish sample size/power calculations Develop questionnaire to answer research questions (refer to logic model) Recruit and train staff Automate data Monitor data quality
HoursDuration 1. Goal clarification ________ ________ 2. Overall study design ________ ________ 3. Selecting the sample ________ ________ 4. Designing the questionnaire and cover letter ________ ________ 5. Conduct pilot test ________ ________ 6. Revise questionnaire (if necessary) ________ ________ 7. Printing time ________ ________ 8. Locating the sample (if necessary) ________ ________ 9. Time in the mail & response time ________ ________ 10. Attempts to get non-respondents ________ ________ 11. Editing the data and coding open-ended questions ________ ________ 12. Data entry and verification ________ ________ 13. Analyzing the data ________ ________ 14. Preparing the report ________ ________ 15. Printing & distribution of the report ________ ________ From: Survival Statistics, by David Walonick
Analyzing Data Qualitative methods Protocols Notes Software Descriptive and analytic methods Tables Regression Other
Dissemination Reports Briefs Articles Reaching out to audience Briefings Press
Ethical Issues in Evaluation Maintain objectivity/avoid conflicts of interest Report all important findings: positive and negative Involve and inform stakeholders Maintain confidentiality and protect human subjects Minimize respondent burden Publish openly and acknowledge all participants
Impact Evaluation Why do an impact evaluation? When to do an impact evaluation?
Developing the counter-factual: “WITH VS. WITHOUT” Random assignment: control group Quasi-experimental: comparison group Pre/post only Other
Random Assignment Design Definition: Measures a program’s impact by randomly assigning subjects to the program or to a control group (“business as usual,” “alternative program,” or “no treatment”)
Example of Alternative to Random Assignment: Regression Discontinuity Design (See West, et al, AJPH, 2008)
Quasi-experimental Design Compare program participants to well-matched non-program group: Match on pre-intervention measures of outcomes Match on demographic and other characteristics (can use propensity scores) Weak design: compare participants to non-participants! Choose comparison group prospectively, and don’t change!
Examples of Comparison Groups Similar individuals in same geographic area Similar individuals in different geographic area All individuals in one area (or school, provider, etc.) compared to all individuals in a well-matched area (or school, provider)
Pre/Post Design Can be strong design if combined with comparison group design Otherwise, falls in category of outcome monitoring, not impact evaluation Advantages: controls well for client characteristics Better than no evaluation as long as context is documented and caveats are described