1 / 24

Learning Objectives:

Background. Evaluating a Title II program involves assessing its effects and impacts, i.e. verifying the extent to which program activities are associated with intended changes in the behavior and well-being of the beneficiary population Evaluation objectives may range from simply measuring the lev

cosette
Download Presentation

Learning Objectives:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1.

    2. Learning Objectives: At the end of this presentation, participants will: Understand the various evaluation design options open to Title II program managers Know how to select a particular design model given the needs of the project

    3. Background Evaluating a Title II program involves assessing its effects and impacts, i.e. verifying the extent to which program activities are associated with intended changes in the behavior and well-being of the beneficiary population Evaluation objectives may range from simply measuring the level of change in indicators of well-being, to attributing a change in the level of those indicators to the intervention being implemented The choice of model to use depends on the answer to one basic question: how confident do program managers need to be that the changes they see are a result of program activities?

    4. Three basic evaluative models “Adequacy” evaluations determine whether objectives were met, and activities were performed as planned. May or may not require before/after comparison. Does not require controls. “Plausibility” evaluations assess whether it is plausible to attribute an effect to an intervention. Requires before/after comparison with/without controls and/or treatment of confounding factors (3 sub-types) “Probability” evaluations determine the statistical probability that the intervention caused the effect. Requires before/after comparison with randomized controls. Gold standard of scientific research.

    5. Adequacy evaluation (simple pre/post design): Did the expected change occur? Adequacy evaluations only describe if a condition was met or not following the intervention. Typically addresses provision, utilization or coverage questions. Does not need controls. May even not need before/after data, e.g. when comparing to set criteria, e.g.: Has measles immunization program reached expected (e.g. 85%) coverage? Can also answer questions about the magnitude of change (this requires before/after data), e.g.: Were stunting rates reduced by at least 10 percentage points?

    6. Features of Adequacy Statements Pros: Simplest and most inexpensive of all designs: No attempt to control for external effects. Data needed only for outcomes. In special situations, sufficient to attribute change to the program. Cons In most cases, not possible to attribute change to the program. Other factors could have caused the change in the absence of the intervention. Also, absence of controls makes it impossible to explain lack of change Bottom line: USAID does not require that evaluations attribute the outcome to the program. No compelling reasons that prevent a program from using this type of evaluation, provided that the other elements of the M&E system lend credibility to the claim of association between the activity and the outcomes. BUT: if a manager wants to understand whether the program works or not, or wants to be able to claim responsibility for a change in the outcome, more powerful designs are needed

    7. “Plausibility” evaluations “Plausibility designs” refer to a family of designs that vary in complexity from the simple pre-post comparison of the beneficiary population to a control group, to designs that compare those same groups before and after the intervention, while simultaneously controlling statistically for other determinants and confounders. Attribution statements go from weak to strong: at the weakest level, a simple comparison across beneficiaries and controls is used to try to remove the influence of unaccounted-for factors. At the strongest level, most alternative explanations have been identified, measured and accounted for.

    8. Plausibility evaluation, Type 1: Pre-post design with controls Most basic controlled design Requires comparison of BL/FE data across both beneficiary and control groups

    9. Pros and cons of Type 1 Pros: Makes it reasonably plausible to associate a change in the outcome to the intervention Remains conceptually simple Only requires outcome data Cons There may still be other factors that explain the outcome Controls may differ from project beneficiaries in significant ways (climate, culture, etc) Costly, as it requires a doubling of the sample size

    10. Plausibility model, Type II Pre-Post Design with Statistical Treatment of Determinants and Known Confounders Definitions “Determinants” are any feature that predictably affect the outcome of the intervention “Confounders” are factors that can affect the outcome, but over which the program has no control—e.g. climate. They may be known or unknown Once measured and quantified, determinants and known confounders can be accounted for (“kept constant”) using multivariate statistical analysis techniques (e.g. regression).

    11. Data collected for adequacy vs. plausibility Type II evaluations

    12. Features of Type II model Does not involve a control group, so sample is maintained at the smallest level possible Requires information at BL and FE on the outcome, its determinants, and all known confounders This makes the interview more complex, adding to the costs

    13. Combines the use of controls and the statistical treatment of confounders The inclusion of the control group addresses unknown or unmeasured confounders Most powerful, and most costly of all 3 “Plausibility” designs Plausibility model, Type III Pre-Post Design with Controls and Statistical Treatment of Determinants and Known Confounders

    14. Properties of Type III Assuming a successful intervention, a Type III design provides the information to state: that the severity of the problem was reduced in beneficiary areas; that there is an inverse association between intensity of the intervention in the project and the severity of the problem; that change in the known confounders do not explain the observed improvement that changes in unknown confounders should not explain the improvement that the severity of the problem did not improve (or improved significantly less) in areas without intervention Being able to say the above reduces the possibility that a change in outcome at the beneficiary population level is due to causes other than the program intervention This level of certainty comes at a high cost however: Involves an extensive interview instrument (same as Type II) Doubles the sample size to include a control group.

    15. Conclusions on plausibility statements Plausibility statements go from weak to strong. The highest level of plausibility is reached when known alternate explanations have been considered and rejected. At the weakest level, a simple comparison across functional groups is made to reduce confounding factors (e.g. two-by-two tables). At the strongest level, matching or mathematical treatment (e.g. multivariate regressions) is used to do the comparison The stronger the assertion, the greater the data needs, and the greater the analytical resource needs. This influences the cost of the study.

    16. Probability statement (Causal analysis of before/after differences) Probability evaluations ensure that there is only a small and measurable probability (e.g. p=.05) that the difference between beneficiaries and controls is due to confounders, biases or to randomness. It requires randomization in the assignment of participants to study groups (treatment or control). Randomization does not ensure the elimination of confounders, but it ensures that the probability of those factors still playing a role is known and small (a=.95, ß=.80)

    17. Pros and cons of probability studies Pros Confounding factors need not be known nor specified. Randomization ensures that comparison groups are unbiased, thus ensuring that any effect statistically detected is due to the intervention Probability statements establish the causality of interventions– the only type of evaluation designs that can make that claim Once causal relations are established by probability studies, similar results can be inferred by program managers in their zone of intervention. Cons Social, ethical or political issues must be ignored when selecting participants or sites. Probability studies have high internal but low external validity. This can create situations that are marginally related to field realities, making them less useful in program context.

    18. Choosing the right type of evaluation Choice based on: Known efficacy/effectiveness of intervention Timing and pertinence Feasibility, desirability and cost Audience

    19. Choosing the right type of evaluation: Known efficacy/effectiveness of the intervention The efficacy of an intervention should be established first. This usually requires probability studies. E.G: Measles immunization is proven to be efficacious. No need for further probability assessments. Adequacy evaluation judging provision and coverage to be adequate is enough. An effectiveness study may be required to understand what other factors may affect it. E.G: The effectiveness of home gardening to increase Vitamin A status may vary by context. The long causal pathway requires that the various possible confounders be measured

    20. Choosing the right type of evaluation: Timing and pertinence Perfect information is useless if it comes too late. The higher the standard, the longer it takes to collect, analyze, report on the data Prior knowledge that a certain level of operational performance is attained may be needed. For instance: an evaluation of a MN fortification program is a loss of time and money if we know that the fortified food has only 20% market penetration. How soon will the impacts be felt? Some indicators (e.g. iron status) react quickly; others respond only in the long run (e.g. stunting). Efficacy trials based on probability assessments are usually politically, ethically unfeasible/undesirable in Title II contexts

    21. Choosing the right type of evaluation: Audience Who uses the information, and how, affects the level of detail needed and the certainty of findings offered. E.G: Scientists and project managers want more details, more precision, and want to understand the mechanics Donor agencies want to confirm the usefulness of their investment and want to see positive outcomes Strategic considerations (the testing of a new approach, a new country situation, future funding needs) may also justify the need for greater details and better understanding of the mechanics involved

    22. Choosing the right type of evaluation: Costs and capabilities Cost: 2 key aspects determine cost: sample size and length of interview Sample size: Adequacy model and Plausibility Type II do not require control. Plausibility Types I and III both require controls, doubling the size of the sample Interview length: Questionnaires for Adequacy model, and for Type I require only data on outcomes. Questionnaires for Type II and III require data on outcomes, determinants and confounders Capability: Are measurement issues under control (particularly for determinants and confounders)? As designs become more elaborate, analytical requirements become more complex. Are the human and technical resources needed to carry out the complex analyses implied by Types II and III available?

    23. Conclusion A good evaluation strategy starts with well defined objectives: we should know from the start what Qs we want to answer. Use “optimal ignorance”: collect only the data needed to document exactly the Qs A good design will facilitate establishment of schedules, workloads, logistics, budgets The most complex is not always the best: choice should be based on the need at hand. Depending on the context, adequacy may be enough to claim impact.

    24. Bottom line: What is best for T-II? In most cases, T-II programs should do “Adequacy” evaluations Plausibility studies should be kept mainly for “Gold Standard” studies; or when there is a need to know the role of confounders

More Related