Download presentation
Presentation is loading. Please wait.
Published byStewart Lewis Modified over 8 years ago
1
Evaluation Design
2
By the end of the session, participants will be a able to: 1.Name the criteria for inferring causality 2.Understand internal & external validity 3.Know the different types of designs for an evaluation 4.Identify the strengths and limitations of the different types of study designs 5.Develop an evaluation framework 6.Select a study design that fits the purpose of a given evaluation Learning Objectives
3
What are some different types of evaluation? Formative evaluation Process evaluation Impact evaluation
4
Introduction To show impact, researchers often want to make inferences about cause and effect We may, for instance, want to: –Identify the factors that explain why the use of ITNs is more prevalent among one group compared to another –Know why a particular intervention works
5
Logic of Causal Inference Under what condition may we infer that a change in dependent variable was really caused by the independent variable, and not by something else? What are some of the most plausible rival explanations, and how do we rule them out?
6
Overview In this section, we will examine appropriate and inappropriate criteria for inferring causality We will identify various evaluation designs: –Experimental –Quasi-experimental –Non-Experimental and the ways they attempt to show causality
7
C riteria for Inferring Causality: Temporal Relationship –The cause must precede the effect
8
Criteria for Inferring Causality: Plausibility –An association is plausible, and thus more likely to be causal, if consistent with other knowledge
9
Criteria for Inferring Causality: Strength of the Association –A strong association between possible cause and effect, as measured by the size of the relative risk, is more likely to be causal than a weak association
10
Criteria for Inferring Causality: Consistency Several studies giving the same result –Clearest when variety of study designs are used in different settings The likelihood that all studies are making the same mistake is minimized –Lack of consistency does not exclude a causal association Different exposure levels and other conditions may reduce the impact of the causal factor in certain studies
11
Criteria for Inferring Causality: Dose-response Relationship Occurs when changes in the level of a possible cause are associated with changes in the prevalence or incidence of the effect –The prevalence of hearing loss increases with noise level and exposure time
12
Criteria for Inferring Causality: Reversibility When the removal of a possible cause results in a reduced disease risk, the likelihood of the association being causal is strengthened –Cessation of cigarette smoking is associated with a reduction in the risk of lung cancer relative to that in people who continue to smoke However, when the cause leads to rapid irreversible changes whether or not there is continued exposure (HIV infection), then reversibility cannot be a condition for causality
13
Internal & External Validity
14
When considering cause and effect, two forms of validity become relevant: –Internal validity –External Validity
15
Internal Validity Internal validity: The confidence that the results of a study accurately depict whether one variable is or is not a cause of another –A study is internally valid if it meets our criteria: Cause precedes effect Empirical correlation between cause and effect No confounding variable
16
Threats to Internal Validity History Maturation or the passage of time Testing & Instrumentation Selection bias Loss to follow-up Diffusion or imitation of treatments
17
External Validity The extent to which the causal relationship depicted in a study can be generalized beyond the study conditions
18
Evaluating the Effect of an Intervention
19
Specific Evaluation Questions Service provisionAre the services available? Are they accessible? Is quality adequate? Service utilizationAre the services being used? Service coverageIs the target population being reached? Service effectiveness Impact Is there improvement in disease outcome or health-related behavior? Were the improvements due to the program?
20
Measuring the Effect of an Intervention Start of Intervention ends Time Change in Intervention outcome With intervention Without intervention Intervention impact
21
Deciding on Evaluation Design 1.What is your research question? 2.What is your target population? 3.What do you know about this population? 4.How do you intend to use the results? 5.What do you want to measure (indicators)? 6.What type of inference do you want to draw? 7.When do you need the results? 8.Do you have a sampling frame? What shape is it in? 9.How much are you willing to pay? 10.Where in the program life cycle are you now?
22
Types of Evaluation Design Design ExperimentalStrongest for demonstrating causality, most expensive Quasi- experimental Weaker for demonstrating causality, less expensive Non- experimental Weakest for demonstrating causality, least expensive
23
Experimental Design
24
The Basic Experimental Principle The intervention is the only difference between two groups This is achieved by random assignment
25
Pretest-posttest Experimental Design Experimental designs attempt to provide maximum control for threats to internal validity They do so by giving the researchers greater ability to manipulate and isolate the independent variable –(not always possible in practice)
26
Pretest-posttest Experimental Design Essential components: 1.Identify the study population and determine the appropriate sample size for experimental and control groups 2.Randomly assign individuals to experimental and control groups 3.Pre-test everyone with a standardized instrument 4.Introduce the independent variable (intervention) to the experimental group while withholding it from the control group 5.Post-test both groups with the same instrument and under the same conditions as the pretest 6.Compare the amount of change in dependent variable for both experimental and control groups
27
Pretest-posttest Control Group Design Intervention group Control group Program Pre- test Post- test Randomization Pre- test Time
28
Factors that May Distort Conclusions Dropout Instrumentation effects Testing effects If you think that taking the pretest might influence the treatment effects, or if it might bias the post-test responses, you might want to opt for the posttest- only control group design Contamination
29
Posttest-only Experimental Design
30
Post Test Only Experimental Design Initial group equivalence Differences between the experimental and control groups at posttest are assumed to reflect the causal impact of the independent variable
31
Post Test Only Experimental Design Intervention group Control group Program Post- test Randomization Time
32
Post-test Only, what to consider? Disadvantages Cannot assess whether the program is going to people for whom it was intended Cannot check comparability of groups Cannot know how much change actually occurred However, a pre-test post-test design always preferred Advantages Cheaper Useful when pre-test can interfere with program effects Randomization ensures equivalent experimental and control groups
33
Group Discussion In which situations might experimental design not be possible?
34
Possible Responses Randomization needed before program starts Ethics o Solution: Use alternative program rather than no program o Known efficacy of intervention Political factors Scale-up o Solution: Start out on small scale and use delayed program strategy
35
Quasi-Experimental Design
36
Quasi-experimental designs Can be used when random assignment is not possible Less internal validity than “true” experiments Still provide a moderate amount of support for causal inferences
37
Principles of the Quasi-experimental Design Intervention group Comparison group Program Pre- test Post- test Pre- test Time (Pre- and Post-test with comparison group but not randomized) Keep in mind selection effects: these occur when people selected for a comparison group differ from the experimental group
38
o Difference-in-Difference analysis: method involves comparing changes before and after the program for individuals in the program and control groups o Regression analysis: Attempts to address the problem of confounding by controlling for difference at baseline Quasi-experimental Statistical Methods
39
Summary of Quasi-Experimental Design Can demand more time and resources Require access to at least two similar groups Provides the assurance that outcomes are actually the results of the program Allows you to accurately assess how much of an effect the program has AdvantageDisadvantage
40
Non-Experimental Design
41
Pre-test Post-test Non-experimental design Intervention group Program Pre- test Post- test Time Pre-test Post-test (No comparison group )
42
Merits of Pre-test Post-test Non- experimental design Cannot account for non- program influences on outcomes Causal attribution not possible Cannot detect small but important changes If self-reporting is used rather than objective measures, posttest scores may be lower than pretest scores Relatively simple to implement Controls for participants' prior knowledge/attitudes/skills intentions AdvantageDisadvantage
43
Time Series Design Intervention group Program Pretest 2 Posttest 1 Time (No comparison group ) Pretest 1 Posttest 2
44
Merits of Time Series Design Problem of confounding Changes in instruments during the series of measurements Loss or change of cases Changes in group composition Enables detection of whether program effects are long-term or short-term Series of tests before intervention can eliminate need of control group Series of tests before program can be used to project results which would be expected Can be used if you have only one site to conduct your evaluation Advantage Disadvantage
45
Strengthening non-experimental designs Since there is no control group confounding can be a problem By constructing a plausibility argument and controlling for contextual and confounding factors, non-experimental designs can be strengthened
46
How to construct a Plausibility Argument Describe trends in Intervention coverage Intermediary outcomes Impact outcomes Contextual factors Link these trends –Temporal, spatial, age-pattern, “dose-response” associations
47
Plausibility Argument for Impact of Malaria Control Increase in effective intervention coverage Decreased morbidity Decreased malaria- associated mortality ITN Ownership ITN use IPTp Treatment Parasite prevalence Anaemia (<8g/dL) Fever All cause under-five mortality (5q0) Indicators Contextual factors Socioeconomic Education Fertility risk Housing condition Nutrition Climatic factor Rainfall Temperature Health intervention Health care utilization ANC, EPI, Vit A, PMTC
48
Merits of Plausibility Argument Need of several data points for intervention and outcomes Challenges in comparing data from different sources Non consistent data collection methods overtime Allow national scale program evaluation- Complex intervention Can accommodate the use of data from different sources AdvantageDisadvantage
49
Summary of Different Study Designs True experimentalQuasi-experimentalNon-experimental Partial coverage/ new programs Control group Strongest design Most expensive Partial coverage/ new programs Comparison group Weaker than experimental design Less expensive Full coverage programs -- Weakest design Least expensive More robust Less robust
50
Different designs vary in their capacity to produce information that allows for the linking of program outcomes to program activities The more confident you want to be about making these connections, the more rigorous the design and costly the evaluation Summary of Different Study Designs
51
Impact Evaluation Case Study 1: Reducing Malaria Transmission through IRS
52
Impact Evaluation Case Study 2: Antimalarial drug policy change
53
Case Study 3: National-Level Scale-up of malaria interventions
54
References Alexander K. Rowe, Faustin Onikpo, Marcel Lama, Dawn M. Osterholt, and Michael S. Deming Impact of a Malaria-Control Project in Benin That Included the Integrated Management of Childhood Illness Strategy. American Journal of Public Health, Published online ahead of print May 12, 2011, as 10.2105/AJPH.2010.300068 de Savigny, D and T Adam, eds. 2009. Systems thinking for health systems strengthening. Geneva, Alliance for Health Policy and Systems Research and WHO. Craig, P, P Dieppe, et al. 2008. Developing and evaluating complex interventions: new guidance. www.mrc.ac.uk/complexinterventionsguidance, Medical Research Council. www.mrc.ac.uk/complexinterventionsguidance Galiani S, Gertler PJ, Schargrodsky E. Water for life: the impact of the privatization of water services on child mortality. J Polit Econ 2005;113: 83-120. Gertler, Martinez, Premand, Rawlings and Vermeersch (2010) Impact Evaluation in Practice, Washington, DC: The World Bank. Habicht JP et al. (1999) Evaluation designs for adequacy, plausibility and probability of public health programme performance and impact. International Journal of Epidemiology, 28: 10-18. Rossi P et al. (1999). Evaluation. A systematic Approach. Thousand Oaks: Sage Publications. Habicht, JP, CG Victora, et al. 1999. Evaluation designs for adequacy, plausibility and probability of public health programme performance and impact. International Journal of Epidemiology 28: 10-18.
55
Group work Please get into your project groups and select an evaluation design for your program (Experimental, Quasi-Experimental, Non- Experimental) Explain why you chose this design Discuss any strengths and weaknesses of this design as they relate to your program
56
General Questions to Be Answered by Each Group What study design will you use? What are the strengths and limitations of your evaluation design? How would you know if your program is effective? How will you address contextual or confounding factors?
57
MEASURE Evaluation is a MEASURE program project funded by the U.S. Agency for International Development (USAID) through Cooperative Agreement GHA-A-00-08-00003-00 and is implemented by the Carolina Population Center at the University of North Carolina at Chapel Hill, in partnership with Futures Group International, John Snow, Inc., ICF Macro, Management Sciences for Health, and Tulane University. Visit us online at http://www.cpc.unc.edu/measurehttp://www.cpc.unc.edu/measure
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.