Impact Evaluation Designs for Male Circumcision Sandi McCoy University of California, Berkeley Male Circumcision Evaluation Workshop and Operations Meeting 1
Our Objective: Estimate the CAUSAL effect (impact) of: intervention P (male circumcision) on outcome Y (HIV incidence)
Our Objective: Estimate the CAUSAL effect (impact) of: intervention P (male circumcision) on outcome Y (HIV incidence) Since we can never actually know what would have happened, comparison groups allow us to estimate the counterfactual
Evaluation Designs for MC IE Study Design Cluster Stepped wedge Selective promotion Dose–Response 1 2 3 4
Evaluation Designs for MC IE Study Design Cluster Stepped wedge Selective promotion Dose–Response Not everyone has access to the intervention at the same time (supply variation) 1 2 3 The program is available to everyone (universal access or already rolled out) 4
Cluster Evaluation Designs Unit of analysis is a group (e.g., communities, districts) Usually prospective Intervention Comparison
Cluster Evaluation Designs Case Study: Progresa/Oportunidades Program National anti-poverty program in Mexico Eligibility based on poverty index Cash transfers conditional on school and health care attendance 506 communities 320 randomly allocated to receive the program 185 randomly allocated to serve as controls Program evaluated for effects on health and welfare
Stepped Wedge or Phased-In Clusters Time Period 1 2 3 4 5 6 And finally, by time period 4, all the clusters receive the intervention. Stepped wedge designs can be used to overcome practical or ethical objections to withholding an intervention from a comparison group. Also, and perhaps most importantly, it allows a trial to be conducted without delaying roll-out of the intervention. Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. 8
Stepped Wedge or Phased-In Clusters Time Period 1 2 Program 3 4 5 6 And finally, by time period 4, all the clusters receive the intervention. Stepped wedge designs can be used to overcome practical or ethical objections to withholding an intervention from a comparison group. Also, and perhaps most importantly, it allows a trial to be conducted without delaying roll-out of the intervention. Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. 9
Stepped Wedge or Phased-In Clusters Time Period 1 2 3 Program 4 5 6 And finally, by time period 4, all the clusters receive the intervention. Stepped wedge designs can be used to overcome practical or ethical objections to withholding an intervention from a comparison group. Also, and perhaps most importantly, it allows a trial to be conducted without delaying roll-out of the intervention. Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. 10
Stepped Wedge or Phased-In Clusters Time Period 1 2 3 4 Program 5 6 And finally, by time period 4, all the clusters receive the intervention. Stepped wedge designs can be used to overcome practical or ethical objections to withholding an intervention from a comparison group. Also, and perhaps most importantly, it allows a trial to be conducted without delaying roll-out of the intervention. Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. 11
Stepped Wedge or Phased-In Case Study: Rwanda Pay-for-Performance Performance based health care financing Increase quantity & quality of health services provided Increase health worker motivation Financial incentives to providers to see more patients and provide higher quality of care Phased rollout at the district level 8 randomly allocated to receive the program immediately 8 randomly allocated to receive the program later
Selective Promotion Common scenarios: National program with universal eligibility Voluntary inscription in program Comparing enrolled to not enrolled introduces selection bias One solution: provide additional promotion, encouragement or incentives to a sub-sample: Information Encouragement (small gift or prize) Transport
Universal eligibility Selective Promotion Universal eligibility
Universal eligibility Selective Promotion Universal eligibility Selectively promote No Promotion Promotion
Universal eligibility Selective Promotion Enrollment Universal eligibility Selectively promote No Promotion Promotion
Selective Promotion Not Encouraged 4% incidence Never Enroll Enroll if Encouraged Always Enroll And finally, by time period 4, all the clusters receive the intervention. Stepped wedge designs can be used to overcome practical or ethical objections to withholding an intervention from a comparison group. Also, and perhaps most importantly, it allows a trial to be conducted without delaying roll-out of the intervention. Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. 17
Selective Promotion Not Encouraged 4% incidence Encouraged Never Enroll Enroll if Encouraged Always Enroll And finally, by time period 4, all the clusters receive the intervention. Stepped wedge designs can be used to overcome practical or ethical objections to withholding an intervention from a comparison group. Also, and perhaps most importantly, it allows a trial to be conducted without delaying roll-out of the intervention. Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. 18
Selective Promotion Not Encouraged 4% incidence Encouraged Δ Effect 0.5% Never Enroll Enroll if Encouraged Always Enroll And finally, by time period 4, all the clusters receive the intervention. Stepped wedge designs can be used to overcome practical or ethical objections to withholding an intervention from a comparison group. Also, and perhaps most importantly, it allows a trial to be conducted without delaying roll-out of the intervention. Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. 19
Selective Promotion Not Encouraged 4% incidence Encouraged Δ Effect 0.5% POPULATION IMPACT 2% incidence reduction Never Enroll Enroll if Encouraged Always Enroll And finally, by time period 4, all the clusters receive the intervention. Stepped wedge designs can be used to overcome practical or ethical objections to withholding an intervention from a comparison group. Also, and perhaps most importantly, it allows a trial to be conducted without delaying roll-out of the intervention. Brown CA, Lilford RJ. BMC Medical Research Methodology, 2006. 20
Selective Promotion Necessary conditions: Promoted and non-promoted groups are comparable Promotion not correlated with population characteristics Guaranteed by randomization Promoted group has higher enrollment in the program Promotion does not affect outcomes directly
Selective Promotion Case Study: Malawi VCT Respondents in rural Malawi were offered a free door-to-door HIV test Some were given randomly assigned vouchers between zero and three dollars, redeemable upon obtaining their results at a nearby VCT center
Dose – Response Evaluations Suitable when a program is already in place everywhere Examine differences in exposures (doses) or intensity across program areas Compare the impact of the program across varying levels of program intensity Hypothetical map of program implementation levels
Dose – Response Evaluations Example for MC: All clinics in a region offer MC, but their capacity is limited and there are queues Some towns are visited by mobile clinics that help the fixed clinic rapidly increase MC coverage
Design Variations for MC IE Study Design Allocation Method Randomization Matching Enrolled vs. not Enrolled Cluster Stepped wedge Selective promotion Dose–Response
Random Allocation Each unit has the same probability of selection for who receives the benefit, or who receives the benefit first Helps obtain comparability between those who did and did not receive the intervention On observed and unobserved factors Ensures transparency and fairness
Unit of Randomization Individuals, groups, communities, districts, etc
Matching Pick a comparison group that “matches” the treatment group based on similarities in observed characteristics
Matching Region A - Treatment Region B - Comparison
Matching Region A - Treatment Region B - Comparison
Matching Matching helps control for observable heterogeneity Cannot control for factors that are unobserved Matching can be done at baseline (more efficient) OR in the analysis
Enrolled versus Not Enrolled Consider a school-based pregnancy prevention program 10 schools in the district are asked if they would like to participate
Enrolled versus Not Enrolled No intervention 5 schools decline participation 5 schools elect to participate in the program Pregnancy Prevention Program
Enrolled versus Not Enrolled No intervention Pregnancy rate = 3 per 100 student years 2 per 100 student years Pregnancy Prevention Program
Enrolled versus Not Enrolled Schools in the program had fewer adolescent pregnancies… Can we attribute this difference to the program? No intervention Pregnancy rate = 3 per 100 student years 2 per 100 student years Pregnancy Prevention Program
Enrolled versus Not Enrolled Observed effect might be due to differences in unobservable factors which led to differential selection into the program (“selection bias”) No intervention Pregnancy rate = 3 per 100 student years 2 per 100 student years Pregnancy Prevention Program
Enrolled versus Not Enrolled This selection method compares “apples to oranges” The reason for not enrolling might be correlated with the outcome You can statistically “control” for observed factors But you cannot control for factors that are “unobserved” Estimated impact erroneously mixes the effect of different factors
Enrolled vs. not Enrolled Choosing Your Methods Two decisions to decide the design: Study Design Allocation Method Randomization Matching Enrolled vs. not Enrolled Cluster Stepped wedge Selective promotion Dose–Effect
Choosing Your Methods Identify the “best” possible design given the context Best design = fewest risks for error Have we controlled for “everything”? Internal validity Is the result valid for “everyone”? External validity Local versus global treatment effect
Consider Randomization First Minimizes selection bias Balances known and unknown confounders Most efficient (smaller Ns) Simpler analyses Transparency Decision makers understand (and believe) the results
Choosing Your Methods To identify an IE design for your program, consider: Prospective/retrospective Eligibility rules Roll-out plan (pipeline) Is universe of eligibles larger than available resources at a given point in time? Who controls implementation? Budget and capacity constraints? Excess demand for program? Eligibility criteria? Geographic targeting?
Thank you