Download presentation
Published byBeatrice Oliver Modified over 9 years ago
1
Measuring Impact 1 Non-experimental methods 2 Experiments
Vincenzo Di Maro Development Impact Evaluation Some parts of this presentation build on Gertler, P. J.; Martinez, S., Premand, P., Rawlings, L. B. and Christel M. J. Vermeersch, 2010, Impact Evaluation in Practice: Ancillary Material, The World Bank, Washington DC ( The content of this presentation reflects the views of the authors and not necessarily those of the World Bank.
2
Impact evaluation for causal impacts
3
How we can evaluate this? Regression Discontinuity Design
Case: pay reward offered on the basis of exam score new scheme under which all tax inspectors will have to take a compulsory written exam Grades for this exam range from 0 to 100 where 0 means worst outcome and 100 is the best outcome. At the end of the exam all tax inspectors that achieve a minimum score (of 50) will be offered to enter the pay reward scheme. Idea: compare tax inspectors with score a bit below 50 (and unable to choose the reward scheme)…. ….with inspectors with score a bit above 50 (and so eligible for the scheme).
4
Regression Discontinuity
At Baseline Revenues Average tax revenue 50 100 Score
5
Regression Discontinuity
After treatment offered Revenues Discontinuity 50 100 Score
6
Regression Discontinuity
Close to 50 very similar characteristics Revenues Discontinuity 40 50 60 100 Score
7
Regression discontinuity
Powerful method if you have: Continuous eligibility index Clearly defined eligibility cut-off. It gives a causal impact but with a local interpretation Method Treated Control Difference in % Regression Discontinuity $80,215 $69,753 $10,463 15% Problem: Impact is valid only for those subjects that are close to the cut-off point, that is only for tax inspectors that have an exam score close around 50. Is this the group you want to know about?
8
Summary of impacts so far
Method Treated Control/Comparison Difference in % Participants - Non-participants $93,827 $70,800 $23,027 33% Before - After $72,175 $21,653 30% Difference-in-differences 1 (P1-NP1) (P0-NP0) $3,347 $19,590 29% Difference-in-differences 2 (P1-P0) $21,652 (NP1-NP0) $2,062 Regression Discontinuity (RD) $80,215 $69,753 $10,463 15% Weak methods can lead to very misleading results RD (causal impact) is only around half of the impact estimated with the other weaker methods. Valid results from IE only if you use rigorous methods.
9
Experiments Other names: Randomized Control Trials (RCTs) or Randomization Assignment to Treatment and Control is based on chance, it is random (like flipping a coin) Treatment and Control groups will have exactly the same characteristics (balanced) at baseline. Only difference is that treatment receives intervention, control does not
10
Experiments: plan Design of experiments How to implement RCTs
One treatment and many treatments Encouragement design
11
Random assignment 3. Randomize treatment 1. Population
2. Evaluation sample Comparison For presentation (with animation effects) Treatment External Validity Internal Validity
12
Unit of Randomization Choose according to type of program Keep in mind
Individual/Household School/Health Clinic/catchment area/Government agency Block/Village/Community Ward/District/Region As a rule of thumb, randomize at the smallest viable unit of implementation. Keep in mind Need “sufficiently large” number of units to detect minimum desired impact: Power. Spillovers/contamination Operational and survey costs
13
Implementation (pure) Randomization might be not feasible because some eligibles would be excluded from benefits. Usually there are constraints within project implementation that allow randomization: Budget constraints Lottery Not enough treatments for all eligible subjects Lottery is fair, transparent and ethical way to assign benefits Limited capacity Randomized phase-in Not possible to treat all the units in the first phase Randomize which group of units is control (they will be treated at later stage, say after 1 year)
14
Multiple Treatments Different level of benefits
Randomly assign people to different intensity of the treatment (20% vs. 30% reward) No evidence on which alternative is best: test variations in treatment: Randomly assign subjects to different interventions Compare one to another Assess complementarities
15
X Multiple Treatments: 2X2 design Assess complementarities
Intervention 2 Control Treatment Intervention 1 X Social recognition reward Monetary reward Both rewards Assess complementarities Overall reward effect
16
Encouragement design Not always possible to randomly assign to control group: Political and ethical reasons Participation is voluntary and all eligible Randomized promotion/encouragement program available to everyone But provide additional promotion, encouragement or incentives to a random sub-sample: Additional Information. Incentives (small gift or prize). Transport (bus fare).
17
Encouragement design Randomize Incentive to participate. Ex. small gifts High participation (ex. 80%) Encouraged For presentation (with animation effects) Low participation (ex. 10%) Not encouraged
18
How we can evaluate this? Randomized Control Trials
Case: pay scheme offered to a subset of inspectors selected randomly. Out of the 482 inspectors, 218 randomly assigned to the treatment group, the rest (264) to control group No pre-treatment difference between control and treatment as only reason that explains assignment to one of the groups is chance Comparison of treatment and control group gives a causal impact only difference is one group receives treatment, the other does not
19
Treatment and control group balance
All key variables are balanced at baseline That is: the difference between control and treatment is zero before the intervention starts This happens because of randomization
20
RCT causal impact Impact can be attributed to the intervention
Benchmark to assess other methods Method Treated Control Difference in % RCT $75,611 $68,738 $6,874 10% Problem: Implementation of experiments External validity
21
Summary of impacts so far
Method Treated Control/Comparison Difference in % Participants - Non-participants $93,827 $70,800 $23,027 33% Before - After $72,175 $21,653 30% Difference-in-differences 1 (P1-NP1) (P0-NP0) $3,347 $19,590 29% Difference-in-differences 2 (P1-P0) $21,652 (NP1-NP0) $2,062 Regression Discontinuity (RD) $80,215 $69,753 $10,463 15% RCT $75,611 $68,738 $6,874 10% Different methods: quite different results RCT is the benchmark Other methods can be vastly wrong RD close to RCT
22
Testing other schemes 3 versions of performance pay incentive were tested: “Revenue” scheme provided incentives solely on revenue collected above a benchmark predicted from historical data. “Revenue Plus” under which adjustments for whether teams ranked in the top, middle, or bottom third (based on an independent survey of taxpayers) were applied “Flexible Bonus” under which rewards both on the basis of pre-specified criteria set by the tax department and on subjective adjustments based on period-end overall performance (based on subjective assessment by managers of the tax units). Treatment Control Difference % RCT "Revenue Incentive" $75,611 $68,738 6,874 10% RCT "Revenue Plus" $72,174 3,437 5% RCT "Flexible Bonus" $69,425 687 1%
23
Experiments If experiments are not possible choose methods that are still valid Before-After Participants-Non-participants RD Diff-in-Diff Multiple treatments X X
24
Thank You! WEB http://dime.worldbank.org facebook.com/ieKnow
#impacteval blogs.worldbank.org/impactevaluations microdata.worldbank.org/index.php/catalog/impact_evaluation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.