Download presentation
Presentation is loading. Please wait.
Published byNathan Hallowell Modified over 9 years ago
1
#ieGovern Impact Evaluation Workshop Istanbul, Turkey January 27-30, 2015 Measuring Impact 1 Non-experimental methods 2 Experiments Vincenzo Di Maro Development Impact Evaluation
2
Impact evaluation for causal impacts 2
3
3 Extract from a conversation between a Policy-maker (P) and IE team (IE): P: Thanks to our reform of the public service we achieved a lot in the last 5 years. IE: That’s great. How do you know the results are a consequence of the reforms? P: Well, we now have more educated and motivated civil servants and their performance improved. IE: Your reform is certainly positive. But what about improvements in education, general economic growth, or technology? These are all things happened at the same time of your reform. P: That’s true. So what should I do if I want to attribute the impact to my reform? IE: Impact Evaluation! Let’s start designing an IE.
4
Evaluation Narrow down causality by identifying a counterfactual and compare 4 WHAT HAPPENED WHAT WOULD HAVE HAPPENED TO
5
5 What is counterfactual analysis? Compare same individual –with & without intervention –at the same point in time Missing data Compare statistically identical groups of individuals –with & without intervention –at the same point in time Comparable data
6
6 Counterfactual criteria Treated & control groups Have identical initial average characteristics (observed and unobserved) So that, the only difference is the treatment Therefore the only reason for the difference in outcomes is due to the treatment
7
In search of a counterfactual Which tools 7 Not good counterfactual misleading impact Before – After Participants – Non Participants Good under some assumptions and limitations Difference-in-differences Regression discontinuity Causal impact Experiments – Randomized Control Trials
8
Tools and link to causal impact 8 Causal Impacts Experiment/RCTs Participants – Non Participants Regression Discontinuity Before and After Difference-in- differences Monitoring
9
Case study: Incentives for civil servants Problem: Tax collection is inefficient and rigger with corruption Intervention: Performance-based pay schemes for tax collectors Main outcome: Tax revenue Some figures: –482 tax units –Reward is 30% of tax revenues collected 9 Case study based on the work of Adnan Q. Khan, Asim I. Khwaja, and Benjamin A. Olken
10
How we can evaluate this? Participants – Non Participants Case: pay reward is voluntary The performance pay incentive was offered to all 482 tax units. Each unit could decide to be paid under incentivized scheme (opt in) or just decline it and continue to be receive the salary as before (opt out). Idea: compare revenue of tax units that opted in with those that opted out 10
11
Participants – Non Participants 11 MethodTreatedControl/ComparisonDifferencein % Participants VS. Non-participants$93,827$70,800$23,02733% Problem: Selection Bias. Why tax inspectors opted in? -Better performers anyways (observable) -Stronger motivation (unobservable) Parts of this presentation build on material from Impact Evaluation in Practice www.worldbank.org/ieinpracticewww.worldbank.org/ieinpractice
12
How we can evaluate this? Before - After Idea: compare revenue of tax units that opted in (treated) after the reward scheme started with… …same tax units (control) before the incentive scheme started This is: revenue for participants before and after the new pay scheme 12
13
Before - After 13 MethodTreated (After) Control/Comparison (Before) Differencein % Before - After$93,827$72,175$21,65330% Problem: Time difference. Other things may have happened over time. -Training program for civil servants -Central management of tax department got better 2010 2012
14
Before - After Compare: Same subjects Before and After they receive an intervention. Problem: Other things may have happened over time. Participants – Non Participants Compare: Group of subjects treated (participants) with group that chooses not to be treated (non participants) Problem: Selection Bias. We do not know why they are not participating. These 2 tools are wrong for IE Both counterfactuals may lead to biased estimates of the counterfactual and so of the impact.
15
Before-After and Monitoring Monitoring tracks indicators over time –Among participants It is descriptive before-after analysis It tells us whether things are moving in the right direction It does not tell us why things happen or how to make more happen Legovini
16
Impact Evaluation Tracks mean outcomes over time in –the treatment group relative to –the control group Compares –what DID happen with –what WOULD HAVE happened (counterfactual) Identifies cause-effect link –controlling for ALL other time-varying factors 16 Legovini
17
How we can evaluate this? Difference-in-Differences Idea: combine the time dimension (before- after) with the participation choice (participants-non participants) (under some assumptions) this deals with the problems above: –Time differences. Other things may have happened over time. –Selection Bias. We do not know why they are not participating. 17
18
Difference-in-Differences Revenues NP1=$70,800 P0=$72,175 T=0 (2010) T=1 (2012) NP0=$68,738 Time Non-participants Participants Impact=$19,590 P1=$93,827
19
Difference-in-Differences Before-After = (P1-P0) = wrong impact Participant-Non participants = (P1-NP1) = wrong impact Differences-in-differences = (P1-NP1) – (P0 – NP0) or (P1-P0) – (NP1-NP0)
20
Assumption Diff-in-Diff Revenues NP1=$70,800 P0=$72,175 T=0 (2010) T=1 (2012) NP0=$68,738 Time Non-participants Participants Impact=$19,590 P1=$93,827 Treatment and Control follow the same trend
21
Difference-in-differences combines Participants-Non participants with Before-After. Difference-in-Differences It deals with problems of previous methods under the… Possible to test if you have data pre-treatment Improve diff-in-diff if you match groups based on observable characteristics (propensity score matching) …fundamental assumption Trends –slopes- are the same in treatments and controls Deals with unobservables only if constant over time
22
Summary of impacts so far 22 MethodTreatedControl/ComparisonDifferencein % Participants - Non-participants$93,827$70,800$23,02733% Before - After$93,827$72,175$21,65330% Difference-in-differences 1 (P1-NP1) $23,027 (P0-NP0) $3,347 $19,59029% Difference-in-differences 2 (P1-P0) $21,652 (NP1-NP0) $2,062 $19,59029% If method is weak can lead to wrong impact and so wrong policy conclusions Participants-Non Participants and Before-After are not good methods for causal impact Difference-in-differences is valid under some (often strong) assumptions
23
How we can evaluate this? Regression Discontinuity Design Case: pay reward offered on the basis of exam score new scheme under which all tax inspectors will have to take a compulsory written exam Grades for this exam range from 0 to 100 where 0 means worst outcome and 100 is the best outcome. At the end of the exam all tax inspectors that achieve a minimum score (of 50) will be offered to enter the pay reward scheme. Idea: compare tax inspectors with score a bit below 50 (and unable to choose the reward scheme)…. ….with inspectors with score a bit above 50 (and so eligible for the scheme). 23
24
Regression Discontinuity Revenues 100 Score 0 50 At Baseline Average tax revenue
25
Regression Discontinuity Revenues 100 Score 0 50 After treatment offered Discontinuity
26
Regression Discontinuity Revenues 100 Score 0 50 Close to 50 very similar characteristics Discontinuity 40 60
27
Regression discontinuity 27 MethodTreatedControlDifferencein % Regression Discontinuity$80,215$69,753$10,46315% Problem: Impact is valid only for those subjects that are close to the cut-off point, that is only for tax inspectors that have an exam score close around 50. Is this the group you want to know about? Powerful method if you have: –Continuous eligibility index –Clearly defined eligibility cut-off. It gives a causal impact but with a local interpretation
28
Summary of impacts so far 28 MethodTreatedControl/ComparisonDifferencein % Participants - Non-participants$93,827$70,800$23,02733% Before - After$93,827$72,175$21,65330% Difference-in-differences 1 (P1-NP1) $23,027 (P0-NP0) $3,347 $19,59029% Difference-in-differences 2 (P1-P0) $21,652 (NP1-NP0) $2,062 $19,59029% Regression Discontinuity (RD)$80,215$69,753$10,46315% Weak methods can lead to very misleading results RD (causal impact) is only around half of the impact estimated with the other weaker methods. Valid results from IE only if you use rigorous methods.
29
Experiments Other names: Randomized Control Trials (RCTs) or Randomization Assignment to Treatment and Control is based on chance, it is random (like flipping a coin) Treatment and Control groups will have exactly the same characteristics (balanced) at baseline. Only difference is that treatment receives intervention, control does not 29
30
Experiments: plan for tomorrow Design of experimentsOne treatment and many treatmentsHow to implement RCTsWhat when experiments are not possible? 30
31
#ieGovern Impact Evaluation Workshop Istanbul, Turkey January 27-30, 2015 Thank You! facebook.com/ieKnow #impacteval blogs.worldbank.org/impactevaluations microdata.worldbank.org/index.php/catalog/ impact_evaluation http://dime.worldbank.org WEB
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.