Development Impact Evaluation Initiative innovations & solutions in infrastructure, agriculture & environment naivasha, april 23-27, 2011 in collaboration.

Slides:



Advertisements
Similar presentations
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation Muna Meky Impact Evaluation Cluster, AFTRL Slides by Paul J.
Advertisements

An Overview Lori Beaman, PhD RWJF Scholar in Health Policy UC Berkeley
The World Bank Human Development Network Spanish Impact Evaluation Fund.
REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.
Advantages and limitations of non- and quasi-experimental methods Module 2.2.
Mywish K. Maredia Michigan State University
#ieGovern Impact Evaluation Workshop Istanbul, Turkey January 27-30, 2015 Measuring Impact 1 Non-experimental methods 2 Experiments Vincenzo Di Maro Development.
Presented by Malte Lierl (Yale University).  How do we measure program impact when random assignment is not possible ?  e.g. universal take-up  non-excludable.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.
Impact Evaluation Toolbox Gautam Rao University of California, Berkeley * ** Presentation credit: Temina Madon.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, 2009 AIM-CDD Using Randomized Evaluations.
Matching Methods. Matching: Overview  The ideal comparison group is selected such that matches the treatment group using either a comprehensive baseline.
AADAPT Workshop Latin America Brasilia, November 16-20, 2009 Non-Experimental Methods Florence Kondylis.
Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Mattea Stein Quasi Experimental Methods.
Understanding Statistics
Quasi Experimental Methods I Nethra Palaniswamy Development Strategy and Governance International Food Policy Research Institute.
S-005 Intervention research: True experiments and quasi- experiments.
Causal Inference & Quasi-experimental Methods Arndt Reichert 22 June 2015 ieConnect Impact Evaluation Workshop Rio de Janeiro, Brazil June 22-25, 2015.
Evaluating Job Training Programs: What have we learned? Haeil Jung and Maureen Pirog School of Public and Environmental Affairs Indiana University Bloomington.
Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Making the Most out of Discontinuities Florence.
Correlational Research Chapter Fifteen Bring Schraw et al.
Beyond surveys: the research frontier moves to the use of administrative data to evaluate R&D grants Oliver Herrmann Ministry of Business, Innovation.
Session III Regression discontinuity (RD) Christel Vermeersch LCSHD November 2006.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Causal Inference Nandini Krishnan Africa Impact Evaluation.
CAUSAL INFERENCE Presented by: Dan Dowhower Alysia Cohen H 615 Friday, October 4, 2013.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Rigorous Quasi-Experimental Evaluations: Design Considerations Sung-Woo Cho, Ph.D. June 11, 2015 Success from the Start: Round 4 Convening US Department.
Public Policy Analysis ECON 3386 Anant Nyshadham.
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.
Nigeria Impact Evaluation Community of Practice Abuja, Nigeria, April 2, 2014 Measuring Program Impacts Through Randomization David Evans (World Bank)
Applying impact evaluation tools A hypothetical fertilizer project.
Non-experimental methods Markus Goldstein The World Bank DECRG & AFTPM.
Impact Evaluation for Evidence-Based Policy Making
What is randomization and how does it solve the causality problem? 2.3.
Measuring Impact 1 Non-experimental methods 2 Experiments
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Steps in Implementing an Impact Evaluation Nandini Krishnan.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.
AADAPT Workshop Latin America Brasilia, November 16-20, 2009 Laura Chioda.
Implementing an impact evaluation under constraints Emanuela Galasso (DECRG) Prem Learning Week May 2 nd, 2006.
Randomized Assignment Difference-in-Differences
Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods.
Social Experimentation & Randomized Evaluations Hélène Giacobino Director J-PAL Europe DG EMPLOI, Brussells,Nov 2011 World Bank Bratislawa December 2011.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Randomization.
Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead Specialist Africa Impact Evaluation Initiative.
Do European Social Fund labour market interventions work? Counterfactual evidence from the Czech Republic. Vladimir Kváča, Czech Ministry of Labour and.
Innovations in investment climate reforms: an impact evaluation workshop November 12-16, 2012, Paris Non Experimental Methods Florence Kondylis.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, Causal Inference Nandini.
Impact Evaluation Methods Regression Discontinuity Design and Difference in Differences Slides by Paul J. Gertler & Sebastian Martinez.
Kenya Evidence Forum - June 14, 2016 Using Evidence to Improve Policy and Program Designs How do we interpret “evidence”? Aidan Coville, Economist, World.
Quasi Experimental Methods I
Quasi Experimental Methods I
An introduction to Impact Evaluation
Quasi-Experimental Methods
Making the Most out of Discontinuities
Quasi-Experimental Methods
Impact evaluation: The quantitative methods with applications
Matching Methods & Propensity Scores
Matching Methods & Propensity Scores
EMPIRICAL STUDY AND FORECASTING (II)
Impact Evaluation Toolbox
Matching Methods & Propensity Scores
Implementation Challenges
Impact Evaluation Methods: Difference in difference & Matching
Evaluating Impacts: An Overview of Quantitative Methods
Explanation of slide: Logos, to show while the audience arrive.
Applying Impact Evaluation Tools: Hypothetical Fertilizer Project
Presentation transcript:

Development Impact Evaluation Initiative innovations & solutions in infrastructure, agriculture & environment naivasha, april 23-27, 2011 in collaboration with Africa region, SD network, GAFSP and AGRA When Randomization is not possible: Quasi-experimental methods

Alternatives to Randomization  Sometimes, randomization is really not possible  Large infrastructure projects  Politically sensitive projects  In these cases, we can use “quasi- experimental” methods to try to mimic the benefits of randomized assignment

Defining the control group  The point of quasi-experimental methods is to obtain a control group that is almost as good as what would have been obtained by randomization  We still have some form of treatment and control, and generally use the difference-in- difference estimator  We just use different methods to select a “good” control group

This session  Three quasi-experimental methods for evaluation  Regression discontinuity design  Propensity score matching  Instrumental variables methods  In each: Pros and cons, practical and research-wise  Illustrative examples

Regression Discontinuity Designs  RDD is based on the selection process  When in presence of an official/bureaucratic, clear and reasonably enforced eligibility rule  A simple, quantifiable score  Assignment to treatment is based on this score  A threshold is established ▪ Ex: target firms with sales above a certain amount ▪ Those above receive, those below do not ▪ compare firms just above the threshold to firms just below the threshold 5

RDD Logic  Assignment to the treatment depends, either completely or partly, on a continuous “score”, ranking (age in the previous case):  potential beneficiaries are ordered by looking at the score  there is a cut-off point for “eligibility” – clearly defined criterion determined ex-ante  cut-off determines the assignment to the treatment or no-treatment groups  These de facto assignments often result from administrative decisions  resource constraints limit coverage  very targeted intervention with expected heterogeneous impact  transparent rules rather than discretion used 6

Possible dicontinuities  Income/Land eligibility for government programs:  People with land below 2ha get subsidized loans, those above do not get it  Age Eligibility Criteria  Children below 5yrs get access to new schools, those above 5yrs go to old schools  Geography  People on one side of a border only get program, those on other side do now

RDD Example: Drinking Age  A country is considering implementing a minimum drinking age in their country.  Will this cause a:  Decrease in drinking?  Decrease in deaths?  Use US data to explore this question

RDD in Practice  Policy: US drinking age, minimum legal age is 21  under 21, alcohol consumption is illegal  Outcomes: alcohol consumption and mortality rate  Observation: The policy implies that  individuals aged 20 years, 11 months and 29 days cannot drink  individuals ages 21 years, 0 month and 1 day can drink  however, do we think that these individuals are inherently different?  wisdom, preferences for alcohol and driving, party-going behavior, etc  People born “few days apart” are treated differently, because of the arbitrary age cut off established by the law  a few days or a month age difference could is unlikely to yield variations in behavior and attitude towards alcohol  The legal status is the only difference between the treatment group (just above 21) and comparison group (just below 21) 9

RDD in Practice  In practice, making alcohol consumption illegal lowers consumption and, therefore, the incidence of drunk-driving  Idea: use the following groups to measure the impact of a minimum drinking age on mortality rate of young adults  Treatment group: individuals 20 years and 11 months to 21 years old  Comparison group: individuals 21 years to 21 years and a month old  Around the threshold, we can safely assume that individuals are randomly assigned to the treatment  We can then measure the causal impact of the policy on mortality rates around the threshold 10

RDD Example 11 MLDA (Treatment) reduces alcohol consumption

RDD Example 12 Total number of Deaths Higher alcohol consumption increases death rate around age 21 Total number of accidental deaths related to alcohol and drug consumption Total number of other deaths

Conclusion: Causal  Since the jump at exactly 21 years could not be caused by other factors, we conclude it is caused by the drinking age policy.

RDD: Caveats  Caveats  Requires program with clear, well-defined eligibility rules  Requires data from many people just above and below cutoff (meaning there need to be many people right around the cutoff!)  Program should be only source of discontinuity (meaning in general, administrative borders are not great for RDD)

Method 2: Matching Method Match participants with non-participants on the basis of observable characteristics Counterfactual:  Matched comparison group  Each program participant is paired with one or more similar non-participant(s) based on observable characteristics >> On average, matched participants and nonparticipants share the same observable characteristics (by construction)  Estimate the effect of our intervention by using difference-in-differences 15

How do we do it?  Design a control group by establishing close matches in terms of observable characteristics  Carefully select variables along which to match participants to their control group  So that we only retain ▪ Treatment Group: Participants that could find a match ▪ Comparison Group: Non-participants similar enough to the participants >> We trim out a portion of our treatment group!

Implications  In most cases, we cannot match everyone  Need to understand who is left out  Example Score Nonparticipants Participants Matched Individuals Wealth Portion of treatment group trimmed out

Matching: Caveats  Caveats:  Needs lots of data to create good matches  Even with good data, results are less robust than other methods  Need to start with very large sample to assure there are enough people with matches  Must be really convinced that the control villages were not excluded for important reasons

Method 3: Instrumental Variables Methods  Idea: if only part of the allocation of project to places is random, use only that part to get at the causal impact  An instrumental variable is a variable that helps you isolate just that part of the variation in project placement (a “lever” to manipulate good variation in project placement)

Method 3: Instrumental Variables Methods  Example: Dinkelman 2011: Rural household electrification and employment in South Africa  End of apartheid (1994), Eskom promises to make 500,000 new household connections each year, fully subsidized  Will electricity improve employment propects?  Project selection criteria  Political reasons: part of the “not-easily-identifiable but good reasons for selecting particular target groups”  Cost reasons: high household density, short distances to existing grid, flatter land gradient

Method 3: Instrumental Variables Methods  Comparing electrified and unelectrified areas likely biased, because project areas not selected randomly  Instead: use variation in cost that affects placement: Land Gradient  Steeper areas are more costly to electrify so are less likely to get electricity  Assumption: Land gradient is relatively random and should not affect employment in other ways besides the cost to electrify

Data and Context Census communities (1996, 2001) for rural KZN Districts (d) ~ 30, ,000 hh’s Community/village (j) ~ 220 hh’s, n=1,816 Geography 1996 grid infrastructure, proximity to roads, towns Community land gradient Electricity Administrative data on whether community had an Eskom project (20% did) Unit of analysis

Received Elec. No Elec. Towns Substations Power lines Sample area and project assignments

Flatter gradient = light yellow Steeper gradient = brown Towns Substations Power lines Sample area and gradient

Method 4: Instrumental Variables Methods  Assumptions and conditions 1. IV must predict project allocation ▪ “Strong first stage” ▪ This is testable!! 2. IV must by unrelated to unobservable factors that affect project allocation and outcomes ▪ This is not testable ▪ Need good contextual knowledge to defend this

IV Methods: Main results  A 10% increase in land gradient reduces the probability of electrification by 7.7 p.p.  Electrification raises female employment by 9.5 p.p, no significant impacts on men  Electrification raises the fraction of households using electric lighting by 63 p.p., cooking with electricity by 23 p.p., reduces cooking with wood by 27.5 p.p.

Instrumental Variables: Caveats  Caveats:  To use IV, you need a good instrument, and this is not always possible!  Generally, it is very difficult to find a convincing instrument, so this method only works in certain cases

Recap of Methods: Which is Best  Randomization  “Gold Standard”- Produces most rigorous results, but may not be technically/politically feasible in all cases  RDD  Produces strong results, but requires a clear, measurable allocation rule  Instrument Variables  Produces strong results, but requires good instrument, which may not exist  Matching  Produces results that may be less rigorous, but may be easier to implement than other methods