Non-experimental methods Markus Goldstein The World Bank DECRG & AFTPM.

Slides:



Advertisements
Similar presentations
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation Muna Meky Impact Evaluation Cluster, AFTRL Slides by Paul J.
Advertisements

An Overview Lori Beaman, PhD RWJF Scholar in Health Policy UC Berkeley
The World Bank Human Development Network Spanish Impact Evaluation Fund.
N ON -E XPERIMENTAL M ETHODS Shwetlena Sabarwal (thanks to Markus Goldstein for the slides)
REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.
Advantages and limitations of non- and quasi-experimental methods Module 2.2.
#ieGovern Impact Evaluation Workshop Istanbul, Turkey January 27-30, 2015 Measuring Impact 1 Non-experimental methods 2 Experiments Vincenzo Di Maro Development.
Monitoring and Evaluation of HIV/AIDS Programs Pretoria, South Africa, March 2011 Research Designs for Program Evaluation M&E for HIV/AIDS Programs March.
Income and Child Development Lawrence Berger, University of Wisconsin Christina Paxson, Princeton University Jane Waldfogel, Columbia Univerity.
Presented by Malte Lierl (Yale University).  How do we measure program impact when random assignment is not possible ?  e.g. universal take-up  non-excludable.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Differences-in-Differences
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Public Policy & Evidence: How to discriminate, interpret and communicate scientific research to better inform society. Rachel Glennerster Executive Director.
Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.
Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.
SAMPLING AND STATISTICAL POWER Erich Battistin Kinnon Scott Erich Battistin Kinnon Scott University of Padua DECRG, World Bank University of Padua DECRG,
Development Impact Evaluation Initiative innovations & solutions in infrastructure, agriculture & environment naivasha, april 23-27, 2011 in collaboration.
Non Experimental Design in Education Ummul Ruthbah.
Matching Methods. Matching: Overview  The ideal comparison group is selected such that matches the treatment group using either a comprehensive baseline.
Impact Evaluation in the Real World One non-experimental design for evaluating behavioral HIV prevention campaigns.
AADAPT Workshop Latin America Brasilia, November 16-20, 2009 Non-Experimental Methods Florence Kondylis.
Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Mattea Stein Quasi Experimental Methods.
Quasi Experimental Methods I Nethra Palaniswamy Development Strategy and Governance International Food Policy Research Institute.
Intergenerational Poverty and Mobility. Intergenerational Mobility Leblanc’s Random Family How does this excerpt relate to what we have been talking about?
Parents’ basic skills and children’s test scores Augustin De Coulon, Elena Meschi and Anna Vignoles.
CAUSAL INFERENCE Shwetlena Sabarwal Africa Program for Education Impact Evaluation Accra, Ghana, May 2010.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Matching Estimators Methods of Economic Investigation Lecture 11.
Session III Regression discontinuity (RD) Christel Vermeersch LCSHD November 2006.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Causal Inference Nandini Krishnan Africa Impact Evaluation.
CAUSAL INFERENCE Presented by: Dan Dowhower Alysia Cohen H 615 Friday, October 4, 2013.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.
Nigeria Impact Evaluation Community of Practice Abuja, Nigeria, April 2, 2014 Measuring Program Impacts Through Randomization David Evans (World Bank)
Applying impact evaluation tools A hypothetical fertilizer project.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Steps in Implementing an Impact Evaluation Nandini Krishnan.
Randomized Assignment Difference-in-Differences
Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods.
Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead Specialist Africa Impact Evaluation Initiative.
Measuring causal impact 2.1. What is impact? The impact of a program is the difference in outcomes caused by the program It is the difference between.
Do European Social Fund labour market interventions work? Counterfactual evidence from the Czech Republic. Vladimir Kváča, Czech Ministry of Labour and.
The Evaluation Problem Alexander Spermann, University of Freiburg 1 The Fundamental Evaluation Problem and its Solution SS 2009.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, Causal Inference Nandini.
1 An introduction to Impact Evaluation (IE) for HIV/AIDS Programs March 12, 2009 Cape Town Léandre Bassolé ACTafrica, The World Bank.
Henrik Winterhager Econometrics III Before After and Difference in Difference Estimators 1 Overview of non- experimental approaches: Before After and Difference.
Impact Evaluation Methods Regression Discontinuity Design and Difference in Differences Slides by Paul J. Gertler & Sebastian Martinez.
The Evaluation Problem Alexander Spermann, University of Freiburg, 2007/ The Fundamental Evaluation Problem and its Solution.
Research Department Inter-American Development Bank
Measuring Results and Impact Evaluation: From Promises into Evidence
Quasi Experimental Methods I
General belief that roads are good for development & living standards
Quasi Experimental Methods I
An introduction to Impact Evaluation
Quasi-Experimental Methods
Quasi-Experimental Methods
Impact evaluation: The quantitative methods with applications
Matching Methods & Propensity Scores
Matching Methods & Propensity Scores
Impact Evaluation Methods
1 Causal Inference Counterfactuals False Counterfactuals
Matching Methods & Propensity Scores
Impact Evaluation Methods: Difference in difference & Matching
Evaluating Impacts: An Overview of Quantitative Methods
Sampling and Power Slides by Jishnu Das.
Explanation of slide: Logos, to show while the audience arrive.
Sampling for Impact Evaluation -theory and application-
Applying Impact Evaluation Tools: Hypothetical Fertilizer Project
Steps in Implementing an Impact Evaluation
Presentation transcript:

Non-experimental methods Markus Goldstein The World Bank DECRG & AFTPM

Objective Find a plausible counterfactual Every method is associated with an assumption The stronger the assumption the more we need to worry about the causal effect »Question your assumptions Reality check

Program to evaluate  Hopetown HIV/AIDS Program ( )  Objectives  Reduce HIV transmission  Intervention: Peer education  Target group: Youth  Indicator: Pregnancy rate (proxy for unprotected sex)

I. Before-after identification strategy (aka reflexive comparison) Counterfactual: Rate of pregnancy observed before program started EFFECT = After minus Before

YearNumber of areas Teen pregnancy rate (per 1000) Difference+3.47

Effect = Intervention Counterfactual assumption: no change over time Question: what else might have happened in to affect teen pregnancy?

Examine assumption with prior data Number of areas Teen pregnancy (per 1000) Assumption of no change over time looks a bit shaky

II. Non-participant identification strategy Counterfactual: Rate of pregnancy among non-participants Teen pregnancy rate (per 1000) in 2012 Participants66.37 Non-participants57.50 Difference+8.87

Counterfactual assumption: Without intervention participants have same pregnancy rate as non-participants Effect = Participants Non-participants Question: how might participants differ from non- participants?

Test assumption with pre-program data ? REJECT counterfactual hypothesis of same pregnancy rates

III. Difference-in-Difference identification strategy Counterfactual: 1.Nonparticipant rate of pregnancy, purging pre-program differences in participants/nonparticipants 2.“Before” rate of pregnancy, purging before-after change for nonparticipants 1 and 2 are equivalent

Average rate of teen pregnancy in Difference ( ) Participants (P) Non-participants (NP) Difference (P-NP)

= – = 3.47 Non-participants Participants Effect = 3.47 – =

After Before Effect = 8.87 – = – = – = 16.53

Counterfactual assumption: Without intervention participants and nonparticipants’ pregnancy rates follow same trends

Questioning the assumption Why might participants’ trends differ from that of nonparticipants?

Examine assumption with pre-program data counterfactual hypothesis of same trends doesn’t look so believable Average rate of teen pregnancy in Difference ( ) Participants (P) Non-participants (NP) Difference (P=NP) ?

IV. Matching with Difference-in- Difference identification strategy Counterfactual: Comparison group is constructed by pairing each program participant with a “similar” nonparticipant using larger dataset – creating a control group from similar (in observable ways) non- participants

Counterfactual assumption: Question: how might participants differ from matched nonparticipants? Unobserved characteristics do not affect outcomes of interest Unobserved = things we cannot measure (e.g. ability) or things we left out of the dataset

Matched nonparticipant Participant Effect =

Can only test assumption with experimental data Apply with care – think very hard about unobservables Studies that compare both methods (because they have experimental data) find that: unobservables often matter! direction of bias is unpredictable!

V. Regression discontinuity identification strategy Applicability: When strict quantitative criteria determine eligibility Counterfactual: Nonparticipants just below the eligibility cutoff are the comparison for participants just above the eligibility cutoff

Counterfactual assumption: Question: Is the distribution around the cutoff smooth? Then, assumption might be reasonable Question: Are unobservables likely to be important (e.g. correlated with cutoff criteria)? Then, assumption might not be reasonable However, can only estimate impact around the cutoff, not for the whole program Nonparticipants just below the eligibility cutoff are the same (in observable and unobservable ways) as participants just above the eligibility cutoff

Target transfer to poorest schools Construct poverty index from 1 to 100 Schools with a score <=50 are in Schools with a score >50 are out Inputs transfer to poor schools Measure outcomes (i.e. test scores) before and after transfer Example: Effect of school inputs on test scores

Non-Poor Poor

Treatment Effect

Applying RDD in practice: Lessons from an HIV-nutrition program Lesson 1: criteria not applied well –Multiple criteria: hh size, income level, months on ART –Nutritionist helps her friends fill out the form with the “right” answers –Now – unobservables separate treatment from control… Lesson 2: Watch out for criteria that can be altered (e.g. land holding size)

Gold standard is randomization – minimal assumptions needed, intuitive estimates Nonexperimental requires assumptions – can you defend them? Summary

Different assumptions will give you different results The program: ART treatment for adult patients Impact of interest: effect of ART on children of patients (are there spillover & intergenerational effects of treatment?) –Child education (attendance) –Child nutrition Data: 250 patient HHs 500 random sample HHs –Before & after treatment Can’t randomize ART so what is the counterfactual

Possible counterfactual candidates Random sample difference in difference –Are they on the same trajectory? Orphans (parents died – what would have happened in absence of treatment) –But when did they die, which orphans do you observe, which do you not observe? Parents self report moderate to high risk of HIV –Self report! Propensity score matching –Unobservables (so why do people get HIV?)

Estimates of treatment effects using alternative comparison groups Standard errors clustered at the household level in each round. Includes child fixed effects, round 2 indicator and month-of-interview indicators. Compare to around 6.4 if we use the simple difference in difference using the random sample

Estimating ATT using propensity score matching Allows us to define comparison group using more than one characteristic of children and their households Propensity scores defined at household level, with most significant variables being single-headed household and HIV risk

Probit regression results Dependent variable: household has adult ARV recipient

ATT using propensity score matching

Nutritional impacts of ARV treatment Includes child fixed effects, age controls, round 2 indicator, interviewer fixed effects, and month-of- interview indicators.

Nutrition with alternative comparison groups Includes child fixed effects, age controls, round 2 indicator, interviewer fixed effects, and month-of- interview indicators.

Summary: choosing among non- experimental methods At the end of the day, they can give us quite different estimates (or not, in some rare cases) Which assumption can we live with?

Thank you