Matching Methods & Propensity Scores

Slides:



Advertisements
Similar presentations
An Overview Lori Beaman, PhD RWJF Scholar in Health Policy UC Berkeley
Advertisements

REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.
Advantages and limitations of non- and quasi-experimental methods Module 2.2.
Evaluation of the impact of the Natural Forest Protection Programme on rural household incomes Katrina Mullan Department of Land Economy University of.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Impact Evaluation Methods. Randomized Trials Regression Discontinuity Matching Difference in Differences.
Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.
Matching Methods. Matching: Overview  The ideal comparison group is selected such that matches the treatment group using either a comprehensive baseline.
Impact Evaluation in the Real World One non-experimental design for evaluating behavioral HIV prevention campaigns.
AADAPT Workshop Latin America Brasilia, November 16-20, 2009 Non-Experimental Methods Florence Kondylis.
Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Mattea Stein Quasi Experimental Methods.
Quasi Experimental Methods I Nethra Palaniswamy Development Strategy and Governance International Food Policy Research Institute.
Welfare Reform and Lone Parents Employment in the UK Paul Gregg and Susan Harkness.
Assessing the Distributional Impact of Social Programs The World Bank Public Expenditure Analysis and Manage Core Course Presented by: Dominique van de.
Matching Estimators Methods of Economic Investigation Lecture 11.
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.
Applying impact evaluation tools A hypothetical fertilizer project.
Non-experimental methods Markus Goldstein The World Bank DECRG & AFTPM.
Selection Models Evaluation Research (8521) Prof. Jesse Lecy 1.
Using Propensity Score Matching in Observational Services Research Neal Wallace, Ph.D. Portland State University February
Implementing an impact evaluation under constraints Emanuela Galasso (DECRG) Prem Learning Week May 2 nd, 2006.
WBI WORKSHOP Randomization and Impact evaluation.
Randomized Assignment Difference-in-Differences
Considering model structure of covariates to estimate propensity scores Qiu Wang.
Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods.
MATCHING Eva Hromádková, Applied Econometrics JEM007, IES Lecture 4.
The Evaluation Problem Alexander Spermann, University of Freiburg 1 The Fundamental Evaluation Problem and its Solution SS 2009.
Alexander Spermann University of Freiburg, SS 2008 Matching and DiD 1 Overview of non- experimental approaches: Matching and Difference in Difference Estimators.
Copyright © 2015 Inter-American Development Bank. This work is licensed under a Creative Commons IGO 3.0 Attribution-Non Commercial-No Derivatives (CC-IGO.
Impact Evaluation Methods Regression Discontinuity Design and Difference in Differences Slides by Paul J. Gertler & Sebastian Martinez.
The Evaluation Problem Alexander Spermann, University of Freiburg, 2007/ The Fundamental Evaluation Problem and its Solution.
Looking for statistical twins
Lurking inferential monsters
EXPERIMENTAL RESEARCH
Research Department Inter-American Development Bank
Measuring Results and Impact Evaluation: From Promises into Evidence
Quasi Experimental Methods I
Constructing Propensity score weighted and matched Samples Stacey L
General belief that roads are good for development & living standards
Quasi Experimental Methods I
Propensity Score Matching
An introduction to Impact Evaluation
Difference-in-Differences
Quasi-Experimental Methods
Impact Evaluation Methods
Research methods Lesson 2.
Quasi-Experimental Methods
Impact evaluation: The quantitative methods with applications
Matching Methods & Propensity Scores
Propensity Score Matching Makes Program Evaluation Easy
Methods of Economic Investigation Lecture 12
Explanation of slide: Logos, to show while the audience arrive.
Impact Evaluation Methods
Impact Evaluation Methods
1 Causal Inference Counterfactuals False Counterfactuals
Jeremiah Maller Partner Organization: Operation Smile
The Use of Counterfactual Impact Evaluation Methods in Cohesion Policy
Matching Methods & Propensity Scores
Impact Evaluation Methods: Difference in difference & Matching
Evaluating Impacts: An Overview of Quantitative Methods
Sampling and Power Slides by Jishnu Das.
The European Statistical Training Programme (ESTP)
Sampling for Impact Evaluation -theory and application-
Chapter: 9: Propensity scores
Applying Impact Evaluation Tools: Hypothetical Fertilizer Project
Module 3: Impact Evaluation for TTLs
Sample Sizes for IE Power Calculations.
Reminder for next week CUELT Conference.
ANalysis Of VAriance Lecture 1 Sections: 12.1 – 12.2
Presentation transcript:

Matching Methods & Propensity Scores Garret Christensen (Taken from Kenny Ajayi) October 27, 2009 Global Poverty and Impact Evaluation

Program Evaluation Methods Randomization (Experiments) Quasi-Experiments Regression Discontinuity Matching, Propensity Score Difference-in-Differences

Matching Methods Creating a counterfactual To measure the effect of a program, we want to measure E[Y | D = 1, X] - E[Y | D = 0, X] but we only observe one of these outcomes for each individual.

Evaluation Exercise Argentine Antipoverty Program

Basic Idea Match each participant (treated) with one or more nonparticipants (untreated) with similar observed characteristics Counterfactual = matched comparison group (i.e. nonparticipants with same characteristics as participants) Illustrate Example

Basic Idea This assumes that there is no selection bias based on unobserved characteristics i.e. there is “selection on observables” and participation is independent of outcomes once we control for observable characteristics (X) What might some of these unobserved characteristics be?

Propensity Score When the set of observed variables is large, we match participants with non participants using a summary measure: the propensity score: the probability of participating in the program (being treated), as a function of the individual’s observed characteristics P(X) = Prob(D = 1|X) D indicates participation in project X is the set of observable characteristics In practice, matching individual characteristics is very hard. The entire vector of X observed characteristics could be huge. Instead of attempting to create a match for each participant with exactly the same value of X, we can instead match on the probability of participation.

if there had not been a program Propensity Score We maintain the assumption of selection on observables: i.e., assume that participation is independent of outcomes conditional on Xi E (Y|X, D = 1) = E (Y|X, D = 0) if there had not been a program This is false if there are unobserved outcomes affecting participation

Evaluation Exercise Argentine Antipoverty Program

Propensity Score Matching Get representative and comparable data on participants and nonparticipants (ideally using the same survey & a similar time period)

Propensity Score Matching Get representative and comparable data on participants and nonparticipants (ideally using the same survey & a similar time period) Estimate the probability of program participation as a function of observable characteristics (using a logit or other discrete choice model)

Jalan and Ravallion (2003)

Propensity Score Matching Get representative and comparable data on participants and nonparticipants (ideally using the same survey & a similar time period) Estimate the probability of program participation as a function of observable characteristics (using a logit or other discrete choice model) Use predicted values from estimation to generate propensity score p(xi) for all treatment and comparison group members

Propensity Score Matching Match Participants: Find a sample of non-participants with similar p(xi) Restrict samples to ensure common support

Density of scores for participants Common Support Density Density of scores for non- participants Density of scores for participants Region of common support High probability of participating, given X Low probability of participating, given X 1 Propensity score

Propensity Score Matching Match Participants: Find a sample of non-participants with similar p(xi) Restrict samples to ensure common support Determine a tolerance limit: how different can matched control individuals or villages be? Decide on a matching technique Nearest neighbors, nonlinear matching, multiple matches

Propensity Score Matching Once matches are made, we can calculate impact by comparing the means of outcomes across participants and their matches The difference in outcomes for each participant and its match is the estimate of the gain due to the program for that observation. Calculate the mean of these individual gains to obtain the average overall gain.

Possible Scenarios Case 1: Baseline Data Exists Arrive at baseline, we can match participants with nonparticipants using baseline characteristics. Case 2: No Baseline Data. Arrive afterwards, we can only match participants with nonparticipants using time-invariant characteristics.

Extensions Be cautious of ex-post matching Matching on variables that change due to program participation (i.e. endogenous variables) What are some invariable characteristics?

Key Factors Identification Assumption Data Requirements Selection on Observables: After controlling for observables, treated and control groups are not systematically different Data Requirements Rich data on as many observable characteristics as possible Large sample size (so that it is possible to find appropriate match)

Additional Considerations Advantages Might be possible to do with existing survey data Doesn’t require randomization/experiment/baseline data Allows estimation of heterogeneous treatment effects because we have individual counterfactuals, instead of just having group averages.

Additional Considerations Disadvantages Strong (if not heroic) identifying assumption: that there are no unobserved differences but if individuals are otherwise identical, then why did some participate and others not? Requires good quality data Need to match on as many characteristics as possible Requires sufficiently large sample size Need a match for each participant in the treatment group

Jalan & Ravallion (2003b) Does piped water reduce diarrhea for children in rural India?

Data Rural Household Survey What would you use for D, Y, and X? No baseline data Detailed information on: Health status of household members Education levels of household members Household income Access to piped water What would you use for D, Y, and X?

Propensity Score Regression

Propensity Score Regression

Matching Prior to matching, the estimated propensity scores for those with and without piped water were, respectively, 0.5495 and 0.1933. After matching there was negligible difference in the mean propensity scores of the two groups 0.3743, for those with piped water 0.3742, for the matched control group

Results “Prevalence and duration of diarrhea among children under five in rural India are significantly lower on average for families with piped water than for observationally identical households without it.” “However, our results indicate that the health gains largely by-pass children in poor families, particularly when the mother is poorly educated.”

Matching is a useful way to control for OBSERVABLE heterogeneity Conclusion Matching is a useful way to control for OBSERVABLE heterogeneity Especially when randomization or RD approach is not possible However, it requires relatively strong assumptions