Impact Evaluation Click to edit Master title style Click to edit Master subtitle style Impact Evaluation World Bank InstituteHuman Development Network.

Slides:

Advertisements

Similar presentations

AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation Muna Meky Impact Evaluation Cluster, AFTRL Slides by Paul J.

Advertisements

Impact Evaluation Methods: Causal Inference

The World Bank Human Development Network Spanish Impact Evaluation Fund.

An Overview Lori Beaman, PhD RWJF Scholar in Health Policy UC Berkeley

The World Bank Human Development Network Spanish Impact Evaluation Fund.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

Knowing if the RBF mechanism is working Incorporating Rigorous Impact Evaluation into your HRBF program Sebastian Martinez World Bank.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.

Impact Evaluation Toolbox Gautam Rao University of California, Berkeley * ** Presentation credit: Temina Madon.

Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, 2009 AIM-CDD Using Randomized Evaluations.

AADAPT Workshop Latin America Brasilia, November 16-20, 2009 Non-Experimental Methods Florence Kondylis.

Measuring Impact: Experiments

AADAPT Workshop South Asia Goa, December 17-21, 2009 Nandini Krishnan 1.

Quasi Experimental Methods I Nethra Palaniswamy Development Strategy and Governance International Food Policy Research Institute.

Assessing the Distributional Impact of Social Programs The World Bank Public Expenditure Analysis and Manage Core Course Presented by: Dominique van de.

CAUSAL INFERENCE Shwetlena Sabarwal Africa Program for Education Impact Evaluation Accra, Ghana, May 2010.

The days ahead Monday-Wednesday –Training workshop on how to measure the actual reduction in HIV incidence that is caused by implementation of MC programs.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

Impact Evaluation in Education Introduction to Monitoring and Evaluation Andrew Jenkins 23/03/14.

Beyond surveys: the research frontier moves to the use of administrative data to evaluate R&D grants Oliver Herrmann Ministry of Business, Innovation.

Session III Regression discontinuity (RD) Christel Vermeersch LCSHD November 2006.

Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Causal Inference Nandini Krishnan Africa Impact Evaluation.

Impact Evaluation Designs for Male Circumcision Sandi McCoy University of California, Berkeley Male Circumcision Evaluation Workshop and Operations Meeting.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.

Nigeria Impact Evaluation Community of Practice Abuja, Nigeria, April 2, 2014 Measuring Program Impacts Through Randomization David Evans (World Bank)

Why Use Randomized Evaluation? Isabel Beltran, World Bank.

Applying impact evaluation tools A hypothetical fertilizer project.

Non-experimental methods Markus Goldstein The World Bank DECRG & AFTPM.

Impact Evaluation for Evidence-Based Policy Making

Measuring Impact 1 Non-experimental methods 2 Experiments

Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Steps in Implementing an Impact Evaluation Nandini Krishnan.

Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.

Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, 2009 Steps in Implementing an Impact.

Implementing an impact evaluation under constraints Emanuela Galasso (DECRG) Prem Learning Week May 2 nd, 2006.

Randomized Assignment Difference-in-Differences

Impact Evaluation Using Impact Evaluation for Results Based Policy Making Arianna Legovini Impact Evaluation Cluster, AFTRL Slides by Paul J. Gertler &

Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

Social Experimentation & Randomized Evaluations Hélène Giacobino Director J-PAL Europe DG EMPLOI, Brussells,Nov 2011 World Bank Bratislawa December 2011.

Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Using Randomized Evaluations to Improve.

What is Impact Evaluation … and How Do We Use It? Deon Filmer Development Research Group, The World Bank Evidence-Based Decision-Making in Education Workshop.

Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Randomization.

Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead Specialist Africa Impact Evaluation Initiative.

Impact Evaluation Methods Randomization and Causal Inference Slides by Paul J. Gertler & Sebastian Martinez.

Copyright © 2015 Inter-American Development Bank. This work is licensed under a Creative Commons IGO 3.0 Attribution-Non Commercial-No Derivatives (CC-IGO.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, Causal Inference Nandini.

Impact Evaluation Methods Regression Discontinuity Design and Difference in Differences Slides by Paul J. Gertler & Sebastian Martinez.

Measuring Results and Impact Evaluation: From Promises into Evidence

Impact Evaluation Methods

Explanation of slide: Logos, to show while the audience arrive.

Explanation of slide: Logos, to show while the audience arrive.

Impact Evaluation Methods

1 Causal Inference Counterfactuals False Counterfactuals

Impact Evaluation Toolbox

Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.

Impact Evaluation Methods: Difference in difference & Matching

Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.

Impact Evaluation Designs for Male Circumcision

Explanation of slide: Logos, to show while the audience arrive.

Sampling for Impact Evaluation -theory and application-

Applying Impact Evaluation Tools: Hypothetical Fertilizer Project

Module 3: Impact Evaluation for TTLs

Steps in Implementing an Impact Evaluation

Presentation transcript:

Impact Evaluation Click to edit Master title style Click to edit Master subtitle style Impact Evaluation World Bank InstituteHuman Development Network Middle East and North Africa Region Measuring Impact: Impact Evaluation Methods for Policy Makers Paul Gertler UC Berkeley Note: slides by Sebastian Martinez, Christel Vermeersch and Paul Gertler. The content of this presentation reflects the views of the authors and not necessarily those of the World Bank. This version: November 2009.

2 Impact Evaluation  Logical Framework How the program works “in theory”  Measuring Impact Identification Strategy  Data  Operational Plan  Resources

3 Measuring Impact 1)Causal Inference  Counterfactuals  False Counterfactuals:  Before & After (pre & post)  Enrolled & Not Enrolled (apples & oranges) 2)IE Methods Toolbox:  Random Assignment  Random Promotion  Discontinuity Design  Difference in Difference (Diff-in-diff)  Matching (P-score matching)

4 Our Objective  Estimate the CAUSAL effect (impact) of intervention P (program or treatment) on outcome Y (indicator, measure of success)  Example: what is the effect of a Health Insurance Subsidy Program(P) on Out of Pocket Health Expenditures (Y)?

5 Causal Inference  What is the impact of P on Y?  Answer: α= (Y | P=1)-(Y | P=0)  Can we all go home?

6 Problem of missing data For a program beneficiary:  we observe (Y | P=1): Health expenditures (Y) with health insurance subsidy (P=1)  but we do not observe (Y | P=0): Health expenditures (Y) without health insurance subsidy (P=0) α= (Y | P=1)-(Y | P=0)

7 Solution  Estimate what would have happened to Y in the absence of P  We call this the………… COUNTERFACTUAL The key to a good impact evaluation is a valid counterfactual!

8 Estimating Impact of P on Y  OBSERVE (Y | P=1) Outcome with treatment  ESTIMATE (Y | P=0) counterfactual α = (Y | P=1) - (Y | P=0) IMPACT = outcome with treatment - counterfactual  Intention to Treat (ITT) - Those offered treatment  Treatment on the Treated (TOT) – Those receiving treatment  Use comparison or control group

9 Example: What is the Impact of: giving Fulanito additional pocket money (P) on Fulanito’s consumption of candies (Y)

10 The perfect “Clone” 6 Candies Impact = FulanitoFulanito’s Clone 4 Candies

11 In reality, use statistics Average Y = 6 Candies Impact = = 2 Candies TreatmentComparison Average Y = 4 Candies

12 Finding Good Comparison Groups  We want to find “clones” for the Fulanito’s in our programs  The treatment and comparison groups should: have identical characteristics, except for benefiting from the intervention  In practice, use program eligibility & assignment rules to construct valid counterfactuals With a good comparison group, the only reason for different outcomes between treatments and controls is the intervention (P)

13  National Health System Reform Closing gap in access and quality of services between rural and urban areas Large expansion in supply of health services Reduction of health care costs for rural poor  Health Insurance Subsidy Program (HISP)  Pilot program  Covers costs for primary health care and drugs  Targeted to poor – eligibility based on poverty index  Rigorous impact evaluation with rich data  200 communities, 10,000 households  Baseline and follow-up data two years later  Many outcomes of interest Yearly out of pocket health expenditures per capita  What is the effect of HISP (P) on health expenditures (Y)?  If impact is a reduction of $9 or more, then scale up nationally Case Study: HISP

14 Ineligibles (Non-Poor) Eligibles (Poor) Case Study: HISP Not Enrolled Enrolled Eligibility and Enrollment

15 Measuring Impact 1)Causal Inference  Counterfactuals  False Counterfactuals:  Before & After (pre & post)  Enrolled & Not enrolled (apples & oranges) 2)IE Methods Toolbox:  Random Assignment  Random Promotion  Discontinuity Design  Difference in Difference (Diff-in-diff)  Matching (P-score matching)

16 Counterfeit Counterfactual #1 Before & After Y Time T=0 Baseline T=1 Endline IMPACT? B A C (counterfactual)

17 Case 1: Before & After  Observe only beneficiaries (P=1)  2 observations in time expenditures at T=0 expenditures at T=1 “Impact” = A-B = Time What is the effect of HISP (P) on health expenditures (Y)? B T=0T=1 Y A α =α =

18 Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). Outcome with Treatment CounterfactualImpact (After)(Before) (Y | P=1) - (Y | P=0) health expenditures (Y) ** Linear Regression Multivariate Linear Regression estimated impact on health expenditures (Y) -6.59**-6.65** Case 1: Before & After

Economic Boom:  Real Impact = A-C  A-B is an underestimate Economic Recession:  Real Impact = A-D  A-B is an overestimate Time B T=0T=1 Y A α = -$6.6 D?D? C? Impact ? Case 1: What’s the Problem? Impact ? Problem with before & after: doesn’t control for other time-varying factors!

20 Measuring Impact 1)Causal Inference  Counterfactuals  False Counterfactuals:  Before & After (pre & post)  Enrolled & Not Enrolled (apples & oranges) 2)IE Methods Toolbox:  Random Assignment  Random Promotion  Discontinuity Design  Difference in Difference (Diff-in-diff)  Matching (P-score matching)

21 False Counterfactual #2 Enrolled & Not Enrolled  If we have post-treatment data on Enrolled: treatment group Not-enrolled: “control” group (counterfactual)  Those ineligible to participate  Those that choose NOT to participate  Selection Bias Reason for not enrolling may be correlated with outcome (Y)  Control for observables  But not unobservables!! Estimated impact is confounded with other things

22 Ineligibles (Non-Poor) Eligibles (Poor) Measure outcomes in post-treatment (T=1) In what ways might enrolled & not enrolled be different, other than their enrollment in the program? Not Enrolled Y = 21.8 Enrolled Y = 7.8 Case 2: Enrolled & Not Enrolled

23 Case 2: Enrolled & Not Enrolled Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). Outcome with Treatment CounterfactualImpact (Enrolled)(Not Enrolled) (Y | P=1) - (Y | P=0) health expenditures (Y) ** Linear Regression Multivariate Linear Regression estimated impact on health expenditures (Y) -13.9**-9.4**

24  Will you recommend scaling up HISP?  Before-After: Are there other time-varying factors that also influence health expenditures?  Enrolled-Not Enrolled: Are reasons for enrolling correlated with health expenditures? Selection Bias Policy Recommendation? Case 1: Before and After Case 2: Enrolled & Not- Enrolled Linear Regression Multivariate Linear Regression Linear Regression Multivariate Linear Regression impact on health expenditures (Y) -6.59**-6.65**-13.9**-9.4**

25 Keep in mind……..  Two common comparisons to be avoided!! Before & After (pre & post)  Compare: same individuals before and after they receive P  Problem: other things may have happened over time Enrolled & Not Enrolled (apples & oranges)  Compare: a group of individuals that enrolled in a program with a group that chooses not to enroll  Problem: Selection Bias  we don’t know why they are not enrolled  Both counterfactuals may lead to biased estimates of the impact

26 Measuring Impact 1)Causal Inference  Counterfactuals  False Counterfactuals:  Before & After (pre & post)  Enrolled & Not Enrolled (apples & oranges) 2)IE Methods Toolbox:  Random Assignment  Random Promotion  Discontinuity Design  Difference in Differences (Diff-in-diff)  Matching (P-score matching)

27 Choosing your IE method(s)…..  Key information you will need for identifying the right method for your program: Prospective/retrospective evaluation? Eligibility rules and criteria?  Poverty targeting?  Geographic targeting ? Roll-out plan (pipeline) ? Is the number of eligible units larger than available resources at a given point in time?  Budget and capacity constraints?  Excess demand for program?  Etc….

28 Choosing your IE method(s)…..  Best design = best comparison group you can find + least operational risk  Have we controlled for “everything”? Internal validity Good comparison group  Is the result valid for “everyone”? External validity Local versus global treatment effect Evaluation results apply to population we’re interested in Choose the “best” possible design given the operational context

29 Measuring Impact 1)Causal Inference  Counterfactuals  False Counterfactuals:  Before & After (pre & post)  Enrolled & Not enrolled (apples & oranges) 2)IE Methods Toolbox:  Random Assignment  Random Promotion  Discontinuity Design  Difference in Differences (Diff-in-diff)  Matching (P-score matching)

30 Randomized Treatments and Controls  When universe of eligibles > # benefits: Randomize! Lottery for who is offered benefits Fair, transparent and ethical way to assign benefits to equally deserving populations  Oversubscription: Give each eligible unit the same chance of receiving treatment Compare those offered treatment with those not offered treatment (controls)  Randomized phase in: Give each eligible unit the same chance of receiving treatment first, second, third…. Compare those offered treatment first, with those offered treatment later (controls)

31 Randomized treatments and controls 1. Universe 2. Random Sample of Eligibles Ineligible = Eligible = 3. Randomize Treatment External ValidityInternal Validity Control

32 Unit of Randomization  Choose according to type of program: Individual/Household School/Health Clinic/catchment area Block/Village/Community Ward/District/Region  Keep in mind: Need “sufficiently large” number of units to detect minimum desired impact  power Spillovers/contamination Operational and survey costs As a rule of thumb, randomize at the smallest viable unit of implementation

 Health Insurance Subsidy Program (HISP)  Unit of randomization: Community  200 communities in the sample  Randomized phase-in: 100 treatment communities (5,000 households)  Started receiving transfers at baseline T = control communities (5,000 households)  Receive transfers after follow up T = 1 if program is scaled up Case 3: Random Assignment 33

34 T=0 100 Treatment Communities (5,000 HH) 100 Control Communities (5,000 HH) T=1 Time Comparison period Case 3: Random Assignment

35 How do we know we have good clones? Case 3: Random Assignment

36 Case 3: Random Assignment ControlTreatment T-stat Health Expenditures ($ yearly per capita) Head’s age (years) Spouse’s age (years) Head’s education (years) ** Spouse’s education (years) **= significant at 1% Case 3: Balance at Baseline

37 Case 3: Random Assignment ControlTreatmentT-stat Head is female = Indigenous = Numer of household members Bathroom = Hectares of Land Distance to hospital (km) **= significant at 1% Case 3: Balance at Baseline

38 Case 3: Random Assignment Note: If the effect is statistically significant at the 1% significance level, we label the estimated impact with 2 stars (**). Treatment Group CounterfactualImpact (Randomized to treatment) (Randomized to comaparison) (Y | P=1)-(Y | P=0) Baseline (T=0) health expenditures (Y) Follow-up (T=1) health expenditures (Y) ** Linear Regression Multivariate Linear Regression estimated impact on health expenditures (Y) -10.1**-10**

39 **= significant at 1% HISP Policy Recommendation? Case 1: Before and After Case 2: Enrolled & Not- Enrolled Case 3: Random Assignment Multivariate Linear Regression Linear Regression Multivariate Linear Regression impact of HISP on health expenditures (Y) -6.65**-13.9**-9.4**-10**

 Random Assignment: With large enough samples, produces two groups that are statistically equivalent We have identified the perfect “clone” Feasible for prospective evaluations with oversubscription/excess demand Most pilots and new programs fall into this category! 40 Keep in mind…….. Randomized beneficiaryRandomized comparison

41 Remember…..  Objective of impact evaluation is to estimate the CAUSAL effect or IMPACT of a program on outcomes of interest  To estimate impact, we need to estimate the counterfactual What would have happened in the absence of the program Use comparison or control groups  We have toolbox with 5 methods to identify good comparison groups  Choose the best evaluation method that is feasible in the program’s operational context

42 THANK YOU!