Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods.

Slides:

Advertisements

Similar presentations

Impact analysis and counterfactuals in practise: the case of Structural Funds support for enterprise Gerhard Untiedt GEFRA-Münster,Germany Conference:

Advertisements

N ON -E XPERIMENTAL M ETHODS Shwetlena Sabarwal (thanks to Markus Goldstein for the slides)

REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.

Advantages and limitations of non- and quasi-experimental methods Module 2.2.

Mywish K. Maredia Michigan State University

#ieGovern Impact Evaluation Workshop Istanbul, Turkey January 27-30, 2015 Measuring Impact 1 Non-experimental methods 2 Experiments Vincenzo Di Maro Development.

Presented by Malte Lierl (Yale University).  How do we measure program impact when random assignment is not possible ?  e.g. universal take-up  non-excludable.

Impact Evaluation Click to edit Master title style Click to edit Master subtitle style Impact Evaluation World Bank InstituteHuman Development Network.

The counterfactual logic for public policy evaluation Alberto Martini hard at first, natural later 1.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

Designing Influential Evaluations Session 5 Quality of Evidence Uganda Evaluation Week - Pre-Conference Workshop 19 th and 20 th May 2014.

Differences-in-Differences

The World Bank Human Development Network Spanish Impact Evaluation Fund.

BUSINESS AND FINANCIAL LITERACY FOR YOUNG ENTREPRENEURS: EVIDENCE FROM BOSNIA-HERZEGOVINA Miriam Bruhn and Bilal Zia (World Bank, DECFP)

VIII Evaluation Conference ‘Methodological Developments and Challenges in UK Policy Evaluation’ Daniel Fujiwara Senior Economist Cabinet Office & London.

Public Policy & Evidence: How to discriminate, interpret and communicate scientific research to better inform society. Rachel Glennerster Executive Director.

Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.

TOOLS OF POSITIVE ANALYSIS

Development Impact Evaluation Initiative innovations & solutions in infrastructure, agriculture & environment naivasha, april 23-27, 2011 in collaboration.

Matching Methods. Matching: Overview  The ideal comparison group is selected such that matches the treatment group using either a comprehensive baseline.

AADAPT Workshop Latin America Brasilia, November 16-20, 2009 Non-Experimental Methods Florence Kondylis.

Measuring Impact: Experiments

Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Mattea Stein Quasi Experimental Methods.

Quasi Experimental Methods I Nethra Palaniswamy Development Strategy and Governance International Food Policy Research Institute.

Evaluating Job Training Programs: What have we learned? Haeil Jung and Maureen Pirog School of Public and Environmental Affairs Indiana University Bloomington.

Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Making the Most out of Discontinuities Florence.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

Matching Estimators Methods of Economic Investigation Lecture 11.

Impact Evaluation in Education Introduction to Monitoring and Evaluation Andrew Jenkins 23/03/14.

Beyond surveys: the research frontier moves to the use of administrative data to evaluate R&D grants Oliver Herrmann Ministry of Business, Innovation.

Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Causal Inference Nandini Krishnan Africa Impact Evaluation.

CAUSAL INFERENCE Presented by: Dan Dowhower Alysia Cohen H 615 Friday, October 4, 2013.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

Evaluating the effectiveness of innovation policies Lessons from the evaluation of Latin American Technology Development Funds Micheline Goedhuys

The World Bank Human Development Network Spanish Impact Evaluation Fund.

AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.

Nigeria Impact Evaluation Community of Practice Abuja, Nigeria, April 2, 2014 Measuring Program Impacts Through Randomization David Evans (World Bank)

Why Use Randomized Evaluation? Isabel Beltran, World Bank.

Applying impact evaluation tools A hypothetical fertilizer project.

Non-experimental methods Markus Goldstein The World Bank DECRG & AFTPM.

What is randomization and how does it solve the causality problem? 2.3.

AADAPT Workshop for Impact Evaluation in Agriculture and Rural Development Goa, India 2009 With generous support from Gender Action Plan.

Measuring Impact 1 Non-experimental methods 2 Experiments

Implementing an impact evaluation under constraints Emanuela Galasso (DECRG) Prem Learning Week May 2 nd, 2006.

Randomized Assignment Difference-in-Differences

The World Bank Human Development Network Spanish Impact Evaluation Fund.

Prof. (FH) Dr. Alexandra Caspari Rigorous Impact Evaluation What It Is About and How It Can Be.

Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Using Randomized Evaluations to Improve.

Public Finance and Public Policy Jonathan Gruber Third Edition Copyright © 2010 Worth Publishers 1 of 24 Copyright © 2010 Worth Publishers.

Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead Specialist Africa Impact Evaluation Initiative.

Do European Social Fund labour market interventions work? Counterfactual evidence from the Czech Republic. Vladimir Kváča, Czech Ministry of Labour and.

Impact Evaluation Methods Randomization and Causal Inference Slides by Paul J. Gertler & Sebastian Martinez.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, Causal Inference Nandini.

1 An introduction to Impact Evaluation (IE) for HIV/AIDS Programs March 12, 2009 Cape Town Léandre Bassolé ACTafrica, The World Bank.

Kenya Evidence Forum - June 14, 2016 Using Evidence to Improve Policy and Program Designs How do we interpret “evidence”? Aidan Coville, Economist, World.

Quasi Experimental Methods I

Quasi Experimental Methods I

Quasi-Experimental Methods

Quasi-Experimental Methods

Matching Methods & Propensity Scores

Matching Methods & Propensity Scores

Development Impact Evaluation in Finance and Private Sector

1 Causal Inference Counterfactuals False Counterfactuals

Matching Methods & Propensity Scores

Impact Evaluation Methods: Difference in difference & Matching

Evaluating Impacts: An Overview of Quantitative Methods

Applying Impact Evaluation Tools: Hypothetical Fertilizer Project

Positive analysis in public finance

Presentation transcript:

Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods

Motivation 2

3

Lesson Number 1: 4 Correlation does not imply causation –Correlation: two things move together –Causation: one thing causes the other

5

6

Impact evaluation is all about causation! Does the intervention (project/policy) CAUSE (good/bad) impacts on the beneficiaries? 7

How do we establish causation in an IE? Need to find the counterfactual So we can compare 8 WHAT HAPPENED WHAT WOULD HAVE HAPPENED IN THE ABSENCE OF THE INTERVENTION WITH

9 Counterfactual criteria Need a “control group” to compare with our “treatment group” Treatment and control groups have similar initial characteristics –on average –observed and unobserved The only difference is that one group received the treatment The only reason observed outcomes are different is due to the treatment

In search of a counterfactual Which tools? 10 Not good counterfactual  misleading impact Before-after Participants-nonparticipants Good under some assumptions and limitations Difference-in-differences Regression discontinuity design Causal impact Experiments – Randomized controlled trials

In search of a counterfactual Non-experimental tools 11 Not good counterfactual  misleading impact Before-after Participants-nonparticipants Good under some assumptions and limitations Difference-in-differences Regression discontinuity design Causal impact Experiments – Randomized controlled trials

12 What is counterfactual analysis? Compare (statistically) identical groups of individuals –with & without intervention –at the same point in time What can non-experimental methods do? Compare similar groups –trying to make them as close to identical as possible

Case study: Returns to capital in microenterprises Problem: Small firms are credit constrained Intervention: one-time increases to capital stock – $100 and $200 Main outcome: profit rates Some figures: –800 firms at the baseline (2007) –More than 50% of the sampled firms invest less than $200 –300 firms applied and received financing 13

How we can evaluate this? Participants–nonparticipants Idea: compare profit rates of firms that applied and received credit with those that did not. 14

Participants-nonparticipants 15 MethodTreatedComparisonDifferencein % Participants VS. Non-participants2.1%0.7% 1.4 pp.300% Problem: Selection Bias. Why did only 300 firms opt in? -Better performers anyways (observable) -Better entrepreneurs, better informed (unobservable) Parts of this presentation build on material from Impact Evaluation in Practice

How we can evaluate this? Before-after Idea: compare real profits of treated firms before and after the subsidized credit policy. 16

Before-after 17 MethodTreated (After) Control/Comparison (Before) Difference Before - After2.1%1.5%0.6 pp. Problem: Time difference. Other things may have changed over time. -An alternative program for untreated firms -Untreated firms did much worse because they did not use the credit

Before-after Compare: Same subjects Before and After they receive an intervention. Problem: Other things may have happened over time. Participants-nonparticipants Compare: Group of subjects treated (participants) with group that chooses not to be treated (non participants) Problem: Selection bias. We do not know why they are participating. These two tools are wrong for IE Both tools lead to biased estimates of the counterfactual and the impact.

Before-after and Monitoring Monitoring tracks indicators over time –among participants only It is descriptive before-after analysis It tells us whether things are moving in the right direction It does not tell us why things happen or how to make more happen Legovini

Impact Evaluation Tracks average outcomes over time in –the treatment group relative to –the control group Compares –what DID happen with –what WOULD HAVE happened (counterfactual) Identifies a causal effect –controlling for ALL other time-varying factors 20 Legovini

Non-Experimental Methods 1. Difference-in-differences (Diff-in-Diff ) 2. Diff-in-Diff with matching 3. Regression discontinuity design (RDD) 21

Non-Experimental Methods 1. Difference-in-differences (Diff-in-Diff ) 2. Diff-in-Diff with matching 3. Regression discontinuity design (RDD) 22

How we can evaluate this? Difference-in-differences Idea: combine the time dimension (before- after) with the participation choice (participants-nonparticipants) (under some assumptions) this deals with the problems above: –Time differences. Other things that happened over time affect both participants and nonparticipants –Selection bias. We do not know why they are participating, but if the reason does not change over time… 23

24 Impact = (P P 2007 ) = 2.1 – 1.5 = Before-after %

25 NP 08 -NP 07 =0.2 Impact = (P P 2007 ) -(NP NP 2007 ) = 0.6 – 0.2 = Impact = (P P 2007 ) -(NP NP 2007 ) = 0.6 – 0.2 = Before-after + P-NP = Diff-in-Diff %

You can use a table instead… 26

Assumption of common time-trend Impact = +0.4 pp

Conclusion The program had a positive effect on profits for firms that used subsidized credit Is the “common time-trend” assumption plausible?

If we have historical data, we can use this to 'test' the assumption

Difference-in-differences combines Participants- nonparticipants with Before-after. Difference-in-Differences It deals with problems of previous methods under the… Possible to test if you have data pre-treatment Improve diff-in-diff if you match groups based on observable characteristics (propensity score matching) at the baseline …fundamental assumption Trends are the same in treatments and controls Deals with unobserved characteristics only if constant over time

Non-Experimental Methods 1. Difference-in-differences (Diff-in-Diff ) 2. Diff-in-Diff with matching 3. Regression discontinuity design (RDD) 31

Diff-in-Diff with matching 32 What is the intuition of matching techniques? The intervention targets firms with characteristics we can observe We can use these characteristics to find firms similar to the ones that participated These firms could be a good comparison group In practice we use an index (“propensity score”) of characteristics and compare groups with similar values of the index

Matching… 33 Challenge: finding nonparticipants that compare with all participants Example Index non-participants Participants Common support

It is a bit complicated in practice! 34 Source: Caliendro, 2008: 33

35 Source: Caliendro, 2008: 33 Don't worry, there is an easy way of avoiding all this!

Summary of impacts so far 36 MethodTreatedControl/ComparisonDifference Participants -nonparticipants pp Before-after pp Difference-in-differences pp If method is weak this can lead to incorrect impact estimates and wrong policy conclusions Participants-nonparticipants and Before-after are not good methods for causal impact Difference-in-differences is valid under some (often strong) assumptions

Non-Experimental Methods 1. Difference-in-differences (DD) 2. DD with matching 3. Regression discontinuity design (RDD) 37

How we can evaluate this? Regression Discontinuity Design Case: subsidies offered on the basis of credit constraint score All firms that apply are scored on age, revenue, profitability, number of employees, and access to different sources of credit. Score ranges from 0 to 100 where 100 means no credit constraint and 0 means high credit constraint The program aims to help the most needy firms. Thus the program is targeted to firms with score < = 50 Idea: compare profits of firms with score just below 50 (eligible for subsidized credit)…. ….with firms with scores just above 50 (ineligible for subsidized credit). 38

Fonte: WB – Human Development Network. Profit rate Non-eligible firms Eligible firms 3% 2.5% 2% 1.5%

Regression Discontinuity Design-Post Intervention Fonte: WB – Human Development Network. RDD identifies the Local Average Treatment Effect (LATE) Treatment Effect Profit rate 3% 2.5% 2.0% 1.5%

Regression discontinuity 41 MethodTreatedControlDifference Regression Discontinuity Design (RDD)2.35%2.10%0.25 pp Important: Valid only for those subjects that are close to the cut-off point that defines who is eligible to the program Is this the group you want to know about? Powerful method if you have: –Continuous eligibility index –Clearly defined eligibility cut-off. It gives a causal impact but with a local interpretation

Summary of impacts so far 42 MethodTreatedControl/ComparisonDifference Participants-nonparticipants pp Before -after pp Difference-in-differences pp Regression Discontinuity Design (RDD) pp Weak methods can lead to very misleading results RD (causal impact) is only around half of the impact estimated with the other weaker methods. Valid results from IE only if you use rigorous methods.

Hopefully, you are now questioning everything... 43

Preview: Experiments Other names: Randomized Controlled Trials (RCTs) or Randomization Assignment to Treatment and Control is based on chance (like flipping a coin) Treatment and Control groups will have identical characteristics (balanced) at baseline. Only difference is that treatment receives intervention, control does not 44

Experiments: plan for next session Design of experimentsOne treatment and many treatmentsHow to implement RCTsWhat when experiments are not possible? 45

Thank you! facebook.com/ieKnow #impacteval blogs.worldbank.org/impactevaluations microdata.worldbank.org/index.php/ca talog/impact_evaluation WEB