Impact evaluations: An introduction

Slides:



Advertisements
Similar presentations
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation Muna Meky Impact Evaluation Cluster, AFTRL Slides by Paul J.
Advertisements

The World Bank Human Development Network Spanish Impact Evaluation Fund.
Designing an impact evaluation: Randomization, statistical power, and some more fun…
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Mywish K. Maredia Michigan State University
#ieGovern Impact Evaluation Workshop Istanbul, Turkey January 27-30, 2015 Measuring Impact 1 Non-experimental methods 2 Experiments Vincenzo Di Maro Development.
Presented by Malte Lierl (Yale University).  How do we measure program impact when random assignment is not possible ?  e.g. universal take-up  non-excludable.
Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.
Making Impact Evaluations Happen World Bank Operational Experience 6 th European Conference on Evaluation of Cohesion Policy 30 November 2009 Warsaw Joost.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, 2009 AIM-CDD Using Randomized Evaluations.
AADAPT Workshop South Asia Goa, December 17-21, 2009 Nandini Krishnan 1.
Shawn Cole Harvard Business School Threats and Analysis.
Impact Evaluation in Education Introduction to Monitoring and Evaluation Andrew Jenkins 23/03/14.
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.
Nigeria Impact Evaluation Community of Practice Abuja, Nigeria, April 2, 2014 Measuring Program Impacts Through Randomization David Evans (World Bank)
Applying impact evaluation tools A hypothetical fertilizer project.
Non-experimental methods Markus Goldstein The World Bank DECRG & AFTPM.
Research Design ED 592A Fall Research Concepts 1. Quantitative vs. Qualitative & Mixed Methods 2. Sampling 3. Instrumentation 4. Validity and Reliability.
Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.
Randomized Assignment Difference-in-Differences
Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods.
What is Impact Evaluation … and How Do We Use It? Deon Filmer Development Research Group, The World Bank Evidence-Based Decision-Making in Education Workshop.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Randomization.
Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead Specialist Africa Impact Evaluation Initiative.
Impact Evaluation Methods Randomization and Causal Inference Slides by Paul J. Gertler & Sebastian Martinez.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, Causal Inference Nandini.
Monitoring and evaluation 16 July 2009 Michael Samson UNICEF/ IDS Course on Social Protection.
Impact Evaluation Methods Regression Discontinuity Design and Difference in Differences Slides by Paul J. Gertler & Sebastian Martinez.
Kenya Evidence Forum - June 14, 2016 Using Evidence to Improve Policy and Program Designs How do we interpret “evidence”? Aidan Coville, Economist, World.
Understanding Populations & Samples
Experimental Research
Issues in Evaluating Educational Research
Sampling and Experimentation
Measuring Results and Impact Evaluation: From Promises into Evidence
Food and Agriculture Organization of the United Nations
General belief that roads are good for development & living standards
Right-sized Evaluation
Statistical Core Didactic
Threats and Analysis.
Impact evaluations at IFAD-IOE
Impact Evaluation Methods
Explanation of slide: Logos, to show while the audience arrive.
Teaching and Educational Psychology
Welcome.
Impact evaluation: The quantitative methods with applications
Matching Methods & Propensity Scores
Matching Methods & Propensity Scores
ESF EVALUATION PARTNERSHIP MEETING Bernhard Boockmann / Helmut Apel
Development Impact Evaluation in Finance and Private Sector
Impact Evaluation Methods
Impact Evaluation Methods
1 Causal Inference Counterfactuals False Counterfactuals
Matching Methods & Propensity Scores
Implementation Challenges
Impact Evaluation Methods: Difference in difference & Matching
Evaluating Impacts: An Overview of Quantitative Methods
Sampling and Power Slides by Jishnu Das.
Impact Evaluation Designs for Male Circumcision
Explanation of slide: Logos, to show while the audience arrive.
Explanation of slide: Logos, to show while the audience arrive.
Class 2: Evaluating Social Programs
Sampling for Impact Evaluation -theory and application-
Class 2: Evaluating Social Programs
What are their purposes? What kinds?
Applying Impact Evaluation Tools: Hypothetical Fertilizer Project
Positive analysis in public finance
Module 3: Impact Evaluation for TTLs
TTL: Nyambura Githagui
Presentation transcript:

Impact evaluations: An introduction Dr. Bidisha Barooah Senior Evaluation Specialist, 3ie St. Stephen’s College March 27th, 2019

Who we are & what we do 3ie is a member-based international NGO promoting evidence-informed development policies and programmes. Grant maker and standard setter for policy-relevant impact evaluations, systematic reviews, evidence gap maps, evidence syntheses and replication studies focussed on low- and middle-income countries Convener of forums to build a culture of evaluation, capacity to undertake impact evaluations and reviews and commitment to evidence-informed decision-making Producer of knowledge products for policymakers, programme managers, researchers, civil society, the media and donors

Aims of this lecture At the end of this lecture, you should know What are impact evaluations Why we need them Methods of evaluation What do we need to do ‘good’ evaluations

What are impact evaluations? Helps to understand what works, why, how and for who Central to impact evaluations is the concept of ‘causal inference’ Rubin Causal Model Definitions Unit: The person, place, or thing upon which a treatment will operate, at a particular time Treatment: An intervention whose effect you want to measure Outcomes: What you want to measure e.g health outcomes, test scores

Causal inference Yt-1 Action (u) Yt (u) 60 Exercise 55 Don’t exercise The Fundamental Problem of Causal Inference: We can observe at most one of the potential outcomes for each unit. Causal Effect: For each unit, the comparison of the potential outcome under treatment and the potential outcome under control Suppose you are the unit. Your doctor has asked you to lose weight. Your current weight is 60 Kgs. You have the option of exercising or not. Yt-1 Action (u) Yt (u) 60 Exercise 55 Don’t exercise 62 Causal Effect on you Yt(exercise)- Yt(don’t exercise)=-7

Average Treatment Effects This leads to the need of a counterfactual or control group. Example, your clone who did not exercise, measured at the same point of time The average treatment effect is the average effect of an intervention on a sample/ population. E (Y(Exercise)- Y(No Exercise))

Example Calculate ATE Calculate proportion Urban by T=1 and T=0 Endogeniety

An example In what ways could districts that started the program first be different? Concept of endogeniety i.e a number of factors go into choice of program placement not all of which are observable Solution: Need a comparison group which has the same characteristics as those selected for the intervention. * Fictitious numbers

Question Calculate the impact for all scenarios Income per month before Income per month after MGNREGA 150000 170000 No MNGREGA 160000 12000 13000 Calculate the impact for all scenarios Which would you have the most faith on?

How do you find a counterfactual? Many methods- quantitative, qualitative and mixed methods We will focus on quantitative methods

Quasi-Experimental Methods Difference in Differences Regression Discontinuity Design Instrumental variable Propensity Score Matching

Difference-in-difference Difference between two time points, for the same group Income per household a year before program launch Income per household a year after program launch Difference MGNREGA Blocks 5000 7000 2000 Non-MGNREGA Blocks 8000 1000 Difference between two time groups Difference in difference: time and groups

But is this sufficient? MGNREGA Blocks Income per capita Non- MGNREGA Blocks Year of NREGA

A school construction program in Indonesia Low literacy areas got high intensity program Duflo 2001

School meals program in India School meals were started in urban public schools of Delhi in 2003 Phased implementation with 410 in first phase (April 2003) and the rest in phase 2 (October 2003) Sample of 19 schools with individual attendance for 4 time points

Average Treatment Effect in a regression equation Aijt = α0 + α1 ∗ Sept + α2Treatj ∗ Sept + µi + Eijm

Instrumental variables Z Y X Z affects Y only through X Regress Z on X and use predicted values of X to regress X on Y 2 stage least square

Example of instruments Angrist and Kruger 2001

Regression discontinuity design There is a programme allocation ‘threshold rule’ dividing participants and non-participants Variable Threshold rule Poverty index Impact of development projects to households below a poverty incidence threshold (eg BPL cards) Age Impacts on subsidies for senior citizens (above 60 y.o.) Date Impact of introduction of a reform after a certain time

Propensity Score Matching Evaluate a program on woman’s empowerment where women are mobilized into self-help groups. Joining a group is voluntary Compare participants to non-participants Not as simple as matching on means Each observation gets a ‘score’ of its probability of being in the program based on its observable characteristics Prennushi and Gupta (2014)

Bias Reduction

Adequate sample Common support

Impacts

Experimental methods What is random sample? Selection into sample is by chance Each member of the population has equal probability of being chosen into the sample Essentially, sample statistics are unbiased estimates of population statistics Sample (mean) Population (mean) (as n increases)

Randomization One way to create a counterfactual is to assign each unit equal probability of being treated. Randomly assign

Randomization Theoretically takes care of selection- random assignment Produces comparable groups on observables and unobservables Easy to interpret (?) ATE= Mean outcome (Treatment)- Mean outcome (Control) Average Treatment Effect in a regression equation 𝑌𝑖=𝛼+𝛽 𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡𝑖+𝜀𝑖 Question: Do you need a baseline if you randomize? Why/Why not?

Example Reference: Afridi, Barooah and Somanathan (2018)

Biases in Impact Evaluations Treatment group Control group

ATTRITION Bidisha

Biases Spillover and Contamination Threatens the validity of IEs Solutions Design Unit of treatment of a village and not some villagers Intervention Non-transferrable vouchers Monitoring

OTHER BIASES Hawthorne Effect John Henry effect What to do? Examples Treatment group modifies behavior not because of the treatment but being observed John Henry effect Control groups change behavior What to do? Examples ‘ Sensitive survey and monitoring systems

Randomization- other examples Pipeline approach: Most development programs are implemented in phases. Assignment to phases may be randomized after participants chosen. Banerjee, Duflo, Glennester and Kinnan (2015)

Randomization- examples Factorial design: All groups get a base treatment Lottery: Oversubscription to a program Encouragement design: Low sign-up to a program, encourage to increase participation

Impact Evaluation Essentials A Theory of Change Formative work Sample Size Determination Monitoring Systems

Theory of Change Lays out how a program is expected to work Which activities, procedures, people have to be in place, and in what sequence – what are the ‘lower reaches of the causal chain’? What are the ‘upper reaches of the causal chain’ - what is the range of outputs, (intended and unintended) outcomes and impacts? Which resources are required for implementation – and are available? Which data are required for M&E – and are available? Is the programme feasible or achievable?

Theory OF CHANGE Students learn better in private schools than they would in public Higher test scores Students Attend Private School Voucher Scheme Established Students are discriminated against and reduce attendance Lower test scores

THE IMPORTANCE OF TOC Two main reasons why programmes succeed or fail: It works (doesn’t work) in theory, under optimal conditions It works (doesn’t work) in practice due to implementation fidelity and beneficiary participation Two main reasons why programmes succeed or fail: It works (doesn’t work) in theory, under optimal conditions It works (doesn’t work) in practice due to implementation fidelity and beneficiary participation

SAMPLE SIZES Formally, impact evaluation tests the null hypothesis H0 : impact = 0 (The hypothesis is that the program does not have an impact) against the alternative hypothesis: Ha : impact ≠ 0 (The alternative hypothesis is that the program has an impact).

HIGH PRECISION

LOW PRECISION

WHAT IS PRECISION?

Remember: Set the correct n Treatment group 2.7 3.7 Control group Remember: Set the correct n

Peru The Government of Peru is planning to roll out a national youth scholarship program to incentivize vocational training for persons aged 14-16. All who are of age 14 to 16 are eligible. The program will be rolled out in a phased manner, with some areas being selected for the first phase and others for the second phase. The gap between the two phases will be 2 years. The government wants to know what is the impact of this program on youth unemployment. You have also been given a fixed budget but can choose how to reallocate money within that amount. You have been asked by the government to help select the phase one and phase two areas as well as to identify the impact. How will you design an IE around this? How many rounds of data will you collect?

India The Government of India has started phased implementation of a public works (employment) program. The program starts in the poorest districts first. All households are eligible for participation but take-up is voluntary. The government has also collected extensive baseline data before the roll-out of the program. The government tracks the number of eligible households that have participated in the program. Around 75% of eligible households have received employment under this program in the districts where it was rolled out first. You want to study the impact on household income due to this program. You have also been given a fixed budget but can choose how to reallocate money within that amount. How will you build an IE design?