Using Randomized Evaluations to Improve Policy

Slides:



Advertisements
Similar presentations
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation Muna Meky Impact Evaluation Cluster, AFTRL Slides by Paul J.
Advertisements

The World Bank Human Development Network Spanish Impact Evaluation Fund.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Choosing the level of randomization
Knowing if the RBF mechanism is working Incorporating Rigorous Impact Evaluation into your HRBF program Sebastian Martinez World Bank.
How to randomize? povertyactionlab.org. Quick Review: Why Randomize Choosing the Sample Frame Choosing the Unit of Randomization Options How to Choose.
Presented by Malte Lierl (Yale University).  How do we measure program impact when random assignment is not possible ?  e.g. universal take-up  non-excludable.
Public Policy & Evidence: How to discriminate, interpret and communicate scientific research to better inform society. Rachel Glennerster Executive Director.
EXPERIMENTS AND OBSERVATIONAL STUDIES Chance Hofmann and Nick Quigley
SAMPLING AND STATISTICAL POWER Erich Battistin Kinnon Scott Erich Battistin Kinnon Scott University of Padua DECRG, World Bank University of Padua DECRG,
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, 2009 AIM-CDD Using Randomized Evaluations.
1 Randomization in Practice. Unit of randomization Randomizing at the individual level Randomizing at the group level –School –Community / village –Health.
Experiments and Observational Studies. Observational Studies In an observational study, researchers don’t assign choices; they simply observe them. look.
AADAPT Workshop Latin America Brasilia, November 16-20, 2009 Non-Experimental Methods Florence Kondylis.
Measuring Impact: Experiments
AADAPT Workshop South Asia Goa, December 17-21, 2009 Nandini Krishnan 1.
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Part III Gathering Data.
Why Use Randomized Evaluation? Presentation by Shawn Cole, Harvard Business School and J-PAL Presenter: Felipe Barrera-Osorio, World Bank 1 APEIE Workshop.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Causal Inference Nandini Krishnan Africa Impact Evaluation.
Impact Evaluation Designs for Male Circumcision Sandi McCoy University of California, Berkeley Male Circumcision Evaluation Workshop and Operations Meeting.
Nigeria Impact Evaluation Community of Practice Abuja, Nigeria, April 2, 2014 Measuring Program Impacts Through Randomization David Evans (World Bank)
Why Use Randomized Evaluation? Isabel Beltran, World Bank.
Applying impact evaluation tools A hypothetical fertilizer project.
Impact Evaluation for Evidence-Based Policy Making
What is randomization and how does it solve the causality problem? 2.3.
Measuring Impact 1 Non-experimental methods 2 Experiments
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Steps in Implementing an Impact Evaluation Nandini Krishnan.
CHOOSING THE LEVEL OF RANDOMIZATION. Unit of Randomization: Individual?
Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, 2009 Steps in Implementing an Impact.
Impact Evaluation Using Impact Evaluation for Results Based Policy Making Arianna Legovini Impact Evaluation Cluster, AFTRL Slides by Paul J. Gertler &
Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods.
Social Experimentation & Randomized Evaluations Hélène Giacobino Director J-PAL Europe DG EMPLOI, Brussells,Nov 2011 World Bank Bratislawa December 2011.
Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Using Randomized Evaluations to Improve.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Randomization.
Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead Specialist Africa Impact Evaluation Initiative.
Common Pitfalls in Randomized Evaluations Jenny C. Aker Tufts University.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, Causal Inference Nandini.
Impact Evaluation Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead, Africa Impact Evaluation Initiative AFTRL.
Copyright © 2009 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
AC 1.2 present the survey methodology and sampling frame used
Why Use Randomized Evaluation?
Experiments vs. Observational Studies vs. Surveys and Simulations
Sampling and Experimentation
Measuring Results and Impact Evaluation: From Promises into Evidence
Quasi Experimental Methods I
An introduction to Impact Evaluation
CHAPTER 4 Designing Studies
Explanation of slide: Logos, to show while the audience arrive.
Chapter 13- Experiments and Observational Studies
Experiments and Observational Studies
Quasi-Experimental Methods
Using Randomized Evaluations to Improve Policy
Implementation Issues Program roll-out
Experiments and Observational Studies
CHAPTER 4 Designing Studies
Development Impact Evaluation in Finance and Private Sector
Impact Evaluation Toolbox
Implementation Challenges
Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.
Observational Studies
Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.
Impact Evaluation Designs for Male Circumcision
SAMPLING AND STATISTICAL POWER
The objective of this lecture is to know the role of random error (chance) in factor-outcome relation and the types of systematic errors (Bias)
Steps in Implementing an Impact Evaluation
Sample Sizes for IE Power Calculations.
Steps in Implementing an Impact Evaluation
Presentation transcript:

Using Randomized Evaluations to Improve Policy Rachel Glennerster This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez

Objective To separate the impact of the program from other factors—what is the causal effect of a program? Need to answer the counterfactual: what would have happened without the program Comparison group—who, except for the program, are just like those who got the program

Illustration: children learning (+) Program impact (+) Effect of other factors 3

Illustration: children learning FALSE (+) Program Impact 4

Correlation is not causation Knowledgeable about politics = More likely to vote Interested In politics 1) knowledgeable vote OR knowledgeable Interested In politics 2) No effect from either paper on political knowledge Receiving either paper led to more support for the democratic candidate in the gubernatorial election But small sample meant that they could not pick up modest effects of different papers. However, know there was not a large effect vote

Motivation Hard to distinguish causation from correlation from statistical analysis of existing data However complex the statistics can only see that X goes with Y Difficult to correct for unobserved characteristics, like motivation Motivation may be one of the most important things to correct for Selection bias a major issue for impact evaluation Projects started at specific times and places for particular reasons Participants may select into programs First farmers to adopt a new technology are likely to be very different from the average farmer, looking at their yields will give you a misleading impression of the benefits of a new technology

Motivation Retrospective impact evaluation: Collecting data after the event you don’t know how participants and nonparticipants compared before the program started Have to try and disentangle why the project was implemented where and when it was, after the event Prospective evaluation allows you to design the evaluation to answer the question you need to answer Allows you to collect the data you will need 7

Experimental design All those in the study have the same chance of being in the treatment or comparison group By design treatment and comparison have the same characteristics (observed and unobserved), on average Only difference is treatment With large sample, all characteristics average out Unbiased impact estimates

Options for randomization Lottery (only some get the program) Random phase-in (everyone gets it eventually) Variation in treatment (full coverage, different options) Encouragement design (when partial take up) Everyone can get it, some encouraged to get it Across treatments that we think ex ante are equally likely to work 9

Examples in agriculture Lottery Lottery to receive information of a new ag. technology Random phase-in (everyone gets it eventually) Train some farmers groups each year Variation in treatment Some get information on new seed, others get credit, others get demonstration plot on their land etc Encouragement design One farmers support center per district Some farmers get travel voucher to attend the center Across treatments that we think ex ante are equally likely to work 10 10 10

Lottery among the qualified Must get the program whatever Randomize who gets the program Not suitable for the program

Opportunities for randomization Budget constraint prevents full coverage Random assignment (lottery) is fair and transparent Limited implementation capacity Phase-in gives all the same chance to go first No evidence on which alternative is best Random assignment to alternatives with equal ex ante chance of success 12

Opportunities for randomization, cont. Take up of existing program is not complete Provide information or incentive for some to sign up Pilot a new program Good opportunity to test design before scaling up Operational changes to ongoing programs Good opportunity to test changes before scaling them up 13

Different units you can randomize at Village level Women’s association Youth groups School level Individual Farm Farmers’ Association Irrigation block

Group or individual randomization? If a program impacts a whole group usually randomize whole community to treatment or comparison Easier to get big enough sample if randomize individuals Individual randomization Group randomization 15

Unit of randomization Randomizing at higher level sometimes necessary: Political constraints on differential treatment within community Practical constraints—confusing for one person to implement different versions Spillover effects may require higher level randomization Randomizing at group level requires many groups because of within community correlation

Elements of an experimental design Target population Commercial Farmers Potential participants Maize producers Rice producers Evaluation sample Random assignment Treatment Group Control Group Participants  Drop-outs Women, particularly poor 17

External and internal validity External validity The sample is representative of the total population The results in the sample represent the results in the population We can apply the lessons to the whole population Internal validity The estimated effect of the intervention/program on the evaluated population reflects the real impact on that population i.e. the intervention and comparison groups are comparable

External and internal validity An evaluation can have internal validity without external validity Example: a randomized evaluation of encouraging women to stand in elections in urban area may not tell you much about impact of a similar program in rural areas An evaluation without internal validity, can’t have external validity If you don’t know whether a program works in one place, then you have learnt nothing about whether it works elsewhere.

Internal & external validity Randomization National Population Samples National Population

Samples of Population Stratum Internal validity Stratification Randomization Population Population stratum Samples of Population Stratum

Representative but biased: useless Randomization National Population Biased assignment USELESS! 22

Example: Fertilizer Vouchers Distribution, internal validity Sample of Commercial Farmers Random assignment 23

Example: CDD in Sierra Leone Basic sequence of tasks for the evaluation Listing eligible communities in target areas Communities not already served by other programs and of manageable size Baseline data of communities Random assignment to GoBifo, some public draws GoBifo project implemented Follow-up survey and participatory exercise What are some examples could be used in the context of an educaiton ie Analogy is how a new drug is tested. Big difference between medical experiments and social experiments: In medical experiments can do double-blind. What does that mean : flip a coin 24

Efficacy & Effectiveness Proof of concept Smaller scale Pilot in ideal conditions Effectiveness At scale Prevailing implementation arrangements -- “real life” Higher or lower impact? Higher or lower costs? Impacts: lower e.g. deworming, malaria etc things that need critical mass to be eradicated | higher: ideal conditions costs: ecomies of scale vs inefficient scale up (from ngo to inefficient gov systems for example 25

Advantages of “experiments” Clear and precise causal impact Relative to other methods Much easier to analyze Cheaper (smaller sample sizes) Easier to explain More convincing to policymakers Methodologically uncontroversial

What if there are constraints on randomization? Budget constraints: randomize among the most in need Roll-out capacity constraints: randomize who receives first (or next, if you have already started) Randomly promote the program to some 27

When is it really not possible? The treatment already assigned and announced and no possibility for expansion of treatment The program is over (retrospective) Universal take up already Program is national and non excludable Freedom of the press, exchange rate policy (sometimes some components can be randomized) Sample size is too small to make it worth it

Common pitfalls to avoid Calculating sample size incorrectly Randomizing one district to treatment and one district to control and calculating sample size on number of people you interview Collecting data in treatment and control differently Counting those assigned to treatment who do not take up program as control—don’t undo your randomization!! 29

Thank You