11 th of October, 2010 University of Cape Town Kamilla Gumede Martin Abel Introduction to Randomised Evaluations.

Slides:



Advertisements
Similar presentations
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation Muna Meky Impact Evaluation Cluster, AFTRL Slides by Paul J.
Advertisements

The World Bank Human Development Network Spanish Impact Evaluation Fund.
REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.
Measuring Impact: lessons from the capacity building cluster SWF Impact Summit 2 nd October 2013 Leroy White University of Bristol Capacity Building Cluster.
Validity (cont.)/Control RMS – October 7. Validity Experimental validity – the soundness of the experimental design – Not the same as measurement validity.
A Guide to Education Research in the Era of NCLB Brian Jacob University of Michigan December 5, 2007.
Public Policy & Evidence: How to discriminate, interpret and communicate scientific research to better inform society. Rachel Glennerster Executive Director.
Who are the participants? Creating a Quality Sample 47:269: Research Methods I Dr. Leonard March 22, 2010.
Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.
Making Impact Evaluations Happen World Bank Operational Experience 6 th European Conference on Evaluation of Cohesion Policy 30 November 2009 Warsaw Joost.
TOOLS OF POSITIVE ANALYSIS
Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, 2009 AIM-CDD Using Randomized Evaluations.
1 Randomization in Practice. Unit of randomization Randomizing at the individual level Randomizing at the group level –School –Community / village –Health.
Randomized Control Trials for Agriculture Pace Phillips, Innovations for Poverty Action
TRANSLATING RESEARCH INTO ACTION What is Randomized Evaluation? Why Randomize? J-PAL South Asia, April 29, 2011.
Health Programme Evaluation by Propensity Score Matching: Accounting for Treatment Intensity and Health Externalities with an Application to Brazil (HEDG.
Gender and Impact Evaluation
Measuring Impact: Experiments
Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Mattea Stein Quasi Experimental Methods.
AADAPT Workshop South Asia Goa, December 17-21, 2009 Nandini Krishnan 1.
Quasi Experimental Methods I Nethra Palaniswamy Development Strategy and Governance International Food Policy Research Institute.
Designing a Random Assignment Social Experiment In the U.K.; The Employment Retention and Advancement Demonstration (ERA)
Impact Evaluation in Education Introduction to Monitoring and Evaluation Andrew Jenkins 23/03/14.
Beyond surveys: the research frontier moves to the use of administrative data to evaluate R&D grants Oliver Herrmann Ministry of Business, Innovation.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Causal Inference Nandini Krishnan Africa Impact Evaluation.
CAUSAL INFERENCE Presented by: Dan Dowhower Alysia Cohen H 615 Friday, October 4, 2013.
CDIS 5400 Dr Brenda Louw 2010 Validity Issues in Research Design.
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
Conditional Cash Transfer (CCT) Programme Kano State Nigeria.
Why Use Randomized Evaluation? Isabel Beltran, World Bank.
Applying impact evaluation tools A hypothetical fertilizer project.
Impact Evaluation “Randomized Evaluations” Jim Berry Asst. Professor of Economics Cornell University.
Non-experimental methods Markus Goldstein The World Bank DECRG & AFTPM.
Impact Evaluation for Evidence-Based Policy Making
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Steps in Implementing an Impact Evaluation Nandini Krishnan.
1 Module 3 Designs. 2 Family Health Project: Exercise Review Discuss the Family Health Case and these questions. Consider how gender issues influence.
TRANSLATING RESEARCH INTO ACTION What and How to randomize? July 9, 2011 Dhaka Raymond Guiteras, Assistant Professor University of Maryland povertyactionlab.org.
Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015.
Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.
Current practices in impact evaluation Howard White Independent Evaluation Group World Bank.
Implementing an impact evaluation under constraints Emanuela Galasso (DECRG) Prem Learning Week May 2 nd, 2006.
Randomized Assignment Difference-in-Differences
Prof. (FH) Dr. Alexandra Caspari Rigorous Impact Evaluation What It Is About and How It Can Be.
Social Experimentation & Randomized Evaluations Hélène Giacobino Director J-PAL Europe DG EMPLOI, Brussells,Nov 2011 World Bank Bratislawa December 2011.
1 Joint meeting of ESF Evaluation Partnership and DG REGIO Evaluation Network in Gdańsk (Poland) on 8 July 2011 The Use of Counterfactual Impact Evaluation.
Open Forum: Scaling Up and Sustaining Interventions Moderator: Carol O'Donnell, NCER
Outcomes Evaluation A good evaluation is …. –Useful to its audience –practical to implement –conducted ethically –technically accurate.
What is Impact Evaluation … and How Do We Use It? Deon Filmer Development Research Group, The World Bank Evidence-Based Decision-Making in Education Workshop.
Research Ethics Kenny Ajayi October 6, 2008 Global Poverty and Impact Evaluation.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Randomization.
Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead Specialist Africa Impact Evaluation Initiative.
Do European Social Fund labour market interventions work? Counterfactual evidence from the Czech Republic. Vladimir Kváča, Czech Ministry of Labour and.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, Causal Inference Nandini.
Monitoring and evaluation 16 July 2009 Michael Samson UNICEF/ IDS Course on Social Protection.
Evaluation What is evaluation?
Measuring Results and Impact Evaluation: From Promises into Evidence
Food and Agriculture Organization of the United Nations
Impact evaluation: The quantitative methods with applications
The Use of Counterfactual Impact Evaluation Methods in Cohesion Policy
Brahm Fleisch Research supported by the Zenex Foundation October 2017
Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.
Impact Evaluation Methods: Difference in difference & Matching
Evaluating Impacts: An Overview of Quantitative Methods
Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.
Sampling and Power Slides by Jishnu Das.
Positive analysis in public finance
Title Team Members.
Presentation transcript:

11 th of October, 2010 University of Cape Town Kamilla Gumede Martin Abel Introduction to Randomised Evaluations

New research programme within SALDRU Regional office of a global network Specialise in RANDOMISED IMPACT EVALUATIONS Do 3 things: – Run evaluations – Disseminate result – public good – Train others to run evaluations J-PAL Africa Fight poverty

I. Why do we evaluate social programmes? II. What is an IMPACT? III. Impact evaluation methodologies IV. How to run an RCT: – Advantages of randomised evaluations – Theory of Change – Randomisation Design – External vs. Internal Validity Overview

Surprisingly little hard evidence on what works Need #1: Can do more with given budget with better evidence. Need #2: If people knew money was going to programs that worked, could help increase pot for anti-poverty programs Instead of asking “do aid/development programs work?” should be asking: – Which work best, why and when? – How can we scale up what works? Evidence-based policy making 4

5 Example Aid: Optimists “I have identified the specific investments that are needed [to end poverty]; found ways to plan and implement them; [and] shown that they can be affordable.” Jeffrey Sachs End of Poverty

6 “After $2.3 trillion over 5 decades, why are the desperate needs of the world's poor still so tragically unmet? Isn't it finally time for an end to the impunity of foreign aid?” Bill Easterly The White Man’s Burden Example Aid: Pessimists

Accountability Lesson learning – Program – Organization – Beneficiaries – World So that we can reduce poverty through more effective programs Different types of evaluation contribute to these different objectives of evaluation Objective of evaluation

The different types of evaluation Evaluation (M&E) Program Evaluation Impact Evaluation Randomized Evaluation

Evaluating Social Programmes

What is outcome after programme? What would have happened in the absence of the programme? Take the difference between what happened (with the program) -what would have happened (without the program) =IMPACT of the program 10 How to measure impact? (I)

Impact is defined as a comparison between: 1.the outcome some time after the program has been introduced 2.the outcome at that same point in time had the program not been introduced (the ”counterfactual”) 11 How to measure impact? (II)

Impact: What is it? Time Primary Outcome Impact Counterfactual Intervention

Impact: What is it? Time Primary Outcome Impact Counterfactual Intervention

The counterfactual represents the state of the world that program participants would have experienced in the absence of the program (i.e. had they not participated in the program) Problem: Counterfactual cannot be observed Solution: We need to “mimic” or construct the counterfactual Counterfactual

Counterfactual is often constructed by selecting a group not affected by the program Randomized: – Use random assignment of the program to create a control group which mimics the counterfactual. Non-randomized: – Argue that a certain excluded group mimics the counterfactual. 15 Constructing the counterfactual

Experimental: – Randomized Evaluations Quasi-experimental – Instrumental Variables – Regression Discontinuity Design Non-experimental – Pre-post – Difference in differences – Cross Sectional Regression – Fixed Effects Analysis – Statistical Matching 16 Methodologies in impact evaluation South Africa OAP on labour supply

Non-experimental evaluations – Cross Sectional Regression Bertrand et al. (2003) Posel et al. (2006) We can control for observable differences (age, gender, education,...) There are also unobservable characteristics we cannot control for (motivation, etc.)  What people does a household with a pension attract?

Non-experimental evaluations – Panel Data Analysis with Fixed Effects Ardington et al. (2009) Fixed effects analysis limits sample to households that changed pension status over time We can control for unobservable characteristics that do not change Unobservable characteristics may change over time Data requirements: panel data, sizeable proportion of households switching

How to randomise

A. The basics Randomly assign them to either:  Treatment Group – is offered treatment  Control Group - not allowed to receive treatment (during the evaluation period) 20 Target Population Not in evaluation Evaluation Sample Random Assignment Treatment group Control group

A. Why randomize? – Conceptual Argument If properly designed and conducted, randomized experiments provide the most credible method to estimate the impact of a program Because members of the groups (treatment and control) do not differ systematically at the outset of the experiment, any difference that subsequently arises between them can be attributed to the program rather than to other factors. 21

Example: Primary vs Secondary

Returns to Secondary Education (?) Standard way to measure this: – Equation But are people who complete school “the same” as those who don’t: – More patient, more ambitious, more resourced families and have lower immediate economic opportunities. 1,200 teens, qualified but cannot afford: – 300 boys, 300 girls get 4 year scholarship – Followed for 10 years

In class test

BREAK

Basic setup of a randomized evaluation 26 Target Population Not in evaluation Evaluation Sample Random Assignment Treatment group Control group Base- line survey End- line survey

Roadmap to Randomized Evaluations willing partner sufficient time interesting policy question / theory sufficient resources Environment / Context 1 mechanism of change (log frame) state assumptions identify research hypothesis identify target population identify indicators identify threads to validity Theory of Change 2 statistical validity cluster correlation Sufficient Sample Size 4 Spillovers Discouragement Attrition Political interference Strategy to Manage Threats 5 Check on competing interventions simple program packages Randomization Design encouragement gradual rollout simple lottery rotation design 3 individual cluster design block random. InterventionUnit of randomization Randomization mechanism Revise

Willing partner Sufficient time Interesting policy question / theory Sufficient resources B. Environment / Context

Programs /Policies Knowledge Evidence Experience Personal collective Ideology Own External Support Budget Political Capacity II. Evaluations: Providing evidence for policymaking

What are the possible chains of outcomes in the case of the intervention? What are the assumptions underlying each chain of causation? What are the critical intermediary steps needed to obtain the final results? What variables should we try to obtain at every step of the way to discriminate between various models? C. Theory of Change (I) 30

C. Theory of Change (II) – SA Pension System 31 Bertrand et al. (2003)Posel et al. (2006) Different theories of change determine what indicators we measure and who do include in our evaluation

Based on the Theory of Change, we identify indicators to test the different lines of causation and measure outcomes...room for creativity… How to measure women empowerment? – Measure fraction of time they speak during village council meeting How to measure corruption in infrastructure projects? – Drill holes in the asphalt of newly built roads and measure difference in actual and official thickness C. Indicators

Roadmap to Randomized Evaluations willing partner sufficient time interesting policy question / theory sufficient resources Environment / Context 1 mechanism of change (log frame) state assumptions identify research hypothesis identify target population identify indicators identify threads to validity Theory of Change 2 statistical validity cluster correlation Sufficient Sample Size 4 Spillovers Discouragement Attrition Political interference Strategy to Manage Threats 5 Check on competing interventions simple program packages Randomization Design encouragement gradual rollout simple lottery rotation design 3 individual cluster design block random. InterventionUnit of randomization Randomization mechanism Revise

D. Basic setup of a randomized evaluation 34 Target Population Not in evaluation Evaluation Sample Random Assignment Treatment group Control group

Evidence on the effectiveness of providing microfinance loans to the poor has been mixed. Some argue that financial literacy training is more effective while others propose that both loans and training needs to be provided to alleviate poverty  How can you design a randomised evaluation to assess which of these claims is true? Case Study: Microfinance and/or Financial Literacy Training

D. Forms of Intervention Random Assignment 6 month Financial Literacy Control group 1 month Financial Literacy Random Assignment Microfinance Control group Financial Literacy Financial Literacy AND Microfinance Random Assignment Microfinance Control group Financial Literacy Random Assignment Microfinance Control group Simple Treatment / Control Cross- cutting Design Multiple Treatment Varying levels of Treatment

Individual Cluster (Class room, school, district,…) Generally, best to randomize at the level at which the treatment is administered. Ethical and practical concerns E. Unit of Randomization

Confronted with overcrowded schools and a shortage of teachers, in 2005 the NGO ICS offered to provided funds to hire 140 extra teachers each year.  What is the best unit of randomisation for our RCT? Case Study: Extra Teachers in Kenya

Lottery Pull out of a hat/bucket Use random number generator in spreadsheet or STATA Phase-in design Rotation design Encouragement design F. Method of Randomization

How to Randomize, Part I - 40 Random assignment through lottery 2006 Income per person, per month, rupees Treat Compare

Alternative Mechanism: Phase-in design Round 1 Treatment: 1/3 Control: 2/3 Round 2 Treatment: 2/3 Control: 1/3 Round 3 Treatment: 3/3 Control: Round 1 Treatment: 1/3 Control: 2/3 Round 2 Treatment: 2/3 Control: 1/3 Randomized evaluation ends

Roadmap to Randomized Evaluations willing partner sufficient time interesting policy question / theory sufficient resources Environment / Context 1 mechanism of change (log frame) state assumptions identify research hypothesis identify target population identify indicators identify threads to validity Theory of Change 2 statistical validity cluster correlation Sufficient Sample Size 4 Spillovers Sample Bias Attrition Strategy to Manage Threats 5 Check on competing interventions simple program packages Randomization Design encouragement gradual rollout simple lottery rotation design 3 individual cluster design block random. InterventionUnit of randomization Randomization mechanism Revise

Internal Validity: Can we estimate the treatment effect for our particular sample? – Fails when there are differences between the two groups (other than the treatment itself) that affect the outcome External Validity: Can we extrapolate our estimates to other populations? – Fails when outside our evaluation environment, the treatment has a different effect G. Internal vs. External Validity

Threads to Internal Validity: control group is different from the counterfactual – Spill-overs – Sample Selection Bias – Attrition Examples: – Individuals assigned to comparison group could attempt to move into treatment group (cross-over) and v.v. – Individuals assigned to treatment group could drop out of the program (Attrition) G. Threads to Internal Validity

Depends on three factors: Program Implementation: can it be replicated at a large scale? Study Sample: is it representative? – Does de-worming have the same effects in Kenya and South Africa? Sensitivity of results: would a similar, but slightly different program, have same impact? G. External Validity: Generalisability of results 45

Interested? Become part of the J-PAL research team! “You get to spend a year in Siberia, while I have to stay here in Hawaii, to apply for grants to extend your research time there.”