RANDOMIZED CONTROL TRIALS (RCTS): KEY CONSIDERATIONS AND TECHNICAL COMPONENTS Making Cents 2012 -Washington DC.

Slides:



Advertisements
Similar presentations
Armenias Millennium Challenge Account: Assessing Impacts Ken Fortson, MPR Ester Hakobyan, MCA Anahit Petrosyan, MCA Anu Rangarajan, MPR Rebecca Tunstall,
Advertisements

AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation Muna Meky Impact Evaluation Cluster, AFTRL Slides by Paul J.
Povertyactionlab.org Planning Sample Size for Randomized Evaluations Esther Duflo J-PAL.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
AADAPT Workshop South Asia Goa, December 17-21, 2009 Kristen Himelein 1.
Knowing if the RBF mechanism is working Incorporating Rigorous Impact Evaluation into your HRBF program Sebastian Martinez World Bank.
1 Research in Micro Finance: Big Questions and How to Answer Them.
Job Search Assistance Strategies Evaluation Presentation for American Public Human Services Association February 25, 2014.
How to randomize? povertyactionlab.org. Quick Review: Why Randomize Choosing the Sample Frame Choosing the Unit of Randomization Options How to Choose.
Study Designs in Epidemiologic
Povertyactionlab.org How to Randomize? Abhijit Banerjee Massachusetts Institute of Technology.
Benefits and limits of randomization 2.4. Tailoring the evaluation to the question Advantage: answer the specific question well – We design our evaluation.
KINE 4565: The epidemiology of injury prevention Randomized controlled trials.
The French Youth Experimentation Fund (Fonds d’Expérimentation pour la Jeunesse – FEJ) Mathieu Valdenaire (DJEPVA - FEJ) International Workshop “Evidence-based.
Operationalizing IE: Case Study example: Textbooks and Teacher Training in Sierra Leone APEIE Workshop, May
Experimental Design making causal inferences. Causal and Effect The IV precedes the DV in time The IV precedes the DV in time The IV and DV are correlated.
Making Impact Evaluations Happen World Bank Operational Experience 6 th European Conference on Evaluation of Cohesion Policy 30 November 2009 Warsaw Joost.
Agenda: Block Watch: Random Assignment, Outcomes, and indicators Issues in Impact and Random Assignment: Youth Transition Demonstration –Who is randomized?
SAMPLING AND STATISTICAL POWER Erich Battistin Kinnon Scott Erich Battistin Kinnon Scott University of Padua DECRG, World Bank University of Padua DECRG,
Concept note for Social Investment Program Project (SIPP), Bangladesh Team Members : Md. Abdul Momen Md. Golam Faruque Md. Lutfor Rahman MIM Zulfiqar Dr.
Evaluation of Math-Science Partnership Projects (or how to find out if you’re really getting your money’s worth)
Sampling and Sample Size Part 2 Cally Ardington. Lecture Outline  Standard deviation and standard error Detecting impact  Background  Hypothesis testing.
Technical & Vocational Skills Training Program for Orphans & Vulnerable Youth Maria Jones Field Coordinator TVST OVAY DIME Workshop, Dubai June 2, 2010.
TRADUIRE LA RECHERCHE EN ACTION Employment RCTs in France Bruno Crépon.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, 2009 AIM-CDD Using Randomized Evaluations.
1 Randomization in Practice. Unit of randomization Randomizing at the individual level Randomizing at the group level –School –Community / village –Health.
Financial Incentives & HIV prevention in Sub-Saharan Africa: Evidence from Three Experimental Interventions Berk Özler & Damien de Walque Development Research.
Moving from Development to Efficacy & Intervention Fidelity Topics National Center for Special Education Research Grantee Meeting: June 28, 2010.
TRANSLATING RESEARCH INTO ACTION What is Randomized Evaluation? Why Randomize? J-PAL South Asia, April 29, 2011.
Gender and Impact Evaluation
Measuring Impact: Experiments
AADAPT Workshop South Asia Goa, December 17-21, 2009 Nandini Krishnan 1.
T tests comparing two means t tests comparing two means.
Povertyactionlab.org Planning Sample Size for Randomized Evaluations Esther Duflo MIT and Poverty Action Lab.
Cultivating Demand Within USAID for Impact Evaluations of Democracy and Governance Assistance Mark Billera USAID Office of Democracy and Governance Perspectives.
Designing a Random Assignment Social Experiment In the U.K.; The Employment Retention and Advancement Demonstration (ERA)
The Use and Evaluation of Experiments in Health Care Delivery Amanda Kowalski Associate Professor of Economics Department of Economics, Yale University.
Rigorous Quasi-Experimental Evaluations: Design Considerations Sung-Woo Cho, Ph.D. June 11, 2015 Success from the Start: Round 4 Convening US Department.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
Nigeria Impact Evaluation Community of Practice Abuja, Nigeria, April 2, 2014 Measuring Program Impacts Through Randomization David Evans (World Bank)
Conditional Cash Transfer (CCT) Programme Kano State Nigeria.
TRANSLATING RESEARCH INTO ACTION Sampling and Sample Size Marc Shotland J-PAL HQ.
Why Use Randomized Evaluation? Isabel Beltran, World Bank.
Land Market Based Interventions in LAC: Protierras in Bolivia Martín Valdivia.
Applying impact evaluation tools A hypothetical fertilizer project.
OPPORTUNITIES AND CHALLENGES OF CONDUCTING RESEARCH IN LARGE SCALE PROGRAMS Presented by: Deanna Olney and Jef Leroy, IFPRI.
What is randomization and how does it solve the causality problem? 2.3.
Presented by: Shubha Chakravarty (Economist, AFTPM) Impact Evaluation team: Mattias Lundberg (Sr. Economist, HDNCY) Markus Goldstein (Sr. Economist, AFTPM.
Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.
Measuring Impact 1 Non-experimental methods 2 Experiments
Development Impact Evaluation in Finance and Private Sector 1.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Steps in Implementing an Impact Evaluation Nandini Krishnan.
TRANSLATING RESEARCH INTO ACTION What and How to randomize? July 9, 2011 Dhaka Raymond Guiteras, Assistant Professor University of Maryland povertyactionlab.org.
Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.
Implementing an impact evaluation under constraints Emanuela Galasso (DECRG) Prem Learning Week May 2 nd, 2006.
Randomized Assignment Difference-in-Differences
Developing an evaluation of professional development Webinar #2: Going deeper into planning the design 1.
Social Experimentation & Randomized Evaluations Hélène Giacobino Director J-PAL Europe DG EMPLOI, Brussells,Nov 2011 World Bank Bratislawa December 2011.
Open Forum: Scaling Up and Sustaining Interventions Moderator: Carol O'Donnell, NCER
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Randomization.
Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead Specialist Africa Impact Evaluation Initiative.
IMPACT EVALUATION PBAF 526 Class 5, October 31, 2011.
Cross-Country Workshop for Impact Evaluations in Agriculture and Community Driven Development Addis Ababa, April 13-16, Causal Inference Nandini.
FAST TRACK PROJECT: IMPACT EVALUATION DESIGN Musa Kpaka & Kevin Leiby | Component Leaders Meeting | 3 Aug, 2015.
Measuring Results and Impact Evaluation: From Promises into Evidence
Impact Evaluation Terms Of Reference
Sample Sizes for IE Power Calculations.
Presentation transcript:

RANDOMIZED CONTROL TRIALS (RCTS): KEY CONSIDERATIONS AND TECHNICAL COMPONENTS Making Cents Washington DC

Outline What makes an RCT Common Questions and Concerns Key Considerations Technical Components Evaluation Example

What makes an RCT Monitoring &Evaluation Program Evaluation Impact Evaluation Randomized Evaluation Impact evaluation is one type of program evaluation Randomized Control Trial is one type of impact evaluation

What makes an RCT Goal: to document the impact of a program or intervention  Did the program change lives? What would have happened if the program hadn’t existed?  Compare: what happened and what would have happened without the program aka the counterfactual

What makes an RCT Defining feature: random assignment of units  individual beneficiaries, schools, health clinics, etc to treatment and control groups  allows us to assess the causal effects of an intervention

Questions and Concerns Expensive Sample size Program started Ethics Technical capacity

Key Considerations What types of questions?  Specific – targeted, focused: test a certain hypothesis  Testable - has outcomes that can be measured  Important - will lead to lessons that will affect the way we plan or implement programs  Micro – macro, expansive questions not appropriate

Key Considerations Right project to evaluate?  Important, specific and testable question  Timing--not too early and not too late  Program is representative not gold plated  Time, expertise, and money to do it right  Results can be used to inform programming and policy

Key Considerations Right project to evaluate?  Plan and randomize before the program starts  Randomly assign who gets the program  Evaluation design should occur with project design  Need sampling frame on intended subjects  Randomize after the baseline survey

Key Considerations Not the right project?  Premature and still requires considerable “tinkering” to work well  Too small of a scale to compare into two “representative groups”  Unethical or politically unfeasible to deny a program while conducting evaluation, ie if a positive impact proven  Program has already begun and not expanding elsewhere  Impact evaluation too time-consuming or costly and therefore not cost-effective

Key Considerations Impact Counterfactual Intervention starts Primary Outcome Timing Have to do randomization before program starts

Technical Components: Sample Sample size and unit of randomization Sample size determined by effect size What effect size if feasible? Small program impact.1 standard deviation change Moderate program impact.2 greater standard deviation change Large program impact.3 or greater standard deviation change

Technical Components: Effect sizes Define effect size that is “worthwhile” What is the smallest effect that should justify the program being adopted? If the effect is smaller than that, it might as well be zero: we are not interested in proving that a very small effect is different from zero In contrast, if any effect larger than that would justify adopting this program: we want to be able to distinguish it from zero Statistical significance vs policy significance

Technical Components: Effect sizes Program manager: “We think our program will increase average income by 20%” Researcher: “OK. That’s equivalent to an effect size of… (frantically calculates)… 0.4 standard deviations… which means you need a sample size of… 2200 beneficiaries.” (Time and money is spent on data collection, monitoring intervention, etc etc. One year passes.) Researcher: “Well, we did not find a 20% increase in income. Maybe you should scrap the program.” Program manager: “WAIT!! The program is still worthwhile if it only increases income by 10%!!” Researcher: “Ooops. We don’t have the power to detect that. We would have needed a sample size of ” Punchline: Define effect size that is “worthwhile”… NOT what you think will happen

An effect size of… Is considered……and it means that… 0.2ModestThe average member of the treatment group had a better outcome than the 58 th percentile of the control group 0.5LargeThe average member of the treatment group had a better outcome than the 69 th percentile of the control group 0.8Whoa…that’s a big effect size! The average member of the treatment group had a better outcome than the 79 th percentile of the control group Technical Components: Effect sizes

Technical Components: Sample Individual level  If expecting.2 effect size  one intervention as compared to control.8 power – 770 people.9 power – 1040 people  two interventions as compared to control.8 power – 1150 people  If expecting.1 effect size  one intervention as compared to control.8 power – 3150

Technical Components: Sample Cluster level If expecting.2 effect size - one intervention  With Clusters of 10 people.8 power – 115 clusters  With Clusters of 20 people.8 power – 80 clusters lf expecting.2 effect size - two interventions  With Clusters of 30 ppl.8 power – 65 clusters  Clusters of 50 ppl.8 power – 56 clusters

Technical Components: Randomization Clients/beneficiaries are randomly assigned to receive the program or different program models  Everyone has an equal chance of being assigned to all groups The only difference between the two groups is whether they are assign to receive the new service Potential Beneficiaries Randomization Treatment (receive program)Control (no or delayed program)

How do we normally select participants HQ Closest to the main roads? Biggest advocate for program services? Greatest need? Not served by other organizations?

Random assignment Pre-program Randomization allows us to be sure we have a reliable and similar comparison group. When everyone had an equal chance of getting the program and randomization worked – we have our counterfactual. We know what would have happened without the program. We know outcomes in blue villages are similar to outcomes in red villages.

But how can we be sure they are similar Income per person, per day in leones, before the program Treat Compare 10,057

What non random assignment might look like HQ Income per person, per day, leones Treat Compare Blue villages don’t make a good counterfactual. These villages are different. They are better off.

Pre-program income – randomized Income per person, per day in leones, before the program Treat Compare 10,057

Post-program income – measure impact Income per person, per day, after the program Treat Compare Post-program and impact IMPACT

Technical Components: Randomization Impact Counterfactual Intervention Primary Outcome Time Randomization is unique – it gives us this reliable counterfactual

Random Sampling vs. Random Assignment Random Sampling: each individual has the same probability of being included in the sample Only survey/interview some households out of a community May select representative sample of villages out of district Can select random sample and conduct needs assessment  Random Assignment: each individual is as likely as any other to be assigned to the treatment or control group – gives us comparison group to measure impact

Randomly sample from area of interest (select some eligible participants/ villages) Randomly assign to treatment and control out of sample Random sampling and random assignment Randomly sample for surveys (from both treatment and control)

Mechanics of randomized assignment Need pre-existing list of all potential beneficiaries Many methods of randomization  Public lottery  Selection from hat/bucket  In office private  Random number generator  Computer program code  Staggered/phase-in  Rotation Pulling names/communities out of a hat to select program beneficiaries is random assignment

Is random fair?

Treatment 1 Treatment 2 Treatment 3 Multiple treatments: comparing programs

Phase-in design: slow program roll out Round 1 Treatment: 1/3 Control: 2/3 Round 2 Treatment: 2/3 Control: 1/ Randomized evaluation ends

Technical Components: Ethics Program proven to work Sufficient resources Doing harm Clear expectations Transparent process

Technical Components: Budget Impact evaluations require a significant budget line  $30,000 - $400,000 The budget is influenced by  Sample size  Numbers of waves of data collection  Logistical costs in-country  Methods of data collection  PDAs vs. paper-based tools  Length of tool  Qualitative vs. quantitative  Staffing

Overall Goal: Evidence Based Policy Cost-BenefitAnalysis Process Evaluation ImpactEvaluationNeedsAssessment Opportunity costs of money Quantification of costs Benefits Cost Benefit Analysis

35 Overall Goal: Evidence Based Policy

Example: Cash transfers for vocational training The Northern Uganda Social Action Fund (NUSAF)

NUSAF “Youth Opportunity Program” Cash grants of about $7000 per group ($377/person) Intended for acquiring vocational skills and tools Goals: 1. Raise incomes and employment 2. Increase community cohesion and reduce conflict 3. Build capacity of local institutions Experiment: Groups randomly assigned to receive the grant

Average age: 25 Average education: 8 th grade Average cash earnings: $0.48/day PPP Average employment: 10 hours/week Female: 33%

535 groups, with 18,000 youth 265 treatment groups receive grant Program allocated by lottery among eligible applicants 270 groups assigned to a control group

Timeline of events 2006 Tens of thousands apply, hundreds of groups funded 2007Funds remain for 265 groups in 10 districts Government selects, screens and approves 535 groups 2/2008Baseline survey with 5 people per group Randomization at group level 7-9/2008Government transfers funds to treatment groups 10/2010Mid-term survey commences roughly 2 years after transfer Effective attrition rate of 8% 2/2012Next survey planne d

The Results Economic Outcomes: Improved Employment, Incomes and High Returns on Investment Social Cohesion, Reconciliation and Conflict: Improvements for Men, Mixed Results for Women Governance and Corruption: No Evidence of Widespread Leakage

QUESTIONS Poverty-action.org Thank you