Www.worldbank.org/hdchiefeconomist The World Bank Human Development Network Spanish Impact Evaluation Fund.

Slides:



Advertisements
Similar presentations
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation Muna Meky Impact Evaluation Cluster, AFTRL Slides by Paul J.
Advertisements

Explanation of slide: Logos, to show while the audience arrive.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Designing an impact evaluation: Randomization, statistical power, and some more fun…
The World Bank Human Development Network Spanish Impact Evaluation Fund.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.
Explanation of slide: Logos, to show while the audience arrive.
Holland on Rubin’s Model Part II. Formalizing These Intuitions. In the 1920 ’ s and 30 ’ s Jerzy Neyman, a Polish statistician, developed a mathematical.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Experimental Research Designs
What could go wrong? Deon Filmer Development Research Group, The World Bank Evidence-Based Decision-Making in Education Workshop Africa Program for Education.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Designing Influential Evaluations Session 5 Quality of Evidence Uganda Evaluation Week - Pre-Conference Workshop 19 th and 20 th May 2014.
Impact Evaluation Randomized Experiments Sebastian Galiani November 2006.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
BUSINESS AND FINANCIAL LITERACY FOR YOUNG ENTREPRENEURS: EVIDENCE FROM BOSNIA-HERZEGOVINA Miriam Bruhn and Bilal Zia (World Bank, DECFP)
Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting.
Research Problems.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
Who are the participants? Creating a Quality Sample 47:269: Research Methods I Dr. Leonard March 22, 2010.
Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.
S519: Evaluation of Information Systems Social Statistics Inferential Statistics Chapter 8: Significantly significant.
TOOLS OF POSITIVE ANALYSIS
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 6 Chicago School of Professional Psychology.
Variables and Measurement (2.1) Variable - Characteristic that takes on varying levels among subjects –Qualitative - Levels are unordered categories (referred.
Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.
Estimation and Hypothesis Testing Now the real fun begins.
TRANSLATING RESEARCH INTO ACTION What is Randomized Evaluation? Why Randomize? J-PAL South Asia, April 29, 2011.
Measuring Impact: Experiments
Final Study Guide Research Design. Experimental Research.
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
ECON 3039 Labor Economics By Elliott Fan Economics, NTU Elliott Fan: Labor 2015 Fall Lecture 31.
Beyond surveys: the research frontier moves to the use of administrative data to evaluate R&D grants Oliver Herrmann Ministry of Business, Innovation.
Estimating Causal Effects from Large Data Sets Using Propensity Scores Hal V. Barron, MD TICR 5/06.
Session III Regression discontinuity (RD) Christel Vermeersch LCSHD November 2006.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Rigorous Quasi-Experimental Evaluations: Design Considerations Sung-Woo Cho, Ph.D. June 11, 2015 Success from the Start: Round 4 Convening US Department.
Propensity Score Matching for Causal Inference: Possibilities, Limitations, and an Example sean f. reardon MAPSS colloquium March 6, 2007.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.
Nigeria Impact Evaluation Community of Practice Abuja, Nigeria, April 2, 2014 Measuring Program Impacts Through Randomization David Evans (World Bank)
Presented by: Shubha Chakravarty (Economist, AFTPM) Impact Evaluation team: Mattias Lundberg (Sr. Economist, HDNCY) Markus Goldstein (Sr. Economist, AFTPM.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Steps in Implementing an Impact Evaluation Nandini Krishnan.
Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.
On the use and misuse of computers in education: Evidence from a randomized experiment in Colombia Felipe Barrera-Osorio, World Bank Leigh L. Linden, Columbia.
Randomized Assignment Difference-in-Differences
Design of Clinical Research Studies ASAP Session by: Robert McCarter, ScD Dir. Biostatistics and Informatics, CNMC
Developing an evaluation of professional development Webinar #2: Going deeper into planning the design 1.
Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Public Finance and Public Policy Jonathan Gruber Third Edition Copyright © 2010 Worth Publishers 1 of 24 Copyright © 2010 Worth Publishers.
Impact Evaluation Methods Randomization and Causal Inference Slides by Paul J. Gertler & Sebastian Martinez.
Randomized Control Trials: What, Why, How, When, and Where Mark L. Davison MESI Conference March 11, 2016 Department of Educational Psychology.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Technical Track Session III
Measuring Results and Impact Evaluation: From Promises into Evidence
Threats and Analysis.
Explanation of slide: Logos, to show while the audience arrive.
Explanation of slide: Logos, to show while the audience arrive.
Explanation of slide: Logos, to show while the audience arrive.
Variables and Measurement (2.1)
Impact Evaluation Methods
Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.
Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.
Sampling and Power Slides by Jishnu Das.
Explanation of slide: Logos, to show while the audience arrive.
Explanation of slide: Logos, to show while the audience arrive.
Presentation transcript:

The World Bank Human Development Network Spanish Impact Evaluation Fund

This material constitutes supporting material for the "Impact Evaluation in Practice" book. This additional material is made freely but please acknowledge its use as follows: Gertler, P. J.; Martinez, S., Premand, P., Rawlings, L. B. and Christel M. J. Vermeersch, 2010, Impact Evaluation in Practice: Ancillary Material, The World Bank, Washington DC ( The content of this presentation reflects the views of the authors and not necessarily those of the World Bank. RANDOMIZED TRIALS Technical Track Session II

Randomized Trials How do researchers learn about counterfactual states of the world in practice? In many fields, evidence about counterfactuals is generated by randomized trials or experiments. (E.g. medical research) Under certain conditions, randomized trials ensure that outcomes in the comparison group really do capture the counterfactual for a treatment group.

Randomization for causal inference A random sample of units is selected from a defined population. This sample of units is randomly assigned to treatment and comparison groups. Second stage First stage Statisticians recommend a formal two-stage randomization model:

Eligible Population Randomized assignment Random sample Treatment Group Comparison Group Sample

Why two stages of randomization? I.e. ensure that the results in the sample will represent the results in the population within a defined level of sampling error I.e. ensure that the observed effect on the dependent variable is due to the treatment rather than to other confounding factors Second stage First stage For internal validity For external validity o Remember the two conditions?

Two-Stage Randomized Trials In large samples, two-stage randomized trials ensure that: [Y̅ 1 |D=1]=[Y̅ 1 |D=0] [Y̅ 0 |D=1]=[Y̅ 0 |D=0] and Why is this true?

“ The law of large numbers When you randomly draw a sample of units from a large population, and when the number of units you draw gets large, the average of any characteristic of your sample will tend to become closer to the expected value. If the number of units in your sample grows, on average the sample will look like its original population.

Eligible Population Randomized assignment Random sample Treatment Group Comparison Group Sample

Two-Stage Randomized Trials In large samples, two-stage randomized trials ensure that: [Y̅ 1 |D=1]=[Y̅ 1 |D=0][Y̅ 0 |D=1]=[Y̅ 0 |D=0] and Thus, the estimator: δ̅̂ = [ Y̅̂ 1 | D = 1] - [ Y̅̂ 0 | D =0 ] consistently estimates the Average Treatment Effect (ATE)

Population vs. Selected group? If the randomization takes place on a selected group of units The treatment effect on that selected group of units! we’ll be estimating what??

Eligible Population Randomized assignment Random Treatment Group Comparison Group Sample Not random Non-selected Group Selected Group (e.g. Volunteers) Selected Group (e.g. Volunteers)

Randomized assignment: How? If you want to assign 50% of the sample to treatment and comparison: flip a coin for each person. If you want to assign 40% of the sample to the treatment group, then roll a dice for each person. A 1 or a 2 is treatment; a 3, 4, 5 or a 6 is comparison. Other percentages : Let Excel give each unit a random number. Decide how many units will be in the treatment group (call this X). Assign the X units that get the highest numbers to the treatment group.

Example Computers for Education Barrera and Linden (2009)

Randomized Trial of in Colombia, Latin-America Program activities: o Re-furbish computers donated by private firms and installs them in public schools. o Train teachers in the pedagogic uses of computers with the help of a local university. 2006: 97 schools were subject to a randomization, o 48 of them received computers. o 49 did not received computers. 2008: Follow-up survey

Step 1: Evaluation question What is the impact of intervention D on variable of interest Y? o Student learning (final outcome) o Classroom practice (intermediate outcome) o Number of teachers trained (output) o Number of working computers in schools (output) Variable of interest Y Intervention package D D = 1If school selected for installation of computers + teacher training 0Otherwise

Step 2: Verify balance Check that characteristics are balanced between treatment and comparison groups. Student i School j Dummy variable for whether school j was assigned to treatment in the randomization process Regression Goal OLS with standard errors clustered at the school level. Method Y ij = β 0 + β 1 T j +ε ij i j TjTj

Verify balance of test scores Normalized Test Scores Treatment Average Comparison Average Difference Language (0.10)(0.08)(0.13) Math (0.08) (0.11) Total (0.11)(0.10)(0.15)

Verify balance of demographic characteristics Demographic Characteristics Treatment Average Comparison Average Difference Gender (0.02) (0.03) Age (0.27)(0.36)(0.45) Nr parents in household (0.02) (0.03) Nr siblings (0.22)(0.20)(0.30) Receives allowance (0.02)(0.03) Nr friends (1.91)(1.15)(2.22) Hours of work (0.40)(0.73)(0.83)

Step 3: Estimate impact Compare the average Y for the treatment group with the average Y for the comparison group. o Cluster at the school level. o This is where you get the standard errors from. In a Regression In Words In a Table Y ij = β 0 + β 1 T j +ε ij % of questions responded correctly Treatment Average Comparison Average Simple Difference Spanish Math

Step 3: Findings % of questions responded correctly Treatment Average Comparison Average Simple Difference Spanish (0.01) (0.02) Math (0.02)(0.01)(0.02)

Step 4: Estimate impact, multivariate o Compare the average Y for the treatment and comparison groups, and o add controls for baseline characteristics. is a vector of baseline characteristics for student i in school j In a Multivariate Regression In Words Y ij = β 0 + β 1 T j + β 2 X ij + ε ij X ij

Step 5: Findings % of questions responded correctly Treatment Average Comparison Average Simple Difference Difference with controls (multivariate) Spanish (0.01) (0.02) Math (0.02)(0.01)(0.02) Output variable Treatment Average Control Average Difference Nr of Computers at School *** (1.28)(0.75)(1.49) % of Teachers trained *** (0.03)(0.04)(0.05) Schools treated *** (0.03) (0.04)

Step 5: Findings Little effect on students’ test scores Across grade levels, subjects, and gender. Why? Computers not incorporated into educational process.

More thoughts on Randomized Trials

Potential Caveats Non-compliance o Not all units assigned to the treatment will actually receive the treatment (non- compliance) o Some units assigned to comparison may still receive treatment (non-compliance) o For limited non-compliance: use instrumental variables (see session IV) Attrition o We may not be able to observe what happens to all units. o Careful for differential attrition!

More Potential Caveats Hawthorne effect o Just observing units makes them behave differently… o Tends to disappear over time. John Henry effect The “comparisons” work harder to compensate.

Level of randomization Level of randomization determines the power of the trial: More units in the treatment and comparison groups The lower the level: In General: Choose the lowest level that is still administratively feasible. District, village, family, child? Estimate of difference between T and C becomes more precise. o the more potential for contamination –children in a class, families in a village- and o the harder to administer the program.

Randomized vs. Non-Randomized Trials Randomized experiments o Assumptions play a minor role o Or no role at all when testing the hypothesis of “no treatment effect” o In the absence of difficulties such as noncompliance or attrition… Non-randomized methods o Requires strong assumptions in order to validate the counterfactual o Attrition is equally a problem as in randomized trials.

References Rosenbaum, Paul (2002): Observational Studies, Springer. Chapter 2. Cochran, W. G. (1965): “ The planning of observational studies of human populations ”, Journal of the Royal Statistics Association Series A 128, pp , with discussion. Angrist, J., E. Bettinger, E. Bloom, E. King and M. Kremer (2002): “ Vouchers for Private Schooling in Colombia: Evidence from a Randomized Natural Experiment ”, American Economic Review, 92, pp Angrist, J. and V. Lavy (2002 ): “ The Effect of High School Matriculation Awards: Evidence from Randomized Trials ”, NBER Working Paper. Barrera-Osorio, F. and L. Linden (2009): “The Use and Misuse of Computers in Education: Evidence from a Randomized Experiment in Colombia”, World Bank Policy Research Working Paper WPS4836.

Thank You

? Q & A ?