Threats and Analysis.

Slides:



Advertisements
Similar presentations
Experimental Research Designs
Advertisements

What could go wrong? Deon Filmer Development Research Group, The World Bank Evidence-Based Decision-Making in Education Workshop Africa Program for Education.
1 Managing Threats to Randomization. Threat (1): Spillovers If people in the control group get treated, randomization is no more perfect Choose the appropriate.
Statistics Micro Mini Threats to Your Experiment!
Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.
EXPERIMENTS AND OBSERVATIONAL STUDIES Chance Hofmann and Nick Quigley
Nasih Jaber Ali Scientific and disciplined inquiry is an orderly process, involving: problem Recognition and identification of a topic to.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
1 Randomization in Practice. Unit of randomization Randomizing at the individual level Randomizing at the group level –School –Community / village –Health.
Intervention Studies Principles of Epidemiology Lecture 10 Dona Schneider, PhD, MPH, FACE.
EVAL 6970: Cost Analysis for Evaluation Dr. Chris L. S. Coryn Nick Saxton Fall 2014.
Health Programme Evaluation by Propensity Score Matching: Accounting for Treatment Intensity and Health Externalities with an Application to Brazil (HEDG.
AADAPT Workshop South Asia Goa, December 17-21, 2009 Nandini Krishnan 1.
Shawn Cole Harvard Business School Threats and Analysis.
Evaluating Job Training Programs: What have we learned? Haeil Jung and Maureen Pirog School of Public and Environmental Affairs Indiana University Bloomington.
ECON 3039 Labor Economics By Elliott Fan Economics, NTU Elliott Fan: Labor 2015 Fall Lecture 31.
RCTs and instrumental variables Anna Vignoles University of Cambridge.
Application 2: Minnesota Domestic Violence Experiment Methods of Economic Investigation Lecture 6.
Why Use Randomized Evaluation? Isabel Beltran, World Bank.
Applying impact evaluation tools A hypothetical fertilizer project.
Research Design ED 592A Fall Research Concepts 1. Quantitative vs. Qualitative & Mixed Methods 2. Sampling 3. Instrumentation 4. Validity and Reliability.
Public Finance and Public Policy Jonathan Gruber Third Edition Copyright © 2010 Worth Publishers 1 of 24 Copyright © 2010 Worth Publishers.
Impact Evaluation for Evidence-Based Policy Making Arianna Legovini Lead Specialist Africa Impact Evaluation Initiative.
Common Pitfalls in Randomized Evaluations Jenny C. Aker Tufts University.
A. CAUSAL EFFECTS Eva Hromádková, Applied Econometrics JEM007, IES Lecture 2A.
IMPACT EVALUATION PBAF 526 Class 5, October 31, 2011.
Randomized Evaluation: Dos and Don’ts An example from Peru Tania Alfonso Training Director, IPA.
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
Kenya Evidence Forum - June 14, 2016 Using Evidence to Improve Policy and Program Designs How do we interpret “evidence”? Aidan Coville, Economist, World.
Copyright © 2009 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Experimental Research
ECON 4009 Labor Economics 2017 Fall By Elliott Fan Economics, NTU
Critically Appraising a Medical Journal Article
Measuring Results and Impact Evaluation: From Promises into Evidence
Causation & Experimental Design
Inference and Tests of Hypotheses
An introduction to Impact Evaluation
Article & Final Reviews
7 How to Decide Which Variables to Manipulate and Measure Marziyeh Rezaee.
Chapter 13- Experiments and Observational Studies
Experiments and Observational Studies
Child Outcomes Summary (COS) Process Training Module
Implementation Issues Program roll-out
Experiments and Observational Studies
Anja Sautmann Director of Research, Education, and Training, J-PAL
Experiments and Quasi-Experiments
Free Distribution or Cost-Sharing
Experimental Design.
Impact Evaluation Methods
Experimental Design.
Empirical Tools of Public Finance
1 Causal Inference Counterfactuals False Counterfactuals
Matching Methods & Propensity Scores
Implementation Challenges
Experiments and Quasi-Experiments
BMTRY 738: The Study Population
Impact Evaluation Methods: Difference in difference & Matching
Evaluating Impacts: An Overview of Quantitative Methods
Sampling and Power Slides by Jishnu Das.
Explanation of slide: Logos, to show while the audience arrive.
Class 2: Evaluating Social Programs
Class 2: Evaluating Social Programs
Chapter 7: Sampling Distributions
Reasoning in Psychology Using Statistics
Applying Impact Evaluation Tools: Hypothetical Fertilizer Project
Test Drop Rules: If not:
Positive analysis in public finance
Reasoning in Psychology Using Statistics
Misc Internal Validity Scenarios External Validity Construct Validity
Impact evaluations: An introduction
Presentation transcript:

Threats and Analysis

Course Overview What is evaluation? Measuring impacts (outcomes, indicators) Why randomize? How to randomize? Sampling and sample size Threats and Analysis Scaling Up Project from Start to Finish

Lecture Overview Attrition Spillovers http://www.povertyactionlab.org/ Lecture Overview Attrition Spillovers Partial Compliance and Sample Selection Bias Intention to Treat & Treatment on Treated Choice of Outcomes External Validity 3

Attrition

Attrition Bias: An Example http://www.povertyactionlab.org/ Attrition Bias: An Example The problem you want to address: Some children don’t come to school because they are too weak (undernourished) You start a feeding program where free food vouchers are delivered at home in certain villages and want to do an evaluation You have a treatment and a control group Weak, stunted children start going to school more if they live in a treatment village First impact of your program: increased enrollment. In addition, you want to measure the impact on child’s growth Second outcome of interest: Weight of children You go to all the schools (in treatment and control districts) and measure everyone who is in school on a given day Will the treatment-control difference in weight be over-stated or understated? 5

6

7

Attrition Is it a problem if some of the people in the experiment vanish before you collect your data? It is a problem if the type of people who disappear is correlated with the treatment. Why is it a problem? When should we expect this to happen? 8

What if only children > 21 Kg come to school? Will you underestimate the impact? Will you overestimate the impact? Neither Ambiguous Don’t know

What if only children > 21 Kg come to school? 10

When is attrition not a problem? When it is less than 25% of the original sample When it happens in the same proportion in both groups When it is correlated with treatment assignment All of the above None of the above

http://www.povertyactionlab.org/ Attrition Bias Devote resources to tracking participants after they leave the program If there is still attrition, check that it is not different in treatment and control. Is that enough? Also check that it is not correlated with observables. Try to bound the extent of the bias Suppose everyone who dropped out from the treatment got the lowest score that anyone got; suppose everyone who dropped out of control got the highest score that anyone got… Why does this help? 12

Spillovers

What Else Could Go Wrong? Total Population Target Population Not in evaluation Evaluation Sample Treatment Group Random Assignment Control Group 14

Spillovers, Contamination Total Population Target Population Not in evaluation Treatment  Evaluation Sample Treatment Group [Repeat from Lecture 2] [First click:] [Second Click]: Through random assignment, we can ensure the study is internally valid [Third Click]: even though there will be some participants and some non-participants within our treatment group, at the first stage, we restrict our analysis to comparing the entire treatment group with the entire control group There is a saying: Once a treatment group, always treatment group [Fourth Click] Similarly: once a control group, always a control group [FifthClick] Highlight No-Shows and Crossovers. This is called non compliance We will cover how to account for this later. Random Assignment Control Group 15

Spillovers, Contamination Total Population Target Population Not in evaluation Treatment  Evaluation Sample Treatment Group It’s worth noting that spillovers won’t always lead you to under-estimate your impact Random Assignment Control Group 16

Example: Vaccination for Chicken Pox http://www.povertyactionlab.org/ Example: Vaccination for Chicken Pox Suppose you randomize chicken pox vaccinations within schools Suppose that prevents the transmission of disease, what problems does this create for evaluation? Suppose spillovers are local? How can we measure total impact? 17

Spillovers Within School http://www.povertyactionlab.org/ Spillovers Within School Without Spillovers School A Treated? Outcome Pupil 1 Yes no chicken pox Total in Treatment with chicken pox Pupil 2 No chicken pox Total in Control with chicken pox Pupil 3 Pupil 4 Treament Effect Pupil 5 Pupil 6 With Spillovers Suppose, because prevalence is lower, some children are not re-infected with chicken pox Treatment Effect 18

Spillovers Within School http://www.povertyactionlab.org/ 0% 100% -100% 67% - 6767% Spillovers Within School Without Spillovers School A Treated? Outcome Pupil 1 Yes no chicken pox Total in Treatment with chicken pox Pupil 2 No chicken pox Total in Control with chicken pox Pupil 3 Pupil 4 Treament Effect Pupil 5 Pupil 6 With Spillovers Suppose, because prevalence is lower, some children are not re-infected with chicken pox Treatment Effect - 67% 19

How to Measure Program Impact in the Presence of Spillovers? http://www.povertyactionlab.org/ How to Measure Program Impact in the Presence of Spillovers? Design the unit of randomization so that it encompasses the spillovers If we expect spillovers that are all within school: Randomization at the level of the school allows for estimation of the overall effect 20

Partial Compliance and Sample Selection Bias

Non Compliers What can you do? Can you switch them? No! Target Population Not in evaluation What can you do? Can you switch them? No! Evaluation Sample Participants Treatment group Random Assignment So sometimes, you intend for there to be complete coverage in your treatment group, and none in your control, but people don’t always comply with their original assignment No-Shows Control group Non-Participants Cross-overs 22

Non Compliers What can you do? Can you drop them? No! Target Population Not in evaluation What can you do? Can you drop them? No! Evaluation Sample Participants Treatment group Random Assignment [Repeat from Lecture 2] [First click:] [Second Click]: Through random assignment, we can ensure the study is internally valid [Third Click]: even though there will be some participants and some non-participants within our treatment group, at the first stage, we restrict our analysis to comparing the entire treatment group with the entire control group There is a saying: Once a treatment group, always treatment group [Fourth Click] Similarly: once a control group, always a control group [FifthClick] Highlight No-Shows and Crossovers. This is called non compliance We will cover how to account for this later. No-Shows Control group Non-Participants Cross-overs 23

Non Compliers You can compare the original groups Target Population Not in evaluation You can compare the original groups Evaluation Sample Participants Treatment group Random Assignment No-Shows Control group Non-Participants Cross-overs 24

http://www.povertyactionlab.org/ Sample Selection Bias Sample selection bias could arise if factors other than random assignment influence program allocation Even if intended allocation of program was random, the actual allocation may not be 25

http://www.povertyactionlab.org/ Sample Selection Bias Individuals assigned to comparison group could attempt to move into treatment group School feeding program: parents could attempt to move their children from comparison school to treatment school Alternatively, individuals allocated to treatment group may not receive treatment School feeding program: some students assigned to treatment schools bring and eat their own lunch anyway, or choose not to eat at all. 26

Intention to Treat & Treatment on Treated

ITT and ToT Vaccination campaign in villages http://www.povertyactionlab.org/ ITT and ToT Vaccination campaign in villages Some people in treatment villages not treated 78% of people assigned to receive treatment received some treatment What do you do? Compare the beneficiaries and non-beneficiaries? Why not? 28

Intention to Treat (ITT) http://www.povertyactionlab.org/ Intention to Treat (ITT) What does “intention to treat” measure? “What happened to the average child who is in a treated school in this population?” Is this difference the causal effect of the intervention? 29

http://www.povertyactionlab.org/ 30

3 0.9 2.1 Intention School 1 to Treat ? Treated? Pupil 1 yes 4 Pupil 2 http://www.povertyactionlab.org/ 3 0.9 2.1 Intention School 1 to Treat ? Treated? Pupil 1 yes 4 Pupil 2 Pupil 3 Pupil 4 no Pupil 5 Pupil 6 2 Pupil 7 Pupil 8 6 School 1: Pupil 9 Avg. Change among Treated (A) Pupil 10 School 2: Avg. Change among Treated A= Avg. Change among not-treated (B) School 2 A-B 1 Avg. Change among Not-Treated B= Observed Change in weight 31

From ITT to Effect of Treatment on the Treated (TOT) http://www.povertyactionlab.org/ From ITT to Effect of Treatment on the Treated (TOT) The point is that if there is leakage across the groups, the comparison between those originally assigned to treatment and those originally assigned to control is smaller But the difference in the probability of getting treated is also smaller Formally, we obtain the probability of being induced into getting treatment by “instrumenting” the probability of treatment by the original assignment 32

Estimating TOT What values do we need? Y(T) Y(C) Prob[treated|T] Prob[treated|C]

Treatment on the Treated (TOT) http://www.povertyactionlab.org/ Treatment on the Treated (TOT) 𝐵= 𝐸 𝑌 𝑖 𝑧 𝑖 =1 −𝐸 𝑌 𝑖 𝑧 𝑖 =0 𝐸 𝑠 𝑖 𝑧 𝑖 =1 −𝐸 𝑠 𝑖 𝑧 𝑖 =0 𝑌 𝑇 −𝑌 𝐶 𝑃𝑟𝑜𝑏 𝑡𝑟𝑒𝑎𝑡𝑒𝑑 𝑇 −𝑃𝑟𝑜𝑏[𝑡𝑟𝑒𝑎𝑡𝑒𝑑|𝐶] 34

http://www.povertyactionlab.org/ TOT Estimator 35

TOT Estimator 3 5.25 Intention School 1 to Treat ? Treated? Pupil 1 http://www.povertyactionlab.org/ TOT Estimator 3 0.9 60% 20% 2.1 40% 5.25 Intention School 1 to Treat ? Treated? Pupil 1 yes 4 Pupil 2 Pupil 3 A = Gain if Treated Pupil 4 no B = Gain if not Treated Pupil 5 Pupil 6 2 Pupil 7 ToT Estimator: A-B Pupil 8 6 Pupil 9 Pupil 10 A-B = Y(T)-Y(C) Avg. Change Y(T)= Prob(Treated|T)-Prob(Treated|C) School 2 Y(T) 1 Y(C) Prob(Treated|T) Prob(Treated|C) Avg. Change Y(C) = A-B Observed Change in weight 36

ITT vs TOT If obtaining estimate is easy, why not always use TOT? http://www.povertyactionlab.org/ ITT vs TOT If obtaining estimate is easy, why not always use TOT? TOT estimates Local Average Treatment Effect (LATE) ITT may be policy relevant parameter of interest For example, we may not be interested in the medical effect of deworming treatment, but what would happen under an actual deworming program. If students often miss school and therefore don't get the deworming medicine, the intention to treat estimate may actually be most relevant. 37

Choice of outcomes

Multiple outcomes How do we decide on which outcomes to focus on? http://www.povertyactionlab.org/ Multiple outcomes How do we decide on which outcomes to focus on? Only outcomes that are statistically significantly different? The more outcomes you look at, the higher the chance you find at least one significantly affected by the program How do you account for this? Pre-specify outcomes of interest (JPAL-AEA web registry) Report results on all measured outcomes, even null results Correct statistical tests (Bonferroni) 39

External validity

Threat to External Validity: http://www.povertyactionlab.org/ Threat to External Validity: Behavioral responses to evaluations Generalizability of results 41

Threat to external validity: Behavioral responses to evaluations http://www.povertyactionlab.org/ Threat to external validity: Behavioral responses to evaluations One limitation of evaluations is that the evaluation itself may cause the treatment or comparison group to change its behavior Treatment group behavior changes: Hawthorne effect Comparison group behavior changes: John Henry effect Minimize salience of evaluation as much as possible Consider including controls who are measured at end-line only 42

Generalizability of Results http://www.povertyactionlab.org/ Generalizability of Results Depend on three factors: Program Implementation: can it be replicated at a large (national) scale? Study Sample: is it representative? Sensitivity of results: would a similar, but slightly different program, have same impact? 43

Further Resources Using Randomization in Development Economics Research: A Toolkit (Duflo, Glennerster, Kremer) Mostly Harmless Econometrics (Angrist and Pischke) Identification and Estimation of Local Average Treatment Effects (Imbens and Angrist, Econometrica, 1994). (LATE via random assignment, first stage, exclusion, and monotonicity)