Presentation is loading. Please wait.

Presentation is loading. Please wait.

Anja Sautmann Director of Research, Education, and Training, J-PAL

Similar presentations


Presentation on theme: "Anja Sautmann Director of Research, Education, and Training, J-PAL"— Presentation transcript:

1 Anja Sautmann Director of Research, Education, and Training, J-PAL
Threats and Analysis Anja Sautmann Director of Research, Education, and Training, J-PAL 1

2 Course Overview What is Evaluation? Theory of Change Why Randomize?
Measurement How to Randomize? Sampling and Sample Size Threats and Analysis Generalizability Start to Finish J-PAL | Threats

3 Introduction Conception phase is important and allows to design an evaluation enabling to answer the research questions But the implementation phase of the evaluation is also extremely important: many things can go wrong J-PAL | Threats

4 Objectives To be able to identify the main threats to validity during the implementation phase of the evaluation To define strategies to mitigate each of these threats To learn a few methods that can be used during analysis phase J-PAL | Threats

5 Lecture Overview Attrition
Lecture Overview Attrition Experimental Spillovers & Behavioral Responses to Evaluations Sample Selection Bias Partial Compliance, Intention to Treat, and the Effect of Treatment on the Treated Research Transparency J-PAL | Threats 5

6 Lecture Overview Attrition
Lecture Overview Attrition Experimental Spillovers & Behavioral Responses to Evaluations Sample Selection Bias Partial Compliance, Intention to Treat, and the Effect of Treatment on the Treated Research Transparency J-PAL | Threats 6

7 Attrition Definition: Attrition describes the reduction of the subject pool during the experiment. Any time between sample selection, randomization, and data collection. Subjects refuse, move/leave, or cannot be found. Is it a problem if some subjects vanish before you collect your data? Yes: Not good: only some type of people vanish Worse: if it differs by treatment. Why is it a problem?  Representativeness Why is it a problem? -- We lose the key properties of an RCT: A group representative of the population – and 2 identical population groups Why should we expect this to happen? -- Treatment may change incentives to participate in the survey J-PAL | Threats 7

8 Attrition Bias I: Selective Attrition
Consider an employment program Help the unemployed find jobs Assign all job seekers to treatment and control and measure effect on employment rates and income Suppose women are less likely to agree to be surveyed than men. Is it possible that we get the treatment-control difference wrong? Yes. For example, suppose the program helps women more than men: then the effect on employment is understated. J-PAL | Threats

9 Women agree less often to be surveyed than men
Women agree less often to be surveyed than men. Suppose employed women earn lower wages than men: We will overestimate the program effect on income We will underestimate the program effect on income. We may overestimate or underestimate the impact. Not sure. A J-PAL | Threats 9

10 Attrition Bias I: Selective Attrition
Definition: Attrition of a specific group, at the same rate between treatment and control groups. The T-C comparison is a relevant estimate of the impact, but only on the population of respondents What about the population of non-respondents? Program impact can be very large on them … or zero … or negative! External validity might be at risk.

11 Attrition Bias II: Differential Attrition
Attrition Bias II: Differential Attrition Consider a school feeding program that is designed to help get undernourished children to school  Stunted children start going to school more if they receive food at school Assign schools to treatment and control and measure child growth (weight/height)  Visit all schools (treatment and control) and weigh everyone who is in school on a given day Is it possible that the treatment-control difference in weight is understated? J-PAL | Threats 11

12 Weight in Treatment and Control Groups
J-PAL | Threats 12

13 Weight in Treatment and Control Groups
J-PAL | Threats 13

14 What if only children > 21 Kg come to school?
We will underestimate the impact We will overestimate the impact Neither Ambiguous Not sure J-PAL | Threats 14

15 What if only children > 21 Kg come to school?
J-PAL | Threats 15

16 Preventing Attrition Bias
Definition: Differential attrition means that participants leave differentially between treatment and control. To avoid: track participants in the experiment Sample non-respondents and devote additional resources If there is still attrition, check that it is not different in treatment and control. Threat to the key property of the RCT: Want to compare outcomes of two groups that only differ because one of them receives the program  Differential attrition can lead to bias. Internal validity might be at risk.

17 Summary of Attrition It can be a serious issue:
A threat to internal validity: causal meaning of your parameter A threat to external validity: even if it has a causal meaning, it is not representative of the population

18 Lecture Overview Attrition
Lecture Overview Attrition Experimental Spillovers & Behavioral Responses to Evaluations Sample Selection Bias Partial Compliance, Intention to Treat, and the Effect of Treatment on the Treated Research Transparency J-PAL | Threats 18

19 Reminder from Lecture 4: Externalities
Total Population Target Population Not in evaluation Treatment  In “How to Randomize”, spillovers were already mentioned. Evaluation Sample Treatment Group Random Assignment Control Group J-PAL | Threats 19

20 Experimental Spillovers: Insufficient Separation of Treatment and Control
Externalities can create spillovers: treatment may affect control subjects due to the nature of the treatment Experimental spillovers: treatment may affect control subjects due to the experimental design The difference: experimental spillovers will not occur when all receive the program. Strategy for externalities: measure them Strategy for experimental spillovers: avoid them J-PAL | Threats

21 Spillover I: Difficult (logistically or politically) for Service Providers
Service providers have trouble distinguishing between treatment and comparison (or customizing service) treatment comparison Services provided to both 1 min In medical trials, clinical researchers are so concerned about doctors’ ability to provide one randomly assigned treatment to one patient, and a different randomly assigned treatment (or status quo) to another, so it’s common practice to take discretion away from the doctor. They design “double-blind” trials. Where the patient doesn’t know which treatment their getting, but the doctor doesn’t know as well. Both the treatment and control “pills” appear identical, and the doctor not informed which pill is being given to which patients. This can be difficult or impossible once we start experimenting with different procedures or processes. If we wanted to test the effectiveness of a new process, and trained nurses on it, we couldn’t ask them to “apply” that training to some patients, and to “forget it” or “unlearn it” for others. I have an example of a project….(use “physician teams”) Crossovers: Control receives intervention (No longer represents pure counterfactual) J-PAL | Thtreats

22 Solution: Assign to Different Service Providers
Have different teams provide the different treatments Randomly assign to those teams treatment comparison 30 seconds I have an example of a project…. (Physician teams?) J-PAL | Threats

23 Solution: Randomize at a different unit
If service providers have trouble distinguishing between treatment and comparison (or customizing service) Change the unit of random assignment Have providers treat entire clusters the same (As with externalities) treatment comparison 30 seconds J-PAL | Threats

24 Spillover II: Control group receives treatment
If treatment and control individuals know each other, the treatment may share benefits with control. 1 min J-PAL | How to Randomize

25 Spillover III: Control group learns from treatment
May change their behavior after seeing treatment 1 min True impact = 5 Measured impact = 0 Treatment group Control group Bad health Good health J-PAL | How to Randomize

26 Experimental spillovers will typically lead us to…
Underestimate the treatment effect Overestimate the treatment effect It depends Not sure 2 minutes J-PAL | How to Randomize

27 Spillover IV: Behavioral Responses to Evaluations
Example: if treatment and control individuals know each other, the control may get upset about not receiving treatment. Service providers may lose support of community Attrition: Control withdraws participation from research treatment comparison Talks with friends (treatment and control) Friends in control group get upset with researchers or service providers 15 seconds J-PAL | Threats

28 Behavioral Responses to Evaluations
Treatment group changes its behavior: Experimenter demand effect Comparison group changes its behavior: Resentment and demoralization effects John Henry effect Anticipation effects Both groups can be affected: Survey effects Hawthorne (observer) effect J-PAL | Threats

29 Experimenter Demand Effect
? Subtle cues from the researcher may inform the treatment group about the expected behavior or effect They may change behavior in order to please the experimenter ! J-PAL | Threats

30 John Henry Effect A legendary American railway worker in the 1870s
Heard that his output was compared to the output of a machine Worked harder to outperform the machine (and died) J-PAL | Threats

31 Hawthorne Effect Productivity increases Productivity decreases Experiments from at Hawthorne Works, a Western Electric Factory Different experiments to increase workers productivity, including lighting studies Productivity gains as a result of the attention paid to workers When the experiment stops, gains disappear Similar: survey effects -- simply surveying subjects may increase their attention or prompt reflection on an issue. J-PAL | Threats

32 How to limit evaluation-driven effects?
Use a different level of randomization Minimize salience of evaluation as much as possible Do not announce phase-in Downside is that this can be useful to reduce attrition! Make sure all subjects give informed consent Make sure staff is impartial and treats both groups similarly (ex: blind data collection staff to treatment arm, do not inform surveyors of exact purpose of the intervention) Create a buffer J-PAL | Threats

33 Solution: Create a Buffer
Not sampled 30 seconds J-PAL | How to Randomize

34 Lecture Overview Attrition
Lecture Overview Attrition Experimental Spillovers & Behavioral Responses to Evaluations Sample Selection Bias Partial Compliance, Intention to Treat, and the Effect of Treatment on the Treated Research Transparency J-PAL | Threats 34

35 Sample Selection Bias Sample selection bias could arise if factors other than random assignment influence program allocation Individuals assigned to comparison group move into treatment group Individuals allocated to treatment group do not receive treatment  Can be due to project implementers or to participants themselves J-PAL | Threats 35

36 Non-Compliers What can you do? Can you drop them? Target Population
Not in evaluation What can you do? Can you drop them? [Repeat from Lecture 2] [First click:] [Second Click]: Through random assignment, we can ensure the study is internally valid [Third Click]: even though there will be some participants and some non-participants within our treatment group, at the first stage, we restrict our analysis to comparing the entire treatment group with the entire control group There is a saying: Once a treatment group, always treatment group [Fourth Click] Similarly: once a control group, always a control group [FifthClick] Highlight No-Shows and Crossovers. This is called non compliance We will cover how to account for this later. Evaluation Sample Participants Treatment group Random Assignment No-Shows Control group Non-Participants Cross-overs J-PAL | Threats 36

37 What can be done? Ideally: prevent it during design or implementation phase  cannot always be done Monitor it during implementation phase (measurement!)  important to be aware that it happens Interpret it during analysis phase  see next section J-PAL | Threats

38 Lecture Overview Attrition
Lecture Overview Attrition Experimental Spillovers & Behavioral Responses to Evaluations Sample Selection Bias Partial Compliance, Intention to Treat, and the Effect of Treatment on the Treated Research Transparency J-PAL | Threats 39

39 A school feeding program
Let’s take the example of a school feeding program Some schools receive the program, some don’t (random allocation) But allocation is imperfectly respected Also recall: an “entitlement” programme, where participation cannot be prevented or mandated Entitlement discussed in “How to randomize” J-PAL | Threats

40 Your treatment group for analysis is…
Individuals assigned to treatment who were actually treated All individuals who were actually treated Individuals assigned to treatment, regardless of whether or not they were treated Not sure )

41 Intention to Treat (ITT)
Intention to Treat (ITT) Easiest way to deal with partial compliance: Calculate Intent to Treat (ITT) Definition: The difference between the average outcome of the group that was randomly assigned to treatment and the group that was randomly assigned to control, regardless of whether they actually received the treatment. What does “intention to treat” measure? “What happened to the average child who is in a treated school in this population?” Is this difference the causal effect of the intervention? This can be the causal effect of the intervention, if sample selection will also happen in the final program J-PAL | Threats 42

42 School 1: Treatment School 2: Control Intention to treat? Treated
Change in weight Pupil 1 Yes 4 Pupil 2 Pupil 3 Pupil 4 No Pupil 5 Pupil 6 2 Pupil 7 Pupil 8 6 Pupil 9 Pupil 10 School 1: Treatment Intention to treat? Treated Change in weight Pupil 1 No 2 Pupil 2 1 Pupil 3 Yes 3 Pupil 4 Pupil 5 Pupil 6 Pupil 7 Pupil 8 Pupil 9 Pupil 10 Note the incomplete compliance: some always compliers, some never compliers; some truly comply (would not take treatment in control, but do take it in treatment) School 2: Control J-PAL | Threats

43 NOT correct School 1: Treatment Mean treated in school 1 4.67
Intention to treat? Treated Change in weight Pupil 1 Yes 4 Pupil 2 Pupil 3 Pupil 4 No Pupil 5 Pupil 6 2 Pupil 7 Pupil 8 6 Pupil 9 Pupil 10 School 1: Treatment Mean treated in school 1 4.67 Mean not treated in school 2 0.5 Difference: 4.17 Intention to treat? Treated Change in weight Pupil 1 No 2 Pupil 2 1 Pupil 3 Yes 3 Pupil 4 Pupil 5 Pupil 6 Pupil 7 Pupil 8 Pupil 9 Pupil 10 You can see that those who got treated in the control gained on average less weight There will be differences between those who do and don’t comply All the problems of causal inference reappear School 2: Control NOT correct J-PAL | Threats

44 The Intent to Treat: Mean in school 1 : 3.0 Mean in school 2 : 1.0
Intention to treat? Treated Change in weight Pupil 1 Yes 4 Pupil 2 Pupil 3 Pupil 4 No Pupil 5 Pupil 6 2 Pupil 7 Pupil 8 6 Pupil 9 Pupil 10 The Intent to Treat: Mean in school 1 : 3.0 Mean in school 2 : 1.0 Difference: 2.0 School 1: Treatment Intention to treat? Treated Change in weight Pupil 1 No 2 Pupil 2 1 Pupil 3 Yes 3 Pupil 4 Pupil 5 Pupil 6 Pupil 7 Pupil 8 Pupil 9 Pupil 10 School 2: Control J-PAL | Threats

45 Treatment Probability:
Intention to treat? Treated Change in weight Pupil 1 Yes 4 Pupil 2 Pupil 3 Pupil 4 No Pupil 5 Pupil 6 2 Pupil 7 Pupil 8 6 Pupil 9 Pupil 10 The Intent to Treat: Mean in school 1 : 3.0 Mean in school 2 : 1.0 Difference: 2.0 School 1: Treatment Treatment Probability: Intention to treat? Treated Change in weight Pupil 1 No 2 Pupil 2 1 Pupil 3 Yes 3 Pupil 4 Pupil 5 Pupil 6 Pupil 7 Pupil 8 Pupil 9 Pupil 10 Fraction treated in school 1: 0.6 Fraction treated in school 2: 0.2 Difference: 0.4 School 2: Control J-PAL | Threats

46 Effect of Treatment on the Treated (TOT)
ITT: “What is the average effect on a child in a treated school?” But: we may want to know the causal effect of the program: “What is the average effect on a treated child?”  Effect of Treatment on the Treated We can estimate this from the information above! J-PAL | Threats

47 Effect of Treatment on the Treated (TOT)
Effect of Treatment on the Treated (TOT) Intuitive idea: We saw above: treatment increased weight by 2kg This was from a difference in take-up (receipt of treatment) of just 0.4 = 40% of students If 40% students receiving treatment cause an average weight increase of 2kg, how much would weight increase if 100% of students received treatment? J-PAL | Threats 48

48 Effect of Treatment on the Treated (TOT)
In general, the treatment on the treated (TOT) is: What does the TOT measure? The effect of the program on those individuals who choose to take it up due to the intervention. Note: Effects on those people who didn’t take it up might have been quite different. J-PAL | Threats

49 ITT vs. TOT If obtaining the TOT is so easy, why not always use it?
ITT vs. TOT If obtaining the TOT is so easy, why not always use it? ITT may be the policy-relevant parameter of interest if take-up is voluntary: can only change outcomes for those who choose to get treated Non-compliance may not be well observed The effect on non-compliers may be small or zero: for example, children who eat better food at their parents’ nearby house would not gain weight. J-PAL | Threats 50

50 ITT / TOT: Conclusions Both ITT and TOT can provide valuable information to decision-makers TOT gives the effect of the intervention on those individuals that comply with the programme ITT gives the overall effect of the intervention, allowing that partial compliance can happen (which is true for many policies) J-PAL | Threats

51 When comparing ITT and TOT:
The TOT is always greater in absolute terms than the ITT The TOT is always smaller in absolute terms than the ITT It depends Not sure 2 minutes J-PAL | How to Randomize

52 Lecture Overview Attrition
Lecture Overview Attrition Experimental Spillovers & Behavioral Responses to Evaluations Sample Selection Bias Partial Compliance, Intention to Treat, and the Effect of Treatment on the Treated Research Transparency J-PAL | Threats 53

53 Another Form of Bias: “Reporting Bias”
Researchers, program managers, and funders all would like to see an impact It is tempting to ”search” for a positive result What might one do? Check many different possible outcomes for a treatment effect Change the regression specification by including different control variables (covariates) Downplay null results J-PAL | Threats

54 Testing Multiple Outcomes
Testing Multiple Outcomes The more outcomes you look at, the higher the chance you find at least one that is significantly affected by the program Recall: 5% chance of Type I error – false positives! How can we avoid false positives from multiple tests? Pre-specify outcomes of interest Report results on all measured outcomes, even null results Correct statistical significance levels for multiple tests (Bonferroni) Group outcomes together and form indices J-PAL | Threats 55

55 Including Covariates Why include covariates? May explain variation, improve statistical power Why not include covariates? Appearance of “specification searching” What to control for? If stratified randomization: add strata fixed effects Other covariates General Guideline: Report both “raw” differences and regression-adjusted results 56

56 Committing by Writing a Pre-Analysis Plan
Particularly useful when: Many ways to measure the outcome Many different subgroups But some drawbacks: What about unexpected outcomes? How to adapt to the main findings? How to know what the relevant covariates are? We can do conditional PAPs… but costly and time- consuming Up to each J-PAL affiliate to do or not to do a PAP But: increasingly standard J-PAL | Threats

57 The AEA RCT Registry J-PAL | Threats

58 Conclusions Internal validity – causal inference, reduction of selection bias -- is the primary strength of randomized Evaluations… …so everything undermining it must be carefully considered Design phase and power calculations are important, but so is the ability to anticipate and avoid challenges during the implementation phase Take measures to limit and monitor attrition, spillover effects, and partial compliance Be aware of behavioral effects Know how to correct for problems in analysis. J-PAL | Threats

59 Thank you! J-PAL | Threats


Download ppt "Anja Sautmann Director of Research, Education, and Training, J-PAL"

Similar presentations


Ads by Google