Download presentation
Presentation is loading. Please wait.
Published byItzel Root Modified over 9 years ago
1
Review of Identifying Causal Effects Methods of Economic Investigation Lecture 13
2
Last Term Classical OLS has 5 main assumptions A1. Full Rank: X is a T x k matrix with rank p(X)= k≤n A2. Linearity: y = Xβ + ε where E(ε ) = 0 A3. X is exogenous with respect to ε, i.e. E(ε | X) = 0 Somewhat weaker condition, X is uncorrelated with ε so that E(εX)=0 A4. Homoskedasticiy: E(ε ε’) = σ 2 I T A5. Normality ε~N(0, σ 2 I T )
3
Relaxing the assumptions Even in finite samples, with these assumption linear regression is BLUE Last term you talked about some of the consequences of relaxing some of these assumptions Can rely on large sample properties to get around some of the problems (e.g. A5) Can construct more robust estimates which are ok in these large sample (e.g. A4)
4
This term The biggest problem in estimation is A3 We call this the “Conditional Independence Assumption” or CIA Our estimates become inconsistent—which means we cannot rely on large samples to fix the problem There are lots of ways violations of A3 can happen
5
CIA is violated, now what? We need to figure out what is generating the correlations Measurement error Omitted variables Selection on unobservables Simultaneity of determination Correlations across different periods
6
How do we figure out what’s wrong? We put this in the context of a “program evaluation” In truth, may not be any “program” per se Think of variable of interest as a “treatment” For simplicity, we now talk about the variable of interest as an indicator variable that can be zero or one In practice can be continuous Use derivatives rather than differences to calculate differences/changes
7
The Imaginary Experimental Ideal Pretend you could run an imaginary experiment Not bounded by reality Everything observable How would you construct a test to isolate the effect of your variable of interest T on the outcome of interest Y
8
Partioning the World Two groups of people in the world People who got the treatment (so T=1) People who did not get the treatment (so T=0) To think of a counterfactual outcome we need to know what would have happened in the absence of the experiment
9
The Road Not Taken… We Imagine 2 states of the world: one where someone gets T and one where that same person does not Now let’s define our usual notation Individual A Gets T Doesn’t Get T Y 1A Y 0A
10
Our Gold Standard If we could observe both these states of the world we could know what would have happened in the absence of the treatment The ONLY DIFFERENCE is the treatment so we know any difference in the Y’s must be caused by the treatment This is our Average Treatment Effect E[Y 1A – Y 0A ]
11
Back to the real world Sadly, we do not observe the true counterfactual What do we observe? Y 1 for all the people in the treatment group (let them all be A’s) Y 0 for all the people in the control group (let them all be B’s)
12
The Difference Estimate What if we just difference between treatment and control? That is what if we did: E[Y 1 | T = 1] – E[Y 0 | T = 0] NOTE: These are now CONDITIONAL expectation because we don’t observe the treatment for the control and vice versa
13
Decomposing the Difference Estimate What if we rewrite our difference estimate so that: E[Y 1A | T = 1] – E[Y 0B | T = 0] = E[Y 1A | T = 1] - E[Y 0A | T = 1] + {E[Y 0A | T = 1] – E[Y 0B | T = 0]} = TOT + Selection Bias
14
ATE vs. TOT Let’s look at the two definitions: TOT: E[Y 1A – Y 0A | T = 1] ATE: E[Y 1A – Y 0A ] Why might these differ even if SB=0? Heterogeneous Treatment Effects This means there may be an idiosyncratic individual component to the treatment effect Not the same as selection, more a function of the actual effect
15
ATE vs. TOT - 2 Suppose Y 1A = μ 1 + ξ 1 and Y 0A = μ 0 + ξ 0 Then we can rewrite TOT= E[Y 1A – Y 0A |T=1] = E[μ 1 – μ 0 |T=1] + E[ξ 1 – ξ 0 |T=1] ATE = E[Y 1A – Y 0A ] = E[μ 1 – μ 0 ] + E[ξ 1 – ξ 0 ] = {E[μ 1 – μ 0 | T=1]+ E[ξ 1 – ξ 0 |T=1]}*Pr(T=1) + {E[μ 1 – μ 0 | T=0]+ E[ξ 1 – ξ 0 |T=0]}*Pr(T=0) So the ATE will be a weighted average of TOT and a treatment effect for the Control group
16
Visual Representation of Difference Assigned Treatment (T=1) A B Not Assigned Treatment (T=0) Y 1A Y 1B Y 0A Y 0B ATETOT
17
MY CARELESS NOTATION Exercise 2 ATE defined as: ATE = E( Y 1i –Y 0i | T i =1) This is really ATET ATET=ATE if E( Y 1i –Y 0i | T i =1) = E( Y 1i –Y 0i | T i =0) No heterogenity in treatment effects In general, these are not the same
18
Why do we care about TOT? In the case of an experiment, we can get an estimate of TOT (may not be ATE) Why? We observe: E[Y 1A | T = 1] – E[Y 0B | T=0] This can be decomposed into two parts: E[Y 1A – Y 0A | T = 1] + E[Y 0A | T = 1] – E[Y 0B | T=0] If E[Y 0A | T = 1] = E[Y 0B | T=0] then the observed difference in outcomes is our estimate of TOT!
19
That’s why experiments are good! If you have an experiment which is randomly assigned with no compliance issues then we can estimate TOT If there are compliance issues, then we estimate ITT E[Y 1A | T = 1] – E[Y 0B | T=0]
20
TOT vs. ITT ITT may not be the same as TOT (and thus in the case of random assignment not the same as ATE) because of compliance: DEVIATION FROM PREVIOUS NOTATION: Before we have assumed that if T= 1 then you were both assigned to and received treatment Now we need two separate things: T the assignment to treatment and R receipt of treatment
21
Visual Representation of Difference Assigned Treatment (T=1) A B Not Assigned Treatment (T=0) R=1 R=0 R=1 Compare A (orange) to B (blue) = ITT Compare R=1 (solid) to R=0 (striped) = TOT + SB Compare A,R=1 (orange solid) to B,R=0 (blue striped) = LATE
22
Compliance Issues In the case where R ≠ T, rewrite the observed difference in outcomes as E[Y 1 | R = 1] – E[Y 0 | R=0] = E[Y A | R = 1, T = 1] *Pr[T=1 | R=1] + E[Y B | R = 1, T = 0] *Pr[T=0 | R=1] – E[Y B | R = 0, T = 0] *Pr[T=0 | R=0] – E[Y A | R = 0, T = 1] *Pr[T=1 | R=0] (Treatment Group Compliers) (Control Group Compliers) (Always Takers) (Never Takers)
23
Imagine non-compliance is symmetric Rewrite with Pr[T=1 | R=0] = Pr[T=0 | R=1] = p E[Y 1 | R = 1] – E[Y 0 | R=0] = {E[Y A | R = 1, T = 1]– E[Y B | R = 0, T = 0] } (1 – p) +{ E[Y B | R = 1, T = 0]– E[Y B | R = 0, T = 1]}p = (TOT + SB)*(1 – p) + (AT – NT) p If SB = 0 (no selection bias, i.e. among compliers, the counterfactual for the treatment group is the same) then this is a the weighted avg between the TOT and the AT/NT bias.
24
Roadmap of the course so far: Hypothetical counterfactual difference Experiment Perfect Compliance Non- experimental Imperfect Compliance TOT ITT Fixed Differences between Groups Groups with Parallel Trends Groups with similar characteristics Fixed Effect Difference-in-Differences TOT Matching Methods TOT/ITTTOT
25
What we’ve done so far… Ways to define a ‘control group’ Fixed Effect Individuals within a group, on average the same Attribute any within group difference to treatment Difference-in-Differences Assume: Fixed Differences over time Attribute any change in trend to treatment Propensity Score Matching Assume: Treatment, conditional on observables, is as if randomly assigned Attribute any difference in outcomes to treatment
26
Next time… Instrumental Variables What Are They What about LATE? How to Estimate
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.