Download presentation
Presentation is loading. Please wait.
Published byDeborah Black Modified over 8 years ago
1
Analysis of Experimental Data II Christoph Engel
2
linear model I.treatment effect II.continuous explanatory variable III.heteroskedasticity IV.control variables V.interaction effects VI.outliers VII.endogeneity VIII.small and big problems
3
I. treatment effect pro (usually) more statistical power greater flexibility control variables heteroskedasticity instrumental variables time series and panel models non-linear functional form automatic estimate of effect size (in principle) marginal effect contra more assumptions
4
data generation set obs 1000 gen uid = _n gen error = rnormal() gen treat = (uid > 500) gen dv = 5 + 2*treat + error
5
data
6
non-parametric
7
parametric
8
ttest hardly ever used with experimental data no effect size assumes normality
9
(linear) regression
10
reference category: baseline, mean 5.045 treatment: cons + 1.947 = 6.992
11
(linear) regression reliability of estimates
12
(linear) regression explained variance
13
regression model explanandum depvar(i) explanans indepvars(i) explanation cons coef
14
regression model
15
fundamental assumption error is uncorrelated with explanatory variables graphical way of testing residuals predicted value should be orthogonal
16
plot
17
II. continuous iv
18
data generating process dv = 5 +.5*level + error
19
regression
20
interpretation in a linear model coef = marginal effect take first derivative wrt level prediction one unit increase of level leads to.495 increase of dv
21
orthogonality of error
22
prediction reg dv level predict preddv two (sc dv level) (sc preddv level, c(L))
23
regression
24
significance intuitive criterion H 0 regressor has no explanatory power = is zero is 0 within confidence interval?
25
how to construct? mean - / + 1.96*SE SE = sqrt(entry in var covar matrix) not very intuitive
26
intuitive approximation assuming the error orthogonal mean 0
27
graph
28
what goes wrong? 6.3 % below 0 procedure attributes entire unexplained variance to level regressor
29
III. heteroskedasticty dv = 5 +.5*level +.1*level*error
30
estimation
31
problem probably even bias / inconsistency at any rate standard errors wrong SE level underestimated SE cons overestimated solution (heteroskedasticity) robust standard errors
32
technically σ0000 0σ000 00σ00 000σ0 0000σ assuming homoskedasticity all obs are iid variance / sd / se the same all over (and all covariance terms are 0)
33
by contrast σ0000 0σ000 00σ00 000σ0 0000σ σ1σ1 0000 0σ2σ2 000 00σ3σ3 00 000σ4σ4 0 0000σ5σ5
34
IV. control variables
35
data generating process two dimensional orthogonal rare in experimental data but correlation of indepvar no problem if not very pronounced multicollinearity dv = 5 + 2*treat +.5*level + error
36
omitted variables if orthogonal no problem with consistency but SE are wrong but cons is wrong
37
prediction
38
same with collinearity data generating process as before but replace treat = treat +.1*level
39
consistency affected
40
V. interaction effects data generating process dv = 5 + 2*treat +.5*level -.25*treat*level + error
41
regression
42
prediction
43
testing net effect is something relevant happening in the treatment at the beginning
44
testing treatment effect at various levels is there a treatment effect at the beginning? is there one in the end?
45
everywhere?
46
VI. outliers data generating process dv = 5 +.5*level + error replace dv = 1000 if uid > 995
47
heavy problem
49
what to do? think of endgame effect proximate cause: highest level (last period) relatively good, but level insig.
50
transform dv
51
best: 1/sqrt(dv) good for cons 5.214 after retransformation very poor for level 5665.8468
52
find reason / contingency
53
problem solved
54
VII. endogeneity immaterial for treatment effect randomization prevents easily relevant when explaining treatment effect data generating process level = 2 +.5*trait + error dv = 5 + 2*treat +.5*level + error
55
inconsistency
56
2sls
57
VIII. small and big problems heteroskedasticity consistent robust SE non-normality (of error term) (law of large numbers) alternative functional form non-independence dgp induced match with statistical model
58
(small and big problems) (omitted variables) decontextualisation outliers capture by specification (transform dv) endogeneity (randomization) (create) iv
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.