Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Experimental Data II Christoph Engel.

Similar presentations


Presentation on theme: "Analysis of Experimental Data II Christoph Engel."— Presentation transcript:

1 Analysis of Experimental Data II Christoph Engel

2 linear model I.treatment effect II.continuous explanatory variable III.heteroskedasticity IV.control variables V.interaction effects VI.outliers VII.endogeneity VIII.small and big problems

3 I. treatment effect  pro  (usually) more statistical power  greater flexibility  control variables  heteroskedasticity  instrumental variables  time series and panel models  non-linear functional form  automatic estimate of  effect size  (in principle) marginal effect  contra  more assumptions

4 data generation  set obs 1000  gen uid = _n  gen error = rnormal()  gen treat = (uid > 500)  gen dv = 5 + 2*treat + error

5 data

6 non-parametric

7 parametric

8 ttest  hardly ever used with experimental data  no effect size  assumes normality

9 (linear) regression

10 reference category: baseline, mean 5.045 treatment: cons + 1.947 = 6.992

11 (linear) regression reliability of estimates

12 (linear) regression explained variance

13 regression model  explanandum  depvar(i)  explanans  indepvars(i)  explanation  cons  coef

14 regression model

15 fundamental assumption  error is uncorrelated with explanatory variables  graphical way of testing  residuals  predicted value should be orthogonal

16 plot

17 II. continuous iv

18 data generating process dv = 5 +.5*level + error

19 regression

20 interpretation  in a linear model  coef = marginal effect  take first derivative wrt level  prediction  one unit increase of level  leads to.495 increase of dv

21 orthogonality of error

22 prediction reg dv level predict preddv two (sc dv level) (sc preddv level, c(L))

23 regression

24 significance  intuitive criterion  H 0  regressor has no explanatory power  = is zero  is 0 within confidence interval?

25 how to construct?  mean - / + 1.96*SE  SE = sqrt(entry in var covar matrix)  not very intuitive

26 intuitive approximation  assuming the error  orthogonal  mean 0

27 graph

28 what goes wrong?  6.3 % below 0  procedure attributes entire unexplained variance to level regressor

29 III. heteroskedasticty dv = 5 +.5*level +.1*level*error

30 estimation

31 problem  probably even bias / inconsistency  at any rate standard errors wrong  SE level underestimated  SE cons overestimated  solution  (heteroskedasticity) robust standard errors

32 technically σ0000 0σ000 00σ00 000σ0 0000σ  assuming homoskedasticity  all obs are iid  variance / sd / se the same all over  (and all covariance terms are 0)

33 by contrast σ0000 0σ000 00σ00 000σ0 0000σ σ1σ1 0000 0σ2σ2 000 00σ3σ3 00 000σ4σ4 0 0000σ5σ5

34 IV. control variables

35 data generating process  two dimensional  orthogonal  rare in experimental data  but correlation of indepvar no problem  if not very pronounced   multicollinearity  dv = 5 + 2*treat +.5*level + error

36 omitted variables if orthogonal no problem with consistency but SE are wrong but cons is wrong

37 prediction

38 same with collinearity  data generating process as before  but  replace treat = treat +.1*level

39 consistency affected

40 V. interaction effects  data generating process dv = 5 + 2*treat +.5*level -.25*treat*level + error

41 regression

42 prediction

43 testing net effect  is something relevant happening  in the treatment  at the beginning

44 testing treatment effect  at various levels  is there a treatment effect at the beginning?  is there one in the end?

45 everywhere?

46 VI. outliers data generating process dv = 5 +.5*level + error replace dv = 1000 if uid > 995

47 heavy problem

48

49 what to do?  think of endgame effect  proximate cause: highest level (last period)  relatively good, but level insig.

50 transform dv

51 best: 1/sqrt(dv)  good for cons  5.214 after retransformation  very poor for level  5665.8468

52 find reason / contingency

53 problem solved

54 VII. endogeneity  immaterial for treatment effect  randomization prevents  easily relevant when explaining treatment effect  data generating process  level = 2 +.5*trait + error  dv = 5 + 2*treat +.5*level + error

55 inconsistency

56 2sls

57 VIII. small and big problems  heteroskedasticity  consistent  robust SE  non-normality (of error term)  (law of large numbers)  alternative functional form  non-independence  dgp induced  match with statistical model

58 (small and big problems)  (omitted variables)  decontextualisation  outliers  capture by specification  (transform dv)  endogeneity  (randomization)  (create) iv

59


Download ppt "Analysis of Experimental Data II Christoph Engel."

Similar presentations


Ads by Google