Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Experimental Data III Christoph Engel.

Similar presentations


Presentation on theme: "Analysis of Experimental Data III Christoph Engel."— Presentation transcript:

1 Analysis of Experimental Data III Christoph Engel

2 independence problems I.repeated measurement II.cluster III.(time series) IV.panel V.nested data

3 I. repeated measurement  simple most case  once repeated  each participant is observed  untreated  treated  dgp  5 + 2*treat + erroruid + errorresid

4 safe solution  dgp  5 + 2*treat + erroruid + errorresid  invites an obvious solution for removing the individual specific error  interest is in the treatment effect  i.e. in individual reactions to manipulation  generate  dv(post) – dv(pre)  test whether significantly different from 0

5 first differences

6 non-parametric works with ranks (as Mann Whitney) but ranks (first) differences

7 parametric assumes normality mean ≈ effect size

8 regression  of first differences  correct  but complicated  but not very informative  Gauss Markov assumptions  independence  exogeneity  error|iv = 0  no multicollinearity  iv matrix has full rank  (no heteroskedasticity)  note  normality not assumed !

9 alternative more informative, but not more effective than ttest t-value of treatment exactly the same

10 technically the same as too conservative if we can safely assume that erroruid = random

11 more efficient in the concrete case small gain coefficients totally unaffected (additional assumption should be tested)

12 II. cluster  typical application  stranger design  interaction in matching groups might violate independence  introductory example  one level of dependence only  dgp  dv = 5 +.5*level + erroruid + errorresid  intuitively  “experiment with x treatments”

13 technically σ1σ1 00000 0σ2σ2 0000 00σ3σ3 000 000σ4σ4 00 0000σ5σ5 0 00000σ6σ6 σ1σ1 σ 11 0000 σ1σ1 0000 00σ2σ2 σ 22 00 00 σ2σ2 00 0000σ3σ3 σ 33 0000 σ3σ3 robust cluster

14 technically σu+σeσu+σe 000 0 σu+σeσu+σe 00 00 σu+σeσu+σe 0 000 σu+σeσu+σe σ 0 00 0 σ00 00σ 0 00 0 σ random effects fixed effects

15 comparison of approaches  cluster  most conservative  only assumes  covariance outside clusters = 0  random / fixed effect  assumes more structure  fixed effects  (implicitly) estimates additional coefficient  random effects  estimates additional error term  assumes off diagonal cov = 0

16 practically  discuss assumptions  random / fixed effect and cluster can be combined  if random effects justified  coefficients should not be affected  consistent  but standard errors should be larger with clustering

17 III. time series  very unusual for experiment  rare application  evolution of average behaviour of participants over time  dgp  dv = 4 + error if t < 11  replace dv = 4 +.2*L10.dv + error if t > 10

18 graphical representation

19 estimation simple  OLS  y = cons + Lx.y + eps  if exact duration of lag unknown / not predicted from theory  one may use significance for selection

20 lag selection

21 more sophisticated  partial autocorrelation  autocorrelation, conditional on all earlier lags  significantly different from 0?  pac dv, lags(number)

22 IV. panel  very frequent  all participants are tested repeatedly  (for the moment: no strategic interaction)  dgp  dv = 5 + 2*treat +.5*level + erroruid + error

23 estimation options  pooled OLS  ignore dependence  random effects  allow for within dependence  but assume  random  independent from ivs  independent from residual error  fixed effects  (implicitly) estimate coefficient for each unit  (cluster)

24 pooled OLS coefficients do not seem biased but standard errors are exaggerated

25 random effects

26 fixed effects

27 time-invariant regressors  why do they drop out?  model uses differencing for removing erroruid  could be first differences  Θ loss of 1 observation per participant  alternative: demeaning  dv* = dv t – (mean)dv

28 why not random effects?  advantages  more efficient  time invariant regressors are estimated  but  additional assumption  individual specific term is  random  uncorrelated with residual error and ivs

29 test of this assumption  straightforward  if assumption is valid  then coefficients of time variant regressors should not differ  random  may differ per individual  but there may not be systematic differences  shift in level is OK  constant may differ

30 Hausman Test  can be done by hand  store coef from one model  use Wald test to see whether coef from alternative model is significantly different  but tedious with > 1 time dependent variable  use Stata procedure

31 Hausman test  xtreg dv treat level, fe  est sto fe  xtreg dv treat level  est sto re  hausman fe re

32 what if Hausman test is significant?  in experimental dataset relatively frequent  mainly due to interactive component  certain participants react in a systematically different way to the actions of others

33 example dgp

34 fe estimates Hausman p =.0051

35 Hausman Taylor

36  estimation  single out ivs suspected to be endogenous  i.e. correlated with random effect  (but uncorrelated with residuals)  check with second Hausman test  if insignificant, endogeneity problem is solved

37 second Hausman test Baltagi Bretton EcLet 2003, 361

38 what does Hausman Taylor do?  remove endogeneity of time dependent variables  by mean differencing  create consistent estimates of time invariant regressors  adjust standard errors  technically most difficult  GLS  (check literature)

39 iv step  alternative interpretation of fixed effects estimator  all time variant regressors are instrumented  instrument  deviation from the individual specific mean  correlated with time variant regressor  uncorrelated with individual specific error  since it has been removed by demeaning

40 iv step  fixed effects is safe, but radical  all time variant regressors are instrumented  even if only some are endogenous  time invariant regressors are removed  even if none of them is endogenous

41 iv step  invites solution  if only some time-variant regressors are endogenous  instrument only those  recover time invariant regressors  if also some time-invariant regressors are endogenous  (use exogenous instruments)  use mean deviation from individual specific mean of exogenous time-variant regressors as instrument

42 iv step  use residuals from step 1  regression of mean differenced model  create mean residual for each uid as dv  explain dv  by time invariant regressors  as instrumented by  exogenous time invariant regressors  instrument themselves  within subject mean of time variant regressors  >= one per endogenous time invariant regressor

43 practical matter  Stata wants at least one time variant exogenous variable  although strictly speaking only necessary if at least one time invariant regressor is endogenous  usually  use time trend

44 V. nested data  very frequent  most economic experiments are interactive  partner design  group  stranger design  matching group  (if you have forgotten to define matching groups: entire sessions)

45 typical dgps  3 layers  choice  individual  group  4 layers  reaction to other group members’ choices  period  individual  group

46 cluster σ1σ1 σ 11 0000 σ1σ1 0000 00σ2σ2 σ 22 00 00 σ2σ2 00 0000σ3σ3 σ 33 0000 σ3σ3 σ1σ1 σ 11 0000 σ1σ1 0000 00σ2σ2 σ 22 00 00 σ2σ2 00 0000σ3σ3 σ 33 0000 σ3σ3 individual clustergroup cluster

47 cluster  SE are not too small  but are likely to be too big  cluster ignores additional information about structure

48 mixed effects model  y git = X*beta + u g + u gi + e git  u g captures group ideosyncrasies  u gi captures individual ideosyncrasies  conditional on group ideosyncrasies being controlled for  e git is residual error

49 estimation  xtmixed dv treat level || group:, || uid:,

50 data structure  defined by  xtmixed dv treat level || group:, || uid:,  could also involve random slopes  xtmixed dv treat level || group: level || uid:,  covariance structure can be changed  default: “independent”  assumes covariances across units to be zero

51 Hausman test  same argument as before  and same test  if both random effects are indeed random  coefficients on time variant regressors should not significantly differ  compared with (one) fixed effect

52 Hausman test necessary to tell Stata what to compare

53


Download ppt "Analysis of Experimental Data III Christoph Engel."

Similar presentations


Ads by Google