Modelling non-independent random effects in multilevel models William Browne Harvey Goldstein University of Bristol
A standard multilevel (VC) model Fixed Random level 2 level 1 residual The are assumed independent. But this is sometimes unrealistic: Repeated measures growth models with closely spaced occasions Schools competing for resources in a ‘zero-sum’ environment
Repeated measures growth curves A simple model of linear growth with random slopes: A model for (non-independent) level 1 residuals might be written: Leading to an exponential decay function. (Goldstein and Healy 1994)
Schools in competition where this correlation is inversely proportional to the (resource) distance between the schools. If we can specify a suitable (set of ) distance functions then we can estimate the relevant parameters. One possibility is to use the extent of overlap between appropriately defined catchment areas. Work using the ALSPAC cohort is currently underway.
Other link functions These have the following forms Logit: Log:
Link functions Link function f(s). From left to right; hyperbolic, logit, log
Parameters and estimation We need to estimate the parameters of the correlation function, the variances and the fixed effects. We propose an MCMC algorithm and have programmed this for general 2- level models where correlations can exist at either or both levels and responses can be normal or binary. Steps are a mixture of Gibbs and MH sampling with adaptive proposal distributions and suitable diffuse priors
Example 1: Growth data The data are 9 measurements on 20 boys around age 13, approximately 3 months apart Fitting a 2-level model with random linear and quadratic coefficients does not remove residual autocorrelation among level 1 residuals. We model the correlation as a negative exponentially decreasing function of the time difference We use a log (exponential) link since correlations should be positive In discrete time (equal intervals) this is a standard first order autoregressive model We fit a 4-th degree polynomial with and without random linear coefficient
Results random slope Intercept only For model A the correlation between measurements 0.25 years apart is 0.73 and for model B is
An equivalence For a 2-level variance components model the full covariance matrix among the level 1 units in a level 2 unit can be written in the form where in this case there are 4 level 1 units. For the model with an equal correlation structure at level 1 and no level 2 variation the corresponding covariance matrix is equivalent, namely
A level 2 example: dependence based on distance apart We have a three level model consisting of schools at level 3, cohorts or year groups at level 2 (2004,2005,2006) and students at level 1. The data are taken from the PLASC/NPD database response is GCSE score and predictors include 11 year KS test score We fit as a 2-level model (school cohorts at level 2) with a correlation structure between cohorts within schools and dummies for years : So that the correlation is modelled by a constant + decreasing function of time difference
A two level model for examination data with correlation structure at level 2. Inverse tanh link function. Burn in =500, Sample =5000. Year 1 (2004) chosen as base category. Uniform priors for variances. Second covariance matrix is unretricted estimate. ParameterEstimateStandard error Intercept Year Year Pretest Level 1 variance Level 2 covariance matrix DIC (PD) (126.4) RESULTS
Conclusions These models provide a useful generalisation to standard ‘independence’ models and are readily extended to non-normal responses, cross classifications etc. They allow us to more realistically describe the behaviour of institutions that are interactive rather than independently behaving units