Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multilevel modelling: general ideas and uses

Similar presentations


Presentation on theme: "Multilevel modelling: general ideas and uses"— Presentation transcript:

1 Multilevel modelling: general ideas and uses
Kari Nissinen Finnish Institute for Educational Research

2 Hierarchical data Data in question is organized in a hierarchical / multilevel manner Units at lower level (1-5) are arranged into higher-level units (A, B) A B 1 2 3 4 5

3 Hierarchical data Examples Students within classes within schools
Employees within workplaces Partners in couples Residents within neighbourhoods Nestlings within broods within populations… Repeated measures within individuals

4 Hierarchical data The key issue is clustering
lower-level units within an upper-level unit tend to be more homogeneous than two arbitrary lower-level units E.g. students within a class: intra-cluster correlation ICC (positive) Repeated measures: autocorrelation (usually positive)

5 Hierarchical data Clustering => lower-level units are not independent In cross-sectional studies this is a problem Two correlated observations provide less information than two independent observations (partial ’overlap’) Efficient sample size smaller than nominal sample size => statistical inference falsely powerful

6 Clustering in cross-sectional studies
Basic statistical methods do not recognize the dependence of observations Standard errors (variances) underestimated => confidence intervals too short, statistical tests too significant Special methodology needed for correct variances… Design-based approaches (variance estimation in cluster sampling framework) Model-based approaches: multilevel models

7 Clustering in cross-sectional studies
Measure of ’inference error’ due to clustering: design effect (DEFF) = ratio of correct variance to underestimated variance (no clustering assumed) A function of ratio of nominal sample size to effective sample size and/or homogeneity within clusters (ICC)

8 Hierarchical data Hierarchy is a property of population, which can carry over into the sample data Cluster sampling: hierarchy is explicitly present in data collection => data possess the same hierarchy (and possible clustering) exactly Simple random sampling (etc): clustering may or may not appear in the data It is present but hidden, may be difficult to identify Effect may be negligible

9 Hierarchical data Hierarchy does not always lead to clustering: units within a cluster can be uncorrelated Other side of the coin is heterogeneity between upper-level units: if no heterogeneity, then no homogeneity among lower-level units Zero ICC => no need for special methodology Clustering can affect some target variables, but not some others

10 Longitudinal data Clustering = measurements on an individual are not independent When analyzing change this is a benefit Each units serves as its own ’control unit’ (’block design’) => ’true’ change Autocorrelation ’carries’ this link from time point to another Appropriate methods utilize this correlation => powerful statistical inference

11 Mixed models An approach for handling hierarchical / clustered / correlated data Typically regression or ANOVA models, which contain effects of explanatory variables, which can be (i) fixed, (ii) random or (iii) both Linear mixed models: error distribution normal (Gaussian) Generalized linear mixed models: error distribution binomial, Poisson, gamma, etc

12 Mixed models Variance component models
Random coefficient regression models Multilevel models Hierachical (generalized) linear models All these are special cases of mixed models Similar estimation procedures (maximum likelihood & its variants), etc

13 Fixed vs random effects
1-way ANOVA fixed effects model Y(ij) = μ + α(i) + e(ij) μ = fixed intercept, grand mean α(i) = fixed effect of group i e(ij) = random error (’random effect’) of unit ij random, because it is drawn from a population it has a probability distribution (often N(0,σ²))

14 Fixed vs random effects
Fixed effects determine the means of observations E(Y(ij)) = μ + α(i), since E(e(ij))=0 Random effects determine the variances (& covariances/correlations) of observations Var(Y(ij)) = Var(e(ij)) = σ²

15 Fixed vs random effects
1-way ANOVA random effects model Y(ij) = μ + u(i) + e(ij) μ = fixed intercept, grand mean u(i) = random effect of group i random when the group is drawn from a population of groups has a probability distribution N(0,σ(u)²) e(ij) = random error (’random effect’) of unit ij

16 Fixed vs random effects
Now the mean of observations is just E(Y(ij)) = μ Variance is Var(Y(ij)) = Var(u(i) + e(ij)) = σ(u)² + σ² Sum of two variance components => variance component model

17 Random effects and clustering
Random group => units ij and ik within group i are correlated: Cov(Y(ij),Y(ik)) = Cov(u(i) + e(ij), u(i) + e(ik)) = Cov(u(i), u(i)) = σ(u)² Positive intra-cluster correlation ICC = Cov(Y(ij),Y(ik)) / Var(Y(ij)) = σ(u)² / (σ(u)² + σ²)

18 Mixed model Contains both fixed and random effects, e.g.
Y(ij) = μ + βX(ij) + u(i) + e(ij) i = school, j = student μ = fixed intercept β = fixed regression coefficient u(i) = random school effect (’school intercept’) e(ij) = random error of student j in school i

19 Mixed model Y(ij) = μ + βX(ij) + u(i) + e(ij)
The mean of Y is modelled as a function of explanatory variable X through the fixed parameters μ and β The variance of Y and within-cluster covariance (ICC) are modelled through the random effects u (’level 2’) and e (’level 1’) This is the general idea; extends versatilely

20 Regression lines in variance component model: high ICC

21 Regression lines in variance component model: low ICC

22 An extension: random coefficient regression
Y(ij) = μ + βX(ij) + u(i) + v(i)X(ij) + e(ij) v(i) = random school slope Regression coefficient of X varies between schools: β + v(i) A ’side effect’: the variance of Y varies along with X one possible way to model unequal variances (as a function of X)

23 Random coefficient regression

24 Regression for repeated measures data
Y(it) = μ(t) + βX(it) + e(it) t = time, μ(t) = intercept at time t i = individual The errors e(it) of individual i correlated: different (auto)correlation structures (e.g. AR(1)) can be fitted as well as different variance structures (unequal variances)

25 Thanks!


Download ppt "Multilevel modelling: general ideas and uses"

Similar presentations


Ads by Google