Presentation is loading. Please wait.

Presentation is loading. Please wait.

LINEAR MIXED-EFFECTS MODELS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.

Similar presentations


Presentation on theme: "LINEAR MIXED-EFFECTS MODELS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1."— Presentation transcript:

1 LINEAR MIXED-EFFECTS MODELS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1

2 Brief Review - RCMs  Random coefficient models (RCMs) are statistical models developed by focusing on individual trajectories in a subject-specific fashion  The idea behind RCMs is that an individual’s set of scores are determined by: 1. Population average effects that influence the process of change for all subjects, 2. Subject-specific effects unique to a particular subject, and 3. Lack of fit or residual 2

3 Brief Review - RCMs  The decomposition of scores into population plus individual effects led to the name mixed-effects model  In other disciplines, for example in education research, the same framework is called a multilevel model or hierarchical model 3

4 Brief Review - RCMs  The RCM for repeated measures expresses the n i responses of an individual, y i = ( y i 1,…, y in i )′, as a linear combination of p independent variables in the n i x p matrix, X i  The weights in this regression are the sum of population parameters plus random effects that are unique to each subject 4

5 Brief Review - RCMs  For person i this relationship is where the individual coefficients, β i, are the sum of fixed, population parameters and subject-specific random effects 5

6 Brief Review - RCMs  Level-1: Within Subject  Level-2: Between Subjects  Because y i is determined by these effects, the mean vector and covariance matrix for y i are, respectively, 6

7 Brief Review - RCMs  Individual coefficients are the sum of fixed and random terms. For all β j, j = 1,…, p  It happens fairly often in practice that there is no appreciable variability on one or more β ij. For example, lets take a linear model In the equation above, there are p = 2 fixed terms and p = 2 random terms. 7

8 Brief Review - RCMs  It might be found empirically that individuals differ in intercept (initial status) but not in slope. When this occurs, we would modify the model so that only intercepts are random In the equation above, there are p = 2 fixed terms and p = 1 random term. 8

9 Brief Review - RCMs  Determining whether random effects should be included on each term can sometimes be decided on theoretical grounds, but more often it is decided empirically 9

10 LME – Distinguishing Fixed from Random Effects  The idea of the RCM is that independent variables always have both a fixed and random term; that is, the stochastic coefficient for the variable t ij is  The notion of mixed effects models is to allow greater flexibility in the possible weightings by specifying that variables have any of three types of coefficients, that is 10

11 LME – Distinguishing Fixed from Random Effects 1. Mixed: fixed + random 2. Fixed alone 3. Random alone  To handle the expanded situation, we first allow the number of fixed and random coefficients to differ  As before, let X i be the n i x p design matrix of the fixed effects, β  11

12 LME – Distinguishing Fixed from Random Effects  Additionally, let r denote the number of random effects. Almost always in practice, r ≤ p, but this need not be true in every case  A second design matrix, Z i, with order n i x r, is associated with the random effects  The general mixed effects model is written in the following way 12

13 LME – Distinguishing Fixed from Random Effects  Note that this more elaborate approach includes standard RCMs as a special case  If all fixed terms also have random effects, we simply set Z i = X i  If it is found that one or more of the random effects are unnecessary, then Z i is set not equal to the columns of X i, that is, Z i ≠ X i 13

14 LME – Distinguishing Fixed from Random Effects  Additionally, Z i may contain variables that are distinct from X i. This extension means that variables with only random effects can be incorporated  Level-1: Within Subjects  Level-2: Between Subjects Note that the order of is now r x r 14

15 LME – Distinguishing Fixed from Random Effects  The repeated measures of individual i are the sum of a fixed part X i β and a random part reflecting between-subject variability Z i b i and a second term reflecting within- subject variability, e i  This implies that the distribution of y i is 15

16 LME – Estimation Issues  The decision to drop random effects is a little complicated  In a typical estimation situation when we consider whether a parameter might be excluded from a model, it is sufficient to compare an estimate to its standard error.  OR, if the confidence interval contains zero, then the parameter is usually excluded from further analyses 16

17 LME – Estimation Issues  When the decision is to eliminate the j th set of random effects because [ Φ ] jj is small, then quite a lot of changes in the model occur:  The j th element is dropped from each of the individual sets of b i, plus r elements of Φ are excluded 17

18 LME – Estimation Issues  The decision to delete a random effect should not be based on a simple test of the relevant variance in Φ : [ ] jj compared to se [ ] jj  The rule is that this decision should be based on a likelihood ratio test comparing two models: one is the full model that includes the random effect(s), and the second is the reduced model excluding the random effect 18

19 LME – Estimation Issues  Based on the likelihood ratio test, if the overall performance of the reduced model is not appreciably poorer than the full model, one can then decide to drop the random effect in question  There is an additional concern to be aware of when testing nested correlated random effects covariance structures. That is, 19

20 LME – Estimation Issues  The null distribution of likelihood ratio test no longer has a chi-squared distribution with degrees of freedom equal to the difference between the number of parameters in the full and reduced models  This is due to the fact the variance parameter is being tested on its boundary space (variances cannot be negative – and we’re testing whether the variance term is zero) 20

21 LME – Estimation Issues  In fact, the null distribution of the likelihood ratio test is a mixture of chi- squared distributions  For example, when comparing two nested models, one with q correlated random effects (1 with just a random intercept) and ( q + 1) correlated random effects, the null distribution of the LRT is a 50:50 mixture of chi-squared distributions with q and ( q + 1) df (Note: critical values can be found in FLW) 21

22 LME – Estimation Issues  Alternatively, Fitzmaurice, Laird and Ware (2004) make another recommendation. As an ad hoc procedure, they advise using the regular LRT test statistic but compare this to a using the more stringent = 0.01 22

23 Example: Vocabulary  High school students are measured on a scaled vocabulary test over four repeated measures. The sample size, N = 64. 23

24 Example  The quadratic model with random effects on all three terms is specified: where t = (0,1,2,3)′. A typical row of the design matrix has entries (1, t j, t j ^2). The covariance matrix of random effects, Φ, is of order 3 and unstructured (allowing for variances and covariances to be estimated). The covariance matrix of residuals is Λ i = σ ^2 I 4. 24

25 Example  Using the total sample of boys and girls, the maximum likelihood estimates are  The confidence intervals of both the second and third variances include zero as an interior point. This is an indication that the model can be simplified 25

26 Example  To check whether the model should be modified, estimate a mixed effects model like that has random intercepts only  The design matrix for the p = 3 fixed effects is the same as before. The design matrix for the random intercepts has r = 1 column. The two matrices are 26

27 Example  The maximum likelihood estimates of the parameters of the reduced model are as follows, where is the variance of the single remaining random effect. 27

28 Example  Under the full model the deviance (-2ln L ) is D F = 852.6, while for the reduced model D R = 856.5  The likelihood ratio test statistic for evaluating the reduced model gives an appreciably poorer overall fit is T = D R – D F = 856.5 - 852.6 = 3.9 28

29 Example  Using FLW (2004) recommendation, we will use LRT test statistic with more stringent. The regular LRT is approximately distributed as with degrees of freedom equal to the difference in number of parameters df = q F – q R = 10 – 5 = 5 29

30 Example  Because T = 3.9 is less than the critical value of with df = 5, we would use the simpler reduced model over the more complicated full model with its three random effects 30

31 LME – Adding Covariates  When a model fits adequately, one generally wishes to understand more about  How the individual differences represented by the individual regressions arise?  Which characteristics in the history of the subjects account for variability in the components of the model such as initial status or in rate of change? 31

32 LME – Adding Covariates  The goal is to provide further explanation about the process under investigation  The way covariates are integrated is based on the fact that random effects have two interpretations : 1. They are regression coefficients that quantify the extent to which an independent variable contributes to a dependent variable 2. At the same time, they are subject- specific variables that differ from person to person 32

33 LME – Adding Covariates  Note that the random effects can be predicted by or correlated with other covariates  If a covariate is correlated with the random effects, then the covariate is an explanation for an important feature (such as initial status or rate of change) of the entire developmental process 33

34 Example LME – Adding Covariates Studying Group Differences :  Group (population) differences may arise in the study of intact groups (men versus women or children versus adolescents), or may be defined by experimental conditions (treatments versus control) 34

35 Example Skodak and Skeels (1949)  Skodak and Skeels (1949) presented final follow-up data from an eleven year longitudinal study of N =100 adopted children. The children were assessed up to four times with the Binet Intelligence test. It will be recalled that the ages when the children were measured are random, and there are some missing observations. A partial subsample of records for the N B =40 boys and N G =60 girls are plotted as: 35

36 Example Skodak and Skeels’ Study 36

37 Example Skodak and Skeels’ Study  A RCM seems appropriate. The sample is composed of 40 boys and 60 girls. A linear model with random slopes and intercepts will be considered  To allow for possible differences between boys and girls in IQ growth, a model is specified with the same intercept for both boys and girls, β 0, but possibly different slopes, β 1 for boys and β 1 + δ for girls 37

38 Example Skodak and Skeels’ Study  Note that to make the interpretation of intercept more meaningful we center the predictor variable, age (contd. on next slide) 38

39 Example Skodak and Skeels’ Study  The intercept now represents the predicted mean “Weighted Total Binet IQ Score” of child at age = 4  The regression coefficient (slope) for age remains the same  Now define t ij as age in years, but to locate the intercept near the first occasion of measurement, we will transform t ij to t * ij = age ij − 4 years. This is approximately the predicted mean age at the first occasion. 39

40 Example Skodak and Skeels’ Study  For subjects with complete data, the design matrices for the fixed effects for boys and girls, respectively, are 40

41 Example Skodak and Skeels’ Study  The design matrices for the random effects for all subjects are the same, that is,  Note that the variance of the random intercepts and random slopes is common for both genders. 41

42 Example Skodak and Skeels’ Study  At the first level of variation, it will be assumed that  At the second level of variation 42

43 Example Treatments of Prostrate Cancer  A study was conducted to investigate the effects of two interventions in the treatment of prostate cancer in middle aged men. The severity of prostate cancer is often assessed by a plasma component called prostate- specific antigen (PSA). PSA is an enzyme produced normally in men, but it is elevated in the presence of prostate cancer. Conversely, lower PSA levels are associated with better overall functioning. (contd…) 43

44 Example Treatments of Prostrate Cancer  The protocol planned for samples to be collected at baseline, and also at months 3, 6, 9, and 12. There were 100 subjects in each of two conditions, the Standard Treatment and the New Treatment. It is clear from the plots (next slide) of PSA over time that patients generally improved. The dependent variance follows a linear trend with negative slope in each condition. 44

45 Example Treatments of Prostrate Cancer 45

46 Example Treatments of Prostrate Cancer  There is missing data in the sample. Two subjects died during the study. The other missing data was mainly due to (1) problems with contaminated samples in the laboratory, and (2) missed appointments due to patient’s schedule conflicts or illness, or (3) missed appointments due to researcher schedule conflict. 46

47 Example Treatments of Prostrate Cancer  We will treat the missing data as ignorable because there is no apparent systematic nonresponse. Details regarding the distinct missing data patterns are in the following table 47

48 Example Treatments of Prostrate Cancer  The primary objective of the study was to investigate whether the treatments lower PSA levels, and to determine whether improvement is more rapid in one or the other  The main study variables were: 48

49 Example Treatments of Prostrate Cancer  A linear relationship was posited between PSA level and time in the study. It was found after some preliminary analyses that random effects are not needed on the slope, so the initial (baseline) model is  The intercept measures PSA level at time M ij = 0, before the treatments were administered 49

50 Example Treatments of Prostrate Cancer  In addition to the linear parameters, β 0 and β 1, the model includes the level 1 and level 2 variances, σ ^2 = var( e ij ) and ϕ = var( b 0 i ), where cov( e i ) = Λ i = σ ^2 I n i. The covariates, even the indicator variable representing treatment, are excluded from the baseline analysis for the moment. 50

51 Example Treatments of Prostrate Cancer  The next obvious step is to understand how individual differences initially arise  What characteristic in these subjects’ history accounts for differences in initial status? Perhaps age, dietary habits, overall health, or some other predictor could explain it.  The broad goal is to investigate whether the intercepts are possibly related to age or to any of the other variables 51


Download ppt "LINEAR MIXED-EFFECTS MODELS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1."

Similar presentations


Ads by Google