Download presentation
Presentation is loading. Please wait.
1
Andrea Friese, Silvia Artuso, Danai Laina
LME4 Andrea Friese, Silvia Artuso, Danai Laina
2
What we are going to do... introduction of the package
linear mixed models: introduction to example fixed and random effects crossed and nested random effects random slopes and random intercepts models comparison and evaluation of the models statistics
3
About lme4 provides functions to fit and analyze statistical linear mixed models, generalized mixed models, and nonlinear mixed models we focus on linear mixed models what they are? → I’ll get there
4
mainly used in fields like biology, physics, or social sciences
developed by Bates, Mächler, Bolker and Walker around 2012 fundamentally based on its predecessor nlme lme4 as baseline for multilevel models + various packages that enhance its feature set
5
Now: what is a linear mixed model?
picture a basic linear model with one dependent variable and one independent variable ( = fixed effect) add various fixed effects add random effects random effect: a factor we want to control, because it can affect the result; “grouping-factor”
6
In lme4: command to analyze fixed effects on the dependent variable:
lm <- lm(variable ~ fixed effect 1 + … , data = lmm.data) summary(lm) to see if/ how individual fixed effects respond in other combinations replace the + operator by * or /: ...fixed effect 1 * fixed effect 2 + fixed effect 3...
7
to fit a random effect into the function:
lmer <- lmer(variable ~ fixed effect (1 | random), data = lmm.data) summary(lmer)
8
Exercises Exercise 1: Run the data that’s provided, then use head(lmm.data) and str(lmm.data) and take a look at your data Exercise 2: Analyze the fixed effects and the dependent variable from our example Exercise 3: Add the random variable ‘school’ To simplify the code check the dataset for abbreviations of fixed effects
9
Types of random effects
we used (1|random)to fit our random effect on the right side of | operator is the so-called “grouping factor” (--> “clusters”) random effects (factors) can be crossed or nested - it depends on the relationship between the variables
10
Crossed random effects
Means that a given factor appears in more than one level of the upper level factor. We can have observations of the same subject across all the random variable ranges (crossed) or at least some of them (partially crossed). We would then fit the identity of the subject and the random factor ranges as (partially) crossed random effects. For example, there are pupils within classes measured over several years. To specify for them: (1|class) + (1|pupil)
11
Be careful: from Wikipedia:
“Multilevel models (also known as hierarchical linear models, linear mixed-effects models, mixed models, nested data models, random coefficient, random-effects models, random parameter models, or split-plot designs)” But while all “hierarchical linear models” are mixed models, not all mixed models are hierarchical. The reason is because you can have crossed (or partially crossed) random factors, that do not represent levels in a hierarchy.
12
Nested random effects When a lower level factor appears only within a particular level of an upper level factor. For example, pupils within classes at a fixed point in time, or classes within schools (implicit nesting). There are two ways to explicitly specify for it in lme4: (1|class) + (1|class:pupil) or (1|class/pupil) Or, we can create a new nested factor: dataset <- within(dataset, sample <- factor(class:pupil))
13
But the nesting can be also explicit in the dataset, when we code differently and uniquely the variables. In this case, using (1|class/pupil) or (1|class)+(1|pupil) in the model will produce the same effect.
14
The same model specification can be used to represent both (partially) crossed or nested factors, so you can’t use the models specification to tell you what’s going on with the random factors, you have to look at the structure of the factors in the data. Example: head(dataset) or str(dataset) To check if there is explicit nesting, we can use the function isNested: with(dataset, isNested(pupil, class)) # FALSE or TRUE
15
And if we need to make explicit the nesting of a random effect in a fixed effect?
In this case: fixed + (1|fixed : random) Because the interaction between a fixed and a random effect is always random.
16
Exercise Exercise 4: In the dataset lmm.data, check if class is nested in school Exercise 5: add the random variable class as a nested effect in school and look at the results; use summary()and plot() Exercise 6: Look at the variance explained by the school factor. What happens if you treat it as a fixed effect? Try out
17
Random slopes vs. random intercepts models
Random effects: Groups Name Variance Std.Dev. class:school (Intercept) school (Intercept) Residual Number of obs: 1200, groups: class:school, 24; school, 6 Our model is what is called a random intercept model: subjects and items are allowed to have different intercept, but not different slopes → coef(). But what if we think that a fixed effect is not the same for all subjects and items?
18
Random slopes model (1 + fixed|random) Example:
randomslopes_models <- lmer(response ~ fixedA + fixedB + (1 + fixedA|random), data=data) This allows the slope of the fixedA variable to vary by the random one. Here we modify our random effect term to include variables before the grouping terms: (1 +open|school/class) tells R to fit a varying slope and varying intercept model for schools and classes nested within schools, and to allow the slope of the open variable to vary by school.
19
Exercise Exercise 7: try to fit to our dataset a random slope model, with openness’ slope varying by school Look at the results
20
How can we compare and evaluate our mixed models?
#Check the arguments of lmer command and find the REML argument ?lmer # Run the following model #Model_1 lmer <- lmer(extro ~ open + agree + social +(1|school/class), data= lmm.data, REML = T) summary(lmer) Check the REML argument, what does it stand for? REML stands for the Restricted Maximum Likelihood It is the default option when constructing a model in lme4 but it is not suitable for comparing models
21
How can we compare and evaluate our mixed models?
# Run the following model #Model_1 lmer <- lmer(extro ~ open + agree + social +(1|school/class), data= lmm.data, REML = F) summary(lmer) Do you notice any differences in the output compared to the previous ones? Linear mixed model fit by maximum likelihood ['lmerMod'] Formula: extro ~ open + agree + social + (1 | school/class) Data: lmm.data AIC BIC logLik deviance df.resid
22
How can we compare and evaluate our mixed models?
#Run the following model #Model_2 lmer2 <- lmer(extro ~ social + (1|class) + (1|school), data=lmm.data, REML = F) summary(lmer2) Which is the output this time? Linear mixed model fit by maximum likelihood ['lmerMod'] Formula: extro ~ social + (1 | class) + (1 | school) Data: lmm.data AIC BIC logLik deviance df.resid Which model should we select?
23
How can we simply test statistically our models?
# Test statistically the two models anova(lmer, lmer2) Check the output. Which model would you choose? # Record the deviance and df of residuals that you calculated before for the two models # lmer: deviance: ; df resid: 1193 # lmer2: deviance: ; df resid: 1195 # Perform a chi-squared test pchisq( – , 1195 – 1193, lower.tail=F)
24
And what if we drop out all our fixed factors?
# Fit the following models, a full model and a reduced model in which we dropped our fixed effects Full.lmer <- lmer(extro ~ open + agree + social + (1|school/class), data= lmm.data, REML = F) Reduced.lmer <- lmer(extro ~ 1 + (1|school/class), data=lmm.data, REML=F) # Compare them anova(Reduced.lmer, Full.lmer) What would you conclude?
25
And after choosing the appropriate model?
REML calculates the most optimal estimates of the variables, optimizing the model So our model is re-fitted with this criterion # After choosing the appropriate model #Model_1 lmer <- lmer(extro ~ open + agree + social +(1|school/class), data= lmm.data, REML = T) summary(lmer)
26
Conclusions To compare different models with different fixed effects and evaluate the amount of explained information → Maximum Likelihood measures The lower the measures, the better our models fit to the data set Depending always on our questions, we construct and select the best fitting models, optimizing them with the REML criterion
27
THANK YOU FOR YOUR ATTENTION
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.