Longitudinal Data & Mixed Effects Models

Longitudinal Data & Mixed Effects Models
Danielle J. Harvey UC Davis

Disclaimer Funding for this conference was made possible, in part by Grant R13 AG from the National Institute on Aging. The views expressed do not necessarily reflect the official policies of the Department of Health and Human Services; nor does mention by trade names, commercial practices, or organizations imply endorsement by the U.S. Government. Dr. Harvey has no conflicts of interest to report.

Outline Intro to longitudinal data Notation General model formulation
Random effects Assumptions Example Interpretation of coefficients Model diagnostics

Longitudinal data features
Three or more waves of data on each unit/person (some with two waves okay). Outcome values. Preferably continuous (although categorical outcomes are possible); Systematically change over time; Metric, validity and precision of the outcome must be preserved across time. Sensible metric for clocking time. Automobile study: months since purchase, miles, or number of oil changes?

Data Format Person-level (multivariate or wide format):
One line/record for each person which contains the data for all assessments Person-period (univariate or long format): One line/record for each assessment Person-period data format is usually preferable: Contains time and predictors at each occasion; More efficient format for unbalanced data.

Exploring longitudinal data
Empirical growth plots (spaghetti plots). If too many, select a random sample. Reveal how each person changes over time. Smoothing techniques for trends: Nonparametric: moving averages, splines, lowess and kernel smoothers. Examine intra- and inter-individual differences in the outcome. Gather ideas about functional form of change.

Spaghetti plots

Exploring longitudinal data (cont)
More formally: use OLS regression methods. Estimate within-person regressions. Record summary statistics (OLS parameter estimates, their standard errors, R2). Evaluate the fit for each person. Examine summary statistics across individuals (obtain their sample means and variances). Known biases: sample variance of estimated slopes > population variance in the rate of change.

Exploring longitudinal data (cont)
To explore effects of categorical predictors: Group individual plots. Examine smoothed individual growth trajectories for groups. Examine relationship between OLS parameter estimates and categorical predictors.

Selected References Singer, J. D., & Willet, J. B. (2003) Applied Longitudinal Data Analysis, Oxford University Press. Diggle,P. J., Heagerty, P., Liang, Kung-Yee, & Zeger, S. L. (2002). Analysis of Longitudinal Data, Oxford University Press. Weiss, R. (2005) Modeling Longitudinal Data, Springer.

Random Effects Models - Notation
Let Yij = outcome for ith person at the jth time point Let Y be a vector of all outcomes for all subjects X is a matrix of independent variables (such as cognitive activity frequency and time) Z is a matrix associated with random effects (typically includes a column of 1s and time)

Mixed Model Formulation
Y = X + Z +   are the “fixed effect” parameters Similar to the coefficients in a regression model Coefficients tell us how variables are related to baseline (or overall) level and change over time in the outcome  are the “random effects”, ~N(0,)  are the errors, ~N(0,2)

Working Memory

Random Effects Why use them?
Not everybody responds the same way (even people with similar demographic and regular cognitive activities respond differently) Want to allow for random differences in baseline level and rate of change that remain unexplained by the covariates

Random Effects Cont. Way to think about them
Two bins with numbers in them Every person draws a number from each bin and carries those numbers with them Predicted baseline level and change based on “fixed effects” adjusted according to a person’s random number

Random Effects Cont. Accounts for correlation in observations
Correlation structures Compound symmetry (common within-individual correlation) Autoregressive - AR(1) (each assessment most strongly correlated with previous one) Unstructured (most flexible)

Assumptions of Model Linearity Homoscedasticity (constant variance)
Errors are normally distributed Random effects are normally distributed Typically assume MAR

Interpretation of parameter estimates
Main effects Continuous variable: average association of one unit change in the independent variable with the baseline level of the outcome Categorical variable: how baseline level of outcome compares to “reference” category Time Average annual change in the outcome for “reference individual” Interactions with time How annual change varies by one unit change in an independent variable Covariance parameters

Graphical Tools for Checking Assumptions
Scatter plot Plot one variable against another one (such as random slope vs. random intercept) E.g. Residual plot Scatter plot of residuals vs. fitted values or a particular independent variable Quantile-Quantile plot (QQ plot) Plots quantiles of the data against quantiles from a specific distribution (normal distribution for us)

Residual Plot Ideal Residual Plot - “cloud” of points - no pattern
- evenly distributed about zero

Non-linear relationship
Residual plot shows a non-linear pattern (in this case, a quadratic pattern) Best to determine which independent variable has this relationship then include the square of that variable into the model

Non-constant variance
Residual plot exhibits a “funnel-like” pattern Residuals are further from the zero line as you move along the fitted values Typically suggests transforming the outcome variable (ln transform is most common)

QQ-Plot

Scatter plot of random effects

Example Back to some data
Interested in association between frequency of cognitive activity and decline in working memory Outcomes = working memory X includes lifetime cognitive activity frequency and time Incorporate a random intercept and slope, with unstructured covariance (allows for correlation between the random effects)

Working Memory Model Results (baseline)
Variable Estimate SE p-value Intercept -0.99 0.02 <0.001 Cognitive Activity Frequency 0.34 0.04

Model Results (change)
Variable Estimate SE p-value Time -0.16 0.03 <0.001 Time*Cognitive Activity Frequency 0.02 0.009 0.007

Advanced topics Time-varying covariates
Simultaneous growth models (modeling two types of longitudinal outcomes together) Allows you to directly compare associations of specific independent variables with the different outcomes Allows you to estimate the correlation between change in the two processes

Longitudinal Data & Mixed Effects Models

Similar presentations

Presentation on theme: "Longitudinal Data & Mixed Effects Models"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Longitudinal Data & Mixed Effects Models

Similar presentations

Presentation on theme: "Longitudinal Data & Mixed Effects Models"— Presentation transcript:

Similar presentations

About project

Feedback