Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.

Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis

Disclaimer Funding for this conference was made possible, in part by Grant R13 AG030995 from the National Institute on Aging. Funding for this conference was made possible, in part by Grant R13 AG030995 from the National Institute on Aging. The views expressed do not necessarily reflect the official policies of the Department of Health and Human Services; nor does mention by trade names, commercial practices, or organizations imply endorsement by the U.S. Government. The views expressed do not necessarily reflect the official policies of the Department of Health and Human Services; nor does mention by trade names, commercial practices, or organizations imply endorsement by the U.S. Government. Dr. Harvey has no conflicts of interest to report. Dr. Harvey has no conflicts of interest to report.

Outline Intro to longitudinal data Intro to longitudinal data Notation Notation General model formulation General model formulation Random effects Random effects Assumptions Assumptions Example Example Interpretation of coefficients Interpretation of coefficients Model diagnostics Model diagnostics

Longitudinal data features Three or more waves of data on each unit/person (some with two waves okay).   Outcome values.   Preferably continuous (although categorical outcomes are possible);   Systematically change over time;   Metric, validity and precision of the outcome must be preserved across time.   Sensible metric for clocking time.   Automobile study: months since purchase, miles, or number of oil changes?

Data Format   Person-level (multivariate or wide format):   One line/record for each person which contains the data for all assessments   Person-period (univariate or long format):   One line/record for each assessment   Person-period data format is usually preferable:   Contains time and predictors at each occasion;   More efficient format for unbalanced data.

Exploring longitudinal data   Empirical growth plots.   If too many, select a random sample.   Reveal how each person changes over time.   Smoothing techniques for trends:   Nonparametric: moving averages, splines, lowess and kernel smoothers.   Examine intra- and inter-individual differences in the outcome.   Gather ideas about functional form of change.

Spaghetti plots

Exploring longitudinal data (cont)   More formally: use OLS regression methods.   Estimate within-person regressions.   Record summary statistics (OLS parameter estimates, their standard errors, R 2 ).   Evaluate the fit for each person.   Examine summary statistics across individuals (obtain their sample means and variances).   Known biases: sample variance of estimated slopes > population variance in the rate of change.

Exploring longitudinal data (cont)   To explore effects of categorical predictors:   Group individual plots.   Examine smoothed individual growth trajectories for groups.   Examine relationship between OLS parameter estimates and categorical predictors.

Selected References Singer, J. D., & Willet, J. B. (2003) Applied Longitudinal Data Analysis, Oxford University Press. Singer, J. D., & Willet, J. B. (2003) Applied Longitudinal Data Analysis, Oxford University Press. Diggle,P. J., Heagerty, P., Liang, Kung-Yee, & Zeger, S. L. (2002). Analysis of Longitudinal Data, Oxford University Press. Diggle,P. J., Heagerty, P., Liang, Kung-Yee, & Zeger, S. L. (2002). Analysis of Longitudinal Data, Oxford University Press. Weiss, R. (2005) Modeling Longitudinal Data, Springer. Weiss, R. (2005) Modeling Longitudinal Data, Springer.

Random Effects Models - Notation Let Y ij = outcome for i th person at the j th time point Let Y ij = outcome for i th person at the j th time point Let Y be a vector of all outcomes for all subjects Let Y be a vector of all outcomes for all subjects X is a matrix of independent variables (such as baseline diagnosis and time) X is a matrix of independent variables (such as baseline diagnosis and time) Z is a matrix associated with random effects (typically includes a column of 1s and time) Z is a matrix associated with random effects (typically includes a column of 1s and time)

Mixed Model Formulation Y = X  + Z  +  Y = X  + Z  +   are the “fixed effect” parameters  are the “fixed effect” parameters Similar to the coefficients in a regression model Similar to the coefficients in a regression model Coefficients tell us how variables are related to baseline level and change over time in the outcome Coefficients tell us how variables are related to baseline level and change over time in the outcome  are the “random effects”,  ~N(0,  )  are the “random effects”,  ~N(0,  )  are the errors,  ~N(0,  2 )  are the errors,  ~N(0,  2 )

Episodic Memory

Working Memory

Random Effects Why use them? Why use them? Not everybody responds the same way (even people with similar demographic and clinical information respond differently) Not everybody responds the same way (even people with similar demographic and clinical information respond differently) Want to allow for random differences in baseline level and rate of change that remain unexplained by the covariates Want to allow for random differences in baseline level and rate of change that remain unexplained by the covariates

Random Effects Cont. Way to think about them Way to think about them Two bins with numbers in them Two bins with numbers in them Every person draws a number from each bin and carries those numbers with them Every person draws a number from each bin and carries those numbers with them Predicted baseline level and change based on “fixed effects” adjusted according to a person’s random number Predicted baseline level and change based on “fixed effects” adjusted according to a person’s random number

Random Effects Cont. Accounts for correlation in observations Accounts for correlation in observations Correlation structures Correlation structures Compound symmetry (common within- individual correlation) Compound symmetry (common within- individual correlation) Autoregressive - AR(1) (each assessment most strongly correlated with previous one) Autoregressive - AR(1) (each assessment most strongly correlated with previous one) Unstructured (most flexible) Unstructured (most flexible)

Assumptions of Model Linearity Linearity Homoscedasticity (constant variance) Homoscedasticity (constant variance) Errors are normally distributed Errors are normally distributed Random effects are normally distributed Random effects are normally distributed Typically assume MAR Typically assume MAR

Interpretation of parameter estimates Main effects Main effects Continuous variable: average association of one unit change in the independent variable with the baseline level of the outcome Continuous variable: average association of one unit change in the independent variable with the baseline level of the outcome Categorical variable: how baseline level of outcome compares to “reference” category Categorical variable: how baseline level of outcome compares to “reference” category Time Time Average annual change in the outcome for “reference individual” Average annual change in the outcome for “reference individual” Interactions with time Interactions with time How annual change varies by one unit change in an independent variable How annual change varies by one unit change in an independent variable Covariance parameters Covariance parameters

Graphical Tools for Checking Assumptions Scatter plot Scatter plot Plot one variable against another one (such as random slope vs. random intercept) Plot one variable against another one (such as random slope vs. random intercept) E.g. Residual plot E.g. Residual plot Scatter plot of residuals vs. fitted values or a particular independent variable Scatter plot of residuals vs. fitted values or a particular independent variable Quantile-Quantile plot (QQ plot) Quantile-Quantile plot (QQ plot) Plots quantiles of the data against quantiles from a specific distribution (normal distribution for us) Plots quantiles of the data against quantiles from a specific distribution (normal distribution for us)

Residual Plot Ideal Residual Plot - “cloud” of points - no pattern - evenly distributed about zero

Non-linear relationship Residual plot shows a non-linear pattern (in this case, a quadratic pattern) Residual plot shows a non-linear pattern (in this case, a quadratic pattern) Best to determine which independent variable has this relationship then include the square of that variable into the model Best to determine which independent variable has this relationship then include the square of that variable into the model

Non-constant variance Residual plot exhibits a “funnel-like” pattern Residual plot exhibits a “funnel-like” pattern Residuals are further from the zero line as you move along the fitted values Residuals are further from the zero line as you move along the fitted values Typically suggests transforming the outcome variable (ln transform is most common) Typically suggests transforming the outcome variable (ln transform is most common)

QQ-Plot

Scatter plot of random effects

Example Back to some data Back to some data Interested in differences in change between diagnostic groups Interested in differences in change between diagnostic groups Outcomes = episodic memory and working memory Outcomes = episodic memory and working memory X includes diagnostic group (control = reference group) and time X includes diagnostic group (control = reference group) and time Incorporate a random intercept and slope, with unstructured covariance (allows for correlation between the random effects) Incorporate a random intercept and slope, with unstructured covariance (allows for correlation between the random effects)

Episodic Memory Model Results (baseline) VariableEstimateSEp-value Intercept0.440.02<0.0001 MCI-0.710.04<0.0001 Demented-1.90.07<0.0001

Model Results (change) VariableEstimateSEp-value Time-0.040.006<0.0001 Time*MCI-0.080.01<0.0001 Time*Dem-0.260.02<0.0001

Advanced topics Time-varying covariates Time-varying covariates Simultaneous growth models (modeling two types of longitudinal outcomes together) Simultaneous growth models (modeling two types of longitudinal outcomes together) Allows you to directly compare associations of specific independent variables with the different outcomes Allows you to directly compare associations of specific independent variables with the different outcomes Allows you to estimate the correlation between change in the two processes Allows you to estimate the correlation between change in the two processes

Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.

Similar presentations

Presentation on theme: "Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.

Similar presentations

Presentation on theme: "Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis."— Presentation transcript:

Similar presentations

About project

Feedback