Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
Probability & Statistical Inference Lecture 9
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.
Objectives (BPS chapter 24)
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Introduction to Regression Analysis, Chapter 13,
Objectives of Multiple Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Inference for regression - Simple linear regression
Simple Linear Regression
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Data Handling & Analysis BD7054 Scatter Plots Andrew Jackson
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Linear correlation and linear regression + summary of tests
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)
Chapter 8: Simple Linear Regression Yang Zhenlin.
Assumptions of Multiple Regression 1. Form of Relationship: –linear vs nonlinear –Main effects vs interaction effects 2. All relevant variables present.
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Quantitative Methods Residual Analysis Multiple Linear Regression C.W. Jackson/B. K. Gordor.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
Lecture Slides Elementary Statistics Twelfth Edition
An Introduction to Latent Curve Models
Regression Analysis AGEC 784.
Inference for Least Squares Lines
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistics for Managers using Microsoft Excel 3rd Edition
ECONOMETRICS DR. DEEPTI.
I271B Quantitative Methods
Correlation and Regression
Regression Analysis Week 4.
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Prepared by Lee Revere and John Large
Techniques for Data Analysis Event Study
BY: Mohammed Hussien Feb 2019 A Seminar Presentation on Longitudinal data analysis Bahir Dar University School of Public Health Post Graduate Program.
Product moment correlation
Longitudinal Data & Mixed Effects Models
Presentation transcript:

Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis

Disclaimer Funding for this conference was made possible, in part by Grant R13 AG from the National Institute on Aging. Funding for this conference was made possible, in part by Grant R13 AG from the National Institute on Aging. The views expressed do not necessarily reflect the official policies of the Department of Health and Human Services; nor does mention by trade names, commercial practices, or organizations imply endorsement by the U.S. Government. The views expressed do not necessarily reflect the official policies of the Department of Health and Human Services; nor does mention by trade names, commercial practices, or organizations imply endorsement by the U.S. Government. Dr. Harvey has no conflicts of interest to report. Dr. Harvey has no conflicts of interest to report.

Outline Intro to longitudinal data Intro to longitudinal data Notation Notation General model formulation General model formulation Random effects Random effects Assumptions Assumptions Example Example Interpretation of coefficients Interpretation of coefficients Model diagnostics Model diagnostics

Longitudinal data features Three or more waves of data on each unit/person (some with two waves okay).   Outcome values.   Preferably continuous (although categorical outcomes are possible);   Systematically change over time;   Metric, validity and precision of the outcome must be preserved across time.   Sensible metric for clocking time.   Automobile study: months since purchase, miles, or number of oil changes?

Data Format   Person-level (multivariate or wide format):   One line/record for each person which contains the data for all assessments   Person-period (univariate or long format):   One line/record for each assessment   Person-period data format is usually preferable:   Contains time and predictors at each occasion;   More efficient format for unbalanced data.

Exploring longitudinal data   Empirical growth plots.   If too many, select a random sample.   Reveal how each person changes over time.   Smoothing techniques for trends:   Nonparametric: moving averages, splines, lowess and kernel smoothers.   Examine intra- and inter-individual differences in the outcome.   Gather ideas about functional form of change.

Spaghetti plots

Exploring longitudinal data (cont)   More formally: use OLS regression methods.   Estimate within-person regressions.   Record summary statistics (OLS parameter estimates, their standard errors, R 2 ).   Evaluate the fit for each person.   Examine summary statistics across individuals (obtain their sample means and variances).   Known biases: sample variance of estimated slopes > population variance in the rate of change.

Exploring longitudinal data (cont)   To explore effects of categorical predictors:   Group individual plots.   Examine smoothed individual growth trajectories for groups.   Examine relationship between OLS parameter estimates and categorical predictors.

Selected References Singer, J. D., & Willet, J. B. (2003) Applied Longitudinal Data Analysis, Oxford University Press. Singer, J. D., & Willet, J. B. (2003) Applied Longitudinal Data Analysis, Oxford University Press. Diggle,P. J., Heagerty, P., Liang, Kung-Yee, & Zeger, S. L. (2002). Analysis of Longitudinal Data, Oxford University Press. Diggle,P. J., Heagerty, P., Liang, Kung-Yee, & Zeger, S. L. (2002). Analysis of Longitudinal Data, Oxford University Press. Weiss, R. (2005) Modeling Longitudinal Data, Springer. Weiss, R. (2005) Modeling Longitudinal Data, Springer.

Random Effects Models - Notation Let Y ij = outcome for i th person at the j th time point Let Y ij = outcome for i th person at the j th time point Let Y be a vector of all outcomes for all subjects Let Y be a vector of all outcomes for all subjects X is a matrix of independent variables (such as baseline diagnosis and time) X is a matrix of independent variables (such as baseline diagnosis and time) Z is a matrix associated with random effects (typically includes a column of 1s and time) Z is a matrix associated with random effects (typically includes a column of 1s and time)

Mixed Model Formulation Y = X  + Z  +  Y = X  + Z  +   are the “fixed effect” parameters  are the “fixed effect” parameters Similar to the coefficients in a regression model Similar to the coefficients in a regression model Coefficients tell us how variables are related to baseline level and change over time in the outcome Coefficients tell us how variables are related to baseline level and change over time in the outcome  are the “random effects”,  ~N(0,  )  are the “random effects”,  ~N(0,  )  are the errors,  ~N(0,  2 )  are the errors,  ~N(0,  2 )

Episodic Memory

Working Memory

Random Effects Why use them? Why use them? Not everybody responds the same way (even people with similar demographic and clinical information respond differently) Not everybody responds the same way (even people with similar demographic and clinical information respond differently) Want to allow for random differences in baseline level and rate of change that remain unexplained by the covariates Want to allow for random differences in baseline level and rate of change that remain unexplained by the covariates

Random Effects Cont. Way to think about them Way to think about them Two bins with numbers in them Two bins with numbers in them Every person draws a number from each bin and carries those numbers with them Every person draws a number from each bin and carries those numbers with them Predicted baseline level and change based on “fixed effects” adjusted according to a person’s random number Predicted baseline level and change based on “fixed effects” adjusted according to a person’s random number

Random Effects Cont. Accounts for correlation in observations Accounts for correlation in observations Correlation structures Correlation structures Compound symmetry (common within- individual correlation) Compound symmetry (common within- individual correlation) Autoregressive - AR(1) (each assessment most strongly correlated with previous one) Autoregressive - AR(1) (each assessment most strongly correlated with previous one) Unstructured (most flexible) Unstructured (most flexible)

Assumptions of Model Linearity Linearity Homoscedasticity (constant variance) Homoscedasticity (constant variance) Errors are normally distributed Errors are normally distributed Random effects are normally distributed Random effects are normally distributed Typically assume MAR Typically assume MAR

Interpretation of parameter estimates Main effects Main effects Continuous variable: average association of one unit change in the independent variable with the baseline level of the outcome Continuous variable: average association of one unit change in the independent variable with the baseline level of the outcome Categorical variable: how baseline level of outcome compares to “reference” category Categorical variable: how baseline level of outcome compares to “reference” category Time Time Average annual change in the outcome for “reference individual” Average annual change in the outcome for “reference individual” Interactions with time Interactions with time How annual change varies by one unit change in an independent variable How annual change varies by one unit change in an independent variable Covariance parameters Covariance parameters

Graphical Tools for Checking Assumptions Scatter plot Scatter plot Plot one variable against another one (such as random slope vs. random intercept) Plot one variable against another one (such as random slope vs. random intercept) E.g. Residual plot E.g. Residual plot Scatter plot of residuals vs. fitted values or a particular independent variable Scatter plot of residuals vs. fitted values or a particular independent variable Quantile-Quantile plot (QQ plot) Quantile-Quantile plot (QQ plot) Plots quantiles of the data against quantiles from a specific distribution (normal distribution for us) Plots quantiles of the data against quantiles from a specific distribution (normal distribution for us)

Residual Plot Ideal Residual Plot - “cloud” of points - no pattern - evenly distributed about zero

Non-linear relationship Residual plot shows a non-linear pattern (in this case, a quadratic pattern) Residual plot shows a non-linear pattern (in this case, a quadratic pattern) Best to determine which independent variable has this relationship then include the square of that variable into the model Best to determine which independent variable has this relationship then include the square of that variable into the model

Non-constant variance Residual plot exhibits a “funnel-like” pattern Residual plot exhibits a “funnel-like” pattern Residuals are further from the zero line as you move along the fitted values Residuals are further from the zero line as you move along the fitted values Typically suggests transforming the outcome variable (ln transform is most common) Typically suggests transforming the outcome variable (ln transform is most common)

QQ-Plot

Scatter plot of random effects

Example Back to some data Back to some data Interested in differences in change between diagnostic groups Interested in differences in change between diagnostic groups Outcomes = episodic memory and working memory Outcomes = episodic memory and working memory X includes diagnostic group (control = reference group) and time X includes diagnostic group (control = reference group) and time Incorporate a random intercept and slope, with unstructured covariance (allows for correlation between the random effects) Incorporate a random intercept and slope, with unstructured covariance (allows for correlation between the random effects)

Episodic Memory Model Results (baseline) VariableEstimateSEp-value Intercept < MCI < Demented <0.0001

Model Results (change) VariableEstimateSEp-value Time < Time*MCI < Time*Dem <0.0001

Advanced topics Time-varying covariates Time-varying covariates Simultaneous growth models (modeling two types of longitudinal outcomes together) Simultaneous growth models (modeling two types of longitudinal outcomes together) Allows you to directly compare associations of specific independent variables with the different outcomes Allows you to directly compare associations of specific independent variables with the different outcomes Allows you to estimate the correlation between change in the two processes Allows you to estimate the correlation between change in the two processes