Covariance structures in longitudinal analysis Which one to choose?

Slides:



Advertisements
Similar presentations
A Note on Modeling the Covariance Structure in Longitudinal Clinical Trials Devan V. Mehrotra Merck Research Laboratories, Blue Bell, PA FDA/Industry Statistics.
Advertisements

Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Multiple Regression and Model Building
Strip-Plot Designs Sometimes called split-block design
GENERAL LINEAR MODELS: Estimation algorithms
Using Multilevel Modeling to Analyze Longitudinal Data Mark A. Ferro, PhD Offord Centre for Child Studies Lunch & Learn Seminar Series January 22, 2013.
STA305 week 31 Assessing Model Adequacy A number of assumptions were made about the model, and these need to be verified in order to use the model for.
Model specification (identification) We already know about the sample autocorrelation function (SAC): Properties: Not unbiased (since a ratio between two.
Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.
Multilevel modeling in R Tom Dunn and Thom Baguley, Psychology, Nottingham Trent University
9. SIMPLE LINEAR REGESSION AND CORRELATION
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Clustered or Multilevel Data
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of scientific research When you know the system: Estimation.
Mixed models Various types of models and their relation
Chapter 2 Simple Comparative Experiments
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? Oi-man Kwok Texas A & M University.
Modeling clustered survival data The different approaches.
Correlation and Regression Analysis
1 Introduction to mixed models Ulf Olsson Unit of Applied Statistics and Mathematics.
Introduction to Multilevel Modeling Using SPSS
Regression and Correlation Methods Judy Zhong Ph.D.
Inference for regression - Simple linear regression
Survival analysis with time-varying covariates in SAS
How to Analyze and Graphically Present Longitudinal Data
Modelling non-independent random effects in multilevel models William Browne Harvey Goldstein University of Bristol.
Longitudinal data analysis Shu-Hui Wen Longitudinal data Data were collected repeatedly through time. –Measure of sleepiness before and after.
Lecture 8: Generalized Linear Models for Longitudinal Data.
Application of repeated measurement ANOVA models using SAS and SPSS: examination of the effect of intravenous lactate infusion in Alzheimer's disease Krisztina.
Scientific question: Does the lunch intervention impact cognitive ability? The data consists of 4 measures of cognitive ability including:Raven’s score.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Modeling Repeated Measures or Longitudinal Data. Example: Annual Assessment of Renal Function in Hypertensive Patients UNITNOYEARAGESCrEGFRPSV
Mixed Linear Models An Introductory Tutorial. What we have covered!!! The Linear model: Mean Structure or Fixed Effects Errors.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
BUSI 6480 Lecture 8 Repeated Measures.
Spatial Analysis & Geostatistics Methods of Interpolation Linear interpolation using an equation to compute z at any point on a triangle.
Developing a Mixed Effects Model Using SAS PROC MIXED
Mixed Effects Models Rebecca Atkins and Rachel Smith March 30, 2015.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Lecture 5. Linear Models for Correlated Data: Inference.
Analysis of Experiments
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
ANOVA Overview of Major Designs. Between or Within Subjects Between-subjects (completely randomized) designs –Subjects are nested within treatment conditions.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies.
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
Computacion Inteligente Least-Square Methods for System Identification.
INFERENCE Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.
Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.
Multilevel modelling: general ideas and uses
Correlation, Regression & Nested Models
Ecevit Eyduran Adile Tatlıyer Abdul Waheed
Linear Mixed Models in JMP Pro
Inference for the mean vector
Statistics in MSmcDESPOT
Chapter 2 Simple Comparative Experiments
G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
An Introductory Tutorial
Longitudinal Data & Mixed Effects Models
Rachael Bedford Mplus: Longitudinal Analysis Workshop 23/06/2015
Presentation transcript:

Covariance structures in longitudinal analysis Which one to choose?

Repeated Measures

Importance of Covariance Structures variability not explained by the fixed effects are model in the covariance structure represent the background variability that the fixed effects are tested against  valid inferences for fixed effects parameters

Selecting the Appropriate Covariance Structure Choice of covariance structure is a balance since:  Too simple  Type I error rate increases  Too complex  power and efficiency decreases

Example How does the left atrial dimension change over time in patients newly diagnosed with atrial fibrillation?  Atrial fibrillation is an irregularity of the heart’s rhythm Due to chaotic electrical activity in the upper chambers (atria), the atria quiver instead of contracting in an organized manner  Atrial enlargement maybe related to how easily a subject can go back to a normal rhythm and the likelihood of a blood clot forming --> stroke

Heart Diagram

Example - Data Data source: Canadian Registry of Atrial Fibrillation Left atrial dimension measured at enrolment, Year 2, Year 4, Year 7 and Year 10 Fit model with fixed effects only  adjust for age at first diagnosis of atrial fibrillation (AF), gender, hypertension at enrolment and visit year

Example

Model specification Y = X  + Z  +  where: Y = response over time X = design matrix for fixed effects  = parameters for fixed effects Z = vector of 1s for the random effects  = parameters for random effects  = within-subject variation Y = X  + 

SAS Code PROC MIXED ; CLASS variables ; MODEL dependent = ; RANDOM random-effects ; REPEATED / TYPE = covariance-structure ;

Repeat vs Random statement The RANDOM statement relates to random effects The REPEATED statement relates to the structure of the within subject errors.  Each statement has a different role…BUT specifying a model with compound symmetry covariance structure can be done with either statement

Models with REPEATED Statement only No random effects specified in model  Assume random effects error is small compared to within subject error Covariance structure is based only on the within subject error.

General covariance structure Assume homogeneity assumption for practical reasons – reduces the number of parameters estimated Possible to not assume the homogeneity assumption (can be tested but need sufficient amount of data to specify)

Block Diagonal Covariance Matrix r ~ N     0,

Covariance structures Simple (VC – Variance Component)            1 parameter

Covariance structures Unstructured (UN)                                      15 parameters

Covariance structures Compound Symmetry (CS)                 2 parameters

Covariance structures First-order Autoregressive [AR(1)]               2 parameters

Covariance structures Toeplitz (TOEP)                  5 parameters

Draftsman’s plots 2D array of scatterplots for each pair of time lagged observations For 3 time points: Y1, Y2 and Y3  Y1 vs. Y2  Y1 vs. Y3  Y2 vs. Y3

Draftsman’s plot – Simulation examples Independence Y2Y3Y4 Y1 Y2 Y3

Draftsman’s plot – Simulation examples Autoregressive Compound Symmetry

Example – Draftsman’s plot

Example - Correlation matrix LA_0LA_2LA_4LA_7LA_10 LA_ LA_ LA_ LA_ LA_

Variogram graphical description of the time/spatial correlation between observations summarises the relationship between differences in pairs of measurements and the distance of the corresponding points from each other Equally or unequally spaced observation periods

Variogram Calculate the sample variogram components: v ijk = ½ (r ij – r ik ) 2 r ij =residual u ijk = |t ij – t ik |t ij =time Plot of v ijk vs. u ijk Process variance – estimated by the average of ½(r ij – r lk ) 2 for i ≠ l

Variogram - Theoretical Measurement Error Within Subject Correlation Time Lag Process Variance Random Effects Process Variance

Variogram – Sitka tree example

Example - Variogram

Which covariance structure? Fit model with different covariance structures Compare goodness-of-fit statistics to choose covariance structure Goodness-of-fit statistics Bayesian information criterion (BIC)  BIC = -2loglik+ d logn Akaike information criterion (AIC)  AIC = -2loglik+ 2d

Estimation method for the covariance parameters Maximum Likelihood (ML) versus Restricted Maximum Likelihood (REML) both are based on likelihood principles  properties of consistency, asymptotic normality, and efficiency differences increase as the number of fixed effects in the model increases

ML vs. REML Goodness-of-fit testing for the two methods differ in what part of the model it assesses  ML: describes the fit of the whole model (fixed and random effects)  REML: describes the fit of the stochastic portion (random effects)

Which goodness-of-fit statistic? Bayesian information criterion (BIC)  BIC = -2loglik+ d logn Akaike information criterion (AIC)  AIC = -2loglik+ 2d The BIC has a higher penalty than AIC for including more parameters  more simple model a too simple model has inflated Type I error rates  Typically, choose model based on AIC

Example Which covariance structure fits the best? Fit Statistics UN (15) CS (2) TOEP (5) AR(1) (2) -2 Res Log Likelihood AIC (smaller is better) BIC (smaller is better)

Fixed Effects Parameter Estimates Effect Covariance structureEstimateSEt-statisticp-value Intercept UN <.0001 CS <.0001 TOEP <.0001 AR(1) <.0001 Age UN CS TOEP AR(1) Female UN CS TOEP AR(1)

Fixed Effect Parameters – cont’d Effect Covariance structureEstimateSEt-statisticp-value HypertensionUN CS TOEP AR(1) TimeUN <.0001 CS <.0001 TOEP <.0001 AR(1) <.0001

Likelihood ratio test (LRT) For nested models, can also test if the additional parameters add a statistically significant improvement in the model For the example, the LRT for TOEP (5 parameters) vs. CS (2 parameters) ---> choose CS model

Summary Graphical plots to help identify covariance structure AIC and BIC to choose between covariance structures LRT to test if additional parameters are warranted

References Dawson, K.S., Gennings, C. and Carter, W.H Two graphical techniques useful in detecting correlation structure in repeated measures data. The American Statistician. 51(3) Diggle, P.J., Liang, K.Y. and Zeger, S.L Analysis of Longitudinal Data. Oxford. Clarendon Press. Littell, R.C., Pendergast, J. and Natarajan, R Modelling covariance structure in the analysis of repeated measures data. Statistics in Medicine Moser, E.B Repeated Measures Modeling with PROC MIXED. Paper SUGI 29. Singer, J.D Using SAS PROC MIXED to Fit Multilevel Models, Hierarchichal Models, and Individual Growth Models. Journal of Educational and Behavioral Statistics. 24(40) Singer, J.D. and Willet, J.B Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. New York. Oxford Univeristy Press. Ware, J.H Linear models for the analysis of longitudinal studies. The American Statistician. 39(2)