Modeling Continuous Longitudinal Data. Introduction to continuous longitudinal data: Examples.

Slides:



Advertisements
Similar presentations
MANOVA (and DISCRIMINANT ANALYSIS) Alan Garnham, Spring 2005
Advertisements

A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Repeated Measure Ideally, we want the data to maintain compound symmetry if we want to justify using univariate approaches to deal with repeated measures.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
SPSS Series 3: Repeated Measures ANOVA and MANOVA
GEE and Mixed Models for longitudinal data
ANOVA notes NR 245 Austin Troy
Analysis of variance (ANOVA)-the General Linear Model (GLM)
C82MST Statistical Methods 2 - Lecture 7 1 Overview of Lecture Advantages and disadvantages of within subjects designs One-way within subjects ANOVA Two-way.
ANalysis Of VAriance (ANOVA) Comparing > 2 means Frequently applied to experimental data Why not do multiple t-tests? If you want to test H 0 : m 1 = m.
Independent Samples and Paired Samples t-tests PSY440 June 24, 2008.
Chapter 14 Conducting & Reading Research Baumgartner et al Chapter 14 Inferential Data Analysis.
Crossover Trials Useful when runs are blocked by human subjects or large animals To increase precision of treatment comparisons all treatments are administered.
One-way Between Groups Analysis of Variance
Analysis of Variance & Multivariate Analysis of Variance
Multivariate Analysis of Variance, Part 1 BMTRY 726.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Leedy and Ormrod Ch. 11 Gray Ch. 14
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
ANALYSIS OF VARIANCE. Analysis of variance ◦ A One-way Analysis Of Variance Is A Way To Test The Equality Of Three Or More Means At One Time By Using.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Inferential Statistics: SPSS
Chapter 13: Inference in Regression
Repeated Measures ANOVA
One-Way Manova For an expository presentation of multivariate analysis of variance (MANOVA). See the following paper, which addresses several questions:
ANOVA Greg C Elvers.
Multivariate Analysis of Variance (MANOVA). Outline Purpose and logic : page 3 Purpose and logic : page 3 Hypothesis testing : page 6 Hypothesis testing.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
 The idea of ANOVA  Comparing several means  The problem of multiple comparisons  The ANOVA F test 1.
ANOVA (Analysis of Variance) by Aziza Munir
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
Biostatistics Case Studies 2007 Peter D. Christenson Biostatistician Session 3: Incomplete Data in Longitudinal Studies.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
1 G Lect 14a G Lecture 14a Examples of repeated measures A simple example: One group measured twice The general mixed model Independence.
Inferential Statistics
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
INTRODUCTION TO ANALYSIS OF VARIANCE (ANOVA). COURSE CONTENT WHAT IS ANOVA DIFFERENT TYPES OF ANOVA ANOVA THEORY WORKED EXAMPLE IN EXCEL –GENERATING THE.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
ANOVA: Analysis of Variance.
Adjusted from slides attributed to Andrew Ainsworth
Chapter 17 Comparing Multiple Population Means: One-factor ANOVA.
Experimental Research Methods in Language Learning
1 Experimental Statistics - week 14 Multiple Regression – miscellaneous topics.
PSYC 3030 Review Session April 19, Housekeeping Exam: –April 26, 2004 (Monday) –RN 203 –Use pencil, bring calculator & eraser –Make use of your.
Strip-Plot Designs Sometimes called split-block design For experiments involving factors that are difficult to apply to small plots Three sizes of plots.
Randomized block designs  Environmental sampling and analysis (Quinn & Keough, 2002)
Stats Lunch: Day 8 Repeated-Measures ANOVA and Analyzing Trends (It’s Hot)
Correlated-Samples ANOVA The Univariate Approach.
1 Experimental Statistics Spring week 6 Chapter 15: Factorial Models (15.5)
Experimental Statistics - week 3
One-Way Analysis of Variance Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
1 Modeling change Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy
Experimental Statistics - week 9
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies.
1 Statistics 262: Intermediate Biostatistics Mixed models; Modeling change.
1 Experimental Statistics - week 8 Chapter 17: Mixed Models Chapter 18: Repeated Measures.
MANOVA Lecture 12 Nuance stuff Psy 524 Andrew Ainsworth.
PROFILE ANALYSIS. Profile Analysis Main Point: Repeated measures multivariate analysis One/Several DVs all measured on the same scale.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
Repeated Measures ANOVA Prof. Wei Zhu Department of Applied Mathematics & Statistics Stony Brook University.
Education 793 Class Notes ANCOVA Presentation 11.
Differences Among Group Means: Multifactorial Analysis of Variance
Modeling Continuous Longitudinal Data
An Introductory Tutorial
12 Inferential Analysis.
Presentation transcript:

Modeling Continuous Longitudinal Data

Introduction to continuous longitudinal data: Examples

Copyright ©1995 BMJ Publishing Group Ltd. Lokken, P. et al. BMJ 1995;310: Day of surgery Days 1-7 after surgery (morning and evening) Mean pain assessments by visual analogue scales (VAS) Homeopathy vs. placebo in treating pain after surgery

Divalproex vs. placebo for treating bipolar depression Davis et al. “Divalproex in the treatment of bipolar depression: A placebo controlled study.” J Affective Disorders 85 (2005)

Copyright ©1995 BMJ Publishing Group Ltd. Keller, H.-R. et al. BMJ 1995;310: Mean (SD) score of acute mountain sickness in subjects treated with simulated descent (One hour of treatment in the hyperbaric chamber) or dexamethasone. Randomized trial of in-field treatments of acute mountain sickness

Copyright ©1997 BMJ Publishing Group Ltd. Cadogan, J. et al. BMJ 1997;315: Mean (SE) percentage increases in total body bone mineral and bone density over 18 months. P values are for the differences between groups by repeated measures analysis of variance Pint of milk vs. control on bone acquisition in adolescent females

Copyright ©2000 BMJ Publishing Group Ltd. Hovell, M. F et al. BMJ 2000;321: Counseling vs. control on smoking in pregnancy

Longitudinal data: broad form id time1 time2 time3 time Hypothetical data from Twisk, chapter 3, page 26, table 3.4 Jos W. R. Twisk. Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide. Cambridge University Press, 2003.

Longitudinal data: Long form Hypothetical data from Twisk, chapter 3, page 26, table 3.4 id time score id time score

Converting data from broad to long in SAS… data long; set broad; time=1; score=time1; output; time=2; score=time2; output; time=3; score=time3; output; time=4; score=time4; output; run;

Profile plots (use long form) The plot tells a lot!

Mean response plot

Superimposed…

smoothed

Superimposed…

Two groups (e.g., treatment placebo) id group time1 time2 time3 time4 1 A A A B B B Hypothetical data from Twisk, chapter 3, page 40, table 3.7

Profile plots by group B A

Mean plots by group B A

Possible questions… Overall, are there significant differences between time points? From plots: looks like some differences (time3 and 4 look different) Overall, are there significant changes from baseline? From plots: at time3 or time4 maybe Do the two groups differ at any time points? From plots: certainly at baseline; some difference everywhere Do the two groups differ in their responses over time?** From plots: their response profile looks similar over time, though A and B are closer by the end.

Statistical analysis strategies Strategy 1: ANCOVA on the final measurement, adjusting for baseline differences (end-point analysis) Strategy 2: repeated-measures ANOVA “Univariate” approach Strategy 3: “Multivariate” ANOVA approach Strategy 4: GEE Strategy 5: Mixed Models Strategy 6: Modeling change Newer approaches: next week Traditional approaches: this week In two/three weeks

Comparison of traditional and new methods FROM: Ralitza Gueorguieva, PhD; John H. Krystal, MD Move Over ANOVA : Progress in Analyzing Repeated-Measures Data and Its Reflection in Papers Published in the Archives of General Psychiatry. Arch Gen Psychiatry. 2004;61:

Things to consider: 1. Spacing of time intervals Repeated-measures ANOVA and MANOVA require that all subjects measured at same time intervals—our plots above assumed this too! MANOVA weights all time intervals evenly (as if evenly spaced) 2. Assumptions of the model ALL strategies assume normally distributed outcome and homogeneity of variances But all strategies are robust against this assumption, especially if data set is >30 **Univariate repeated-measures ANOVA assumes sphericity, or compound symmetry 3. Missing Data All traditional analyses require imputation of missing data (also need to know: does the SAS PROC require long or broad form of data?)

Compound symmetry Compound symmetry requires : (a)The variances of the outcome variable must be the same at each time point (b) The correlation between repeated measurements are equal, regardless of the time interval between measurements.

(a) Variances at each time points (visually) Does variance look equal across time points?? --Looks like most variability at time1 and least at time4…

(a) Variances at each time points (numerically) id time1 time2 time3 time Variance:

(b) Correlation (covariance) across time points time1 time2 time3 time4 time time time time Certainly do NOT have equal correlations! Time1 and time2 are highly correlated, but time1 and time3 are inversely correlated!

Compound symmetry would look like… time1 time2 time3 time4 time time time time

Missing Data Very important to fill in missing data! Otherwise, you have to throw out the whole observation. With missing data, changes in the mean over time may just reflect drop-out pattern; you cannot compare time point 1 with 50 people to time point 2 with 35 people! We will implement classic “last observation carried forward” strategy for simplicity Other more complicated imputation strategies may be more appropriate

LOCF SubjectHRSD 1HRSD 2HRSD 3HRSD 4 Subject Subject Subject Subject

LOCF SubjectHRSD 1HRSD 2HRSD 3HRSD 4 Subject Subject Subject Subject Last Observation Carried Forward

Strategy 1: End-point analysis proc glm data=broad; class group; model time4 = time1 group; run; Removes repeated measures problem by considering only a single time point (the final one). Ignores intermediate data completely Asks whether or not the two group means differ at the final time point, adjusting for differences at baseline (using ANCOVA). Comparing groups at every follow-up time point in this way would hugely increase your type I error.

Strategy 1: End-point analysis Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE time4 Mean Source DF Type I SS Mean Square F Value Pr > F time group group time4 LSMEAN Pr > |t| A B

Strategy 1: End-point analysis Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE time4 Mean Source DF Type I SS Mean Square F Value Pr > F time group group time4 LSMEAN Pr > |t| A B Least-squares means of the two groups at time4, adjusted for baseline differences (not significantly different)

From end-point analysis… Overall, are there significant differences between time points? Can’t say Overall, are there significant changes from baseline? Can’t say Do the two groups differ at any time points? They don’t differ at time4 Do the two groups differ in their responses over time? Can’t say

Strategy 2: univariate repeated measures ANOVA (rANOVA) Just good-old regular ANOVA, but accounting for between subject differences

BUT first… Naive analysis Run ANOVA on long form of data, ignoring correlations within subjects (also ignoring group for now): proc anova data=long; class time; model score= time ; run; Compares means from each time point as if they were independent samples. (analogous to using a two-sample t-test when a paired t-test is appropriate). Results in loss of power!

One-way ANOVA (naïve) Between times id time1 time2 time3 time4MEAN MEAN : Within time

One-way ANOVA results The ANOVA Procedure Dependent Variable: score Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total Source DF Anova SS Mean Square F Value Pr > F time Twisk: Output 3.3

Univariate repeated-measures ANOVA Explain away some error variability by accounting for differences between subjects: -SSE was This will be reduced by variability between subjects proc glm data=broad; model time1-time4=; repeated time; run; quit;

rANOVA id time1 time2 time3 time4MEAN MEAN: Between subjects

rANOVA results The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Adj Pr > F Source DF Type III SS Mean Square F Value Pr > F G - G H - F time Error(time) Greenhouse-Geisser Epsilon Huynh-Feldt Epsilon Between time variability Unexplained variability Repeated measures p-value =.0752 After G-G correction for non-sphericity=.1311 (H-F correction gives.1114) Idea of G-G and H-F corrections, analogous to pooled vs. unpooled variance ttest: if we have to estimate more things because variances/covariances aren’t equal, then we lose some degrees of freedom and p-value increases. These epsilons should be 1.0 if sphericity holds. Sphericity assumption appears violated.

With two groups: Naive analysis Run ANOVA on long form of data, ignoring correlations within subjects: proc anova data=long; class time; model score= time group group*time; run; As if there are 8 independent samples: 2 groups at each time point.

Two-way ANOVA (naïve) grp time1 time2 time3 time4MEAN A A A MEAN: B B B MEAN : Overall mean=27 Within time Between groups Within time Recall: SST= ; group by time= =26.79

Results: Naïve analysis The ANOVA Procedure Dependent Variable: score Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total Source DF Anova SS Mean Square F Value Pr > F time group time*group

Univariate repeated-measures ANOVA Reduce error variability by between subject differences: -SSE was This will be reduced by variability between subjects proc glm data=broad; class group; model time1-time4= group; repeated time; run; quit;

rANOVA grp time1 time2 time3 time4MEAN A A A MEAN: B B B MEAN : Overall mean=27 Between subjects in each group

rANOVA results (two groups) The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Adj Pr > F Source DF Type III SS Mean Square F Value Pr > F G - G H - F time time*group Error(time) Greenhouse-Geisser Epsilon Huynh-Feldt Epsilon The GLM Procedure Repeated Measures Analysis of Variance Tests of Hypotheses for Between Subjects Effects Source DF Type III SS Mean Square F Value Pr > F group Error Usually of less interest! What we care about! No apparent difference in responses over time between the groups.

From rANOVA analysis… Overall, are there significant differences between time points? No, Time not statistically significant (p=.1743, G-G) Overall, are there significant changes from baseline? No, Time not statistically significant Do the two groups differ at any time points? No, Group not statistically significant (p=.1408) Do the two groups differ in their responses over time?** No, not even close; Group*Time (p-value>.60)

Strategy 3: rMANOVA Multivariate: More than one dependent variable Multivariate Approach to repeated measures--Treats response variable as a multivariate response vector. Not just for repeated measures, but appropriate for other situations with multiple dependent variables.

Analogous to paired t-test Recall: paired t-test: Paired t-test compares the difference values between two time points to their standard error. MANOVA is just a paired t-test where the outcome variable is a vector of difference rather than a single difference: Called: Hotelling's Trace Where T is the number of time points:

T-1 differences id group diff1 diff2 diff3 1 A A A B B B Note: weights all differences equally, so hard to interpret if time intervals are unevenly spaced. Note: assumes differences follow a multivariate normal distribution + multivariate homogeneity of variances assumption

On same output as rANOVA proc glm data=broad; model time1-time4=; repeated time; run; quit; Null hypothesis: diff1=0, diff2=0, diff3=0

Results (time only) MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time Effect H = Type III SSCP Matrix for time E = Error SSCP Matrix S=1 M=0.5 N=0.5 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda Pillai's Trace Hotelling-Lawley Trace Roy's Greatest Root separate F-statistics (slightly different versions of MANOVA statistic) all give the same answer: change over time is not significant compare to rANOVA results: G-G time p-value=.13 Use Wilks’ Lambda in general. Use Pillai’s Trace for small sample sizes (when assumptions of model are violated)

On same output as rANOVA proc glm data=broad; class group; model time1-time4= group; repeated time; run; quit;

The GLM Procedure Repeated Measures Analysis of Variance MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time Effect Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda Pillai's Trace Hotelling-Lawley Trace Roy's Greatest Root MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time*group Effect Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda Pillai's Trace Hotelling-Lawley Trace Roy's Greatest Root No differences between times. No differences in change over time between the groups (compare to G-G time*group p-value=.6954) Results (two groups)

From rMANOVA analysis… Overall, are there significant differences between time points? No, Time not statistically significant (p=.3287) Overall, are there significant changes from baseline? No, Time not statistically significant Do the two groups differ at any time points? Can’t say (never looked at raw scores, only difference values) Do the two groups differ in their responses over time?** No, not even close; Group*Time (p-value=.89)

Can also test for the shape of the response profile… proc glm data=broad; class group; model time1-time4= group; repeated time 3 polynomial /summary ; run; quit;

The GLM Procedure Repeated Measures Analysis of Variance Analysis of Variance of Contrast Variables time_N represents the nth degree polynomial contrast for time Contrast Variable: time_1 Source DF Type III SS Mean Square F Value Pr > F Mean group Error Contrast Variable: time_2 Source DF Type III SS Mean Square F Value Pr > F Mean group Error Contrast Variable: time_3 Source DF Type III SS Mean Square F Value Pr > F Mean group Error linear quadratic cubic

Can also get successive paired t-tests proc glm data=broad; class group; model time1-time4= group; repeated time profile /summary ; run; quit; **Not adjusted for multiple comparisons!

Repeated Measures Analysis of Variance Analysis of Variance of Contrast Variables time_N represents the nth successive difference in time Contrast Variable: time_1 Source DF Type III SS Mean Square F Value Pr > F Mean group Error Contrast Variable: time_2 Source DF Type III SS Mean Square F Value Pr > F Mean group Error Contrast Variable: time_3 Source DF Type III SS Mean Square F Value Pr > F Mean group Error Time1 vs. time2 Time2 vs. time3 Time3 vs. time4

Univariate vs. multivariate If compound symmetry assumption is met, univariate approach has more power (more degrees of freedom). But, if compound symmetry is not met, then type I error is increased

Summary: rANOVA and rMANOVA Require imputation of missing data rANOVA requires compound symmetry (though there are corrections for this) Require subjects measured at same time points But, easy to implement and interpret

Practice: rANOVA and rMANOVA Within-subjects effects, but no between-subjects effects. Time is significant. Group*time is significant. Group is not significant. What effects do you expect to be statistically significant? Time? Group? Time*group?

Practice: rANOVA and rMANOVA Between group effects; no within subject effects: Time is not significant. Group*time is not significant. Group IS significant.

Practice: rANOVA and rMANOVA Some within-group effects, no between- group effect. Time is significant. Group is not significant. Time*group is not significant.

References Jos W. R. Twisk. Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide. Cambridge University Press, 2003.