Application of repeated measurement ANOVA models using SAS and SPSS: examination of the effect of intravenous lactate infusion in Alzheimer's disease Krisztina Boda1, János Kálmán2, Zoltán Janka2 Department of Medical Informatics1, Department of Psychiatry2 University of Szeged, Hungary
Introduction Repeated measures analysis of variance (ANOVA) generalizes Student's t-test for paired samples. It is used when an outcome variable of interest is measured repeatedly over time or under different experimental conditions on the same subject. MIE '2002
The purpose of the discussion to show the application of different statistical models to investigate the effect of intravenous Na-lactate on cerebral blood flow and on venous blood parameters in Alzheimer's dementia (AD) probands using SAS and SPSS programs. to show the most important properties of these statistical models. to show that different models on the same data set may give different results. MIE '2002
Topics of Discussion The medical experiment The data table Statistical models and programs Statistical analysis of two parameters (venous blood PH and systoloc blood pressure) using different models and programs GLM models Mixed models Comparison of the results Summary of the key points Medical results and discussion MIE '2002
The medical experiment Patients: 20 patients having moderate-severe dementia syndrome (AD). Experimental design: self-control study measurements were performed on the same patient at 0, 10 and 20 minutes after 0.9 % NaCl (Saline) or 0.5 M Na-lactate infusion on two different days NaCl (Saline) (day 1) Na-lactate (day 2) 0’ 10’ 20’ 0’ 10’ 20’ MIE '2002
The data „multivariate” or „wide” form MIE '2002
The data „univariate” or „long” form MIE '2002
Statistical model The statistical models will be shown using one chosen parameter the venous blood PH. 2 repeated measures factors: days (treatments) with 2 levels (Saline or Lactate) time with 3 levels (0, 10 and 20 minutes) both factors are fixed values of interest are all represented in the data file MIE '2002
Venous blood PH levels sample size interaction MIE '2002
Topics of Discussion The medical experiment The data table Statistical models and programs Statistical analysis of one parameter (venous blood PH) using different models and programs GLM models Mixed models Comparison of the results Summary of the key points Medical results and discussion MIE '2002
Statistical models and programs t-tests the repeated use of the t-tests may increase the experiment wise probability of Type I error. ANOVA GLM Mixed Programs used SAS 6.12, 8.02 SPSS 9.0, 11.0 MIE '2002
Repeated measures ANOVA Observations on the same subject are usually correlated and often exhibit heterogeneous variability a covariance pattern across time periods can be specified within the residual matrix. Effects: between-subjects effects within-subjects effects Interactions MIE '2002
Statistical models GLM (General Linear Model) y= X + y: a vector of observed data : an unknown vector of fixed-effects parameters with known design matrix X : an unknown random error vector – assumed to be independently and identically distributed N(0,2) MIXED Model y= X + Z + : an unknown vector of random-effects parameters with known design matrix Z : an unknown random error vector – whose elements are no longer required to be independent and homogenous. Assume that and are Gaussian random variables and have expectations 0 and variances G and R, respectively. The variance of y is V=ZGZ’ + R For G and R some covariance structure must be selected MIE '2002
The within-subjects covariance matrix - covariance patterns for 3 time periods UN-Unstructured CS-Compound Symmetry VC-Variance Components AR(1) - First-Order Autoregressive MIE '2002
GLM MIXED Requires balanced data; subjects with missing observations are deleted Assumes special form of the within-subject covariance matrix: Type H (Sphericity) – univariate approach Unstructured –multivariate approach Estimates covariance parameters using a method of moments …. Allows data that are missing at random Allows a wide variety of within-subject covariance matrix UN-Unstructured VC-Variance Components CS-Compound Symmetry AR(1)-1th order autoregressive … Estimates covariance parameters using restricted maximum likelihood,… …. MIE '2002
Topics of Discussion The medical experiment The data table Statistical models and programs Statistical analysis of one parameter (venous blood PH) using different models and programs GLM models Mixed models Comparison of the results Summary of the key points Medical results and discussion MIE '2002
Statistical analysis of venous blood PH using different models and programs Examination of univariate statistics and correlation structure GLM univariate and multivariate results, verifying assumptions Mixed models Create the model Examine and choose the covariance structure Compare fixed effects MIE '2002
Paired t-test (only for demonstration –not recommended) Comparison Sig. (2-tailed) Day 1, 0’-10’ 0.140 Day 1, 0’-20’ 0.164 Day 1, 10’-20’ 0.607 Day 2, 0’-10’ 0.009 Day 2, 0’-20’ 0.000 Day 2, 10’-20’ 0.000 0’, Day1-Day2 0.788 10’, Day1-Day2 0.018 20’, Day1-Day2 0.000 MIE '2002
Correlation of PH measurements PH1_0 PH1_10 PH1_20 PH2_0 PH2_10 PH2_20 PH1_0 1 .874 .691 .658 .512 .243 PH1_10 .874 1 .820 .600 .677 .407 PH1_20 .691 .820 1 .381 .296 .006 PH2_0 .658 .600 .381 1 .635 .399 PH2_10 .512 .677 .296. 635 1. 720 PH2_20 .243 .407 .006 .399 .720 1 MIE '2002
Repeated measures ANOVA Effects: between-subjects effects -none within-subjects effects Treatment (Saline - Lactate) - fixed Time (0’-10’-10’) - fixed Patient -random Interactions Treatment*time interactions will be examined MIE '2002
GLM Univariate commands (data must be in „wide” form) SPSS SAS GLM ph1_0 ph1_10 ph1_20 ph2_0 ph2_10 ph2_20 /WSFACTOR = treat 2 Polynomial time 3 Polynomial /METHOD = SSTYPE(3) /PLOT = PROFILE( time*treat ) /WSDESIGN = treat time treat*time. PROC GLM ; model ph1_0 ph1_10 ph1_20 ph2_0 ph2_10 ph2_20=; repeated treat 2, time 3 polynomial / summary ; Run; MIE '2002
GLM univariate assumptions and results (SPSS) Sphericity test failed, a correction can be applied 3 subjects are deleted because of missing value TREATMENT*TIME interaction is significant MIE '2002
GLM multivariate results (SPSS) MIE '2002
Plot in SPSS MIE '2002
Mixed models commands (Data must be in „long” form) SAS 8.02 SPSS 11.0 proc mixed covtest; class name treat time; model ph = treat time treat*time; repeated /type=un sub=name r rcorr; lsmeans treat*time /pdiff; run; MIXED ph BY treat time /CRITERIA = CIN(95) MXITER(100) MXSTEP(10) SCORING(1) SINGULAR(0.000000000001) HCONVERGE(0, ABSOLUTE) LCONVERGE(0, ABSOLUTE) PCONVERGE(0.000001, ABSOLUTE) /FIXED = treat time time*treat | SSTYPE(3) /METHOD = REML /PRINT = G LMATRIX R SOLUTION TESTCOV /REPEATED = treat time | SUBJECT(name ) COVTYPE(UN) /SAVE = RESID . MIE '2002
Selecting the covariance structure Using SAS command, replacing “UN” in type=UN with CS, VC, HF , AR(1) and others defines Unstructured, Variance Components, Huynh-Feldt and First Order Autoregressive, etc… variance-covariance structures of the fixed effects. The default is VC. Using SPSS command, replacing “UN” in COVTYPE(UN) with ID, CS, VC, HF , AR(1) defines the above covariance structures. No other types are available. MIE '2002
Selecting the covariance structure The unstructured covariance is overly complex. In our example we have 6 levels for treat*time effects, so the unstructured covariance has 6 variances and 15 covariances (6*5)/2 ), for a total of 21 variances and covariances being estimated. The other structures use less covariance parameter for the repeated effects. Another problem with CS, HF and AR(1) structures that they do not take into account the double repeated nature of our model. MIE '2002
Selecting the covariance structure Correlation matrix for a block using UN covariance structure Row COL1 COL2 COL3 COL4 COL5 COL6 1 1.00000000 0.87572288 0.69284518 0.64717994 0.54989785 0.24762745 2 0.87572288 1.00000000 0.82563330 0.59373944 0.69760124 0.39606241 3 0.69284518 0.82563330 1.00000000 0.37105322 0.36385899 0.00696442 4 0.64717994 0.59373944 0.37105322 1.00000000 0.64283355 0.39879596 5 0.54989785 0.69760124 0.36385899 0.64283355 1.00000000 0.71658187 6 0.24762745 0.39606241 0.00696442 0.39879596 0.71658187 1.00000000 Correlation matrix for a block using AR(1) covariance structure Row Col1 Col2 Col3 Col4 Col5 Col6 1 1.0000 0.6169 0.3805 0.2348 0.1448 0.08933 2 0.6169 1.0000 0.6169 0.3805 0.2348 0.1448 3 0.3805 0.6169 1.0000 0.6169 0.3805 0.2348 4 0.2348 0.3805 0.6169 1.0000 0.6169 0.3805 5 0.1448 0.2348 0.3805 0.6169 1.0000 0.6169 6 0.08933 0.1448 0.2348 0.3805 0.6169 1.0000 MIE '2002
Selecting the covariance structure: a composite covariance model Under a composite covariance model separate covariance structures are specified for each of two repeat factors. Using UN@AR(1), we assume equal correlation between treatments (UN) and AR(1) covariance structure between the three time points. UN@AR(1): we assume the UN covariance matrix for the treatments and the AR(1) covariance matrix for the time effects MIE '2002
The UN@AR(1) composite covariance model in SAS For each subject, we have the following covariance matrix: MIE '2002
Selecting the covariance structure Correlation matrix for a block using UN@AR(1) covariance structure Row COL1 COL2 COL3 COL4 COL5 COL6 1 1.00000000 0.73001496 0.53292185 0.22698641 0.16570348 0.12096602 2 0.73001496 1.00000000 0.73001496 0.16570348 0.22698641 0.16570348 3 0.53292185 0.73001496 1.00000000 0.12096602 0.16570348 0.22698641 4 0.22698641 0.16570348 0.12096602 1.00000000 0.73001496 0.53292185 5 0.16570348 0.22698641 0.16570348 0.73001496 1.00000000 0.73001496 6 0.12096602 0.16570348 0.22698641 0.53292185 0.73001496 1.00000000 Correlation between time Time 0 Time 10 Time 20 Time 0 1.00000000 0.73001496 0.53292185 Time 10 0.73001496 1.00000000 0.73001496 Time 20 0.53292185 0.73001496 1.00000000 R=0.227 (correlation between treatments) MIE '2002
Comparison of mixed models with different covariance structures Based on information criteria about the model fit Akaike's Information Criterion (AIC) -2 Restricted Log Likelihood: Likelihood ratio test (for nested models) Smaller values indicate better models MIE '2002
Comparison of covariance structures for PH data Models VC,CS are significantly different (worse) from model with UN covariace structure. However, UN@AR(1) model will be used, -because this is a doubly repeated model, -the covariance structure is simpler MIE '2002
Results using mixed model (SAS) Tests of Fixed Effects (Type=UN@AR) Source NDF DDF Type III F Pr > F TREAT 1 88 8.77 0.0039 TIME 2 88 15.86 0.0001 TREAT*TIME 2 88 14.22 0.0001 MIE '2002
Differences of Least Squares Means Effect TREAT TIME _TREAT _TIME Difference Std Error DF t Pr > |t| TREAT*TIME 1.00 0.00 1.00 10.00 -0.00668 0.004978 33 -1.34 0.1886 TREAT*TIME 1.00 0.00 1.00 20.00 -0.00888 0.006548 33 -1.36 0.1844 TREAT*TIME 1.00 10.00 1.00 20.00 -0.00219 0.004978 33 -0.44 0.6624 TREAT*TIME 2.00 0.00 2.00 10.00 -0.02260 0.006852 33 -3.30 0.0023 TREAT*TIME 2.00 0.00 2.00 20.00 -0.05895 0.008888 33 -6.63 <.0001 TREAT*TIME 2.00 10.00 2.00 20.00 -0.03635 0.006852 33 -5.30 <.0001 TREAT*TIME 1.00 0.00 2.00 0.00 -0.00459 0.01018 33 -0.45 0.6551 TREAT*TIME 1.00 10.00 2.00 10.00 -0.02051 0.01024 33 -2.00 0.0536 TREAT*TIME 1.00 20.00 2.00 20.00 -0.05466 0.01018 33 -5.37 <.0001 Paired t-test: Comparison Sig. (2-tailed) Day 1, 0’-10’ 0.140 Day 1, 0’-20’ 0.164 Day 1, 10’-20’ 0.607 Day 2, 0’-10’ 0.009 Day 2, 0’-20’ 0.000 Day 2, 10’-20’ 0.000 0’, Day1-Day2 0.788 10’, Day1-Day2 0.018 20’, Day1-Day2 0.000 MIE '2002
Distribution of residuals using UN@AR(1) covariance structure MIE '2002
Summary of statistical results for venous blood PH Changing models might give different results. GLM models are useful in case of balanced data satisfying special assumptions. Using mixed model, the covariance structure of repeated effects can be taken into account, and cases with missing values are not deleted. The presence of a treatment*time interaction is obvious by any model. MIE '2002
Examination of another parameter: systolic blood pressure (RRS) MIE '2002
MIE '2002
The same figure with different scaling Different sample size Equal sample size MIE '2002
GLM results (2 cases are deleted) GLM Multivariate (Wilks’ Lambda Sig): TREAT 0.868 TIME 0.095 TREAT*TIME 0.270 GLM Univariate (Spericity assumptions met) TIME 0.042 TREAT*TIME 0.253 Is there a significant time effect? 0.095 0.042 MIE '2002
Plot in SPSS GLM MIE '2002
Correlation matrix of systolic blood pressures BP1_0 BP1_10 BP1_20 BP2_0 BP2_10 BP2_20 BP1_0 1 .954 .893 .884 .619 .790 BP1_10 .954 1 .908 .842 .569 .776 BP1_20 .893 .908 1 .825 .566 .778 BP2_0 .884 .842 .825 1 .755 .791 BP2_10 .619 .569 .566 .755 1 .825 BP2_20 .790 .776 .778 .791 .825 1 Paired t-tests Sig. (2-tailed) RRS 1-0 - RRS 1-10 .409 RRS 1-0 - RRS 1-20 .003 RRS 1-10 - RRS 1-20 .009 RRS 2-0 - RRS 2-10 .515 RRS 2-0 - RRS 2-20 .439 RRS 2-10 - RRS 2-20 .845 RRS 1-0 - RRS 2-0 .715 RRS 1-10 - RRS 2-10 .672 RRS 1-20 - RRS 2-20 .155 MIE '2002
MIXED: Comparison of covariance structures for BP data UN covariance structure is significantly better than the other models examined MIE '2002
Results for time-trend using mixed model GLM: based on data of 18 patients, univariate results seem to be acceptable, showing a significant time-trend. However, assumptions of the multivariate approach are more realistic. Multivariate (UN): 2, 16, p=0.095 Univariate (CS): 2, 34 p=0.042. MIXED: based on data of 20 patients, UN covariance structure has to be used. UN: 2, 18, p=0.045 CS: 2, 89 p=0.0587 The p-values are close. There is a significant increase in time for BP data. MIE '2002
Using mixed models, an increasing time effect could be shown. MIE '2002
Covariance pattern model vs. random coefficients model When correlation between observations on the same patients is not constant, a covariance pattern model can be used. When the relationship of the response variable with time is of interest, a random coefficients model is more appropriate. Here, regression curves are fitted for each patient and the regression coefficients are allowed to vary randomly between the patients. MIE '2002
Individual regression lines MIE '2002
SAS commands Fixed effects approach (linear regression with one independent variable). The effect of patient is ignored – all observations are treated as independent. proc mixed; model rrs= time /s; run; Mixed models (with random coefficients for patients and patients*time) class name treat; model rrs= time /s; random int time /sub=name type=un solution; Mixed models with two additional effects (with random coefficients for patients and patients*time) model rrs=treat time treat*time/s; MIE '2002
Regression lines by averaged by treatments MIE '2002
Results I: fixed effects (linear regression) Covariance Parameter Estimates: Residual 410.02 Fit Statistics -2 Res Log Likelihood 996.5 Solution for Fixed Effects Effect Estimate Standard Error DF t Value Pr > |t| Intercept 138.84 3.0039 111 46.22 <.0001 TIME 0.2579 0.2323 111 1.11 0.2693 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TIME 1 111 1.23 0.2693 Residual variance: 410.02 RRS=0.2579*time + 138.84 The time-effect is not significant MIE '2002
Results I: fixed effects (linear regression) RRS=0.2579*time + 138.84 The time-effect is not significant MIE '2002
Results II: mixed model: fixed and random effects (linear regression) Covariance Parameter Estimates UN(1,1) NAME 346.73 UN(2,1) NAME -0.5609 UN(2,2) NAME 0 Residual 88.3869 Fit Statistics: -2 Res Log Likelihood 882.9 Solution for Fixed Effects Effect Estimate Standard Error DF t Value Pr > |t| Intercept 139.03 4.4939 18 30.94 <.0001 TIME 0.2579 0.1078 18 2.39 0.0279 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TIME 1 18 5.72 0.0279 Residual variance: 88.38 RRS=0.2579*time + 139.03 The time-effect is significant MIE '2002
Results III: mixed model: two fixed effects and random effects Covariance Parameter Estimates UN(1,1) NAME 346.73 UN(2,1) NAME -0.5609 UN(2,2) NAME 0 Residual 88.3869 Fit Statistics -2 Res Log Likelihood 879.2 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TREAT 1 73 0.53 0.4703 TIME 1 18 5.71 0.0280 TIME*TREAT 1 73 1.74 0.1919 Residual variance: 88.38 The time-effect is significant The other two effects are not significant We decide to use MODEL II MIE '2002
Discussion Using statistical software without knowing their main properties or using only their default parameters may lead to spurious results. Using only the default parameters means that simple models are supposed (i.e. VC covariance pattern in mixed procedure). Medical experiments often result in repeated measures data, nested repeated measures data. The use of carefully chosen statistical model may improve the quality of statistical evaluation of medical data. MIE '2002
Medical consequences The main results are that the diminished elevation of serum cortisol levels indicates blunted stress response to Na-lactate in AD. The decreased vascular responsiveness of the majority of AD cases reflects impaired vasoreactivity and disturbed vasoregulation. Since the catecholaminerg system and cholinergic mechanisms are also involved in the regulation of reactivity of the brain microvasculature, these alterations might be the consequences of the general cholinergic deficit in AD. MIE '2002
References H. Brown and R. Prescott, Applied Mixed Models in Medicine. Wiley, 2001. SAS Institute, Inc: The MIXED procedure in SAS/STAT Software: Changes and Enhancements through Release 6.11. Copyright © 1996 by SAS Institute Inc., Cary, NC 27513. T. Park, and Y.J. Lee,: Covariance models for nested repeated measures data: analysis of ovarian steroid secretion data. Statistics in Medicine 21 (2002) 134-164 SPSS Advanced Models 9.0. Copyright © 1996 by SPSS Inc P. R. S. Stewart, M. D. Devous, A. J. Rush, L. Lane, F. J. Bonte, Cerebral blood flow changes during sodium-lactate induced panic attacks. Am. J. Psych., 145 (1988) 442-449. R. Wolfinger and M. Chang, Comparing the SAS GLM and MIXED Procedures for Repeated Measures, SAS Institute Inc., Cary, NC. http://www.ats.ucla.edu/stat/sas/library/ MIE '2002