Presentation is loading. Please wait.

Presentation is loading. Please wait.

How to Analyze and Graphically Present Longitudinal Data

Similar presentations


Presentation on theme: "How to Analyze and Graphically Present Longitudinal Data"— Presentation transcript:

1 How to Analyze and Graphically Present Longitudinal Data
Ayumi Shintani, Ph.D., M.P.H. Department of Biostatistics For handouts and datasets:

2 Example 1. More than 2 repeated measures with 1 group
From Table 1 of Deal et al (1979): Role of respiratory heat exchange in production of exercise-induced asthma. J Appl Physiol 46: Minute ventilation Volume vs. Temperature-Dry Gas Experiments Ventilation in 1/min ID -10 25 37 50 65 80 Mean SD Slope 1 74.5 81.5 83.6 68.6 73.1 79.4 76.8 5.7 -0.01 2 75.5 84.6 70.6 87.3 73 75 77.7 6.7 -0.02 3 68.9 71.6 55.9 61.9 60.5 61.8 63.4 5.8 -0.40 4 57 61.3 54.1 59.2 56.6 58.8 57.8 2.5 5 78.3 84.9 64 62.2 60.1 78.7 71.4 10.5 -0.12 6 54 62.8 63 58 56 51.5 57.6 4.7 -0.04 7 72.5 68.3 67.8 71.5 67.7 68.8 2.8 -0.06 8 80.8 89.9 83.2 83 85.7 79.6 83.7 3.7 70.2 75.6 69.0 66.3 69.1 -4.5 9.8 11.0 11.1 10.3 10.8 4.6

3 ] Minute Ventilation Volume Temperature
Error Bars show 95.0% Cl of Mean Dot/Lines show Means Temp_10 Temp25 Temp37 Temp50 Temp65 Temp80 Temperature 60.0 70.0 80.0 Minute Ventilation Volume ]

4 Minute Ventilation Volume
90.0 ID 1 2 3 4 80.0 5 6 7 8 70.0 Minute Ventilation Volume Dot/Lines show Means 60.0 50.0 Temperture -10 Temperture 37 Temperture 65 Temperture 25 Temperture 50 Temperture 80 Temperature

5 We want to analyze whether there is an association between minute ventilation volume vs. temperature. What’s hypothesis do you want to test, i.e., what exactly do you want to compare? Null hypothesis 1: Mean of minute ventilation volume at different temperatures are the same. Error Bars show 95.0% Cl of Mean Dot/Lines show Means Temp_10 Temp25 Temp37 Temp50 Temp65 Temp80 Temperature 60.0 70.0 80.0 Minute Ventilation Volume ]

6 First, let’s ignore repeated measures, perform one-way ANOVA
In order to perform ANOVA, you first need to transform data from (horizontal) to (longitudinal) format (longitudinal format uses only one variable for outcome measures as oppose to horizontal, where different outcome variable is created for each repeated measure). In SPSS go: Data Restructure Step 1: Welcome Data Structure Wizard - Select the first choice on Step 2: Variables to Cases: Number of variable groups - Select the first choice Step 3: Variables to Cases: Select variables * Select Temp_01 through Temp80 (in the right order) to Variables to be transposed box (use Shift key on your key board to highlight them all once), * Type the name of the output variable (for example, Vent) * Select ID to the Fixed variable box, Next Step 4: Variables to Cases: Create Index Variables - Select the first choice on Step 5: Variables to Cases: Create One Index Variable Click on the first choice (sequential numbers) Edit index variable from Index1 to TEMP (for temperature) Step 6: Handling variables not selected Keep and treat as fixed variables The rest remains the same Step 7: Do nothing Finish (before you do this, make sure you saved the original horizontal file) Recode the level of Temp to the actual values (-10, 25, 37, …80) Save the new longitudinal dataset as Deal Longitudinal.sav

7 Now, let’s perform one-way ANOVA
Analyze Compare Means One-way ANOVA Select Vent as dependent, Temp as Factor variables Another way: Analyze General linear model Univariate (Multivariate means when you have more than 1 dependent variables) Select Change as dependent, Temp as fixed Factor variables Click Plots Select Temp as horizontal variable, Click ADD, Continue Click Models Select Custom Select Temp into Model box, Continue OK Author’s test for effect of temperature: One-way analysis of variances : F5,42=.72, p>0.5

8 Results of one-way ANOVA
Tests of Between-Subjects Effects Dependent Variable: Vent a 5 82.773 .727 .607 1 .000 42 48 47 Source Corrected Model Intercept Temp Error Total Corrected Total Type III Sum of Squares df Mean Square F Sig. R Squared = .080 (Adjusted R Squared = -.030) a.

9 The authors concluded that no differences existed between minute ventilation volume and temperatures. The flaw in this analysis are: 1 The normality of the distributions of ventilation volumes has not been checked. Use Kruskal-Wallis test 2. The observations within a single subject are not independent. The subject identification was not used in the analysis. The analysis does not remove variation among subjects by considering each subject as his own control, making use of the fact that variation within subjects is usually less than the variation between subjects.

10 Linear mixed effect model (???)
Ignoring correlation among measurements Considering correlation among measurements Non-parametric Kruskal-Wallis test (p=0.613) Friedman Test (p=0.023) Parametric One-way ANOVA (GLM) (p=0.607) Linear mixed effect model (???) Problem 2 Problem 1

11 Are they independent from each other?
In order to solve the problem 2, we can use a linear mixed effect model. A linear mixed effect model is similar to linear regression (or general linear regression) where outcome variables are continuous. A linear regression assumes normality and independence of residuals. Similarly a linear mixed model requires normality assumption, however, it does not requires independence assumption. We will talk about normality part later, here let’s spend some time to learn about independence assumption. ID temp trans1 1 -10 74.5 25 81.5 37 83.6 50 68.6 65 73.1 80 79.4 2 75.5 84.6 70.6 87.3 73.0 75.0 3 68.9 71.6 55.9 Are they independent from each other?

12 A quick way to check this is to do correlation analysis.
Correlations 1.000 .952 ** .690 .786 * .738 .929 . .000 .058 .021 .037 .001 8 .667 .881 .071 .004 .762 .857 .833 .028 .007 .010 Correlation Coefficient Sig. (2-tailed) N Temperture -10 Temperture 25 Temperture 37 Temperture 50 Temperture 65 Temperture 80 Spearman's rho Temperture -10 25 37 50 65 80 Correlation is significant at the 0.01 level (2-tailed). **. Correlation is significant at the 0.05 level (2-tailed). *. If each observation is independent, you would expect p>0.05 for Spearman correlation coefficient.

13 In order to perform correlation analysis, data must be entered horizontally, you can use deal.sav dataset for this. In the case, you have created your original database longitudinally and want to convert it to horizontal, here is how to do it. Read Deal.long.sav into SPSS Before you restructure this data, recode negative value for Temperature variable, by either recoding it to (1,2,3,4,5,6) or replace -10 with 10. Go to: Data Restructure Step 1: Welcome Data Structure Wizard - Select the second choice “Restructure selected cases into variables, click “next” Step 2: Select “PATIENT ID” to identifier variable box (upper left box) “Temperature” to index variable box (lower left box), click “next” Step 3: Finish

14 Mathematical Presentation of Correlation Structures
Let rjk donates a correlation coefficient between the jth and kth repeated measures on the same patients. R(rjk ) is the working correlation matrix of Y This is what a linear regression model assumes.

15 Since observations are not independent for repeated measure data (observations within a patient are dependent), we cannot use the independence assumption. A linear mixed model requires assumption on “correlation structure”. You need to make a guess on how measures taken repeatedly correlate.

16 Data on the above figure assumes variance of Y at each point is the same across all categories, correlation between any 2 sets of Ys are zero (independent). This structure is called independent (scaled identity in SPSS). This is what 2-way ANOVA assumes for a structure of error terms, which is now obvious not providing good fit to the isoproterenol data.

17 Let’s look at the next figure
Let’s look at the next figure. Variance of each Y are the same across all the categories, correlation between any 2 Ys are the same (ie, correlation is the same when 2 doses are closer or not). This structure is called Compound Symmetry (Exchangeable).

18 Let’s look at the next figure
Let’s look at the next figure. Variance of each Y are the same across all the categories, correlation between any 2 Ys is equal to r(distance between Ys) where -1<r<1 (ie, correlation is the same when 2 doses are closer or not). This structure is called First-Order-Autoregressive.

19 AR(1): Heterogeneous. This is a first-order autoregressive structure with heterogeneous variances.

20 Mathematical Presentation of Correlation Structures
Let rjk donates a correlation coefficient between the jth and kth repeated measures on the same patients. R(rjk ) is the working correlation matrix of Y

21 Model for the correlation
Independence (called “Scaled Identity” in SPSS) Correlation between any two observations within the same patient is independent

22 Model for the correlation (cont.)
Exchangeable (compound symmetry) Any two distinct observations from the same patient have the same correlation coefficient ()

23 Model for the correlation (cont.)
Unstructured Each jk has different value, no structure is assumed in R

24 Model for the correlation (cont.)
Auto regressive (1) rjk is function of time lag between 2 points

25 Toeplitz: Often fits well for experimental data.
Toeplitz. This covariance structure has homogenous variances and heterogenous correlations between elements. The correlation between adjacent elements is homogenous across pairs of adjacent elements. The correlation between elements separated by a third is again homogenous, and so on.

26 Selection of correlation structure
If the number of repeats is small and data are balanced and complete, then an unstructured matrix is recommended If observations are measured over time, then use a structure that accounts for correlation as function of time (i.e. auto-regressive), choose a model which provides the smallest AIC value. If observations are clustered (i.e. no logical ordering) then exchangeable may be appropriate

27 The model is able to consider the following covariance structures for repeated
measures data available in SPSS. Ante-Dependence: First Order AR(1) AR(1): Heterogeneous ARMA(1,1) Compound Symmetry Compound Symmetry: Correlation Metric Compound Symmetry: Heterogeneous Diagonal Factor Analytic: First Order Factor Analytic: First Order, Heterogeneous Huynh-Feldt Scaled Identity Toeplitz Toeplitz: Heterogeneous Unstructured Unstructured: Correlations

28 Let’s use a linear mixed effect model to analyze Minute ventilation Volume data.
Read Isoproterenol.long2.sav into SPSS. Analyze, Mixed Models, Linear, Select ID for Subject variable Select Dose as Repeated variable Select appropriate covariance structure in the Repeated Covariance Type for example AR(1) heterogeneous, Continue Dependent variable: Vent Factor variable (categorical independent variable) : Temp Covariates (continuous independent variables): Click Fixed Click Custom, highlight all independent variables in the box Choose Main Effect, and put them (Temp) in the model box Click EM Means Select Temp into the “Display means for” box Select Bonferroni method Click Compare main effect, select reference category to “first” Click Statistics Choose Parameter estimates, tests for covariance parameters, covariance for residuals Click Save Select Residuals, Predicted Values OK

29 Result of the linear mixed Model with ARH(1)
Information Criteria a -2 Restricted Log Likelihood Akaike's Information Criterion (AIC) Hurvich and Tsai's Criterion (AICC) Bozdogan's Criterion (CAIC) Schwarz's Bayesian Criterion (BIC) The information criteria are displayed in smaller-is-better forms. a. Dependent Variable: Minute Ventilation Volume. Type III Tests of Fixed Effects a Denominator Source Numerator df df F Sig. Intercept 1 7.049 .000 temp 5 21.222 3.099 .030 a. Dependent Variable: Minute Ventilation Volume. Pairwise Comparisons b 5.425 2.339 14.346 .178 -1.513 12.363 -2.413 3.412 16.480 1.000 7.516 -1.225 3.783 20.843 9.493 -3.938 3.843 23.946 6.813 -1.125 4.246 19.444 10.993 (J) Temperature -10 (I) Temperature 25 37 50 65 80 Mean Difference (I-J) Std. Error df Sig. a Lower Bound Upper Bound 95% Confidence Interval for Based on estimated marginal means Adjustment for multiple comparisons: Bonferroni. a. Dependent Variable: Minute Ventilation Volume. b.

30 P-values were not adjusted For multiple comparisons.
Pairwise Comparisons b -5.425 * 2.339 14.346 .036 -.419 2.413 3.412 16.480 .489 -4.804 9.629 1.225 3.783 20.843 .749 -6.645 9.095 3.938 3.843 23.946 .316 -3.995 11.870 1.125 4.246 19.444 .794 -7.749 9.999 5.425 .419 10.431 7.838 2.678 16.722 .010 2.180 13.495 6.650 3.458 23.877 .066 -.489 13.789 9.363 3.787 26.581 .020 1.585 17.140 6.550 4.285 23.101 .140 -2.312 15.412 -2.413 -9.629 4.804 -7.838 -2.180 -1.188 2.748 19.753 .670 -6.925 4.550 1.525 3.528 21.550 -5.800 8.850 -1.288 4.177 22.832 .761 -9.932 7.357 -1.225 -9.095 6.645 -6.650 1.188 -4.550 6.925 2.713 2.579 17.391 .307 -2.720 8.145 -.100 3.513 20.601 .978 -7.415 7.215 -3.938 3.995 -9.363 -1.585 -1.525 -8.850 5.800 -2.713 -8.145 2.720 -2.813 2.496 14.017 .279 -8.165 2.540 -1.125 -9.999 7.749 -6.550 2.312 1.288 -7.357 9.932 .100 -7.215 7.415 2.813 -2.540 8.165 (J) Temperature 25 37 50 65 80 -10 (I) Temperature Mean Difference (I-J) Std. Error df Sig. a Lower Bound Upper Bound 95% Confidence Interval for Based on estimated marginal means The mean difference is significant at the .05 level. *. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments). a. Dependent Variable: Minute Vantilation Volume. b. P-values were not adjusted For multiple comparisons. Select Bonferroni option to adjust for multiple comparisons. Therefore, With the adjustment, None of the pair-wise Analysis was significant.

31 Predicted model by the linear mixed effect model with ARH(1)
Correlation structure.

32 This value is bigger than one with ARH(1) on page 29
Performing One-way ANOVA using a linear mixed effect model option in SPSS When you use independence (scaled identity) structure for correlation, the model becomes equivalent with one-way ANOVA. Result of the linear mixed model ignoring dependency among repeated measures. This value is bigger than one with ARH(1) on page 29 Information Criteria a -2 Restricted Log Likelihood Type III Tests of Fixed Effects a 1 42 .000 5 .727 .607 Source Intercept temp Numerator df Denominator df F Sig. Dependent Variable: Minute Ventilation Volume. a. Akaike's Information Criterion (AIC) Hurvich and Tsai's Criterion (AICC) Bozdogan's Criterion (CAIC) Schwarz's Bayesian Criterion (BIC) The information criteria are displayed in smaller-is-better forms. a. Dependent Variable: Minute Ventilation Volume.

33 Predicted model by the linear mixed effect model with independence
Correlation structure. Model parameter estimates are the same as those of the model with ARH(1) however, standard errors are much smaller with a model with considering correlation among repeated measures.

34 Performing residual diagnosis for a linear mixed model in SPSS.
After you perform the analysis on page 28, by using SAVE option, you created 2 new variables; residuals and predicted values. Go to graphics, histogram to create graph 1. Go to graphics, Scatter plot to create graph 2. Graph 2: Checking for trend in residuals (if there is a trend, you may want to try transformation of Y Graph 1:Checking for normality

35 Fitting slopes (null hypothesis 2: slope = 0)
In the previous analysis, we treated temperature as a categorical variable, which assesses that means ventilation volume were the same or not. All pair-wise analysis did not show any difference due to power loss (by adjustment for multiple comparisons). In order to assess whether there is increasing or decreasing trend by temperature, we can off course analyze this data with regression slope by treating temperature as continuous instead of categorical. LOWSS curve

36 Performing a linear mixed effect model to assess slope is greater or less than zero.
Read deal.long.sav into SPSS. Analyze, Mixed Models, Linear, Select ID for Subject variable Select Temp as Repeated variable Select appropriate covariance structure in the Repeated Covariance Type for example AR(1) heterogeneous Continue Dependent variable: Vent Factor variable (categorical independent variable) : Covariates (continuous independent variables): Temp Click Fixed Click Custom, highlight all independent variables in the box Choose Main Effect, and put them (Temp) in the model box Click Statistics Choose Parameter estimates, tests for covariance parameters Click Save Select Residuals, Predicted Values OK

37 Result of the linear mixed model with AHR(1) with temp as continuous.
Estimates of Fixed Effects a 8.409 22.861 .000 22.702 .306 .762 Parameter Intercept temp Estimate Std. Error df t Sig. Lower Bound Upper Bound 95% Confidence Interval Dependent Variable: Minute Ventilation Volume. a. P=0.762 indicates that the slope is not different from zero.

38 A much simpler way to analyze this data:
Response feature analysis, i.e., analysis of summary measures: Using slope as a summary measure: Read Deal Longitudianl.sav dataset into SPSS Graphs, Interactive, Scatterplot, Select Vent as Y-axis, Temp (as Scale) as X-axis, ID (as Categorical) as panel variable Click Fit Select “Include constant in equation” Prediction line for “Mean” Fit line for ‘Total” OK

39 Then perform One-sample non-parametric test (N is small) for the slope
Now, open Deal.sav and create a new variable Slope and type each person’s slope value Then perform One-sample non-parametric test (N is small) for the slope In order to perform one-sample non-parametric Test in SPSS, you need this trick. Create a dummy variable with all 0’s Then go: Analyze 2-related samples Select Slope and Dummy to test pair list Select Wilcoxon as test type OK (SPSS does not work with only SLOPE variable) ID Slope Dummy 1 -0.01 2 -0.02 3 -0.4 4 5 -0.12 6 -0.04 7 -0.06 8 Using slopes to test for trends: Wilcoxon signed-rank tests: P=0.018 We now can conclude that there is a significant association between minute ventilation volume and temperature (as temperature increases, ventilation volume decreases)

40 Using summary measures can provide more intuitive and simplified approach which some times provides bigger power to detect differences.


Download ppt "How to Analyze and Graphically Present Longitudinal Data"

Similar presentations


Ads by Google