Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine.

Similar presentations


Presentation on theme: "Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine."— Presentation transcript:

1 Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine and Community Health University of Texas Medical Branch July 31st, 2018

2 Introduction In an experimental study, repeated measures of multiple biomarker predictors were collected from the same subject, while the primary binary outcome was measured once or fixed. This contrasts with repeated measures of an outcome, for which established statistical analysis methods are available. Statistical methodology in the context of repeated measures of a fixed outcome is not well established. It is not uncommon in publications that people ignore the correlations among repeated measures and analyze the data by treating all the repeated measures as independent observations (naïve approach). A simple approach to dealing with this problem is to take the average of the repeated measures for each biomarker before conducting analysis (mean score approach). Here we consider alternative statistical approaches which account for the correlation among various number of repeated measures on predictors.

3 Data Structure Obs Subj Y Rep X1 X2 1 1.175 1.489 2 1.499 1.543 3
1.175 1.489 2 1.499 1.543 3 1.272 1.493 4 1.208 1.352 5 1.189 1.551 6 2.784 3.976 7 0.865 7.034 8 0.824 3.373 9 5.503 1.482 10 2.452 1.917 11 3.541 1.566 12 1.415 3.323

4 Common Methods Naïve method (common mistake)
Assume independence of individual observations, ignoring repeated measures correlation. Perform logistic regression model. Calculate the area under curve (C statistic) for the models. Underestimates the variance due to ignoring the repeated measures correlation. Mean score method (commonly used) Use average of the repeated values for the same biomarker from the same subject as predictor in the logistic regression. Aggregates the values for the repeated predictors without consideration of the correlation, and reduce the sample size

5 Proposed Methods Resampling method Two-stage method
Sample single observation from repeated observations for the same subject and construct a dataset with independent observations, and perform the logistic regression and obtain the C statistic. Repeat 100 times and calculated the average of C statistics. Computationally expensive for large sample Two-stage method First step: Fit a multivariate random effects model for the biomarker levels assuming unstructured variance-covariance G matrix and diagonal R matrix. Use this model to derive a best linear unbiased predictor (BLUP) for each biomarker per subject (so each subject has a single vector of BLUPs). Second step: The BLUPs from this model are used as predictors in the logistic regression and ROC analysis. Relies on the assumption of the parametric distribution for the predictors

6 Results and Conclusions (Per Simulated Data)
Model TRUE C stat Naïve method Mean score Bias% SE CP X1 + X2 0.78 0.5 0.71 1.7 Accurate 0.91 X1 only 0.53 5.6 0.81 5.1 0.97 X2 only 0.77 -0.3 0.70 0.6 0.93 Two-stage   Resampling  0.95 5.8 0.96 5.0 0.98 0.4 0.94 Severely underestimate Overestimate Resampling method outperforms other methods.

7 Contact Information Xiaoying Yu, PhD Office of Biostatistics Department of Preventive Medicine and Community Health University of Texas Medical Branch Galveston, TX 77555 O E


Download ppt "Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine."

Similar presentations


Ads by Google