Download presentation
Presentation is loading. Please wait.
1
Preliminary results of: Prediction of TB transmission from Attributes of Infected Cases
2
Review – TB transmission
Reactivation – development of TB disease from latent infection (R* below) Primary TB disease – disease after initial exposure (T below) RT T T RN T
3
Review – TB transmission
How do I know whether a case belongs to Rt, Rn, or T? Epi investigation – low sensitivity DNA fingerprinting – long turnaround time or low specificity Goal – Develop alternative method to classify a TB case into one of 3 classes from infected case’s features
4
Potential features to predict TB transmission from the past retrospective TB epi studies
Cough Sputum Chest X-Ray results HIV Sputum smear results Age Sex Country of origin IV Drug-user, homeless, alcoholic Residence (i.e. apartment or house) Residential location on the island (CSSS) Some non-linear (Bivariate interaction) variables: Residence and X-ray results, Residence and coughing
5
Model Multinomial logit regression with Begg’s & Gray approximation.
3 response classes to be predicted from patient’s attributes (classes: Rt , Rn, T) Status of the response class was determined by the DNA fingerprint analysis as well as the date of lab tests Feature selection: Bayesian Model Averaging(BMA) – Validated to yield more stable estimates than stepwise with AIC or p-value based methods Almost all predictor variables have missing data, so multiple imputations were performed with m=15 (ICE method).
6
Estimates of performance and Internal Validation
10 rounds of 10-fold cross-validation on each of 15 imputed datasets Estimates of predictive performance AUC for Rt and T using Rn as reference Brier score For Rt and T using Rn as reference. 1/n*∑(Yi - yi)2 where Yi =observed response {1,0} and yi = predicted probability score 0 -> perfect prediction 0.25 -> crap
7
~ ~ Summary of analysis Original Data Multiple Imputations
15 imputed datasets generated from Multiple imputations ~ m1 m2 m3 m15 10 rounds of 10-fold cross-validation applied to each m* dataset to obtain predictive performance (i.e. AUC, Brier) of prediction models Cross-validations ~ est1 est2 est3 est15 Pooling Pool estimates of prediction errors From 15 datasets according to Rubin’s rule Final Estimates
8
Rt AUC 0.62 (95% CI 0.52, 0.71) T AUC 0.65 (95% CI 0.53, 0.77) Results
Preliminary Estimates from 1-round 10-fold X-validation Rt AUC (95% CI 0.52, 0.71) Brier Score 0.07 T AUC (95% CI 0.53, 0.77) Brier Score 0.12
9
Results Predicted probabilities too low
10
Results Predictors Sputum Smear 0.01 1.1 0.00 Age -0.01 35.43 1.8
Mean Coef Mean P.P Sputum Smear 0.01 1.1 0.00 Age -0.01 35.43 1.8 Residence Lung Involvement? 0.02 3.48 0.48 Sex Previous TB diagnosis? -0.03 5.50 X-ray result 0.06 0.26 HIV 1.2 99.88 2.48 IV drug use Alcoholic 0.12 0.07 Coughing Sputum Residence * coughing 0.25 Residence * X-ray 96.39 0.05 7.99
11
Results Predictors Mean Coef Mean P.P CSSS - West to Central 0.00
-0.33 54.47 CSSS - East 0.02 5.13 0.12 CSSS –Central 0.45 57.05 WHO – Africa high HIV, Americas 0.42 58.6 WHO - Canada 1.19 100 WHO – E. Medit, E.Eur, S.E. Asia -0.27 34.61 0.79 WHO - Haiti 0.24 1.32 WHO – Western Pacific -0.11 15.18 0.03 5.06 CSSS reference: Airport & south of the island WHO reference: Africa Low HIV, Industrialized nations
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.