Presentation is loading. Please wait.

Presentation is loading. Please wait.

Logistic Regression Model (Limited)

Similar presentations


Presentation on theme: "Logistic Regression Model (Limited)"— Presentation transcript:

1 Logistic Regression Model (Limited)
Using Machine Learning to Define the Association between Cardiorespiratory Fitness and All-Cause Mortality: The FIT (Henry Ford ExercIse Testing) Project Mouaz H. Al-Mallah MD, MSc1,2; Radwa Elshawi, PhD3; Amjad M. Ahmed, MBBS MSc2; Waqas T. Qureshi, MD MS4; Clinton A. Brawner, PhD1; Michael J. Blaha, MD MPH5; Haitham M. Ahmed, MD MPH5, 6; Jonathan K. Ehrman, PhD1; Steven J. Keteyian PhD1; Sherif Sakr PhD2 1: Division of Cardiovascular Medicine, Henry Ford Hospital, Detroit, MI, USA; 2: King Saud bin Abdulaziz University for Health Sciences, King Abdullah International Medical Research Center, King AbdulAziz Cardiac Center, Ministry of National Guard, Health Affairs, Riyadh, Saudi Arabia; 3:Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia; 4:Wake Forest School of Medicine, Medical Center Boulevard, Winston-Salem, NC, USA; 5: Johns Hopkins Ciccarone Center for the Prevention of Heart Disease, Baltimore, MD, USA; 6: Cleveland Clinic Foundation, Cleveland, OH, USA Prior studies have demonstrated that cardiorespiratory fitness (CRF) is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification technique that classifies the data into predetermined categories. The aim of the analysis is to compare the prediction of 10 year all-cause mortality (ACM) using logistic regression (LR) and ML approaches in patients who underwent stress testing. A total of 34,212 patients (Mean age 54 ± 13 years, 55% males) were included in this analysis. The baseline characteristics of the included cohort are shown in the table below showing high prevalence of cardiovascular risk factors. During a follow-up duration of 10 years, a total of 3,921 patients (11.5%) died. Feature selection for potential inclusion in the machine learning model: METs: Metabolic Equivalent of Tasks; %HR: Percentage of Heart Rate; SBP: Systolic Blood Pressure; CHD: Coronary Heart Disease. Accuracy to Predict 10-year all-cause mortality using ASCVD risk score, Logistic regression or Machine learning approaches Table 1. Baseline Characteristics of the cohort Logistic Regression Model (Limited) Machine Learning True Positive 1,669 3,427 False Positive 2,144 846 False Negative 2,252 494 True Negative 18,117 29,445 Sensitivity 42.57% (41.0% -44.1%) 87.4% (86.3% %) Specificity 93.02 %(92.7% %) 97.2% (97.0% %) Positive Likelihood Ratio 6.10 ( ) 31.3 ( ) Negative Likelihood Ratio 0.62 ( ) 0.13 ( ) Positive Predictive Value 44.12% (42.8% to 45.5%) 80.20% (79.0% %) Negative Predictive Value 92.6% (92.4% to 92.8 %) 98.35% (98.2% %) Model Area Under the Curve 0.824 ( ) 0.923 ( ) Characteristic Data (n=34,212) Age (years)* 54 ± 13 Male $ 18,703 (55) Race $ White 23801 (70) Black 9768 (29) Others 643 (1) Body Mass Index (kg/m2) * 29.3 ± 5.8 Reason for Test $ Chest Pain 17547 (51) Shortness of Breath 3307 (10) Pre-Operation 781 (2) Rule out Ischemia 3884 (11) Stress Variables * Peak METS 9.2 ± 3.1 Resting Systolic Blood Pressure (mmHg) 132 ± 19 Resting Diastolic Blood Pressure (mmHg) 82 ± 11 Resting Heart rate (bpm) 74 ± 13 Peak Systolic Blood Pressure (mmHg) 183 ± 27 Peak Diastolic Blood Pressure (mmHg) 86 ± 14 Peak Heart Rate (bpm) 151 ± 21 Chronotropic incompetence $ 6,957 (23.3) Past Medical History $ Diabetes 5,907(17) Hypertension 20,534 (60) Smoking 15,249 (43) Family History of CAD 18,299 (51) Medications Used $ Diuretic Use 5,743 (16) Hypertensive medications 14,905 (42) Diabetes medications 2,432 (7) Statin 4,524 (13.2) Aspirin 5,752 (16.8) Beta Blockers 5,434 (15.9) Calcium Channel Blockers 4,638 (13.5) All the data are presented as: * Mean and standard deviation and $ frequencies and percentages. mmHg: millimeter mercury; bpm: beat per minute; CAD: coronary artery disease. We included 34,212 patients (55% males, mean age 54 ± 13years) free of coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems Between 1991 and 2009 and had complete 10-year follow-up. The primary outcome of this analysis was All Cause Mortality at 10 years. The machine learning Methods used was K-Nearest Neighbors algorithm (K-NN) classification technique; K-NN algorithm is a supervised classification technique where the training dataset is used by the classifier to train the model about how items from class “Yes” and from class “No” look like. The probability of 10 year ACM was calculated using logistic regression, ASCVD risk score and machine learning technique and the accuracy of these methods was calculated and compared. Our analysis demonstrates that ML provides better accuracy and discrimination of the prediction of ACM among patients undergoing stress testing compared to LR. To the best of our knowledge, this is the first report describing the ML approach for interpreting a large dataset using exercise stress test variables to identify individuals at risk of ACM in individuals without known CVD Machine learning had the highest Area under the Curve (AUC 0.923) in the prediction of all-cause mortality compared to logistic regression (AUC 0.824) or atherosclerotic cardiovascular disease (ASCVD) risk score (AUC 0.785), p < 0.001 Disclosures: None


Download ppt "Logistic Regression Model (Limited)"

Similar presentations


Ads by Google