© 2003 By Default! A Free sample background from Slide 1 Evaluation of Support Vector Machines for Risk Modeling in Interventional Cardiology Michael E. Matheny, M.D.
© 2003 By Default! A Free sample background from Slide 2 Goal Comparison of support vector machines and logistic regression risk modeling performance over time for the outcome of death in pre- intervention cardiac catheterization patients. Comparison of support vector machines and logistic regression risk modeling performance over time for the outcome of death in pre- intervention cardiac catheterization patients.
© 2003 By Default! A Free sample background from Slide 3 Pre-intervention Risk Assessment Percutaneous Coronary Intervention (PCI) is a high volume procedure with significant morbidity & mortality Percutaneous Coronary Intervention (PCI) is a high volume procedure with significant morbidity & mortality Risk of death in PCI varies widely based on co-morbidities Risk of death in PCI varies widely based on co-morbidities Providing accurate case level estimations can greatly aid patient and physician decision-making Providing accurate case level estimations can greatly aid patient and physician decision-making
© 2003 By Default! A Free sample background from Slide 4 Domain Data Quality The American College of Cardiologists has published a standardized data dictionary (ACC-NCDR) and mandates that accredited centers maintain detailed data on all PCI patients The American College of Cardiologists has published a standardized data dictionary (ACC-NCDR) and mandates that accredited centers maintain detailed data on all PCI patients Some states, including Massachusetts, now have mandatory reporting of case data based on the ACC-NCDR Some states, including Massachusetts, now have mandatory reporting of case data based on the ACC-NCDR
© 2003 By Default! A Free sample background from Slide 5 Current Risk Model Standard Logistical Regression (LR) Gold standard for risk modeling in interventional cardiology Gold standard for risk modeling in interventional cardiology Type of generalized non-linear model Type of generalized non-linear model –Used in analysis of a binary outcome –Bounded by 0 and 1 Feature (variable) selection Feature (variable) selection –From All Available Data –Known Risk Factors from Prior Studies –Selected Subset of data based on Study Design
© 2003 By Default! A Free sample background from Slide 6 Alternative Risk Model Support Vector Machine (SVM) Key Features Key Features –Kernel Functions - introduce non-linearity in the hypothesis space without explicitly requiring a non-linear algorithm LinearLinear PolynomialPolynomial Radial BasedRadial Based –Global Minimum
© 2003 By Default! A Free sample background from Slide 7 Risk Model Evaluation Discrimination Provides an estimate of population level accuracy Provides an estimate of population level accuracy Area under the Receiver Operating Characteristic (ROC) Curve Area under the Receiver Operating Characteristic (ROC) Curve Graphed by the sensitivity vs. 1-specificity at different thresholds Graphed by the sensitivity vs. 1-specificity at different thresholds
© 2003 By Default! A Free sample background from Slide 8 Risk Model Evaluation Calibration Provides an estimation of case level accuracy Provides an estimation of case level accuracy Hosmer-Lemeshow’s Goodness-of-Fit Test Hosmer-Lemeshow’s Goodness-of-Fit Test –Primarily used in logistic regression –Calculates how well the observed and expected frequencies match –Handles data sparsity better than more common methods (Variance, Pearson’s) –P > 0.05 is a good fit
© 2003 By Default! A Free sample background from Slide 9 Source Data Brigham & Women’s Hospital Brigham & Women’s Hospital Interventional Cardiology Database Interventional Cardiology Database January 1, 2002 – October 30, 2004 January 1, 2002 – October 30, Cases 5383 Cases –Data split two ways each into 2/3 Training (3588) and 1/3 Test (1795) Sequential SplitSequential Split –sorted chronologically –October 27, 2003 split Random SplitRandom Split
© 2003 By Default! A Free sample background from Slide 10 Sample Demographics Overview #%Age Diabetic Hypertensive Hyperlipidemia Prior PCI Salvage Procedure Cardiogenic Shock Hemodynamic Instability Death781.45
© 2003 By Default! A Free sample background from Slide 11 Model Features Age (D) Hyperlipidemia Hx COPD GenderHTN Hx CVD BMI (D) Diabetes Hx PVD Cardiogenic Shock Creatinine (D) Thrombolytic Cardiac arrest Hx CHF IABP Hemodynamic instability CHF EF (D) Smoker Prior MI AMI Prior CABG Prior PCI Procedure urgency (D) Unstable Angina Chronic Angina AMI Within 24 Hours
© 2003 By Default! A Free sample background from Slide 12 Logistic Regression Model Development STATA 8.2 (College Station, TX) STATA 8.2 (College Station, TX) Backwards Stepwise Technique Backwards Stepwise Technique Exclusion Threshold (P 0.05 – 0.15) Exclusion Threshold (P 0.05 – 0.15) Feature Selection Feature Selection
© 2003 By Default! A Free sample background from Slide 13 Logistic Regression Feature Selection Model development Model development –Sequential Training Set –Stepwise Backwards (P = 0.10) used for feature selection –Stepwise feature removal based on ROC and HL Goodness-of-fit (HL) optimization
© 2003 By Default! A Free sample background from Slide 14 Logistic Regression Feature Selection FeatureROC HL P All BMI -BMI EF -EF arrest -arrest hyperlipid -hyperlipid BMI,EF -BMI,EF BMI, Urgency -BMI, Urgency BMI, Urgency, CHF Hx -BMI, Urgency, CHF Hx
© 2003 By Default! A Free sample background from Slide 15 Logistic Regression Evaluation TrainingTest ROCHLROCHL <0.001 SEQ < RND <0.001
© 2003 By Default! A Free sample background from Slide 16 Support Vector Machine Model Development GIST (Columbia University, NY, NY) GIST (Columbia University, NY, NY) STATA 8.2 (College Station, TX) STATA 8.2 (College Station, TX) All variables used All variables used Kernel Choice Kernel Choice –Polynomial (1-6) –Radial width factor (related to sigma) (0.1-20) Probabilistic Output Methodology Probabilistic Output Methodology –Discriminant: distance from hyperplane –LR Model using Discriminant as the only feature –Established method to convert SVM classification to regression –Allows use of HL Goodness of fit
© 2003 By Default! A Free sample background from Slide 17 SEQ TrainingTest ROCHLROCHL Lin P P P P P Support Vector Machine Polynomial Evaluation
© 2003 By Default! A Free sample background from Slide 18 RND TrainingTest ROCHLROCHL Lin P P P P P Support Vector Machine Polynomial Evaluation
© 2003 By Default! A Free sample background from Slide 19 SEQ TrainingTest ROCHLROCHL R R R R R R Support Vector Machine Radial Evaluation
© 2003 By Default! A Free sample background from Slide 20 RND TrainingTest ROCHLROCHL R R R R R R Support Vector Machine Radial Evaluation
© 2003 By Default! A Free sample background from Slide 21 Discussion All Discrimination All Models showed excellent performance All Models showed excellent performance None of the models was significantly different in performance None of the models was significantly different in performance This measure was relatively insensitive to changes in data across widely variable levels of calibration This measure was relatively insensitive to changes in data across widely variable levels of calibration
© 2003 By Default! A Free sample background from Slide 22 Discussion LR Calibration For this data, LR was unable to maintain calibration. This is likely due to temporal data drift For this data, LR was unable to maintain calibration. This is likely due to temporal data drift The LR models required manual feature selection and expert knowledge to calibrate the training data sets The LR models required manual feature selection and expert knowledge to calibrate the training data sets
© 2003 By Default! A Free sample background from Slide 23 Discussion SVM Calibration Some versions of both kernel types were able to maintain calibration on both data sets Some versions of both kernel types were able to maintain calibration on both data sets Calibration was maintained across larger parameter ranges of both kernels for the random data set than the sequential data set Calibration was maintained across larger parameter ranges of both kernels for the random data set than the sequential data set Current assessments of discrimination and calibration on the training set are insufficient to choose the optimal kernel parameter Current assessments of discrimination and calibration on the training set are insufficient to choose the optimal kernel parameter
© 2003 By Default! A Free sample background from Slide 24 Conclusions SVMs could be superior to LR in terms of maintaining calibration over time in this domain SVMs could be superior to LR in terms of maintaining calibration over time in this domain Further exploration is needed to develop additional markers of model robustness Further exploration is needed to develop additional markers of model robustness Further work in evaluating optimal time intervals to create new models or recalibrate old models Further work in evaluating optimal time intervals to create new models or recalibrate old models
© 2003 By Default! A Free sample background from Slide 25 The end