© 2003 By Default! A Free sample background from www.powerpointbackgrounds.com Slide 1 Evaluation of Support Vector Machines for Risk Modeling in Interventional.

Slides:

Advertisements

Similar presentations

Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.

Advertisements

ECG Signal processing (2)

Gall C, Katch A, Rice T, Jeffries HE, Kukuyeva I, and Wetzel RC

Guidelines recommend consideration of fibrinolytic therapy if unable to achieve a door to balloon time ≤120 minutes for STEMI patients transferred for.

An Introduction of Support Vector Machine

Support Vector Machines

Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,

The Influence of Radial vs. Femoral Access on Acute Blood Loss in Patients Undergoing Percutaneous Coronary Intervention Amit Nanda 1, Eric Novak MS 2,

Comparison of the New Mayo Clinic Risk Scores and Clinical SYNTAX Score in Predicting Adverse Cardiovascular Outcomes following Percutaneous Coronary Intervention.

Prediction Models in Medicine Clinical Decision Support The Road Ahead Chapter 10.

Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.

Using Machine Learning to Model Standard Practice: Retrospective Analysis of Group C-Section Rate via Bagged Decision Trees Rich Caruana Cornell CS Stefan.

Data mining and statistical learning - lecture 13 Separating hyperplane.

What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.

Lucila Ohno-Machado An introduction to calibration and discrimination methods HST951 Medical Decision Support Harvard Medical School Massachusetts Institute.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Validation of predictive regression models Ewout W. Steyerberg, PhD Clinical epidemiologist Frank E. Harrell, PhD Biostatistician.

An Introduction to Support Vector Machines Martin Law.

Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.

Performance Reports Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics UCSF.

Factors influencing treatment decisions for coronary artery disease after cardiac catheterization American Heart Association November 18, 2013 Dallas,

WG Hellenic PCI Registry Organization - Structure - Directions - Initial Recordings Georgios I. Papaioannou, MD,

Validation of Mayo Clinic Risk Adjustment Model for In-Hospital Mortality following Percutaneous Coronary Interventions using the National Cardiovascular.

© 2003 By Default! A Free sample background from Slide 1 Physician Prognostic Accuracy for In-Hospital Mortality in Percutaneous.

Patterns of red blood cell transfusion use and outcomes in patients undergoing percutaneous coronary intervention in contemporary clinical practice: Insights.

Effect of Hypertension and Dyslipidemia on glycemic control among Type 2 Diabetes patients in Thailand Dr. Mya Thandar Dr.PH. Batch 5 1.

ICE Hellenic PCI Registry Organization - Structure - Directions - Initial Recordings Georgios I. Papaioannou, MD,

1 EFFECT STUDY 2 EFFECT STUDY  Set national cardiac care benchmarks for hospitals to work towards 

Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.

West TA, Rivara FP, Cummings P, Jurkovich GJ, Maier RV. J Trauma 2000;49:

GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.

Prediction of Malignancy of Ovarian Tumors Using Least Squares Support Vector Machines C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel 1, I.

EMBC2001 Using Artificial Neural Networks to Predict Malignancy of Ovarian Tumors C. Lu 1, J. De Brabanter 1, S. Van Huffel 1, I. Vergote 2, D. Timmerman.

An Introduction to Support Vector Machines (M. Law)

Effect of Hypertension and Dyslipidemia on glycemic control among Type 2 Diabetes patients in Thailand Dr. Mya Thandar DrPH Batch 5 1.

The Synergy between Percutaneous Coronary Intervention with TAXUS and Cardiac Surgery: The SYNTAX Study One Year Results of the PCI and CABG Registries.

A Novel Score to Estimate the Risk of Pneumonia After Cardiac Surgery

Bleeding in Patients Undergoing Percutaneous Coronary Interventions: A Risk Model From 302,152 Patients in the NCDR. Sameer K. Mehta MD, Andrew D. Frutkin.

Left Main Trifurcation Disease: Early and Long-Term Outcomes Of Percutaneous Coronary Intervention I.Sheiban, A.Gerasimou, F. Sciuto, P.Omedè, G. Biondi.

© 2003 By Default! A Free sample background from Slide 1 PCI Risk Model Comparisons An alternative model for case level estimation.

Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,

Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.

Community Outreach to Reduce Disparities in Cardiovascular & Diabetes Morbidity & Mortality in the South Bronx Michael Alderman, MD Michelle Johnson, MD,

Evaluating Predictive Models Niels Peek Department of Medical Informatics Academic Medical Center University of Amsterdam.

Journal Club Jeffrey P Schaefer, MD April 16, 2007.

Guest lecture: Feature Selection Alan Qi Dec 2, 2004.

GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.

Impact of Prior Myocardial Infarction Among Patients with Acute Myocardial Infarction Treated in Contemporary Practice: A Report from the ACTION Registry.

1 Statistical Review of the Observational Studies of Aprotinin Safety Part II: The i3 Drug Safety Study CRDAC and DSaRM Meeting September 12, 2007 P. Chris.

A Comparison of Quality of Care in General Hospitals, Specialty Hospitals, and Ambulatory Surgery Centers Cheryl Fahlman, PhD Phil Kletke, PhD Chuck Wentworth,

Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.

Gist 2.3 John H. Phan MIBLab Summer Workshop June 28th, 2006.

Date of download: 6/3/2016 Copyright © The American College of Cardiology. All rights reserved. From: Relationship Between Operator Volume and Adverse.

Predicting Mortality in Non-Variceal Upper Gastrointestinal Bleeders: Validation of the Italian PNED Score and Prospective Comparison With the Rockall.

Non-Linear Dependent Variables Ciaran S. Phibbs November 17, 2010.

From: Contemporary Mortality Risk Prediction for Percutaneous Coronary Intervention: Results From 588,398 Procedures in the National Cardiovascular Data.

Matching methods for estimating causal effects Danilo Fusco Rome, October 15, 2012.

Bootstrap and Model Validation

CS548 Fall 2017 Decision Trees / Random Forest Showcase by Yimin Lin, Youqiao Ma, Ran Lin, Shaoju Wu, Bhon Bunnag Showcasing work by Cano,

Martijn Schuemie, Peter Rijnbeek, Jenna Reps, Marc Suchard

Risk adjustment using administrative and clinical data: model comparison

Giuseppe Biondi Zoccai, MD

The Synergy between Percutaneous Coronary Intervention with TAXUS and Cardiac Surgery: The SYNTAX Study One Year Results of the PCI and CABG Registries.

Postoperative neonatal mortality prediction using superlearning

Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee

Hellenic PCI Registry Organization - Structure - Directions - Initial Recordings Georgios I. Papaioannou, MD, MPH, FACC,

No Financial Disclosure or Conflict of Interest

The Synergy between Percutaneous Coronary Intervention with TAXUS and Cardiac Surgery: The SYNTAX Study One Year Results of the PCI and CABG Registries.

Machine Learning with Clinical Data

Atlantic Cardiovascular Patient Outcomes Research Team

Presentation transcript:

© 2003 By Default! A Free sample background from Slide 1 Evaluation of Support Vector Machines for Risk Modeling in Interventional Cardiology Michael E. Matheny, M.D.

© 2003 By Default! A Free sample background from Slide 2 Goal Comparison of support vector machines and logistic regression risk modeling performance over time for the outcome of death in pre- intervention cardiac catheterization patients. Comparison of support vector machines and logistic regression risk modeling performance over time for the outcome of death in pre- intervention cardiac catheterization patients.

© 2003 By Default! A Free sample background from Slide 3 Pre-intervention Risk Assessment Percutaneous Coronary Intervention (PCI) is a high volume procedure with significant morbidity & mortality Percutaneous Coronary Intervention (PCI) is a high volume procedure with significant morbidity & mortality Risk of death in PCI varies widely based on co-morbidities Risk of death in PCI varies widely based on co-morbidities Providing accurate case level estimations can greatly aid patient and physician decision-making Providing accurate case level estimations can greatly aid patient and physician decision-making

© 2003 By Default! A Free sample background from Slide 4 Domain Data Quality The American College of Cardiologists has published a standardized data dictionary (ACC-NCDR) and mandates that accredited centers maintain detailed data on all PCI patients The American College of Cardiologists has published a standardized data dictionary (ACC-NCDR) and mandates that accredited centers maintain detailed data on all PCI patients Some states, including Massachusetts, now have mandatory reporting of case data based on the ACC-NCDR Some states, including Massachusetts, now have mandatory reporting of case data based on the ACC-NCDR

© 2003 By Default! A Free sample background from Slide 5 Current Risk Model Standard Logistical Regression (LR) Gold standard for risk modeling in interventional cardiology Gold standard for risk modeling in interventional cardiology Type of generalized non-linear model Type of generalized non-linear model –Used in analysis of a binary outcome –Bounded by 0 and 1 Feature (variable) selection Feature (variable) selection –From All Available Data –Known Risk Factors from Prior Studies –Selected Subset of data based on Study Design

© 2003 By Default! A Free sample background from Slide 6 Alternative Risk Model Support Vector Machine (SVM) Key Features Key Features –Kernel Functions - introduce non-linearity in the hypothesis space without explicitly requiring a non-linear algorithm LinearLinear PolynomialPolynomial Radial BasedRadial Based –Global Minimum

© 2003 By Default! A Free sample background from Slide 7 Risk Model Evaluation Discrimination Provides an estimate of population level accuracy Provides an estimate of population level accuracy Area under the Receiver Operating Characteristic (ROC) Curve Area under the Receiver Operating Characteristic (ROC) Curve Graphed by the sensitivity vs. 1-specificity at different thresholds Graphed by the sensitivity vs. 1-specificity at different thresholds

© 2003 By Default! A Free sample background from Slide 8 Risk Model Evaluation Calibration Provides an estimation of case level accuracy Provides an estimation of case level accuracy Hosmer-Lemeshow’s Goodness-of-Fit Test Hosmer-Lemeshow’s Goodness-of-Fit Test –Primarily used in logistic regression –Calculates how well the observed and expected frequencies match –Handles data sparsity better than more common methods (Variance, Pearson’s) –P > 0.05 is a good fit

© 2003 By Default! A Free sample background from Slide 9 Source Data Brigham & Women’s Hospital Brigham & Women’s Hospital Interventional Cardiology Database Interventional Cardiology Database January 1, 2002 – October 30, 2004 January 1, 2002 – October 30, Cases 5383 Cases –Data split two ways each into 2/3 Training (3588) and 1/3 Test (1795) Sequential SplitSequential Split –sorted chronologically –October 27, 2003 split Random SplitRandom Split

© 2003 By Default! A Free sample background from Slide 10 Sample Demographics Overview #%Age Diabetic Hypertensive Hyperlipidemia Prior PCI Salvage Procedure Cardiogenic Shock Hemodynamic Instability Death781.45

© 2003 By Default! A Free sample background from Slide 11 Model Features Age (D) Hyperlipidemia Hx COPD GenderHTN Hx CVD BMI (D) Diabetes Hx PVD Cardiogenic Shock Creatinine (D) Thrombolytic Cardiac arrest Hx CHF IABP Hemodynamic instability CHF EF (D) Smoker Prior MI AMI Prior CABG Prior PCI Procedure urgency (D) Unstable Angina Chronic Angina AMI Within 24 Hours

© 2003 By Default! A Free sample background from Slide 12 Logistic Regression Model Development STATA 8.2 (College Station, TX) STATA 8.2 (College Station, TX) Backwards Stepwise Technique Backwards Stepwise Technique Exclusion Threshold (P 0.05 – 0.15) Exclusion Threshold (P 0.05 – 0.15) Feature Selection Feature Selection

© 2003 By Default! A Free sample background from Slide 13 Logistic Regression Feature Selection Model development Model development –Sequential Training Set –Stepwise Backwards (P = 0.10) used for feature selection –Stepwise feature removal based on ROC and HL Goodness-of-fit (HL) optimization

© 2003 By Default! A Free sample background from Slide 14 Logistic Regression Feature Selection FeatureROC HL P All BMI -BMI EF -EF arrest -arrest hyperlipid -hyperlipid BMI,EF -BMI,EF BMI, Urgency -BMI, Urgency BMI, Urgency, CHF Hx -BMI, Urgency, CHF Hx

© 2003 By Default! A Free sample background from Slide 15 Logistic Regression Evaluation TrainingTest ROCHLROCHL <0.001 SEQ < RND <0.001

© 2003 By Default! A Free sample background from Slide 16 Support Vector Machine Model Development GIST (Columbia University, NY, NY) GIST (Columbia University, NY, NY) STATA 8.2 (College Station, TX) STATA 8.2 (College Station, TX) All variables used All variables used Kernel Choice Kernel Choice –Polynomial (1-6) –Radial width factor (related to sigma) (0.1-20) Probabilistic Output Methodology Probabilistic Output Methodology –Discriminant: distance from hyperplane –LR Model using Discriminant as the only feature –Established method to convert SVM classification to regression –Allows use of HL Goodness of fit

© 2003 By Default! A Free sample background from Slide 17 SEQ TrainingTest ROCHLROCHL Lin P P P P P Support Vector Machine Polynomial Evaluation

© 2003 By Default! A Free sample background from Slide 18 RND TrainingTest ROCHLROCHL Lin P P P P P Support Vector Machine Polynomial Evaluation

© 2003 By Default! A Free sample background from Slide 19 SEQ TrainingTest ROCHLROCHL R R R R R R Support Vector Machine Radial Evaluation

© 2003 By Default! A Free sample background from Slide 20 RND TrainingTest ROCHLROCHL R R R R R R Support Vector Machine Radial Evaluation

© 2003 By Default! A Free sample background from Slide 21 Discussion All Discrimination All Models showed excellent performance All Models showed excellent performance None of the models was significantly different in performance None of the models was significantly different in performance This measure was relatively insensitive to changes in data across widely variable levels of calibration This measure was relatively insensitive to changes in data across widely variable levels of calibration

© 2003 By Default! A Free sample background from Slide 22 Discussion LR Calibration For this data, LR was unable to maintain calibration. This is likely due to temporal data drift For this data, LR was unable to maintain calibration. This is likely due to temporal data drift The LR models required manual feature selection and expert knowledge to calibrate the training data sets The LR models required manual feature selection and expert knowledge to calibrate the training data sets

© 2003 By Default! A Free sample background from Slide 23 Discussion SVM Calibration Some versions of both kernel types were able to maintain calibration on both data sets Some versions of both kernel types were able to maintain calibration on both data sets Calibration was maintained across larger parameter ranges of both kernels for the random data set than the sequential data set Calibration was maintained across larger parameter ranges of both kernels for the random data set than the sequential data set Current assessments of discrimination and calibration on the training set are insufficient to choose the optimal kernel parameter Current assessments of discrimination and calibration on the training set are insufficient to choose the optimal kernel parameter

© 2003 By Default! A Free sample background from Slide 24 Conclusions SVMs could be superior to LR in terms of maintaining calibration over time in this domain SVMs could be superior to LR in terms of maintaining calibration over time in this domain Further exploration is needed to develop additional markers of model robustness Further exploration is needed to develop additional markers of model robustness Further work in evaluating optimal time intervals to create new models or recalibrate old models Further work in evaluating optimal time intervals to create new models or recalibrate old models

© 2003 By Default! A Free sample background from Slide 25 The end