S117: Acute Setting Predictive Analytics Sharon E. Davis, MS

Slides:



Advertisements
Similar presentations
Cost-sharing for Emergency Care and Unfavorable Clinical Events: Findings from the Safety And Financial Ramifications of ED Copayments (SAFE) Study AcademyHealth.
Advertisements

Population Trends in the Incidence and Outcomes of Acute Myocardial Infarction Robert W. Yeh, MD MSc Massachusetts General Hospital Alan S. Go, MD Kaiser.
Comparator Selection in Observational Comparative Effectiveness Research Prepared for: Agency for Healthcare Research and Quality (AHRQ)
Computational Modeling of Emergency Medical Services Aaron Bair, MD Emergency Medicine UC Davis Medical Center.
Connie N. Hess, MD, Bimal R. Shah, MD, MBA, S. Andrew Peng, MS, Laine Thomas, PhD, Matthew T. Roe, MD, MHS, Eric D. Peterson, MD, MPH Relationship of Early.
Gall C, Katch A, Rice T, Jeffries HE, Kukuyeva I, and Wetzel RC
Model Assessment, Selection and Averaging
The Relationship Between CMS Quality Indicators and Long-term Outcomes Among Hospitalized Heart Failure Patients Mark Patterson, Ph.D., M.P.H. Post-doctoral.
1 Moderators of Treatment Effects in the General Medicine Literature: Looking for Improvement Nicole Bloser, MHA, MPH University of California, Davis June.
Prediction Models in Medicine Clinical Decision Support The Road Ahead Chapter 10.
1 Journal Club Alcohol, Other Drugs, and Health: Current Evidence July–August 2011.
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
The Redesigned National Hospital Discharge Survey National Center for Health Statistics Division of Health Care Statistics Hospital Care Team Last Updated:
Prelude of Machine Learning 202 Statistical Data Analysis in the Computer Age (1991) Bradely Efron and Robert Tibshirani.
Validation of predictive regression models Ewout W. Steyerberg, PhD Clinical epidemiologist Frank E. Harrell, PhD Biostatistician.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
1 VA Hospice and Palliative Care: Identifying Veterans at High Risk of Mortality Ann Hendricks PhD, Lynn Wolfsfeld MPP Health Care Financing & Economics.
Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.
Acknowledgements Contact Information Anthony Wong, MTech 1, Senthil K. Nachimuthu, MD 1, Peter J. Haug, MD 1,2 Patterns and Rules  Vital signs medoids.
SUMMARY Emergency Departments (EDs) are an essential service for the care of injuries and trauma for everyone. They provide a safety net when the system.
A Novel Score to Estimate the Risk of Pneumonia After Cardiac Surgery
Complexity of Case Mix is an Independent Predictor of Mortality After Esophagectomy A Nationwide Inpatient Sample Data Analysis ML Inra MD, EB Habermann.
Impact of Air Pollution on Public Health: Transportability of Risk Estimates Jonathan M. Samet, MD, MS NERAM V October 16, 2006 Vancouver, B.C. Department.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
Sharon Wolf NYU Abu Dhabi Additional Insights Summer Training Institute June 15,
Teaching Intensity, Race and Surgical Outcomes Jeffrey H. Silber The University of Pennsylvania The Children’s Hospital of Philadelphia.
Amar K. Das, MD, PhD Associate Professor of Biomedical Data Science, Psychiatry and Health Policy & Clinical Practice Geisel School of Medicine at Dartmouth.
Blackbox classifiers for preoperative discrimination between malignant and benign ovarian tumors C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel.
Lydia A. Shrier, MD, MPH David Williams, PhD Division of Adolescent/Young Adult Medicine and the Clinical Research Center, Boston Children’s Hospital Department.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
Characterizing an Optimal Predictive Modeling Framework for Prediction of Adverse Drug Events Jon Duke, MD MS, Xiaochun Li PhD, Zuoyi Zhang PhD EDM Forum.
Kelci J. Miclaus, PhD Advanced Analytics R&D Manager JMP Life Sciences
Bootstrap and Model Validation
Quality Measurement A Changing Landscape
VA Office of Mental Health and Suicide Prevention
DATA COLLECTION METHODS IN NURSING RESEARCH
When Using DOPPS Slides
John Weeks1, MD Candidate 2017, Justin Hickman1, MD Candidate 2017
Anastasiia Raievska (Veramed)
Amy Carroll-Scott, PhD, MPH
Suicide Mortality Following VA Irregular Discharges:
Impact of State Reporting Laws on Central Line– Associated Bloodstream Infection Rates in U.S. Adult Intensive Care Units Hangsheng Liu, Carolyn T. A.
David Culliford, Lynn Josephs, Matthew Johnson, Mike Thomas
Is Hospital Procedure Volume a Reliable Marker of Quality for Coronary Artery Bypass Surgery? A Comparison of Risk and Propensity Adjusted Operative and.
AMIA Joint Summits 2017 San Francisco
Analytics in Higher Education: Methods Overview
Eliminating Reproductive Risk Factors and Reaping Female Education and Work Benefits: A Constructed Cohort Analysis of 50 Developing Countries Qingfeng.
Presenter: Wen-Ching Lan Date: 2018/05/09
Predicting the Outcome of Patient-Provider Communication Sequences using Recurrent Neural Networks and Probabilistic Models S38: Predictive Modeling.
Performance Comparison Among Major EHR Systems
Martijn Schuemie, Peter Rijnbeek, Jenna Reps, Marc Suchard
Predicting Tomorrow by Crunching Today’s Numbers
Using Measurement in Community Health Improvement Processes to Identify Priorities and Drive Change Michael A. Stoto, PhD Collaborative Working Group on.
Day 2 Applications of Growth Curve Models June 28 & 29, 2018
Postoperative neonatal mortality prediction using superlearning
CRISP: Consensus Regularized Selection based Prediction
PREDICTORS OF OUTCOME AMONG PATIENTS WITH TRAUMATIC BRAIN INJURY AT MOI TEACHING AND REFERRAL HOSPITAL: ELDORET, KENYA   Judy C. Rotich.
Epidemiology of exercise and physical activity
Fast Sequences of Non-spatial State Representations in Humans
Longitudinal Data & Mixed Effects Models
Lecture 4 Study design and bias in screening and diagnostic tests
Admission Glucose and In-hospital Mortality after Acute Myocardial Infarction in Patients with or without Diabetes: A Cross-sectional Study Shi Zhao, Karthik.
Regression and Clinical prediction models
Do Latinas who live in ethnic enclaves have better or worse survival?
Anina M. Pescatore, MSc, Cristian M
A machine learning approach to prognostic and predictive covariate identification for subgroup analysis David A. James and David Ohlssen Advanced Exploratory.
Professor of Clinical Biostatistics and Medical Decision Making Nov-19 Why Most Statistical Predictions Cannot Reliably Support Decision-Making:
Chaoran Hu1,4, Xiao Tan2,4, Qing Pan3, Yong Ma4, Jaejoon Song4
Is Statistics=Data Science
Presentation transcript:

Calibration Drift Among Regression and Machine Learning Models for Hospital Mortality S117: Acute Setting Predictive Analytics Sharon E. Davis, MS PhD Candidate Department of Biomedical Informatics Vanderbilt University

Disclosure I and my spouse have no relevant relationships with commercial interests to disclose. AMIA 2017 | amia.org

Motivation Evolving role of clinical prediction models EHR-based models have access to more data, can support more complexity in real-time Move from classification to individual-level predicted probabilities Increasing focus on calibration Discrimination Ability to separate cases and non-cases Supports risk stratification Calibration Alignment between predicted and observed probabilities More difficult to assess, evolving metrics Calibration hierarchy and associated metrics1 Agreement within each covariate pattern Agreement of prediction and outcome rate among similar patients No systematic over/underprediction or over/underfitting Agreement on average Strong Moderate Weak Mean 1 Van Calster et al. “A calibration hierarchy for risk models was defined: from utopia to empirical data.” JCE 2016. AMIA 2017 | amia.org

Motivation Drifting performance over time, particularly in terms of calibration Driving forces Population shifts in outcome rate, case mix Clinical practice and documentation changes (predictor-outcome associations) Limited understanding Focus on logistic models, despite proliferation of more complex modeling methods and some evidence that methods impact drift Focus on crude measures of calibration, despite evidence that metrics impact understanding Limited exploration of drivers of calibration drift Davis SE, Lasko TA, Chen G, Siew ED, Matheny ME. 2017. “Calibration Drift in Regression and Machine Learning Models for Acute Kidney Injury.” JAMIA. 24(6): 1052-1061. AMIA 2017 | amia.org

Objectives Compare performance over time of prediction models developed using common frequentist statistical methods and machine learning techniques Expanded set of modeling methods Calibration measured at multiple levels of stringency Link shifts in patient populations with model performance to identify drivers of calibration drift across models Event rate shift Case mix shift Predictor-outcome association shift AMIA 2017 | amia.org

Study Population National cohort of admissions to VA facilities, 2006-2013 Total sample size – 1,893,284 admissions One year development period, multi-year validation period Outcome: 30-day mortality after hospital admission Predictors based on prior modeling Demographics (3) Laboratory results (20) Diagnoses (32) Vitals (7) Care utilization (3) Admission type (1) AMIA 2017 | amia.org

Methods – Model Development Training set – admissions occurring in 2006 (n = 235,548) Modeling methods Parallel models developed with each method Hyperparameters selected with 5-fold cross-validation Internal validation with 200 bootstrap iterations Logistic L1-L2 regularized logistic (elastic net) L1 regularized logistic (lasso) Random forest L2 regularized logistic logistic (ridge) Neural network AMIA 2017 | amia.org

Methods – Model Validation Validation period: 2007-2013 (n = 1,657,736) AMIA 2017 | amia.org

Methods – Model Validation Discrimination - AUC Calibration Calibration Plot Metrics – flexible calibration curves, estimated calibration index (ECI) 1 Metrics: Cox recalibration intercept and slope Metric: observed to expected outcome ratio (O:E) Strong Moderate Weak Mean 1 Van Hoorde et al. “A spline-based tool to assess and visualize the calibration of multiclass risk predictions.” JBI. 2015. AMIA 2017 | amia.org

Methods – Extending Flexible Calibration Curves Determining regions of calibration Tracking regions over time Rescaling regions by volume of data Time Data concentrated at low predicted probabilities Rescaled regions of calibration by proportion of observations in each region Assessed magnitude of miscalibration using within region ECIs AMIA 2017 | amia.org 10

Methods – Population Data Shifts Event rate shift Definition: Changes in the prevalence of the outcome Assessment: Distribution of mortality rate over time Case mix shift Definition: Changes in the distribution of risk factors in the population Assessment: Distributions of predictors over time Discrimination and model structure of membership models1 Predictor-outcome association shifts Definition: Changes in form or magnitude of relationships between risk factors and the outcome Assessment: Changes in the structure of models refit in each 3-month period Differences that arise over time between the population on which the model was developed and the population on which the model is applied 1 Debray et al. “A new framework to enhance the interpretation of external validation studies of clinical prediction models.” JCE 2015.. AMIA 2017 | amia.org

Discrimination Over Time AMIA 2017 | amia.org

Calibration Over Time – Observed to Expected Outcome Ratio Ideal value: 1 AMIA 2017 | amia.org

Calibration Over Time – Estimated Calibration Index Ideal value: 0 AMIA 2017 | amia.org

Calibration Over Time – Rescaled Regions of Calibration AMIA 2017 | amia.org

Linking Population Shifts and Performance Event rate shift dominated by seasonal variation in mortality rate Model Outcome Rate Shift Case Mix Shift Association Shift Logistic regression u L1 logistic L2 logistic L1-L2 logistic Random forest Neural network Susceptibility – u High u Moderate u Low Mortality rate declined from 5.0% to 4.8% over study period AMIA 2017 | amia.org

Linking Population Shifts and Performance Outcome rate shift dominated by seasonal variation in mortality rate Limited evidence of predictor-outcome association shifts Model Outcome Rate Shift Case Mix Shift Association Shift Logistic regression u L1 logistic L2 logistic L1-L2 logistic Random forest Neural network Susceptibility – u High u Moderate u Low Mortality rate declined from 5.0% to 4.8% over study period AMIA 2017 | amia.org

Linking Population Shifts and Performance Outcome rate shift dominated by seasonal variation in mortality rate Limited evidence of predictor-outcome association shifts Case mix shift occurred throughout the validation period Model Outcome Rate Shift Case Mix Shift Association Shift Logistic regression u L1 logistic L2 logistic L1-L2 logistic Random forest Neural network Membership Model Discrimination Susceptibility – u High u Moderate u Low Mortality rate declined from 5.0% to 4.8% over study period AMIA 2017 | amia.org

Integration With Additional Research Corresponding analysis in separate cohort experiencing different population data shifts Acute kidney injury models experienced Event rate shift Case mix shift Predictor-outcome association shift Age inclusion in L1 penalized logistic regression model for AKI Estimated calibration index for AKI models Davis SE, Lasko TA, Chen G, Siew ED, Matheny ME. 2017. “Calibration Drift in Regression and Machine Learning Models for Acute Kidney Injury.” JAMIA. 24(6): 1052-1061. AMIA 2017 | amia.org

Integration With Additional Research Corresponding analysis in separate cohort experiencing different population data shifts Acute kidney injury models experienced Event rate shift Case mix shift Predictor-outcome association shift Model Outcome Rate Shift Case Mix Shift Association Shift Logistic regression u L1 logistic L2 logistic L1-L2 logistic Random forest Neural network Susceptibility – u High u Moderate u Low Davis SE, Lasko TA, Chen G, Siew ED, Matheny ME. 2017. “Calibration Drift in Regression and Machine Learning Models for Acute Kidney Injury.” JAMIA. 24(6): 1052-1061. AMIA 2017 | amia.org

Limitations Statistical versus clinically meaningful miscalibration Additional modeling methods remain to be considered Limited population data shift scenarios explored Variable magnitudes of population data shifts Variable combinations of event rate, case mix, association shifts AMIA 2017 | amia.org

Conclusions Calibration drift varies by modeling method and form of underlying population shifts Selection of calibration metrics impacts understanding of drift across methods Model updating strategies will need to be tailored to the unique vulnerabilities of modeling methods AMIA 2017 | amia.org

Funding Agencies Research Team Michael Matheny, MD, MS, MPH VA HSR&D IIR 11-292 VA HSR&D IIR 13-052 NLM 5T15LM007450 NLM 1R21LM011664-01 Research Team Michael Matheny, MD, MS, MPH Guanhua Chen, PhD Thomas Lasko, MD, PhD Department of Biomedical Informatics AMIA 2017 | amia.org

Email: sharon.e.davis@vanderbilt.edu Thank you! Email: sharon.e.davis@vanderbilt.edu Department of Biomedical Informatics