Assessing the additional value of diagnostic markers: a comparison of traditional and novel measures Ewout W. Steyerberg Professor of Medical Decision.

Slides:



Advertisements
Similar presentations
Lecture 3 Validity of screening and diagnostic tests
Advertisements

Grading the Strength of a Body of Evidence on Diagnostic Tests Prepared for: The Agency for Healthcare Research and Quality (AHRQ) Training Modules for.
Chapter 4 Pattern Recognition Concepts: Introduction & ROC Analysis.
Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,
Curva ROC figuras esquemáticas Curva ROC figuras esquemáticas Prof. Ivan Balducci FOSJC / Unesp.
Clinical Decision Support: Using Logistic Regression to Diagnose COPD and CHF ©2012 Wayne G. Fischer, PhD 1 COPD patient inclusion criteria: Discharged.
Assessing Information from Multilevel (Ordinal) and Continuous Tests ROC curves and Likelihood Ratios for results other than “+” or “-” Michael A. Kohn,
Departments of Medicine and Biostatistics
Sensitivity, Specificity and ROC Curve Analysis.
Potential Roles and Limitations of Biomarkers in Alzheimer’s Disease Richard Mayeux, MD, MSc Columbia University.
© Nancy E. Mayo 2004 Sample Size Estimations Demystifying Sample Size Calculations Graphics contributed by Dr. Gillian Bartlett.
Centre Cérébrovasculaire COMORBIDITY ANALYSIS AND 3 MONTHS FUNCTIONAL OUTCOME IN ACUTE ISCHEMIC STROKE: DATA FROM ACUTE STROKE REGISTRY AND ANALYSIS.
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
1 The Expected Performance Curve Samy Bengio, Johnny Mariéthoz, Mikaela Keller MI – 25. oktober 2007 Kresten Toftgaard Andersen.
How do we know whether a marker or model is any good? A discussion of some simple decision analytic methods Carrie Bennette on behalf of Andrew Vickers.
Lucila Ohno-Machado An introduction to calibration and discrimination methods HST951 Medical Decision Support Harvard Medical School Massachusetts Institute.
Validation of predictive regression models Ewout W. Steyerberg, PhD Clinical epidemiologist Frank E. Harrell, PhD Biostatistician.
Decision Tree Models in Data Mining
Screening and Early Detection Epidemiological Basis for Disease Control – Fall 2001 Joel L. Weissfeld, M.D. M.P.H.
UOG Journal Club: January 2013
Medical decision making. 2 Predictive values 57-years old, Weight loss, Numbness, Mild fewer What is the probability of low back cancer? Base on demographic.
Criteria for Assessment of Performance of Cancer Risk Prediction Models: Overview Ruth Pfeiffer Cancer Risk Prediction Workshop, May 21, 2004 Division.
© Copyright 2009 by the American Association for Clinical Chemistry Plasma Myeloperoxidase Predicts Incident Cardiovascular Risks in Stable Patients Undergoing.
Evidence Evaluation & Methods Workgroup: Developing a Decision Analysis Model Lisa A. Prosser, PhD, MS September 23, 2011.
Evaluation – next steps
Non-Traditional Metrics Evaluation measures from the Evaluation measures from the medical diagnostic community medical diagnostic community Constructing.
Jaw Pain: Characteristics and Prevalence in Fibromyalgia and other Rheumatic Disorders Robert S. Katz 1, Frederick Wolfe 2. 1 Rush University Med Center,
Task Force on Health Recent results - Particulate matter Michal Krzyzanowski TFH Chair Head, Bonn Office European Centre for Environment and Health WHO.
Division of Population Health Sciences Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Indices of Performances of CPRs Nicola.
Patients’ preferences for preventive osteoporosis drug treatment EW de Bekker-Grob ML Essink-Bot WJ Meerding HAP Pols BW Koes EW Steyerberg Dept. Public.
BACKGROUND Cost-effectiveness of Psychotherapy for Cluster C Personality Disorders and the Value of Information and Implementation Djøra I. Soeteman 1,2,
The PCA3 Assay improves the prediction of initial biopsy outcome and may be indicative of prostate cancer aggressiveness de la Taille A, Irani J, Graefen.
How do we know whether a marker or model is any good? A discussion of some simple decision analytic methods Carrie Bennette (on behalf of Andrew Vickers)
Kevin Kennedy, MS Saint Luke’s Hospital, Kansas City, MO
EMBC2001 Using Artificial Neural Networks to Predict Malignancy of Ovarian Tumors C. Lu 1, J. De Brabanter 1, S. Van Huffel 1, I. Vergote 2, D. Timmerman.
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
Appraising A Diagnostic Test
Assessing Information from Multilevel (Ordinal) and Continuous Tests ROC curves and Likelihood Ratios for results other than “+” or “-” Michael A. Kohn,
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
Evidence-Based Medicine Diagnosis Component 2 / Unit 5 1 Health IT Workforce Curriculum Version 1.0 /Fall 2010.
1 Risk Assessment Tests Marina Kondratovich, Ph.D. OIVD/CDRH/FDA March 9, 2011 Molecular and Clinical Genetics Panel for Direct-to-Consumer (DTC) Genetic.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
Diagnostic Tests Afshin Ostovar Bushehr University of Medical Sciences Bushehr, /7/20151.
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
Evaluating Classification Performance
Conceptual Addition of Adherence to a Markov Model In the adherence-naïve model, medication adherence and associated effectiveness assumed to be trial.
Blackbox classifiers for preoperative discrimination between malignant and benign ovarian tumors C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel.
ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.
Joan Carles Soliva Vila Cognitive Neuroscience Research Unit (URNC) Dept. of Psychiatry. Autonomous University of Barcelona (UAB)
Timothy Wiemken, PhD MPH Assistant Professor Division of Infectious Diseases Diagnostic Tests.
Relationship between performance measures: From statistical evaluations to decision-analysis Ewout Steyerberg Dept of Public Health, Erasmus MC, Rotterdam,
Diagnostic Likelihood Ratio Presented by Juan Wang.
Net Reclassification Risk: a graph to clarify the potential prognostic utility of new markers Ewout Steyerberg Professor of Medical Decision Making Dept.
© 2010 Jones and Bartlett Publishers, LLC. Chapter 12 Clinical Epidemiology.
Sensitivity, Specificity, and Receiver- Operator Characteristic Curves 10/10/2013.
حسن بیات - دانش ‌ آموخته ‌ ی علوم آزمایشگاهی اردیبهشت 1395.
The index test results: positivity and negativity criteria.
Bootstrap and Model Validation
When is the post-test probability sufficient for decision-making?
Measuring prognosis Patients want to know likely outcome
Association between two categorical variables
The receiver operating characteristic (ROC) curve
Aiying Chen, Scott Patterson, Fabrice Bailleux and Ehab Bassily
VALIDATION AND UPDATING OF MODELS WITH BIOMARKERS
A novel counting algorithm to detect common fetal trisomies
Volume 51, Issue 2, Pages (February 2007)
Regression and Clinical prediction models
Evidence Based Diagnosis
Professor of Clinical Biostatistics and Medical Decision Making Nov-19 Why Most Statistical Predictions Cannot Reliably Support Decision-Making:
Presentation transcript:

Assessing the additional value of diagnostic markers: a comparison of traditional and novel measures Ewout W. Steyerberg Professor of Medical Decision Making Dept of Public Health, Erasmus MC, Rotterdam, the Netherlands Birmingham, July 2, 2010

Introduction: additional value of a diagnostic marker  Usefulness / Clinical utility: what do we mean exactly?  Evaluation of predictions  Ordering: concordance statistic (c, or AUC)  Evaluation of decisions  Net Reclassification Index (NRI), very popular  Net Benefit (NB): decision-analytic, not popular  Adding a marker to a model  Statistical significance? Simple LR testing; not an issue  Clinical usefulness: measurement worth the costs?

Overview  Hypotheses:  NRI is closely related to AUC  NRI may be misleading

Addition of a marker to a model  Typically small improvement in discriminative ability according to c statistic  c stat blamed for being insensitive

 Net Reclassification Index:  (move up | event– move down | event) + (move down | non-event – move up | non-event ) = improvement in sensitivity + improvement in specificity

Pencina example

/183=12% 1/3081=0.03%

Enthusiasm

History of NRI 1.Many object to AUC 2.Cook: Reclassification provides insight 3.Pencina: Net reclassification is what counts 4.Many: Enthusiasm 5.Objections, 8 LTTEs Stat Med 2008 a) Relationships to other measures Reply: agree b) Greenland +Vickers/Steyerberg: Need to weight consequences Reply: implicit weighting by prevalence

5a) NRI ‘a better measure’?  NRI requires classification  Simplest case: binary (high vs low risk)  If binary, easy to calculate sensitivity and specificity  NRI = delta sens + delta spec, reminds us of Youden Index  Youden Index = sens + spec – 1  NRI = delta Youden Index

NRI better than AUC?  Binary ROC curve  AUC = (sens+spec) / 2  NRI = delta sens + delta spec  NRI = 2 x delta AUC !  Conclusion: NRI misleading in claiming being ‘better’ than AUC 1. from predictions to classification 2. 2 x delta AUC

5b) Weighting ‘absurd”

Chapter 16 - Google books - Order

Evaluation of decisions  Clinically meaningful cut-off (or threshold) for the probability: p t  p t reflects relative true-positive vs weight false-positive decisions e.g. if p t = 50%, wTP=wFP if p t = 20%, wTP = 4 times wFP  Net Benefit: (TP – w FP) / N, with w = harm / benefit = p t / (1 – p t ) (Pierce 1884, Vickers 2006)  If p t = 50%, w =.5 / (1 –.5) = 1 if p t = 20%, w =.2 / (1 –.2) = 1/4  Net Reclassification Index:  NRI = improvement in sens + improvement in spec  Implicit weighting by non-event odds: (1 – Prevalence) / Prevalence  Hence inconsistent if p t ≠ Prevalence

Overview

Case study  Testicular cancer: prediction of residual tumor after chemotherapy  N=544, 299 tumor (55%)  Reference models  Postchemotherapy mass size  … + reduction in size + primary histology  3 tumor markers  AFP: abnormal vs normal  HCG: abnormal vs normal  LDH: abnormal vs normal and continuous: log(LDH)

Evaluation of predictions  LR and AUC (c) same pattern  Reference model matters; dichotomization harms

Evaluation of decisions at 20% and 55% thresholds  Net Benefit and NRI consistent at 55% (=prevalence) threshold, not 20%

Conclusions 1.Judgment of additional value depends on the measure chosen; the reference model; coding of the marker. 2.A decision-analytic perspective is not compatible with an overall judgment as obtained from the AUC in ROC analysis nor with NRI. 3.The current practice of reporting AUC and NRI as measures of usefulness needs to be replaced by routinely reporting net benefit analyses. 4.Further work: - NRI and NB for 2 decisions, e.g. CVD 5% and 20% thresholds - link to decision analysis / cost-effectiveness analysis

References  Vickers AJ, Elkin EB: Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making 26:565-74, 2006  Steyerberg EW, Vickers AJ: Decision Curve Analysis: A Discussion. Med Decis Making 28; 146, 2008  Steyerberg EW, Vickers AJ, Cook NR, et al. Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiology, Jan 2010

From 1 cutoff to consecutive cutoffs  Sensitivity and specificity  ROC curve  Net benefit  decision curve

ROC curves

Decision curves