The receiver operating characteristic (ROC) curve

Slides:



Advertisements
Similar presentations
Lecture 3 Validity of screening and diagnostic tests
Advertisements

...visualizing classifier performance Tobias Sing Dept. of Modeling & Simulation Novartis Pharma AG Joint work with Oliver Sander (MPI for Informatics,
Chapter 4 Pattern Recognition Concepts: Introduction & ROC Analysis.
Curva ROC figuras esquemáticas Curva ROC figuras esquemáticas Prof. Ivan Balducci FOSJC / Unesp.
Receiver Operating Characteristic (ROC) Curves
Evaluating Diagnostic Accuracy of Prostate Cancer Using Bayesian Analysis Part of an Undergraduate Research course Chantal D. Larose.
V. Petrenkiene*, D. Petrauskas L. Kupcinskas, Lithuanian University of Health sciences Clinic of Gastroenterology Kaunas Utility of non-invasive markers.
Performance measures Morten Nielsen, CBS, BioCentrum, DTU.
Statistical Fridays J C Horrow, MD, MSSTAT
Evaluating Classifiers
BASIC STATISTICS: AN OXYMORON? (With a little EPI thrown in…) URVASHI VAID MD, MS AUG 2012.
Medical decision making. 2 Predictive values 57-years old, Weight loss, Numbness, Mild fewer What is the probability of low back cancer? Base on demographic.
Diagnostic Testing Ethan Cowan, MD, MS Department of Emergency Medicine Jacobi Medical Center Department of Epidemiology and Population Health Albert Einstein.
Division of Population Health Sciences Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Indices of Performances of CPRs Nicola.
Performance measurement. Must be careful what performance metric we use For example, say we have a NN classifier with 1 output unit, and we code ‘1 =
Sensitivity Sensitivity answers the following question: If a person has a disease, how often will the test be positive (true positive rate)? i.e.: if the.
Portal hypertension (PH) is a frequent complication of cirrhosis, contributing to the development of ascites, esophageal varices (EV), and hepatic encephalopathy.
Classification Performance Evaluation. How do you know that you have a good classifier? Is a feature contributing to overall performance? Is classifier.
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
Appraising A Diagnostic Test
Likelihood 2005/5/22. Likelihood  probability I am likelihood I am probability.
Evaluating Results of Learning Blaž Zupan
1 Wrap up SCREENING TESTS. 2 Screening test The basic tool of a screening program easy to use, rapid and inexpensive. 1.2.
Diagnostic Tests Studies 87/3/2 “How to read a paper” workshop Kamran Yazdani, MD MPH.
Evaluating Classification Performance
Laboratory Medicine: Basic QC Concepts M. Desmond Burke, MD.
Evaluation of Diagnostic Tests & ROC Curve Analysis PhD Özgür Tosun.
ROC curve estimation. Index Introduction to ROC ROC curve Area under ROC curve Visualization using ROC curve.
Performance measures Morten Nielsen, CBS, Department of Systems Biology, DTU.
Timothy Wiemken, PhD MPH Assistant Professor Division of Infectious Diseases Diagnostic Tests.
Critical Appraisal Course for Emergency Medicine Trainees Module 5 Evaluation of a Diagnostic Test.
Accuracy, sensitivity and specificity analysis
The index test results: positivity and negativity criteria.
Performance of a diagnostic test Tunisia, 31 Oct 2014
Sarah N. Mattson, Ph.D. Patrick Goh, B.S. Melody Sadler, Ph.D.
The architecture of diagnostic research
Sensitivity and Specificity
QUADAS-2 Mirella Fraquelli Gastroenterology and Endoscopy Unit
Diagnostic accuracy and statistical significance
Cardiac Testing for Coronary Artery Disease in Potential Kidney Transplant Recipients: A Systematic Review of Test Accuracy Studies  Louis W. Wang, MM(ClinEpi)(Hons),
Diagnostic test accuracy. Study design and the 2x2 table
Evaluation – next steps
Class session 7 Screening, validity, reliability
Evaluating Results of Learning
Lecture 3.
From: Systematic Reviews of Diagnostic Test Accuracy
Performance Measures II
بسم الله الرحمن الرحيم Clinical Epidemiology
کاربرد آمار در آزمایشگاه
Machine Learning Week 10.
Data Mining Classification: Alternative Techniques
The Diagnostic Accuracy of Bedside Ocular Ultrasonography for the Diagnosis of Retinal Detachment: A Systematic Review and Meta-analysis  Michael E. Vrablik,
distinguishing IBD versus D-IBS.
Accuracy, sensitivity and specificity analysis
Volume 28, Issue 5, Pages (November 2015)
Force-Velocity Characteristics of the Knee Extensors: An Indication of the Risk for Physical Frailty in Elderly Women  Evelien Van Roie, MS, Sabine M.
Figure 1. Table for calculating the accuracy of a diagnostic test.
Patricia Butterfield & Naomi Chaytor October 18th, 2017
Volume 28, Issue 5, Pages (November 2015)
Model Evaluation and Selection
Predicting arterial blood gas and lactate from central venous blood analysis in critically ill patients: a multicentre, prospective, diagnostic accuracy.
A Systematic Review and Meta-analysis of D-dimer as a Rule-out Test for Suspected Acute Aortic Dissection  Stephen E. Asha, MBBS, MMed (Clin Epi), James.
Distinguishing organic disease versus D-IBS.
Volume 144, Issue 1, Pages e1 (January 2013)
Analgesia nociception index for the assessment of pain in critically ill patients: a diagnostic accuracy study  G. Chanques, T. Tarri, A. Ride, A. Prades,
Serum LAMC2 levels in pancreatic adenocarcinoma (PDAC) and other samples from Japan. Serum LAMC2 levels in pancreatic adenocarcinoma (PDAC) and other samples.
Accuracy of sputum colour in predicting NB
Accuracy of sputum colour in predicting neutrophilic inflammation.
Evidence Based Diagnosis
ROC Curves and Operating Points
Presentation transcript:

The receiver operating characteristic (ROC) curve Outline DIAGNOSIS: the pathway of a diagnostic test from bench to bedside. Basic residential course. The receiver operating characteristic (ROC) curve Giovanni Casazza April, 4 - 8, 2017 - Palazzo Feltrinelli - Gargnano, Lago di Garda, Italy

Outline Effect of cut-off variation on sensitivity and specificity Graphical representation of the relationship between sensitivity and specificity (ROC curve) A summary measure of the overall accuracy (AUC) Reading a ROC curve

A diagnostic accuracy study: spleen sriffness Spleen stiffness: continuous measurement

Continuous index test results

Continuous index test results Sensitivity: 23/24=95.8% 23/24 test + 1/24 test – Test – n=36 n=24

Continuous index test results Specificity: 28/36=77.8% 8/36 test + 28/36 test – Test – n=36 n=24

Continuous index test results Sensitivity: 23/24=95.8% Test + Specificity: 28/36=77.8% 8/36 test + 28/36 test – Any EV + – 23 8 31 1 28 29 Tot 24 36 FP TP TN FN 23/24 test + 1/24 test – Test – n=36 n=24

Continuous index test results Sensitivity: 6/24=25% Specificity: 36/36=100% 0/36 test + 36/36 test – 6/24 test + 18/24 test – 3.95 n=36 n=24

Continuous index test results Sensitivity: 9/24=37.5% Specificity: 35/36=97.2% 1/36 test + 35/36 test – 9/24 test + 15/24 test – 3.75 n=36 n=24

Continuous index test results Sensitivity: 11/24=45.8% Specificity: 35/36=97.2% 1/36 test + 35/36 test – 11/24 test + 13/24 test – 3.68 n=36 n=24

Continuous index test results Sensitivity: 15/24=62.5% Specificity: 34/36=94.4% 2/36 test + 34/36 test – 15/24 test + 9/24 test – 3.59 n=36 n=24

Continuous index test results Sensitivity: 23/24=95.8% Specificity: 22/36=61.1% 14/36 test + 22/36 test – 23/24 test + 1/24 test – 3.25 n=36 n=24

Continuous index test results Sensitivity: 24/24=100% Specificity: 10/36=27.8% 26/36 test + 10/36 test – 24/24 test + 0/24 test – 3.00 n=36 n=24

Continuous index test results Sensitivity: 24/24=100% Specificity: 18/36=50% 18/36 test + 18/36 test – 24/24 test + 0/24 test – 3.15 n=36 n=24

The threshold Any EV + – SS >3.95 6 18 36 54 Tot 24 Any EV + – 18 36 54 Tot 24 Any EV + – SS >3.75 9 1 10 15 35 50 Tot 24 36 Any EV + – SS >3.68 11 1 12 13 35 48 Tot 24 36 Any EV + – SS >3.59 15 2 17 9 34 43 Tot 24 36

The threshold Any EV + – SS >3.36 23 8 31 1 28 29 Tot 24 36 Any EV 14 37 1 22 Tot 24 36 Any EV + – SS >3.15 24 18 42 Tot 36 Any EV + – SS >3.00 24 26 50 10 Tot 36

Summary of thresholds - Table Cut-off value Test + Test - Sensitivity Specificity 3.95 6 54 25 100 3.75 10 50 37.5 97.2 3.68 12 48 45.8 3.59 17 43 62.5 94.4 3.36 31 29 95.8 77.8 3.25 37 23 61.1 3.15 42 18 3.00 27.8

Trade-off between sensitivity and specificity Unfortunately, as specificity increases, sensitivity decreases. pt SS cut-off 3.15 3.59 3.95 1 3.02 - 2 3.14 3 3.25 + 4 3.40 5 3.65 6 3.80 7 3.98 8 4.05 9 4.40 As the cut-off increases: only patients with higher SS values are classified as positive. Less (true and false) positive patients; more (true and false) negative patients. Less true positives Sensitivity decreases Sens=TP/(TP+FN) Less false positives Specificity increases Spec=TN/(TN+FP)

Trade-off between sensitivity and specificity Unfortunately, as sensitivity increases, specificity decreases. pt SS cut-off 3.15 3.59 3.95 1 3.02 - 2 3.14 3 3.25 + 4 3.40 5 3.65 6 3.80 7 3.98 8 4.05 9 4.40 As the cut-off decreases: only patients with lower SS values are classified as negatives. Less (true and false) negative patients; more (true and false) positive patients. More true positives Sensitivity increases Sens=TP/(TP+FN) More false positives Specificity decreases Spec=TN/(TN+FP)

Summary of thresholds - Graphic Cut-off value Sensitivity Specificity 3.95 0.250 1.000 3.75 0.375 0.972 3.68 0.458 3.59 0.625 0.944 3.36 0.958 0.778 3.25 0.611 3.15 0.500 3.00 0.278 Cut - off Sensitivity Specificity 1 - specificity value 3.95 0.250 1.000 0.000 3.75 0.375 0.972 0.028 3.68 0.458 0.972 0.028 3.59 0.625 0.944 0.056 3.36 0.958 0.778 0.222 3.25 0.958 0.611 0.389 3.15 1.000 0.500 0.500 3.00 1.000 0.278 0.722 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Specificity

Summary of thresholds - Graphic 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Specificity If we do the same for all the possible cut-off values

The ROC curve 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Specificity This curve is known as the Receiver Operating Characteristic (ROC) curve.

The ROC curve cut-off: 3.36 Sens=0.958 Spec=0.778 cut-off: 3.59 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Specificity

The ROC curve The area under the ROC curve (AUC) is a (summary) measure of diagnostic accuracy AUC is a measure of the ability of the continuous index test to discriminate between diseased and non diseased AUC=0.937 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

The ROC curve Inidividual patients plot Box plot

The ROC curve HVPG vs LS for a Target Condition: which of the two has the higher AUC?

The ROC curve Platelet count/spleen diameter ratio: proposal and validation of a non-invasive parameter to predict the presence of oesophageal varices in patients with liver cirrhosis Gut 2003;52:1200–1205

The perfect test: sensitivity and specificity both 100%. The ROC curve The perfect test: sensitivity and specificity both 100%. 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

The ROC curve A new index test with sensitivity 99% and specificity 1%. Is that test useful for … … ? The worthless test … like flipping a coin. What is the value of LRs? LR+=1 for each point of the curve LR -=1 for each point of the curve 0.99 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.01

The ROC curve Reading the results of a study Correlation of platelets count with endoscopic findings in a cohort of Egyptian patients with liver cirrhosis Medicine (2016) 95:23 Reading the results of a study

The ROC curve Reading a ROC curve Choosing the cut-off value

Take home points The ROC curve as a summary of the pairs (sensitivity, specificity) at each cut-off. Do not give too much importance to the value of AUC: “read” the whole curve. Assess if the test is (and how much is) useful to rule-in or to rule-out the target condition. AUC may be useful to compare the overall accuracy of two or more tests