Diagnostic Testing Ethan Cowan, MD, MS Department of Emergency Medicine Jacobi Medical Center Department of Epidemiology and Population Health Albert Einstein.

Slides:



Advertisements
Similar presentations
Dichotomous Tests (Tom). Their results change the probability of disease Negative testPositive test Reassurance Treatment Order a Test A good test moves.
Advertisements

TEACHING ABOUT DIAGNOSIS
Validity and Reliability of Analytical Tests. Analytical Tests include both: Screening Tests Diagnostic Tests.
Diagnostic Test Studies Tran The Trung Nguyen Quang Vinh.
Lecture 3 Validity of screening and diagnostic tests
Step 3: Critically Appraising the Evidence: Statistics for Diagnosis.
TESTING A TEST Ian McDowell Department of Epidemiology & Community Medicine November, 2004.
Evaluation of segmentation. Example Reference standard & segmentation.
Curva ROC figuras esquemáticas Curva ROC figuras esquemáticas Prof. Ivan Balducci FOSJC / Unesp.
Diagnostic Tests Patrick S. Romano, MD, MPH Professor of Medicine and Pediatrics Patrick S. Romano, MD, MPH Professor of Medicine and Pediatrics.
Receiver Operating Characteristic (ROC) Curves
Azita Kheiltash Social Medicine Specialist Tehran University of Medical Sciences Diagnostic Tests Evaluation.
CRITICAL APPRAISAL Dr. Cristina Ana Stoian Resident Journal Club
GerstmanChapter 41 Epidemiology Kept Simple Chapter 4 Screening for Disease.
Concept of Measurement
Epidemiology Kept Simple Chapter 4 Screening for Disease.
Baye’s Rule and Medical Screening Tests. Baye’s Rule Baye’s Rule is used in medicine and epidemiology to calculate the probability that an individual.
Lucila Ohno-Machado An introduction to calibration and discrimination methods HST951 Medical Decision Support Harvard Medical School Massachusetts Institute.
Judgement and Decision Making in Information Systems Diagnostic Modeling: Bayes’ Theorem, Influence Diagrams and Belief Networks Yuval Shahar, M.D., Ph.D.
Screening and Early Detection Epidemiological Basis for Disease Control – Fall 2001 Joel L. Weissfeld, M.D. M.P.H.
Statistics in Screening/Diagnosis
BASIC STATISTICS: AN OXYMORON? (With a little EPI thrown in…) URVASHI VAID MD, MS AUG 2012.
Multiple Choice Questions for discussion
Medical decision making. 2 Predictive values 57-years old, Weight loss, Numbness, Mild fewer What is the probability of low back cancer? Base on demographic.
PTP 560 Research Methods Week 3 Thomas Ruediger, PT.
When is it safe to forego a CT in kids with head trauma? (based on the article: Identification of children at very low risk of clinically- important brain.
Lecture 4: Assessing Diagnostic and Screening Tests
Basic statistics 11/09/13.
Division of Population Health Sciences Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Indices of Performances of CPRs Nicola.
Principles and Predictive Value of Screening. Objectives Discuss principles of screening Describe elements of screening tests Calculate sensitivity, specificity.
Sensitivity Sensitivity answers the following question: If a person has a disease, how often will the test be positive (true positive rate)? i.e.: if the.
Evidence Based Medicine Workshop Diagnosis March 18, 2010.
Evaluation of Diagnostic Tests
EVIDENCE ABOUT DIAGNOSTIC TESTS Min H. Huang, PT, PhD, NCS.
+ Clinical Decision on a Diagnostic Test Inna Mangalindan. Block N. Class September 15, 2008.
Diagnosis: EBM Approach Michael Brown MD Grand Rapids MERC/ Michigan State University.
1 Epidemiological Measures I Screening for Disease.
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
Appraising A Diagnostic Test
Likelihood 2005/5/22. Likelihood  probability I am likelihood I am probability.
Evidence-Based Medicine Diagnosis Component 2 / Unit 5 1 Health IT Workforce Curriculum Version 1.0 /Fall 2010.
Chapter 10 Screening for Disease
1 Risk Assessment Tests Marina Kondratovich, Ph.D. OIVD/CDRH/FDA March 9, 2011 Molecular and Clinical Genetics Panel for Direct-to-Consumer (DTC) Genetic.
TESTING A TEST Ian McDowell Department of Epidemiology & Community Medicine January 2008.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
HSS4303B – Intro to Epidemiology Feb 8, Agreement.
1 Wrap up SCREENING TESTS. 2 Screening test The basic tool of a screening program easy to use, rapid and inexpensive. 1.2.
Diagnostic Tests Studies 87/3/2 “How to read a paper” workshop Kamran Yazdani, MD MPH.
Assist. Prof. Dr. Memet IŞIK Ataturk University Medical Faculty Department of Family Medicine Class 2:
Inter-rater Reliability of Clinical Ratings: A Brief Primer on Kappa Daniel H. Mathalon, Ph.D., M.D. Department of Psychiatry Yale University School of.
Diagnostic Test Characteristics: What does this result mean
Screening.  “...the identification of unrecognized disease or defect by the application of tests, examinations or other procedures...”  “...sort out.
10 May Understanding diagnostic tests Evan Sergeant AusVet Animal Health Services.
Inter-observer variation can be measured in any situation in which two or more independent observers are evaluating the same thing Kappa is intended to.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
Diagnosis Examination(MMSE) in detecting dementia among elderly patients living in the community. Excel.
Timothy Wiemken, PhD MPH Assistant Professor Division of Infectious Diseases Diagnostic Tests.
PTP 560 Research Methods Week 12 Thomas Ruediger, PT.
Biostatistics Board Review Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016.
CHAPTER 3 Key Principles of Statistical Inference.
Critical Appraisal Course for Emergency Medicine Trainees Module 5 Evaluation of a Diagnostic Test.
Diagnostic studies Adrian Boyle.
Performance of a diagnostic test Tunisia, 31 Oct 2014
Diagnostic Test Studies
Class session 7 Screening, validity, reliability
بسم الله الرحمن الرحيم Clinical Epidemiology
کاربرد آمار در آزمایشگاه
Refining Probability Test Informations Vahid Ashoorion MD. ,MSc,
Patricia Butterfield & Naomi Chaytor October 18th, 2017
Evidence Based Diagnosis
Presentation transcript:

Diagnostic Testing Ethan Cowan, MD, MS Department of Emergency Medicine Jacobi Medical Center Department of Epidemiology and Population Health Albert Einstein College of Medicine

The Provider Dilemma u A 26 year old pregnant female presents after twisting her ankle. She has no abdominal or urinary complaints. The nurse sends a UA and uricult dipslide prior to you seeing the patient. What should you do with the results of these tests?

The Provider Dilemma u Should a provider give antibiotics if either one or both of these tests come back positive?

Why Order a Diagnostic Test? u When the diagnosis is uncertain u Incorrect diagnosis leads to clinically significant morbidity or mortality u Diagnostic test result changes management u Test is cost effective

Clinician Thought Process u Clinician derives patient prior prob. of disease: u H & P u Literature u Experience u “Index of Suspicion” u 0% - 100% u “Low, Med., High”

Threshold Approach to Diagnostic Testing u P < P(-)Dx testing & therapy not indicated u P(-) < P < P(+)Dx testing needed prior to therapy u P > P(+) Only intervention needed Pauker and Kassirer, 1980, Gallagher, 1998 Probability of Disease 0% 100% Testing Zone P(-) P(+)

Threshold Approach to Diagnostic Testing u Width of testing zone depends on: u Test properties u Risk of excess morbidity/mortality attributable to the test u Risk/benefit ratio of available therapies for the Dx Probability of Disease 0% 100% Testing Zone P(-) P(+) Pauker and Kassirer, 1980, Gallagher, 1998

Test Characteristics u Reliability u Inter observer u Intra observer u Correlation u B&A Plot u Simple Agreement u Kappa Statistics u Validity u Sensitivity u Specificity u NPV u PPV u ROC Curves

Reliability u The extent to which results obtained with a test are reproducible.

Reliability Not Reliable Reliable

Intra rater reliability u Extent to which a measure produces the same result at different times for the same subjects

Inter rater reliability u Extent to which a measure produces the same result on each subject regardless of who makes the observation

Correlation (r) u For continuous data u r = 1 perfect u r = 0 none O 1 = O 2 O1O1 O2O2 Bland & Altman, 1986

Correlation (r) u Measures relation strength, not agreement u Problem: even near perfect correlation may indicate significant differences between observations O 1 = O 2 r = 0.8 O1O1 O2O2 Bland & Altman, 1986

Bland & Altman Plot u For continuous data u Plot of observation differences versus the means u Data that are evenly distributed around 0 and are within 2 STDs exhibit good agreement O 1 – O 2 [O 1 + O 2 ] / 2 Bland & Altman, 1986

Simple Agreement u Extent to which two or more raters agree on the classifications of all subjects u % of concordance in the 2 x 2 table (a + d) / N u Not ideal, subjects may fall on diagonal by chance Rater 1 Rater 2 -+total - ab a + b + cd c + d totala + cb + dN

Kappa u The proportion of the best possible improvement in agreement beyond chance obtained by the observers u K = (p a – p 0 )/(1-p 0 ) u P a = (a+d)/N (prop. of subjects along the main diagonal) u P o = [(a + b)(a+c) + (c+d)(b+d)]/N 2 (expected prop.) Rater 1 Rater 2 -+total - ab a + b + cd c + d totala + cb + dN

Interpreting Kappa Values K=1 K > < K < < K < < K < 0.40 K = 0 K < 0 Perfect Excellent Good Fair Poor Chance (p a = p 0 ) Less than chance

Weighted Kappa u Used for more than 2 observers or categories u Perfect agreement on the main diagonal weighted more than partial agreement off of it. Rater 1 Rater Ctotal 1 n 11 n 12...n 1C n 1. 2 n 21 n 22...n 2C n C n C1 n C2...n CC n C. totaln.1 n.2...n.C N

Validity u The degree to which a test correctly diagnoses people as having or not having a condition u Internal Validity u External Validity

Validity Valid, not reliableReliable and Valid

Internal Validity u Performance Characteristics u Sensitivity u Specificity u NPV u PPV u ROC Curves

2 x 2 Table TP = True Positives FP = False Positives Test Result Disease Status cases noncases total + TP - positives negatives total cases noncases N FN FP TN TN = True Negatives FN = False Negatives

Gold Standard u Definitive test used to identify cases u Example: traditional agar culture u The dipstick and dipslide are measured against the gold standard

Sensitivity (SN) Test Result Disease Status cases noncases total + TP - positives negatives total cases noncases N FN FP TN u Probability of correctly identifying a true case u TP/(TP + FN) = TP/ cases u High SN, Negative test result rules out Dx (SnNout) Sackett & Straus, 1998

Specificity (SP) Test Result Disease Status cases noncases total + TP - positives negatives total cases noncases N FN FP TN u Probability of correctly identifying a true noncase u TN/(TN + FP) = TN/ noncases u High SP, Positive test result rules in Dx (SpPin) Sackett & Straus, 1998

Problems with Sensitivity and Specificity u Remain constant over patient populations u But, SN and SP convey how likely a test result is positive or negative given the patient does or does not have disease u Paradoxical inversion of clinical logic u Prior knowledge of disease status obviates need of the diagnostic test Gallagher, 1998

Positive Predictive Value (PPV) Test Result Disease Status cases noncases total + TP - positives negatives total cases noncases N FN FP TN u Probability that a labeled (+) is a true case u TP/(TP + FP) = TP/ total positives u High SP corresponds to very high PPV (SpPin) Sackett & Straus, 1998

Negative Predictive Value (NPV) Test Result Disease Status cases noncases total + TP - positives negatives total cases noncases N FN FP TN u Probability that a labeled (-) is a true noncase u TN/(TN + FN) = TP/ total negatives u High SN corresponds to very high NPV (SnNout) Sackett & Straus, 1998

Predictive Value Problems u Vulnerable to Disease Prevalence (P) Shifts u Do not remain constant over patient populations u As PPPV NPV Gallagher, 1998

Flipping a Coin to Dx AMI for People with Chest Pain SN = 3 / 6 = 50% SP = 47 / 94 = 50% AMINo AMI Heads (+)34750 Tails (-) ED AMI Prevalence 6% PPV= 3 / 50 = 6% NPV = 47 / 50 = 94% Worster, 2002

Flipping a Coin to Dx AMI for People with Chest Pain SN = 45 / 90 = 50% SP = 5 / 10 = 50% AMINo AMI Heads (+)45550 Tails (-) CCU AMI Prevalence 90% PPV= 45 / 50 = 90% NPV = 5 / 50 = 10% Worster, 2002

Receiver Operator Curve u Allows consideration of test performance across a range of threshold values u Well suited for continuous variable Dx Tests Specificity (FPR) Sensitivity (TPR)

Receiver Operator Curve u Avoids the “single cutoff trap” No Effect Sepsis Effect WBC Count Gallagher, 1998

Area Under the Curve (θ) 1-Specificity (FPR) Sensitivity (TPR) u Measure of test accuracy u (θ) 0.5 – 0.7 no to low discriminatory power u (θ) 0.7 – 0.9 moderate discriminatory power u (θ) > 0.9 high discriminatory power Gryzybowski, 1997

Problem with ROC curves u Same problems as SN and SP “Reverse Logic” u Mainly used to describe Dx test performance

Appendicitis Example u Study design: u Prospective cohort u Gold standard: u Pathology report from appendectomy or CT finding (negatives) u Diagnostic Test: u Total WBC Cardall, 2004 Appy No Appy CT Scan OR Physical Exam

Appendicitis Example WBCAppyNot AppyTotal > 10, < 10, Total SN 76% (65%-84%) SP 52% (45%-60%) PPV 42% (35%-51%) NPV 82% (74%-89%) Cardall, 2004

Appendicitis Example u Patient WBC: u 13,000 u Management: u Get CT with PO & IV Contrast Cardall, 2004 Appy No Appy CT Scan OR Physical Exam

Abdominal CT

Follow UP u CT result: acute appendicitis u Patient taken to OR for appendectomy

But, was WBC necessary? Answer given in talk on Likelihood Ratios