Assessing Information from Multilevel (Ordinal) and Continuous Tests ROC curves and Likelihood Ratios for results other than “+” or “-” Michael A. Kohn,

Slides:

Advertisements

Similar presentations

2) Multilevel Tests (Michael) Likelihood ratios for results other than + or -

Advertisements

Frequency Distributions Quantitative Methods in HPELS 440:210.

CHAPTER TWELVE ANALYSING DATA I: QUANTITATIVE DATA ANALYSIS.

Logistic Regression.

Assessing Information from Multilevel (Ordinal) and Continuous Tests ROC curves and Likelihood Ratios for results other than “+” or “-” Michael A. Kohn,

Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.

Departments of Medicine and Biostatistics

Topic 3 The Normal Distribution. From Histogram to Density Curve 2 We used histogram in Topic 2 to describe the overall pattern (shape, center, and spread)

Lesson 5 Histograms and Box Plots. Histograms A bar graph that is used to display the frequency of data divided into equal intervals. The bars must be.

QUANTITATIVE DATA ANALYSIS

Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.

Statistics for Health Care

Frequency Distribution Ibrahim Altubasi, PT, PhD The University of Jordan.

Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.

Frequency Distributions and Their Graphs

Chapter 7: Normal Probability Distributions

Medical decision making. 2 Predictive values 57-years old, Weight loss, Numbness, Mild fewer What is the probability of low back cancer? Base on demographic.

Data Presentation.

Stats Tutorial. Is My Coin Fair? Assume it is no different from others (null hypothesis) When will you no longer accept this assumption?

BPS - 3rd Ed. Chapter 211 Inference for Regression.

Assessing Information from Multilevel and Continuous Tests Likelihood Ratios for results other than “+” or “-” Tom Newman (based on previous lectures by.

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 PROBABILITIES FOR CONTINUOUS RANDOM VARIABLES THE NORMAL DISTRIBUTION CHAPTER 8_B.

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.

DATA IDENTIFICATION AND ANALYSIS. Introduction  During design phase of a study, the investigator must decide which type of data will be collected and.

Diagnosis: EBM Approach Michael Brown MD Grand Rapids MERC/ Michigan State University.

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 6 Probability Distributions Section 6.2 Probabilities for Bell-Shaped Distributions.

LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.

ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:

Appraising A Diagnostic Test

Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.

Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.

Thursday August 29, 2013 The Z Transformation. Today: Z-Scores First--Upper and lower real limits: Boundaries of intervals for scores that are represented.

1 Risk Assessment Tests Marina Kondratovich, Ph.D. OIVD/CDRH/FDA March 9, 2011 Molecular and Clinical Genetics Panel for Direct-to-Consumer (DTC) Genetic.

1. Statistics Objectives: 1.Try to differentiate between the P value and alpha value 2.When to perform a test 3.Limitations of different tests and how.

Assessing Information from Multilevel (Ordinal) Tests ROC curves and Likelihood Ratios for results other than “+” or “-” Michael A. Kohn, MD, MPP 10/4/2007.

Chapter 2 EDRS 5305 Fall Descriptive Statistics  Organize data into some comprehensible form so that any pattern in the data can be easily seen.

Evaluating Results of Learning Blaž Zupan

The Normal Distribution

Assessing Information from Multilevel and Continuous Tests Likelihood Ratios for results other than “+” or “-” Michael A. Kohn, MD, MPP 10/2/2008.

Assessing Information from Multilevel and Continuous Tests Likelihood Ratios for results other than “+” or “-” Michael A. Kohn, MD, MPP 10/13/2011.

Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved THE Normal PROBABILITY DISTRIBUTION.

Biostatistics in Practice Peter D. Christenson Biostatistician Session 3: Testing Hypotheses.

Lecture 2.  A descriptive technique  An organized tabulation showing exactly how many individuals are located in each category on the scale of measurement.

Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.

Angela Hebel Department of Natural Sciences

Copyright restrictions may apply JAMA Pediatrics Journal Club Slides: Procalcitonin Use to Predict Bacterial Infection in Febrile Infants Milcent K, Faesch.

SCH Journal Club Use of time from fever onset improves the diagnostic accuracy of C-reactive protein in identifying bacterial infections Wednesday 13 th.

Statistics as a Tool A set of tools for collecting, organizing, presenting and analyzing numerical facts or observations.

Diagnostic Test Characteristics: What does this result mean

Common Errors by Teachers and Proponents of EBM

Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter The Normal Probability Distribution 7.

Chapter 2: Frequency Distributions. Frequency Distributions After collecting data, the first task for a researcher is to organize and simplify the data.

Week 6 Dr. Jenne Meyer.  Article review  Rules of variance  Keep unaccounted variance small (you want to be able to explain why the variance occurs)

1 SSC 2006: Case Study #2: Obstructive Sleep Apnea Rachel Chu, Shuyu Fan, Kimberly Fernandes, and Jesse Raffa Department of Statistics, University of British.

Midterm. T/F (a) False—step function (b) False, F n (x)~Bin(n,F(x)) so Inverting and estimating the standard error we see that a factor of n -1/2 is missing.

Chapter 2 Describing and Presenting a Distribution of Scores.

The Normal Approximation for Data. History The normal curve was discovered by Abraham de Moivre around Around 1870, the Belgian mathematician Adolph.

Diagnostic Likelihood Ratio Presented by Juan Wang.

BPS - 5th Ed. Chapter 231 Inference for Regression.

Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.

Assessing the additional value of diagnostic markers: a comparison of traditional and novel measures Ewout W. Steyerberg Professor of Medical Decision.

Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.

Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.

Frequency Distributions

The Normal Probability Distribution

Refining Probability Test Informations Vahid Ashoorion MD. ,MSc,

Association, correlation and regression in biomedical research

Sexual Activity and the Lifespan of Male Fruitflies

Computation of Post Test Probability

Frequency Distributions

Presentation transcript:

Assessing Information from Multilevel (Ordinal) and Continuous Tests ROC curves and Likelihood Ratios for results other than “+” or “-” Michael A. Kohn, MD, MPP 10/7/2004

Outline of Topics Introduce Likelihood Ratio Slide Rule Problem with making tests dichotomous ROC curves Likelihood Ratios for results of non- dichotomous tests If time… Calculating the “c statistic” Logarithms and Log Odds

Many Tests Are Not Dichotomous Ordinal “-”, “+”, “++”, “+++” for leukocyte esterase on urine dip stick “Normal”, “Low Prob”, “Intermediate Prob”, “High Prob” on VQ scan Continuous Systolic Blood Pressure WBC Count

Evaluating the Test --Test Characteristics For dichotomous tests, we discussed sensitivity P(+|D+) and specificity P(-|D-) For multi-level and continuous tests, we will discuss the Receiver Operating Characteristic (ROC) curve

Using the Test Result to Make Decisions about a Patient For dichotomous tests, we use the LR(+) if the test is positive and the LR(-) if the test is negative For multilevel and continuous tests, we use the LR(r), where r is the result of the test

Clinical Scenario 5-month old boy with fever You have the results of a WBC count. How do you use this WBC result to determine whether to treat empirically for possible bacteremia?

Why Not Make It a Dichotomous Test? WBC Count (x1000/uL)BacteremiaNo Bacteremia > Total Lee GM, Harper MB. Risk of bacteremia for febrile young children in the post- Haemophilus influenzae type b era. Arch Pediatr Adolesc Med. 1998;152(7):

Why Not Make It a Dichotomous Test? Sensitivity = 109/127 = 0.86 Specificity = 6601/11396 = 0.76 LR(+) = 0.86/( ) = 3.65 LR(-) = ( )/0.76 = 0.19

Clinical Scenario WBC = 16,000/mL (Demonstrate LR Slide Rule) Pre-test prob: 0.03 LR(+) = 3.65 Post-Test prob = ?

Clinical Scenario WBC = 16,000/mL Pre-test prob: 0.03 Pre-test odds: 0.03/0.97 = LR(+) = 3.65 Post-Test Odds = Pre-Test Odds x LR(+) = x 3.65 =.113 Post-Test prob =.113/(.113+1) =.10

Clinical Scenario WBC = 28,000/mL Pre-test prob: 0.03 LR(+) = ? Post-Test prob =?

Clinical Scenario WBC = 28,000/mL Pre-test prob: 0.03 Pre-test odds: 0.03/0.97 = LR(+) = 3.65 (same as for WBC=16,000!) Post-Test Odds = Pre-Test Odds x LR(+) = x 3.65 =.113 Post-Test prob =.113/(.113+1) =.10

Why Not Make It a Dichotomous Test? Because you lose information. The risk associated with WBC=16,000 is equated with the risk associated with WBC=28,000. Choosing a fixed cutpoint to dichotomize a multi-level or continuous test throws away information and reduces the value of the test.

WBC Count (x1000/uL) BacteremiaNo Bacteremia < < < < < <50543 TOTAL Lee GM, Harper MB. Risk of bacteremia for febrile young children in the post- Haemophilus influenzae type b era. Arch Pediatr Adolesc Med. 1998;152(7):

WBC Count (x1000/uL) BacteremiaNo Bacteremia %0.8% 25 - <309.4%1.8% 20 - <2526.8%5.4% 15 - <2037.8%15.5% 10 - <1511.8%32.1% 5 - <102.4%38.1% 0 - <50.0%6.3% TOTAL100%

Histogram Does not reflect prevalence of D+ (Dark D+ columns add to 100%, Open D- columns add to 100%) Sensitivity and specificity depend on the cutpoint chosen to separate “positives” from “negatives” The ROC curve is drawn by serially lowering the cutpoint from highest (most abnormal) to lowest (least abnormal).

WBC Count (x1000/uL) Sensitivity1 - Specificity > 50 0% > %0.8% > %2.6% > %8.0% > %23.5% > %55.6% > 5100%93.7% > 0100%

Area Under Curve (AUC) = ,000/uL 25,000/uL 20,000/uL 15,000/uL 10,000/uL 5,000/uL

1218 Bacteremia No Bacteremia

WBC Cutoff = 12,000/μL

Test Discriminates Well Between D+ and D- Test Result D- D+

Test Discriminates Well Between D+ and D-

Test Discriminates Poorly Between D+ and D- Test Result D- D+

Test Discriminates Poorly Between D+ and D-

Area Under Curve (AUC) = ,000/uL 25,000/uL 20,000/uL 15,000/uL 10,000/uL 5,000/uL Area Under ROC Curve

Summary measure of test’s discriminatory ability Probability that a randomly chosen D+ individual will have a more positive test result than a randomly chosen D- individual e.g. randomly choose 1 of the 127 bacteremic children and 1 of the 8629 non-bacteremic children. The probability that the bacteremic child’s WBC will fall in a higher WBC interval than the non-bacteremic child is 0.86

Area Under ROC Curve Corresponds to the Mann-Whitney (Wilcoxan Rank Sum) Test Statistic, which is the non-parametric equivalent of Student’s t test. Also corresponds to the “c statistic” reported in logistic regression models

“Walking Man” Approach to ROC Curves Divide vertical axis into d steps, where d is the number of D+ individuals Divide horizontal axis into n steps, where n is the number of D- individuals Sort individuals from most to least abnormal test result Moving from the first individual (with the most abnormal test result) to the last (with the least abnormal test result)…

“Walking Man” (continued) …call out “D” if the individual is D+ and “N” if the individual is D- Let the walking man know when you reach a new value of the test The walking man takes a step up every time he hears “D” and a step to the right every time he hears “N” When you reach a new value of the test, he drops a stone.

WBC Count in 5 Bacteremic Children PatientWBC Count D127 D222 D319 D417 D514

WBC Count in 10 Non-Bacteremic Children PatientWBC Count N121 N218 N317 N413 N512 N612 N78 N87 N96 N104

BACTEREMIANO BACTEREMIA

DDNDN(DN)DN(NN)NNNN

ROC Curve Describes the Test Describes the test’s ability to discriminate between D+ and D- individuals Not particularly useful in interpreting a test result for a given patient Example 1: Child with WBC count = 16,000 Example 2: Dyspnea patient with BNP = 225 pg/ml

Example 1 Febrile Child with WBC count = 16,000

Lee et al. Arch Peds Adol Med 1998;152:624-28

Example 2 Dyspnea patient with BNP = 225 pg/ml

Maisel et al. N Engl J Med 2002; 347:161-7

Likelihood Ratios LR(+) = Sensitivity/(1 – Specificity) = P(+|D+)/(1-P(-|D-)) = P(+|D+)/P(+|D-) LR(-) = (1 – Sensitivity)/Specificity = (1-P(+|D+))/P(-|D-) = P(-|D+)/P(-|D-)

Likelihood Ratios LR(result) = P(result|D+)/P(result|D-)

Likelihood Ratios The ratio of the height of the D+ distribution to the height of the D- distribution 37.8% 15.5% LR = 37.8%/15.5% = 2.4

30,000/uL 25,000/uL 20,000/uL 15,000/uL 10,000/uL 5,000/uL Likelihood Ratio = Slope of ROC Cuve 37.8% 15.5% Slope = 37.8%/15.5% = 2.4

Likelihood Ratio WBC Count (x1000/uL) BacteremiaNo BacteremiaLR 30 - <3511.8%0.8% <309.4%1.8% <2526.8%5.4% <2037.8%15.5% <1511.8%32.1% <102.4%38.1% <50.0%6.3%0.00

WBC Count (x1000/uL) Sensitivity1 - Specificity > 50 0% > %0.8% > %2.6% > %8.0% > %23.5% > %55.6% > %93.7% > % Using “ROC Tables” to Get Interval LRs

Common Mistake When given an “ROC Table,” it is tempting to calculate an LR(+) or LR(-) as if the test were “dichotomized” at a particular cutoff. Example: LR(+,10,000) = 97.6/55.6 = 1.8 This is NOT the LR of a particular result (e.g. WBC >10,000 and <15,000); it is the LR(+) if you divide “+” from “-” at 10,000.

Common Mistake 55.6% 97.6% 10,000/uL 15,000/uL 5,000/uL 20,000/uL 97.6/55.6 = 1.8

Using “ROC Tables” to Get Interval LRs Most abnormal interval (>= to top cutoff): D+ frequency = sensitivity of top cutoff; D- frequency = FPR of top cutoff For each less abnormal interval (between a higher and lower cutoff): D+ frequency = sensitivity of the lower cutoff - sensitivity of the higher cutoff; D- frequency = FPR of the lower cutoff - FPR of the higher cutoff Least abnormal interval (<= lowest cutoff): D+ frequency = 100% - low cutoff sensitivity; D- frequency = 100% - low cutoff FPR.

Example 1 Febrile Child with WBC count = 16,000

Lee et al. Arch Peds Adol Med 1998;152: Focus on these

Using “ROC Tables” to Get Interval LRs CutoffSensitivitySpecificity1 - Spec >= >= >= We will use this row, and … …this row

Using “ROC Tables” to Get Interval LRs For the interval >= 15 and <17, P(r|D+) = Sens (>=15) – Sens(>=17) = = 0.14 P(r|D-) = FPR(>=15) – FPR(>=17) = = 0.07

Using “ROC Tables” to Get Interval LRs LR(WBC btw 15-17) = P(r|D+) / P(r|D-) = 0.14/0.07 = 2 For the interval >= 15 and <17, LR(WBC btw 15 and 17) = 2 Child has WBC Count of 16,000 Post-Test Odds = Pre-Test Odds x 2

Something to notice The LR we just obtained, for a WBC of 16 (15-17, actually) was 2.0 The LR for the category 15- <20 was 2.5 This makes sense, because 16 is at the low end of the 15 –20 range The LR for a WBC of 19 would be a little higher than 2.5

Example 2 Dyspnea patient with BNP = 225 pg/ml

Maisel et al. N Engl J Med 2002; 347:161-7

Likelihood Ratio Slide Rule Repeat Calculations: Pre-Test Probability: 3% WBC = 16,000 LR(16,000): 2.4 Post-Test Probability: WBC = 28,000 LR(28,000): 5.3 Post-Test Probability:

Clinical Scenario WBC = 16,000/mL Pre-test prob: 0.03 Pre-test odds: 0.03/0.97 = LR(WBC btw 15 and 20) = 2.4 Post-Test Odds = Pre-Test Odds x LR(16) = x 2.4 = Post-Test prob = 0.074/( ) = 0.069

Clinical Scenario WBC = 28,000/mL Pre-test prob: 0.03 Pre-test odds: 0.03/0.97 = LR(28,000/uL) = 5.3 Post-Test Odds = Pre-Test Odds x LR(28) = x 5.3 = Post-Test prob = 0.164/( ) = 0.141

Clinical Scenario WBC = 16,000/uL Post-Test Prob = 7% WBC = 28,000/uL Post-Test Prob = 14% (Recall that dichotomizing the WBC with a fixed cutpoint of 15,000/uL meant that WBC = 16,000/uL would be treated the same as WBC = 28,000/uL and post-test prob = 10%)

Summary Dichotomizing a multi-level or continuous test by choosing a fixed cutpoint reduces the value of the test LR(result) = P(result|D+)/P(result|D-) NOTE: Do not calculate an LR(+) or LR(-) for a multilevel test. One can calculate the interval likelihood ratios from an “ROC table” of sensitivities and specificities at various cutoffs

ROC Curve when a lower test result is more abnormal Gestational age as a predictor of neonatal morbidity. Trace ROC curve by serially moving cutoff from the lowest level (<24 weeks) up to the highest level (<45 weeks)

Gestational Age as Predictor of Neonatal Morbidity/Mortality

< 36 weeks

Calculating the c Statistic In the “walking man” approach to tracing out the ROC curve, the actual values of the test are not important for the shape of the ROC curve or the area under it--only the ranking of the values. The c statistic for the area under an ROC curve comes out exactly the same as the Wilcoxon Rank Sum statistic (or Mann- Whitney U, which is equivalent). Non-parametric equivalent of the t test statistic comparing two means.

BACTEREMIANO BACTEREMIA TEST RESULTS

Boxes under Curve = 43.5 Total Boxes = 50 Area Under Curve = 43.5/50 = 0.87

BACTEREMIANO BACTEREMIA S = 21.5 Replace Test Results with Ranks

S = 21.5 Smin = d(d+1)/2 = 5(6)/2 = 15 Smax = dn + Smin = 5(10) + 15 = 65 C = (Smax – S) / (Smax – Smin)* = (65 – 21.5) / (65 – 15) = 43.5/50 = 0.87 * Smax – Smin = dn Calculating the C Statistic

Logarithms Log 10 (a) = b, where a =10 b Log 10 (100) = 2 Log 10 (10) = 1 Log 10 (1) = 0 Log 10 (0.10) = -1 Log 10 (0.01) = -2

Multiplying is Adding Logs Log 10 (xy) = Log 10 (x) + Log 10 (y) Log 10 (10 x 1000) = Log 10 (10) + Log 10 (100) = = 3

Dividing is Subtracting Logs Log 10 (x/y) = Log 10 (x) - Log 10 (y) Log 10 (10/100) = Log 10 (10) - Log 10 (100) = = -1