Diagnostic Tests Patrick S. Romano, MD, MPH Professor of Medicine and Pediatrics Patrick S. Romano, MD, MPH Professor of Medicine and Pediatrics
Disease + Disease- Test + TPFPTP + FP Test-FNTNFN + TN TP + FNFP + TNTotal Disease + Disease- Test + TPFPTP + FP Test-FNTNFN + TN TP + FNFP + TNTotal The Two-by-two Table
True positives: Patients with disease who test positive False negatives: Patients with disease who test negative True negatives: Patients without disease who test negative False positives: Patients without disease who test positive True positives: Patients with disease who test positive False negatives: Patients with disease who test negative True negatives: Patients without disease who test negative False positives: Patients without disease who test positive The Two-by-two Table (cont)
Sensitivity: TP/(TP + FN) Test accuracy (or probability of correct classification) among patients with disease Specificity: TN/(TN + FP) Test accuracy (or probability of correct classification) among patients without disease Sensitivity: TP/(TP + FN) Test accuracy (or probability of correct classification) among patients with disease Specificity: TN/(TN + FP) Test accuracy (or probability of correct classification) among patients without disease Test Characteristics
Positive predictive value: TP/(TP + FP) Predictive value of a positive (abnormal) result OR post-test probability of disease, given positive test Negative predictive value: TN/(TN + FN) Predictive value of a negative (normal) result OR post-test probability of non- disease, given negative test Positive predictive value: TP/(TP + FP) Predictive value of a positive (abnormal) result OR post-test probability of disease, given positive test Negative predictive value: TN/(TN + FN) Predictive value of a negative (normal) result OR post-test probability of non- disease, given negative test Test Characteristics (cont)
Have you ever felt you should CUT down on your drinking? Have people ANNOYED you by criticizing your drinking? Have you ever felt bad or GUILTY about your drinking? Have you ever had a drink first thing in the morning to steady your nerves or to get rid of a hangover (EYE opener?) Have you ever felt you should CUT down on your drinking? Have people ANNOYED you by criticizing your drinking? Have you ever felt bad or GUILTY about your drinking? Have you ever had a drink first thing in the morning to steady your nerves or to get rid of a hangover (EYE opener?) CAGE Questionnaire
No ‘Yes’AlcoholismAlcoholism responses(n) (%)(n) (%) No ‘Yes’AlcoholismAlcoholism responses(n) (%)(n) (%) Prevalence of Alcoholism by CAGE Score
Performance Characteristics of CAGE: 3-4 “yes” Responses
Performance Characteristics of CAGE: 2-4 “yes” Responses
Choice of cutoff value Quality of administration of test – Equipment, technique, reagents, questionnaire Quality of interpretation of test Spectrum of disease (severity distribution) – A truncated sample may result from using a test measure to select recipients of the “gold standard” measure NOT prevalence Choice of cutoff value Quality of administration of test – Equipment, technique, reagents, questionnaire Quality of interpretation of test Spectrum of disease (severity distribution) – A truncated sample may result from using a test measure to select recipients of the “gold standard” measure NOT prevalence What Affects Sensitivity?
What Affects Specificity? Choice of cutoff value: – Sensitivity-specificity tradeoff Quality of administration of test Quality of interpretation of test Spectrum of non-disease – Other prevalent diseases may cause false positive values NOT prevalence Choice of cutoff value: – Sensitivity-specificity tradeoff Quality of administration of test Quality of interpretation of test Spectrum of non-disease – Other prevalent diseases may cause false positive values NOT prevalence
Sensitivity Specificity Prevalence Sensitivity Specificity Prevalence PV+ = (Sensitivity)(Prevalence) (Sens)(Prev) + (1-Spec)(1-Prev) PV- = (Specificity)(1-Prevalence) (Spec)(1-Prev) + (1-Sens)(Prev) What Affects Predictive Values?
Performance Characteristics of CAGE: High Prevalence of Alcoholism
Performance Characteristics of CAGE: Low Prevalence of Alcoholism
Test Characteristics Likelihood ratio (positive): = Sensitivity / (1-Specificity) = (TP/Disease +) / (FP/Disease –) Likelihood of a (true) positive test among patients with disease, relative to the likelihood of a (false) positive test among those without disease How much more likely are you to find a positive test result in a person with disease than in a person without disease? Likelihood ratio (positive): = Sensitivity / (1-Specificity) = (TP/Disease +) / (FP/Disease –) Likelihood of a (true) positive test among patients with disease, relative to the likelihood of a (false) positive test among those without disease How much more likely are you to find a positive test result in a person with disease than in a person without disease?
Test Characteristics (cont) Likelihood ratio (positive): = Sensitivity/(1-Specificity) = (TP/Disease +)/(FP/Disease –) If ODDS = p(event)/[1-p(event)], then: Pre-test odds x Likelihood ratio = Post-test odds Prior odds x Likelihood ratio = Posterior odds Likelihood ratio (positive): = Sensitivity/(1-Specificity) = (TP/Disease +)/(FP/Disease –) If ODDS = p(event)/[1-p(event)], then: Pre-test odds x Likelihood ratio = Post-test odds Prior odds x Likelihood ratio = Posterior odds
Specificity (FP/[TN+FP]) Sensitivity (TP/[TP+FN]) Urologic practice Community screening PSA Performance (ROC) Curve
Sensitivity 1–Sensitivity Specificity 1–Specificity 2.5 ng/ml 5.0 ng/ml 10.0 ng/ml 2.5 ng/ml 5.0 ng/ml 10.0 ng/ml Stage A Stage B Stage C Stage D
Problem: In your study, you are using a diagnostic test of unknown accuracy. A better "gold standard" test is available, but is too expensive or too complicated for you to adopt. How accurate is your classification of patients based on the cheaper test? Problem: In your study, you are using a diagnostic test of unknown accuracy. A better "gold standard" test is available, but is too expensive or too complicated for you to adopt. How accurate is your classification of patients based on the cheaper test? Using Bayes Theorem
Solution—Step 1: Review the literature (or check with your instrument supplier or manufacturer) to ascertain the sensitivity and specificity of the measure in previous studies. Solution—Step 1: Review the literature (or check with your instrument supplier or manufacturer) to ascertain the sensitivity and specificity of the measure in previous studies. Using Bayes Theorem (cont)
Solution—Step 2: If possible, do your own "validation.” This usually involves applying the gold standard to a subset of your sample and comparing the results with those of the cheaper test. A 5–10% subsample may suffice (depending on sample size). Solution—Step 2: If possible, do your own "validation.” This usually involves applying the gold standard to a subset of your sample and comparing the results with those of the cheaper test. A 5–10% subsample may suffice (depending on sample size). Using Bayes Theorem (cont)
Solution—Step 3: Apply Bayes theorem to calculate the predictive values of positive and negative tests, based on sensitivity, specificity, and prevalence. Sensitivity = P(disease)|Test + Specificity = P(no disease)|Test - Prevalence = Prior probability of disease in your sample Solution—Step 3: Apply Bayes theorem to calculate the predictive values of positive and negative tests, based on sensitivity, specificity, and prevalence. Sensitivity = P(disease)|Test + Specificity = P(no disease)|Test - Prevalence = Prior probability of disease in your sample Using Bayes Theorem (cont)
Solution—Step 3 (cont) PV+ = (Sensitivity)(Prevalence) (Sens)(Prev) + (1-Spec)(1-Prev) PV- = (Specificity)(1-Prevalence) (Spec)(1-Prev) + (1-Sens)(Prev) Using Bayes Theorem (cont)
You are using daily urinary ratios of pregnanediol-3-glucuronide to creatinine, indexed against each patient's baseline value, to identify anovulatory menstrual cycles. The “gold standard” involves serum progesterone determinations, but cannot be applied to a large community-based sample. You are using daily urinary ratios of pregnanediol-3-glucuronide to creatinine, indexed against each patient's baseline value, to identify anovulatory menstrual cycles. The “gold standard” involves serum progesterone determinations, but cannot be applied to a large community-based sample. Using Bayes Theorem—Example
Cycles with a low ratio are labeled anovulatory. The test has a sensitivity of 90% and a specificity of 90%. In the real world, where only 5-10% of cycles are anovulatory, how often will you misclassify cycles? Cycles with a low ratio are labeled anovulatory. The test has a sensitivity of 90% and a specificity of 90%. In the real world, where only 5-10% of cycles are anovulatory, how often will you misclassify cycles? Using Bayes Theorem— Example (cont)
PV+ = (0.9)(0.1)/[(0.9)(0.1)+(0.1)(0.9)] =0.50 Assuming 10% prevalence PV+ = (0.9)(0.05)/[(0.9)(0.05)+(0.1)(0.95)] =0.32 Assuming 5% prevalence PV+ = (0.9)(0.1)/[(0.9)(0.1)+(0.1)(0.9)] =0.50 Assuming 10% prevalence PV+ = (0.9)(0.05)/[(0.9)(0.05)+(0.1)(0.95)] =0.32 Assuming 5% prevalence Using Bayes Theorem— Example (cont) Using Bayes Theorem— Example (cont) In other words, 50–68% of all cycles labeled as anovulatory will actually be false positives (e.g., ovulatory).
Thank you !