Critically Evaluating the Evidence: diagnosis, prognosis, and screening Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

Evidence Based Health Care Course Paris, 2010 Appraising diagnostic studies Dr Matthew Thompson Senior Clinical Scientist.
Lecture 3 Validity of screening and diagnostic tests
Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,
Studying a Study and Testing a Test: Sensitivity Training, “Don’t Make a Good Test Bad”, and “Analyze This” Borrowed Liberally from Riegelman and Hirsch,
“Diagnostic value of procalcitonin in well appearing young febrile infants” Pediatrics 2012; 130:
Improving The Clinical Care of Children and Adolescents With Mild Traumatic Brain Injury Madeline Joseph, MD, FACEP, FAAP Professor of Emergency Medicine.
1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.
Evaluation of Diagnostic Test Studies
Evidence-Based Medicine Week 3 - Prognosis Department of Medicine - Residency Training Program Tuesdays, 9:00 a.m. - 11:30 a.m., UW Health Sciences Library.
Journal Club Alcohol and Health: Current Evidence November–December 2004.
Journal Club Alcohol, Other Drugs, and Health: Current Evidence May-June 2007.
Journal Club Alcohol, Other Drugs, and Health: Current Evidence April 2008.
Journal Club Alcohol, Other Drugs, and Health: Current Evidence September-October 2007.
Journal Club Alcohol and Health: Current Evidence January-February 2005.
Journal Club Alcohol, Other Drugs, and Health: Current Evidence July–August 2010.
Vanderbilt Sports Medicine Chapter 4: Prognosis Presented by: Laurie Huston and Kurt Spindler Evidence-Based Medicine How to Practice and Teach EBM.
Statistics for Health Care
Journal Club Alcohol, Other Drugs, and Health: Current Evidence May-June 2008.
By Dr. Ahmed Mostafa Assist. Prof. of anesthesia & I.C.U. Evidence-based medicine.
EVIDENCE BASED MEDICINE
Cohort Studies Hanna E. Bloomfield, MD, MPH Professor of Medicine Associate Chief of Staff, Research Minneapolis VA Medical Center.
AM Recitation 2/10/11.
Statistics in Screening/Diagnosis
Overview Definition Hypothesis
BASIC STATISTICS: AN OXYMORON? (With a little EPI thrown in…) URVASHI VAID MD, MS AUG 2012.
Multiple Choice Questions for discussion
Diagnosis Articles Much Thanks to: Rob Hayward & Tanya Voth, CCHE.
DEB BYNUM, MD AUGUST 2010 Evidence Based Medicine: Review of the basics.
Diagnostic Cases. Goals & Objectives Highlight Bayesian and Boolean processes used in classic diagnosis Demonstrate use/misuse of tests for screening.
Chapter 8 Introduction to Hypothesis Testing
Statistics for Health Care Biostatistics. Phases of a Full Clinical Trial Phase I – the trial takes place after the development of a therapy and is designed.
Vanderbilt Sports Medicine How to practice and teach EBM Chapter 3 May 3, 2006.
Evidence Based Medicine Workshop Diagnosis March 18, 2010.
Screening and Diagnostic Testing Sue Lindsay, Ph.D., MSW, MPH Division of Epidemiology and Biostatistics Institute for Public Health San Diego State University.
EBCP. Random vs Systemic error Random error: errors in measurement that lead to measured values being inconsistent when repeated measures are taken. Ie:
EBC course 10 April 2003 Critical Appraisal of the Clinical Literature: The Big Picture Cynthia R. Long, PhD Associate Professor Palmer Center for Chiropractic.
EVIDENCE ABOUT DIAGNOSTIC TESTS Min H. Huang, PT, PhD, NCS.
+ Clinical Decision on a Diagnostic Test Inna Mangalindan. Block N. Class September 15, 2008.
1 SCREENING. 2 Why screen? Who wants to screen? n Doctors n Labs n Hospitals n Drug companies n Public n Who doesn’t ?
Literature searching & critical appraisal Chihaya Koriyama August 15, 2011 (Lecture 2)
INTRODUCTION Upper respiratory tract infections, including acute pharyngitis, are common in general practice. Although the most common cause of pharyngitis.
Clinical Writing for Interventional Cardiologists.
Appraising A Diagnostic Test
Wipanee Phupakdi, MD September 15, Overview  Define EBM  Learn steps in EBM process  Identify parts of a well-built clinical question  Discuss.
Risk assessment for VTE Dr Roopen Arya King’s College Hospital.
Prediction statistics Prediction generally True and false, positives and negatives Quality of a prediction Usefulness of a prediction Prediction goes Bayesian.
Prognosis study EBM questions. Prognostic factors Characteristics of patient that may predict eventual outcome Several types: demographic (eg age) disease-specific.
Diagnostic Tests Studies 87/3/2 “How to read a paper” workshop Kamran Yazdani, MD MPH.
SCH Journal Club Use of time from fever onset improves the diagnostic accuracy of C-reactive protein in identifying bacterial infections Wednesday 13 th.
Unit 15: Screening. Unit 15 Learning Objectives: 1.Understand the role of screening in the secondary prevention of disease. 2.Recognize the characteristics.
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
EVALUATING u After retrieving the literature, you have to evaluate or critically appraise the evidence for its validity and applicability to your patient.
Clinical Epidemiology and Evidence-based Medicine Unit FKUI – RSCM
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
G. Biondi Zoccai – Ricerca in cardiologia What to expect? Core modules IntroductionIntroduction Finding out relevant literatureFinding out relevant literature.
CAT 4: How to Read a Prognosis Article Maribeth Chitkara, MD Rachel Boykan, MD.
Chapter 13 Understanding research results: statistical inference.
EBM --- Journal Reading Presenter :黃美琴 Date : 2005/10/27.
© 2010 Jones and Bartlett Publishers, LLC. Chapter 12 Clinical Epidemiology.
Uses of Diagnostic Tests Screen (mammography for breast cancer) Diagnose (electrocardiogram for acute myocardial infarction) Grade (stage of cancer) Monitor.
Copyright © 2008 Delmar. All rights reserved. Chapter 4 Epidemiology and Public Health Nursing.
Screening Tests: A Review. Learning Objectives: 1.Understand the role of screening in the secondary prevention of disease. 2.Recognize the characteristics.
Critical Appraisal Course for Emergency Medicine Trainees Module 5 Evaluation of a Diagnostic Test.
Diagnostic Test Studies
Evidence-Based Medicine
How to read a paper D. Singh-Ranger.
Diagnosis II Dr. Brent E. Faught, Ph.D. Assistant Professor
Statistical significance using p-value
Evidence Based Diagnosis
Presentation transcript:

Critically Evaluating the Evidence: diagnosis, prognosis, and screening Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor, Library and Informatics

2/3 legal claims against GPs in UK 40,000-80,000 US hospital deaths from misdiagnosis per year Adverse events, negligence cases, and serious disability are more likely to be related to misdiagnosis than drug errors Diagnosis uses <5% of hospital costs, but influences 60% of decision making

What are tests used for?

Increase certainty about presence/absence of disease Disease severity (triage) Monitor clinical course Assess prognosis – risk/stage within diagnosis Plan treatment Screening

Appraising articles on diagnosis in 3 easy steps: Are the results important? Can the results be applied to my patient?

Appraising articles on diagnosis in 3 easy steps: Are the results important? Can the results be applied to my patient? Appropriate spectrum of patients? Does everyone get the gold standard? Is there an independent, blind or objective comparison with the gold standard?

Appropriate spectrum of patients? An example: Prospective Validation of the Pediatric Appendicitis Score, Goldman et al., 2008 Who are the patients being screened with the PAS? (Abstract, Methods pg. 279) - Study includes the full spectrum of manifestation of the illness (ex. early and late) - Study includes patients with illnesses commonly included in the differential

Reference standard applied (does everyone get the gold standard)? An example: Prospective Validation of the Pediatric Appendicitis Score, Goldman et al., 2008 Did all patients receive CT? (Methods Section, pg. 279) Investigators often forgo the reference standard when the diagnostic test is negative (may require a period of follow-up with criteria for need for treatment)

Is there an independent, blind or objective comparison to gold standard? An example: Screening for Urinary Tract Infections in Infants in the Emergency Department: Which Test Is Best, Shaw et al., 1998 What is the diagnostic test and what is the reference (gold) standard? (Abstract objectives, Methods, and Table 1) – Subjects should have both the diagnostic test in question and the reference standard – Be vigilant to the reference standard The results of one test should not be known (or bias) the other

Appraising articles on diagnosis in 3 easy steps: Are the results important? Can the results be applied to my patient? Sensitivity, Specificity Predictive Values ROC Curves Likelihood Ratios

Sensitivity and Specificity Sensitivity is the proportion of true positives that are correctly identified by a test or measure (e.g., percent of sick people correctly identified as having the condition) Ex: If 100 patients known to have a disease were tested, and 43 test positive, then the test has 43% sensitivity. Specificity is the proportion of true negatives that are correctly identified by the test (e.g., percent of healthy people correctly identified as not having the condition) Ex: If 100 patients with no disease are tested and 96 return a negative result, then the test has 96% specificity.

ROC Curves -Shows the tradeoff between sensitivity and specificity -The closer the curve follows the left-hand border and then the top border of the ROC space, the more accurate the test. - The closer the curve comes to the 45-degree diagonal of the ROC space, the less accurate the test - The area under the curve is a measure of text accuracy

Area Under ROC Curve (AUC) Overall measure of test performance Comparisons between two tests based on differences between (estimated) AUC

Best Test: Worst test: True Positive Rate 0%0% 100% False Positive Rate 0%0% 100 % True Positive Rate 0%0% 100% False Positive Rate 0%0% 100 % The distributions don’t overlap at all The distributions overlap completely ROC Curve Extremes

True Positive Rate 0%0% 100% False Positive Rate 0%0% 100 % True Positive Rate 0%0% 100% False Positive Rate 0%0% 100 % True Positive Rate 0%0% 100% False Positive Rate 0%0% 100 % AUC = 50% AUC = 90% AUC = 65% AUC = 100% True Positive Rate 0%0% 100% False Positive Rate 0%0% 100 % AUC for ROC Curves

An example: Prospective Validation of the Pediatric Appendicitis Score, Goldman et al., 2008 What is the total area under the ROC curve for the PAS? (Results pg. 279, Figure 1)

Patients and clinicians have a different question… Positive and Negative Predictive Values Positive predictive value is the probability that a patient with a positive test result really does have the condition for which the test was conducted. Negative predictive value is the probability that a patient with a negative test result really is free of the condition for which the test was conducted Predictive values give a direct assessment of the usefulness of the test in practice – influenced by the prevalence of disease in the population that is being tested

Pre- and Post-Test Probability a solution to the deficiencies of sensitivity/specificity and predictive values?

Positive Likelihood Ratio probability of an individual with the condition having a positive test divided by the probability of an individual without the condition having a positive test A helpful test will have a large LR positive

Negative Likelihood Ratio probability of an individual with the condition having a negative test divided by the probability of an individual without the condition having a negative test A helpful test will have a small LR negative

LR < 0.1 = strong negative test result LR =1 = no diagnostic value LR >10 = strong positive test result

Pre test 5% Post test 20% ? Appendicitis: McBurney tenderness LR+ = 3.4 Likelihood Nomogram

Likelihood Ratio Approximate Change in Probability (%) * Values between 0 and 1 decrease the probability of disease 0.1 − − − − − Values greater than 1 increase the probability of disease From: J Gen Intern Med August; 17(8): 647–650. doi: /j x

Appraising articles on diagnosis in 3 easy steps: Are the results important? Can the results be applied to my patient? Can I do the test in my setting? Do results apply to the mix of patients I see? Will the result change my management? Costs to patient/health service?

Why is understanding prognosis important?

Appraising articles on prognosis in 3 easy steps: Are the results important? Can the results be applied to my patient? Sample of patients assembled at a common point in the course of their disease? Follow-up sufficient and complete? Outcome criteria objective? Adjustment for important prognostic factors?

Sample defined at common point? An example: Risk of epilepsy after febrile convulsions: a national cohort study, Verity & Golding, 1991 What kind of study is this? Was a defined, representative sample of patients assembled at a common point in the course of their disease? (Abstract) Cohort studies: best design by studying patients with the disease over time Case control: limited value by strength of inference

Follow-up sufficient and complete? An example: Risk of epilepsy after febrile convulsions: a national cohort study, Verity & Golding, 1991 How long were patients followed? Was this long enough to determine the outcome in question? (Abstract)

Outcome criteria objective? An example: Risk of epilepsy after febrile convulsions: a national cohort study, Verity & Golding, 1991 What were the criteria applied for ascertaining the outcome? (Abstract)

Adjustment for important prognostic factors? An example: Risk of epilepsy after febrile convulsions: a national cohort study, Verity & Golding, 1991 Were there subgroups to follow? (Abstract & Results)

Appraising articles on prognosis in 3 easy steps: Are the results important? Can the results be applied to my patient? What is the risk of the outcome over time? How precise are the estimates?

What is the risk of the outcome over time? Relative Risk Odds Ratios Survival Curves

HOMEHELPFEEDBACKSUBSCRIPTIONS Click on image to view larger version. Proc. Am. Thorac. Soc.Am. J. Respir. Cell Mol. Biol. Copyright © 2009 American Thoracic Society Return to a r t i c l e Return to a r t i c l e Post Transplant Survival for Patients with CF at Two Time Periods Liou et al. Am J Respir Crit Care Med 2005 Survival Curves

How precise are the estimates? Confidence Intervals Around a rate – Gives the reader a sense of precision – It represents the range that the test statistic would be expected to fall in if the study were repeated 100 number of times Ex. A 95% CI means that 95 out of 100 times the test statistic would fall within that range

Appraising articles on prognosis in 3 easy steps: Are the results important? Can the results be applied to my patient? Is my patient so different to those in the study that the results cannot apply? Will this evidence make a clinically important impact on my conclusions about what to offer my patients?

Basically, the process of deciding whether to screen follows the following format: 1) Is the prevalance of the disease high enough in the target population to warrant the time and expense of screening? 2) Does a therapy exist which will significantly reduce the risk of disease? 3) If not, will early screening effect the duration/severity of the disease? 4) Is the screening test itself sufficiently sensitive to catch the disease so that treatment may progress? Screening

Feeling confident in making the diagnosis and understanding prognosis helps determine whether to proceed with therapy When appraising articles, always consider validity, importance, and applicability Knowledge translation in this setting is the interpretation and integration of appraised and accepted evidence into clinical practice recommendations