Reference standard Diagnosis: the pathway of a diagnostic test

Slides:



Advertisements
Similar presentations
Sample size estimation
Advertisements

Errors in the diagnostic process Hierarchy of Qualities in Medicine Frequency of diagnostic Errors Judgment under Uncertainty: Heuristics and Biases The.
Evaluating Diagnostic Accuracy of Prostate Cancer Using Bayesian Analysis Part of an Undergraduate Research course Chantal D. Larose.
Azita Kheiltash Social Medicine Specialist Tehran University of Medical Sciences Diagnostic Tests Evaluation.
Evaluation of Diagnostic Test Studies
Journal Club Alcohol and Health: Current Evidence July–August 2005.
Bias and errors in epidemiologic studies Manish Chaudhary BPH( IOM) MPH(BPKIHS)
Statistics in Screening/Diagnosis
BASIC STATISTICS: AN OXYMORON? (With a little EPI thrown in…) URVASHI VAID MD, MS AUG 2012.
EBM --- Journal Reading Presenter :李政鴻 Date : 2005/10/26.
Sensitivity Sensitivity answers the following question: If a person has a disease, how often will the test be positive (true positive rate)? i.e.: if the.
Evidence Based Medicine Workshop Diagnosis March 18, 2010.
Sensitivity & Specificity Sam Thomson 8/12/10. Sensitivity Proportion of people with the condition who have a positive test result Proportion of people.
Appraising A Diagnostic Test
Evidence-Based Medicine Diagnosis Component 2 / Unit 5 1 Health IT Workforce Curriculum Version 1.0 /Fall 2010.
Evaluating Results of Learning Blaž Zupan
Fundamentals of Clinical Research for Radiologists Presented by: Reema Al-Shawaf.
Welcome Back From Lunch. Thursday Afternoon 2:00-3:00 Studies of Diagnostic Test Accuracy (Tom) 3:00-3:45 Combining Tests (Mark) 3:45-4:00 Break 4:00-5:30.
1 Wrap up SCREENING TESTS. 2 Screening test The basic tool of a screening program easy to use, rapid and inexpensive. 1.2.
Diagnostic Tests Studies 87/3/2 “How to read a paper” workshop Kamran Yazdani, MD MPH.
EVALUATING u After retrieving the literature, you have to evaluate or critically appraise the evidence for its validity and applicability to your patient.
Pulmonary Embolism in Patients with Unexplained Exacerbation of COPD: Prevalence and Risk Factors Isabelle Tillie-Leblond, MD, PhD; Charles-Hugo Marquette,
Critical Appraisal Course for Emergency Medicine Trainees Module 5 Evaluation of a Diagnostic Test.
Diagnosis Recitation. The Dilemma At the conclusion of my “diagnosis” presentation during the recent IAPA meeting, a gentleman from the audience asked.
Accuracy and usefulness of a clinical prediction rule and D-dimer testing in excluding deep vein thrombosis in cancer patients Thrombosis Research (2008)
حسن بیات - دانش ‌ آموخته ‌ ی علوم آزمایشگاهی اردیبهشت 1395.
A prospective study of PET/CT in initial staging of small-cell lung cancer : comparison with CT, bone scintigraphy and bone marrow analysis B. M. Fischer1,
TUTORIAL: SCREENING. PERFORMANCE OBJECTIVES Compute and interpret Sensitivity Specificity Predictive value positive Predictive value negative False positive.
CHP400: Community Health Program-lI Mohamed M. B. Alnoor Muna M H Diab SCREENING.
Accuracy, sensitivity and specificity analysis
Screening for Disease: Part One
The index test results: positivity and negativity criteria.
Diagnostic studies Adrian Boyle.
Performance of a diagnostic test Tunisia, 31 Oct 2014
Clinical practice involves measuring quantities for a variety of purposes, such as: aiding diagnosis, predicting future patient outcomes, serving as endpoints.
Diagnostic Test Studies
When is the post-test probability sufficient for decision-making?
Sensitivity and Specificity
QUADAS-2 Mirella Fraquelli Gastroenterology and Endoscopy Unit
Diagnostic accuracy and statistical significance
Cardiac Testing for Coronary Artery Disease in Potential Kidney Transplant Recipients: A Systematic Review of Test Accuracy Studies  Louis W. Wang, MM(ClinEpi)(Hons),
Diagnostic test accuracy. Study design and the 2x2 table
Types of Research Studies Architecture of Clinical Research
Present: Disease Past: Exposure
Evaluating The Accuracy Of International Classification Of Diseases 10TH Revision Codes For Venous Thromboembolism (VTE) And Major Bleeding (MB) in.
The Evaluation of Suspected Pulmonary Embolism
How to read a paper D. Singh-Ranger.
Class session 7 Screening, validity, reliability
Figure 1. Paradigm for evaluation of those with latent tuberculosis infection (LTBI) based on risk of infection, risk of progression to tuberculosis, and.
Evaluating Results of Learning
Understanding Results
Lecture 3.
Lung Ventilation-Perfusion Scan (V/Q Scan) 2015/2016
Study design IV: Cohort Studies
Dr. Tauseef Ismail Assistant Professor Dept of C Med. KGMC
Comunicación y Gerencia
Nucleic Acid Amplification Test for Tuberculosis
How do we delay disease progress once it has started?
Accuracy, sensitivity and specificity analysis
Hint: Numerator Denominator. Vascular Technology Lecture 34: Test Validation (Statistical Profile and Correlation) HHHoldorf.
Dr. Muhammad Ajmal Zahid Chairman, Department of Psychiatry,
The receiver operating characteristic (ROC) curve
Refining Probability Test Informations Vahid Ashoorion MD. ,MSc,
ERRORS, CONFOUNDING, and INTERACTION
Study design IV: Cohort Studies
The objective of this lecture is to know the role of random error (chance) in factor-outcome relation and the types of systematic errors (Bias)
Interpreting Epidemiologic Results.
The Research Question Has this patient with chest pain coronary artery disease? Diagnostic utility of a clinical decision rule. J Haasenritter, S Bösner,
Lecture 4 Study design and bias in screening and diagnostic tests
Evidence Based Diagnosis
Presentation transcript:

Reference standard Diagnosis: the pathway of a diagnostic test From bench to bedside Reference standard Gennaro D’Amico UOC Gastroenterologia Ospedale V Cervello– Palermo gedamico@libero.it

Terminology In diagnosis research, the Reference Standard (RS) is the procedure (or test) that is used to define the true state of the patient (disease vs no disease) A major aim of diagnosis research is to find new diagnostic tests (Index Test) (IT) less invasive and expensive than the RS

The index test (IT) Disease Positive Test under hypothesis study Negative

The question underlying test accuracy research Are these patients truly Positive With disease Free of disease True Positive True Negative ? Test under study Negative To answer, a verification test is needed: the RS

The ideal (perfect) reference (gold) standard RS “An ideal RS, in an optimal diagnostic accuracy study, would fulfill the following criteria: The RS provides error-free classification of all subjects. (2) The same RS is used to verify all IT results. (3) The IT and RS can be performed within a short interval to avoid changes in target condition status″ Retisma JB. J Clin Epidemiol 2009; 62: 787-806

Verification of a new test accuracy Reference standard Positive Negative Disease No disease Positive Test under study true false Negative false true

Imperfect RS A perfect gold standard with 100% sensitivity and specificity is exceptional The clinical condition to which the IT is to be applied may hamper the application of the (same) RS in all the subjects The time interval from IT to RS may be too long A satisfactory RS may not be available For these reasons the term reference standard is preferred to gold standard

Problems related to imperfect RS Problem with RS Consequence / Bias Possible error / limitation sensitivity and specificity <100% Imprecise IT accuracy estimation Under- or over- estimation of accuracy RS is invasive and not performed in all negative IT Partial verification bias Over-estimation of sensitivity The RS is not independent of the IT and vice versa Diagnostic test bias Test review bias Incorporation bias Over-estimation of sensitivity, unclear effect on specificty Time interval between IT and RS is too long Verification bias Diagnosis may change along time Satisfactory RS is not available New definition needed, usually complex Reproducibility of IT accuracy Retisma JB. J Clin Epidemiol 2009; 62: 787-806 Whiting P. Ann Int Med 2004;140:189-202

Different RS diagnosis of depression IT: TRH stimulation test for depression (TSH < 7 µIU/ml post-TRH infusion of 500 p.g IV). Ten sensitivity studies used two different RSs, based on different validated questionnaires Diagnostic and Statistical Manual III criteria (DSM III), American Psychiatry Association (APA, 1980) Research Diagnostic Critera (RDC) (Spitzer, 1978) Arana GW. Biol Psych 1990;28:733-737

Different RSs Sensitivity of TRH-ST for depression 20 30 40 50 60 70 DSM RDC Sensitivity % Applying different standard procedures for different patients may yield inconsistent reference for the IT as each of the «standards» will have its own error rate

Inappropriate RS sensitivity of excercise scintigraphy for coronary disease Planar vs TC coronarography as RS Planar TC Detrano R. Arch Int Med 1988;148:1289-1295

Trade off Sensitivity/Specificity Courtesy of dr Mirella Fraquelli Trade off Sensitivity/Specificity Imperfect RS vs true disease status IT Broad histopathologic criteria to diagnose colon dysplasia Liver biopsy in diagnosing hepatic fibrosis Specificity Sensitivity Underestimates specificity Underestimates sensitivity

Verification bias RF is not performed in all subjects or different RSs are used Partial verification: RS performed on test-positives, but not on test-negatives Differential verification: RS used for test-positives is different from that used for test-negatives

Prospective investigation of Pulmonary Embolism Investigation PIOPED Diagnostic accuracy of ventilation perfusion scan was assessed by angiography Angio was more commonly done in patients with higher probability of PE based on VQ scan results Partial verification bias JAMA 1990;263:2753-59

Independent assessment of RS Diagnostic review bias Test under study Reference standard blind unblind test review bias RS + - + - FP TP Test under study FN TN

Incorporation bias The test that is being evaluated is included in the RS It can lead to overestimation of test accuracy It can occur if final diagnosis is made on the basis of all clinical data (which might include the IT) Examples: PCR for tuberculosis, Mantoux for TB among kids, screening for depression

Possible solutions for imperfect or missing RS Description limitation Composite RS Two or more tests combined by a prespecified rule Residual misclassification Expert panel All the relevant information per each patient, possibly including follow-up Disagreement Latent class analysis Statistical model providing probabilities of presence or absence of the diseas No clinical definition Validation Comparing IT with different items related to the disease or its severity Relevance of the considered items Delayed verification Short follow-up to obtain delayed cross-sectional information; RCT Disease status may change with time

Solving problems of imperfect RS Delayed-type cross-sectional study RS is invasive and may not be performed in all subjects or different standards have to be used The disease state of included subjects may be verified after a predefined follow-up Knotterus JA. J Clin Epidemiol 2003;56:118-1128

Prospective investigation of Pulmonary Embolism Investigation PIOPED Diagnostic accuracy of ventilation perfusion scan was assessed by angiography Angio was more commonly done in patients with higher probability of PE based on VQ scan results 1-year follow-up confirmed the diagnosis in subjects not undergoing angio JAMA 1990;263:2753-59

Index test in patients with suspected PE Verification Performance of the Wells score in patients with suspected pulmonary embolism during hospitalization: A delayed-type cross sectional study in a community hospital Index test in patients with suspected PE Verification Angio TC V/Q pulmonary scinti-scan Angiography Leg venous US showing DVT Wells score >4 PE likely ≤ 4 PE unlikely Posadas-Martines ML Thromb Res 2014;133:177–181

Performance of the Wells score in patients with suspected pulmonary embolism during hospitalization: A delayed-type cross sectional study in a community hospital 613 pts Suspected PE No PE 90 days followup +1 PE Posadas-Martines ML Thromb Res 2014;133:177–181

Validation of IT in the absence of satisfactory RS Interferon gamma levels for latent tuberculosis The better test would be the one more strongly correlated with exposure E= IFN-Ỿ T= tubercolin skin test Ewer K Lancet 2003;361:1168-73

Conclusions The RS is the diagnostic instrument used to verify the accuracy of the results of a new test The same RS should independently, or even blindly,verify all the new test results and should be correctly performed Since the RS is almost always imperfect it should be appropriate for the new test under study Imperfect RS or inappropriate use of satisfactory RS may leed to incorrect conclusions on the IT accuracy Composite RS, expert panel judgement, cross-sectional delayed verification and other validation methods, may help to overcome the lack of a satisfactory RS

Independent assessment of RS Diagnostic review bias Test under study Reference standard unblind blind test review bias RS + - TP FP + - Test under study FN TN