Evaluation of Diagnostic Tests

Slides:



Advertisements
Similar presentations
Validity and Reliability of Analytical Tests. Analytical Tests include both: Screening Tests Diagnostic Tests.
Advertisements

Diagnostic Test Studies Tran The Trung Nguyen Quang Vinh.
Lecture 3 Validity of screening and diagnostic tests
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Understanding Statistics in Research Articles Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor,
TESTING A TEST Ian McDowell Department of Epidemiology & Community Medicine November, 2004.
Module 6 “Normal Values”: How are Normal Reference Ranges Established?
Critically Evaluating the Evidence: diagnosis, prognosis, and screening Elizabeth Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management.
Azita Kheiltash Social Medicine Specialist Tehran University of Medical Sciences Diagnostic Tests Evaluation.
Concept of Measurement
Darlene Goldstein 29 January 2003 Receiver Operating Characteristic Methodology.
Statistics for Health Care
(Medical) Diagnostic Testing. The situation Patient presents with symptoms, and is suspected of having some disease. Patient either has the disease or.
Screening and Early Detection Epidemiological Basis for Disease Control – Fall 2001 Joel L. Weissfeld, M.D. M.P.H.
Data Management & Basic Analysis Interpretation of Diagnostic test.
Statistics in Screening/Diagnosis
BASIC STATISTICS: AN OXYMORON? (With a little EPI thrown in…) URVASHI VAID MD, MS AUG 2012.
Multiple Choice Questions for discussion
Diagnosis Articles Much Thanks to: Rob Hayward & Tanya Voth, CCHE.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Lecture 4: Assessing Diagnostic and Screening Tests
Basic statistics 11/09/13.
Diagnostic Testing Ethan Cowan, MD, MS Department of Emergency Medicine Jacobi Medical Center Department of Epidemiology and Population Health Albert Einstein.
Statistics for Health Care Biostatistics. Phases of a Full Clinical Trial Phase I – the trial takes place after the development of a therapy and is designed.
Evidence Based Medicine Workshop Diagnosis March 18, 2010.
Screening and Diagnostic Testing Sue Lindsay, Ph.D., MSW, MPH Division of Epidemiology and Biostatistics Institute for Public Health San Diego State University.
EVIDENCE ABOUT DIAGNOSTIC TESTS Min H. Huang, PT, PhD, NCS.
+ Clinical Decision on a Diagnostic Test Inna Mangalindan. Block N. Class September 15, 2008.
1 SCREENING. 2 Why screen? Who wants to screen? n Doctors n Labs n Hospitals n Drug companies n Public n Who doesn’t ?
CHP400: Community Health Program-lI Mohamed M. B. Alnoor Muna M H Diab SCREENING.
MEASURES OF TEST ACCURACY AND ASSOCIATIONS DR ODIFE, U.B SR, EDM DIVISION.
Appraising A Diagnostic Test
Likelihood 2005/5/22. Likelihood  probability I am likelihood I am probability.
1 Risk Assessment Tests Marina Kondratovich, Ph.D. OIVD/CDRH/FDA March 9, 2011 Molecular and Clinical Genetics Panel for Direct-to-Consumer (DTC) Genetic.
1. Statistics Objectives: 1.Try to differentiate between the P value and alpha value 2.When to perform a test 3.Limitations of different tests and how.
Prediction statistics Prediction generally True and false, positives and negatives Quality of a prediction Usefulness of a prediction Prediction goes Bayesian.
Screening of diseases Dr Zhian S Ramzi Screening 1 Dr. Zhian S Ramzi.
Screening and its Useful Tools Thomas Songer, PhD Basic Epidemiology South Asian Cardiovascular Research Methodology Workshop.
Diagnostic Tests Afshin Ostovar Bushehr University of Medical Sciences Bushehr, /7/20151.
1 Wrap up SCREENING TESTS. 2 Screening test The basic tool of a screening program easy to use, rapid and inexpensive. 1.2.
Diagnostic Tests Studies 87/3/2 “How to read a paper” workshop Kamran Yazdani, MD MPH.
Unit 15: Screening. Unit 15 Learning Objectives: 1.Understand the role of screening in the secondary prevention of disease. 2.Recognize the characteristics.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
CLINICAL EPIDEMIOLOGY III: JOURNAL APPRAISAL Group 3 February 11, 2010.
Screening.  “...the identification of unrecognized disease or defect by the application of tests, examinations or other procedures...”  “...sort out.
Evidence based medicine Diagnostic tests Ross Lawrenson.
1 Medical Epidemiology Interpreting Medical Tests and Other Evidence.
10 May Understanding diagnostic tests Evan Sergeant AusVet Animal Health Services.
Inter-observer variation can be measured in any situation in which two or more independent observers are evaluating the same thing Kappa is intended to.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
PTP 560 Research Methods Week 12 Thomas Ruediger, PT.
Diagnosis:Testing the Test Verma Walker Kathy Davies.
Biostatistics Board Review Parul Chaudhri, DO Family Medicine Faculty Development Fellow, UPMC St Margaret March 5, 2016.
Direct method of standardization of indices. Average Values n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the.
© 2010 Jones and Bartlett Publishers, LLC. Chapter 12 Clinical Epidemiology.
Screening Tests: A Review. Learning Objectives: 1.Understand the role of screening in the secondary prevention of disease. 2.Recognize the characteristics.
Critical Appraisal Course for Emergency Medicine Trainees Module 5 Evaluation of a Diagnostic Test.
Diagnostic studies Adrian Boyle.
Diagnostic Test Studies
Understanding Results
EPIDEMIOLOGICAL METHOD TO DETERMINE UTILITY OF A DIAGNOSTIC TEST
Dr. Tauseef Ismail Assistant Professor Dept of C Med. KGMC
Comunicación y Gerencia
What is Screening? Basic Public Health Concepts Sheila West, Ph.D.
How do we delay disease progress once it has started?
Diagnosis II Dr. Brent E. Faught, Ph.D. Assistant Professor
What is Screening? Basic Public Health Concepts Sheila West, Ph.D.
DIAGNOSIS.
Evidence Based Diagnosis
Presentation transcript:

Evaluation of Diagnostic Tests Presenter: Akash Ranjan Moderator: Dr Chetna Maliye

Framework Introduction Determining useful Diagnostic Test Evaluation of Diagnostic Test Gold Standard Measure of Diagnostic Accuracy ROC Curve Multiple Testing Reliability of Test Relationship between Reliability and Validity References

Correctly classifying individuals by Disease Status Tests are used in medical diagnosis, screening and research to classified subjects in to disease or non-diseased group Ideally, all subjects who have the disease should be classified as “having the disease” and vice-versa

Diagnostic Test and Screening Test A diagnostic test is used to determine the presence or absence of a disease when a subject shows signs or symptoms of a disease A screening test identifies asymptomatic individuals who may have the disease The diagnostic test is performed after a positive screening test to establish a definitive diagnosis Few Exp of screeng test- Pap smear, F & PP Bld sugar for DM, BP for HT, Mammography for Br Ca, Fasting bld Chl for Ht Dis, Occular press for Galucoma

Useful Diagnostic Test Reproducibility Accuracy Feasibility Effects on clinical decisions Outcomes

Evaluation of Diagnostic Test Ability to classify individuals in to correct disease status in reliable manner Help to make decisions about their use and interpretation By determining validity and reliability. Validity Internal Validity External Validity Reliability

Simplify Data Many test results have a continuous, ordinal or continuous variables Complex data are reduce to simple dichotomy Present/ Absent Abnormal/ Normal Disease/ Well.

Distribution of Systolic Blood Pressures: Males, Ages 40–64

+ Gold Standard Disease All people with disease Accuracy of a test established by independent comparison with “Gold Standard” Ideally, Gold Standard is 100% accurate test Practically, sensitivity and specificity tend to be 100% Histopathology Cytopathology Radiologic contrast procedures Prolong follow up Autopsy Disease + All people with disease All people without disease

Measure of Diagnostic Accuracy Comparison of Disease status: Gold Standard test and Index test Disease + a (True positives) b (False Positives) c (False Negative) d (True Negative) + Index Test

Sensitivity Sensitivity = a a + c Proportion of people with the disease, who have positive test result for the disease A sensitive test will rarely miss people with the disease Used when there is an important penalty for missing the disease eg. Ca Cervix, Breast Cancer, HIV Sensitivity = a a + c

Specificity Specificity = d b + d The proportion of people without the disease, who have negative test result useful to confirm ( “rule in” ) a diagnosis For screening a prevalent dis like DM when false positive results can harm the patients, physically and financially eg. Cancer Chemotherapy Specificity = d b + d 1. For a prevalent dis like DM, for which t/t doesn’t markedly alter outcome, false positive should be limited, otherwise health system will be overburdened with diagnostic demands of the positives

Factors establishing Sensitivity and Specificity Spectrum of Patients Test may not distinguish when differences are subtle between patients Bias Sn & Sp of test should be assessed separately, not be part of information in making diagnosis eg x ray Chance Small sample Size Confidence Interval Bias- Not be part of information in making diagnosis. Chane – precision of estimate

Trade-off between Sensitivity and Specificity Sensitivity can be increased only at the expense of Specificity Trade-off between Sensitivity and Specificity when diagnosing Diabetes Blood Sugar after fasting 8 hour Sensitivity (%) Specificity(%) It is desirable to have a test that is both highly Sn and Sp, but unfortunately this is usually not possible. Instead there is trade off between Sn & Sp of a diagnostic test. When test results expressed on a continuous scale, Sn can be increased only at the expense of Sp and vice- versa

ROC Curve

ROC Curve By Plotting Sensitivity against false positive rate (1-Sp) over a range of cut off values Test that discriminate well, crowd towards the upper right corner of the curve Tests that performs less well have curves that fall closer to diagonal running from lower left to upper right. shows how severe trade off between Sn & Sp To decide where best cut off point should be Generally it is near the shoulder of ROC curve, unless there are clinical reasons for minimizing either false negative or false positives 3 point- For them as the Sn increases (lower cut off point) there is little or no loss in specificity, until very high levels of Sn are achieved.

ROC Curve In comparing alternative tests for same diagnosis Area under the ROC curve-larger the area, better the test 2 point-Accuracy of test described as MAST- Michighan alcoholism screening test The CAGe is both more Sn & more Sp than MAST and includes much larger atea under its curve.

Predictive Accuracy (“Clinician’s dilemma”) Positive predictive value - Probability of disease in a patient with positive test result. Reflects the diagnostic power of a test Depends on Sn & Sp Directly proportional to disease prevalence in population PPV= a a + b Intro:For a clinician the dilemma is to determine whether or not the patient has the disease, given the result of a test. They are more concern to know- What is the probability of having the disease, when test is positive? And what is the probability of not having the disease, when test is negative? 1 point-Positive predictive value /Post Test Probability/ Posterior probability

Predictive Accuracy PPV = Sensitivity x Prevalence (Sensitivity x Prevalence) + (1- Sp) x (1-Pr)

Predictive Accuracy

Predictive Accuracy NPV= d c + d Negative predictive value- Probability that the patient with Negative test result do not have the disease. Reflect the diagnostic power of test Depends on Sn & Sp Inversely proportional to disease prevalence in population NPV= d c + d

Likelihood Ratios LR+ = Sn 1- Sp LR- = 1 – Sn Sp Positive Likelihood ratio(LR+): Ratio of proportion of diseased people with a positive test result (Sn) to the proportion of non diseased people with a positive test result (1-Sp) Negative Likelihood ratio(LR-):proportion of diseased people with a negative test result (1-Sn) devided by proportion of non diseased people with a negative test result (Sp) LR+ = Sn 1- Sp LR- = 1 – Sn Sp

Likelihood Ratios Example: A positive test is about 2.6 times more likely to be found in presence of DVT (Deep vein thrombosis) than in absence of it. Advantages of LR’s Not change with changes in the prevalence Can be used at multiple levels of test results describing the overall odds of disease when a series of diagnostic test is used.

Likelihood Ratios Disease + - Test 34 168 1 282 Techniques of using LR’s Mathematical approach Using a likelihood ratio nomogram Disease + - Test 34 168 1 282 Sn=97%, Sp= 63%, Pv=7%, PPV= 17%, NPV= 100%, LR+ = 2.6, LR- =0.05 Step1: Convert pretest probability to pretest odds Odds= 0.075 Step2: Post test odds= Pretest odds x LR+ = 0.075 X 2.6 = 0.195 Step3: Convert Post test odds to post test probability P= 0.195/ (1+0.195) = 16%

Likelihood Ratios Using a likelihood ratio nomogram

Multiple Tests Single test frequently results in a probability of disease that is neither very high nor very low Physician raise or lower the probability of disease in such situations Multiple tests helps the clinicians in this regard Applied in in two basic ways Parallel testing: (All at once) Serial Testing: (Consecutive) 2 point- Usually it is not acceptable to stop the diagnostic process at that point.

Multiple Tests Parallel testing: (All at once) A positive result of any test is considered evidence for disease Rapid assessment is needed eg. hospitalized or emergency patients useful when need for a very sensitive strategy Net effect is a more sensitive diagnostic strategy Serial Testing: (Consecutive) Decision to order next test in series based on results of previous test All tests must give a positive result in order for diagnosis to be made Maximizes Sp and PPV, but lowers Sn and NPV Serial- after second point -because the diagnostic process stopped with a negative result Last - Serial testing strategy used when rapid assessment of patients not required, tests are expensive, risky. These tests used only after simpler or safer tests suggest the presence of disease eg. Maternal age and blood tests (AFP, Chorionic gonadotropin & estradiol) are used to identify pregnancies at higher risks of delivering a baby with Down syndrome. Mother found to be at higher risk by these tests are then offered Amnioncentasis

Multiple Tests

Reliability of a test Reliability/ Repeatability- Test is able to give same result again and again. Regardless of Sn and Sp of a test, if the test result can not be reproduced, the value and usefulness of the test are minimal Factors contribute to the variation between test results Intra subject variation (with in individual subjects) Intra observer variation Inter observer variation (variation between those reading test result).

Reliability of a test Intra subject variation Therefore, in evaluating any test result, it is important to consider conditions under which the test was performed, including the time of day Table: Examples showing variation in Blood Pressure reading during a 24-Hour Period Blood Pressure (mmHg) Female Aged 27 Yr Female Aged 62 Yr Male Aged 33 Yr Basal 110/70 132/82 152/109 Lowest Hour 86/47 102/61 123/78 Highest Hour 126/79 172/94 153/107 Casual 108/64 155/93 157/109

Reliability of test Intra observer variation Variation occurs between two observations made by the same observer Eg. A radiologist who reads the same group of x rays at two different times, may read one or more x ray differently at second time. Tests and examinations differ in the degree to which subjective factors enter in to observer’s conclusion, greater the subjective element in the reading, greater the intra observer variation in reading is likely to be.

Reliability of test Inter observer variation Variation between observers Measures extent to which observers agree or disagree in quantitative terms. Kappa Statistics (Kappa measure of agreement) Difference between observed and expected agreement expressed as a fraction of the maximum difference. Since the maximum value of I0 is 1, this gives K = I0 – Ie / 1- Ie

Relationship between Validity and Reliability Reliability/ Repeatability- Test is able to give same result again and again. Validity- Test is able to measure what it is intended to

Comparison of reliability and validity using graphical presentation The distribution of test result is a broad base centered on the true value, describe the result as valid. However the results are valid only for a group (Tend to cluster around true value). It is important to remember that what may be valid for a group or a population may not be so for an individual in a clinical setting When the reliability of a test is poor, the validity of the test for a given individual also be poor.

References Beaglehole R, Bonita R, Kjellstrom T. Basic Epidemiology. Geveva: World Health Organization; 1993. Fletcher RH, Fletcher SW. Clinical Epidemiology- The essentials. Third ed. Baltimore: Lippincott Williams and Williams; 1996. 35-56 p. Gordis L. Epidemiology. Pennsylvania: Elsever Saunders; 2004. 71-94p. Armitage P, Berry G. Statistical Methods in Medical Research. Third ed. London: Blackwell Scientific Publications; 1994.445p