Diagnostic research. Lecture Contents I. Diagnostics in practice - Explained with a case II.Scientific diagnostic research – Design – Data-analysis –

Slides:



Advertisements
Similar presentations
Números.
Advertisements

Dichotomous Tests (Tom). Their results change the probability of disease Negative testPositive test Reassurance Treatment Order a Test A good test moves.
Case-control study 3: Bias and confounding and analysis Preben Aavitsland.
If you are viewing this slideshow within a browser window, select File/Save as… from the toolbar and save the slideshow to your computer, then open it.
Client Assessment and Other New Uses of Reliability Will G Hopkins Physiology and Physical Education University of Otago, Dunedin NZ Reliability: the Essentials.
Statistical vs Clinical or Practical Significance
Statistical vs Clinical Significance
Performance of a diagnostic test
Significance testing and confidence intervals Ágnes Hajdu EPIET Introductory course
Module 4: HIV Testing Strategies and Algorithms. 2 Learning Objectives At the end of this module, you will be able to: Discuss the process for developing.
Frequency Tables, Stem-and-Leaf Plots, and Line Plots 7-1
DiseaseNo disease 60 people with disease 40 people without disease Total population = 100.
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
The basics for simulations
Proving a Premise – Chi Square Inferential statistics involve randomly drawing samples from populations, and making inferences about the total population.
Scoring Terminology Used in Assessment in Special Education
Validity and Reliability of Analytical Tests. Analytical Tests include both: Screening Tests Diagnostic Tests.
Inferential Statistics and t - tests
Finish Test 15 minutes Wednesday February 8, 2012
Frequency Tables and Stem-and-Leaf Plots 1-3
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Power and sample size.
Understanding p-values Annie Herbert Medical Statistician Research and Development Support Unit
Lecture 3 Validity of screening and diagnostic tests
SCREENING CHP400: Community Health Program-lI Mohamed M. B. Alnoor
Diagnostic Metrics Week 2 Video 3. Different Methods, Different Measures  Today we’ll continue our focus on classifiers  Later this week we’ll discuss.
J. Dahm 1, J. Ponsford 1,2, D. Wong 1,3, M.Schönberger 1,2 1 Monash University, 2 Monash-Epworth Rehabilitation Research Centre, 3 Epworth Hospital Introduction.
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Improving Office Care for Chest Pain Thomas D. Sequist, MD MPH Associate Professor of Medicine and Health Care Policy Brigham and Women ’ s Hospital, Division.
How do we delay disease progress once it has started?
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
Cross Sectional Designs
“Diagnostic value of procalcitonin in well appearing young febrile infants” Pediatrics 2012; 130:
Receiver Operating Characteristic (ROC) Curves
Evaluating Diagnostic Accuracy of Prostate Cancer Using Bayesian Analysis Part of an Undergraduate Research course Chantal D. Larose.
The Ethics of Image Analysis Martin Peterson,TU/e.
Estimation of Sample Size
Azita Kheiltash Social Medicine Specialist Tehran University of Medical Sciences Diagnostic Tests Evaluation.
Diagnosing – Critical Activity HINF Medical Methodologies Session 7.
Anthropometry Technique of measuring people Measure Index Indicator Reference Information.
Baye’s Rule and Medical Screening Tests. Baye’s Rule Baye’s Rule is used in medicine and epidemiology to calculate the probability that an individual.
Lucila Ohno-Machado An introduction to calibration and discrimination methods HST951 Medical Decision Support Harvard Medical School Massachusetts Institute.
Screening Test for Occult Cancer 100 patients with occult cancer: 95 have "x" in their blood 100 patients without occult cancer: 95 do not have "x" in.
Statistics in Screening/Diagnosis
BASIC STATISTICS: AN OXYMORON? (With a little EPI thrown in…) URVASHI VAID MD, MS AUG 2012.
Multiple Choice Questions for discussion
Medical decision making. 2 Predictive values 57-years old, Weight loss, Numbness, Mild fewer What is the probability of low back cancer? Base on demographic.
Diagnostic research Delivered by Nia Kurniati. Lecture Contents I. Diagnostics in practice - Explained with a case II.Scientific development of diagnostic.
Evaluating What’s Been Learned. Cross-Validation Foundation is a simple idea – “ holdout ” – holds out a certain amount for testing and uses rest for.
+ Clinical Decision on a Diagnostic Test Inna Mangalindan. Block N. Class September 15, 2008.
INTRODUCTION Upper respiratory tract infections, including acute pharyngitis, are common in general practice. Although the most common cause of pharyngitis.
CHP400: Community Health Program-lI Mohamed M. B. Alnoor Muna M H Diab SCREENING.
Appraising A Diagnostic Test
Evaluating Results of Learning Blaž Zupan
Computational Intelligence: Methods and Applications Lecture 16 Model evaluation and ROC Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Prediction statistics Prediction generally True and false, positives and negatives Quality of a prediction Usefulness of a prediction Prediction goes Bayesian.
Diagnostic Tests Studies 87/3/2 “How to read a paper” workshop Kamran Yazdani, MD MPH.
SCH Journal Club Use of time from fever onset improves the diagnostic accuracy of C-reactive protein in identifying bacterial infections Wednesday 13 th.
Afebrile Infants With UTI and the Risk for Bacteraemia Journal Club Sheffield Children’s Hospital Naheed Maher 7 th January 2015.
Unit 15: Screening. Unit 15 Learning Objectives: 1.Understand the role of screening in the secondary prevention of disease. 2.Recognize the characteristics.
Journal club Diagnostic accuracy of Urinalysis for UTI in Infants
EVALUATING u After retrieving the literature, you have to evaluate or critically appraise the evidence for its validity and applicability to your patient.
Timothy Wiemken, PhD MPH Assistant Professor Division of Infectious Diseases Diagnostic Tests.
SCREENING FOR DISEASE. Learning Objectives Definition of screening; Principles of Screening.
Screening Tests: A Review. Learning Objectives: 1.Understand the role of screening in the secondary prevention of disease. 2.Recognize the characteristics.
Performance of a diagnostic test Tunisia, 31 Oct 2014
Diagnostic Test Studies
The receiver operating characteristic (ROC) curve
PICO model for developing EBM questions
Presentation transcript:

Diagnostic research

Lecture Contents I. Diagnostics in practice - Explained with a case II.Scientific diagnostic research – Design – Data-analysis – Reporting III.Exercises IV.Summary

Diagnostics in practice Diagnostics always start with a patient with a complaint/symptom Case: neck stiffness Child, 2 years-old, comes to ER with parents Child turns out to have a very stiff neck What is the physician’s aim?

Diagnostics in practice Aim of the physician Quickly and efficiently determine the correct diagnosis Why diagnose? Basis medical handling Determines treatment choice Gives information about prognosis What are possible diagnoses for neck stiffness?

Diagnostics in practice Differential diagnosis (DD) Bacterial meningitis Viral meningitis Pneumonia ENT infection Other (e.g. myalgia) What is the most important diagnosis? Which one does the physician not want to miss?

Diagnostics in practice Most important diagnosis Bacterial meningitis (BM) If missed: often fatal

Diagnostics in practice Suppose: 20% of all children on the ER with neck stiffness has BM – 20% with disease in that population = prevalence – Prior-probability What is your decision for the child in this case?

Diagnostics in practice Decision for child in case Prior-probability too low to treat Prior-probability too high to send home Decision: reduce uncertainty  diagnostics What is the best test?

Diagnostics in practice Best test Lumbal punction (liquor culture)

Diagnostics in practice Gold standard True disease status; ‘truth’ –Never 24 karat Reference standard/test Decisive test with doubt Perform reference test for everybody (=every child on ER with neck stiffness)?

Reference test for everybody? Unethical  too invasive/risky Inefficient  too expensive Do not perform unnecessarily How should we then determine the probability of disease presence and what would be ideal? Diagnostics in practice

How then? Simpler diagnostics: –Usually anamnesis, physical exam, simple lab tests, imaging, etc. –Ideal: diagnosis without reference test Diagnostic process in practice: –Stepwise process: less  more invasive –Not one diagnosis based on 1 test –Each item: separate test Diagnostics in practice

Suppose: after anamnesis & PE 10% probability of BM Probability of disease given test results = posterior- probability The bigger the difference between prior and posterior probability, the better the diagnostic value of the tests Our decision for child in case: probability is too high to send home --> next step?

Diagnostics in practice Next step –Additional research, e.g. blood tests (leucocytes, CRP, sedimentation, etc.)

Diagnostics in practice Suppose: 1% posterior-probability after anamnesis, PE+ simple lab testsposterior probability low enough to send home Ideal diagnostic process: simple tests reduce posterior probability to 0 or 100% (without reference) Most often physician continues testing until sufficiently sure (approximation of 0 or 100%) Choose when sufficiently sure: depends on prognosis of disease if untreated + risks/costs treatment

Diagnostics in practice Summarizing What does diagnosing involve in practice? –Estimation of probability of disease presence based on test results of the patient When is the probability of disease best estimated? Why is this usually not done?

Diagnostics in practice Why not all possible tests? –Invasive (for patient and budget) –Unnecessary: different test results give same info –However: In practice often more tested than necessary! What diagnostics truly necessary scientific diagnostic research

BREAK

Study design Scientific diagnostic research –What tests truly contribute to probability estimation? –Has to serve practicefollow practice

Study design Research question Domain Study population Determinant(s): test(s) to study Endpoint: presence/absence disease (outcome) Study design: design Data analysis, interpretation + reporting

Research question With as few as possible simple, safe, and cheap tests estimate the probability of the presence/absence of disease. Determinant-outcome relation: – probability of disease as a function of test results – outcome = probability of disease = % = prevalence – test results = determinants

Research question Case What tests contribute to probability estimation of presence or absence of BM in children with neck stiffness at the ER? Or: Determinants of presence/absence disease (BM)? %BM = ƒ(age, gender, fever, blood leucocytes, blood CRP, etc)

Research population Case: All children with neck stiffness in 2002 at ER Utrecht

Domain For whomdomain, generalisation = type of patient with certain symptom/complaint + setting Research population = 1 sample from domain Case: All children (e.g. in Western world) suspected of disease (BM) based on neck stiffness (characteristic) in secondary care (setting)

Determinants = Tests to study Diagnostic determinants All possible important tests (in domain) Case Items anamnesis, PE and lab (blood and urine) tests

Endpoint ‘True’ presence/absence disease = Diagnostic outcome = Results reference test NB:reference = not infallible but always best available test in practice at that moment Case Positive liquor culture

PICO EBM Population/ problem Intervention Comparison/ control Outcome Domain Determinant Reference test Outcome

Measure determinants/endpoint Determinants –Without knowledge (blinded) of the outcome –Same method in study and practice never measure more precisely than in practice (overestimation information yield) Endpoint –Assessment blind for determinants –With the best possible test known in practice

Study design Observational and descriptive –Observational = no manipulation of determinants –Descriptive = not causal –if the determinant only predicts –no hypothesis functional mechanism determinant- outcome >1 determinant

Study design Cross-sectional = Simultaneously measure determinants and outcome

Data-analysis After data collection, per patient –Value determinants (test results) –Diagnostic outcome (reference test)

Data-analysis Data analysis: 3 steps 1) Estimation a priori probability (without test results) 2) Compare each test result separately with reference = univariate 3) Compare combination of test results with reference = multivariate (via model) - Following order in practice - Determine added value test result to already collected (previous) test results

Data-analysis Case Data scientific research available: 200 patients with neck stiffness at ER Liquor culture positive (BM+) n=40 Liquor culture negative (BM-) n=160 Step 1: A priori probability (prevalence) of BM? = % BM+ = 40/ 200 patients = 20%

Data-analysis reading 2 by 2 table Disease PresenceAbsence Test PositiveTrue positive A False positive B NegativeC False negative D True negative Step 2: Analysis per determinant (univariate) Use 2 by 2 table

Data-analysis reading 2 by 2 table Horizontally Positive predictive value (PV+) = probability Disease + if test + PV+ = A / A + B Negative predictive value (PV-) = probability disease - if test - PV- = D / C + D Vertically Sensitivity (SE) = probability test + if disease + SE = A / A + C Specificity (SP) = probability test - if disease - SP = D / B + D What numbers do you think are most useful in practice (PV+ and PV- or SE and SP)? TP A FN C B FP Gold standard Disease +Disease – Test + Test – D TN

Data-analysis Perfect diagnostic test False Positive = 0 False Negative = 0 e.g. Fever > 38 0 C as predictor for BM BM+BM-tot. Yes (+) Fever > 38 0 C No (-)

Data-analysis reading 2 by 2 table Horizontally probability BM+ if fever+ = 20/110 = 18% PV+ = A / A + B probability BM - if fever- = 70/90 = 78% PV- = D / C + D Vertically probability fever+ if BM+ = 20/40 = 50% SE = A / A + C probability fever- if BM- = 70/160 = 44% SP = D / B + D What numbers do you think are most useful in practice (PV+ and PV- or SE and SP)? 20 TP A FN C B FP Gold standard BM+BM– Fever + Fever – D TN 70

BREAK

Exercise 1 Mercury thermometer or timpanic membrane infrared meter

Exercise 1 Ad question 1 Research question: Can fever be determined with the TIM? Determinant: test under study = timpanic membrane infrared meter Outcome: fever determined with rectal mercury thermometer (RMT) Domain: Children in secondary/tertiary care (ER hospital)

Exercise 1 Ad question 2 77 TP A FN C 19 9 B FP TIM >38° TIM  38° D TN 108 GS RMT Fever+Fever– Se = probability TIM+ if RMT+ = 77/96 = 80 % SP = probability TIM- if RMT- = 108/117 = 92%

Exercise 1 Ad question 3 77 TP A FN C 19 9 B FP TIM >38° TIM  38° D TN 108 GS RMT Fever+Fever– PV+ = probability RMT+ if TIM+ = 77/86 = 90 % PV- = probability RMT- if TIM- = 108/127 = 85%

Exercise 1 Ad question 4 –The prior probability of fever in the general practice is lower, e.g. 20% (X/213=0,2  X=43) – For similar Se and SP: (A/43=0,8  A=34) (D/170=0,92  D=156) –PV+ becomes lower (34/48 = 70%) –PV – becomes higher (156/164 = 95%) TIM+ TIM- GS RMT Fever+Fever–

Exercise 1 Ad question 5 –In the general practice an unjustly referred or treated child is less of a problem than an unjust reassurance of the parents –Especially the negative predictive value must therefore be sufficiently high

Data-analysis: combination of determinants In practice not one single diagnosis based on 1 test –Tests together distinguish ill/non-ill –Method: statistical model Moreover: diagnostic process is hierarchical –(simple --> invasive/expensive) --> always start with anamnesis model --> see case

Data-analysis Case: model with all anamnestic tests gender + age + fever + pain %BM = ƒ(gender, age, fever, pain) Statistical model can be seen as 1 (composed) test Quantify diagnostic value model with area under ROC curve ( R eceiver O perating C haracteristic = A rea Under Curve (AUC))

Data-analysis

Case: AUC anamnesis model = 0,71 Informal interpretation AUC = % correctly diagnosed The larger the ROC area  the better the model AUC range: 0,5  1,0 AUC = 0,5  bad (Se = 1- Sp  diagonal [coin]) AUC > 0,7  reasonable AUC > 0,8  good AUC > 0,9  excellent AUC = 1,0  perfect (Se=100% & 1-Sp=0%)

Data-analysis Quantify added value additional tests to previous tests Extend previous model (follow order practice) Quantify change in AUC Case Model 1anamnesis model + physical exam (5 extra tests) --> AUC = 0,72  interpretation? Model 2anamnesis model + 3 blood tests ---> AUC = 0,90  interpretation?

Data-analysis

The AUC does not directly say anything about individual patients and is therefore not directly applicable

Reporting Research question Study set-up Research population, setting, determinants, outcome, design Results Predictive values (new) test and/or ROC curve ROC curve combination of tests Added value new test --> ROC curve

Exercise 2

Ad question 1 -Cross-sectional study in patients suspected of a stomach or duodenum ulcer -For all patients anamnestic data were collected -For all patients a gastroscopy was done -Independent diagnostic value of anamnestic factors (determinant) for the diagnosis of ulcer (outcome: gastroscopy) were calculated Exercise 2

Ad question 2 Adults with stomach complaints referred to a gastro- enterology policlinic in a peripheral hospital

Exercise 2 Ad question 3 Score is 5, risk is 57%

Exercise 2 Ad question 4 -Everybody above the cut-off point has the same risk (and the same below the cut-off point) -Of course this is not true and the score loses precision -Preferably predictive values for score-categories and predictive values for more cut-off points

Exercise 2 Ad question 5 20 TP A FN C 5 11 B FP Test + Test - D TN 64 Peptic ulcus +– PV+ = 20/31 = 65% PV- = 64/69 = 93%

Exercise 2 Ad question 6 -Predictive values more favourable and therefore preferred -But it is not about the isolated predictive value but about the added diagnostic value given the results of the anamnestic score

Exercise 2 Ad question 7 Perform the anamnestic score and the breath test for a population from the domain. Subsequently perform the reference test (endoscopy) for everybody Compare the next determinant-outcome relations: P(ulcus) = ƒ (age, gender, anamnesis,...) P(ulcus) = ƒ (age, gender, anamnesis,..., breath test) Then compare the Receiver Operating Characteristic (ROC)-curve of the models

Exercise 2 Ad question 8 -Breath test partially contains the same information as the score -Suppose that the breath test is more often positive with age, then the breath test also measures age and therefore the added value is less than when the breath test would be completely independent of the score

Exercise 2 Ad question 9 - Preferably not, but if the assessor is informed of the data in the score in practice, than it should be the same in the study

Diagnostics Summary (1) Diagnostics in practice –Uncertainty reduction –Determines prognosis & determines policy Diagnostic research Design –Observational –Descriptive

–Cross-sectional Simultaneous measurement determinant and outcome (reference standard) –Always study >1 determinant Design –Assess determinants as in practice –Assess disease status & determinant status with double blinding Diagnostics Summary (2)

Analysis –Univariate (per determinant) –Multivariate: combination of test results in relation to outcome Endpoint = ƒ(combination of determinants) Determine added value; first analyse least invasive tests (as in practice) Reporting –Mainly added value of test Diagnostics Summary (3)