Download presentation
Presentation is loading. Please wait.
1
COMPLEX QUESTION ANSWERING BASED ON A SEMANTIC DOMAIN MODEL OF CLINICAL MEDICINE Dina Demner-Fushman 24 August 2006
2
2 Informed clinical decision making Evidence Based Medicine Combine: –the best published medical research findings –clinical judgment –expertise and experience Use systematic approach: –Formulate specific and relevant questions –Know where to look for an answer –Answer questions iteratively clinical state and circumstances patient’s preferences information resources
3
3 Outline Motivation Evidence Based Medicine Hypotheses Clinical Question Answering system Evaluation –System components extractors Document re-ranking –Answers Multi-tier answers Best answers Contributions, Limitations, Future work
4
4 Real-life questions How do we diagnose prostatitis? –Interactive multi-level answers: Diagnosis of chronic abacterial prostatitis: evaluate infection inflammation biochemistry ultrasonography … How much better is Amiodarone in controlling fast atrial fibrillation with rapid ventricular response compared to cardizem? Zosyn dosage regimens: 3.375g or 4.5g? –Best answer: The usual total daily dose of Zosyn for adults is 3.375 g every six hours. Patients with nosocomial pneumonia should start with Zosyn at a dosage of 4.5 g …
5
5 Evidence search and appraisal Convert information needs into focused questions Track down the best evidence with which to answer them Identify “bottom-line” recommendations, supporting evidence, and its strength
6
6 Information sources Paper (books, desk references, journals): 50% Colleagues: 40% Clinical librarians or services: 32% Online Resources: 25% –Primary sources: Bibliographic databases (MEDLINE) –Secondary sources: Systematic reviews (Cochrane collaboration, American College of Physicians Journal Club) Databases of expert answers to clinical questions (FPIN, BMJ Clinical Evidence)
7
7 Question frame How much better is Amiodarone in controlling fast atrial fibrillation with rapid ventricular response compared to cardizem? Etiology Diagnosis Task: Therapy Prognosis Population: [unspecified] Problem: Atrial fibrillation with rapid ventricular response Intervention 1: Amiodarone Intervention 2: Cardizem Outcome: achieve rate control Users' Guides to Evidence-based Medicine (JAMA series) P I C O
8
8 Strength of Evidence A –Meta-Analysis –Randomized Controlled Trials B –Cross-Sectional Studies –Retrospective Studies C –Case Report –Animal Studies
9
9 Hypotheses Document frames based on three main EBM components (clinical task, PICO, SoE) are sufficient to answer questions –Document frames could be generated using a hybrid statistical/knowledge-based approach to leverage existing resources –Complex clinical questions could be answered through semantic matching of the question- document frames
10
10 MEDLINE Clinical QA system architecture Entities & relations annotation PubMed Document Retrieval Query terms E-Utilities citations MetaMap SemRep Semantic matching Answer Generation Document frame Question frame Answer Annotated citations PICO Query Formulation UMLS EBM Domain Model Semantic processing Knowledge Extraction Clinical Task Classification Strength of Evidence Classification
11
11 Component architecture Search Engine Wrapper Question Processing MetaMap Wrapper Semantic Matcher Answer Generator Citations Query MEDLINE Annotated Document Semantic processor Task Classifier Strength of Evidence Classifier Problem Extractor Population Extractor Intervention Extractor Outcome Extractor ESearch EFetch Question frame Document frame Answer
12
12 Semantic processor Task Classifier Strength of Evidence Classifier Problem Extractor Population Extractor Intervention Extractor Outcome Extractor Semantic processing example … Patients with atrial fibrillation (n = 57), … were randomly assigned to one of three intravenous treatment regimens. Amiodarone versus diltiazem for rate control in critically ill patients with atrial tachyarrhythmias. Group 1 received diltiazem … group 2 received amiodarone …. Sufficient rate control can be achieved in critically ill patients with atrial tachyarrhythmias using either diltiazem or amiodarone … Task: Therapy Strength of Evidence: A (RCT)
13
13 Outcome Extractor Classifiers Cue-terms Naïve Bayes N-gram Position Heuristic Length Multiple Linear Regression Score: 0.99 Sufficient rate control can be achieved in critically ill patients with atrial tachyarrhythmias using either diltiazem or amiodarone. Score: 0.75 Although diltiazem allowed for significantly better 24-hr heart rate control, this effect was offset by a significantly higher incidence of hypotension requiring discontinuation of the drug. Problem Extractor Population Extractor Intervention Extractor Training: 275 manually annotated abstracts
14
14 Extractor accuracy ExtractorTest set N= CorrectUnknownWrong Problem5090%5% Population10079%11%10% Intervention10077%23% Outcome35890%10%
15
15 Outline Motivation Evidence Based Medicine Hypotheses Clinical Question Answering system Evaluation –System components extractors Document re-ranking –Answers Multi-level answers Best answers Contributions, Limitations, Future work
16
16 Semantic Matcher Doc3 frame Problem: atrial fibrillation Intervention: coronary surgery Outcome: …… Outcome score: Pico score: Task:THERAPY score: SoE score: Document re-ranking Question frame Task: THERAPY Problem: atrial fibrillation Intervention: Amiodarone Cardizem Doc Score = λ P S PICO + λ S S SoE + λ T S Task S PICO = λ p S problem + λ pt S population + λ i S intervention + λ o S outcome S SoE = λ j S journal + λ s S study + λ d S date S Task = Σ λ i Task_Indicator(i) Doc2 frame Problem: arterial hypertension Intervention: Warfarin Outcome: …… Outcome score: Pico score: Task:THERAPY score: SoE score: Doc1 frame Problem: atrial fibrillation Intervention: diltiazem, amiodarone Outcome: …… Outcome score: 0.79 Pico score: 0.89 Task: THERAPY score: 0.64 SoE score: 0.32
17
17 Document re-ranking evaluation baselinefilteringcomponents Relevance judgments for 24 FPIN questions by Dr. CS
18
18 Answer Generation Intervention Extractor Outcome Extractor Semantic Clustering Imaging by method [ ultrasound ][ Doppler studies] Transrectal ultrasound (TRUS) offers a valuable complement to digital rectal examination (DRE) in diagnosing prostate diseases. A sensitivity of 90.6% and a specificity of 64.2% was reached. Automated analysis and interpretation of transrectal ultrasonography images in patients with prostatitis. Eur Urol. 1995;27(1):47-53. Metadata SoE Task Cluster label Intervention Outcome Title
19
19 UMLS Semantic Clustering Magnetic resonance Doppler studies MRI of abdomen Specific ultrasound studies Imaging by method Imaging by body site Ultrasound scan Evaluation procedure Procedure by method Diagnostic imaging … Procedures Investigations SNOMED Clinical Terms Operations, procedures and interventions Read Codes Pruned top Interior nodes Extracted interventions Ultrasonography
20
20 Q1 Q2 30 Q1 Q2 30 Q1 Q2 25 Cluster selection for evaluation UMLS (latest, 3 largest clusters) User (latest, 3 best clusters) Pubmed (3 latest) Imaging by method biochemistry goodOKbad infection inflammation
21
21 Answer evaluation Clinical Evidence categories Beneficial Harmful BLBTU+NH 0.130.250.130.460.01 0.230.270.120.370.01 0.350.280.110.26- Distribution for 25 Clinical Evidence questions (cluster selection and judgment by Dr CA) Cluster selection strategy Evidence support goodOKbad PubMed0.570.140.27 UMLS0.720.090.19 User0.850.080.07
22
22 Outline Motivation Evidence Based Medicine Hypotheses Clinical Question Answering system Evaluation –System components extractors Document re-ranking –Answers Multi-level answers Best answers Contributions, Limitations, Future work
23
23 Answer precision at 5 221 answers to 24 questions judged by Drs. CS and KWH
24
24 Contributions Leveraging semantic domain model as a foundation for an end-to-end clinical question answering system. Identification of the domain-model components necessary and sufficient for system development. Demonstration of applicability of the system architecture for complex question answering in the clinical domain. Methods for combining information extraction based on statistical and knowledge-based methods. Adaptation of question answering evaluation methods for the clinical domain. Development of test collections for information extraction and question answering evaluation.
25
25 Limitations No user interface Manual question processing PubMed for document retrieval Processing speed of automatic semantic annotation
26
26 Future work Combining knowledge-based and corpus- based methods beyond outcome extractor Developing a corpus-based stopping condition for hierarchical ontological clustering In-depth study of PICO frame alternatives Combining ranking results of different search engines
27
27 Thanks to my advisory cloud! Douglas Oard Jimmy Lin Philip Resnik Dagobert Soergel Ben Shneiderman Susan Hauser Thomas Rindflesch George Thoma Alan Aronson Susanne Humphrey
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.