Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid.

Slides:



Advertisements
Similar presentations
Diagnostic Method Diagnosis Diagnosis means `through knowledge` and entails acquisition of data about the patient and their complaints using the senses:
Advertisements

Codifying Semantic Information in Medical Questions Using Lexical Sources Paul E. Pancoast Arthur B. Smith Chi-Ren Shyu.
Dr. Simon Benson GP Specialist Trainee. Introduction Diagnosis of pneumonia in children with wheeze is difficult Limited data exists regarding predictors.
Concept of Measurement
Battling Scylla and Charybdis: The Search for Redundancy and Ambiguity in the 2001 UMLS Metathesuarus James J. Cimino Department of Medical Informatics.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
Information Extraction from Clinical Reports Wendy W. Chapman, PhD University of Pittsburgh Department of Biomedical Informatics.
Graphs of Consistent Concepts Data mining in a medical domain (Pawel Matykiewicz, Wlodzislaw Duch, John Pestian)
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
IMPROVING THE DOCUMENTATION OF DIAGNOSES Carol A. Lewis.
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Chapter 17 Nursing Diagnosis
Unified Medical Language System® (UMLS®) NLM Presentation Theater MLA 2007 National Library of Medicine National Institutes of Health U.S. Dept. of Health.
Word Sense Disambiguation for Automatic Taxonomy Construction from Text-Based Web Corpora 12th International Conference on Web Information System Engineering.
A Joint Model of Feature Mining and Sentiment Analysis for Product Review Rating Jorge Carrillo de Albornoz Laura Plaza Pablo Gervás Alberto Díaz Universidad.
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Biomedical Informatics and Clinical NLP in Translational Science Research Piet C. de Groen, M.D.
The Nature of Disease.
Performing the Study Data Collection
Unified Medical Language System® (UMLS®) NLM Presentation Theater MLA 2005 May 16 & 17, 2005 Rachel Kleinsorge.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
Annual reports and feedback from UMLS licensees Kin Wah Fung MD, MSc, MA The UMLS Team National Library of Medicine Workshop on the Future of the UMLS.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
1 st June 2006 St. George’s University of LondonSlide 1 Using UMLS to map from a Library to a Clinical Classification: Improving the Functionality of a.
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : Jorge Villalon and Rafael A. Calvo 2011, EST Concept Maps as Cognitive Visualizations.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
Extracting BI-RADS Features from Portuguese Clinical Texts H. Nassif, F. Cunha, I.C. Moreira, R. Cruz- Correia, E. Sousa, D. Page, E. Burnside, and I.
Copyright © 2008 Delmar Learning. All rights reserved. Unit 8 Observation, Reporting, and Documentation.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 3 Medical Records: The Basis for All Coding.
A semantic based methodology to classify and protect sensitive data in medical records Flora Amato, Valentina Casola, Antonino Mazzeo, Sara Romano Dipartimento.
Health IT Workforce Curriculum Version 1.0 Fall Networking and Health Information Exchange Unit 4c Basic Health Data Standards Component 9/Unit.
Relevance Detection Approach to Gene Annotation Aid to automatic annotation of databases Annotation flow –Extraction of molecular function of a gene from.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
Knowledge-Based Semantic Interpretation for Summarizing Biomedical Text Thomas C. Rindflesch, Ph.D. Marcelo Fiszman, M.D., Ph.D. Halil Kilicoglu, M.S.
GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
Chapter 23: Probabilistic Language Models April 13, 2004.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Summarizing Encyclopedic Term Descriptions on the Web from Coling 2004 Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media.
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
Clinical Decision Support 1 Historical Perspectives.
Date: 2012/5/28 Source: Alexander Kotov. al(CIKM’11) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Interactive Sense Feedback for Difficult Queries.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Automatic Document Indexing in Large Medical Collections.
Annotating and measuring Temporal relations in texts Philippe Muller and Xavier Tannier IRIT,Université Paul Sabatier COLING 2004.
Information Retrieval Lecture 3 Introduction to Information Retrieval (Manning et al. 2007) Chapter 8 For the MSc Computer Science Programme Dell Zhang.
Ioana Barbantan and Rodica Potolea. Lots of technology to capture health information.
Finding Content in SNOMED CT Jo Oakes – Knowledge & Information Manager.
BioCreAtIvE Critical Assessment for Information Extraction in Biology Granada, Spain, March28-March 31, 2004 Task 2: Functional annotation of gene products.
Medical Documentation CHAPTER 17. Purposes of Documentation  Communication  Most patients receive care from more than one source  Allows all health.
The TDR Targets Database Prioritizing potential drug targets in complete genomes.
Introduction to Health Informatics Leon Geffen MBChB MCFP(SA)
The Types of Cough By : Anti Cough.
Documentation and Medical Records
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
The UMLS and the Semantic Web
NeurOn: Modeling Ontology for Neurosurgery
Text Based Information Retrieval
CRF &SVM in Medication Extraction
Component 11: Configuring EHRs
Multimedia Information Retrieval
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
SNOMED-CT representation Radiologic report Admission Letter
Public Health Surveillance
Presentation transcript:

Retrieval of Similar Electronic Health Records using UMLS Concept Graphs Laura Plaza and Alberto Díaz Universidad Complutense de Madrid

ª When facing complex and untypical cases, physicians need to refer to similar previous cases ª The adoption EHR by office-based physicians and hospitals is increasing ª But still the time required to find them can be prohibitive if no effective access is provided Motivation Motivation Given a reference record, retrieve others from the clinical database that are similar to the reference one Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 2

ª A mix of highly structured information + idiosyncratic narrative text ª Unique sublanguage characteristics:  Verbless sentences, punctuation, spelling errors.  Synonyms and homonyms  Neologisms  Acronyms and abbreviations ª When two HR can be considered as similar? A Different IR Task A Different IR Task Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 3

Two EHR are Similar if… Two EHR are Similar if…  Same symptom or sign (e.g. fever or 5 kg weight loss)  Same diagnosis (e.g. bacterial pneumonia)  Same test or procedure (e.g. cerebral NMR or endoscopy biopsy)  Same medicament (e.g. clopidogrel)  But … absent criteria are not relevant for the task!!! Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 4

ª UMLS consists of three main components: the Specialist Lexicon, the Metathesaurus and the Semantic Network ª We use MetaMap to translate free-form text to Metathesaurus concepts ª Advantages:  Broad coverage  Performs word sense disambiguation  Numerous entries for acronyms and abbreviations  Etc. Using UMLS for Concept Annotation Using UMLS for Concept Annotation Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 5

ª A four-step graph-based method : 1.Extraction of UMLS concepts 2.Negation detection 3.Semantic graph-based representation 4.Ranking similar EHR Our Proposal Our Proposal CLINICAL HISTORY: Eleven years old with ALL, bone narrow transplant on Jan.2, now with 3 day history of cough. IMPRESSION: No focal pneumonia. Likely chronic changes at the left lung base. Mild anterior wedging of the thoracic vertebral bodies. Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 6

ª We use MetaMap to extract the UMLS concepts from the Metathesaurus and their semantic types from the Semantic Network ª But, according to the expert, not all concepts are relevant to the task ª Thus, the expert mapped these criteria to semantic types and only concepts from those types are considered Our Proposal: Extracting UMLS Concepts Our Proposal: Extracting UMLS Concepts Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 7

Our Proposal: Extracting UMLS Concepts Our Proposal: Extracting UMLS Concepts CategoryUMLS Semantic Types Symptoms and Signs Sign or Symptom Finding Diseases Disease or Syndrome Pathologic Function Procedures Therapeutic or Preventive Procedure Diagnosis Procedure Body Parts Body Location or Region Body Part, Organ, or Organ Component MedicamentsPharmacologic substance Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 8

ª According to the expert, absent or negated criteria (e.g. On admission, the patient had no internal bleeding) are not relevant for the task ª Thus, negated UMLS concepts are ignored ª Negations in medical records usually appears in a reduced number of forms, easy to identify using a simple lexical scanner from regular expressions Our Proposal: Negation Detection Our Proposal: Negation Detection Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 9

Our Proposal: Negation Detection Our Proposal: Negation Detection Lexical PatternExamples no|without| rule out + concept + (or concept)* No pneumonia Without fever or cough no|without|rule out + adj + concept + (or concept)* No significant hydronephrosis Rule out cardiac abnormality no|without + noun + of + concept + (or concept)* No signs of tuberculosis Without evidence of hydroureter evaluate for + (noun|adj)? + concept + (or concept)* Evaluate for foreign body Evaluate for abnormalities lack of|absence of + (noun|adj)? + concept + (or concept)* Lack of kyphosis Absence of heart murmur Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 10

ª First, the concepts are retrieved from the UMLS Metathesaurus along with their complete hierarchy of hypernyms (is-a relations). ª Second, all concept hierarchies for each category are merged, building a unique graph for each category in the EHR ª Finally, each concept is assigned a weight, using the Jaccard similarity coefficient, attaching greater importance to specific concepts than to general ones Our Proposal: Semantic Graph Representation Our Proposal: Semantic Graph Representation Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 11

Our Proposal: Semantic Graph Representation Our Proposal: Semantic Graph Representation Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 12 1/5 2/5 3/5 4/5 5/5

ª We compute the similarity among the reference EHR and all records in the database, and rank them ª Given two graphs, A and B, so that the similarity of A to B has to be measured:  First, each concept of A which is not in B assigns a score equal to 0, while each concept of A which is also in B assigns a score equal to its weight in the graph A  Next, the sum of the scores for all concepts in A is computed.  Finally, this result is normalized in the interval [0, maximum similarity]. Our Proposal: Ranking Similar EHR Our Proposal: Ranking Similar EHR Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 13

Our Proposal: Ranking Similar EHR Our Proposal: Ranking Similar EHR Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 14 Finding by site Clinical finding Disease Bacterial pneumonia Infectious disease Disorder by body site Pneumonia due toStreptococcus Mycoplasma pneumonia Respiratory finding Functional finding of respiratory tract Coughing Clinical finding Disorder by body site Finding by site 1/11 2/11 3/11 8/11 9/11 10/11 3/5 4/5 5/5 Bacterial pneumonia Pneumococcal pneumonia 11/11 Pneumonia due to anaerobic bacteria Pneumonia due topleuropneumonia Graph A Graph B... Virus Diseases

ª Test collection: 50 radiology reports from the CMC-NLP 2007 Challenge corpus ª Query collection: a subset of 20 reports from the test collection ª Two hospital physicians were asked to select, for each report in the query collection, the most similar reports within the test collection ª There is a substantial agreement between judges (Kappa test, k=0.7980) ª Precision and Recall of our method are compared with those obtained by a term-based approach Experiments Experiments Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 15

Results Results Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 16 Graph-based method Term-based Method PrecisionRecall F-score PrecisionRecallF-score Union Intersection Union Intersection Graph-based method Term-based Method Precision F-score PrecisionF-score Union Intersection

ª The method achieves relatively high precision and recall which are also well balanced ª UMLS occasionally fails to recover relevant concepts especially when expressed in their shortened forms ª Another impairment to concept identification comes from the spelling errors in the clinical records ª Future work will test the method on a different evaluation collection which will present longer medical records structured in different sections Conclusion and Future Work Conclusion and Future Work Retrieval of Similar Electronic Health Records Using UMLS Concept Graphs (Plaza and Díaz, 2010) 17