The UMLS and the Semantic Web

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Consistent and standardized common model to support large-scale vocabulary use and adoption Robust, scalable, and common API to reduce variation in clinical.
Semantic indexing in PubMed CERN Workshop on Innovations in Scholarly Communication (OAI8) CERN Workshop on Innovations in Scholarly Communication (OAI8)
Vision and Ambition for LifeWatch ICT Infrastructure Axel Poigné (Fraunhofer IAIS) Vera Hernández-Ernst (Fraunhofer IAIS) Alex Hardisty (Cardiff University)
1 Natural Language Processing Group HUGs Geneva Start.
Update on Vocabularies and Value Sets for Meaningful Use Betsy Humphreys, MLS, FACMI Deputy Director National Library of Medicine National Institutes of.
Summary Issues and Suggestions Workshop on The Future of the UMLS Semantic Network NLM, April 8, 2005 Olivier Bodenreider Lister Hill National Center for.
The Role of the UMLS in Vocabulary Control CENDI Conference “Controlled Vocabulary and the Internet” Stuart J. Nelson, MD.
Ontology Notes are from:
Overview of Biomedical Informatics Rakesh Nagarajan.
EleMAP: An Online Tool for Harmonizing Data Elements using Standardized Metadata Registries and Biomedical Vocabularies Jyotishman Pathak, PhD 1 Janey.
Brian A. Carlsen Apelon, Inc. Tools For Classification Integration Networked Knowledge Organization Systems/Services Workshop June 28, 2001.
Social Pharmacy and Pharmacoepidemiology Lister Hill National Center for Biomedical Communications Text-based Discovery in Biomedicine The Architecture.
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
1 Definitions Value set: A list of specific values, which may – or may not – contain subsets of one or more standard vocabularies, that define or identify:
NLM-Semantic Medline Data Science Data Publication Commons Dr. Brand Niemann Director and Senior Data Scientist/Data Journalist Semantic Community Data.
Development Principles PHIN advances the use of standard vocabularies by working with Standards Development Organizations to ensure that public health.
Unified Medical Language System® (UMLS®) NLM Presentation Theater MLA 2007 National Library of Medicine National Institutes of Health U.S. Dept. of Health.
1 Betsy L. Humphreys, MLS Betsy L. Humphreys, MLS National Library of Medicine National Library of Medicine National Institutes of Health National Institutes.
Enhancing the Quality of ImmPort Data Barry Smith ImmPort Science Meeting, February 27, 2014 With thanks to Anna Maria Masci.
Unified Medical Language System® (UMLS®) NLM Presentation Theater MLA 2005 May 16 & 17, 2005 Rachel Kleinsorge.
Linking Diseases and Genes through Informatics Knowledge Bases and Ontologies Joyce A. Mitchell, Ph.D. National Library of Medicine University of Missouri.
Session II: Scientific Publishing and Semantic Web W3C Semantic Web for Life Sciences Workshop October 27, 2004 Moderator: Alan R. Aronson.
Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA Experiences in visualizing and navigating biomedical.
Betsy L. Humphreys Betsy L. Humphreys Associate Director for Library Operations NLM, NIH, HHS NLM, NIH, HHS National Library.
Annual reports and feedback from UMLS licensees Kin Wah Fung MD, MSc, MA The UMLS Team National Library of Medicine Workshop on the Future of the UMLS.
NLM Standards Related Activities Vivian A Auld Senior Specialist for Health Data Standards National Library of Medicine HL7 Working Group Meeting San Antonio,
1 st June 2006 St. George’s University of LondonSlide 1 Using UMLS to map from a Library to a Clinical Classification: Improving the Functionality of a.
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
NLM Standards Related Activities Vivian A Auld Senior Specialist for Health Data Standards National Library of Medicine National Committee on Vital and.
From biomedical informatics to translational research
Ontologies and data integration in biomedicine Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA Kno.e.sis.
UMLS Unified Medical Language System. What is UMLS? A Unified knowledge representation system Project of NLM Large scale Distributed First launched in.
A School of Information Science, Federal University of Minas Gerais, Brazil b Medical University of Graz, Austria, c University Medical Center Freiburg,
Consistency between Metathesaurus and Semantic Network Workshop on The Future of the UMLS Semantic Network NLM, April 8, 2005 Olivier Bodenreider Lister.
AMIA 2008 Monday, Nov :15-1:30 National Library of Medicine National Institutes of Health U.S. Dept. of Health & Human Services UMLS ® Users’ Meeting.
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
AMIA 2007 Monday, Nov :30-1:30 National Library of Medicine National Institutes of Health U.S. Dept. of Health & Human Services UMLS ® Users Meeting.
Medinfo 2013 Copenhagen, Denmark
Enabling complex queries to drug information sources through functional composition Olivier Bodenreider Lister Hill National Center for Biomedical Communications.
Metadata Registries Registry: authoritative, centrally controlled store of information – W3C Web Services Glossary, 2004
Using Domain Ontologies to Improve Information Retrieval in Scientific Publications Engineering Informatics Lab at Stanford.
A LexWiki-based Representation and Harmonization Framework for caDSR Common Data Elements Guoqian Jiang, Ph.D. Robert Freimuth, Ph.D. Harold Solbrig Mayo.
Digital Libraries, Archives, and Large Data Sets Alexa T. McCray National Library of Medicine Bethesda, Maryland USA WHOI, June 3, 2004.
Japan Consortium for Glycobiology and Glycotechnology DataBase 日本糖鎖科学統合データベース GDGDB - Glyco-Disease Genes Database The complexity of glycan metabolic pathways.
NLM Value Set Authority Center Curation and delivery of value sets for eMeasures eMeasures Issues Group (eMIG) May 24, 2012 NLM.
HIT Standards Committee Vocabulary Task Force Task Force Report and Recommendation Jamie Ferguson Kaiser Permanente Betsy Humphreys National Library of.
The UMLS Semantic Network Alexa T. McCray Center for Clinical Computing Beth Israel Deaconess Medical Center Harvard Medical School
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 12 RDF, OWL, Minimax.
Clinical research data interoperbility Shared names meeting, Boston, Bosse Andersson (AstraZeneca R&D Lund) Kerstin Forsberg (AstraZeneca R&D.
Oncologic Pathology in Biomedical Terminologies Challenges for Data Integration Olivier Bodenreider National Library of Medicine Bethesda, Maryland -
Semantic Media Wiki Open Terminology Development - Initial Steps - Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer.
Oncology in SNOMED CT NCI Workshop The Role of Ontology in Big Cancer Data Session 3: Cancer big data and the Ontology of Disease Bethesda, Maryland May.
LifeWatch - ICT infrastructure for Biodiversity Research in Europe LifeWatch ICT construction group (WP5) Axel Poigné, Vera Hernández ( Fraunhofer IAIS)
Clinical terminology for personalized medicine: Deploying a common concept model for SNOMED CT and LOINC Observables in service of genomic medicine James.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Assessing SNOMED CT for Large Scale eHealth Deployments in the EU Workpackage 2- Building new Evidence Daniel Karlsson, Linköping University Stefan Schulz,
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Demonstration for AHIMA
NeurOn: Modeling Ontology for Neurosurgery
Data Reference Model Implementation Through Iteration & Testing
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Stanford Medical Informatics
knowledge organization for a food secure world
The Unified Medical Language System Overview
Doron Goldfarb & Yann LE FRANC
Networking and Health Information Exchange
PREMIS Tools and Services
Department of Medical Informatics
The Foundational Model of Anatomy
Presentation transcript:

The UMLS and the Semantic Web W3C Semantic Web Health Care and Life Sciences Interest Group BioRDF Teleconference September 22, 2008 The UMLS and the Semantic Web Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA

Outline The UMLS (in a nutshell) Lexical resources Metathesaurus Semantic Network Why is the UMLS relevant to the Semantic Web? Issues and challenges

Unified Medical Language System (UMLS)

UMLS: 3 components SPECIALIST Lexicon Metathesaurus Semantic Network 200,000 lexical items Part of speech and variant information Metathesaurus 5M names from over 100 terminologies 1M concepts 16M relations Semantic Network 135 high-level categories 7000 relations among them Lexical resources Ontological resources Terminological resources

UMLS Characteristics (1) Current version: 2008AA (2-3 annual releases) Type: Terminology integration system Domain: Biomedicine Developer: NLM Funding: NLM (intramural) Availability Publicly available: Yes* (cost-free license required) Repositories: UMLS URL: http://umlsks.nlm.nih.gov/

UMLS Characteristics (2) Number of Concepts: 1.5M (2008AA) Terms: ~6M Major organizing principles (Metathesaurus): Concept orientation Source transparency Multi-lingual through translation Formalism: Proprietary format (RRF)

UMLS Integrating subdomains MIE 2005 - Geneva, Switzerland August 28, 2005 UMLS Integrating subdomains Clinical repositories SNOMED CT Genetic knowledge bases OMIM Other subdomains … UMLS Biomedical literature MeSH Genome annotations GO Anatomy FMA Model organisms NCBI Taxonomy UMLS Tutorial - O. Bodenreider (NLM)

Trans-namespace integration O. Bodenreider - NLM 4/20/2018 Trans-namespace integration Addison's disease (363732003) Clinical repositories Genetic knowledge bases OMIM Other subdomains … SNOMED CT UMLS UMLS Biomedical literature Genome annotations GO MeSH Anatomy FMA Model organisms NCBI Taxonomy C0001403 Addison Disease (D000224) Unified Medical Language System (UMLS) Overview

Heart Anatomical Structure Fully Formed Embryonic Body Part, Organ or Organ Component Pharmacologic Substance Disease or Syndrome Population Group Semantic Types Semantic Network Heart Concepts Metathesaurus Heart Valves Fetal Medias- tinum Saccular Viscus 22 225 97 4 12 9 31 Angina Pectoris Cardiotonic Agents Tissue Donors Esophagus Left Phrenic Nerve

Why is the UMLS relevant to the Semantic Web?

Relevance to the SW Metathesaurus Terminology integration system Trans-namespace integration Integration beyond shared identifiers Repository of biomedical terminologies/ontologies Many UMLS vocabularies used for the annotation of datasets (including clinical records)

Relevance to the SW Metathesaurus Broad coverage of biomedicine Large user base Tooling available E.g, visualization, named entity recognition, etc.

Relevance to the SW Semantic Network Top-level ontology of the biomedical domain Broad biomedical categories Helps partition biomedical concepts Semantic relations

Issues and Challenges

Issues and challenges Availability Discoverability Formalism Mandatory license agreement Discoverability No metadata Formalism No easy conversion to SKOS/RDF(S)/OWL Identifiers Steep learning curve

Availability Some source vocabularies have intellectual property restrictions E.g., most drug vocabularies Complex agreement for SNOMED CT: available at no cost for member countries of the IHTSDO Mandatory license agreement No cost for research May require negotiation with the vocabulary developer for production applications MetamorphoSys helps extract selected sources from the UMLS

Discoverability Discoverability of individual concepts UMLSKS web services Search all UMLS source vocabularies at the same time Named entity recognition/normalization (e.g., MetaMap) Discoverability of terminologies/ontologies No comprehensive registries No rich registries With rich metadata supporting the discoverability of terminologies/ontologies

Formalism UMLS: Proprietary format Rich Release Format (RRF) All terminologies/ontologies represented in the same format No easy conversion to SKOS/RDF(S)/OWL Underspecified semantics Child/parent  subClassOf Complex semantics Descriptors / concepts / terms Rich attribute set

Identifiers for biomedical entities What is identified? Entity vs. resource about the entity Which identifier to pick? E.g., Addison’s disease 363732003 (SNOMED CT) D000224 (MeSH) C0001403 (UMLS Metathesaurus) Which format? URI vs. LSID Which authoritative source for minting URIs? Ontology developers vs. (e.g.) Bio2RDF

Steep learning curve Large resource Complex structure 1.5M concepts 6M terms Over 20M relations Complex structure Metathesaurus Semantic Network Rich set of attributes Rich set of relations Terminological Semantic Statistical Mapping Multiple languages Complex domain

Conclusions

Conclusions UMLS as a terminology integration system Helps bridge across namespaces Helps integrate information sources Beyond shared identifiers UMLS as a repository of terminologies/ontologies Single source, single format for 143 vocabularies Issues with availability, discoverability and formalism Identifiers for biomedical entities

References UMLS umlsinfo.nlm.nih.gov UMLS browsers (free, but UMLS license required) Knowledge Source Server: umlsks.nlm.nih.gov Semantic Navigator: http://mor.nlm.nih.gov/perl/semnav.pl RRF browser (standalone application distributed with the UMLS)

References Recent overviews Bodenreider O. (2004). The Unified Medical Language System (UMLS): Integrating biomedical terminology. Nucleic Acids Research; D267-D270. Bodenreider O. From terminology integration to information integration: Unified Medical Language System (UMLS). BioRDF Teleconference, W3C Semantic Web Health Care and Life Sciences Interest Group, June 5, 2006. http://mor.nlm.nih.gov/pubs/pres/060605-BioRDF.pdf

Medical Ontology Research Contact: Web: olivier@nlm.nih.gov mor.nlm.nih.gov Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA