Unified Medical Language System® (UMLS®) NLM Presentation Theater MLA 2007 National Library of Medicine National Institutes of Health U.S. Dept. of Health and Human Services
The UMLS consists of Metathesaurus Semantic Network SPECIALIST Lexicon +Tools 135 broad categories and 54 relationships between them 1 million+ biomedical concepts from over 100 sources lexical information and programs for language processing 3 Knowledge Sources used separately or together
UMLS Objectives Began in 1986 as long-term R&D project Designed for systems developers Develop multi-purpose tools to enhance understanding of medical meaning across systems O Overcome barriers to effective retrieval of machine-readable information Overcome variety of ways the same concepts are expressed in machine readable and human language Dr. Donald A. Lindberg Director National Library of Medicine National Library of Medicine
UMLS Uses Information retrieval Thesaurus construction Natural language processing Automated indexing Electronic health records (EHR) Distribution mechanism for HIPAA, CHI, PHIN regulatory standards SNOMED CT
UMLS Releases 3-4 updates (releases) per year 3 Knowledge Sources in 3 sets of relational files Tools: MetamorphoSys – install and customize RRF Subset Browser Lexical tools including LVG Download from UMLSKS or on DVD
UMLS Access UMLS Knowledge Source Server (UMLSKS) Browse 3 Knowledge Sources Query with Java or XML-based APIs Download files and programs Access documentation and other resources Local installation Use MetamorphoSys to install Knowledge Sources, customize Metathesaurus View customized subset with browser Use load scripts to load into database
UMLS License Agreements Terms and conditions of use online Semantic Network SPECIALIST Lexicon & Lexical Tools Metathesaurus Sign formal license agreement Additional restrictions apply to use of some sources as noted in the Appendix to the License Agreement
Metathesaurus 100+ general and specialized biomedical vocabularies 17 languages (63% English) 1 million+ concepts; 6 million+ names 100K+ relationships (hierarchical, semantic, statistical and mapping relationships) Distributed in a common electronic format
Metathesaurus Source Vocabularies Vary in purpose, structure and properties Used in clinical, research, administrative, public health reporting All are sets of valid values Thesauri, e.g., MeSH, CRISP, NCI Statistical classifications, e.g., ICD-9-CM Billing codes, e.g., CPT, ABC Codes Clinical coding systems, e.g., SNOMED CT
SNOMED CT Comprehensive clinical terminology Created by College of American Pathologists Ownership transferred to International Health Terminology Standards Development Organisation (IHTSDO) in April 2007 9 charter member countries includes U.S. NLM represents U.S. NLM distributes SNOMED CT to U.S. users in both native and UMLS formats
Metathesaurus Concepts Synonymous terms clustered into a concept Unique identifier (CUI) is assigned Source information preserved Addison’s disease Addison’s disease SNOMED CT PT Addison’s Disease MedlinePlus PT T1233 Addison Disease MeSHPTD Primary Adrenal Insufficiency MeSHEND Primary hypoadreanlism MedDRALT syndrome, Addison… C
Humphreys, BL and PL Schuyler, The Unified Medical Language System: Moving beyond the vocabulary of bibliographic retrieval. In Broering NC, ed. High- Performance Medical Libraries: advanced information management for the virtual era. Westport (CT): Meckler; 1993, p. 33. assumes continuing diversity in the formats and vocabularies of different information sources in the language employed by different elements of the biomedical community. It is not an attempt to build a single standard biomedical vocabulary." assumes continuing diversity in the formats and vocabularies of different information sources and in the language employed by different elements of the biomedical community. It is not an attempt to build a single standard biomedical vocabulary." Betsy L. Humphreys, Deputy Director, National Library of Medicine “The UMLS approach …
Semantic Network 135 Semantic Types Broad subject categories in 2 hierarchies Assigned to all Metathesaurus concepts 54 Semantic Relationships Useful, important links between Types Hierarchical “isa” and associative relations Categorize the Metathesaurus Enhance meaning of concepts
“Biologic Function” hierarchy (isa) Biologic Function 360 Pathologic Function 9983 Physiologic Function 691 Disease or Syndrome Cell or Molecular Dysfunction 1276 Experimental Model of Disease 72 Organism Function 1528 Organ or Tissue Function 2912 Cell Function 4417 Molecular Function Mental or Behavioral Dysfunction 5691 Neoplastic Process Mental Process 1224 Genetic Function 1340
Semantic Relations: between types u Disease or Syndrome associated_with Finding Disease or Syndrome associated_withFinding Disease or Syndrome associated_withFinding u Disease or Syndrome result_of Pathologic Function Disease or Syndrome result_ofPathologic Function Disease or Syndrome result_ofPathologic Function u Body Part, Organ, or Organ Component location_of Disease or Syndrome Body Part, Organ, or Organ Component Disease or Syndrome Body Part, Organ, or Organ Component Disease or Syndrome u Hormone affects Disease or Syndrome Hormone causes Disease or Syndrome Hormone complicates Disease or Syndrome Hormone affectsDisease or Syndrome Hormone causesDisease or Syndrome Hormone complicatesDisease or Syndrome Hormone affectsDisease or Syndrome Hormone causesDisease or Syndrome Hormone complicatesDisease or Syndrome
SPECIALIST Lexicon and Lexical Tools English lexicon of 300K+ common words and biomedical terms Lexical records encode information on: Syntax Morphology Orthography Used with associated lexical tools in Metathesaurus production in natural language processing applications
SPECIALIST Lexicon Lexical Entry {base=disease entry=E cat=noun variants=reg variants=uncount compl=pphr(of,np|bone|) compl=pphr(of,np|breast|) compl=pphr(of,np|liver|) compl=pphr(of,np|ovary|)} reguncountreguncount Base form Unique identifier Part of speech Lexical variants Prepositional phrase complements
Lexical Tools Manage lexical variation in biomedical terminologies and text Used separately or with SPECIALIST Lexicon Perform transformations selected and ordered by users 3 primary programs: normalizer, word index generator, lexical variant generator
Normalization 1 Hodgkin’s diseases, NOS Hodgkin diseases, NOS Remove genitive Hodgkin diseases, Remove stop words hodgkin diseases, Lowercase hodgkin diseases Strip punctuation hodgkin disease UninflectSort words disease hodgkin
Normalization 2 Hodgkin Disease HODGKINS DISEASE Hodgkin's Disease Disease, Hodgkin's Hodgkin's, disease HODGKIN'S DISEASE Hodgkin's disease Hodgkins Disease Hodgkin's disease NOS Hodgkin's disease, NOS Disease, Hodgkins Diseases, Hodgkins Hodgkins Diseases Hodgkins disease hodgkin's disease Disease, Hodgkin normalize disease hodgkin
UMLS Knowledge Source Server (UMLSKS) Home Page From top links or buttons Search 3 Knowledge Sources From sidebar Downloads Documentation Resources
UMLS Documentation and Support UMLS Home Page UMLSKS NLP and Lexical Tools tml tml tml NLM customer service
Summary 3 Knowledge Sources Metathesaurus Semantic Network Lexicon + Lexical Tools MetamorphoSys install, customize, browse UMLSKS browse, query, download License Agreement
Thank you