Unified Medical Language System® (UMLS®) NLM Presentation Theater MLA 2005 May 16 & 17, 2005 Rachel Kleinsorge
What does UMLS stand for? u Unified u Medical u Language u System UMLS ® Unified Medical Language System ®
The UMLS is: MetathesaurusSemantic Network SPECIALIST Lexicon & Lexical Tools Categories and relationshipsConceptsLexicon, lexical databases and programs The Knowledge Sources (delivered as machine readable files)
UMLS 3 Knowledge Sources u Metathesaurus l Over 100 source vocabularies l Over 1 million concepts l Inter-concept relationships u Semantic Network l 135 Semantic types (broad categories) l 54 Semantic relations (between categories) u Lexical resources l SPECIALIST Lexicon l Lexical tools (programs and databases)
History of the UMLS u Started at National Library of Medicine, 1986 u “Long-term R&D project” u Complementary to IAIMS [Lindberg & al., Methods, 1993] [Humphreys & al., JAMIA, 1998] «[…] the UMLS project is an effort to overcome two significant barriers to effective retrieval of machine-readable information. The first is the variety of ways the same concepts are expressed in different machine-readable sources and by different people. The second is the distribution of useful information among many disparate databases and systems.» (Integrated Academic Information Management Systems)
UMLS Objectives u nowledge Sources used to overcome: u Knowledge Sources used to overcome: l disparities in language format l Ex: atrial fibrillation, auricular fibrillation, af l d isparities in granularity and perspective l E x: Contusions, hematoma, bruise l Ex: Instruct patient to promptly report nosebleeds and excessive bruising (NIC), Epistaxis (MeSH) l problems in mapping and aggregating within and across databases and systems
UMLS in Practice u Intellectual “middleware” = a u Intellectual “middleware” = a set of multi-purpose tools for system developers u Databases: 3 separate sets of relational files u Tools: l MetamorphoSys (installation and customization) l Web interface Knowledge Source Server (UMLSKS) l Application programming interfaces l lvg (lexical programs) l RRF Subset Browser The UMLS is not an end-user application
UMLS Uses u u Information retrieval u u Thesaurus construction u u Natural language processing u u Automated indexing u u Electronic patient records
License Agreements u Semantic Network, SPECIALIST Lexicon, and Lexical Programs l terms and conditions of use online u Metathesaurus l liscense agreement process l some restrictions n 2. No charges, usage fees or royalties will be paid to NLM. n 5. Within 30 days of the end of any calendar year … provide NLM with a brief report n 11.c. required to include … identifiers from … the original source vocabularies n 12. For material … from some sources additional restrictions … may apply.
What is the UMLS? Overview through an example
Addison’s Disease in medical vocabularies u Synonyms: different terms l Addisonian syndrome l Bronzed disease l Addison melanoderma l Asthenia pigmentosa l Primary adrenal deficiency l Primary adrenal insufficiency l Primary adrenocortical insufficiency l Chronic adrenocortical insufficiency u Contexts: different hierarchies symptoms clinical variants eponym
Metathesaurus gathers and organizes terms u Synonymous terms clustered into a concept u Preferred term is chosen u Unique identifier (CUI) is assigned Adrenal Gland Diseases Adrenal gland diseasesMeSHD Adrenal disorderAOD Disorder of adrenal glandReadC15z. Diseases of the adrenal glandsSNOMEDDB C
Cluster of synonymous terms Concept C […] Term L S Adrenal Gland Diseases S Adrenal Gland Disease S Disease of adrenal gland S Disease of adrenal gland, NOS S Disease, adrenal gland S Gland Disease, Adrenal Term L S Disorder of adrenal gland, unspecified S Unspecified disorder of adrenal glands […] Term L S Adrenal disease S ADRENAL DISEASE, NOS […] Term L S Disorder of adrenal gland S Adrenal Gland Disorders […] Term L S ADRENAL DISORDER S DISORDER ADRENAL (NOS) […] Term L S Nebennierenkrankheiten GER S SURRENALE, MALADIES Term L FRE
Semantic Network u Organizes terms into 135 broad subject categories l Semantic Types (Clinical Drug, Virus) l Addison’s Disease Semantic Type: Disease or Syndrome u Defines 54 Semantic Relationships l Links between categories (isa, causes, treats) l Ex: Virus causes Disease or Syndrome u Together, types and relations: l Form the structure of the semantic network l Broadly categorize the biomedical domain
Lexical Tools process terms and text Organization is accomplished using: u Normalization u Semantic pre-processing u UMLS editors Adrenal gland diseases Adrenal disorder Disorder of adrenal gland Diseases of the adrenal glands C
Summary Metathesaurus: u clusters terms into concepts – assigns unique identifier Semantic Network: u defines relationships between concepts, organizes concepts into categories Lexicon and Lexical Tools: u process terms for entry into the Metathesaurus