Lexical ambiguity in SNOMED CT

Slides:



Advertisements
Similar presentations
Semantic Interoperability in Health Informatics: Lessons Learned 10 January 2008Semantic Interoperability in Health Informatics: Lessons Learned 1 Medical.
Advertisements

Copyright © 2001 College of American Pathologists Pseudophakic corneal edema Sample Hierarchy for Corneal Edema Idiopathic corneal edema Phakic corneal.
Copyright © 2001 College of American Pathologists Sample Hierarchy for Nasal Septoplasty Hair bearing flap to eyebrow Repair of eyebrow Operation on eyebrow.
Lab and Messaging CoP 10/25/2012. Agenda Agenda review Introduction to the Specimen Cross-Mapping Table (Specimen CMT) – Background – Expected uses –
Catalina Martínez-Costa, Stefan Schulz: Ontology-based reinterpretation of the SNOMED CT context model Ontology-based reinterpretation of the SNOMED CT.
Copyright © 2001 College of American Pathologists Sample Hierarchy for Fine Needle Biopsy of Abdominal Wall Mass Tru cut biopsy of lesion of abdominal.
Copyright © 2001 College of American Pathologists Sample Hierarchy for Hepatitis Chronic viral hepatitis Chronic viral hepatitis C About SNOMED Relationships.
Terminology in Health Care and Public Health Settings
Ontology-based Convergence of Medical Terminologies: SNOMED CT and ICD-11 Institute for Medical Informatics, Statistics and Documentation, Medical University.
Karen Gibson.  Significant investment in eHealth is underway  Clinical records: ◦ Not only a record for the author ◦ Essential to inform the next person.
Ontology Development in the Sciences Some Fundamental Considerations Ontolytics LLC Topics:  Possible uses of ontologies  Ontologies vs. terminologies.
Healthcare Services as Collective Activity Susan Wakenshaw Xiao MA.
SNOMED CT Denise Downs Knowledge Management & Education Lead Data Standards, Technology Office Department of Health Informatics Directorate.
Concept Model for observables, investigations, and observation results For the IHTSDO Observables Project Group and LOINC Coordination Project Revision.
SNOMED CT – Distributed Content Management Stefan Schulz Content Committee April 2, 2009.
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
Tommie Curtis SAIC January 17, 2000 Open Forum on Metadata Registries Santa Fe, NM SDC JE-2023.
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
A School of Information Science, Federal University of Minas Gerais, Brazil b Medical University of Graz, Austria, c University Medical Center Freiburg,
WHO – IHTSDO: SNOMED CT – ICD-11 coordination: Conditions vs. Situations Stefan Schulz IHTSDO conference Copenhagen, Apr 24, 2012.
How to write a scientific article Nikolaos P. Polyzos M.D. PhD.
Ontological Analysis, Ontological Commitment, and Epistemic Contexts Stefan Schulz WHO – IHTSDO Joint Advisory Group First Face-to-Face Meeting Heathrow,
Lab and Messaging CoP 11/29/2012. Agenda Agenda review Introduction to the Specimen Cross-Mapping Table (Specimen CMT) – Background – Expected uses –
Semantic Relation Discovery by Using Co-occurrence Information Background: MEDLINE contains high quality semantic metadata covering more than 22 million.
SNOMED Core Structures NAHLN January 2005 Las Vegas, NV.
Detection of underspecifications in SNOMED CT concept definitions using language processing 1 Federal Technical University of Paraná (UTFPR), Curitiba,
SNOMED CT A Technologist’s Perspective Gaur Sunder Principal Technical Officer & Incharge, National Release Center VC&BA, C-DAC, Pune.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
SNOMED mapping for Pan-Canadian Surgery Templates Elaine Maloney January 27, 2015 ITHSDO Implementation SIG.
Finding Content in SNOMED CT Jo Oakes – Knowledge & Information Manager.
1 Alberta Health Services Capital Health Palliative Care Program Clinical Vocabulary Pilot Project Project Update Friday April 24, 2009 Dennis Lee & Francis.
Mapping from antecedent SNOMED to SNOMED CT in Histopathology and Cellular Pathology IHTSDO Business Meeting, London April 2016 Presented by Jeremy Rogers.
Mapping the NCI Thesaurus and the Collaborative Inter-Lingual Index Amanda Hicks University of Florida HealthInsight Workshop, Oslo, Norway.
Oncology in SNOMED CT NCI Workshop The Role of Ontology in Big Cancer Data Session 3: Cancer big data and the Ontology of Disease Bethesda, Maryland May.
© University of Manchester Creative Commons Attribution-NonCommercial 3.0 unported 3.0 license Quality Assurance, Ontology Engineering, and Semantic Interoperability.
Metaphrase EVS API Presented to the National Cancer Institute (NCI) Office of Informatics (OI) Noel Ramathal 06/06/2000.
Dispelling the myths about dm+d Presenter: Karen ReesDate: 10 th December 2014 dm+d and the NHS Standard.
Developments in The Netherlands focussed on the IHTSDO Workbench Jos Baptist, senior advisor standardisation processes IHTSDO, Stockholm, October, 2012.
A Proposed Approach to Binding SNOMED CT to HL7 FHIR Dr Linda Bird Senior Implementation Specialist.
Acquisition of Character Translation Rules for Supporting SNOMED CT Localizations Jose Antonio Miñarro-Giménez a Johannes Hellrich b Stefan Schulz a a.
Introduction to Health Informatics Leon Geffen MBChB MCFP(SA)
Assessing SNOMED CT for Large Scale eHealth Deployments in the EU Workpackage 2- Building new Evidence Daniel Karlsson, Linköping University Stefan Schulz,
SNOMED Core Structures RF2
SNOMED CT and Surgical Pathology
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
The UMLS and the Semantic Web
The “Common Ontology” between ICD 11 and SNOMED CT
Using language technology for SNOMED CT localization
NeurOn: Modeling Ontology for Neurosurgery
Scope The scope of this test is to make a preliminary assessment of the fitness of the Draft Observables and Investigation Model (Observables model)
SNOMED CT and Surgical Pathology
ece 627 intelligent web: ontology and beyond
Can SNOMED CT be harmonized with an upper-level ontology?
Stefan Schulz Medical University of Graz (Austria)
How to bring both kinds of data together for biomarker research ?
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
ICD 11 FC and SNOMED CT – finding as Ontologies of Clinical Situations
Surgical Cancer Treatment
Terminology Quality Evaluation S60 Rashmie Abeysinghe Joint work with
Clinical Assessment Instruments and the LOINC/SNOMED CT Relationship
Kin Wah Fung, MD, MS, MA National Library of Medicine, NIH
Networking and Health Information Exchange
Presented by Joe Nichols MD Principal – Health Data Consulting
ICD-10 and Clinical Documentation
Architecture for ICD 11 and SNOMED CT Harmonization
Category-Based Pseudowords
Harmonizing SNOMED CT with BioTopLite
Stefan Schulz Medical University of Graz (Austria)
Reference terminologies vs. Interface terminologies
BFO meets critics Workshop organised by Barry Smith
Presentation transcript:

Lexical ambiguity in SNOMED CT ODLS 2017 Ontologies & Data in Life Sciences Lexical ambiguity in SNOMED CT Stefan Schulz Catalina Martínez-Costa Jose Antonio Miñarro-Giménez Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria

Background SNOMED CT: largest clinical terminology / ontology (English: 300 k concepts, 750k terms) Two aspects: SNOMED CT as a domain ontology: Labels (FSNs), tentatively self-explaining; formal descriptions and definitions (EL++), still few free text elucidations SNOMED CT as a domain terminology: at least for English, enrichment with quasi-synonyms ("interface terms") SNOMED International (aka IHTSDO). SNOMED CT http://www.snomed.org/snomed-ct

Ontology labels vs. Interface terms Self-explaining Univocal Long Unabridged Unpopular Should be understandable independent of contex Not self-explaining Ambiguous Short Abridged (Acronyms) Popular Depend on user groups, by specialty, institution, dialect "Primary malignant neoplasm of lung" "Leishmania tropica" "Electrocardiogram" "Diagnosis" "Ca Lung " "LT" "ECG" "Dx"

Popularity of terms (Pubmed titles and abstracts) FSN (SNOMED CT) Count SNOMED CT synonyms Primary malignant neoplasm of lung Lung cancer Bronchial carcinoma 120682 3452 Cerebrovascular accident 3819 Stroke 191559 Block dissection of cervical lymph nodes  1 Neck dissection 7512 Electrocardiographic procedure Electrocardiogram ECG 33670 55120 Backache 3489 Back pain 38132 Capillary blood specimens 32 Capillary blood samples 574

Lexical ambiguity in a nutshell "Term" and "concept" are two fundamentally different things: Concepts/classes/types/categories: units of language-independent meaning (Natural language) Terms: units of language, connected to concepts "financial institution" "bank" "riverside" Concept 1 Concept 2

Main questions Why should ontologies care about ambiguity aspects at all when studying ontology? User acceptance of ontology-based systems Quality of structured data entry Use of ontology in NLP scenarios How is lexical ambiguity related to the ontology issues proper? Completeness and quality of ontology content Complex categories

Understanding better SNOMED CT naming Fully specified names Unique – 1 : 1 relation with codes Carry a "hierarchy tag" Without hierarchy tag (e.g. for term matching in texts), ambiguity may arise: Lymphoma (disorder) vs. Lymphoma (morphology) Synonyms May be ambiguous Short forms Entries not ambiguous because accompanied by expanded form, e.g. PIN - Prostatic intraepithelial neoplasia Pressure-induced nystagmus

Scrutiny of ambiguous terms in SNOMED CT SNOMED CT January 2017 release: Extract ambiguous entries Full terms (without hierarchy tags) D1 Acronyms (without abbreviations)  D2 Analysis: Count ambiguities and their cardinality SNOMED CT hierarchies to which ambiguous terms belong Ambiguous terms that are related via non-taxonomic links (e.g. Associated morphology or Has active Ingredient) Ambiguous terms that are related via taxonomic links (is-a) Purpose: Detect regularities, spot errors, derive recommendations to SNOMED Intl.

Results: Frequency and Distribution Frequency and distribution of ambiguous readings of SNOMED CT terms

Results Results D1 Leading patterns of concept tuples connected by the same SNOMED CT (non-acronym) term Strict implications, e.g. 'Folinic acid (product)' subclassOf 'Has active ingredient' some 'Folinic acid (substance)' "Dot types" (logical polysemy) 'Solar keratosis (disorder)' subclassOf 'Associated morphology' some 'Solar keratosis (morphologic abnormality)' Arapinis A,Vieu L (2015). A plea for complex categories in ontologies. Applied Ontology, 10(3-4), 285-296.

Results Results D2 Leading patterns of concept tuples linked by the same acronym extracted from SNOMED CT terms Distribution of patterns much more evely Acronym naming pattern not specific: e.g., O/E – eye, O/E – nose, O/E – mouth, O/E – heart etc.

Conclusion Degree of Lexical ambiguity in SNOMED CT moderate Ontological aspect: ontologically dependent concepts, partly interpretable as complex categories (dot types) Lexical aspect: Amount of ambiguous acronyms lower than expected  risk of wrong mappings Naming aspect: Acronym – expansion patterns not specific:  wrong expansions Should interface terms (synonyms) be managed by the ontology curators?