Ontology mapping: a way out of the medical tower of Babel?

Slides:



Advertisements
Similar presentations
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Advertisements

Learning to Map between Ontologies on the Semantic Web AnHai Doan, Jayant Madhavan, Pedro Domingos, and Alon Halevy Databases and Data Mining group University.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
Chapter 9: Ontology Management Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.
Semantic Web research anno 2006: main streams, popular falacies, current status, future challenges Frank van Harmelen Vrije Universiteit Amsterdam.
A Framework for Ontology-Based Knowledge Management System
IR & Metadata. Metadata Didn’t we already talk about this? We discussed what metadata is and its types –Data about data –Descriptive metadata is external.
Ontology mapping needs context & approximation Frank van Harmelen Vrije Universiteit Amsterdam.
1 Bluffers Guide to The Semantic Web Frank van Harmelen CS Department Vrije Universiteit Amsterdam Data wants to be free.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.
June 19-21, 2006WMS'06, Chania, Crete1 Design and Evaluation of Semantic Similarity Measures for Concepts Stemming from the Same or Different Ontologies.
The Semantic Web: New-style data-integration (and how it works for life-scientists too!) Frank van Harmelen AI Department Vrije Universiteit Amsterdam.
School of Computing and Mathematics, University of Huddersfield Knowledge Engineering: Issues for the Planning Community Lee McCluskey Department of Computing.
An Empirical Study of Instance-Based Ontology Mapping Antoine Isaac, Lourens van der Meij, Stefan Schlobach, Shenghui Wang funded by NWO Vrije.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
QoM: Qualitative and Quantitative Measure of Schema Matching Naiyana Tansalarak and Kajal T. Claypool (Kajal Claypool - presenter) University of Massachusetts,
Some comments on Granularity Scale & Collectivity by Rector & Rogers Thomas Bittner IFOMIS Saarbruecken.
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
Evaluating Ontology-Mapping Tools: Requirements and Experience Natalya F. Noy Mark A. Musen Stanford Medical Informatics Stanford University.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
Ontology Matching Basics Ontology Matching by Jerome Euzenat and Pavel Shvaiko Parts I and II 11/6/2012Ontology Matching Basics - PL, CS 6521.
Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,
Web Explanations for Semantic Heterogeneity Discovery Pavel Shvaiko 2 nd European Semantic Web Conference (ESWC), 1 June 2005, Crete, Greece work in collaboration.
Reasoning with context in the Semantic Web … or contextualizing ontologies Fausto Giunchiglia July 23, 2004.
Ontology Alignment Patrick Lambrix Linköpings universitet.
Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
Logics for Data and Knowledge Representation Semantic Matching.
BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING Pavel Shvaiko joint work with Fausto Giunchiglia and Mikalai Yatskevich INFINT 2007 Bertinoro Workshop on Information.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
Ontologies for the Integration of Geospatial Data Michael Lutz Workshop: Semantics and Ontologies for GI Services, 2006 Paper: Lutz et al., Overcoming.
PART IV: REPRESENTING, EXPLAINING, AND PROCESSING ALIGNMENTS & PART V: CONCLUSIONS Ontology Matching Jerome Euzenat and Pavel Shvaiko.
Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.
Survey of Medical Informatics CS 493 – Fall 2004 September 27, 2004.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Ontology Alignment. Ontologies in biomedical research many biomedical ontologies e.g. GO, OBO, SNOMED-CT practical use of biomedical ontologies e.g. databases.
1 What is an Ontology? n No exact definition n A tool to help organize knowledge n Or a way to convey a theory on how to represent a class of things n.
Logics for Data and Knowledge Representation Applications of ClassL: Lightweight Ontologies.
A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan.
Logics for Data and Knowledge Representation
Sharing Ontologies in the Biomedical Domain Alexa T. McCray National Library of Medicine National Institutes of Health Department of Health & Human Services.
Christoph F. Eick University of Houston Organization 1. What are Ontologies? 2. What are they good for? 3. Ontologies and.
Ontology Mapping in Pervasive Computing Environment C.Y. Kong, C.L. Wang, F.C.M. Lau The University of Hong Kong.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Knowledge Representation. Keywordsquick way for agents to locate potentially useful information Thesaurimore structured approach than keywords, arranging.
Approach to building ontologies A high-level view Chris Wroe.
Copy right 2004 Adam Pease permission to copy granted so long as slides and this notice are not altered Ontology Overview Introduction.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
WonderWeb. Ontology Infrastructure for the Semantic Web. IST WP4: Ontology Engineering Heiner Stuckenschmidt, Michel Klein Vrije Universiteit.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Zlatan Dragisic, Patrick Lambrix and Eva Blomqvist
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Cross-Ontological Relationships
ece 627 intelligent web: ontology and beyond
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
ece 720 intelligent web: ontology and beyond
Service-Oriented Computing: Semantics, Processes, Agents
Nov. 29, 2001 Ontology Based Recognition of Complex Objects --- Problems to be Solved Develop Base Object Recognition algorithms that identify non-decomposable.
Introduction to Information Retrieval
Information Networks: State of the Art
Service-Oriented Computing: Semantics, Processes, Agents
Logics for Data and Knowledge Representation
Presentation transcript:

Ontology mapping: a way out of the medical tower of Babel? Frank van Harmelen Vrije Universiteit Amsterdam The Netherlands Antilles

Before we start… a talk on ontology mappings is difficult talk to give: no concensus in the field on merits of the different approaches on classifying the different approaches no one can speak with authority on the solution this is a personal view, with a sell-by date other speakers will entirely disagree (or disapprove) picture of tower of babel?

Good overviews of the topic Knowledge Web D2.2.3: “State of the art on ontology alignment” Ontology Mapping Survey talk by Siyamed Seyhmus SINIR ESWC'05 Tutorial on Schema and Ontology Matching by Pavel Shvaiko Jerome Euzenat KER 2003 paper Kalfoglou & Schorlemmer These are all different & incompatible…

Ontology mapping: a way out of the medical tower of Babel?

The Medical tower of Babel Mesh Medical Subject Headings, National Library of Medicine 22.000 descriptions EMTREE Commercial Elsevier, Drugs and diseases 45.000 terms, 190.000 synonyms UMLS Integrates 100 different vocabularies SNOMED 200.000 concepts, College of American Pathologists Gene Ontology 15.000 terms in molecular biology NCI Cancer Ontology: 17,000 classes (about 1M definitions),

Ontology mapping: a way out of the medical tower of Babel?

What are ontologies & what are they used for world concept language Agree on a conceptualization no shared understanding Conceptual and terminological confusion Make it explicit in some language. Actors: both humans and machines

Ontologies come in very different kinds From lightweight to heavyweight: Yahoo topic hierarchy Open directory (400.000 general categories) Cyc, 300.000 axioms From very specific to very general METAR code (weather conditions at air terminals) SNOMED (medical concepts) Cyc (common sense knowledge)

What’s inside an ontology? terms + specialisation hierarchy classes + class-hierarchy instances slots/values inheritance (multiple? defaults?) restrictions on slots (type, cardinality) properties of slots (symm., trans., …) relations between classes (disjoint, covers) reasoning tasks: classification, subsumption Increasing semantic “weight” increasing degree of semantics/formality

In short (for the duration of this talk) Ontologies are not definitive descriptions of what exists in the world (= philosphy) Ontologies are models of the world constructed to facilitate communication Yes, ontologies exist (because we build them)

Ontology mapping: a way out of the medical tower of Babel?

 Ontology mapping is old & inevitable db schema integration federated databases Ontology mapping is inevitable ontology language is standardised, don't even try to standardise contents compare relational (only structural, not semantics) against ontology (constrain semantics, logical axoims)

 Ontology mapping is important database integration, heterogeneous database retrieval (traditional) catalog matching (e-commerce) agent communication (theory only) web service integration (urgent) P2P information sharing (emerging) personalisation (emerging)

 Ontology mapping is now urgent Ontology mapping has acquired new urgency physical and syntactic integration is ± solved, (open world, web) automated mappings are now required (P2P) shift from off-line to run-time matching Ontology mapping has new opportunities larger volumes of data richer schemas (relational vs. ontology) applications where partial mappings work

Different aspects of ontology mapping how to discover a mapping how to represent a mapping subset/equal/disjoint/overlap/ is-somehow-related-to logical/equational/category-theoretical atomic/complex arguments, confidence measure how to use it We only talk about “how to discover”

Many experimental systems: (non-exhaustive!) Prompt (Stanford SMI) Anchor-Prompt (Stanford SMI) Chimerae (Stanford KSL) Rondo (Stanford U./ULeipzig) MoA (ETRI) Cupid (Microsoft research) Glue (Uof Washington) FCA-merge (UKarlsruhe) IF-Map Artemis (UMilano) T-tree (INRIA Rhone-Alpes) S-MATCH (UTrento) Coma (ULeipzig) Buster (UBremen) MULTIKAT (INRIA S.A.) ASCO (INRIA S.A.) OLA (INRIA R.A.) Dogma's Methodology ArtGen (Stanford U.) Alimo (ITI-CERTH) Bibster (UKarlruhe) QOM (UKarlsruhe) KILT (INRIA LORRAINE)

Different approaches to ontology matching Linguistics & structure Shared vocabulary Instance-based matching Shared background knowledge going to review the first 3 quickly, spend most time on the fourth one

Linguistic & structural mappings normalisation (case,blanks,digits,diacritics) lemmatization, N-grams, edit-distance, Hamming distance, distance = fraction of common parents elements are similar if their parents/children/siblings are similar problem: ontologies are semantic objects, these methods entirely ignore the semantics decreasing order of boredom

Different approaches to ontology matching Linguistics & structure Shared vocabulary Instance-based matching Shared background knowledge

Matching through shared vocabulary Q Up(Q) Q Low(Q) U Low(Q) µ Q µ I Up(Q) Early results with post-doc

Matching through shared vocabulary Used in mapping geospatial databases from German land-registration authorities (small) Used in mapping bio-medical and genetic thesauri (large) Early results with post-doc

Different approaches to ontology matching Linguistics & structure Shared vocabulary Instance-based matching Shared background knowledge

Matching through shared instances Early results with post-doc

Matching through shared instances Used by Ichise et al (IJCAI’03) to succesfully map parts of Yahoo to parts of Google Yahoo = 8402 classes, 45.000 instances Google = 8343 classes, 82.000 instances Only 6000 shared instances 70% - 80% accuracy obtained (!) Conclusions from authors: semantics is needed to improve on this ceiling Early results with post-doc

Different approaches to ontology matching Linguistics & structure Shared vocabulary Instance-based matching Shared background knowledge

Matching using shared background knowledge ontology 1 ontology 2 Early results with post-doc

Ontology mapping using background knowledge Case study 1 Work with Zharko Aleksovski @ Philips Michel Klein @ VU KIK @ AMC PHILIPS

Overview of test data Two terminologies from intensive care domain OLVG list List of reasons for ICU admission AMC list DICE hierarchy Additional hierarchical knowledge describing the reasons for ICU admission

OLVG list developed by clinician 3000 reasons for ICU admission 1390 used in first 24 hours of stay 3600 patients since 2000 based on ICD9 + additional material List of problems for patient admission Each reason for admission is described with one label Labels consist of 1.8 words on average redundancy because of spelling mistakes implicit hierarchy (e.g. many fractures)

AMC list List of 1460 problems for ICU admission Each problem is described using 5 aspects from the DICE terminology: 2500 concepts (5000 terms), 4500 links Abnormality (size: 85) Action taken (size: 55) Body system (size: 13) Location (size: 1512) Cause (size: 255) expressed in OWL allows for subsumption & part-of reasoning

Why mapping AMC list $ OLVG list? allow easy entering of OLVG data re-use of data in epidemiology quality of care assessment data-mining (patient prognosis)

Linguistic mapping: Compare each pair of concepts Use labels and synonyms of concepts Heuristic method to discover equivalence and subclass relations More specific than Long brain tumor Long tumor First round compare with complete DICE 313 suggested matches, around 70 % correct Second round: only compare with “reasons for admission” subtree 209 suggested matches, around 90 % correct High precision, low recall (“the easy cases”)

Using background knowledge Use properties of concepts Use other ontologies to discover relation between properties ? …. ….

DICE aspect taxonomies Semantic match DICE aspect taxonomies Given Lexical match ? Abnormality taxonomy ? Action taxonomy ? Body system taxonomy ? Location taxonomy ? Cause taxonomy Implicit matching: property match OLVG problem list DICE problem list

Semantic match Lexical match: has location Lexical match: has location Taxonomy of body parts Blood vessel is more general is more general Vein Artery is more general Aorta Lexical match: has location Lexical match: has location Reasoning: implies Aorta thoracalis dissection Dissection of artery Location match: has more general location

Example: “Heroin intoxication” – “drugs overdose” Cause taxonomy Drugs is more general Heroine Lexical match: cause Lexical match: cause Cause match: has more specific cause Heroin intoxication Drugs overdosis Abnormality match: has more general abnormality Lexical match: abnormality Lexical match: abnormality Abnormality taxonomy Intoxicatie is more general Overdosis

Example results OLVG: Acute respiratory failure DICE: Asthma cardiale OLVG: Aspergillus fumigatus DICE: Aspergilloom OLVG: duodenum perforation DICE: Gut perforation OLVG: HIV DICE: AIDS OLVG: Aorta thoracalis dissectie type B DICE: Dissection of artery abnormality cause abnormality, cause cause location, abnormality

Extension: approximate matching Terms are not precisely defined Terms are not precisely used Exact reasoning will not be useful B A A ½ B ?

Approximate matching Translate every class-name into a propositional formula (both DNF and CNF versions) A  B = (Ai  Bk) = i,k (Ai  Bk) ignore increasing number. of (i,k)-subsumption pairs varies from classical to trivial

Results (obtained on different domain)

Ontology mapping using background knowledge Case study 2 Work with Heiner Stuckenschmidt @ VU

Case Study: Map GALEN & Tambis, using UMLS as background knowledge Select three topics with sufficient overlap Substances Structures Processes Define some partial & ad-hoc manual mappings between individual concepts Represent mappings in C-OWL Use semantics of C-OWL to verify and complete mappings Partial -> complete later Ad-hoc -> verify later

(medical terminology) Case Study: UMLS (medical terminology) verification & derivation verification & derivation Animate diagram Derived mapping only possible after identity assumption on equal domains lexical mapping lexical mapping derived mapping GALEN (medical ontology) Tambis (genetic ontology)

Ad hoc mappings: Substances UMLS GALEN Notice: UMLS has two views vs. GALEN mixed, Notice: mappings high and low in the hierarchy, few in the middle Notice: mappings high and low in the hierarchy, few in the middle

Ad hoc mappings: Substances UMLS Tambis Notice different grainsize: UMLS course, Tambis fine

Verification of mappings UMLS:Chemicals = Tambis:Chemical UMLS:Chemicals_ viewed_structurally ? Tambis:enzyme UMLS:Chemicals_ viewed_functionally = Either: mapping is wrong or UMLS classes are non-disjoint UMLS:enzyme

Deriving new mappings   =  UMLS:substance UMLS:Phenomenon_ or_process UMLS:Chemicals  Galen: ChemicalSubstance UMLS:OrganicChemical  = 

Ontology mapping: a way out of the medical tower of Babel?

“Conclusions” Ontology mapping is (still) hard & open Many different approaches will be required: linguistic, structural statistical semantic … Currently no roadmap theory on what's good for which problems

Challenges roadmap theory run-time matching “good-enough” matches large scale evaluation methodology hybrid matchers (needs roadmap theory)

Ontology mapping: a way out of the medical tower of Babel?