Kunal Narsinghani Ashwini Lahane Ontology Mapping and link discovery.

Slides:

Advertisements

Similar presentations

Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.

Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.

1 ICS-FORTH & Univ. of Crete SeLene November 15, 2002 A View Definition Language for the Semantic Web Maganaraki Aimilia.

An Introduction to RDF(S) and a Quick Tour of OWL

GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.

Database Systems: Design, Implementation, and Management Tenth Edition

Information-Flow-based Ontology Mapping Yannis Kalfoglou, University of Southampton Marco Schorlemmer, University of Edinburgh.

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 8 Slide 1 System modeling 2.

Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.

Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.

1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.

©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 8 Slide 1 System models.

Xyleme A Dynamic Warehouse for XML Data of the Web.

Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.

A Review of Ontology Mapping, Merging, and Integration Presenter: Yihong Ding.

PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.

The NSDL Registry Diane Hillmann  Jon Phipps. What We’re Doing Received an NSF grant in Oct. 2006, to: Register metadata schemas, vocabularies, application.

Mayo LexWiki: A Prototype of Collaborative Platform for Terminology/Ontology Content Development Guoqian Jiang, Ph.D. Division of Biomedical Informatics,

Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.

PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy Stanford Medical Informatics Stanford University.

Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.

PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.

PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.

Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.

BIS310: Week 7 BIS310: Structured Analysis and Design Data Modeling and Database Design.

Evaluating Ontology-Mapping Tools: Requirements and Experience Natalya F. Noy Mark A. Musen Stanford Medical Informatics Stanford University.

State of the Art Ontology Mapping By Justin Martineau.

©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 7 Slide 1 System models l Abstract descriptions of systems whose requirements are being.

OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR

Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.

Ontology Matching Basics Ontology Matching by Jerome Euzenat and Pavel Shvaiko Parts I and II 11/6/2012Ontology Matching Basics - PL, CS 6521.

An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.

PART IV: REPRESENTING, EXPLAINING, AND PROCESSING ALIGNMENTS & PART V: CONCLUSIONS Ontology Matching Jerome Euzenat and Pavel Shvaiko.

Resource Curation and Automated Resource Discovery.

10/18/20151 Business Process Management and Semantic Technologies B. Ramamurthy.

Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker.

Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.

Dimitrios Skoutas Alkis Simitsis

1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.

Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.

Object Oriented Multi-Database Systems An Overview of Chapters 4 and 5.

EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.

Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.

SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.

Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.

User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.

Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 8 Slide 1 System models.

Dictionary based interchanges for iSURF -An Interoperability Service Utility for Collaborative Supply Chain Planning across Multiple Domains David Webber.

THE BIBFRAME EDITOR AND THE LC PILOT Module 3 – Unit 1 The Semantic Web and Linked Data : a Recap of the Key Concepts Library of Congress BIBFRAME Pilot.

DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.

KAnOE: Research Centre for Knowledge Analytics and Ontological Engineering Managing Semantic Data NACLIN-2014, 10 Dec 2014 Dr. Kavi Mahesh Dean of Research,

A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.

Chapter – 8 Software Tools.

WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.

Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,

Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.

GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011

Of 24 lecture 11: ontology – mediation, merging & aligning.

Design Evaluation Overview Introduction Model for Interface Design Evaluation Types of Evaluation –Conceptual Design –Usability –Learning Outcome.

XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.

Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.

Modern Systems Analysis and Design Third Edition

Modern Systems Analysis and Design Third Edition

Modern Systems Analysis and Design Third Edition

Metadata The metadata contains

State of the Art Ontology Mapping

Business Process Management and Semantic Technologies

ONTOMERGE Ontology translations by merging ontologies Paper: Ontology Translation on the Semantic Web by Dejing Dou, Drew McDermott and Peishen Qi 2003.

Presentation transcript:

Kunal Narsinghani Ashwini Lahane Ontology Mapping and link discovery

Agenda Introduction Levels of heterogeneity Previous work in the field PROMPT Suite of Tools Prompt on Protégé The Web of Data CRS : Managing Co-references Silk – A link discovery framework

Introduction Can a single ontology suffice for various applications? Definition – The task of relating the vocabulary of two Ontologies that share the same domain of discourse It’s a morphism that consists of a collection of functions assigning symbols used in one vocabulary to the symbols in the other [1] This would provide a common layer from which ontologies can be accessed and exchange information. Translation is different from mapping

Introduction An analogy to the problem – Clocks Levels of Heterogeneity in Ontologies Syntactic Structural Semantic

Mapping discovery First approach is to use a reference ontology Example – the upper Ontologies SUMO and DOLCE What when a shared ontology is not available? Structural & definitional information can be used to discover mappings Example tools – IF-Map, QOM, MAFRA & Prompt

IF-MAP architecture Fig: The steps in IF-MAP

PROMPT Suite of Tools Interactive tools for ontology merging and mapping Ontology formal specification of domain information facilitate knowledge sharing and reuse Different ontologies –may overlap, need to be reconciled Determine correlation Find all concepts Determine similarities Change source ontologies or remove overlap Record mapping for future reference

Ontology Management Tasks Finding correlations Merging ontologies Version management Factoring ontologies Tools Benefit from being tightly integrated into single framework Uniform user interface Same interaction paradigms Easy access from one tool to another

PROMPT Knowledge Model Based on knowledge model of Protégé Frame based Types of frames Class Set of entities specifying a concept Slots Attributes of class Has domain and range Must have unique names Instances Elements of class

PROMPT Framework Tools for multiple-ontology management Extension to Protege ontology-editing environment Open architecture allows easy extension with plugins Tools in PROMPT IPROMPT – Interactive ontology merging tool ANCHORPROMPT – a graph-based tool for finding similarities between ontologies PROMPTDIFF –for finding a diff between two versions of the same ontology PROMPTFACTOR – a tool for extracting a part of an ontology

PROMPT Framework

IPROMPT Interactive ontology merging tool Leads user through merging process Suggestions for merging Identifies inconsistencies and potential problems Suggests strategies for resolving Uses structure of concepts and their relation along with user input Decision based on local context Iterative

IPROMPT Algorithm

Creates initial suggestion based on lexical similarity of names Merged ontology contains frames which are similar to frames in input ontologies 2 ontologies O 1 and O 2 are merged to form O m Merging decisions are designer and task dependent Set of knowledge based operations defined For each operation: Changes performed automatically New merging suggestions Inconsistencies and potential problems

Class hierarchies

Suggestion for merging

IPROMPT Operations Merge classes Merge slots Merge instances Shallow copy of a class Copy class from source ontology to merged Deep copy of a class Also copies all the parents of the class up to the root hierarchy

Inconsistencies & Potential Problems Name conflicts Dangling references Redundancy in the class hierarchy Slot values violating slot-value restrictions

Additional features Setting up preferred ontology Maintaining user focus Providing feedback to user Logging of ontology merging and editing operations

ANCHORPROMPT Graph based tool for finding similarities Compares larger portions Goal : Augment IPROMPT by determining additional points of similarity Input : Anchors - Set of pairs of related terms Anchor identification – Manual /Automatic Each ontology is viewed as a directed labeled graph

ANCHORPROMPT representation

ANCHORPROMPT algorithm

Algorithm Begins with anchor pair TRIAL, Trail PERSON, Person Path 1: TRIAL -> PROTOCOL -> STUDY-SITE -> PERSON Path 2: Trial -> Design -> Blinding -> Person Determine similarity score for pair of related terms If two pairs of terms from the source ontologies are similar and there are paths connecting the terms, then the elements in those paths are often similar as well

PROMPTDIFF Tool for comparing ontology versions Version comparison in software code is based on comparing text files Ontologies have different text representation Heuristics algorithm that produces a structural diff between two versions Compares the structure of the two ontology versions Identifies frames changed and what changes were made

PromptDiff Algorithm An extensible set of heuristic matchers Fixed-point algorithm to combine the results of the matchers to produce a structural diff between two versions

PROMPTFACTOR Tool for factoring out semantically independent part of an large ontology into a new sub-ontology Ensures that severed links do not introduce ill-defined concepts in the sub-ontology User can specify concepts of interest Performs the transitive closure of the superclass relation and all the relations defined by slots Target ontology works as stand-alone

PromptFactor Algorithm User specifies the concept of interest PromptFactor traverses the ontology term Determines transitive closure of all relations including subclass-of relation Determines all the parents of selected term in hierarchy User interactive Determines inconsistencies

Prompt Demo It is available as a plug-in for Protégé 3.4 Uses linguistic similarity matches between concepts Also matches slot names and slot value types In cases where automation is not possible, user intervention is needed; possible actions are suggested Alignment is followed by merging Alignment is establishing links between the ontologies Merging is the creation of a single coherent ontology

Prompt Demo

The Web of Data Data sources span a large range of domains RDF data model is used to publish structured data on the web Explicit RDF links exist between entities in different data sources However, there is a lack of tools to set RDF links to other data sources

Silk It is a link specification language Allows specification of the links that should be discovered between data sources, as well as conditions to be fulfilled to be linked Link conditions are specified using similarity metrics; they can use aggregation functions to combine similarity scores Data access performed using SPARQL

Silk Features Support for owl:sameAs links and other types of RDF links Provides a declarative language to specify link conditions Datasets need not be replicated locally Caching, indexing and entity pre-selection are used to enhance performance

Silk LSL example

Silk LSL example..contd

Silk similarity metrics Similarity metrics can be combined using aggregation functions Sets of resources can be selected using Silk RDF path selector language

Silk Pre-Matching Comparison of all entities in Source ‘S’ and Target ‘T’ would need O(|S|*|T|) Using pre-matching a limited set of target entities that are likely to match a given source entity is found Performed by indexing the target resources based on their property values Using this scheme reduces runtime to O(|S| + |T|)

Silk Implementation

Managing coreferences Semantic web vision - Large quantities of information Readily available Interlinked Machine readable Fragmented web Significant overlap Need to identify ‘duplicates’ Co-reference resolution – determining “equivalent” URIs

Co-reference Resolution Service (CRS) Systematic analysis and heuristic based approach : Identifying Publishing Managing Using co-reference information Most prevalent way – owl:sameAs Equivalence – context dependent

CRSes Maintain sets of equivalent URIs Storing co-reference data separately URI definition and synonyms are kept separate Management techniques - history, rollback, annotation Use of multiple CRSes that applications can use Core functionality in PHP – easy integration Backed by MySQL

Data representation in CRS Equivalent URIs are stored in bundles 1 URI in each bundle is considered as a canon- preferred URI Formation of bundles: Check if URI already exists in any bundle If not, create a ‘singleton’ bundle for new URIs Perform merge – union of bundles with “equivalent” URIs Constituent bundles that were merged are marked inactive

Examples of bundle formation

Data representation Data storage – Indexed tables of hashed URIs Permits fast lookup to find: Canon of given URI All URIs in a bundle Deprecate URIs by flags Finding all equivalences - coref:coreferenceData links to the bundle for that URI and recursively repeat the process for each URI in that bundle

<rdf:RDF xmlns:coref=" xmlns:rdf=" :11:40 RDF description of equivalent URIs in a bundle

Ways to speed up Look up only 1 URI from each CRS Follow only coref:canon predicate Lookup would need O(log|S|+ log|T|)

References [1] The PROMPT Suite: Interactive Tools For Ontology Merging And Mapping – Natalya F. Noy and Mark A. Musen;Stanford Medical Informatics, Stanford University [2] Managing Co-reference on the Semantic Web - Hugh Glaser, Afraz Jaffri, Ian C. Millard School of Electronics and Computer Science University of Southampton Southampton, Hampshire, UK [3] Ontology Mapping: The State of the Art Yannis Kalfoglou and Marco Schorlemmer [4] Kalfoglou, Y. and Schorlemmer, M. (2003a). IFMap: an ontology mapping method based on information flow theory. Journal on Data Semantics, 1(1):98–127. [5] Silk – A Link Discovery Framework for the Web of Data Julius Volz, Christian Bizer et al.