PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy Stanford Medical Informatics Stanford University.

Slides:



Advertisements
Similar presentations
Profiles Construction Eclipse ECESIS Project Construction of Complex UML Profiles UPM ETSI Telecomunicación Ciudad Universitaria s/n Madrid 28040,
Advertisements

Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Ch 3 System Development Environment
Kunal Narsinghani Ashwini Lahane Ontology Mapping and link discovery.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
Ontologies - Design principles Cartic Ramakrishnan LSDIS Lab University of Georgia.
Page 1 Integrating Multiple Data Sources using a Standardized XML Dictionary Ramon Lawrence Integrating Multiple Data Sources using a Standardized XML.
USC Graduate Student DayColumbia, SCMarch 2006 Presented by: Jingshan Huang Computer Science & Engineering Department University of South Carolina PhD.
Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CSC 8520 Fall, Paula Matuszek 1 CSC 8520: Artificial Intelligence.
Issues in the Transfer of Help Tools to Government Agencies: The Example of the Statistical Interactive Glossary (SIG) Stephanie W. Haas School of Information.
Merging Models Based on Given Correspondences Rachel A. Pottinger Philip A. Bernstein.
Chapter 14 (Web): Object-Oriented Data Modeling
A Review of Ontology Mapping, Merging, and Integration Presenter: Yihong Ding.
© 2006 Pearson Addison-Wesley. All rights reserved4-1 Chapter 4 Data Abstraction: The Walls.
Analysis Stage (Phase I) The goal: understanding the customer's requirements for a software system. n involves technical staff working with customers n.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 5 Understanding Entity Relationship Diagrams.
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
Biological Ontologies Neocles Leontis April 20, 2005.
Tools for Developing and Using DAML-Based Ontologies and Documents Richard Fikes Deborah McGuinness Sheila McIlraith Jessica Jenkins Son Cao Tran Gleb.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
October 15, 2007Inf 722 Information Organisation (Fall 2007) (Gangolly)1 Ontologies Lecture Notes Prepared by Jagdish S. Gangolly Interdisciplinary Ph.D.
Protégé An Environment for Knowledge- Based Systems Development Haishan Liu.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya Fridman Noy and Mark A. Musen.
Information Fusion: Moving from domain independent to domain literate approaches Professor Deborah L. McGuinness Tetherless World Constellation, Rensselaer.
1 CIS607, Fall 2005 Semantic Information Integration Presentation by Amanda Hosler Week 6 (Nov. 2)
Evaluating Ontology-Mapping Tools: Requirements and Experience Natalya F. Noy Mark A. Musen Stanford Medical Informatics Stanford University.
State of the Art Ontology Mapping By Justin Martineau.
Problems with reuse – Increased maintenance costs; lack of tool support; not-invented- here syndrome; creating, maintaining, and using a component library.
Carlos Lamsfus. ISWDS 2005 Galway, November 7th 2005 CENTRO DE TECNOLOGÍAS DE INTERACCIÓN VISUAL Y COMUNICACIONES VISUAL INTERACTION AND COMMUNICATIONS.
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
In The Name Of God. Jhaleh Narimisaei By Guide: Dr. Shadgar Implementation of Web Ontology and Semantic Application for Electronic Journal Citation System.
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Metadata Agents and Semantic Mediation Mikhaila Burgess Cardiff University.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
A Z Approach in Validating ORA-SS Data Models Scott Uk-Jin Lee Jing Sun Gillian Dobbie Yuan Fang Li.
PART IV: REPRESENTING, EXPLAINING, AND PROCESSING ALIGNMENTS & PART V: CONCLUSIONS Ontology Matching Jerome Euzenat and Pavel Shvaiko.
Database Management System Prepared by Dr. Ahmed El-Ragal Reviewed & Presented By Mr. Mahmoud Rafeek Alfarra College Of Science & Technology Khan younis.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Dimitrios Skoutas Alkis Simitsis
P15 Lai Xiaoni (U077151L) Qiao Li (U077194E) Saw Woei Yuh (U077146X) Wang Yong (U077138Y)
Unified Modeling Language © 2002 by Dietrich and Urban1 ADVANCED DATABASE CONCEPTS Unified Modeling Language Susan D. Urban and Suzanne W. Dietrich Department.
Elizabeth Furtado, Vasco Furtado, Kênia Sousa, Jean Vanderdonckt, Quentin Limbourg KnowiXML: A Knowledge-Based System Generating Multiple Abstract User.
Dr. Darius Silingas | No Magic, Inc. Domain-Specific Profiles for Your UML Tool Building DSL Environments with MagicDraw UML.
Object Oriented Multi-Database Systems An Overview of Chapters 4 and 5.
Ontology Mapping in Pervasive Computing Environment C.Y. Kong, C.L. Wang, F.C.M. Lau The University of Hong Kong.
1 Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas Qi He Tok Wang Ling Dept. of Computer Science School of Computing National.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
Testing OO software. State Based Testing State machine: implementation-independent specification (model) of the dynamic behaviour of the system State:
1 USC INFORMATION SCIENCES INSTITUTE EXPECT TEMPLE: TEMPLate Extension Through Knowledge Acquisition Yolanda Gil Jim Blythe Information Sciences Institute.
Object storage and object interoperability
Introduction to Active Directory
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Henrik Eriksson Department of Computer and Information Science Linkoping University SE Linkoping, Sweden Raymond W. Fergerson Yuval Shahar Stanford.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Protégé/2000 Advanced Tools for Building Intelligent Systems Mark A. Musen Stanford University Stanford, California USA.
1 Model Driven Health Tools Design and Implementation of CDA Templates Dave Carlson Contractor to CHIO
Database Design, Application Development, and Administration, 6 th Edition Copyright © 2015 by Michael V. Mannino. All rights reserved. Chapter 5 Understanding.
Of 24 lecture 11: ontology – mediation, merging & aligning.
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Ontology Alignment Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Graphical Data Engineering
OKBC (Open Knowledge Base Connectivity) An API For Knowledge Servers
Conceptual Design & ERD Modelling
DOMAIN ONTOLOGY DESIGN
Daniel Amyot and Jun Biao Yan
[jws13] Evaluation of instance matching tools: The experience of OAEI
State of the Art Ontology Mapping
Building Ontologies with Protégé-2000
Presentation transcript:

PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy Stanford Medical Informatics Stanford University

Outline Definitions and motivation The PROMPT ontology-merging algorithm Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT) The tools Evaluation Future work

Ontologies Characterize concepts and relationships in an application area, providing a domain of discourse Enumerate concepts, attributes of concepts, and relationships among concepts Define constraints on relationships among concepts

Why do we need ontologies An ontology provides a shared vocabulary for different applications in a domain An ontology enables interoperation among applications using disparate data sources from the same domain

Ontologies Are Everywhere Ontologies have been used in academic projects for a long time Knowledge sharing and reuse Reuse of problem-solving methods Ontologies are becoming widely used outside of academia Categorization of Web sites (e.g. Yahoo!) Product catalogs

Need for Ontology Merging There is significant overlap in existing ontologies Yahoo! and DMOZ Open Directory Product catalogs for similar domains

Need for Ontology Merging and Integration Need to merge or align overlapping ontologies Chemdex™—a portal for accessing life- science–supply catalogs Workshop on “Ontologies and Information Sharing” at IJCAI’ out of 18 papers (1/3) are about ontology merging and integration

What Is Ontology Merging

Existing Approaches Ontology design and integration term matching (Stanford SKC, ISI) graph-based analysis (Stanford SKC) transformation operators (Ontomorph at ISI) merging tools (Chimaera at Stanford KSL) Object-oriented Programming subject-oriented programming (IBM) “subjective” views of classes transformation operations concentrates on methods rather than relations

Existing Approaches (II) Databases develop mediators and provide wrappers define a common data model and mappings define matching rules to translate directly Most of these approaches do not provide any guidance to the user, do not use structural information

Outline Definitions and motivation The PROMPT ontology-merging algorithm Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT) The tools Evaluation Future work

PROMPT Our approach is: Partial automation Algorithms based on concept-representation structure relations between concepts user’s actions Our approach is not: Complete automation Algorithm for matching concept names

Knowledge Model A generic knowledge model of OKBC (Open Knowledge- Base Connectivity Protocol) Classes Collections of objects with similar properties Arranged in a subclass–superclass hierarchy Instances Slots First-class objects in a knowledge base Binary relations describing properties of classes and instances Facets Constraints on slot values (cardinality, min, max)

Make initial suggestions Select the next operation Perform automatic updates Find conflicts Make suggestions The PROMPT Algorithm

Example: merge-classes Agency employee Agent Customer subclass of agent for Agent Employee Traveler subclass of has client Agency employee Agent Employee Customer Traveler subclass of agent for has client

Example: merge-classes (II) Agency employee Agent Employee Customer Traveler subclass of agent for has client Agency employee Agent Employee Customer Traveler subclass of agent for

Analyzing Global Properties Locally Global properties classes that have the same sets of slots classes that refer to the same set of classes slots that are attached to the same classes Local context incremental analysis consider only the concepts that were affected by the last operation

The PROMPT Operation Set Extends the OKBC operation set with ontology- merging operations merge classes merge slots merge instances copy of a class deep or shallow with or without subclasses with or without instances …

After a User Performs an Operation For each operation perform the operation consider possible conflicts identify conflicts propose solutions analyze local context create new suggestions reinforce or downgrade existing suggestions

Conflicts Conflicts that PROMPT identifies name conflicts dangling references redundancy in a class hierarchy slot-value restrictions that violate class inheritance

Agent Example: merge-classes

Operation Steps: merge-classes Own slot and their values for the new class ask the user in case of conflicts or use preferences Template slots for the new class union of template slots of the original classes Subclasses and superclasses for the new class Conflicts Suggestions

Agent agent for Template Slots Copy template slots that don’t exist in the merged ontology agent for

Agent has client client Template Slots Attach the slots that have already been mapped

Employee Subclasses And Superclasses If a superclass (subclass) exists, re-establish the links Agent Agency employee superclass

Agent Dangling References Agent agent for Customer facet value For example, allowed class agent for facet value Customer _temp dummy frame

Agent client has client Additional Suggestions: Merge Slots If slot names at the merged class are similar, suggest to merge the slots

Agent Additional Suggestions: Merge Classes If the set of classes referenced by the merged class is the same as the set of classes referenced by another class, suggest a merge ReservationClient has clients handles reservations Agency employee

EmployeeAgency employee Agent If names of superclasses (subclasses) of the merged class are similar, suggest to merge the classes superclass Additional Suggestions: Merge Classes

Check for Cycles Person EmployeeAgency employee Agent superclass If there is a cycle, suggest removing one of the parents

To Summarize Perform the actual operation For the concepts (classes, slots, and instances) directly attached to the operation arguments perform global analysis for new suggestions Perform global analysis for new conflicts

Non-local context Classes directly referenced by C Slots in C Context C

Anchor-PROMPT: Using Non-Local Contexts Input: A set of anchor pairs Output: A set of related terms with similarity scores Where do anchors come from? Lexical matching Interactive tools User-specified Ontology 1Ontology 2

Generating Paths in the Graph

Similarity Score Generate a set of all paths (of length < L) Generate a set of all possible pairs of paths of equal length For each pair of paths and for each pair of nodes in the identical positions in the paths, increment the similarity score Combine the similarity score for all the paths

Equivalence Groups

Anchor-PROMPT: Initial Results TRIALTrial PERSONPerson CROSSOVERCrossover PROTOCOLDesign TRIAL-SUBJECTPerson INVESTIGATORSPerson POPULATIONAction_Spec PERSONCharacter TREATMENT-POPULATIONCrossover_arm

Knowledge Model Assumptions The only assumption: An OKBC-compliant knowledge model

Outline Definitions and motivation The PROMPT ontology-merging algorithm Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT) The tools Evaluation Future work

Protégé-2000 An environment for Ontology development Knowledge acquisition Intuitive direct-manipulation interface Extensibility Ability to plug in new components

Ontologies in Protégé-2000

Protégé-200 plugins Domain-specific user-interface plugins Alternative back ends for archival storage Utility programs for knowledge-acquisition tasks End-user applications

Protégé-based PROMPT tool Protégé-2000 has an OKBC-compatible knowledge model allows building extensions through a plug-in mechanism can work as a knowledge-base server for the plug- ins

The PROMPT tool

The PROMPT tool features Setting a preferred ontology Maintaining the user’s focus Providing feedback to the user Preserving original relations subclass-superclass relations slot attachment facet values Linking to the direct-manipulation ontology editor Logging operations

Outline Definitions and motivation The PROMPT ontology-merging algorithm Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT) The tools Evaluation Future work

Evaluation Knowledge-based systems are rarely evaluated We can use software-engineering approaches to empirical evaluation of tools We need to develop additional knowledge- base measurements

Questions we asked How good are PROMPT’s suggestions and conflict-resolution strategies? Does PROMPT provide any benefit when compared to a generic ontology-editing tool (Protégé-2000)?

What we were trying to find out The benefit that the tool provides Productivity benefit Quality improvement in the resulting ontologies User satisfaction Precision and recall of the tool’s suggestions

Source ontologies for the experiments Two ontologies of problem-solving methods the ontology for the Unified Problem-solving Method Development Language (UPML) the ontology for the Method-Description Language (MDL)

Experiment 1: Evaluate the quality of PROMPT’s suggestions Metrics Precision Recall Method Automatic logging Automatic data reporting Suggestions that the tool produced Operations that the user performed Suggestions that the user followed

Results: the quality of PROMPT’s suggestions Suggestions that users followed Conflict-resolution strategies that users followed Knowledge-base operations generated automatically 90% 75% 74%

Experiment 2: PROMPT versus generic Protégé-2000 Metrics content of the resulting ontologies number of explicit knowledge-base operations PROMPT

Results: PROMPT versus generic Protégé-2000 The resulting ontologies had only one difference Specifying operations explicitly 16 60

Results Experts followed most of the PROMPT’s suggestions Using PROMPT has improved the efficiency of ontology merging

Anchor-PROMPT Evaluation Experiment setup Two ontologies from the DAML ontology library Varying parameters maximum path length number of anchor pairs Experiment results Ratio of correct results above the median similarity score

Anchor-PROMPT: Evaluation Results

Anchor-PROMPT Evaluation Results Equivalence groups of size <= 2 are required Maximum path lengths of 2 provides extremely high precision (but low recall) 75% precision with maximum path lengths 3 and 4

Future work Extend the set of heuristics that PROMPT uses for guiding the experts Extend the techniques to ontology alignment and ontology refactoring Develop protocols and metrics for a more detailed evaluation of the tools