1 Découverte de mappings entre schemas : les différentes approches Schema Matching : Different Approaches Khalid Saleem LIRMM.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
Learning to Map between Ontologies on the Semantic Web AnHai Doan, Jayant Madhavan, Pedro Domingos, and Alon Halevy Databases and Data Mining group University.
Semantic integration of data in database systems and ontologies
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.4/1 Outline Introduction Background Distributed Database Design Database Integration ➡ Schema Matching ➡
An Extensible System for Merging Two Models Rachel Pottinger University of Washington Supervisors: Phil Bernstein and Alon Halevy.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Generic Schema Matching using Cupid
CS652 Spring 2004 Summary. Course Objectives  Learn how to extract, structure, and integrate Web information  Learn what the Semantic Web is  Learn.
Xyleme A Dynamic Warehouse for XML Data of the Web.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Learning to Match Ontologies on the Semantic Web AnHai Doan Jayant Madhavan Robin Dhamankar Pedro Domingos Alon Halevy.
GloServ: Global Service Discovery Architecture Knarig Arabshian and Henning Schulzrinne IRT internal talk April 26, 2005.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Schema Matching Algorithms Phil Bernstein CSE 590sw February 2003.
Generic Schema Matching with Cupid Jayant Madhavan Philip A. Bernstein Erhard Raham Proceedings of the 27 th VLDB Conference.
QoM: Qualitative and Quantitative Measure of Schema Matching Naiyana Tansalarak and Kajal T. Claypool (Kajal Claypool - presenter) University of Massachusetts,
BYU Data Extraction Group Funded by NSF1 Brigham Young University Li Xu Source Discovery and Schema Mapping for Data Integration.
Alternatives to Metadata IMT 589 February 25, 2006.
ONTOLOGY MATCHING Part III: Systems and evaluation.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
Ontology Matching Basics Ontology Matching by Jerome Euzenat and Pavel Shvaiko Parts I and II 11/6/2012Ontology Matching Basics - PL, CS 6521.
Semantic Matching Pavel Shvaiko Stanford University, October 31, 2003 Paper with Fausto Giunchiglia Research group (alphabetically ordered): Fausto Giunchiglia,
A survey of approaches to automatic schema matching Erhard Rahm, Universität für Informatik, Leipzig Philip A. Bernstein, Microsoft Research VLDB 2001.
1/ 27 The Agriculture Ontology Service Initiative APAN Conference 20 July 2006 Singapore.
Web Explanations for Semantic Heterogeneity Discovery Pavel Shvaiko 2 nd European Semantic Web Conference (ESWC), 1 June 2005, Crete, Greece work in collaboration.
Aidministrator nederland b.v. Adding formal semantics to the Web Jeen Broekstra, Michel Klein, Stefan Decker, Dieter Fensel,
Knowledge Representation Ontology are best delivered in some computable representation Variety of choices with different: –Expressiveness The range of.
Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.
BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING Pavel Shvaiko joint work with Fausto Giunchiglia and Mikalai Yatskevich INFINT 2007 Bertinoro Workshop on Information.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
PART IV: REPRESENTING, EXPLAINING, AND PROCESSING ALIGNMENTS & PART V: CONCLUSIONS Ontology Matching Jerome Euzenat and Pavel Shvaiko.
RDF and OWL Developing Semantic Web Services by H. Peter Alesso and Craig F. Smith CMPT 455/826 - Week 6, Day Sept-Dec 2009 – w6d21.
Semantic Matching Fausto Giunchiglia work in collaboration with Pavel Shvaiko The Italian-Israeli Forum on Computer Science, Haifa, June 17-18, 2003.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
Dimitrios Skoutas Alkis Simitsis
updated CmpE 583 Fall 2008 Ontology Integration- 1 CmpE 583- Web Semantics: Theory and Practice ONTOLOGY INTEGRATION Atilla ELÇİ Computer.
Ontology Alignment. Ontologies in biomedical research many biomedical ontologies e.g. GO, OBO, SNOMED-CT practical use of biomedical ontologies e.g. databases.
CSE 636 Data Integration Schema Matching Cupid Fall 2006.
Logics for Data and Knowledge Representation Applications of ClassL: Lightweight Ontologies.
A Classification of Schema-based Matching Approaches Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
The Semantic Web Riccardo Rosati Dottorato in Ingegneria Informatica Sapienza Università di Roma a.a. 2006/07.
Semantic Mappings for Data Mediation
Knowledge Representation. Keywordsquick way for agents to locate potentially useful information Thesaurimore structured approach than keywords, arranging.
The Semantic Web and Ontology. The Semantic Web WWW: –syntactic transmission of information –only processible by human – no semantic conservation of the.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Making Holistic Schema Matching Robust: An Ensemble Approach Bin He Joint work with: Kevin Chen-Chuan Chang Univ. Illinois at Urbana-Champaign.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Chapter 8A Semantic Web Primer 1 Chapter 8 Conclusion and Outlook Grigoris Antoniou Frank van Harmelen.
The Semantic Web By: Maulik Parikh.
Statistical Schema Matching across Web Query Interfaces
Information Organization
ece 627 intelligent web: ontology and beyond
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
ece 720 intelligent web: ontology and beyond
Property consolidation for entity browsing
Piotr Kaminski University of Victoria September 24th, 2002
Integrating Taxonomies
A Semantic Peer-to-Peer Overlay for Web Services Discovery
Tantan Liu, Fan Wang, Gagan Agrawal The Ohio State University
Presentation transcript:

1 Découverte de mappings entre schemas : les différentes approches Schema Matching : Different Approaches Khalid Saleem LIRMM

2 Schema and Ontology Schema represents Database Community Schemas often do not provide explicit semantics of their data (ER, XML document schema). Ontology represents the AI Community Ontologies are logical systems that themselves obey some formal semantics. Designed to be interpreted by computers for reasoning (OWL) Schemas and Ontologies are similar in the sense that Both provide a vocabulary of terms that describes a domain Both constraint the meaning of terms used in vocabulary (Hierarchy/ relations) XML XML Schema RDF RDF Schema OWL

3 Schema vs Ontology : examples class-def animal %plants are a class that is disjoint from animals class-def plant subclass-of NOT animal %it is necessary but not sufficient for a tree to be a plant: class-def tree subclass-of plant %branches are PART OF trees class-def branch slot-constraint is-part-of has-value tree %it is necessary and sufficient for a carnivore to be an animal: class-def defined carnivore subclass-of animal slot-constraints eats value-type animal %herbivores eat only plants OR part of plants class-def defined herbivore subclass-of animal slot-constraint eats value-type plant OR (slot-constraint is-part-of has-value plant) DAML +OIL branch is-part-of tree XML

4 Match Takes two schemas/ontologies as input and produces a mapping between elements of the two schemas that correspond semantically to each other 1-1 match complex match 26,60 Harry Potter J. K. Rowling 11,50 Marie Des Intrigues Juliette Benzoni 16,50 Nous Les Dieux Bernard Werber 24 Pompei Robert Harris price book-title author-name Books Source A listed-price title a-fname a-lname Books Source B

5 Schema Matching vs Ontology Matching Schema matching is usually performed with the help of techniques trying to guess the meaning encoded in the schemas Ontology matching try to exploit knowledge explicitly encoded in the ontologies.` In real world applications : Solutions from both domains are mutually beneficial

6 Application Domains Traditional (Static) Schema Integration Data warehousing E-commerce Catalogue Integration New Frontiers (Dynamic) Semantic Query Processing Agent Communication Web Services Integration P2P Databases

7 Basic Classification of Matchers [RB01] Schema vs Data Instance Element vs Structure Language vs Constraint String based : Prefix, Suffix e.g. auth: author Tokenization, Lemmatization, Eliminition [GSY04] Tool_Kit :(Tool,Kit), Kits:Kit, IsRelatedTo : Related Data Types, Value domain e.g : month Match Cardinalities - 1:1, 1:n, n:m (Tel Res, Other) : (Tel Day, Evening, Night) Auxiliary Information Global Schema, Dictionaries, Thesauri, Previous Match Decisions, User Input

8 Basic Classification of Matchers [SE05] Structure Level Techniques Graph Matching Children Leaves Relations Taxonomy based Techniques e.g if super concept is same then sub concepts are same or vice versa Model Based ER, XML or XML schema, OWL, OO etc. Combinational Matchers[RB01] Hybrid Matcher Multiple/Composite Matcher

9 Match Dimensions [SE05] For Match Algorithms designing We need the knowledge for its utilization i.e. Dimensions Input of the Algorithm Data or Schema, Element level or Structure Level Characteristics of the Matching Process Require exact or approximate matching Performance over quality Output of the Algorithms Output is a graded result, or part of a set of match algorithms which are combined together for a map result

10 Existing Matching Tools Cupid [MBR01] COMA (COMA++) [ADMR05] Similarity Flooding SemInt Artemis DIKE TransScm AutoMed Charlie [TBBT04] Ontologies Specific NOM/ QOM OLA Anchor-PROMPT S-Match [GSY04] HICAL SKAT

11 Matching Tools continued Machine Learning GLUE (LSD, CGLUE) [DMDH02] Automatch These tools do not completely fulfil the requirements for large scale schema matching because Not fully automated Emphasise less on search space optimisation

12 Our Approach Motivation : Large Scale Scenario Peer-to-peer Information Systems over the XML Web b ap n t n b wf n t pi n b d a g tpr w nho t a: author b: book d: detail f: information g: general h: birth i: isbn n: name o: own-books p: publisher r: price t: title w: writer b wf n t pi n a=w b=o f=d Our Schema Matching and Integration Approach Tree Mining Techniques Name Matcher Element Level Matching Structure Level Matching Search sub-trees h

13 Tree Mining Approach Our work extends these data structures for schema matching and integration process for handling large sets of XML schema trees. Employs a) Element level Name Matcher (same node label or synonym) Cluster similar/synonym labels b) Utilize the node scope values properties to extract semantics out of structure E.g. node with label name n2[2,2] is a descendent of node with label author n1[1,2] and not of node with label publisher n3[3,4] verified using descendent test Inspired from the tree mining algorithms and data structures based on node scope values (calculated by depth first pre-order traversal) Top-down [Z02] book n0 [0,5] b title n5 [5,5] t author n1 [1,2] a name n2 [2,2] n publisher n3 [3,4] p name n4 [4,4] n Descendent Node Check : Scope of Node x is [X,Y] and Scope of Descendent Node xd [Xd,Yd] then Xd>X and Yd<=Y

14 Tree Mining Approach … continued Data Structure used Label List : Sorted list of all node labels in the forest of XML schema trees xGrid : Matrix in which each row represent each participating XML tree and each column represents the corresponding node label. Each cell contains the scope values, parent node number and mapping information. Output Creation of a Mediated Schema Tree, from the given forest of participating XML schema trees. Generation of Mapping Information between participating schema trees and the mediated schema tree

15 Tree Mining Approach … continued Mapping Information is the column number of node Sm S1 S2 S3 S4

16 Conclusion Element level Name and Linguistic Matching with the support of thesaurus is an integral part of every Match system. With systems moving towards schema/ontology based manipulation, and lack of global schemas or previous matching results, Structure Level matching is equally important for making out the semantics. Peer-to-peer environment requires new methods to be exploited for performance and quality mapping i.e. integration of Tree Mining techniques for matching purposes and search space optimisation. Machine Learning algorithms can be beneficial in the P2P environment in later stages when training examples have been created from instance data, provided the target domain remains the same.

17 References [AH04] Antoniou G., Harmelen F. A Semantic Web Primer, The MIT Press, 2004 [ADMR05] Aumuller D., Do H. H., Massmann S., and Rahm E. Schema and ontology matching with COMA++. In Proceedings of the International Conference on Management of Data (SIG-MOD), 2005 [BR04] Bellahsène Z. and Roantree M. (2004) Querying Distributed Data in a Super- peer based Architecture. DEXA [BMP04] Bernstein PA., Melnik S., Petropoulos M. and Quix C. (2004) Industrial- Strength Schema Mapping. SIGMOD Record, Vol. 33, No. 4, December 2004 [DMDH02] Doan AH., Madhavan J., Domingos P. and Halvey A. (2002) Learning to Map Ontologies on the Semantic Web. WWW 2002 [MBR01] Madhavan J., Bernstein PA. and Rahm E. (2001) Generic Schema Matching with Cupid. VLDB [RB01] Rahm E. and Bernstein PA (2001) A Survey of Approaches to Automatic Schema Matching. VLDB Journal 2001 : 10(4): [SE05] Shvaiko P. and Euzenat J. (2005) A Survey of Schema-based Matching Approaches. Journal on Data Semantics, [TBBT04] Tranier J., Baraer R., Bellahsene Z. and Teisseire M (2004) Where’s Charlie: Family Based Heuristics for Peer-to-Peer Schema Integration. IDEAS 2004, [Z02] Zaki MJ (2002) Efficiently Mining Frequent Trees in a Forest. 8 th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining. July

18 Thank you