Peer Data Management, Concluded and Model Management Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems April 18, 2005.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
XML: Extensible Markup Language
RDF Schemata (with apologies to the W3C, the plural is not ‘schemas’) CSCI 7818 – Web Technologies 14 November 2001 Van Lepthien.
An Introduction to RDF(S) and a Quick Tour of OWL
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
XML Technology in E-Commerce
Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships Eduard C. Dragut Ramon Lawrence Eduard C. Dragut Ramon Lawrence.
An Extensible System for Merging Two Models Rachel Pottinger University of Washington Supervisors: Phil Bernstein and Alon Halevy.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
Web Semantics: KB vs. DB Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems April 13, 2005.
Merging Models Based on Given Correspondences Rachel A. Pottinger Philip A. Bernstein.
A First Attempt towards a Logical Model for the PBMS PANDA Meeting, Milano, 18 April 2002 National Technical University of Athens Patterns for Next-Generation.
1 CIS607, Fall 2005 Semantic Information Integration Instructor/Organizer: Dejing Dou Week 1 (Sept. 28)
A Review of Ontology Mapping, Merging, and Integration Presenter: Yihong Ding.
1 COS 425: Database and Information Management Systems XML and information exchange.
TOSS: An Extension of TAX with Ontologies and Similarity Queries Edward Hung, Yu Deng, V.S. Subrahmanian Department of Computer Science University of Maryland,
How can Computer Science contribute to Research Publishing?
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Databases: Some Research Opportunities For Latin America Marcelo Arenas Pontificia Universidad Católica de Chile Marcelo Arenas Pontificia Universidad.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
Model Management and the Future Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems April 20, 2005 Semex figures extracted.
Page 1 Multidatabase Querying by Context Ramon Lawrence, Ken Barker Multidatabase Querying by Context.
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Crossing the Structure Chasm Alon Halevy University of Washington FQAS 2002.
ANHAI DOAN ALON HALEVY ZACHARY IVES Chapter 6: General Schema Manipulation Operators PRINCIPLES OF DATA INTEGRATION.
09/12/2003 Peer-to-Peer Information Systems – WS 03/04 1 Piazza: Data Management Infrastructure for Semantic Web Applications Alon Y. Halevy, Zachary G.
CSE 590DB: Database Seminar Autumn 2002: Meta Data Management Phil Bernstein Microsoft Research.
4/20/2017.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
Practical RDF Chapter 1. RDF: An Introduction
Piazza: Data Management Infrastructure for the Semantic Web Zachary G. Ives University of Pennsylvania CIS 700 – Internet-Scale Distributed Computing February.
Peer-to-Peer Data Integration Using Distributed Bridges Neal Arthorne B. Eng. Computer Systems (2002) Supervisor: Babak Esfandiari April 12, 2005 Candidate.
Logics for Data and Knowledge Representation
RDF and OWL Developing Semantic Web Services by H. Peter Alesso and Craig F. Smith CMPT 455/826 - Week 6, Day Sept-Dec 2009 – w6d21.
SQL Databases are a Moving Target Juan F. Sequeda – Syed Hamid Tirmizi –
An Algebra for Composing Access Control Policies (2002) Author: PIERO BONATTI, SABRINA DE CAPITANI DI, PIERANGELA SAMARATI Presenter: Siqing Du Date:
Querying Structured Text in an XML Database By Xuemei Luo.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
1 Lessons from the TSIMMIS Project Yannis Papakonstantinou Department of Computer Science & Engineering University of California, San Diego.
Dimitrios Skoutas Alkis Simitsis
1 © 1999 Microsoft Corp.. Microsoft Repository Phil Bernstein Microsoft Corp.
1 CS 430 Database Theory Winter 2005 Lecture 2: General Concepts.
RQL: RDF Query language Jianguo Lu University of Windsor The following slides are from Grigoris Antoniou, Frank van Harmelen, “A Semantic Web Primer”
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Algorithmic Detection of Semantic Similarity WWW 2005.
Presented by Jiwen Sun, Lihui Zhao 24/3/2004
WonderWeb. Ontology Infrastructure for the Semantic Web. IST Project Review Meeting, 11 th March, WP2: Tools Raphael Volz Universität.
Relational-Style XML Query Taro L. Saito, Shinichi Morishita University of Tokyo June 10 th, SIGMOD 2008 Vancouver, Canada Presented by Sangkeun-Lee Reference.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
1 Integrating Databases into the Semantic Web through an Ontology-based Framework Dejing Dou, Paea LePendu, Shiwoong Kim Computer and Information Science,
Ontology Technology applied to Catalogues Paul Kopp.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Linking Ontologies to Spatial Databases
The Semantic Web By: Maulik Parikh.
Web Ontology Language for Service (OWL-S)
Chapter 2 Database Environment Pearson Education © 2009.
Ontology.
Semantic Markup for Semantic Web Tools:
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment Pearson Education © 2009.
ONTOMERGE Ontology translations by merging ontologies Paper: Ontology Translation on the Semantic Web by Dejing Dou, Drew McDermott and Peishen Qi 2003.
Presentation transcript:

Peer Data Management, Concluded and Model Management Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems April 18, 2005

2 Administrivia  Next readings and summaries:  Dong and Halevy on Personal Info Management  2 paragraph summary of the problems they focus on, key contributions  From Piazza to pizza … and scheduling

3 Today’s Trivia Question

4 Our Discussion  The SW as originally posed:  RDF as “semantic” format  Also RDFS schema format  Ontologies as the standard way of defining concepts  Description logics are the way most ontologies are defined (OWL language)  Piazza PDMS:  Relations and views  Query language as mapping language  Transitive closure of composition of mappings

5 Peer Data Management: Decentralized Mediation for Ad Hoc Extensibility DB Projects UPennUW Stanford IIT Mumbai Data integration: 1 mediated schema, m mappings to sources Peer data management system (PDMS):  n mediated “peer schemas,” as few as (n - 1) mappings between them – evaluated transitively  m mappings to sources

6 Example Rule-Goal Tree Expansion q: Q(a1, a2) :- SameProject(a1,a2,p), Author(a1,w), Author(a2,w) SameProject(a1,a2,p) Author(a1,w) Author(a2,w) ProjMember(a1,p)ProjMember(a2,p) CoAuthor(a1,a2)CoAuthor(a2,a1) S1(a1,p,_) S1(a2,p,_) S2(a1,a2) S2(a2,a1) q r0 r1 r3 r2 Q’(a1,a2) :- S1(a1,p,_), S1(a2,p,_), S2(a1,a2)  S1(a1,p,_), S1(a2,p,_), S2(a2,a1)

7 RDF vs. XML  RDF explicitly names relationships: (book, title, “ABC”) (book, writtenBy, author) (author, name, “John Smith”)  XML does not always: 1. ABC John Smith 2. ABC John Smith titlename book author writtenBy

8 RDF vs. XML 2  RDF is subject-neutral (a graph)  XML centers around a subject (a tree): 1. ABC John Smith 2. John Smith ABC  This may result in duplication of contained objects

9 An XML Version of the Semantic Web Data model: XML + Schema  Vast volumes of data already in XML (or exported as XML)  CAVEAT: not all relationships are labeled in XML (“XML has no semantics.”) Concepts: Views ≈ classes; schemas ≈ ontologies  Views define membership via queries; can reason about containment  CAVEAT: less expressive than OWL classes Schema mappings: target schema as query over source Sophisticated reasoning about mappings is possible by extending existing data integration techniques  Can use mappings in in “forward” and “reverse” directions  Allows for “chaining” of mappings to answer queries

10 Piazza with XML (WWW03) Goals:  Build on XQuery and XML (extended with RDF-style identity, following lead of [Patel-Schneider & Simeon 02])  Remain computationally inexpensive  Capture the common mapping types Directional mapping language based on templates {: $var IN document(“doc”)/path WHERE condition :} $var  Translates between parts of data instances  Restricted subset of XQuery that’s decidable to reason about  Supports special annotations and object fusion Can map XML-XML, XML-RDF, RDF-XML (at data level)

11 Mapping Example between XML Schemas Target: pubs book* title author* name Source: authors author* full-name publication* title pub-type pub-type name publication author writtenBy title

12 Example Piazza Mapping {: $a IN document(“…”)/authors/author, $an IN $a/full-name, $t IN $a/publication/title, $typ IN $a/publication/pub-type WHERE $typ = “book” PROPERTY $t >= ‘A’ AND $t {$t} {$an}

13 Challenges  Query reformulation for XML is significantly harder  Hierarchy, 1:n schema constraints, ability to map from values to tags, …  Redundant paths  Can only do ~ the XML equivalent of conjunctive queries  See the WWW03 paper (plus later work by Yu and Popa, Deutsch et al., many others) for details

14 What about Values?  Thus far, we’ve focused on schema mappings  Almost as important in the real world: mappings of values to values  Proteins to binding sites  SSNs to customer IDs  etc.  The Hyperion system (KAM 03) focuses on computing transitive relationships between mappings  In many cases, we only have partial transitive mappings  Key idea: divide all of the mappings into partitions, each of which can compute transitive closures separately

15 Assessment: The Semantic Web  The KB world focuses on expressively capturing concepts  The DB world focuses on integrating and restructuring data (but views are less expressive in certain ways)  Do either of these seem likely to change the world?  What barriers need to be removed?

16 From Managing the Web as a Database to Managing Databases of Databases  Many common operations in:  Data integration  Data interchange  Schema design  Semantic Web  Schema maintenance/evolution  For instance:  Creating a mediated schema  Defining mappings between schemas  Seeing what’s different between schemas  The vision: let’s build a system to manage metadata, not data!

17 Metadata Management  The challenges:  There are lots of metadata representations  Different data models; different definition types (e.g., Java classes, XML Schemas, SQL DDL, …)  Many of the problems are unsolvable in the abstract  e.g., schema matching  But maybe we can customize tools for each task  And maybe we can get user input to help  We want to create a clean, composable model of operators  Should be “algebraic” in some sense, with nice properties  Operators need to be generic but extensible

18 Data vs. Metadata vs. … Data  We know what this is Metadata (models)  Schemas, types, classes, etc. Metamodels  Things like the relational model, O-R model, …  Bernstein focuses on managing models, with customization for each metamodel (and perhaps special domains)

19 Models  A model is a set of objects with identity  Objects have at least extended ER-style traits:  attributes/properties  is-a, has-a relationships  loose associations  All of these are assumed to have types

20 Mappings A mapping describes a correspondence between parts of two models; it may be annotated with information about computing the transformation Emp Emp# Name Address Map ee 1=1= 2≈2≈ Employee EmployeeID FirstName LastName Phone

21 The Basic Algebraic Operators Match Basically, schema matching: takes two models and returns a mapping between them Elementary vs. complex match; reliance on morphisms Compose Takes two mappings and composes them Diff Takes a model A, a mapping A  B, and returns the part of A that’s not mapped ModelGen Takes model A, creates new model B plus mapping A  B Merge Takes models A, B, mapping between them, returns the union C, plus mappings A  C, B  C

22 Model Management in Action

23 Schematic of Changes the new parts in S2 that need to be propagated to d2 Dest. w/o deleted items from s1 the XML version of s2

24 Actual Operations

25 What’s Hard?  Match  We saw that LSD is far from perfect, and it’s the best out there…  Merge  Can we make (A merge B) merge C = A merge (B merge C)?  (Buneman, Davidson, Kosky 92)  With Diff, how do we ensure a well-formed model as the result?  They return a copy of the model, plus mappings showing what is actually part of the diff  Composition – it isn’t always closed within the mapping language!

26 More Challenges  What about:  Semantics of the meta-model – how do we handle, e.g., constraints?  What to do about approximate correspondences?  Can we actually make these things generic but expressive enough to be useful?  Do you think this vision is feasible?