1 Provenance in O RCHESTRA T.J. Green, G. Karvounarakis, Z. Ives, V. Tannen University of Pennsylvania Principles of Provenance (PrOPr) Philadelphia, PA.

Slides:



Advertisements
Similar presentations
Relational Database and Data Modeling
Advertisements

1 Datalog: Logic Instead of Algebra. 2 Datalog: Logic instead of Algebra Each relational-algebra operator can be mimicked by one or several Database Logic.
21 Sep 2005LCG's R-GMA Applications R-GMA and LCG Steve Fisher & Antony Wilson.
Update Exchange with Mappings and Provenance Todd J. Green Grigoris Karvounarakis Zachary G. Ives Val Tannen University of Pennsylvania VLDB 2007 Vienna,
XML DOCUMENTS AND DATABASES
ANHAI DOAN ALON HALEVY ZACHARY IVES CHAPTER 14: DATA PROVENANCE PRINCIPLES OF DATA INTEGRATION.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
University of Washington Database Group Tiresias The Database Oracle for How-To Queries Alexandra Meliou § ✜ Dan Suciu ✜ § University of Massachusetts.
O RCHESTRA : Rapid, Collaborative Sharing of Dynamic Data Zachary Ives, Nitin Khandelwal, Aneesh Kapur, University of Pennsylvania Murat Cakir, Drexel.
Chapter 12 Information Systems Chapter Goals Define the role of general information systems Explain how spreadsheets are organized Create spreadsheets.
Maintenance Modifying the data –Add records –Delete records –Update records Modifying the design –Add fields into tables –Remove fields from a table –Change.
© 2002 by Prentice Hall 1 David M. Kroenke Database Processing Eighth Edition Chapter 2 Introduction to Database Development.
Data Integration and Exchange for Scientific Collaboration DILS 2009 July 20, 2009 Zachary G. Ives University of Pennsylvania Funded by NSF IIS ,
1 Provenance Semirings T.J. Green, G. Karvounarakis, V. Tannen University of Pennsylvania Principles of Provenance (PrOPr) Philadelphia, PA June 26, 2007.
Chapter 5 Other Relational Languages By Cui, Can B.
Mid-term Class Review.
July 14, 2015ICS 424: recap1 Relational Database Design: Recap of ICS 324.
UT DALLAS Erik Jonsson School of Engineering & Computer Science FEARLESS engineering Secure Data Storage and Retrieval in the Cloud Bhavani Thuraisingham,
Interoperability for Provenance-aware Databases using PROV and JSON Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy Oracle Corporation Raghav Kapoor,
A Generic Provenance Middleware for Database Queries, Updates, and Transactions Bahareh Sadat Arab 1, Dieter Gawlick 2, Venkatesh Radhakrishnan 2, Hao.
Information storage: Introduction of database 10/7/2004 Xiangming Mu.
Chapter 4 The Relational Model.
Introduction to Database Systems
Information Systems: Databases Define the role of general information systems Describe the elements of a database management system (DBMS) Describe the.
Entity Framework Overview. Entity Framework A set of technologies in ADO.NET that support the development of data-oriented software applications A component.
HEPTOX 1 : Marrying XML and Heterogeneity in Your P2P Databases Angela Bonifati (Icar CNR, Italy), Elaine Q.Chang, Laks V.S.Lakshmanan, Terence Ho, Rachel.
Database Management Systems Introduction. In the Beginning… Customer Program 1.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Ch. 1 데이터베이스시스템 (2). Ch.1 Database System 데이터베이스시스템 2 What to Learn Database System Overview Entity-Relationship diagram Relational Data Model  Structure.
Google Fusion Tables: Web-Centered Data Management and Collaboration Hector Gonzalez, Alon Y. Halevy, Christian S. Jensen, Anno Langen, Jayant Madhavan,
1 Kuali Nervous System (KNS) Part 1 Presented by: Jerry Neal – KFS Development Manager Geoff McGregor – KC Lead Developer Brian McGough – KRice Project.
Future and Emerging Technologies (FET) Future and Emerging Technologies (FET) The roots of innovation Proactive initiative on: Global Computing (GC) Proactive.
CS 1308 Computer Literacy and the Internet
PEP Similarity Credential Repository Gossip protocol Access request Credential request Reputation-based Similarity Evaluator AC Policy Request Decision.
EXAM 1 NEXT TUESDAY…. EXAMPLE QUESTIONS 1.Why is the notion of a “state” important in relational database technology? What does it refer to? 2.What do.
Reconcilable Differences Todd J. GreenZachary G. IvesVal Tannen University of Pennsylvania March 24, ICDT 09, Saint Petersburg.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Data Staging Data Loading and Cleaning Marakas pg. 25 BCIS 4660 Spring 2012.
WP3: Provenance and Access Policies Giorgos Flouris (FORTH) - Irini Fundulaki (CWI & FORTH) -
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
INRIA - Progress report DBGlobe meeting - Athens November 29 th, 2002.
Chapter 4 An Introduction to SQL. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.4-2 Topics in this Chapter SQL: History and Overview The.
1 Provenance Semirings T.J. Green, G. Karvounarakis, V. Tannen University of Pennsylvania PODS 2007.
Introduction Zachary G. Ives University of Pennsylvania CIS 700 – Internet-Scale Distributed Computing January 13, 2004.
©2003 Prentice Hall Business Publishing, Accounting Information Systems, 9/e, Romney/Steinbart 4-1 Relational Databases.
Class Diagrams. Terms and Concepts A class diagram is a diagram that shows a set of classes, interfaces, and collaborations and their relationships.
Update Exchange with Provenance Schemas are related by GLAV schema mappings (tgds) : M4: Domain_Ref(SrcID, 'Interpro', ITAcc), Entry2Meth(ITAcc, DBAcc,
Hoi Le. Why database? Spreadsheet is not good to: Store very large information Efficiently update data Use in multi-user mode Hoi Le2.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Chapter 4 An Introduction to SQL. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.4-2 Topics in this Chapter SQL: History and Overview The.
Grigoris Karvounarakis Zachary G. Ives University of Pennsylvania Bidirectional Mappings for Data and Update Exchange WebDB 2008.
Fundamental of Database Systems
Chapter 4 An Introduction to SQL.
Roles in the Database Environment
Chapter 2 Database Environment.
Chapter 12 Information Systems.
Chapter 2 Database Environment.
SQL Data Modification Statements.
Chapter 2 Database Environment Pearson Education © 2009.
P2P Integration, Concluded, and Data Stream Processing
Data Model.
logical design for relational database
CS4433 Database Systems Project.
Views 1.
CPSC-608 Database Systems
Data Access Layer (Con’t) (Overview)
Updating Databases With Open SQL
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment Pearson Education © 2009.
Updating Databases With Open SQL
Presentation transcript:

1 Provenance in O RCHESTRA T.J. Green, G. Karvounarakis, Z. Ives, V. Tannen University of Pennsylvania Principles of Provenance (PrOPr) Philadelphia, PA June 26, 2007

2 Collaborative Data Sharing [Ives+ CIDR 05] (m 1 ) G ( i, c, n ) ! B ( i, n ) (m 2 ) G ( i, c, n ) ! U ( n, c ) (m 3 ) B ( i, n ) !  c U ( n, c ) (m 4 ) B ( i, c )  U ( n, c ) ! B ( i, n ) B(id,nam) G(id,can,nam) m2m2 m4m4 m3m3 m1m1 P GUS P uBio P BioSQL U(nam,can) BB UU UU +U(3,2) comes from +G(1,2,3) via m 2 Schema mappings specify how data is logically related Update exchange propagates updates and records provenance information (1) to assess trust conditions (2) to facilitate incremental maintenance

3 Insertions and provenance +G(3,5,2) p 3 +B(3,5) p 1 +G(1,2,3) p 4 +U(2,5) p 2 (3,5,2) (1,2,3) (3,5) (3,2) (3,3) (1,3) (5,c 1 ) (2,5) (2,c 2 ) (3,c 3 ) (3,2) m1m1 m1m1 m2m2 m3m3 m4m4 m4m4 m3m3 m3m3 m3m3 m2m2 GB U

4 Deletions and provenance +G(3,5,2) p 3 +B(3,5) p 1 +G(1,2,3) p 4 +U(2,5) p 2 GB U (3,5,2) (1,2,3) (3,5) (3,2) (3,3) (1,3) (5,c 1 ) (2,5) (2,c 2 ) (3,c 3 ) (3,2) m1m1 m1m1 m2m2 m3m3 m4m4 m4m4 m3m3 m3m3 m3m3 m2m2

5 Trust and provenance +G(3,5,2) p 3 +B(3,5) p 1 +G(1,2,3) p 4 +U(2,5) p 2 GB U (3,5,2) (1,2,3) (3,5) (3,2) (3,3) (1,3) (5,c 1 ) (2,5) (2,c 2 ) (3,c 3 ) (3,2) m1m1 m1m1 m2m2 m3m3 m4m4 m4m4 m3m3 m3m3 m3m3 m2m2 Peer B distrusts any tuple B(i,n) that came from mapping m4 if n  2. Peer B distrusts any tuple B(i,n), if the data came from Peer G and n ¸ 3 and trusts any tuple from Peer U

6 Further aspects of O RCHESTRA ● Semantics: insertions and deletions with idbs ● Handling conflicts among trusted updates (Taylor+Ives SIGMOD 06) ● Prototype implementation (demo SIGMOD 07, technical details VLDB 07): ▪ Java middleware layer using database as subcomponent ▪ Provenance expressions stored as tables ▪ tgds become datalog rules with Skolem functions ▪ Update exchange using relational query engine (recursion!) ▪ Feasibility experiments ● Future work / topics for discussion ▪ What else to do with rich provenance information? (ranked trust models, bag semantics, querying provenance,...)

7 Trust policies ● Peer B distrusts any tuple B(i,n) that came from mapping m 4 if n  5. ● Peer B distrusts any tuple B(i,n), if the data came from Peer G and n ¸ 3 and trusts any tuple from Peer U