On the Origin of Data Daniel Deutch Blavatnik School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences.

Slides:



Advertisements
Similar presentations
Views-basics 1. 2 Introduction a view is a perspective of the database different users may need to see the database differently; this is achieved through.
Advertisements

TU e technische universiteit eindhoven / department of mathematics and computer science Modeling User Input and Hypermedia Dynamics in Hera Databases and.
Daniel Deutch Tel Aviv Univ. Tova Milo Tel Aviv Univ. Sudeepa Roy Univ. of Washington Val Tannen Univ. of Pennsylvania.
Provenance analysis of algorithms 10/1/13 V. Tannen University of Pennsylvania 1WebDam someTowards ?
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Implementing Reflective Access Control in SQL Lars E. Olson 1, Carl A. Gunter 1, William R. Cook 2, and Marianne Winslett 1 1 University of Illinois at.
Foundations of Relational Implementation n Defining Relational Data n Relational Data Manipulation n Relational Algebra.
University of Washington Database Group Tiresias The Database Oracle for How-To Queries Alexandra Meliou § ✜ Dan Suciu ✜ § University of Massachusetts.
Copyright © The OWASP Foundation Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation.
CSE 425: Logic Programming I Logic and Programs Most programs use Boolean expressions over data Logic statements can express program semantics –I.e., axiomatic.
By: Chris Hayes. Facebook Today, Facebook is the most commonly used social networking site for people to connect with one another online. People of all.
Using Markov Chain Monte Carlo To Play Trivia Daniel Deutch ICDE ‘11 [Deutch, Greenshpan, Kostenko,Milo] PODS ‘10 [Deutch, Koch, Milo] WebDAM meeting March.
Chapter 12 Information Systems Chapter Goals Define the role of general information systems Explain how spreadsheets are organized Create spreadsheets.
Dept. of Computer Science & Engineering, CUHK1 Trust- and Clustering-Based Authentication Services in Mobile Ad Hoc Networks Edith Ngai and Michael R.
Fundamentals, Design, and Implementation, 9/e Chapter 11 Managing Databases with SQL Server 2000.
CS240A: Databases and Knowledge Bases Applications of Active Database Carlo Zaniolo Department of Computer Science University of California, Los Angeles.
Fundamentals, Design, and Implementation, 9/e Chapter 7 Using SQL in Applications.
Chapter 12 Information Systems Nell Dale John Lewis.
1 Provenance Semirings T.J. Green, G. Karvounarakis, V. Tannen University of Pennsylvania Principles of Provenance (PrOPr) Philadelphia, PA June 26, 2007.
1 Provenance in O RCHESTRA T.J. Green, G. Karvounarakis, Z. Ives, V. Tannen University of Pennsylvania Principles of Provenance (PrOPr) Philadelphia, PA.
Microsoft Access 2010 Chapter 7 Using SQL.
OSG Logging Architecture Update Center for Enabling Distributed Petascale Science Brian L. Tierney: LBNL.
Interoperability for Provenance-aware Databases using PROV and JSON Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy Oracle Corporation Raghav Kapoor,
Chapter 5: z-Scores. 5.1 Purpose of z-Scores Identify and describe location of every score in the distribution Take different distributions and make them.
Approximated Provenance for Complex Applications
A Generic Provenance Middleware for Database Queries, Updates, and Transactions Bahareh Sadat Arab 1, Dieter Gawlick 2, Venkatesh Radhakrishnan 2, Hao.
DBMS Lab Projects. Information System Design Auction Portal Items Bidders Bid Managers Lowest Bid Incremental Bidding.
Deploying Trust Policies on the Semantic Web Brian Matthews and Theo Dimitrakos.
Web 2.0 Data Analysis DANIEL DEUTCH. Data Management “Data management is the development, execution and supervision of plans, policies, programs and practices.
Ensemble Computing in the National Science Digital Library (NSDL)
Mini-Project on Web Data Analysis DANIEL DEUTCH. Data Management “Data management is the development, execution and supervision of plans, policies, programs.
A Survey Based Seminar: Data Cleaning & Uncertain Data Management Speaker: Shawn Yang Supervisor: Dr. Reynold Cheng Prof. David Cheung
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “QUERY OPTIMIZATION” Academic Year 2014 Spring.
On the Semantics of R2RML and its Relationship with the Direct Mapping Juan F. Sequeda Research in Bioinformatics and Semantic Web (RiBS) Lab Department.
Querying Business Processes Under Models of Uncertainty Daniel Deutch, Tova Milo Tel-Aviv University ERP HR System eComm CRM Logistics Customer Bank Supplier.
1 Security on Social Networks Or some clues about Access Control in Web Data Management with Privacy, Time and Provenance Serge Abiteboul, Alban Galland.
Event Processing A Perspective From Oracle Dieter Gawlick, Shailendra Mishra Oracle Corporation March,
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
4BP1 Electronic & Computer Engineering Paul Gildea th Year Interim Project Presentation.
Chapter No 4 Query optimization and Data Integrity & Security.
Lightweight Consistency Enforcement Schemes for Distributed Proofs with Hidden Subtrees Adam J. Lee, Kazuhiro Minami, and Marianne Winslett University.
Java Portals and Portlets Submitted By: Rashi Chopra CIS 764 Fall 2007 Rashi Chopra.
To Tune or not to Tune? A Lightweight Physical Design Alerter Nico Bruno, Surajit Chaudhuri DMX Group, Microsoft Research VLDB’06.
Indexes and Views Unit 7.
A Specification Logic for Exceptions and Beyond Cristina David Cristian Gherghina National University of Singapore.
A Data Stream Publish/Subscribe Architecture with Self-adapting Queries Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences,
Security Patterns for Web Services 02/03/05 Nelly A. Delessy.
Web Security Lesson Summary ●Overview of Web and security vulnerabilities ●Cross Site Scripting ●Cross Site Request Forgery ●SQL Injection.
Reasoning about the Behavior of Semantic Web Services with Concurrent Transaction Logic Presented By Dumitru Roman, Michael Kifer University of Innsbruk,
1 Provenance Semirings T.J. Green, G. Karvounarakis, V. Tannen University of Pennsylvania PODS 2007.
© Donald F. Ferguson, All rights reserved. Topics in Computer Science: Modern Internet Service Oriented Application Development Dr. Donald F. Ferguson.
1 A framework for eager encoding Daniel Kroening ETH, Switzerland Ofer Strichman Technion, Israel (Executive summary) (submitted to: Formal Aspects of.
CS422 Principles of Database Systems Stored Procedures and Triggers Chengyu Sun California State University, Los Angeles.
1 Introduction to Quantum Information Processing CS 467 / CS 667 Phys 467 / Phys 767 C&O 481 / C&O 681 Richard Cleve DC 3524 Course.
A formal study of collaborative access control in distributed datalog Serge Abiteboul – Inria & ENS Cachan Pierre Bourhis CNRS & Lille Univ. & Inria Victor.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
Informix Red Brick Warehouse 5.1
Chapter 15 QUERY EXECUTION.
Database Management System (DBMS)
Prasenjit Ghosh. Director Balram Mishra. Project Manager
Using the Checklist for SDMX Data Providers
Team Project, Part II NOMO Auto, Part II IST 210 Section 4
Query Optimization.
Chapter 11 Managing Databases with SQL Server 2000
Topics discussed in this section:
On Provenance of Queries on Linked Web Data
Information Retrieval and Web Design
Lecture 1: Overview of CSCI 485 Notes: I presented parts of this lecture as a keynote at Educator’s Symposium of OOPSLA Shahram Ghandeharizadeh Director.
Presentation transcript:

On the Origin of Data Daniel Deutch Blavatnik School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences

Data Evolvement This is the era of Data. – Databases, text, blogs, social data,… – Huge volumes Evolving Through Automatic Tools Sent Between Applications and Users

Provenance

Data Provenance Understanding how and why data has evolved is of fundamental importance – For authentication Both origin and propagators of data should be trustworthy – For access control Confidentiality constraints interplay with the transformation – For hypothetical reasoning What if we change a piece of data? How can we optimally affect data evolvement

Example Alice posted photos with David David is worried about Eve seeing his photos OR AND NOT () ( )

Tracking Provenance The logic is already implemented (e.g. to decide what photos to show) We develop tools to “instrument” applications with provenance tracking. Simply maintaining an “activity log” is not good enough. – We want also the possible “reasons” for activities – E.g. “not blacklisted” is not an activity Instead we create formulas in generic algebraic constructions based on semirings We also develop tools that use the provenance information for analysis.

Generic Expression Trust: OR AND NOT () () False OR ( ( True OR True) AND NOT False ) = True Number of paths (if Alice and Eve are not friends) : 0 + ( ( ) x 1 ) = 2  min ( (0:05 min 0:08 ) + 0:00 ) = 0:05 Latency:

Provenance for SQL Queries Amsterdamer, D., Tannen, Provenance for Aggregate Queries [PODS ‘11] Amsterdamer, D., Tannen, On the limitations of Provenance for Queries with Difference [Tapp ‘11] D., Milo, Roy, Tannen, Circuits for Datalog Provenance [ICDT ‘14] Amsterdamer, D.,Green, Karvounarakis, Tannen, Semiring-based Provenance for SQL Queries (In preparation) D., Moskovitch, Provenance for Relational Updates [In preparation] Dep.EmpProv. Eng.AliceS Eng.BobT SalesCarolS EmpsGoodEmps EmpProv. AliceC BobS CarolT Dep.Prov. Eng.S·C+T·S = S + T = S SalesS·T = T π Dep (Emps GoodEmps)

Provenance for Social and Web Data Bienvenu, D., Suchaneck, Provenance for Web 2.0 Data [Secure Data Management ‘12] Abiteboul, Bienvenu, D., Deduction in the Presence of Distribution and Contradictions [WebDB ‘12] Abiteboul, D., Vianu, Deduction with Contradictions in Datalog [ICDT ‘14] Amarilli, D., Senellart, Provenance for Order-Aware Transformations (In preparation)

PROPOLIS: Provenance for Process Analysis D., Moskovich, Tannen, PROPOLIS: Provisioned Analysis of Data-Centric Processes [VLDB ’13] D., Moskovich, Tannen, A Provenance Framework for Data-Dependent Process Analysis (Submitted) D., Moskovich, Provenance for Distributed Processes (In preparation)

Thank you!