Q UERY L ANGUAGE C ONSTRUCTS FOR P ROVENANCE Murali Mani, Mohamad Alawa, Arunlal Kalyanasundaram University of Michigan, Flint Presented at IDEAS 2011.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

2008 May 31Standards PD: Day 1 afternoon: slide 1 Goal for the Afternoon Identify content specific to each grade band and each grade level.
Limitations of the relational model 1. 2 Overview application areas for which the relational model is inadequate - reasons drawbacks of relational DBMSs.
Brief Introduction to Provenance "As data becomes plentiful, verifiable truth becomes scarce
Feedback on OPM Yogesh Simmhan Microsoft Research Synthesis of pairwise conversations with: Roger Barga Satya Sahoo Microsoft Research Beth Plale Abhijit.
A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.
XML: Extensible Markup Language
Open Provenance Model Tutorial Session 6: Interoperability.
Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
A*-tree: A Structure for Storage and Modeling of Uncertain Multidimensional Arrays Presented by: ZHANG Xiaofei March 2, 2011.
Open Provenance Model Tutorial Session 2: OPM Overview and Semantics Luc Moreau University of Southampton.
Querying Workflow Provenance Susan B. Davidson University of Pennsylvania Joint work with Zhuowei Bao, Xiaocheng Huang and Tova Milo.
Open Provenance Model Tutorial Session 7: Open Provenance Model Vocabulary.
ICS-FORTH May 23, An Ontological Approach to Digital Preservation Metadata Martin Doerr Foundation for Research and Technology - Hellas Institute.
D ATABASE S YSTEMS I R ELATIONAL A LGEBRA. 22 R ELATIONAL Q UERY L ANGUAGES Query languages (QL): Allow manipulation and retrieval of data from a database.
Open Provenance Model Tutorial Session 3: OPM Serializations Luc Moreau University of Southampton.
UTPB: A Benchmark for Scientific Workflow Provenance Storage and Querying Systems Artem Chebotko Joint work with E. De Hoyos, C. Gomez, A. Kashlev, X.
Adopting Provenance-based Access Control in OpenStack Cloud IaaS October, 2014 NSS Presentation Institute for Cyber Security University of Texas at San.
A Provenance-based Access Control Model (PBAC) July 18, 2012 PST’12, Paris, France Jaehong Park, Dang Nguyen and Ravi Sandhu Institute for Cyber Security.
Remainder and Factor Theorems
Quotients as Equations Domain 2, Lesson 8 Quotients as Equations Mr. Heath Blue Creek Elementary School.
Using Provenance to Support Real-Time Collaborative Design of Workflows Workflow evolution provenance and OPM Tommy Ellkvist and Juliana Freire.
What legal inferences in OPM OPM Workshop Luc Moreau.
Storing, Indexing and Querying Large Provenance Data Sets as RDF Graphs in Apache HBase Artem Chebotko Joint work with John Abraham and Pearl Brazier University.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
Provenance-aware faceted search Peter Fox Stephan Zednik Patrick West Tetherless World Constellation, RPI EGU 2010.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
Chapter 4 Graphs.
T HE Q UERY C OMPILER Prepared by : Ankit Patel (226)
Graph Algebra with Pattern Matching and Aggregation Support 1.
Binary Numbers.
division algorithm Before we study divisibility, we must remember the division algorithm. r dividend = (divisor ⋅ quotient) + remainder.
Interoperability for Provenance-aware Databases using PROV and JSON Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy Oracle Corporation Raghav Kapoor,
Open Provenance Model Tutorial Session 5: OPM Emerging Profiles.
A D ICHOTOMY ON T HE C OMPLEXITY OF C ONSISTENT Q UERY A NSWERING FOR A TOMS W ITH S IMPLE K EYS Paris Koutris Dan Suciu University of Washington.
Usage of `provenance’: A Tower of Babel Luc Moreau.
Provenance-based Access Control in Cloud IaaS August 23, 2013 Dissertation Proposal Dang Nguyen Institute for Cyber Security University of Texas at San.
Functional Modeling Question How do you know if you have enough information to compute the necessary output values? How do you know if you have.
On Data Provenance in Group-centric Secure Collaboration Oct. 17, 2011 CollaborateCom Jaehong Park, Dang Nguyen and Ravi Sandhu Institute for Cyber Security.
Quantitative Analysis. Quantitative / Formal Methods objective measurement systems graphical methods statistical procedures.
XML Data Management 10. Deterministic DTDs and Schemas Werner Nutt.
Lesson 5-6 Example Find 312 ÷ 8. Use short division. Step 1Look at the first digit in the dividend. Since the divisor is greater than the first digit,
Part4 Methodology of Database Design Chapter 07- Overview of Conceptual Database Design Lu Wei College of Software and Microelectronics Northwestern Polytechnical.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Usability and Integration H. V. Jagadish. Many Sources of Data Text XML/semi-structured Experimental measurements Public databases Some data may have.
Graphs & Matrices Todd Cromedy & Bruce Nicometo March 30, 2004.
Wrapper-Based Evolution of Legacy Information System Philippe Thiran et al Fcculties University Notre-Dame de la Paix.
A Semantic Web Approach for the Third Provenance Challenge Tetherless World Rensselaer Polytechnic Institute James Michaelis, Li Ding,
Chapter 7 Complex Similarity Topix. About this chapter Extends previous discussed methods The reader can choose to read about only specific methods, depending.
Thornton Elementary Third Quarter Data rd Grade ELA Which standard did the students perform the best on in reading? Which standard did students.
My Book of Divisibility. THE 2’s Example: 30, 42, 24, 76, 98, Must be an even number Number must end in a 0, 2, 4, 6, or 8.
1 Chapter 2 Database Environment Pearson Education © 2009.
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
Subgraph Search Over Uncertain Graphs Erşan Demircioğlu.
QUANTIFYING INFORMATION LOSS AFTER REDACTING DATA PROVENANCE TEAM: AVINI SOGANI VAISHNAVI SUNKU VENUGOPAL BOPPA.
Section 5.4 – Dividing Polynomials
Chapter 8: Concurrency Control on Relational Databases
Learn about relations and their basic properties
Division Using “Lucky Seven”
Probabilistic Data Management
NOSQL databases and Big Data Storage Systems
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Computation in Other Bases
Electrical and Computer Engineering Department
G-CORE: A Core for Future Graph Query Languages
Calculation Policy Division – Years 4-6
Calculation Policy Division – Years 4-6
Dynamic Programming 1 Neil Tang 4/15/2008
Dynamic Programming 動態規劃
Presentation transcript:

Q UERY L ANGUAGE C ONSTRUCTS FOR P ROVENANCE Murali Mani, Mohamad Alawa, Arunlal Kalyanasundaram University of Michigan, Flint Presented at IDEAS 2011.

P ROVENANCE M ETADATA Data about origins of data Applications: Check whether data item is valid – in health records How much do we trust an inference/observation – scientific computation Audit trails – manufacturing/shipping/trading Database community found provenance could be useful in updating views maintenance of materialized views interpretation of query results querying probabilistic/uncertain data In short, numerous applications …

OPM (O PEN P ROVENANCE M ODEL ) HTTP :// OPENPROVENANCE. ORG / Developed by several researchers who have been involved with provenance Describes a logical representation of provenance information for a wide variety of applications. Provenance information represented as a directed graph consisting of: Nodes (can be artifact, process, or agent) Edges or dependencies. There are 5 types of edges Used: a process used an artifact wasGeneratedBy: an artifact generated by a process wasControlledBy: a process controlled by an agent wasTriggeredBy: a process trigged by another process wasDerivedFrom: an artifact derived from another artifact Nodes and edges have annotations (attribute-value pairs)

OPM: A S IMPLE E XAMPLE P A1 A2 A3 A4 used(divisor)used(dividend) wasGeneratedBy (remainder) wasGeneratedBy (quotient) type=division A1, A2 are artifacts P = a process that is performing division (A1/A2) – note the used edges between P and A1, A2 A3, A4 are artifacts generated by P (representing quotient, remainder) – note the wasGeneratedBy edges between P and A3, A4 Example taken from

Q UERIES FOR OPM We can write complex “multi-step inference” queries using Datalog/SQL based on the different edges in OPM Example: find artifacts directly or indirectly derived from another artifact (recursive query using wasDerivedFrom edges) However, is it sufficient? We may need to express Sub-graph isomorphism (given a graph query pattern, check whether the pattern appears in a provenance graph) Studied in graph query languages ([Graph-QL]), [OPQL] … Shortest path queries (using some notion of distance) Typically not studied in graph query languages

O UR APPROACH

E XAMPLES OF G ENERALIZED S ELECTION O PERATOR

C ONCLUSIONS AND F UTURE W ORK Observation: Provenance query language should not be restricted to Datalog/SQL. Developed a query model that provides constructs for querying structure and for querying content. Using our query model, we can express a wide range of queries including shortest path (not expressible using SQL/Datalog).

R EFERENCES [Graph-QL]: He, H., and Singh, A. K Graphs-at-a-time: Query Language and Access Methods for Graph Databases. ACM SIGMOD (2008). [OPQL]: Lim, C., Lu, S., Chebotko, A., and Fatouhi, F OPQL: A First OPM-Level Query Language for Scientific Workflow Provenance. IEEE SCC (2011). [OPM]: The OPM Provenance Model (OPM), available at