Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQLite: An Overly Simplistic Query Language for Data Provenance.

Slides:



Advertisements
Similar presentations
Layering in Provenance Systems Margo Seltzer May 13, 2009 Provenance in Secure and Advanced Computer Systems.
Advertisements

Provenance-Aware Storage Systems Margo Seltzer April 29, 2005.
Harvards PASS Takes on The Provenance Challenge September 13, 2006 Margo Seltzer Harvard University Division of Engineering and Applied Sciences.
The Case for Browser Provenance Daniel W. Margo and Margo Seltzer Harvard School of Engineering and Applied Sciences.
Chapter 10: The Traditional Approach to Design
Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou.
1 Programming Languages (CS 550) Mini Language Interpreter Jeremy R. Johnson.
Semantics Static semantics Dynamic semantics attribute grammars
Making Cloud Storage Provenance- Aware Kiran-Kumar Muniswamy-Reddy, Peter Macko, and Margo Seltzer Harvard School of Engineering and Applied Sciences.
PROVENANCE FOR THE CLOUD (USENIX CONFERENCE ON FILE AND STORAGE TECHNOLOGIES(FAST `10)) Kiran-Kumar Muniswamy-Reddy, Peter Macko, and Margo Seltzer Harvard.
Traditional Approach to Design
P2P2DSpace Project. Project in the Technion Electrical Engineering Software Lab P2P Network, Map, Background Manager Team members: Vladimir Shulman Ziv.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Trees. 2 Definition of a tree A tree is like a binary tree, except that a node may have any number of children Depending on the needs of the program,
Crash Language1 Crash. A Graphical Animation Tool By: Mikhail Litvin Vadim Belobrovka Michael Anikin Daniel Burdeinick.
Course Introduction CS 1037 Fundamentals of Computer Science II.
Environments and Evaluation
Trees. Definition of a tree A tree is like a binary tree, except that a node may have any number of children –Depending on the needs of the program, the.
Network-Aware Operator Placement for Stream-Processing Systems CS253 project presentation Min Chen, Danhua Guo {michen, 12/4/2006.
Revision Control Practices in Software Engineering Surekha, Kotiyala Madhuri, Komuravelly Suchitra, Yerramalla.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Grammars This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit.
Provenance-aware Storage Systems Kiran-Kumar Muniswamy-Reddy David A. Holland Uri Braun Margo Seltzer Harvard University.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Searching Provenance Shankar Pasupathy, Network Appliance PASS Workshop, Harvard October 2005.
XPath Processor MQP Presentation April 15, 2003 Tammy Worthington Advisor: Elke Rundensteiner Computer Science Department Worcester Polytechnic Institute.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
The Linux /proc Filesystem CSE8343 – Fall 2001 Group A1 – Alex MacFarlane, Garrick Williamson, Brad Crabtree.
AN IMPLEMENTATION OF A REGULAR EXPRESSION PARSER
2014-T2 Lecture 21 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, John Lewis,
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Overview of the language A table generation language Initialize and manipulate columns and finally generate a table The values in the column are numbers.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Component 4: Introduction to Information and Computer Science Unit 6a Databases and SQL.
© M. Winter COSC 4P41 – Functional Programming Programming with actions Why is I/O an issue? I/O is a kind of side-effect. Example: Suppose there.
1 Introduction to Software Testing. Reading Assignment P. Ammann and J. Offutt “Introduction to Software Testing” ◦ Chapter 1 2.
Mitchell McMullen Paul Nguyen SWAN. Python written entirely in C#. Can access all.NET libraries and Silverlight. Created by the same guy as Jython. No.
Transparently Gathering Provenance with Provenance Aware Condor Christine Reilly and Jeffrey Naughton Department of Computer Sciences University of Wisconsin.
Compiler design Lecture 1: Compiler Overview Sulaimany University 2 Oct
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
XML Access Control Koukis Dimitris Padeleris Pashalis.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Statistic Module Shuanglong Zhang 04/23/2013. Overview Device Mapper procfs bio control.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. C H A P T E R T W O Syntax.
TREES K. Birman’s and G. Bebis’s Slides. Tree Overview 2  Tree: recursive data structure (similar to list)  Each cell may have zero or more successors.
ELECTRONIC DOCUMENT SHARING AND MANAGEMENT BY: EDWARD DISI JUSTIN HEIN BROM ESPY Senior Design 1.
C H A P T E R T W O Linking Syntax And Semantics Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan.
1 Programming Languages (CS 550) Lecture 2 Summary Mini Language Interpreter Jeremy R. Johnson.
Dr. Mohamed Ramadan Saady 314ALL CH1.1 Chapter 1: Introduction to Compiling.
Language Implementation Overview John Keyser Spring 2016.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
SPADE on Android
CS Class 04 Topics  Selection statement – IF  Expressions  More practice writing simple C++ programs Announcements  Read pages for next.
/16 Final Project Report By Facializer Team Final Project Report Eagle, Leo, Bessie, Five, Evan Dan, Kyle, Ben, Caleb.
CS522 Advanced database Systems
Database Management System
MCS680: Foundations Of Computer Science
Trees.
Mini Language Interpreter Programming Languages (CS 550)
Implementing Language Extensions with Model Transformations
College of Engineering Cherthala
C H A P T E R T W O Syntax.
Trees.
CPSC-608 Database Systems
Implementing Language Extensions with Model Transformations
Advanced OS COMP 755.
Presentation transcript:

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQLite: An Overly Simplistic Query Language for Data Provenance CMPS203 Final Project University of California, Santa Cruz Jack Baskin School of Engineering Michael {Leece, Sevilla}

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Overview Introduction Current Work Design and Implementation Conclusions

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Terminology Applications Terminology Applications Introduction Provenance: history + ancestry of an object [1] – Processes – Data Provenance Aware Storage (PASS) – Transparent collection PQL: Path Query Language – Useful for provenance Terminology Ancestry Graph

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Terminology Applications Terminology Applications Introduction Security File System Search The Cloud New Hierarchical File Systems Yan Li’s Photo Album Applications

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Terminology Applications Terminology Applications Introduction Obtained PASSv2 Ran PQL query on provenance database – Infinite loops – {} PQL Broken

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQL Broken PQL Undocumented Overview PQL Broken PQL Undocumented Overview Current Work Obtained PASSv2 Ran PQL query on provenance database – Infinite loops – {} “The problem with PQL and Sage is that the implementation… is really slow, and it’s perhaps too easy to generate PQL queries that do not return any data.” – PASS Team PQL Broken

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQL Broken PQL Undocumented Overview PQL Broken PQL Undocumented Overview Current Work PQL Undocumented

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQL Broken PQL Undocumented Overview PQL Broken PQL Undocumented Overview Current Work Overview Waldo Database Dump PASSv2 Modules Kernel Space VFS Lasagna FS App1 App2 User Space BDB.twig

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation What we have – [ P ] 1.0 INODE 4 INODE 12[ P ] 1.0 NAME 9 "/file.txt"[ P ] 1.0 TYPE 4 "FILE"[ P ] 1.0 FREEZETIME 8 TIME [ P ] 1.0 FREEZETIME 8 TIME [ P ] 1.0 FREEZETIME 8 TIME [AP ] 1.1 INPUT 12 --> 2.1[AP ] 1.2 INPUT 12 --> 8.1[AP ] 1.3 INPUT 12 --> 16.2[ PT] 2.0 ARGV 4 [1]"cat"[ PT] 2.0 ENV 64 [2]"SHELL=/bin/bash" [3]"TERM=xterm" [4]"XDG_SESSION_COOKIE=06c3f2775eb071081dfacb984bf6c " [5]"USER=root" [6]"LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:tw=30;42:ow=34;42:st=37;44:ex=01;32:*. tar=01;31:*.tgz=01;31:*.svgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz2=01 ;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01; 35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*. mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.r m=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi =00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:" [7]"MAIL=/var/mail/root" [8]"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" [9]"PWD=/test" [10]"LANG=en_US.UTF-8" [11]"SHLVL=1" [12]"HOME=/root" [13]"LOGNAME=root" [14]"LESSOPEN=| /usr/bin/lesspipe %s" [15]"LESSCLOSE=/usr/bin/lesspipe %s %s" [16]"_=/bin/cat" [17]"OLDPWD=/"[ ] 2.0 EXECTIME 8 TIME [ P ] 2.0 TYPE 4 "PROC"[ ] 2.0 PID 4 INT 13739[ P ] 2.0 NAME 8 "/bin/cat"[A ] 2.0 FORKPARENT 12 --> [ P ] 2.0 FREEZETIME 8 TIME What we want – A list of files or processes that are one-step ancestors of “/file.txt” Use Case

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Use Case (cont.) Waldo Database Dump Query: SELECT r FROM Graph AS r WHERE r.child = "/file.txt" Query Parser Evaluator Dump Parser Ancestry Graph 1 -> file.txt 2 -> jazz.jpg 3 -> bacon.txt … Label Map Select "r" [From [Alias "Graph" "r"]] [Duo Eq (PathType "r" ["child"]) (Str "/file.txt")] Abstract Syntax Tree Response: [(MyNode "/usr/bin/pico" 1,1,[2]), (MyNode "/usr/bin/vi” 2,3,[17,16,15]), (MyNode "/bin/cat" 1,4,[0])] Use Case

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Use Case (cont.) Waldo Database Dump Query: SELECT r FROM Graph AS r WHERE r.child = "/file.txt" Query Parser Evaluator Dump Parser Ancestry Graph 1 -> file.txt 2 -> jazz.jpg 3 -> bacon.txt … Label Map Select "r" [From [Alias "Graph" "r"]] [Duo Eq (PathType "r" ["child"]) (Str "/file.txt")] Abstract Syntax Tree Response: [(MyNode "/usr/bin/pico" 1,1,[2]), (MyNode "/usr/bin/vi” 2,3,[17,16,15]), (MyNode "/bin/cat" 1,4,[0])] Use Case

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Use Case (cont.) Waldo Database Dump Query: SELECT r FROM Graph AS r WHERE r.child = "/file.txt" Query Parser Evaluator Dump Parser Ancestry Graph 1 -> file.txt 2 -> jazz.jpg 3 -> bacon.txt … Label Map Select "r" [From [Alias "Graph" "r"]] [Duo Eq (PathType "r" ["child"]) (Str "/file.txt")] Abstract Syntax Tree Response: [(MyNode "/usr/bin/pico" 1,1,[2]), (MyNode "/usr/bin/vi” 2,3,[17,16,15]), (MyNode "/bin/cat" 1,4,[0])] Use Case

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Select Statement Language Specification

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Select Statement Language Specification

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Expression Language Specification

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Expression Language Specification

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Use Case (cont.) Waldo Database Dump Query: SELECT r FROM Graph AS r WHERE r.child = "/file.txt" Query Parser Evaluator Dump Parser Ancestry Graph 1 -> file.txt 2 -> jazz.jpg 3 -> bacon.txt … Label Map Select "r" [From [Alias "Graph" "r"]] [Duo Eq (PathType "r" ["child"]) (Str "/file.txt")] Abstract Syntax Tree Response: [(MyNode "/usr/bin/pico" 1,1,[2]), (MyNode "/usr/bin/vi” 2,3,[17,16,15]), (MyNode "/bin/cat" 1,4,[0])] Use Case

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language What We Did Well Lessons Learned References What We Did Well Lessons Learned References Conclusions Functional – It works. (PQLite > PQL) Easy to use – Intuitive (SQL-like) way of querying a provenance graph – Getting stuff we care about What we did well What We Did Well

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language What We Did Well Lessons Learned References What We Did Well Lessons Learned References Conclusions Infinite recursion in parsing – Left recursion in a recursive descent parser – Refined syntax Began coding too soon Monads are useful – IO(), Maybe, State, Parsec Lessons Learned

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language What We Did Well Lessons Learned References What We Did Well Lessons Learned References Conclusions 1)Margo Seltzer, Kiran-Kumar Muniswamy-Reddy, David A. Holland, Uri Braun, and Jonathan Ledlie. Provenance-Aware Storage Systems. (PDF) Harvard University Computer Science Technical Report TR-18-05, July )Stephanie Jones, Christina Strong, Darrell D. E. Long, Ethan L. Miller, Tracking Emigrant Data via Transient Provenance, Proceedings of the 3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP '11), June )Kiran-Kumar Muniswamy-Reddy, Uri Braun, David A. Holland, Peter Macko, Diana Maclean, Daniel Margo, Margo Seltzer, and Robin Smogor. Layering in Provenance Systems. In proceedings of the 2009 USENIX Annual Technical Conference, San Diego, CA, June )PQL Language Guide and ReferencePQL Language Guide and Reference References