Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQLite: An Overly Simplistic Query Language for Data Provenance.

Similar presentations


Presentation on theme: "Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQLite: An Overly Simplistic Query Language for Data Provenance."— Presentation transcript:

1 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQLite: An Overly Simplistic Query Language for Data Provenance mleece@soe.ucsc.edu msevilla@soe.ucsc.edu CMPS203 Final Project University of California, Santa Cruz Jack Baskin School of Engineering Michael {Leece, Sevilla}

2 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Overview Introduction Current Work Design and Implementation Conclusions

3 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Terminology Applications Terminology Applications Introduction Provenance: history + ancestry of an object [1] – Processes – Data Provenance Aware Storage (PASS) – Transparent collection PQL: Path Query Language – Useful for provenance Terminology Ancestry Graph

4 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Terminology Applications Terminology Applications Introduction Security File System Search The Cloud New Hierarchical File Systems Yan Li’s Photo Album Applications

5 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Terminology Applications Terminology Applications Introduction Obtained PASSv2 Ran PQL query on provenance database – Infinite loops – {} PQL Broken

6 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQL Broken PQL Undocumented Overview PQL Broken PQL Undocumented Overview Current Work Obtained PASSv2 Ran PQL query on provenance database – Infinite loops – {} “The problem with PQL and Sage is that the implementation… is really slow, and it’s perhaps too easy to generate PQL queries that do not return any data.” – PASS Team PQL Broken

7 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQL Broken PQL Undocumented Overview PQL Broken PQL Undocumented Overview Current Work PQL Undocumented

8 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQL Broken PQL Undocumented Overview PQL Broken PQL Undocumented Overview Current Work Overview Waldo Database Dump PASSv2 Modules Kernel Space VFS Lasagna FS App1 App2 User Space BDB.twig

9 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation What we have – [ P ] 1.0 INODE 4 INODE 12[ P ] 1.0 NAME 9 "/file.txt"[ P ] 1.0 TYPE 4 "FILE"[ P ] 1.0 FREEZETIME 8 TIME 1329510432.493134083[ P ] 1.0 FREEZETIME 8 TIME 1329510618.420311721[ P ] 1.0 FREEZETIME 8 TIME 1329510676.040716382[AP ] 1.1 INPUT 12 --> 2.1[AP ] 1.2 INPUT 12 --> 8.1[AP ] 1.3 INPUT 12 --> 16.2[ PT] 2.0 ARGV 4 [1]"cat"[ PT] 2.0 ENV 64 [2]"SHELL=/bin/bash" [3]"TERM=xterm" [4]"XDG_SESSION_COOKIE=06c3f2775eb071081dfacb984bf6c364-1329508695.722050-291519720" [5]"USER=root" [6]"LS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:tw=30;42:ow=34;42:st=37;44:ex=01;32:*. tar=01;31:*.tgz=01;31:*.svgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz2=01 ;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01; 35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*. mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.r m=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi =00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:" [7]"MAIL=/var/mail/root" [8]"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" [9]"PWD=/test" [10]"LANG=en_US.UTF-8" [11]"SHLVL=1" [12]"HOME=/root" [13]"LOGNAME=root" [14]"LESSOPEN=| /usr/bin/lesspipe %s" [15]"LESSCLOSE=/usr/bin/lesspipe %s %s" [16]"_=/bin/cat" [17]"OLDPWD=/"[ ] 2.0 EXECTIME 8 TIME 1329510428.104272662[ P ] 2.0 TYPE 4 "PROC"[ ] 2.0 PID 4 INT 13739[ P ] 2.0 NAME 8 "/bin/cat"[A ] 2.0 FORKPARENT 12 --> 14762.0[ P ] 2.0 FREEZETIME 8 TIME 1329510428.104272662 What we want – A list of files or processes that are one-step ancestors of “/file.txt” Use Case

10 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Use Case (cont.) Waldo Database Dump Query: SELECT r FROM Graph AS r WHERE r.child = "/file.txt" Query Parser Evaluator Dump Parser Ancestry Graph 1 -> file.txt 2 -> jazz.jpg 3 -> bacon.txt … Label Map Select "r" [From [Alias "Graph" "r"]] [Duo Eq (PathType "r" ["child"]) (Str "/file.txt")] Abstract Syntax Tree Response: [(MyNode "/usr/bin/pico" 1,1,[2]), (MyNode "/usr/bin/vi” 2,3,[17,16,15]), (MyNode "/bin/cat" 1,4,[0])] Use Case

11 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Use Case (cont.) Waldo Database Dump Query: SELECT r FROM Graph AS r WHERE r.child = "/file.txt" Query Parser Evaluator Dump Parser Ancestry Graph 1 -> file.txt 2 -> jazz.jpg 3 -> bacon.txt … Label Map Select "r" [From [Alias "Graph" "r"]] [Duo Eq (PathType "r" ["child"]) (Str "/file.txt")] Abstract Syntax Tree Response: [(MyNode "/usr/bin/pico" 1,1,[2]), (MyNode "/usr/bin/vi” 2,3,[17,16,15]), (MyNode "/bin/cat" 1,4,[0])] Use Case

12 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Use Case (cont.) Waldo Database Dump Query: SELECT r FROM Graph AS r WHERE r.child = "/file.txt" Query Parser Evaluator Dump Parser Ancestry Graph 1 -> file.txt 2 -> jazz.jpg 3 -> bacon.txt … Label Map Select "r" [From [Alias "Graph" "r"]] [Duo Eq (PathType "r" ["child"]) (Str "/file.txt")] Abstract Syntax Tree Response: [(MyNode "/usr/bin/pico" 1,1,[2]), (MyNode "/usr/bin/vi” 2,3,[17,16,15]), (MyNode "/bin/cat" 1,4,[0])] Use Case

13 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Select Statement Language Specification

14 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Select Statement Language Specification

15 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Expression Language Specification

16 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Expression Language Specification

17 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language Use Case Language Specification Use Case Language Specification Design & Implementation Use Case (cont.) Waldo Database Dump Query: SELECT r FROM Graph AS r WHERE r.child = "/file.txt" Query Parser Evaluator Dump Parser Ancestry Graph 1 -> file.txt 2 -> jazz.jpg 3 -> bacon.txt … Label Map Select "r" [From [Alias "Graph" "r"]] [Duo Eq (PathType "r" ["child"]) (Str "/file.txt")] Abstract Syntax Tree Response: [(MyNode "/usr/bin/pico" 1,1,[2]), (MyNode "/usr/bin/vi” 2,3,[17,16,15]), (MyNode "/bin/cat" 1,4,[0])] Use Case

18 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language What We Did Well Lessons Learned References What We Did Well Lessons Learned References Conclusions Functional – It works. (PQLite > PQL) Easy to use – Intuitive (SQL-like) way of querying a provenance graph – Getting stuff we care about What we did well What We Did Well

19 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language What We Did Well Lessons Learned References What We Did Well Lessons Learned References Conclusions Infinite recursion in parsing – Left recursion in a recursive descent parser – Refined syntax Began coding too soon Monads are useful – IO(), Maybe, State, Parsec Lessons Learned

20 Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language What We Did Well Lessons Learned References What We Did Well Lessons Learned References Conclusions 1)Margo Seltzer, Kiran-Kumar Muniswamy-Reddy, David A. Holland, Uri Braun, and Jonathan Ledlie. Provenance-Aware Storage Systems. (PDF) Harvard University Computer Science Technical Report TR-18-05, July 2005 2)Stephanie Jones, Christina Strong, Darrell D. E. Long, Ethan L. Miller, Tracking Emigrant Data via Transient Provenance, Proceedings of the 3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP '11), June 2011. 3)Kiran-Kumar Muniswamy-Reddy, Uri Braun, David A. Holland, Peter Macko, Diana Maclean, Daniel Margo, Margo Seltzer, and Robin Smogor. Layering in Provenance Systems. In proceedings of the 2009 USENIX Annual Technical Conference, San Diego, CA, June 2009. 4)PQL Language Guide and ReferencePQL Language Guide and Reference References


Download ppt "Introduction Current Work Design & Implementation Conclusions PQLite: Provenance Query Language PQLite: An Overly Simplistic Query Language for Data Provenance."

Similar presentations


Ads by Google