Provenance Challenge, Sept. 20061 Modeling Provenance through User views Sarah Cohen-Boulakia Shirley Cohen Susan Davidson Thunyarat (Bam) Amornpetchkul.

Slides:



Advertisements
Similar presentations
3° Workshop Nazionale del Gruppo di Interesse in Ingegneria del Software Genova, 2-3 ottobre 2006 CASE – Libera Università di Bolzano-Bozen RCOST – Università
Advertisements

Lucinia Bal-Doebel, November Planning the OSCE multilingual website.
NVO Summer School, September Desktop Integration with VO VOClient, DALClient NVO Summer School, Aspen Sept 2006 Doug Tody (NRAO), Mike Fitzpatrick.
14 Sep 2006NVO Summer School T HE US N ATIONAL V IRTUAL O BSERVATORY Simple SSA Query Kelly McCusker Amy Kimball Mike Koss Phil Warner Melinda Mello.
12 Copyright © 2005, Oracle. All rights reserved. Query Rewrite.
Modelling self-reliance, evacuation and fire fighting activities during a large fire in a public building – a dynamic approach Daniela Hanea MSc.:
ICAO Radio Spectrum SeminarMID Office, Cairo, 4 – 6 June Implementation of ICAO Systems Prepared by Torsten Jacob ICAO ANB/CNS.
Harvards PASS Takes on The Provenance Challenge September 13, 2006 Margo Seltzer Harvard University Division of Engineering and Applied Sciences.
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Fall Active Learning. Fall learning means the ability to store information in your brain and then to be able to recall it later.
Tools for Refactoring Functional Programs Simon Thompson with Huiqing Li Claus Reinke
EAR-BASED AMENDMENT FORUM. September PROCESS AND PROCEDURES From Preparation of an Amendment to a Finding of “In Compliance”
Data accreditation standard for the IM&T DES12 Sept The IM&T DES Using the tools that support e- audit John Williams & James Barrett.
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
IPAW'08 – Salt Lake City, Utah, June 2008 Data lineage model for Taverna workflows with lightweight annotation requirements Paolo Missier, Khalid Belhajjame,
Querying Workflow Provenance Susan B. Davidson University of Pennsylvania Joint work with Zhuowei Bao, Xiaocheng Huang and Tova Milo.
Application Graphic design / svetagraphics.com 01 FRAMEWORK data service.
Provenance GGF18 Kepler/COW+RWS, Kepler/COW+RWS, Bowers, McPhiilips et al. Provenance Management in a COllection-oriented Scientific Workflow.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Karma Provenance Framework v2 Provenance Challenge Workshop/GGF18 Yogesh L. Simmhan Beth Plale, Dennis Gannon, Srinath Perera Indiana University.
Projects March 29, Project Requirements Think Aloud –At least two people OR Difficulty Factors Assessment –Ideally >25 (at least one class), but.
6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computing in Kepler Ilkay Altintas Lead, Scientific Workflow Automation Technologies.
CS 337 Final Project Presentation Asset Management and Tracking Developers: –Jimmy Hoo –Edwin Panameno –Manuel Segura –Sheng-Tian Lin Customers –Alexandre.
Database Connectivity Rose-Hulman Institute of Technology Curt Clifton.
Query Execution Professor: Dr T.Y. Lin Prepared by, Mudra Patel Class id: 113.
Chapter 7 Advanced SQL Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Karma Provenance Framework v2 Provenance Challenge Workshop/GGF18 Yogesh L. Simmhan Beth Plale, Dennis Gannon, Srinath Perera Indiana University.
Query Execution Chapter 15 Section 15.1 Presented by Khadke, Suvarna CS 257 (Section II) Id
Pan-European infrastructure for Ocean & Marine Data management An EU Integrated research Infrastructure Initiative (I3) MIKADO : Java tool for XML Creation.
LDS Account and the Java Stack. Disclaimer This is a training NOT a presentation. – Be prepared to learn and participate in labs Please ask questions.
Advance Computer Programming Java Database Connectivity (JDBC) – In order to connect a Java application to a database, you need to use a JDBC driver. –
Biology.sdsc.edu CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses Zhijie Guan 1, Alex Borchers 1, Timothy.
Approximated Provenance for Complex Applications
COMP 410 & Sky.NET May 2 nd, What is COMP 410? Forming an independent company The customer The planning Learning teamwork.
OracleAS Reports Services. Problem Statement To simplify the process of managing, creating and execution of Oracle Reports.
2. Database System Concepts and Architecture
Chapter 7 Advanced SQL Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
ABC Insurance Co. Paul Barry Steve Randolph Jing Zhou CSC8490 Database Systems & File Management Dr. Goelman Villanova University August 2, 2004.
Next-generation databases Active databases: when a particular event occurs and given conditions are satisfied then some actions are executed. An active.
A Logic Programming Approach to Scientific Workflow Provenance Querying* Shiyong Lu Department of Computer Science Wayne State University, Detroit, MI.
Provenance Challenge Simon Miles, Mike Wilde, Ian Foster and Luc Moreau.
Partitioning Patterns How to partition complex actors and concepts into multiple classes. Layered Initialization Filter Composite.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Provenance Challenge gLite Job Provenance.
Paolo Missier (1), Bertram Luda ̈ scher (2), Shawn Bowers (3), Saumen Dey (2), Anandarup Sarkar (3), Biva Shrestha (4), Ilkay Altintas (5), Manish Kumar.
Okalo Daniel Ikhena Dr. V. Z. Këpuska December 7, 2007.
8 1 Chapter 8 Advanced SQL Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Query Execution Section 15.1 Shweta Athalye CS257: Database Systems ID: 118 Section 1.
CS499 Project #3 XML mySQL Test Generation Members Erica Wade Kevin Hardison Sameer Patwa Yi Lu.
Kepler+PF+RWS, Kepler+PF+RWS, Podhorszki, Altintas et al. Provenance GGF18 RWS Provenance Experiments in Kepler (Kepler + PR + RWS) Norbert.
ABSTRACT The JDBC (Java Database Connectivity) API is the industry standard for database- independent connectivity between the Java programming language.
Software Regression Testing
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
REDUX – automatic capture, efficient storage Roger S. Barga Microsoft Research (MSR) Luciano Digiampietri University of Campinas, Sao Paolo, Brazil.
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
Spring JDBC Dima Ionut Daniel. Contents What is Spring JDBC? Overview Spring JDBC Core SQL Exceptions Database Connection Batch Operations Handling BLOB/CLOB.
Fundamentals of MyBATIS
CS 440 Database Management Systems Stored procedures & OR mapping 1.
1 / 23 Presenter: Dong Dai, DISCL Lab. TTU Data-Intensive Scalable Computing Laboratory Department of Computer Science Accelerating Scientific.
Extending and Creating Dynamics AX OLAP Cubes
CS 440 Database Management Systems
Group 2 Charge: identify and briefly describe four most important computational challenges for data citation; give examples / use cases. (We did not spend.
Web Technologies IT230 Dr Mohamed Habib.
Chapter 2 Database System Concepts and Architecture
CPSC-608 Database Systems
Chapter 15 QUERY EXECUTION.
Using JDeveloper.
End-user measurement combined with deep technical visibility
Dynamic SQL Konstantin Osipov, MySQL AB.
Presentation transcript:

Provenance Challenge, Sept Modeling Provenance through User views Sarah Cohen-Boulakia Shirley Cohen Susan Davidson Thunyarat (Bam) Amornpetchkul Olivier Biton Database group, University of Pennsylvania

Provenance Challenge, Sept Our approach provenance  Model of provenance Based on study of user requirements (CIPRES) Based on careful studies of workflow systems (Kepler, MyGrid, Chimera)  minimal information to reason about provenance  No workflow system is proposed  User views Capability of workflow systems to group steps (forming boxes) and to zoom into boxes granularity  Multi-granularity levels of provenance  Implemented  Implemented in Oracle 10g and Java Relationaltransitive closure Relational framework augmented with transitive closure objectuser interface Java/Spring/JDBC: object layer and user interface

Provenance Challenge, Sept Workflow Representation  Terminology Step-classes Step-classes (static) steps An execution of a workflow generates a partial order of steps (dynamic)  Instances of step classes input output Each step has input and output data 8.reslice: step reslice: step-class input data output data

Provenance Challenge, Sept Provenance Trace  Base tables DataDataAttributes Data(dataid, name, type), DataAttributes(dataid, attribute, value)  Data(1, Anatomy Image1, Anatomy Image)  DataAttributes(1, center, UChicago) Center=UChicago InstanceOfStepParams InstanceOf(Step,Step-Class,ts), StepParams(step, attribute, value), StageInstance(step, stage) InputOutput Input(stepId,dataId,ts) / Output(stepId,dataId,ts) stepId takes as input /produces dataId at time ts  Views Process Process(stepId, stepClass, input, output, time) …

Provenance Challenge, Sept Provenance Queries Q1: Find the process that led to Atlas X Graphic / everything that caused Atlas X Graphic to be as it is Implements transitive closure. Necessary to return all the data used to (recursively) compute Atlas X Graphic. SELECT DISTINCT step, step-class, input, output FROM Process START WITH output = ( SELECT ID FROM DataID WHERE name = 'Atlas X Graphic' ) CONNECT BY PRIOR CONNECT BY PRIOR input = output ORDER BY step;

Provenance Challenge, Sept Provenance Queries (Cont.) queries answered  All the queries can be answered by our system Code available on TWiki SQL  Using SQL Connect by operators Joins with several tables (e.g. Parameters, DataAttribute) Minus and Union operators generalization of Q7  The generalization of Q7 (difference between workflows) is currently not answerable

Provenance Challenge, Sept Workflow Variant: User Views Box1 Box2 UBio UBlackBox UAdmin UAdmin can see everything  What  What are User views? detail Level of detail the user wishes to track Permissions Permissions given to the user Ability Ability of the user to see / know the sub-steps (distributed computation)  Why  Why use User Views? Throw away Throw away unimportant intermediate results Better understanding Better understanding of the workflow Reduce Reduce the amount of work to be redone

Provenance Challenge, Sept Querying within User Views  Need information from Workflow: Step-class containment and user views Cinput(sid,idid,tsi), Coutput(sid,idid,tso)  View UProcess(usr, step, step-class, input, output)  Query: What are all the data items used to produce“Resliced Image1”?  SELECT * FROM uProcess upc WHERE usr = :userName START WITH outputName = 'Resliced Image1' CONNECT BY PRIOR upc.output = upc.input; UAdmin UAdmin: Anatomy Header 1, Anatomy Image1, Reference Image, Reference Header, Wrap param1 UBio UBio: Anatomy Header 1, Anatomy Image1, Reference Image, Reference Header UBlackBox UBlackBox: empty answer!

Provenance Challenge, Sept Conclusion, Perspectives queries  Able to answer the queries, including  Data  Data and Step provenance  Immediate and Deep  Immediate and Deep (recursive) provenance user views  Variation of the workflow and queries considering user views granularity Multi-granularity levels of provenance Only visible and necessary data are kept  Open questions stage What is the meaning of “stage” in a workflow (with respect to user views)? difference What are we expecting as an answer to the difference between two workflows (cf. query 7)? biologically significant Are all the procedures of the workflow “biologically significant” (cf. user views)?

Provenance Challenge, Sept Acknowledgements  Kepler Group Shawn Bowers Bertram Ludascher Timothy McPhillips  Biologists from the CIPRES project  Members from the Database group, University of Pennsylvania  This work is supported by NSF grants , , and

Provenance Challenge, Sept User interface