Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana Cardiff University, UK.

Slides:



Advertisements
Similar presentations
LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Advertisements

Web Services Architecture An interoperability architecture for the World Wide Service Network.
UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.
Software Modeling SWE5441 Lecture 3 Eng. Mohammed Timraz
Provenance in Distr. Organ Transplant Management Applying Provenance in Distributed Organ Management Sergio Álvarez, Javier Vázquez-Salceda, Tamás Kifor,
Architecture Tutorial 1 Overview of Today’s Talks Provenance Data Structures Recording and Querying Provenance –Break (30 minutes) Distribution and Scalability.
T-FLEX DOCs PLM, Document and Workflow Management.
1 Frameworks. 2 Framework Set of cooperating classes/interfaces –Structure essential mechanisms of a problem domain –Programmer can extend framework classes,
L4-1-S1 UML Overview © M.E. Fayad SJSU -- CmpE Software Architectures Dr. M.E. Fayad, Professor Computer Engineering Department, Room #283I.
Dynamic adaptation of parallel codes Toward self-adaptable components for the Grid Françoise André, Jérémy Buisson & Jean-Louis Pazat IRISA / INSA de Rennes.
6th Biennial Ptolemy Miniconference Berkeley, CA May 12, 2005 Distributed Computing in Kepler Ilkay Altintas Lead, Scientific Workflow Automation Technologies.
GenSpace: Exploring Social Networking Metaphors for Knowledge Sharing and Scientific Collaborative Work Chris Murphy, Swapneel Sheth, Gail Kaiser, Lauren.
Cloud Computing for Chemical Property Prediction Paul Watson School of Computing Science Newcastle University, UK Microsoft Cloud.
Software Engineering I Object-Oriented Design
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
Maintaining and Updating Windows Server 2008
A Semantic Workflow Mechanism to Realise Experimental Goals and Constraints Edoardo Pignotti, Peter Edwards, Alun Preece, Nick Gotts and Gary Polhill School.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse 2.
GMD German National Research Center for Information Technology Innovation through Research Jörg M. Haake Applying Collaborative Open Hypermedia.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 13 Slide 1 Application architectures.
Biology.sdsc.edu CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses Zhijie Guan 1, Alex Borchers 1, Timothy.
Architecture Tutorial Overview of Today’s Talks Provenance Data Structures Recording and Querying Provenance –Break (30 minutes) Distribution and Scalability.
Electronically Querying for the Provenance of Entities Simon Miles Provenance-Aware Service-Oriented Architectures.
Apache Airavata GSOC Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced.
Chapter 16 The World Wide Web Chapter Goals Compare and contrast the Internet and the World Wide Web Describe general Web processing Describe several.
UK e-Science All Hands Meeting 2005 Paul Groth, Simon Miles, Luc Moreau.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Usage of `provenance’: A Tower of Babel Luc Moreau.
Provenance Aware Service Oriented Architecture (1 year on) Professor Luc Moreau University of Southampton
PI Performance Monitoring James Wong OSI Software, Inc.
Architecture Tutorial 1 Overview of Today’s Talks Provenance Data Structures Recording and Querying Provenance –Break (30 minutes) Distribution and Scalability.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
The Network Performance Advisor J. W. Ferguson NLANR/DAST & NCSA.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Towards Low Overhead Provenance Tracking in Near Real-Time Stream Filtering Nithya N. Vijayakumar, Beth Plale DDE Lab, Indiana University {nvijayak,
Interoperability between Scientific Workflows Ahmed Alqaoud, Ian Taylor, and Andrew Jones Cardiff University 10/09/2008.
The FI-WARE Project – Base Platform for Future Service Infrastructures FI-WARE Interface to the network and Devices Chapter.
Constructing Data Mining Applications based on Web Services Composition Ali Shaikh Ali and Omer Rana
Security Issues in a SOA- based Provenance System Victor Tan, Paul Groth, Simon Miles, Sheng Jiang, Steve Munroe, Sofia Tsasakou and Luc Moreau PASOA/EU.
Unified Modeling Language* Keng Siau University of Nebraska-Lincoln *Adapted from “Software Architecture and the UML” by Grady Booch.
Chapter 8 Object Design Reuse and Patterns. Object Design Object design is the process of adding details to the requirements analysis and making implementation.
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
A record and replay mechanism using programmable network interface cards Laurent Lefèvre INRIA / LIP (UMR CNRS, INRIA, ENS, UCB)
Sadegh Aliakbary. Copyright ©2014 JAVACUP.IRJAVACUP.IR All rights reserved. Redistribution of JAVACUP contents is not prohibited if JAVACUP.
Recording the Context of Action for Process Documentation Ian Wootten Cardiff University, UK
Registries, ebXML and Web Services in short. Registry A mechanism for allowing users to announce, or discover, the availability and state of a resource:
Information Systems Analysis and Design Reviews of IS and Software Process Spring Semester
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
Week 04 Object Oriented Analysis and Designing. What is a model? A model is quicker and easier to build A model can be used in simulations, to learn more.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
Steve Graham WS-ResourceFramework Modeling Stateful Resources With Web services OASIS WSRF TC F2F Wednesday, April 28th, 2004.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Architecture Tutorial 1 Overview of Today’s Talks Provenance Data Structures Recording and Querying Provenance –Break (30 minutes) Distribution and Scalability.
Design and implementation Chapter 7 – Lecture 1. Design and implementation Software design and implementation is the stage in the software engineering.
THE EYESWEB PLATFORM - GDE The EyesWeb XMI multimodal platform GDE 5 March 2015.
Provenance in Distr. Organ Transplant Management EU PROVENANCE project: an open provenance architecture for distributed.
T EST T OOLS U NIT VI This unit contains the overview of the test tools. Also prerequisites for applying these tools, tools selection and implementation.
Michael Ernst, page 1 Application Communities: Next steps MIT & Determina October 2006.
Tools for Navigating and Analysis of Provenance Information Vikas Deora, Arnaud Contes and Omer Rana.
Maintaining and Updating Windows Server 2008 Lesson 8.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
Applications Active Web Documents Active Web Documents.
Chapter 8 – Software Testing
Chapter 2: System Structures
OGSA Data Architecture Scenarios
UML dynamic Modeling (Behavior Diagram)
The Extensible Tool-chain for Evaluation of Architectural Models
Leigh Grundhoefer Indiana University
Unified Modeling Language
Presentation transcript:

Recording Actor Provenance in Scientific Workflows Ian Wootten, Shrija Rajbhandari, Omer Rana Cardiff University, UK

What? Provenance is concerned with process  This may or may not be documented Data Provenance – The process which leads to a particular piece of data Actor Provenance - The process which leads to a particular actor state  How an actor (client or service) arrived at a particular state during an interaction (for stateless actors)

What? Actor Provenance Service Enactment Engine Service Interaction Assertions: Asserting the contents of a message by an actor sending or receiving it. A1A1 A2A2 B1B1 B2B2 Actor State Assertions: Asserting the state of an actor at a particular time during an interaction.

Metrics for Actor State Assertion Static  No variation in value over actor lifetime Per Node - Node identity, Operating system Per Actor - Actor identity, Name, Owner, Version Dynamic  Variation in value over actor lifetime Per Node - Memory usage, Network traffic Per Actor - Execution Time, Availability Instrumented  Actor is ‘Instrumented’ at Key Points in its Execution Description of internal data flow  Eg. German Aerospace Center (DLR) Completion states for action events and file transfers

How? Actor Provenance Service Enactment Engine Service B1B1 B2B2 M1M2 Instrumented Output Monitor Output Monitoring Sources: Service information derived from hosting platform via monitoring sources (eg Ganglia) Instrumented Actor: Service information obtained from instrumented points within an actor.

Why? Standalone and Combined Value Standalone State Assertion Value  Actor Selection Performance Evaluation of Past / Prediction of Future  Resource Allocation Actor administrator allocates resources according to performance metrics Combined Value - Putting Assertions into Context  Interaction – Through Actor State Assertions Determining the likely cause of error / results Understanding what an actor is doing  Actor – Through Interaction Assertions Understanding performance pattern observations Understanding instrumented metric observations

How? Actor Provenance Registry Attempt to provide a mechanism to specify and record actor state assertions for any application Generic Mechanism Problems  No Knowledge of Potential Resources Monitoring sources, containers  No Direct Knowledge of Implementation Instrumented Data Capture

How? Actor Provenance Registry Resource and Rule Registration  Resource – Monitoring Tool  Rule - User defined instructions Indirectly from Resources  Coordinator polls resources for information  Times of interest – Service Invocation, Request Directly from actor  Collection of Instrumented data Representation?

How? Actor Provenance Registry Integration with PReP [Groth et al.]

Data Mining Prototype Record assertions using registry during invocation of a data modelling service Service takes incoming data sets and generates a model based upon it  Uses Quantitative Structure-Activity Relationship (QSAR) to attempt to correlate biological activity to a chemical compound  Larger data set = longer run time

Performance Evaluation No rules 1 rule 5 rules

Conclusions / Future Work Actor Provenance data is important  Without it, we don’t get the full picture Prototype shows that it can be done  Room for improvement Interface to Monitoring System Caching of results  No inclusion of ‘instrumented’ actor capture  Requires service provider adoption to work

Prototype Configuration Single machine holding both client, service and registry Rules executed on invocation of service  XQuery  Invocations performed 100 times on datasets between 30KB – 340KB in size Coordinator records rule results to a local file store