Download presentation
Presentation is loading. Please wait.
1
Karma Provenance Framework v2 Provenance Challenge Workshop/GGF18 Yogesh L. Simmhan Beth Plale, Dennis Gannon, Srinath Perera Indiana University
2
[2/25] 2006-09-13 Outline Architecture of Karma Workflow Setup & Collecting Provenance Provenance Traces “canonical” Challenge Queries Suggested Variations
3
[3/25] 2006-09-13 Provenance Collection: Challenges & Uses Linked Environments for Atmospheric Discovery (LEAD) project Weather & Severe Storm Prediction Applications Provenance on workflow (process) & data products at fine granularity Dynamic, Long running workflows Helps scientists to search for workflows & data products, Track workflow execution, Analyze & mine data products from runs
4
[4/25] 2006-09-13 Karma Provenance Framework Lightweight – do not duplicate existing metadata cataloging effort myLEAD personal metadata catalog ResCat service & data registry Glue to integrate metadata on data & services with runtime workflow information Scalability 1 – 500 users, 100’s of workflows, 10,000’s of data products [1] [1] Performance Evaluation of the Karma Provenance Framework, Simmhan, Y., et al.; IPAW, 2006
5
[5/25] 2006-09-13 Karma Provenance Service Provenance Listener Provenance Listener Activity DB Activity DB Karma Architecture 2 Workflow Instance 10 Data Products Consumed & Produced by each Service Workflow Instance 10 Data Products Consumed & Produced by each Service Service 2 Service 2 … … Service 1 Service 1 Service 10 Service 10 Service 9 Service 9 10P/10C 10C 10P10C10P/10C 10P Workflow Engine Workflow Engine Message Bus WS-Eventing Service API WS-Messenger Notification Broker WS-Messenger Notification Broker Publish Provenance Activities as Notifications Application–Started & –Finished, Data–Produced & –Consumed Activities Workflow–Started & –Finished Activities Provenance Query API Provenance Query API Provenance Browser Client Provenance Browser Client Query for Workflow, Process, & Data Provenance Subscribe & Listen to Activity Notifications [2] A Framework for Collecting Provenance in Data-Centric Scientific Workflows, Simmhan, Y., et al., Submitted to ICWS Conference, 2006A Framework for Collecting Provenance in Data-Centric Scientific Workflows
6
[6/25] 2006-09-13 Provenance Challenge Workflow Applications modeled as web-services GFac toolkit creates service for command-line applications Service invokes a shell-script wrapper of the application, passing command-line arguments Created services automatically instrumented to generate provenance using Karma client library Workflow composed as GPEL * script XBaya Workflow composer GUI Central GPEL workflow engine orchestrates execution *Grid Process Execution Language, an extension of the Business Process Execution Language (BPEL)
7
[7/25] 2006-09-13 Provenance Challenge Workflow
8
[8/25] 2006-09-13 Provenance Traces Data Provenance: get[Recursive]DataProvenance What (ID), where (URL), when (Timestamp) How (Process, inputs)
9
[9/25] 2006-09-13 Provenance Traces Process Provenance: getProcessProvenance What (ID), when (Timestamp), who (Invoker) State (execution/completion status) Input & Output data products
10
[10/25] 2006-09-13 Provenance Traces Workflow Trace: getWorkflowTrace What (ID), when (Timestamp), who (Invoker) State (execution/completion status) Process provenance of workflow steps
11
[11/25] 2006-09-13
12
[12/25] 2006-09-13 Provenance Challenge Queries !Answered by Karma Service API Directly Answered by Karma Service API, with post-processing by client ~Answered by access to backend DB (SQL) Not answered Query 123456789 Result ! ! ~ ~ ~ ~
13
[13/25] 2006-09-13 Provenance Challenge Queries: Q1 Find everything that caused Atlas X Graphic to be as it is !Answered by Karma Service API Directly This is the recursive data provenance of the Atlas X Graphic file A call to getRecursiveDataProvenance( ‘lead:uuid:1157946992-atlas-x.gif’) returns this [www]thiswww
14
[14/25] 2006-09-13 Provenance Challenge Queries: Q2 Find the process that led to Atlas X Graphic, excluding all prior to softmean Answered by Karma Service API, with post- processing by client 1. First call getDataProvenance 2. Then recursively get data provenance till ‘SoftmeanService’ is seen Returns this [www]thiswww 1. let $dataList := ['lead:uuid:1157946992-atlas-x.gif'] 2. while ($dataList != empty) do // get data provenance for this level a. $dataProvenance = karma.getDataProvenance($dataList[0]) // print process information & remove data from list b. Print $dataProvenance; $dataList.delete(0) c. if ($dataProvenance.getProducedBy() == 'SoftmeanService') break; // found Softmean. Stop. // get input data used by this data & recurse up the tree d. foreach ($inputData in $dataProvenance.getUsingData()) do i. $dataList.add($inputData) 3. End
15
[15/25] 2006-09-13 Provenance Challenge: Q4 Find all invocations of align_warp ( with parameter "-m 12") that ran on a Monday ~ Answered by access to backend DB (SQL) 1. Use SQL query to get matching invocations 2. Call getProcessProvenance to get description of align_warp Returns this [www]thiswww SELECT invokee.workflow_id, invokee.service_id, invokee.workflow_node_id, invokee.workflow_timestep, invoker.workflow_id, invoker.service_id, invoker.workflow_node_id, invoker.workflow_timestep FROM invocation_state_table invocation, entity_table invokee, entity_table invoker, notification_table notifications WHERE invokee.entity_id = invocation.invokee_id AND invoker.entity_id = invocation.invoker_id AND notifications.source_id = invocation.invokee_id AND notifications.notification_type = 'ServiceInvoked' AND invokee.service_id = 'urn:qname:http://www.extreme.indiana.edu/karma/challenge06:AlignWarpService' AND notifications.notification_xml LIKE'% 12 %‘ AND DayOfWeek(invocation.request_receive_time) = 2; // 1=Sunday, 2=Monday,...
16
[16/25] 2006-09-13 Provenance Challenge: Q9 Find all the graphical atlas sets that have metadata annotation studyModality with values speech, visual or audio, and return all other annotations to these files. Not answered We do not expect to answer such queries through the provenance system We push the provenance information to external metadata management systems such as MyLEAD, which can answer such “join” queries on data product metadata and provenance
17
[17/25] 2006-09-13 Variations of Workflow Workflows with loops Workflows whose structure changes dynamically or, as a simpler case, workflows with conditional branches Hierarchical composition of workflows workflows invoking other workflows
18
[18/25] 2006-09-13 Variations of Queries Find all [workflows | processes] with a particular execution status [completed | failed | waiting for input] Show the client view and service view of the provenance and check for differences
19
Acknowledgements Alek Slominski (GPEL Engine) Satoshi Shirasuna (XBaya Composer) LEAD Members NSF Questions www.extreme.indiana.edu/karma
20
[20/25] 2006-09-13 More here [www]here Sample Activities Published
21
[21/25] 2006-09-13 Karma DB Schema
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.