Download presentation
Presentation is loading. Please wait.
1
Karma Provenance Framework v2 Provenance Challenge Workshop/GGF18 Yogesh L. Simmhan Beth Plale, Dennis Gannon, Srinath Perera Indiana University
2
[2/25] 2006-09-13 Outline Architecture of Karma Workflow Setup & Collecting Provenance Provenance Traces “canonical” Challenge Queries Suggested Variations
3
[3/25] 2006-09-13 Provenance Collection: Challenges & Uses Linked Environments for Atmospheric Discovery (LEAD) project Weather & Severe Storm Prediction Applications Provenance on workflow (process) & data products at fine granularity Dynamic, Long running workflows Helps scientists to search for workflows & data products estimate data quality, track workflow execution, and analyze & mine data products from runs
4
[4/25] 2006-09-13 Karma Provenance Framework Lightweight – do not duplicate existing metadata cataloging effort myLEAD personal metadata catalog ResCat service & data registry Glue to integrate metadata on data & services with runtime workflow information Scalability 1 – 500 users, 100’s of workflows, 10,000’s of data products [1] [1] Performance Evaluation of the Karma Provenance Framework, Simmhan, Y., et al.; IPAW, 2006
5
[5/25] 2006-09-13 Karma Provenance Framework Key Provenance Activities generated during lifetime of wrokflow Workflow | Service Invoked Data Consumed Data Produced Sending Response Activities modeled as XML messages Published asynchronously by service|workflow|client Presently use WS-Eventing messaging system Activities stored in relational database
6
[6/25] 2006-09-13 Karma Provenance Service Provenance Listener Provenance Listener Activity DB Activity DB Karma Architecture 1 Workflow Instance 10 Data Products Consumed & Produced by each Service Workflow Instance 10 Data Products Consumed & Produced by each Service Service 2 Service 2 … … Service 1 Service 1 Service 10 Service 10 Service 9 Service 9 10P/10C 10C 10P10C10P/10C 10P Workflow Engine Workflow Engine Message Bus WS-Eventing Service API WS-Messenger Notification Broker WS-Messenger Notification Broker Publish Provenance Activities as async Notifications ServiceInvoked & Sending Response, Data–Produced & –Consumed Activities WorkflowInvoked & SendingResponse Activities Provenance Query API Provenance Query API Provenance Browser Client Provenance Browser Client Query for Workflow, Process, & Data Provenance Subscribe & Listen to Activity Notifications [1] A Framework for Collecting Provenance in Data-Centric Scientific Workflows, Simmhan, Y., et al., Submitted to ICWS Conference, 2006A Framework for Collecting Provenance in Data-Centric Scientific Workflows
7
[7/25] 2006-09-13 Provenance Challenge Workflow Applications modeled as web-services Generic Factory toolkit creates web-service wrappers for command-line applications Service invokes a shell-script/application, passing command-line arguments Created services automatically instrumented to generate provenance using Karma client library Workflow composed as GPEL * script XBaya Workflow composer GUI Central GPEL workflow engine orchestrates execution *Grid Process Execution Language, an extension of the Business Process Execution Language (BPEL)
8
[8/25] 2006-09-13 Provenance Challenge Workflow
9
[9/25] 2006-09-13 Provenance Traces – Building Block Queries Data Provenance: get[Recursive]DataProvenance What (ID), where (URL), when (Timestamp) How (Process, inputs)
10
[10/25] 2006-09-13 Provenance Traces – Building Block Queries Process Provenance: getProcessProvenance What (ID), when (Timestamp), who (Invoker) State (execution/completion status) Input & Output data products
11
[11/25] 2006-09-13 Provenance Traces – Building Block Queries Workflow Trace: getWorkflowTrace What (ID), when (Timestamp), who (Invoker) State (execution/completion status) Process provenance of workflow steps
12
[12/25] 2006-09-13
13
[13/25] 2006-09-13 Provenance Challenge Queries !Answered by Karma Service API Directly Answered by Karma Service API, with post-processing by client ~Answered by access to backend DB (SQL) Not answered Query 123456789 Result ! ! ~ ~ ~ ~
14
[14/25] 2006-09-13 Provenance Challenge Queries: Q1 Find everything that caused Atlas X Graphic to be as it is !Answered by Karma Service API Directly This is the recursive data provenance of the Atlas X Graphic file A call to getRecursiveDataProvenance( ‘lead:uuid:1157946992-atlas-x.gif’) returns this [www]thiswww
15
[15/25] 2006-09-13 Provenance Challenge Queries: Q2 Find the process that led to Atlas X Graphic, excluding all prior to softmean Answered by Karma Service API, with post- processing by client 1. First call getDataProvenance 2. Then recursively get data provenance till ‘SoftmeanService’ is seen Returns this [www]thiswww 1. let $dataList := ['lead:uuid:1157946992-atlas-x.gif'] 2. while ($dataList != empty) do // get data provenance for this level a. $dataProvenance = karma.getDataProvenance($dataList[0]) // print process information & remove data from list b. Print $dataProvenance; $dataList.delete(0) c. if ($dataProvenance.getProducedBy() == 'SoftmeanService') break; // found Softmean. Stop. // get input data used by this data & recurse up the tree d. foreach ($inputData in $dataProvenance.getUsingData()) do i. $dataList.add($inputData) 3. End
16
[16/25] 2006-09-13 Provenance Challenge: Q4 Find all invocations of align_warp with parameter "-m 12" that ran on a Monday ~ Answered by access to backend DB (SQL) 1. Use SQL query to get matching invocations 2. Call getProcessProvenance to get description of align_warp Returns this [www]thiswww SELECT invokee.workflow_id, invokee.service_id, invokee.workflow_node_id, invokee.workflow_timestep, invoker.workflow_id, invoker.service_id, invoker.workflow_node_id, invoker.workflow_timestep FROM invocation_state_table invocation, entity_table invokee, entity_table invoker, notification_table notifications WHERE invokee.entity_id = invocation.invokee_id AND invoker.entity_id = invocation.invoker_id AND notifications.source_id = invocation.invokee_id AND notifications.notification_type = 'ServiceInvoked' AND invokee.service_id = 'urn:qname:http://www.extreme.indiana.edu/karma/challenge06:AlignWarpService' AND notifications.notification_xml LIKE'% 12 %‘ AND DayOfWeek(invocation.request_receive_time) = 2; // 1=Sunday, 2=Monday,...
17
[17/25] 2006-09-13 Provenance Challenge: Q9 Find all the graphical atlas sets that have metadata annotation studyModality with values speech, visual or audio, and return all other annotations to these files. Not answered We do not expect to answer such queries through the provenance system We push the provenance information to external metadata management systems such as MyLEAD, which can answer such “join” queries on data product metadata and provenance
18
[18/25] 2006-09-13 Variations of Workflow Workflows with loops Workflows whose structure changes dynamically or, as a simpler case, workflows with conditional branches Hierarchical composition of workflows workflows invoking other workflows ~Similar to user-views (UPenn), nested- workflows (myGrid), …
19
[19/25] 2006-09-13 Variations of Queries Find all [workflows | processes] with a particular execution status [completed | failed | waiting for input] Dynamic attribute of provenance? Query for client view and service view of the provenance Check for differences
20
Acknowledgements Alek Slominski (GPEL Engine) Satoshi Shirasuna (XBaya Composer) LEAD Members NSF Questions www.extreme.indiana.edu/karma
21
[21/25] 2006-09-13 More here [www]here Sample Activities Published
22
[22/25] 2006-09-13 Karma DB Schema
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.