Download presentation
Presentation is loading. Please wait.
Published byShannon Smith Modified over 9 years ago
1
Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo Gomes 1
2
Agenda Introduction – Problem Formulation and Initial Hypothesis Envisioned Solution Preliminary Experiments Reformulated Hypothesis Qualitative Analysis of the Research Material – The myExperiment Repository Related Work Conclusions 2
3
Introduction Scientific workflows are used for tackling complex problems in different e-science domains – They may be described as a directed graph where the vertices represent the tasks and the edges represent the data relationships between the tasks Several Scientific Workflow Management Systems (SWfMSs) have been developed – Specifying scientific workflows with higher-level abstractions (Workflow Specification Languages - WfSL) than scripts, – Orchestrating the execution of the tasks, and – Managing the data consumed and produced by these workflows. 3
4
Problem Formulation We formulated our research problem – The state-of-the-art in SWfMSs does not allow a scientist to easily reuse workflow specifications previously modeled in other SWfMSs than those this scientist is used to work with. 4
5
Initial Hypothesis The use of workflow patterns could help in keeping the semantics of a workflow – The use of workflow patterns combined with software architecture concepts to capture the key semantics expressed in workflow specifications enables the establishment of automated processes that transform these specifications across different SWfMSs. These processes allow for a reduction on the effort scientists would make to reuse workflow specifications developed by other research groups in SWfMSs that are not part of the usual tooling these scientists employ in their daily work 5
6
Envisioned Solution A novel language for interchanging workflow specifications – Using the Acme architecture description interchange language It was based on the specification of a single architectural style where the components were the tasks and the connectors were the patterns – Definition of an interchangeable workflow: workflow composed of a set of “interchangeable elements” Constants, subworkflows and webservices tasks 6
7
Envisioned Solution Patterns Structural Sequence: binds a single output port to a single input port; Parallel Split: binds a single output port to two or more input ports, replicating the same data from the output port to all input ports; Simple Merge: binds two or more output ports to a single input port, feeding the input port with data received from each output port in an interleaved way; Behavioral Synchronization: similar in structure to the Simple Merge pattern, but the task with the input port may be only executed when data coming from all the output ports have been received and grouped according to some criteria; Exclusive Choice: similar in structure to the Parallel Split pattern, but only one of the input ports may receive data from the output port, according to some condition. 7
8
Workflow Pattern Identification Patterns may be implemented in different ways – Depending on the features each SWfMS supports – Eg: Exclusive Choice Pattern 8
9
Preliminary Experiments Experiment Planning – 4 VisTrails, 46 Kepler and 1452 Taverna specifications For the 1st hypothesis the task type matters – VisTrails has only one Web Service and it is not available – Kepler has 45 types of tasks but none of them is a Web Service – Taverna has more than 100 types and many Web Services 9
10
Preliminary Experiments Analysis of the workflow transformations – 53% of the Taverna tasks were interchangeable Quantity of Tasks Quantity of Interchangeable Workflows 10
11
Reformulated Hypothesis The use of workflow patterns and software architecture concepts to capture the key structural semantics expressed in workflow specifications enables the establishment of semi-automated processes that transform these specifications across different WfSLs. These processes allow for a reduction on the effort scientists would make to reuse structurally complex workflow specifications (in the sense of having a large number of tasks and dependency relationships between these tasks) developed by other research groups in SWfMSs that are not part of the usual tooling these scientists employ in their daily work. 11
12
Further Experiments After interchanging the workflows structures we could interchange almost all workflows (98.28%) – Problems related to the patterns identification 12
13
Qualitative Analisys of the Research Material The myExperiment repository – Webservice tasks implemented as either local, inaccessible, or authenticated, which made it impossible to execute these workflows, even in their source specifications – Lack of documentation: Most of the analyzed workflows have no or very few metadata information Similar problems reported in the Wf4Ever project – Proposal of a new myExperiment repository 13
14
Qualitative Analisys of the Research Material The studied systems – Once a task has its type defined and its input and output ports linked to other tasks, it cannot have its type changed, therefore it needs to be removed Once removed the relations are gone! It reduces the utility of our approach – Some SWfMS have limitations VisTrails does not export subworkflows 14
15
Related Work Taverna 2-Galaxy and Tavaxy – Limited to two SWfMSs and their adaptability to a broader range of SWfMSs would depend on a complete reformulation of their architectures Although Tavaxy brings the patterns approach IWIR – Most similar to ours – Syntactical structures that are quite similar to those defined for the SWfMSs Other works 15
16
Conclusions This research endeavor started with exploratory studies aiming at identifying whether it would be possible to establish “future-proof” automated processes for transforming workflows between different SWfMSs. It was unclear whether the perceived problem does actually exist, and the experimental data we employed may point out in a different direction. The fact that the myExperiment repository is plenty of “toy” made it harder to execute a proof of concept. 16
17
Questions 17
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.