A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center.

Slides:



Advertisements
Similar presentations
Enabling the execution of various workflows (Kepler, Taverna, Triana, P-GRADE) on EGEE Tamas Kukla, Tamas Kiss, Gabor Terstyanszky.
Advertisements

Running Workflows on Clouds and Grids Gabor Terstyanszky, University of Westminster T. Fahringer, P. Kacsuk, J. Montagnat, I. Taylor e-Science Workshop,
IBM Software Group ® Design Thoughts for JDSL 2.0 Version 0.2.
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
P. Kacsuk, G. Sipos, A. Toth, Z. Farkas, G. Kecskemeti and G. Hermann P. Kacsuk, G. Sipos, A. Toth, Z. Farkas, G. Kecskemeti and G. Hermann MTA SZTAKI.
ER-flow Purpose of the meeting C. Vuerli Contributions by G. Terstyanszky.
Investigating Approaches to Speeding Up Systems Biology Using BOINC-Based Desktop Grids Simon J E Taylor (1) Mohammadmersad Ghorbani (1) David Gilbert.
P-GRADE and WS-PGRADE portals supporting desktop grids and clouds Peter Kacsuk MTA SZTAKI
WS-PGRADE: Supporting parameter sweep applications in workflows Péter Kacsuk, Krisztián Karóczkai, Gábor Hermann, Gergely Sipos, and József Kovács MTA.
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
EXTENDING SCIENTIFIC WORKFLOW SYSTEMS TO SUPPORT MAPREDUCE BASED APPLICATIONS IN THE CLOUD Shashank Gugnani Tamas Kiss.
Workflows Information Flows Prof. Silvia Olabarriaga Dr. Gabriele Pierantoni.
SCI-BUS is supported by the FP7 Capacities Programme under contract RI ER-FLOW is supported by the FP7 Infrastructures under contract RI
ER-flow C. Vuerli Contributions by G. Terstyanszky, K. Varga.
Workflow sharing and integration services by the ER-flow project on behalf of the ER-flow consortium EGI Community Forum, Manchester,
A General and Scalable Solution of Heterogeneous Workflow Invocation and Nesting Tamas Kukla, Tamas Kiss, Gabor Terstyanszky.
Sharing, integrating and executing different workflows in heterogeneous multi-cloud systems Peter Kacsuk MTA SZTAKI SCI-BUS is supported.
Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.
From P-GRADE to SCI-BUS Peter Kacsuk, Zoltan Farkas and Miklos Kozlovszky MTA SZTAKI - Computer and Automation Research Institute of the Hungarian Academy.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
Sharing Workflows through Coarse-Grained Workflow Interoperability : Sharing Workflows through Coarse-Grained Workflow Interoperability G. Terstyanszky,
Introduction to SHIWA Technology Peter Kacsuk MTA SZTAKI and Univ.of Westminster
Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU)
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
INFSO-RI Enabling Grids for E-sciencE Supporting legacy code applications on EGEE VOs by GEMLCA and the P-GRADE portal P. Kacsuk*,
Introduction to WS-PGRADE and gUSE Tutorial Akos Balasko 04/17/
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
AgINFRA science gateway for workflows and integrated services 07/02/2012 Robert Lovas MTA SZTAKI.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
P-GRADE and GEMLCA.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Services for advanced workflow programming.
1 Practical information for the GEMLCA / P-GRADE hands-on Tamas Kiss University of Westminster.
SHIWA and Coarse-grained Workflow Interoperability Gabor Terstyanszky, University of Westminster Summer School Budapest July 2012 SHIWA is supported.
Advanced Component Models ULCM & HLCM Julien Bigot, Hinde Bouziane, Christian Perez COOP Project Lyon, 9-10 mars 2010.
Building an European Research Community through Interoperable Workflows and Data ER-flow project Gabor Terstyanszky, University of Westminster, UK EGI.
SHIWA Desktop Cardiff University, Budapest, 3 rd July 2012.
2nd Texas A&M Big Data Workshop Development of “Big Data” Scientific Workflow Management Tools for the Materials Genome Initiative: “Materials Galaxy”
SHIWA: Is the Workflow Interoperability a Myth or Reality PUCOWO, June 2011, London Gabor Terstyanszky, Tamas Kiss, Tamas Kukla University of Westminster.
1 SCI-BUS: building e-Science gateways in Europe: building e-Science gateways in Europe Peter Kacsuk and Zoltan Farkas MTA SZTAKI.
07/02/2012 WS-PGRADE/gUSE in use Lightweight introduction Zoltán Farkas MTA SZTAKI LPDS.
WS-PGRADE/gUSE in use Advance use of WS- PGRADE/gUSE gateway framework Zoltán Farkas and Peter Kacsuk MTA SZTAKI LPDS.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Introduction to SHIWA project EGI User Forum, Vilnius Peter Kacsuk MTA SZTAKI
Supporting Big Data Processing via Science Gateways EGI CF 2015, November, Bari, Italy Dr Tamas Kiss, CloudSME Project Director University of Westminster,
Porting workflows for the Heliophysics Community Dr. Gabriele Pierantoni Trinity College Dublin.
SHIWA Simulation Platform (SSP) Gabor Terstyanszky, University of Westminster EGI Community Forum Munnich March 2012 SHIWA is supported by the FP7.
Usage of WS-PGRADE and gUSE in European and national projects Peter Kacsuk 03/27/
1 Support for parameter study applications in the P-GRADE Portal Gergely Sipos MTA SZTAKI (Hungarian Academy of Sciences)
SHIWA project presentation Project 1 st Review Meeting, Brussels 09/11/2011 Peter Kacsuk MTA SZTAKI
Hadoop on the EGI Federated Cloud Dr Tamas Kiss, CloudSME Project Director University of Westminster, London, UK Carlos Blanco – University.
Using SHIWA Workflow Interoperability Tools for Neuroimaging Data Analysis Applications Vladimir Korkhov 1, Dagmar Krefting 2, Tamas Kukla 3, Gabor Terstyanszky.
SCI-BUS project Pre-kick-off meeting University of Westminster Centre for Parallel Computing Tamas Kiss, Stephen Winter, Gabor.
Exposing WS-PGRADE/gUSE for large user communities Peter Kacsuk, Zoltan Farkas, Krisztian Karoczkai, Istvan Marton, Akos Hajnal,
Building an European Research Community through Interoperable Workflows and Data ER-flow Prof. Gabor Terstyanszky, University of Westminster, UK Heildelberg,
HELIOGATE – A USER EXPERIENCE PERSPECTIVE G.Pierantoni, E. Carley, B. Bentley S. Olabarriaga E. Sciacca, G. Taffoni, G. Castelli, U. Becciani.
Building an European Research Community through Interoperable Workflow and Data Gabor Terstyanszky University of Westminster.
SHIWA SIMULATION PLATFORM = SSP Gabor Terstyanszky, University of Westminster e-Science Workflows Workshop Budapest 09 nd February 2012 SHIWA is supported.
Coarse Grained Interoperability scenarios
Tamas Kiss University Of Westminster
Workflows in Computational Chemistry Prof
Peter Kacsuk MTA SZTAKI
P-GRADE and GEMLCA.
Lightweight introduction
Lightweight introduction
Introduction to the SHIWA Simulation Platform EGI User Forum,
Workflow level parametric study support by the P-GRADE portal
Scientific Workflows Lecture 15
Presentation transcript:

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability Junaid Arshad, Gabor Terstyanszky, Tamas Kiss, Noam Weingarten Center for Parallel Computing, University of Westminster, London, UK (J.Arshad, G.Z.Terstyanszky, T.Kiss,

Scientific Workflows Complex computational experiments involving Analysis of large volumes of data Use of distributed/high performance computing infrastructures (DCIs) Individual tasks performing control and data analytics functions A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 2 Collect Data Transform Data Analytics Data Store Flush Outputs External Aggregation

Meta-workflows Composed of one or more entities such as jobs and/or sub- workflows Re-use these existing entities as components Faster and more efficient development A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 3 Collect Data Transform Data Data Store Flush Outputs External Aggregation Analytics Join Aggreg ate results Run algorith m

Types of Workflows (1) Single Workflow all the nodes of the workflow are individual jobs that are executed and managed by a single workflow system such as P-GRADE [1], Galaxy [2] or Taverna [3] A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 4

5 Single workflow Native meta-workflow Non-native meta-workflow Types of Workflows (2) Native meta-workflow the embedded sub-workflows are all from the host workflow system (WS1) Non-native meta-workflow at least one of the embedded workflows is from external workflow systems

Non-native meta-workflows Two approaches have been established to execute non-native meta- workflows Coarse Grained Interoperability (CGI) Host workflow engine and workflows are considered native whereas all other workflows are considered non-native The non-native workflows and workflow engines are managed as legacy applications Fine Grained Interoperability (FGI) A workflow is considered to have two distinguished parts i.e. abstract and concrete Abstract includes workflow graph and abstract I/O whereas concrete part contains implementation details FGI approach creates a IWIR bundle containing IWIR representation of abstract part and the concrete part. A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 6

Meta-workflow Execution Approaches A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 7 Coarse-grain Workflow Interoperability usage scenarioTransformation of workflows for FGI approach

Formalizing CGI based Approach Based on existing formalization presented in [4] Objective to improve the usability of existing meta-workflows by highlighting the requirements of CGI based workflow interoperability approach Formalization achieved using the Z-notations A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 8

9 WF meta =: {J 1 …J k, WF nat1 …WF natl, WF nnt1,..WF nntm }, where J nath - native job h = 1... k Wf nati - native workflow i = 1... l WF nntj - non-native workflow j = 1... m. Existing formalization Formal description of the CGI approach using Z

Types of CGI-based meta-workflow Focused on CGI-based meta-workflows Aimed to facilitate workflow developers to design new workflows by enabling them to comprehend with attributes and semantics of each type of workflow Following types of meta-workflows considered i.Single job meta-workflow ii.Linear multi-job meta-workflow iii.Parallel multi-job meta-workflow iv.Parameter sweep meta-workflow A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 10

Single job meta-workflow A job representing this workflow can be a simple native job, a native workflow or even a non-native workflow A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 11 J n = {J abs, J cnf, J cnr, J ennat } where J n : native job, J abs : abstract description of the job, J cnf : configuration of the job, J cnr : implementation of the job J ennat : native workflow engine WF n = {WF abs, WF cnr, WF cnf, WF enn } where WF abs : abstract part of workflow WF cnr : concrete part of workflow WF cnf : workflow configuration WF enn : native workflow engine WF nn = {TASKS, WF abs, WF cnr, WF cnf, WF ennn } where J nn = {J abs, J cnf, J cnr, J ennnat } J nn ∈ TASKS WF nn : non-native workflow WF ennn : non-native workflow engine Single native job Single native workflow Single non-native workflow Single Job Meta-Workflow

Linear multi-job meta-workflow A pipeline of multiple jobs in the native workflow system where any (or even all) of these jobs can be native jobs, native or non-native workflows A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 12 WF lmj = {J 1 …, J m }={J n1 …, J nk, WF n1, …,WF ny, WF nn1, …,WF nnz } where J j job of the linear multi-job meta-workflow, j = 1 … m J ni native job ∈ J j, i = 1 … k WF np native workflow ∈ J j, p = 1… y WF nnt non-native workflow, ∈ J j, t = 1… z If J x and J y are two consecutive jobs and J x is not the last job of the workflow, there must be an edge e i = (P xo, P yi ) where P xo : output port of job J x P yi : input port of job J y Linear multi-job meta-workflow

Parallel multi-job meta-workflow A native workflow including parallel branches which can include either single jobs or native and non-native workflows, i.e. composed of several single and/or linear multi-job meta-workflows A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 13 WF pmj = {J 1 …, J m }={J n1 …, J nk, WF n1, …,WF ny, WF nn1, …,WF nnz } where J j job of the parallel multi-job meta-workflow, j = 1 … m J ni native job ∈ J j, i = 1 … k WF np native workflow ∈ J j, p = 1… y WF nnt non-native workflow, ∈ J j, t = 1… z A parallel multi-job meta-workflow must contain an edge that connects split and worker jobs e s = (P go, P wxi ), where P so : output port of the split job J wx : worker job and x = 1 … k P wxi : input port of the worker job J wx and, an edge e m such that connects worker and merge jobs e m = (P wxo, P mi ), where P wxo : output port of a worker job J wx and x = 1 … k P mi : input port of the merge job J m and i = 1 … k Parallel multi-job meta-workflow

Parameter-sweep meta-workflow A parameter sweep workflow in the native workflow system that includes one or more non-native workflows A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 14 WF ps = = {J 1 …, J m }={J n1 …, J nk, WF n1, …,WF ny, WF nn1, …,WF nnz } where J j job of the parallel sweep meta-workflow, j = 1 … m J ni native job ∈ J j, i = 1 … k WF np native workflow ∈ J j, p = 1… y WF nnt non-native workflow, ∈ J j, t = 1… z Connections between the generator, collector and the worker jobs of the parameter sweep meta-workflow are described by two specific types of edges e g = (P go, P wxi ) where P go : output port of the generator job J g P wxi : input port of the worker job J wx and x = 1 … k and e c = (P wxo, P ci ) where P wxo : output of a worker job J wx and x = 1 … k P ci : input port of collector job J c and i = 1 … k Parameter-sweep meta-workflow

SHIWA Simulation Platform (1) The simulation platform contains science gateway SHIWA Science Gateway a submission service SHIWA Submission Service a workflow repository SHIWA Repository a data transfer service Data Avenue Service Enables data, infrastructure and workflow interoperability at both coarse- and fine-grained level Supports the whole workflow life-cycle addressing the challenges of executing workflows of different workflow systems as non-native workflows combining workflows of different workflow systems into meta-workflows running these workflows on different DCIs The gateway is integrated with the WS-PGRADE Workflow System which is used as native workflow engine in the simulation platform Supports ASKALON, Galaxy, GWES, Kepler, MOTEUR, Pegasus, ProActive, Taverna, Triana A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 15

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 16 SHIWA Simulation Platform (2)

Meta-workflows in Heliophysics (1) Simulating complex scenarios such as propagation of Coronal Mass Ejection (CME) Two types of meta-workflows used Linear multi-job meta-workflow Investigation of the fastest CME in a given period of time using an embedded Taverna workflow Parameter sweep meta-workflow finds and propagates the fastest CME over a given time range The WS-PGRADE parameter sweep extension of the Taverna meta-workflow performs the same operation over many time ranges Experiments performed using meta-workflow support in SHIWA Simulation Platform A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 17

A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 18 Linear multi-job meta-workflow for CME propagation Parameter sweep meta-workflow in WS-PGRADE Meta-workflows in Heliophysics (2)

Conclusions and Further Ahead Meta-workflows represent an exciting but challenging prospect within the context of workflow execution and interoperability Formalization for workflow types and CGI based approach for workflow interoperability has been achieved Significant contribution to facilitate design and development of new and existing workflows and workflow engines An example use case from Heliophysics has been presented demonstrating the effectiveness of the efforts for broader scientific domains Envisage expanding the scope of formalization to fine grained workflow interoperability approach A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 19

References [1] A. Kertész, G. Sipos and P. Kacsuk; Brokering multi-grid workflows in the P-GRADE portal, in: Euro-Par 2006: Parallel Processing, vol. 4375, Springer, Berlin, 138–149, [2] Giardine, B., Riemer et al.; Galaxy: A platform for interactive large-scale genome analysis. Genome Research, 15(10):1451-5, [3] Oinn, T., Addis et al.; Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics, 20(17): , 2004 [4] Terstyanszky, G. et al; Enabling Scientific Workflow Sharing Through Coarse Grained Interoperability in the Future Generation Com[puter Systems Vol 37, 46-59, A Definition and Analysis of the Role of Meta-workflows in Workflow Interoperability 20