Www.cpc.wmin.ac.uk/GEMLCA Enabling the execution of various workflows (Kepler, Taverna, Triana, P-GRADE) on EGEE Tamas Kukla, Tamas Kiss, Gabor Terstyanszky.

Slides:



Advertisements
Similar presentations
Running Workflows on Clouds and Grids Gabor Terstyanszky, University of Westminster T. Fahringer, P. Kacsuk, J. Montagnat, I. Taylor e-Science Workshop,
Advertisements

1 CEOS WGISS Meeting, May 8-12, 2006, Budapest MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences
1 P-GRADE Portal and GEMLCA Legacy Code Architecture Peter Kacsuk MTA SZTAKI
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
P. Kacsuk, G. Sipos, A. Toth, Z. Farkas, G. Kecskemeti and G. Hermann P. Kacsuk, G. Sipos, A. Toth, Z. Farkas, G. Kecskemeti and G. Hermann MTA SZTAKI.
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
WS-PGRADE: Supporting parameter sweep applications in workflows Péter Kacsuk, Krisztián Karóczkai, Gábor Hermann, Gergely Sipos, and József Kovács MTA.
Technical Architectures
Grid Execution Management for Legacy Code Applications Exposing Application as Grid Services Porto, Portugal, 23 January 2007.
Porto, January Grid Computing Course Summary of day 2.
1 Application Specific Module for P-GRADE Portal 2.7 Application Specific Module overview Akos Balasko MTA-SZTAKI LPDS
GMD German National Research Center for Information Technology Innovation through Research Jörg M. Haake Applying Collaborative Open Hypermedia.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
1 portal.p-grade.hu További lehetőségek a P-GRADE Portállal Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences.
June Amsterdam A Workflow Bus for e-Science Applications Dr Zhiming Zhao Faculty of Science, University of Amsterdam VL-e SP 2.5.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
AHM /09/05 AHM 2005 Automatic Deployment and Interoperability of Grid Services G.Kecskemeti, Yonatan Zetuny, G.Terstyanszky,
Web-based Virtual Research Environments (VRE): Supporting Collaboration in e-Science Xiaobo Yang, Rob Allan CCLRC e-Science Centre Daresbury Laboratory,
1 portal.p-grade.hu Further information on P-GRADE Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
A General and Scalable Solution of Heterogeneous Workflow Invocation and Nesting Tamas Kukla, Tamas Kiss, Gabor Terstyanszky.
From P-GRADE to SCI-BUS Peter Kacsuk, Zoltan Farkas and Miklos Kozlovszky MTA SZTAKI - Computer and Automation Research Institute of the Hungarian Academy.
Sharing Workflows through Coarse-Grained Workflow Interoperability : Sharing Workflows through Coarse-Grained Workflow Interoperability G. Terstyanszky,
Introduction to SHIWA Technology Peter Kacsuk MTA SZTAKI and Univ.of Westminster
INFSO-RI Enabling Grids for E-sciencE Supporting legacy code applications on EGEE VOs by GEMLCA and the P-GRADE portal P. Kacsuk*,
Introduction to WS-PGRADE and gUSE Tutorial Akos Balasko 04/17/
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
1 Advanced features of the P-GRADE portal Peter Kacsuk, Gergely Sipos Peter Kacsuk, Gergely Sipos MTA.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
P-GRADE and GEMLCA.
1 P-GRADE Portal: a workflow-oriented generic application development portal Peter Kacsuk MTA SZTAKI, Hungary Univ. of Westminster, UK.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE User Forum, Manchester, 10 May ‘07 Nicola Venuti
Conference name Company name INFSOM-RI Speaker name The ETICS Job management architecture EGEE ‘08 Istanbul, September 25 th 2008 Valerio Venturi.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Services for advanced workflow programming.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Workflows Description, Enactment and Monitoring in SAGA Ashiq Anjum, UWE Bristol Shantenu Jha, LSU 1.
1 Practical information for the GEMLCA / P-GRADE hands-on Gergely Sipos On behalf of: MTA.
1 Practical information for the GEMLCA / P-GRADE hands-on Tamas Kiss University of Westminster.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
SHIWA and Coarse-grained Workflow Interoperability Gabor Terstyanszky, University of Westminster Summer School Budapest July 2012 SHIWA is supported.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Building an European Research Community through Interoperable Workflows and Data ER-flow project Gabor Terstyanszky, University of Westminster, UK EGI.
SHIWA: Is the Workflow Interoperability a Myth or Reality PUCOWO, June 2011, London Gabor Terstyanszky, Tamas Kiss, Tamas Kukla University of Westminster.
1 Other features and next steps Gergely Sipos MTA SZTAKI Hungarian Academy of Sciences.
1 Porting applications to the NGS, using the P-GRADE portal and GEMLCA Peter Kacsuk MTA SZTAKI Hungarian Academy of Sciences Centre for.
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
Introduction to the program of the summer school Peter Kacsuk MTA SZTAKI SCI-BUS is supported by the FP7 Capacities Programme under contract.
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
© Geodise Project, University of Southampton, Workflow Support for Advanced Grid-Enabled Computing Fenglian Xu *, M.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Introduction to SHIWA project EGI User Forum, Vilnius Peter Kacsuk MTA SZTAKI
SHIWA Simulation Platform (SSP) Gabor Terstyanszky, University of Westminster EGI Community Forum Munnich March 2012 SHIWA is supported by the FP7.
Usage of WS-PGRADE and gUSE in European and national projects Peter Kacsuk 03/27/
SHIWA project presentation Project 1 st Review Meeting, Brussels 09/11/2011 Peter Kacsuk MTA SZTAKI
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
SCI-BUS project Pre-kick-off meeting University of Westminster Centre for Parallel Computing Tamas Kiss, Stephen Winter, Gabor.
Overview on the work performed during EPIKH Training Faiza MEDJEK /INFN, CATANIA 1.
Introduction: AstroGrid increases scientific research possibilities by enabling access to distributed astronomical data and information resources. AstroGrid.
SHIWA SIMULATION PLATFORM = SSP Gabor Terstyanszky, University of Westminster e-Science Workflows Workshop Budapest 09 nd February 2012 SHIWA is supported.
RDA 9th Plenary Breakout 3, 5 April :00-17:30
Tamas Kiss University Of Westminster
Peter Kacsuk MTA SZTAKI
P-GRADE and GEMLCA.
Introduction to the SHIWA Simulation Platform EGI User Forum,
Workflow level parametric study support by the P-GRADE portal
Presentation transcript:

Enabling the execution of various workflows (Kepler, Taverna, Triana, P-GRADE) on EGEE Tamas Kukla, Tamas Kiss, Gabor Terstyanszky Centre for Parallel computing University of Westminster London Peter Kacsuk, Gergely Sipos Computer and Automation Research Institute Hungarian Academy of Sciences Budapest

Contents Introduction Workflow interoperability Requirements of workflow engine integration Realising workflow integration Conclusions

Introduction Several widely utilised, Grid workflow management systems, such as Triana, P-GRADE, Taverna, Kepler, CppWfMS, YAWL, or the K-Wf Grid emerged in the last decade. These systems were developed by different scientific communities for various purposes. Therefore, they differ in several aspects. They use –different workflow engines –different workflow description languages –different workflow formalisms –different Grid middleware

Different workflow engines Most systems are coupled with one engine: –Taverna uses Freefluo –Triana uses Triana engine –K-WfGrid uses GWES (Grid workflow execution service) –Older versions of P-GRADE used Condor DAGMan, while its recent version uses its own engine Xen.

Different workflow description languages Most workflow systems use different workflow description languages: –Triana interprets BPEL (Business Process Execution Language) and its own language format. –Taverna workflows are represented in SCUFL. –Older versions of P-GRADE used Condor DAG, now it uses its own defined language. –Kepler uses MOML. –YAWL system uses YAWL language. –K-WfGrid uses GWorkflowDL. Because of this diversity, workflows of a system cannot be reused in another system.

Different workflow formalisms Workflow description languages are based on various workflow formalisms. –Condor DAG uses directed acyclic graphs (DAG). –SCUFL is also DAG based, but it is extended with control constraints. –The new workflow language of P-GRADE is also DAG based, but it is extended with recursion and nesting. –YAWL and GWorkflowDL are based on Petri Nets –BPEL is Pi-Calculus based Different formalisms have different expression capabilities. Therefore, in many cases it is not possible to express a workflow of one type in the description language of another.

Workflow interoperability In order to achieve cross-organisational collaboration between the different scientific communities, workflows should be able to interoperate, communicate with and/or invoke each other during execution. The WfMC defines workflow interoperability in general as: – "The ability of two or more Workflow Engines to communicate and work together to coordinate work." In this definition the workflow engine is a piece of software that provides the workflow run-time environment.

Approaches to workflow interoperability Various solutions can bring workflow interoperability into effect: –Workflow description standardisation Would enable the exchange of workflows of different systems XPDL was defined by the WfMC and BPEL was defined by Microsoft and IBM for this purpose, but they did not gain universal acceptance so far. It is unlikely in the near future –Workflow translation Would enable the translation from one language to another Can be realised by translating via an intermediate workflow language. –YAWL and GWorkflowDL could also be used for this purpose. See BPEL to YAWL translator or SCUFL to GWorkflowDL converter. Cannot be applied in any case

Workflow engine integration An alternative approach to attain workflow interoperability could be realised by workflow engine integration. Executes the workflow in its native environment in by its own workflow engine. Makes workflow management systems to be able to execute non- native workflows. Can be realised by loosely or tightly coupled integration.

Tightly(i) and loosely(ii) coupled engine integration WF System C CCC Workflow engine integration service C C (i) (ii) III I C Engine of WF System A Engine of WF System B Engine of WF System C Interface of WF integration service

Requirements of workflow engine integration Our aim is to provide a solution for workflow sharing and interoperability by integrating different workflow systems in the following fashion: –providing a generic solution, which can be adopted to any workflow system –providing a scalable solution in terms of both number of workflows and amount of data –integration of a new workflow engine to the system should not require code re-engineering, only user level understanding of the engine in question

The concept of the target heterogeneous WF system In a certain type of workflow (meta workflow) other types of workflows can be executed as nodes The goal is that the meta workflow could be any type of WF (Taverna, Triana, Kepler, etc.) The embedded workflows can also be any kind of WFs

Realising workflow integration To provide a generic solution: –It is recommended to realise loosely coupled integration To provide a scalable solution: –It is recommended to utilize Grid resources for workflow engine execution To make the workflow engine deployment straightforward: –It is recommended to handle workflow engines as legacy applications

GEMLCA GEMLCA, that is unique in a sense that it is an application repository extended with a job submitter, allows the deployment of legacy code applications on the Grid. An application can be exposed via a GEMLCA service and can be executed using a GEMLCA client. The legacy application is stored either in the repository of a GEMLCA service or on a third party computational node where GEMLCA can access it. To publish a legacy application via GEMLCA, only a basic user-level understanding of the legacy application is needed, code re-engineering is not required. As soon as the application is deployed, GEMLCA is able to submit it using either GT2, GT4 or gLite Grid middleware. If the workflow engine requires credentials to utilise further Grid resources for workflow execution, these are automatically provided by GEMLCA through proxy delegation.

Exposing workflow engines via GEMLCA Command-line workflow engines, just like other legacy applications, can be exposed via a GEMLCA service, without code re-engineering and can be automatically submitted by GEMLCA to the Grid to a computational node. Three engines (engine of Taverna, Triana, and Kepler) have been en-wrapped by scripts so as to provide a general command line interface for them. This interface is the following: wfsubmit.sh -w wf_descriptor [-p wf_input_params] [-i wf_input_files] [-o wf_output_files] Wrapper scripts are responsible for installing the workflow engine, decompressing the workflow input files, execute the workflow by parametrizing and invoking the workflow engine and finally compress the workflow outputs into one archive file. The engines were exposed using the JSR-168 based GEMLCA administrator portlet.

Exposing Taverna workflow engine on EGEE using GEMLCA Administration Portlet

Parameters

Exposing Taverna workflow engine using GEMLCA Administration Portlet Submission settings

Legacy Code interface Description of the exposed Taverna engine

Parameters

Legacy Code interface Description of the exposed Taverna engine Submission settings

Realisation of a Workflow engine repository and submitter via GEMLCA GEMLCA client Deployed appsBackends WF Engine 1 WF Engine 2 WF Engine 3 GT2 GT4 gLite GEMLCA service User selects the required workflow engine, gives the workflow descriptor, the input parameters and input files. Executable: WF engine (that will be submitted) WF to execute: an input parameter of the GEMLCA job NGS EGEE

Heterogeneous workflow nesting via GEMLCA Workflow system GEMLCA client Deployed appsBackends WF Engine 1 WF Engine 2 WF Engine 3 GT2 GT4 gLite GEMLCA service User selects the required workflow engine, uploads the workflow, the input parameters and input files. Executable: WF engine (that will be submitted) WF to execute: an input parameter of the GEMLCA job NGS EGEE

Parametrization of non-native workflow execution within the P-GRADE portal GEMLCA was integrated to the P-GRADE portal. GEMLCA jobs can be parametrized using a JAVA based GUI within the P-GRADE workflow editor. Any other workflow system can adopt this solution and integrate a GEMLCA client. Selecting Grid Selecting workflow engine Setting workflow output file Setting input parameters Setting workflow descriptor Selecting GEMLCA service Selecting computational site Setting workflow input files

Case Study A case study workflow, that presents how workflows of different systems interoperate, will be presented. It serves only demonstration purposes, it is not a real life example. A high level heterogeneous P-GRADE workflow, nesting a Taverna, Kepler and Triana workflows. The data that are transferred between the workflows are stored files, there is no data transformation. If data transformation is needed, user has to create a data transformer job.

Taverna workflow This workflow fetches several images from a database, creates a few directories and places the images into those directories as image files. Data- base

Kepler workflow This workflow goes through the directory structure of the archive input file and manipulates each image that it finds. The manipulation includes edge highlighting, picture resizing and image type conversion.

Triana workflow This workflow couples the pictures, merges each couple and converts the merged pictures to greyscale images. Then, one colour component, that can be either the blue, green or red, is taken of the greyscale pictures and saved as new image file.

Heterogeneous P-GRADE workflow embedding Triana, Taverna, and Kepler workflows Taverna workflow Kepler workflow Triana workflow P-GRADE workflow

Heterogeneous P-GRADE workflow embedding Triana, Taverna, and Kepler workflows on EGEE Taverna workflow Kepler workflow Triana workflow P-GRADE workflow Submitted to EGEE Executed on EGEE

Heterogeneous P-GRADE workflow embedding Triana, Taverna, and Kepler workflows on NGS Taverna workflow Kepler workflow Triana workflow P-GRADE workflow Submitted to NGS Executed on NGS

Heterogeneous P-GRADE workflow embedding Triana, Taverna, and Kepler workflows on both EGEE and NGS Taverna workflow Kepler workflow Triana workflow P-GRADE workflow Submitted to EGEE Submitted to NGS Submitted to EGEE Executed on EGEE Executed on NGS

Executing heterogeneous meta-workflows in the P-GRADE portal using EGEE and NGS resources Using only EGEE resources Using both NGS and EGEE resources

Executing heterogeneous meta-workflows in the P-GRADE portal using EGEE and NGS resources

Conclusions This presentation introduced a general solution for integrating and accessing heterogeneous workflow engines. The solution can be used to achieve workflow interoperability and sharing at the level of workflow engine integration. The solution exposes various workflow engines via a GEMLCA service, that is capable of submitting the engines to the Grid using GT2, GT4 or gLite based Grid infrastructures, such as NGS or EGEE. It can be extended to support any further grid middleware, by implementing an additional GEMLCA submitter backend. The solution keeps the data at computational sites, it is scalable in terms of number of workflows and amount of data.

Conclusions Workflow engine deployment to this system does not require any code re-engineering, user level understanding is sufficient. The solution can be adopted by any workflow management system by integrating GEMLCA with the selected WF system Our main concern is to investigate and realise further workflow interoperability models and to extend the solution with a workflow repository. The integration technique can be applied not only for workflow engine integration, but for data middle-ware integration as well.

Questions?

Thank You!