GLOBUS PLUG-IN FOR WINGS WOKFLOW ENGINE Elizabeth Martí ITACA Universidad Politécnica de Valencia

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

Nimrod/K: Towards Massively Parallel Dynamic Grid Workflows David Abramson, Colin Enticott, Monash Ilkay Altinas, UCSD.
11 Application of CSF4 in Avian Flu Grid: Meta-scheduler CSF4. Lab of Grid Computing and Network Security Jilin University, Changchun, China Hongliang.
Höchstleistungsrechenzentrum Stuttgart SEGL Parameter Study Slide 1 Science Experimental Grid Laboratory (SEGL) Dynamical Parameter Study in Distributed.
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
Interaction model of grid services in mobile grid environment Ladislav Pesicka University of West Bohemia.
P. Kacsuk, G. Sipos, A. Toth, Z. Farkas, G. Kecskemeti and G. Hermann P. Kacsuk, G. Sipos, A. Toth, Z. Farkas, G. Kecskemeti and G. Hermann MTA SZTAKI.
JSAGA2 Overview job desc. gLite plug-ins Globus plug-ins JSAGA hidemiddlewareheterogeneity (e.g. gLite, Globus, Unicore) JDLRSL.
Natasha Pavlovikj, Kevin Begcy, Sairam Behera, Malachy Campbell, Harkamal Walia, Jitender S.Deogun University of Nebraska-Lincoln Evaluating Distributed.
WS-VLAM: Towards a Scalable Workflow System on the Grid V. Korkhov, D. Vasyunin, A. Wibisono, V. Guevara-Masis, A. Belloum Institute.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © Chapter 1, pp For educational use only.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Workflow Management System based on Service Oriented Components for Grid Applications. Ju-Ho Choi Korea University, Seoul, Rep. of Korea.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing Prepared by: Haoliang Robin Yu.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Messaging Technologies Group: Yuzhou Xia Yi Tan Jianxiao Zhai.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
- 1 - Grid Programming Environment (GPE) Ralf Ratering Intel Parallel and Distributed Solutions Division (PDSD)
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Introduction to MDA (Model Driven Architecture) CYT.
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Computational grids and grids projects DSS,
Scientific Workflow Interchanging Through Patterns: Reversals and Lessons Learned Bruno Fernandes Bastos Regina Maria Maciel Braga Antônio Tadeu Azevedo.
DOMENICO TALIA (joint work with M. Cannataro, A. Congiusta, P. Trunfio) DEIS University of Calabria ITALY Grid-Based Data Mining and.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
Workflow Early Start Pattern and Future's Update Strategies in ProActive Environment E. Zimeo, N. Ranaldo, G. Tretola University of Sannio - Italy.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
An Ontological Framework for Web Service Processes By Claus Pahl and Ronan Barrett.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
WALSAIP Portal Automated Composition of Signal Processing Operators Mariana Mendoza Botero.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks, Novelties and Features around the GridWay.
SEE-GRID-SCI The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
1 Limitations of BLAST Can only search for a single query (e.g. find all genes similar to TTGGACAGGATCGA) What about more complex queries? “Find all genes.
Enabling Grids for E-sciencE Astronomical data processing workflows on a service-oriented Grid architecture Valeria Manna INAF - SI The.
Migrating Desktop Bartek Palak Bartek Palak Poznan Supercomputing and Networking Center The Graphical Framework.
SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Globus Grid Tutorial Part 2: Running Programs Across Multiple Resources.
INFSO-RI Enabling Grids for E-sciencE Activities of the UPV in NA4- Biomed Ignacio Blanquer Vicente Hernández Universidad Politécnica.
OPTIMIZATION OF DIESEL INJECTION USING GRID COMPUTING Miguel Caballer Universidad Politécnica de Valencia.
An approach to Web services Management in OGSA environment By Shobhana Kirtane.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
ETICS An Environment for Distributed Software Development in Aerospace Applications SpaceTransfer09 Hannover Messe, April 2009.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
Coarse Grained Interoperability scenarios
Self Healing and Dynamic Construction Framework:
The Open Grid Service Architecture (OGSA) Standard for Grid Computing
Globus —— Toolkits for Grid Computing
Composite Subscriptions in Content-based Pub/Sub Systems
Introduction to the SHIWA Simulation Platform EGI User Forum,
Presentation transcript:

GLOBUS PLUG-IN FOR WINGS WOKFLOW ENGINE Elizabeth Martí ITACA Universidad Politécnica de Valencia

INTRODUCTION Take advantage of two concepts: Workflow & Grid. Workflow provides the automation of the processes. Grid makes possible the development of high-performance computing systems using heterogeneous geographically distributed resources with multiple administrative domains. A Grid workflow can be defined as the composition of grid application services which execute on heterogeneous and distributed resources in a well-defined order to accomplish a specific goal (Rajkumar Buyya).

MOTIVATION There have appeared many different workflow initiatives. Askalon, Karajan, Kepler, K-WfGrid, Taverna, Triana, etc. They lack of some important characteristics: multi grid capability. easy extensibility to new middleware. etc. WINGS provides new features focusing on high level definition, multigrid and extensibility capabilities. The most significant features of WINGS are: Expressiveness to capture specificities of grid computing. Provide flow control structures. Consider simple light operations. It is able to deal with different grid middlewares and versions.

WINGS CONCEPTS It is based on four concepts to model a workflow: Data sources: Communication points to interchange data among the different executions of the workflow. Activities: Abstractions of tasks to be run on the Grid. Describe the functionality of the tasks. Are defined by: The input and output parameters (simple/structured types). The list of deployments that provides the multi-grid middlewares specifics. Executions: Specific instances of an activity. The engine is in charge of selecting from the different deployments defined for each activity, according to where it going to be run. Operations: Simple executions that will be executed by the workflow runtime in order to pre or post process the information available in the Data Sources, to be used by the next tasks. Examples: arithmetic and reduction operations, string search operations, field extractions operations, split or merge file operations etc.

WINGS ENGINE It considers a pure data flow language where a workflow is a sequence of: DS – Execution or Operation – DS Simplifies the workflow description and understanding, and also increases the expressiveness. It is in charge of providing the functionality defined in the XML file, creating a environment to launch concurrent jobs. A key issue in a multi-grid environment is the movement of the files among the different resources of consecutive tasks, so the RT tries to: Reduce the number of data transferences. Deal with different physical file storage systems.

WINGS ARCHITECTURE WINGS Core Engine WINGS Core Engine Middlewares Engines Middlewares Engines Fura GT2 etc … Transference Systems Transference Systems Fura IXOS GridFTP etc … Operations Arithm etic Split File Split File etc … Information Systems Information Systems Fura RM MDS etc …

EXECUTION SCHEME Core Engine: Performs the logic and control operations: Prepare and select the tasks ready to be launched and the data to use in each execution. Plug-ins: In charge of effectively perform the file transferences and all the needed operations to complete the execution.

MIDDLEWARE PLUGINS Extended functionality just implementing a plug-in and adding it to the system. In the first version of the workflow engine, the Fura middleware plug-in was developed Now a Globus Toolkit plug-in has been implemented to enable multi-grid tests. Globus has been selected due to the great number of current infrastructures that use it as the underlying grid middleware (EGEE, EELA, etc.).

GLOBUS PLUG-IN Step 1 : To prepare the activity. Workflow model is defined at the XML file. Create a valid proxy (Proxy store). Define the execution enviroment of the task (Globus, Fura,…). Create a working directory on the execution host (GridFTP). Create an execution directory (GridFTP). Copy the executable to the execution host (Third party copy with UrlCopy). If necessary copy auxiliary data used by the executable (libraries, jar files, …).

GLOBUS PLUG-IN Step 2 : To prepare the initial data. Obtain the information of the input data (XML file) and store it (input parameters matrix). Obtain the number of microtask (combination of inputs). Create an input directory. Copy the input data to the input directory (UrlCopy). Create an output directory.

GLOBUS PLUG-IN Step 3 : Execute the task. Define the RSL file for the task. – Executable, arguments, working directory, etc. Create a GRAM Job for each RSL file. Launch the job (batch mode). – Parallel execution of microtasks.

GLOBUS PLUG-IN Step 4 : Get output data. Get output data from the output directory. – Use of wildcards to filter files. Create a replica of results in a specified location. – Path specification at the data source definition. Clean intermediate data. – Implementation of a function to delete recursively directories.

USE CASE A biomedical application representing the execution of a medical images co-registration process (rigid and elastic). The co-registration processes compare all the images with the base study to align the voxels of the studies to be as much as possible similar to the reference image. The input data are dynamic series of 3D magnetic resonance images after the injection of a contrast bolus in the area of the abdomen, to study the perfusion of the liver. The set are composed by 5 studies with 12 slices.

USE CASE Biomedical Application The workflow is composed by three steps 1.Rigid co-registration 2.Elastic co-registration (the most CPU consuming step) 3.Process to transpose the N studies (with K slices) results of the co-registration into K studies with N slices.

USE CASE Execution Times Execution Phase (CPU use time) Rigid CoReg Elastic CoReg Fuse SlicesTotalClean Op.OverheadTotal Fura 2 ’’ (F1) 2 ’’ (F2) 2 ’’ (F1) 2 ’’ (F2) 40 ’ 16 ’’ (F1) 43 ’ 43 ’’ (F2) 40 ’ 05 ’’ (F1) 41 ’ 23 ’’ (F2) 1 ’’ (F1)84 ’ 6 ’’ 4 ’’ 55 ’’ 85 ’ 5 ’’ Globus 4 ’’ (GN) 50 ’ 23 ’’ (GN) 50 ’ 26 ’’ (GN) 51 ’ 39 ’’ (GN) 50 ’ 24 ’’ (GN) 5 ’’ (GN)102 ’ 14 ’’ 1 ’ 28 ’’ 3 ’ 28 ’’ 107 ’ 10 ’’ Mixed Fura/Globus 2 ’’ (F1) 2 ’’ (F2) 2 ’’ (F1) 2 ’’ (F2) 39 ’ 38 ’’ (F1) 41 ’ 32 ’’ (F2) 50 ’ 56 ’’ (GN) 51 ’ 09 ’’ (GN) 1 ’’ (F1)51 ’ 12 ’’ 23 ’’ 2 ’ 25 ’’ 54 ’’ F1: AMD Opteron 2.4GHz with 1GB of RAM (Fura Agent) F2: AMD Opteron 2.2GHz with 1GB of RAM (Fura Agent) GN: Pentium Xeon 2 GHz with 512 MB of RAM (Globus Node) Gigabit Ethernet Network

CONCLUSIONS We have analyzed previous works and some of them have good features but do not fit our needs. WINGS has been designed in a modular way enabling to add new components to the system through a plug- in. We have implemented a Globus plug-in oriented to GT middleware. Currently Fura, Globus Toolkit (pre-ws services), and sub-workflow execution plugins have been developed enabling to launch cross-middleware tests with the two specified grid systems.

Thanks for you attention !