1/22 Distributed Systems Architecture Research Group Universidad Complutense de Madrid Constantino Vázquez Eduardo Huedo Scaling DRMAA codes to the Grid:

Slides:



Advertisements
Similar presentations
A Lightweight Platform for Integration of Mobile Devices into Pervasive Grids Stavros Isaiadis, Vladimir Getov University of Westminster, London {s.isaiadis,
Advertisements

Distributed Systems Architectures
CSF4 Meta-Scheduler Tutorial 1st PRAGMA Institute Zhaohui Ding or
National Institute of Advanced Industrial Science and Technology Advance Reservation-based Grid Co-allocation System Atsuko Takefusa, Hidemoto Nakada,
March 6 th, 2009 OGF 25 Unicore 6 and IPv6 readiness and IPv6 readiness
Experiences with GridWay on CRO NGI infrastructure / EGEE User Forum 2009 Experiences with GridWay on CRO NGI infrastructure Emir Imamagic, Srce EGEE User.
Distributed Systems Architecture Research Group Universidad Complutense de Madrid EGEE UF4/OGF25 Catania, Italy March 2 nd, 2009 State and Future Plans.
NGS computation services: API's,
WS-JDML: A Web Service Interface for Job Submission and Monitoring Stephen M C Gough William Lee London e-Science Centre Department of Computing, Imperial.
The National Grid Service and OGSA-DAI Mike Mineter
Current status of grids: the need for standards Mike Mineter TOE-NeSC, Edinburgh.
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
Database System Concepts and Architecture
Global Analysis and Distributed Systems Software Architecture Lecture # 5-6.
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
1/16 Distributed Systems Architecture Research Group Universidad Complutense de Madrid An Introduction to Virtualization and Cloud Technologies to Support.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
C. Grimme, A. Papaspyrou Scheduling in C3-Grid AstroGrid-D Workshop Project: C3-Grid Collaborative Climate Community Data and Processing Grid Scheduling.
Congreso Cuidad, Spain May 15, 2007 GridWay 1/29 gLite Course EGEE’07 MTA SZTAKI, Budapest, Hungary September 30th, 2007 An Overview of the GridWay Metascheduler.
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
2 nd GADA Workshop / OTM 2005 Conferences Eduardo Huedo Rubén S. Montero Ignacio M. Llorente Advanced Computing Laboratory Center for.
Universität Dortmund Robotics Research Institute Information Technology Section Grid Metaschedulers An Overview and Up-to-date Solutions Christian.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Workload Management Massimo Sgaravatto INFN Padova.
Simo Niskala Teemu Pasanen
Massive Ray Tracing in Fusion Plasmas on EGEE J.L. Vázquez-Poletti, E. Huedo, R.S. Montero and I.M. Llorente Distributed Systems Architecture Group Universidad.
Kate Keahey Argonne National Laboratory University of Chicago Globus Toolkit® 4: from common Grid protocols to virtualization.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
KARMA with ProActive Parallel Suite 12/01/2009 Air France, Sophia Antipolis Solutions and Services for Accelerating your Applications.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
OGF 25/EGEE User Forum Catania, March 2 nd 2009 Meta Scheduling and Advanced Application Support on the Spanish NGI Enol Fernández del Castillo (IFCA-CSIC)
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
3-2.1 Topics Grid Computing Meta-schedulers –Condor-G –Gridway Distributed Resource Management Application (DRMAA) © 2010 B. Wilkinson/Clayton Ferner.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
London e-Science Centre GridSAM Job Submission and Monitoring Web Service William Lee, Stephen McGough.
Grid Workload Management Massimo Sgaravatto INFN Padova.
BOF: Megajobs Gracie: Grid Resource Virtualization and Customization Infrastructure How to execute hundreds of thousands tasks concurrently on distributed.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks S. Natarajan (CSU) C. Martín (UCM) J.L.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks, An Overview of the GridWay Metascheduler.
ServiceSs, a new programming model for the Cloud Daniele Lezzi, Rosa M. Badia, Jorge Ejarque, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Congreso Cuidad, Spain May 15, 2007 GridWay 1/26 EGEE’07 Conference Budapest, Hungary October 1st – 5th, 2007 Uniform Grid Access with GridWay GridWay.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks, Novelties and Features around the GridWay.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
INFSO-RI Enabling Grids for E-sciencE OSG-LCG Interoperability Activity Author: Laurence Field (CERN)
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Enabling Grids for E-sciencE Astronomical data processing workflows on a service-oriented Grid architecture Valeria Manna INAF - SI The.
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Application Porting Support Group Demonstration at EGEE’08 Conference Istanbul,
Tool Integration with Data and Computation Grid “Grid Wizard 2”
EGI Technical Forum Amsterdam, 16 September 2010 Sylvain Reynaud.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
Congreso Cuidad, Spain May 15, 2007 GridWay 1/28, Installation and Basic Configuration GridWay Distributed Systems Architecture Group Universidad Complutense.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Congreso Cuidad, Spain May 15, 2007 GridWay 1/27, Programming with the DRMAA OGF Standard GridWay Distributed Systems Architecture Group Universidad Complutense.
Workload Management Workpackage
Management of Virtual Machines in Grids Infrastructures
J.L. Vázquez-Poletti (UCM) EGEE08 (Istambul)
Management of Virtual Machines in Grids Infrastructures
Management of Virtual Execution Environments 3 June 2008
Presentation transcript:

1/22 Distributed Systems Architecture Research Group Universidad Complutense de Madrid Constantino Vázquez Eduardo Huedo Scaling DRMAA codes to the Grid: A Demonstration on EGEE, TG and OSG

2/22 Contents 1.The GridWay Metascheduler 2.The DRMAA standard and GridWay 3.GridWay approach to interoperability 4.The CD-HIT Application

3/22 1. The GridWay Metascheduler What is GridWay? GridWay is a Globus Toolkit component for meta-scheduling, creating a scheduler virtualization layer on top of Globus services (GRAM, MDS & GridFTP) For project and infrastructure directors GridWay is an open-source community project, adhering to Globus philosophy and guidelines for collaborative development. For system integrators GridWay is highly modular, allowing adaptation to different grid infrastructures, and supports several OGF standards. For system managers GridWay gives a scheduling framework similar to that found on local LRM systems, supporting resource accounting and the definition of state-of-the-art scheduling policies. For application developers GridWay implements the OGF standard DRMAA API (C, JAVA & more bindings), assuring compatibility of applications with LRM systems that implement the standard, such as SGE, Condor, Torque,... For end users GridWay provides a LRM-like CLI for submitting, monitoring, synchronizing and controlling jobs, that could be described using the OGF standard JSDL.

4/22 1. The GridWay Metascheduler Global Architecture of a Computational Grid PBS GridWay SGE $> CLI Results.C,.java DRMAA.C,.java Infrastructure Grid Middleware Applications Globus Grid Meta- Scheduler Standard API (OGF DRMAA) Command Line Interface open source job execution management resource brokering Globus services Standard interfaces end-to-end (e.g. TCP/IP) highly dynamic & heterogeneous high fault rate Application-Infrastructure decoupling

5/22 1. The GridWay Metascheduler GridWay Internals Execution Manager Transfer Manager Information Manager Dispatch Manager Request Manager Scheduler Job PoolHost Pool DRMAA libraryCLI GridWay Core Grid File Transfer Services Grid Execution Services GridFTPRFT pre-WS GRAM WS GRAM Grid Information Services MDS2 GLUE MDS4 Resource Discovery Resource Monitoring Resource Discovery Resource Monitoring Job Preparation Job Termination Job Migration Job Preparation Job Termination Job Migration Job Submission Job Monitoring Job Control Job Migration Job Submission Job Monitoring Job Control Job Migration

6/22 2. The DRMAA standard and GridWay What is DRMAA? Distributed Resource Management Application API Open Grid Forum Standard Homogeneous interface to different Distributed Resource Managers (DRM): SGE Condor PBS/Torque GridWay C JAVA Perl (GW 5.2+) Ruby (GW 5.2+) Python (GW 5.2+)

7/22 2. The DRMAA standard and GridWay C Binding The native binding All the others are wrappers around this Features a dynamic library to link DRMAA applications with They will automatically run on a Grid offered by GridWay drmaa_run_job (job_id, DRMAA_JOBNAME_BUFFER-1, jt, error, DRMAA_ERROR_STRING_BUFFER-1);

8/22 2. The DRMAA standard and GridWay Java Binding Uses Java Native Interface (JNI) performs calls to the C library to do the work Two versions of the DRMAA spec Not yet officially recommended by OGF session.runJob(jt);

9/22 2. The DRMAA standard and GridWay Ruby Binding SWIG : C/C++ wrapper generator for scripting languages and Java SWIG binding for Ruby developed by dsa-research.org (result, job_id, error)=drmaa_run_job(jt)

10/22 2. The DRMAA standard and GridWay Python Binding SWIG binding developed by 3rd party Author: Enrico Sirola License: GPL --> external download (result, job_id, error)=drmaa_run_job(jt) Perl Binding SWIG binding developed by 3rd party Author: Tim Harsch License: GPL --> external download ($result, $job_id, $error)=drmaa_run_job($jt);

11/22 3. GridWay Approach to Interoperability Definition (by OGF GIN-CG) Interoperability: The native ability of Grids and Grid technologies to interact directly via common open standards in the near future. A rather long-term solution within production e-Science infrastructures. GridWay provides support for established standards: DRMAA, JSDL, WSRF… Interoperation: What needs to be done to get production Grid and e- Science infrastructures to work together as a short-term solution. Two alternatives: Adapters: "A device that allows one system to connect to and work with another". o Change the middleware/tools to insert the adapter Gateways: adapters implemented as a service. o No need to change the middleware/tools GridWay provides both adapters (Middleware Access Drivers, MADs) and a gateway (GridGateWay, WSRF GRAM service encapsulating GridWay).

12/22 3. GridWay Approach to Interoperability How do we achieve interoperability By using adapters: A device that allows one system to connect to and work with another Users pre-WS / WS GridWay pre-WS / WS gLite SGE ClusterPBS ClusterSGE Cluster PBS Cluster pre-WS / WS SGE ClusterPBS Cluster GridWay Users (Virtual) Organization Applications Middleware

13/22 3. GridWay Approach to Interoperability EGEE Middleware The Enabling Grids for E-sciencE European Commission funded project brings together scientists and engineers from more than 240 institutions in 45 countries world-wide to provide a seamless Grid infrastructure for e-Science that is available to scientists 24 hours-a-day. Interoperability Issues Execution Manager Driver for preWS Different data staging philosophy Cannot stage to front node Dont know Execution Node beforehand SOLUTION : Wrapper Virtual Organization support

14/22 3. GridWay Approach to Interoperability Open Science Grid Middleware The Open Science Grid brings together a distributed, peta-scale computing and storage resources into a uniform shared cyberinfrastructure for large-scale scientific research. It is built and operated by a consortium of universities, national laboratories, scientific collaborations and software developers. Interoperability Issues MDS2 info doesnt provide queue information static monitoring Globus container running in a non standard port MAD modification

15/22 3. GridWay Approach to Interoperability TeraGrid Middleware TeraGrid is an open scientific discovery infrastructure combining leadership class resources at eleven partner sites to create an integrated, persistent computational resource Interoperability Issues Separated Staging Element and Working Node Shared homes Use of SE_HOSTNAME Mix of static and dynamic data Support for raw rsl extensions To bypass GRAM and get info to DRMS

16/22 4. The CD-HIT Application Application Description Cluster Database at High Identity with Tolerance Protein (and also DNA) clustering Compares protein DB entries Eliminates redundancies Example: Used in UniProt for generating UniRef data sets Our case: Widely used in the Spanish National Oncology Research Center (CNIO) Input DB: 504,876 proteins / 435MB Infeasible to be executed on single machine Memory requirements Total execution time UniProt is the world's most comprehensive catalog of information on proteins. CD-HIT program is used to generate the UniRef reference data sets, UniRef90 and UniRef50. CD-HIT is also used at the PDB to treat redundant sequences

17/22 4. The CD-HIT Application CD-HIT Parallel Execute cd-hit in parallel mode Idea: divide the input database to compare each division in parallel Divide the input db Repeat Cluster the first division (cd- hit) Compare others against this one (cd-hit-2d) Merge results Speed-up the process and deal with larger databases Computational characteristics Variable degree of parallelism Grain must be adjusted A90 B-AC-AD-A C-ABD-AB D-ABC B90 C90 A BCD D90 DB DB90 Merge Div.

18/22 4. The CD-HIT Application B-AC-AD-A C-ABD-AB D-ABC C90 cd-hit-div D90 B90 A90 Front-end PBS Database division/merging is performed in the front-end Several structures to invoke the underlying DRMS PBS, SGE and ssh merge

19/22 4. The CD-HIT Application PBS C-ABD-AB D-ABC C90 C-AD-A D-AB D90 cd-hit-div merge C90 B90 Merge sequential tasks to reduce overhead Provide a uniform interface (DRMAA) to interact with different DRMS. Some file manipulation still needed DRMS GridWay To your local cluster EGEE TG OSG

20/22 4. The CD-HIT Application Running with 10 divisions Using previous set-up on TG, EGEE, OSG and UCM local cluster

21/22 4. The CD-HIT Application Job States - Running with 14 divisions

22/22 QUESTIONS? Thank you for your attention!