INFSO-RI-508833 Enabling Grids for E-sciencE CREAM, WMS integration and possible deployment scenarios Massimo Sgaravatto – INFN Padova.

Slides:



Advertisements
Similar presentations
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
Advertisements

INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
INFSO-RI Enabling Grids for E-sciencE CREAM: a WebService based CE Massimo Sgaravatto INFN Padova On behalf of the JRA1 IT-CZ Padova.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
INFSO-RI Enabling Grids for E-sciencE DAGs with data placement nodes: the “shish-kebab” jobs Francesco Prelz Enzo Martelli INFN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Security and Job Management.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Provenance Challenge gLite Job Provenance.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Angela Poschlad (PPS-FZK), Antonio Retico.
INFSO-RI Enabling Grids for E-sciencE The gLite Workload Management System Elisabetta Molinari (INFN-Milan) on behalf of the JRA1.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM and ICE Massimo Sgaravatto – INFN Padova.
Conference name Company name INFSOM-RI Speaker name The ETICS Job management architecture EGEE ‘08 Istanbul, September 25 th 2008 Valerio Venturi.
INFSO-RI Enabling Grids for E-sciencE Scenarios for Integrating Data and Job Scheduling Peter Kunszt On behalf of the JRA1-DM Cluster,
WP1 WMS rel. 2.0 Some issues Massimo Sgaravatto INFN Padova.
EGEE is a project funded by the European Union under contract INFSO-RI Practical approaches to Grid workload management in the EGEE project Massimo.
Enabling Grids for E-sciencE The gLite Workload Management System Alessandro Maraschini OGF20, Manchester,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM Massimo Sgaravatto – INFN Padova On.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMPROXY usage Álvaro Fernández IFIC (CSIC)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks New Authorization Service Christoph Witzig,
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Job sandboxes.
EGEE-II INFSO-RI Enabling Grids for E-sciencE gLite and Condor present and future Claudio Grandi (INFN – Bologna)
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite – UNICORE interoperability Daniel Mallmann.
WP1 Status and plans Francesco Prelz, Massimo Sgaravatto 4 th EDG Project Conference Paris, March 6 th, 2002.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Management Claudio Grandi.
INFSO-RI Enabling Grids for E-sciencE Status of BLAH Francesco Prelz ( ) JRA1 All-Hands meeting, Nicosia,
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CREAM: current status and next steps EGEE-JRA1.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
CE design report Luigi Zangrando
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
EGEE is a project funded by the European Union under contract IST Padova report Massimo Sgaravatto On behalf of the INFN Padova JRA1 Group.
Enabling Grids for E-sciencE EGEE-III INFSO-RI EGEE and gLite are registered trademarks Francesco Giacomini JRA1 Activity Leader.
CREAM Status and plans Massimo Sgaravatto – INFN Padova
EGEE-II INFSO-RI Enabling Grids for E-sciencE IT cluster activity status (Status of WMS & CE) Francesco Prelz – IT.
FESR Trinacria Grid Virtual Laboratory Practical using WMProxy advanced job submission Emidio Giorgio INFN Catania.
Argus EMI Authorization Integration
Resource access in the EGEE project Massimo Sgaravatto INFN Padova
Practical using C++ WMProxy API advanced job submission
Workload Management Workpackage
CEMon
JRA1 IT-CZ cluster meeting Milano, May 3-4, 2004
CREAM and ICE Test Results
Security aspects of the CREAM-CE
Workload Management System ( WMS )
Summary on PPS-pilot activity on CREAM CE
Preview Testbed Massimo Sgaravatto – INFN Padova
and Alexandre Duarte OurGrid/EELA Interoperability Meeting
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
gLite Middleware Status
Massimo Sgaravatto INFN Padova On behalf of the CREAM product team
Massimo Sgaravatto INFN Padova On behalf of the Padova CREAM group
ICE-CREAM Luigi Zangrando On behalf of the JRA1 IT-CZ Padova group
The CREAM CE: When can the LCG-CE be replaced?
Grid2Win: Porting of gLite middleware to Windows XP platform
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Short update on the latest gLite status
TCG Discussion on CE Strategy & SL4 Move
Francesco Giacomini – INFN JRA1 All-Hands Nikhef, February 2008
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE CREAM, WMS integration and possible deployment scenarios Massimo Sgaravatto – INFN Padova for the JRA1 Padova group

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 2 CREAM CREAM service: Computing Resource Execution And Management service Simple, lightweight service for job management operation at the Computing Element (CE) level Web service interface Following and participating in efforts for standardizations –Job description language (JSDL), CE interface –In particular via OMII-Europe Implemented and maintained by the Padova Group of the EGEE JRA1 IT cluster –Same team developing and maintaining the CEMon and ICE services

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 3 CREAM: functionality Job submission –Submission of jobs to a CREAM based CE –Includes also staging of input sandbox files –Job characteristics described via a JDL (Job Description Language) expression  CREAM JDL is basically the same JDL used by the Glite WMS  Support of JSDL is on-going (see next presentation) –Supported job types  Simple, batch jobs  MPI jobs, as supported by the Glite WMS  Support for bulk jobs being integrated Proxy delegation –Possibility to automatically delegate a proxy for each job submission –Possibility to delegate a proxy, and then using it for multiple job submissions  Recommended approach wrt performance, since proxy delegation can be “expensive” –Same approach used in WMproxy

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 4 CREAM functionality Job cancellation –To cancel previously submitted jobs Job status –To get status and other info (e.g. creation/submission/start execution/job completion times, worker node, failure reason, e.g.) of submitted jobs –Also possible to apply filters on submission time and/or job status Job list –To get the identifiers of all your jobs Job suspension and job resume –To hold and then restart jobs

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 5 CREAM functionality Job purge –To clear a job from a CREAM based CE –Can be explicitly called by the client, or can be called via a cron job (e.g. to clean old jobs) Proxy renewal Disabling of new job submissions –Can be used only by CE admin e.g. if the CE has to be shutdown for maintenance –Other operations still allowed –Also possible to define policies on waiting/pending/running jobs to disable new job submissions  E.g. disable new submissions if the number of active jobs is > 3000

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 6 CREAM Architecture

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 7 CREAM usage scenario CREAM should be invoked: –By a generic client (e.g. an end-user willing to interact directly with the CE)  CREAM CLI exists  Also a Java client –Through the Glite WMS  ICE (Interface to Cream Environment) service on the WMS It has the role played by JC+LM+Condor in the submission to non-CREAM CEs  Condor is also working to support submission to CREAM WMS Direct Job Submission through the WMS CREAM

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 8 WMS-CREAM integration NS WMProxy FileList WM ICE MM JA FileList CEMon CREAM Helpers JC+LM Condor Submitter Job Status Handler LCG CEGlite CE WMS node

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 9 Job Status Changes A thread of ICE receives notifications about the job status changes from CEMon closely coupled with CREAM CE As a fail-safe mechanism, another thread is needed to poll the status of jobs still alive –If the relevant notifications are not received via CEMon ICE CREAM CEMon Job Status Handler Notifications sent by CEMon Periodic status polling

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 10 WMS CREAM integration Bulk jobs (DAGs, collections, parametric jobs) cannot be handled by ICE and therefore cannot be submitted to CREAM by the WMS right now This is because now all bulk jobs (DAGs, parametric jobs, job collections) are all represented into DAGs and managed by the Glite WMS via Condor DAGMan We are planning (also in the context of bulk matchmaking) to not use DAG representation anymore for parametric jobs and job collections –So Condor DAGMan would be used only for real DAGs So in this scenario: –Sequential and MPI jobs, “nodes” of a collection and “nodes” of a parametric job could be submitted indifferently to a Glite CE/LCG CE (via JC+LM+Condor) or to a CREAM CE (via ICE) –Nodes of a real DAG would be submitted to a Glite CE/LCG CE (via JC+LM+Condor) since managed by Condor DAGMan  Also to a CREAM CE, when Condor is able to submit to CREAM  Condor submission to CREAM can be used also for other environments where the Glite WMS is not used oE.g. OSG, in they are interested in

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 11 What is needed for a CREAM CE CREAM software Tomcat, Trustmanager, AuthZ framework CEMon and job plugin –Needed only for submissions via the Glite WMS BLAH –For submissions to the underlying LRMS Glexec –Used for credential mapping Gridftp server LCAS-LCMAPS enabled –Not needed if the CREAM CE is used only for submissions via the GLite WMS LCAS, LCMAPS –Used by Glexec and gridftpd Info providers and software needed for publication in the InfoSys –Same software used in the Glite CE: just a matter of configuration LB client –Needed only for submissions via the Glite WMS DGAS –Of course needed only if DGAS is used CREAM WN requires the same exact software of a WN of a Glite CE Detailed installation instructions available at the CREAM Web site:

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 12 CREAM CE and Glite CE Many of these software components used also in the Glite CE 3.0 Could a same machine run a CREAM CE and a Glite CE at the same time ? –Yes, but:  For some components (LCAS, LCMAPS, BLAH) software releases post Glite 3.0 are needed (Glite v. 3.1 should be considered)  We wouldn’t recommend this scenario, since performance could be impacted The load (CPU, memory, etc.) introduced by the Glite CE could impact CREAM performance and vice versa Not tested: just hypotheses Instead we would recommend to have a Glite CE and a CREAM CE on two different machines, sharing the same set of WNs

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 13 BLAH Used by CREAM (and by the Glite CE) to interact with the underlying Local Resource Management System(s) –Multiple resource management systems could be used at the same time in CREAM All the resource management systems supported by BLAH are automatically supported by CREAM BLAH currently supports LSF and PBS/Torque –Support for Condor is on-going

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 14 Tests Testing and debugging right now performed in a small testbed in Padova Next slides show some test results for direct submissions to CREAM via CREAM CLI –CREAM CE and CREAM CLI both in Padova –CREAM CE: P3-1000x2 ; 1 GB RAM –Pretty old tests (March 04, 2006), sorry, but I was not aware of this presentation till last Saturday and so I was not able to collect newer numbers !  We are not doing worst now –Not able to collect numbers for submissions to CREAM through the GLite WMS (ICE) for the same reason  Good results (even if there are some problems being fixed) in particular in submission time to CREAM and in being able to promptly detect job status changes

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 15 Test results Submission of 250 jobs from X threads –Used a single delegation –ISB downloaded from a remote GridFTP server Submission of 250 jobs from a single thread –Overall time needed to submit the jobs: 257 s. - No failures Submission of 250 jobs from 2 threads (125 jobs per thread) –Overall time needed to submit the jobs (as average of the values of each thread): 186 s. - No failures Submission of 249 jobs from 3 threads (83 jobs per thread) –Overall time needed to submit the jobs: 158 s. - No failures Submission of 252 jobs from 4 threads (63 jobs per thread) –Overall time needed to submit the jobs: 157 s. - No failures Submission of 250 jobs from 5 threads (50 jobs per thread) –Overall time needed to submit the jobs: 146 s. - No failures

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 16 Submission of jobs from 5 threads ( jobs per thread) Used a single delegation ISB downloaded from a remote GridFTP server 2088 s. to submit 3000 jobs 1 failure in the submission of 3000 jobs

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 17 Testing in a larger environment Necessary to perform tests on a larger environment –4-5 CREAM installations at different sites –1 WMS ICE enabled (Glite 3.1) Functionality tests, stress tests, performance tests, reliability tests Preview testbed should be the right place –INFN Cert testbed as far as I can understand could be available basically now Plans –We will take care of the installations of the CREAM CEs and of the WMS  Standard Glite installers could be used for WorkerNodes –We will perform the first tests –Then we will open to interested users

Enabling Grids for E-sciencE INFSO-RI Pilsen, July , EGEE JRA1 all-hands meeting 18 Other info CREAM, ICE – CEMon – BLAH –