Presentation is loading. Please wait.

Presentation is loading. Please wait.

First ideas for a Resource Management Architecture for Productions Massimo Sgaravatto INFN Padova.

Similar presentations


Presentation on theme: "First ideas for a Resource Management Architecture for Productions Massimo Sgaravatto INFN Padova."— Presentation transcript:

1 First ideas for a Resource Management Architecture for Productions Massimo Sgaravatto INFN Padova

2 First step GRAM CONDOR GRAM LSF GRAM PBS Submit jobs (using globusrun) Site1 Site2Site3

3 Overview GRAM as uniform interface to different resource management systems Job submission from a single location Users must explicitly specify in which Globus resources (Condor pool, LSF cluster, …) the jobs must be executed Usage of Globus tools (globusrun, globus-job- status, …) to “manage” the jobs Are these “robust” tools with all the required capabilities ???

4 Usage examples %globusrun –b –r lxpd.pd.infn.it/jobmanager-lsf –f file.rsl file.rsl: & (executable=$(CMS)/startcmsim.sh) (stdin=$(CMS)/Pythia/run.1) (stdout=$(CMS)/Cmsim/log.1) (count=1) (queue=cmsprod) %globusrun –b –r lxbo.bo.infn.it/jobmanager-condor –f file.rsl file.rsl: & (executable=$(CMS)/startcmsim.sh) (stdin=$(CMS)/Pythia/run.1) (stdout=$(CMS)/Cmsim/log.1) (count=1)

5 What has been tested so far http://www.pd.infn.it/~sgaravat/ INFN-GRID/Globus/gram-report.pdf Tests only with simple programs (just to evaluate the capabilities and functionalities) No tests with “real” applications No “stress tests” (to evaluate reliability, robustness, …) GRAM – LSF: tested Seems working

6 What has been tested so far GRAM – Condor: tested GRAM assumes that the underlying environment is a “uniform” Condor pool (in particular for Vanilla jobs) Difficult to consider the INFN WAN Condor pool as Globus resource Usage of local “uniform” Condor pools ??? GRAM – PBS: not tested

7 Second step GRAM CONDOR GRAM LSF GRAM PBS globusrun Site1 Site2Site3 Submit jobs (using condor_submit and Globus Universe) Personal Condor

8 Overview Personal Condor able to provide robustness and reliability Job submission from a single location Users still must explicitly specify in which Globus resources the jobs must be executed Usage of Condor interface and tools (condor_submit, condor_q, …) to “manage” the jobs “Robust” tools with all the required capabilities (monitor, logging, …)

9 Usage examples %condor_submit file.cnd file.cnd: Universe=globus executable=$(CMS)/startcmsim.sh input=$(CMS)/Pythia/run.1 output=$(CMS)/Cmsim/log.1 GlobusScheduler=lxpd.pd.infn.it/jobmanager-lsf queue 1 %condor_submit file.cnd file.cnd: Universe=globus executable=$(CMS)/startcmsim.sh input=$(CMS)/Pythia/run.1 output=$(CMS)/Cmsim/log.1 GlobusScheduler=lxbo.bo.infn.it/jobmanager-condor queue 1

10 Second step (option 2) CONDOR GRAM LSF GRAM PBS globusrun Site1 Site2Site3 Submit jobs (using condor_submit and Globus Universe) Personal Condor Flocking condor_submit

11 Second step (option 3) CONDOR GRAM LSF GRAM PBS globusrun Site1 Site2Site3 Submit jobs (using condor_submit and Globus Universe) Personal Condor condor_submit Single Condor Pool

12 Problems The Globus Universe architecture is only a prototype Only best effort support by Condor team Tests not completed Ongoing tests (considering the fork system call as underlying resource management system) Tests considering the Globus Universe and LSF or Condor as underlying resource management system have not yet been performed PBS Is it supported by the Globus Universe mechanisms ??? Do we need it ??

13 Third step GRAM CONDOR GRAM LSF GRAM PBS globusrun Site1 Site2Site3 condor_submit (Globus Universe) Personal Condor MasterGIS Submit jobs Resource Discovery Information on characteristics and status of local resources

14 Overview Master smart enough to decide in which Globus resources the jobs must be submitted The Master uses the information on characteristics and status of resources published in the GIS

15 Problems and work needed The Master doesn’t exist  We have to implement it It is necessary to define the GIS architecture The local GRAMs provide the GIS with not enough information  The default schema must be integrated

16 GRAM & Condor & GIS

17 GRAM & LSF & GIS

18 Fourth step Information on characteristics and status of local resources Data Catalog Site1 GRAM CONDOR GRAM LSF GRAM PBS globusrun Site2Site3 condor_submit (Globus Universe) Personal Condor MasterGIS Submit jobs Resource Discovery Data Discovery Data Mover


Download ppt "First ideas for a Resource Management Architecture for Productions Massimo Sgaravatto INFN Padova."

Similar presentations


Ads by Google