gLite Advanced Job Management

Slides:



Advertisements
Similar presentations
FP62004Infrastructures6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Claudio Cherubino INFN Catania.
Advertisements

EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
SEE-GRID-SCI User Interface (UI) Installation and Configuration Branimir Ackovic Institute of Physics Serbia The SEE-GRID-SCI.
Riccardo Bruno, INFN.CT Sevilla, 10-14/09/2007 GENIUS Exercises.
E-infrastructure shared between Europe and Latin America 12th EELA Tutorial for Users and System Administrators Architecture of the gLite.
E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça.
Special Jobs Claudio Cherubino INFN - Catania. 2 MPI jobs on gLite DAG Job Collection Parametric jobs Outline.
Querétaro (Mexico), E2GRIS – Job Description Language JDL 1.
EGEE-II INFSO-RI Enabling Grids for E-sciencE International Summer School on Grid Computing 2006 gLite Information System and Workload.
FESR Consorzio COMETA - Progetto PI2S2 Using MPI to run parallel jobs on the Grid Marcello Iacono Manno Consorzio COMETA
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Luciano Díaz ICN-UNAM Based on Domenico.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) WMPROXY API Python & C++ Diego Scardaci
E-science grid facility for Europe and Latin America gLite Job Management. User and Site Admin Tutorial Elisa Ingrà – INFN Catania Dublin.
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals GILDA Tutors INFN Catania ICTP/INFM-Democritos Workshop on Porting Scientific.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Using gLite API Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers.
INFSO-RI Enabling Grids for E-sciencE The gLite Workload Management System Elisabetta Molinari (INFN-Milan) on behalf of the JRA1.
SEE-GRID-SCI The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
Job Management DIRAC Project. Overview  DIRAC JDL  DIRAC Commands  Tutorial Exercises  What do you have learned? KEK 10/2012DIRAC Tutorial.
INFSO-RI Enabling Grids for E-sciencE Workflow Management in Giuseppe La Rocca INFN – Catania ICTP/INFM-Democritos Workshop on Porting.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
INFSO-RI Enabling Grids for E-sciencE Claudio Cherubino, INFN Catania Grid Tutorial for users Merida, April 2006 Special jobs.
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
INFSO-RI Enabling Grids for E-sciencE Job Workflows with gLite Emidio Giorgio INFN NA4 Generic Applications Meeting 10 January 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Workload management in gLite 3.x - MPI P. Nenkova, IPP-BAS, Sofia, Bulgaria Some of.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Moisés Hernández Duarte UNAM FES Cuautitlán.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Job sandboxes.
INFSO-RI Enabling Grids for E-sciencE Job Description Language (JDL) Giuseppe La Rocca INFN First gLite tutorial on GILDA Catania,
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals Giuseppe La Rocca INFN – Catania gLite Tutorial at the EGEE User Forum CERN.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Special Jobs Valeria Ardizzone INFN - Catania.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMS tricks & tips – further scripting Giuseppe.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Hands-on: Compiling MPI codes with PGI Dušan Vudragović SCL,
INFSO-RI Enabling Grids for E-sciencE Flexible Job Submission Using Web Services: The gLite WMProxy Experience Giuseppe Avellino.
LCG2 Tutorial Viet Tran Institute of Informatics Slovakia.
FESR Consorzio COMETA - Progetto PI2S2 Job Description Language (JDL) Marcello Iacono Manno PRIMO TUTORIAL.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Advanced Job Riccardo Rotondo
Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.
FESR Consorzio COMETA - Progetto PI2S2 Using MPI to run parallel jobs on the Grid Marcello Iacono Manno Consorzio Cometa
Introduction to Job Description Language (JDL) Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY September.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
FESR Trinacria Grid Virtual Laboratory Practical using WMProxy advanced job submission Emidio Giorgio INFN Catania.
Practical using C++ WMProxy API advanced job submission
Architecture of the gLite WMS
How to connect your DG to EDGeS? Zoltán Farkas, MTA SZTAKI
Workload Management System on gLite middleware
Special jobs with the gLite WMS
The gLite Workload Management System
Corso di Calcolo Parallelo Grid Computing
EGEE tutorial, Job Description Language - more control over your Job Assaf Gottlieb Tel-Aviv University EGEE is a project.
Alexandre Duarte CERN Fifth EELA Tutorial Santiago, 06/09-07/09,2006
gLite Job Management Mario Reale GARR
Special Jobs: MPI Alessandro Costa INAF Catania
gLite Advanced Job Management
gLite Job Management Amina KHEDIMI CERIST
Certificates Usage and Simple Job Submission
Certificates Usage and Simple Job Submission
Workload Management System (WMS) & Job Description Language (JDL)
gLite Job Management Christos Theodosiou
Job Description Language
GENIUS Grid portal Hands on
Job Description Language (JDL)
Hands on Session: DAG Job Submission
Job Submission M. Jouvin (LAL-Orsay)
Presentation transcript:

gLite Advanced Job Management Africa 3 2010 - Joint EUMEDGRID-Support/EPIKH School for Application Porting Elisa Ingrà – Consortium GARR Algiers, 4-16 July 2010

Log Lifetime Proxy Do you need a long lifetime proxy for your application? Use MyProxy! How?! Create a long proxy ( myproxy-init -d -n -s changeme-myproxyserver) -d Use the proxy certificate subject (DN) as the default username -n Don't prompt for passphrase -s <hostname> Hostname of the myproxy-server Create a normal proxy with vo extensions (voms-proxy-init --voms eumed/gilda) In jdl file specify myproxy server name (MyProxyServer = “changeme-myproxyserver";) Submit your job For this tutorial you can use myproxy.ct.infn.it as MyProxyServer Location, Meeting title, dd.mm.yyyy

Outline WMProxy Overview Special Jobs MPI Jobs DAG jobs Job collections Parametric jobs MPI Jobs

WMProxy Overview is a service providing access to the gLite Workload Management System (WMS) functionality through a simple Web Services based interface. has been designed to handle a large number of requests for job submission gLite 3.0=> ~180 secs for 500 jobs goal is to get in the short term to ~60 secs for 1000 jobs it provides additional features such as bulk submission and the support for shared and compressed sandboxes for compound jobs. Now is co-existing with the WMS.

New request types Support for new types strongly relies on newly developed JDL converters and on the DAG submission support - all JDL conversions are performed on the server - a single submission for several jobs All new request types can be monitored and controlled through a single request id - each sub-jobs can be however followed-up and controlled independently through its own id “Smarter” WMS client commands/API - allow submission of DAGs, collections and parametric jobs exploiting the concept of “shared sandbox” - allow automatic generation and submission of collections and DAGs from sets of JDL files located in user specified directories on the UI Crei un jdl per i diversi job e questo viene convertito in più job. Ogni job può essere controllato indipendentemente attraverso il suo id Le api messe a disposizione permettono di sottomettere dag collectio e parametrici job sfruttando il concetto di sandbox condivise

Outline Special Jobs DAG Job Job Collection Parametric jobs

DAG jobs A DAG job is a set of jobs where input, output, or execution of one or more jobs can depend on other jobs Dependencies are represented through Directed Acyclic Graphs, where the nodes are jobs, and the edges identify the dependencies nodeA node nodeC nodeB node D nodeE

DAG -- JDL structure The JDL description of a DAG must not contain any OutputSandbox attribute occurrence also the ones in the descriptions of the nodes. The OutputSandbox of the DAG has to be considered as the sum of the output sandboxes of all its nodes. Max_node_running: the maximum number of resources involved in job running. You have not to specify the Output sandbox attribute here. The most important attribute of a dag job is the nodes attribute. In this attribute you have to specify the subjobs of the dag and their dependencies.

Attribute: InputSandbox All nodes that do not contain the InputSandbox attribute in their descriptions inherit the value of these attributes from the one specified for the DAG. Nodes representing jobs without InputSandbox have to contain the following specification in their description (empy InputSandbox list): The nodes attribute is composed by a set of attributes where you specify the jdl of the sub-job. You can specify the jdl of a sub-job or with a reference to a file or with a explicit description as in this case. [ ……………. InputSandbox = {}; …………… ]

Attribute: Nodes The nodes attribute is composed by a set of attributes where you specify the jdl of the sub-job. You can specify the jdl of a sub-job or with a reference to a file or with a explicit description as in this case.

Attribute: File The File attribute is a string representing the path on the local file system ofù a file containing the JDL description of a Job. It is important to note that this kind of representation can only be used when submitting to the WMS through a client (glite-wms-job-submit) able to resolve the path locally and to expand the JDL with the full description before passing it to the WMS. The File attribute cannot be specified together with the Description attribute within the same node description! The nodes attribute is composed by a set of attributes where you specify the jdl of the sub-job. You can specify the jdl of a sub-job or with a reference to a file or with a explicit description as in this case.

Attribute: Dependencies With The attribute Dependencies you can specify the node dependencies. In the sample on the slide the subjod nodefilename2 depends on nodefilename1 than, the job nodefilename2 can start to run when the job nodefilename1 is done.

DAG jdl [ type = "dag"; max_nodes_running = 4; nodes = [ nodeA = [ file ="nodes/nodeA.jdl" ; ]; nodeB = [ file ="nodes/nodeB.jdl" ; nodeC = [ file ="nodes/nodeC.jdl" ; nodeD = [ file ="nodes/nodeD.jdl"; dependencies = { {nodeA, nodeB}, {nodeA, nodeC}, { {nodeB,nodeC}, nodeD } } ] This is a complete dag job composed by 4 sub-jobs. In this jdl the sub-jobs are specified with a file name but it’s possible also to insert the description of the sub-job directly in the jdl.

Job Collection A job collection is a set of independent jobs that user wants to submit and monitor as a single request Jobs of a collection are submitted as DAG nodes without dependencies JDL is a list of classad, which describes the subjobs [ Type = "collection"; VirtualOrganisation = “gilda"; nodes = { [ <job descr 1 >], [ <job descr 2 >], … }; ] A collection is a special DAG without dependencies. In this case you have to set the Type attribute to collection. The node attribute is the same of the node attr. of the dag job but without the dependencies attribute.

Input Sandboxes Input Sandbox can contain: pointer to other files within the DAG/collection URI pointing to files on a remote gridFTP/HTTPS server file paths on the UI machine (i.e. the usual way) Only local files (file://) are uploaded to the WMS node File pointed by URIs are directly downloaded on the WN by the JobWrapper just before the job is started InputSandbox = { "gsiftp://neo.datamat.it:2811/var/prg/sim.exe", root.nodes.nodeA.description.OutputSandbox[0], "file:///home/pacio/myconf“ };

Output Sandboxes The OutputSandbox attribute lists the files destination of the job output A base URI to be applied to all sandbox files can also be specified Files are copied when the job has completed execution by the JobWrapper to the specified destination without transiting on the WMS node OutputSandbox = { "jobOutput","run1/event1", "jobError" }; OutputSandboxDestURI = { "gsiftp://matrix.datamat.it/var/jobOutput", "https://grid003.ct.infn.it:8443/home/cms/event1", "gsiftp://matrix.datamat.it/var/jobError" }; OutputSandboxBaseDestURI = "gsiftp://neo.datamat.it/home/run1/";

All nodes will share this Input Sandbox Job collection example [ type = "collection"; InputSandbox = {"date.sh"}; RetryCount = 3; nodes = { file ="jobs/job1.jdl" ; ], Executable = "/bin/sh"; Arguments = "date.sh"; StdOutput = "date.out"; StdError = "date.err"; OutputSandbox ={"date.out", "date.err"}; ] file ="jobs/job3.jdl" ; }; All nodes will share this Input Sandbox

Parametric Job A parametric job is a job where one or more of its attributes are parameterized Values of attributes vary according to a parameter Job monitoring / managing is always done through an unique jobID, as if the job was single (see submission of collection) [ JobType = "Parametric"; Executable = "/bin/sh"; Arguments = "md5.sh input_PARAM_.txt"; InputSandbox = {"md5.sh", "input_PARAM_.txt"}; StdOutput = "out_PARAM_.txt"; StdError = "err_PARAM_.txt"; Parameters = 4; ParameterStart = 1; ParameterStep = 1; OutputSandbox = {"out_PARAM_.txt", "err_PARAM_.txt"}; ]

Parametric job Parameter can be either a number, or a list of items (typically strings, but not enclosed within double quotes) Input Sandbox (if present) has to be coherent with parameters [ui-test] /home/giorgio/param > cat param2.jdl [ JobType = "Parametric"; Executable = “/bin/cat"; Arguments = “input_PARAM_.txt”; InputSandbox = "input_PARAM_.txt"; StdOutput = "myoutput_PARAM_.txt"; StdError = "myerror_PARAM_.txt"; Parameters = {EARTH,MOON,MARS}; OutputSandbox = {“myoutput_PARAM_.txt”}; ] [ui-test] /home/giorgio/param > ls inputEARTH.txt inputMARS.txt inputMOON.txt param2.jdl It is the list of the values the parameter must take.

The parameter is a number Parametric job [ JobType = "Parametric"; Executable = "myjob.exe"; StdInput = "input_PARAM_.txt"; StdOutput = "output_PARAM_.txt"; StdError = "error_PARAM_.txt"; Parameters = 100; ParameterStart = 1; ParameterStep = 1; InputSandbox = {"myjob.exe", "input_PARAM_.txt"; OutputSandbox = {"output_PARAM_.txt", "error_PARAM_.txt"}; ] The parameter is a number the initial number of the running paramenter, the increment of the running parameter between consecutive jobs. Both attributes, ParameterStart and ParameterStep, can be set only if Parameters is a number.

MPI Overview Execution of parallel jobs is an essential issue for modern informatics and applications. Most used library for parallel jobs support is MPI (Message Passing Interface) At the state of the art, parallel jobs can run inside single Computing Elements (CE) only; several projects are involved into studies concerning the possibility of executing parallel jobs on Worker Nodes (WNs) belonging to different CEs. The source code must have been compiled with mpicc libraries

MPI JDL Type = "Job"; #Mandatory JobType = "MPICH"; > $ cat mpi.jdl [ Type = "Job"; #Mandatory JobType = "MPICH"; #The number of CPU that will be used NodeNumber = 2; Executable = "cpi"; StdOutput = "cpi.out"; StdError = "cpi.err"; InputSandbox = {"cpi"}; OutputSandbox = {"cpi.err","cpi.out"}; RetryCount = 3; ] First South Africa Grid Training in Catania

Practice... Compound jobs More on jdl attributes https://grid.ct.infn.it/twiki/bin/view/GILDA/WmProxyUse More on jdl attributes https://grid.ct.infn.it/twiki/bin/view/GILDA/MoreOnJDL Jdl and data management (Is your input file>10 MB? ) A Grid application case https://grid.ct.infn.it/twiki/bin/view/GILDA/JobDataApplicationCase 23 First South Africa Grid Training in Catania

References https://edms.cern.ch/file/722398/1.2/gLite-3-UserGuide.pdf gLite 3 User Guide (pdf, html) https://edms.cern.ch/file/722398/1.2/gLite-3-UserGuide.pdf https://edms.cern.ch/file/722398/1.2/gLite-3-UserGuide.html WMProxy and JDL Guide https://edms.cern.ch/file/674643/1/WMPROXY-guide.pdf https://edms.cern.ch/file/590869/1/EGEE-JRA1-TEC-590869-JDL- Attributes-v0-9.pdf Documentation http://trinity.datamat.it/projects/EGEE/wiki/apidoc/3.1/htmlcp p/index.html Examples https://grid.ct.infn.it/twiki/bin/view/GILDA/WMProxyCPPAPI https://grid.ct.infn.it/twiki/bin/view/GILDA/ApiJavaWMProxy MPI http://www-unix.mcs.anl.gov/mpi/

Questions…. 16-26 June 2008, Catania (Italy) First South Africa Grid Training in Catania