Advanced services in gLite Gergely Sipos and Peter Kacsuk MTA SZTAKI.

Slides:



Advertisements
Similar presentations
Workload Management David Colling Imperial College London.
Advertisements

Generic MPI Job Submission by the P-GRADE Grid Portal Zoltán Farkas MTA SZTAKI.
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Supporting MPI Applications on EGEE Grids Zoltán Farkas MTA SZTAKI.
Intermediate HTCondor: More Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison.
Intermediate Condor: DAGMan Monday, 1:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
The EDG Workload Management System – n° 1 The EDG Workload Management System.
1 Using Condor An Introduction ICE 2008.
FESR Consorzio COMETA - Progetto PI2S2 Jobs Interattivi Giuseppe La Rocca INFN Catania – Italy Tutorial.
FESR Consorzio COMETA - Progetto PI2S2 Using MPI to run parallel jobs on the Grid Marcello Iacono Manno Consorzio COMETA
Intermediate HTCondor: Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
Grid Computing, B. Wilkinson, 20046d.1 Schedulers and Resource Brokers.
High Throughput Computing with Condor at Purdue XSEDE ECSS Monthly Symposium Condor.
EGEE Summer School Grid Systems – 3-8 July Job submission into the LHC Grid (Job Management + JDL) EGEE is funded by the European Union under.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
Part 8: DAGMan A: Grid Workflow Management B: DAGMan C: Laboratory: DAGMan.
EGEE is a project funded by the European Union under contract IST Status of NA4 Generic Applications Roberto Barbera NA4 Generic Applications.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Grid Compute Resources and Job Management. 2 Local Resource Managers (LRM)‏ Compute resources have a local resource manager (LRM) that controls:  Who.
M. Sgaravatto – n° 1 Overview of WP1 Workload Management System in EDG 2.x Massimo Sgaravatto INFN Padova - DataGrid WP1
Intermediate Condor: Workflows Rob Quick Open Science Grid Indiana University.
HTCondor and Workflows: An Introduction HTCondor Week 2015 Kent Wenger.
Intermediate HTCondor: More Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison.
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Job Services Emidio.
M. Sgaravatto – n° 1 Overview of release 2 of the EDG WP1 Workload Management System deployed in the INFN production Grid Massimo Sgaravatto INFN Padova.
Intermediate Condor: Workflows Monday, 1:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.
Grid Compute Resources and Job Management. 2 How do we access the grid ?  Command line with tools that you'll use  Specialised applications Ex: Write.
Peter F. Couvares Computer Sciences Department University of Wisconsin-Madison Condor DAGMan: Managing Job.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Workload management in gLite 3.x - MPI P. Nenkova, IPP-BAS, Sofia, Bulgaria Some of.
Grid Compute Resources and Job Management. 2 Job and compute resource management This module is about running jobs on remote compute resources.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor and DAGMan Barcelona,
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals Giuseppe La Rocca INFN – Catania gLite Tutorial at the EGEE User Forum CERN.
Peter Couvares Computer Sciences Department University of Wisconsin-Madison Condor DAGMan: Introduction &
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Special Jobs Valeria Ardizzone INFN - Catania.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Advanced Job Riccardo Rotondo
FESR Consorzio COMETA - Progetto PI2S2 Using MPI to run parallel jobs on the Grid Marcello Iacono Manno Consorzio Cometa
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
EU 2nd Year Review – Feb – WP1 Demo – n° 1 WP1 demo Grid “logical” checkpointing Fabrizio Pacini (Datamat SpA, WP1 )
ELTE lectures Grid Systems – 2004/ nd semester - 1 Job submission into the LHC Grid EGEE is funded by the European Union under contract IST
Practical using C++ WMProxy API advanced job submission
Intermediate HTCondor: More Workflows Monday pm
Condor DAGMan: Managing Job Dependencies with Condor
Operations Support Manager - Open Science Grid
Architecture of the gLite WMS
Intermediate HTCondor: Workflows Monday pm
gLite MPI Job Amina KHEDIMI CERIST
Special jobs with the gLite WMS
Workload Management System ( WMS )
The gLite Workload Management System
EGEE tutorial, Job Description Language - more control over your Job Assaf Gottlieb Tel-Aviv University EGEE is a project.
Job Submission in the DataGrid Workload Management System
Alexandre Duarte CERN Fifth EELA Tutorial Santiago, 06/09-07/09,2006
I2G CrossBroker Enol Fernández UAB
Workload Management System
Grid Compute Resources and Job Management
Job Description Language
CompChem VO: User experience using MPI
5. Job Submission Grid Computing.
gLite Job Management Christos Theodosiou
Job Description Language
Job Description Language (JDL)
Job Submission M. Jouvin (LAL-Orsay)
Presentation transcript:

Advanced services in gLite Gergely Sipos and Peter Kacsuk MTA SZTAKI

Grid Computing School, July 2006, Rio de Janeiro Outline Advanced job types Interactive jobs Checkpointing jobs MPI jobs Workflows Condor DAGMan gLite workflow

Grid Computing School, July 2006, Rio de Janeiro Normal job We have talked about Normal jobs sequential program takes input performs computation writes output The user gets the output after the execution Other options: Interactive jobs Logical checkpointing jobs MPI jobs Workflows

Grid Computing School, July 2006, Rio de Janeiro Interactive Job (I) The Interactive job is a job whose standard streams are forwarded to the submitting client The user has to set the JDL JobType attribute to interactive When an interactive job is submitted, the edg-job-submit command starts a Grid console shadow process in the background that listens on a port assigned by the Operating System The port can be forced through the ListenerPort attribute in the JDL opens a new window where the incoming job streams are forwarded The DISPLAY environment variable has to be set correctly, because an X window is open The user can specify --nogui option, which makes the command provide a simple standard non-graphical interaction with the running job It is not necessary to specify the OutputSandbox attribute in the JDL because the output will be sent to the interactive window

Grid Computing School, July 2006, Rio de Janeiro Interactive jobs (II) Specified setting JobType = “Interactive” in JDL When an interactive job is executed, a window for the stdin, stdout, stderr streams is opened Possibility to send the stdin to the job Possibility the have the stderr and stdout of the job when it is running Possibility to start a window for the standard streams for a previously submitted interactive job with command edg-job-attach

Grid Computing School, July 2006, Rio de Janeiro Logical Checkpointing Job The Checkpointing job is a job that can be decomposed in several steps In every step the job state can be saved in the LB and retrieved later in case of failures The job state is a set of pairs defined by the user The job can start running from a previously saved state and not from the beginning again The user has to set the JDL JobType attribute to checkpointable

Grid Computing School, July 2006, Rio de Janeiro Logical Checkpointing Job When a checkpointable job is submitted and starts from the beginning, the user run simply the edg-job-submit command the number of steps, that represents the job phases, can be specified by the JobSteps attribute e.g. JobSteps = 2; the list of labels, that represents the job phases, can be specified by the JobSteps attribute e.g. JobSteps = {“january”, “february”}; The latest job state can be obtained by using the edg-job-get-chkpt command A specific job state can be obtained by using the edg-job-get-chkpt –cs command When a checkpointable job has to start from an intermediate job state, the user run the edg-job-submit command using the –chkpt option where is a valid job state file, where the state of a previously submitted job was saved

Grid Computing School, July 2006, Rio de Janeiro Job checkpointing example int main () { … for (int i=event; i < EVMAX; i++) { ;}... exit(0); } Example of Application (e.g. HEP MonteCarlo simulation)

Grid Computing School, July 2006, Rio de Janeiro Job checkpointing example #include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); ; … for (int i=event; i < EVMAX; i++) { ;... state.saveValue("first_event", i+1); ; state.saveValue("filename", PFN of file_on_SE);... state.saveValue("var_n", value_n); state.saveState(); } … exit(0); } User code must be easily instrumented in order to exploit the checkpointing framework …

Grid Computing School, July 2006, Rio de Janeiro Job checkpointing example #include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); ; … for (int i=event; i < EVMAX; i++) { ;... state.saveValue("first_event", i+1); ; state.saveValue("filename", PFN of file_on_SE);... state.saveValue("var_n", value_n); state.saveState(); } … exit(0); } User defines what is a state Defined as pairs Must be “enough” to restart a computation from a previously saved state

Grid Computing School, July 2006, Rio de Janeiro Job checkpointing example #include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); ; … for (int i=event; i < EVMAX; i++) { ;... state.saveValue("first_event", i+1); ; state.saveValue("filename", PFN of file_on_SE);... state.saveValue("var_n", value_n); state.saveState(); } … exit(0); } User can save from time to time the state of the job

Grid Computing School, July 2006, Rio de Janeiro Job checkpointing example #include "checkpointing.h" int main () { JobState state(JobState::job); event = state.getIntValue("first_event"); PFN_of_file_on_SE = state.getStringValue("filename"); …. var_n = state.getBoolValue("var_n"); ; … for (int i=event; i < EVMAX; i++) { ;... state.saveValue("first_event", i+1); ; state.saveValue("filename", PFN of file_on_SE);... state.saveValue("var_n", value_n); state.saveState(); } … exit(0); } Retrieval of the last saved state The job can restart from that point

Grid Computing School, July 2006, Rio de Janeiro Other (most relevant) UI commands edg-job-attach Starts an interactive session for previously submitted interactive jobs Srarts a listener process on the UI machine edg-job-get-chkpt Allows the user to retrieve one or more checkpoint states by a previously submitted job

Grid Computing School, July 2006, Rio de Janeiro MPI Job There are a lot of libraries supporting parallel jobs, but we decided to support MPICH. The MPI job is run in parallel on several processors The user has to set the JDL JobType attribute to MPICH and specify the NodeNumber attribute that’s the required number of CPUs When a MPI job is submitted, the UI adds in the Requirements attribute Member(“MpiCH”, other.GlueHostApplicationSoftwareRunTimeEnvironment) (the MPICH runtime environment must be installed on the CE) other.GlueCEInfoTotalCPUs >= NodeNumber (a number of CPUs must be at least be equal to the required number of nodes) In the Rank attribute other.GlueCEStateFreeCPUs (it is chosen the CE with the largest number of free CPUs)

Grid Computing School, July 2006, Rio de Janeiro [ JobType = "MPICH"; NodeNumber = 2; Executable = "MPItest.sh"; Argument = "cpi 2"; InputSandbox = {"MPItest.sh", "cpi"}; OutputSandbox = "executable.out"; Requirements = other.GlueCEInfoLRMSType == “PBS” || other.GlueCEInfoLRMSType == “LSF”; ] The NodeNumber entry is the number of threads of MPI job The MPItest.sh script only works if PBS or LSF is the local job manager MPI Job

Grid Computing School, July 2006, Rio de Janeiro MPItest.shSnapshot of MPItest.sh : # $HOST_NODEFILE contains names of hosts allocated for MPI job for i in `cat $HOST_NODEFILE` ; do echo "Mirroring via SSH to $i" # creates the working directories on all the nodes allocated for parallel execution ssh $i mkdir -p `pwd` # copies the needed files on all the nodes allocated for parallel execution /usr/bin/scp -rp./* $i:`pwd` # sets the permissions of the files ssh $i chmod 755 `pwd`/$EXE ssh $i ls -alR `pwd` done # execute the parallel job with mpirun mpirun -np $CPU_NEEDED -machinefile $HOST_NODEFILE `pwd`/$EXE > executable.out Important: you need shared keys between worker nodesImportant: you need shared keys between worker nodes Avoids sharing of home directories Enforced in GILDA NOT enforced in LCG2 … The VO needs to negotiate on a site by site basis MPI Job

Grid Computing School, July 2006, Rio de Janeiro Condor DAGMan Directed Acyclic Graph Manager DAGMan allows you to specify the dependencies between your Condor jobs, so it can manage them automatically for you. (e.g., “Don’t run job “B” until job “A” has completed successfully.”)

Grid Computing School, July 2006, Rio de Janeiro What is a DAG? A DAG is the data structure used by DAGMan to represent these dependencies. Each job is a “node” in the DAG. Each node can have any number of “parent” or “children” nodes – as long as there are no loops! Job A Job BJob C Job D

Grid Computing School, July 2006, Rio de Janeiro Defining a Condor DAG A DAG is defined by a.dag file, listing each of its nodes and their dependencies: # diamond.dag Job A a.sub Job B b.sub Job C c.sub Job D d.sub Parent A Child B C Parent B C Child D each node will run the Condor job specified by its accompanying Condor submit file Job A Job BJob C Job D

Grid Computing School, July 2006, Rio de Janeiro Submitting a Condor DAG To start your DAG, just run condor_submit_dag with your.dag file, and Condor will start a personal DAGMan daemon which to begin running your jobs: % condor_submit_dag diamond.dag condor_submit_dag submits a Scheduler Universe Job with DAGMan as the executable. Thus the DAGMan daemon itself runs as a Condor job, so you don’t have to baby-sit it.

Grid Computing School, July 2006, Rio de Janeiro DAGMan Running a Condor DAG DAGMan acts as a “meta-scheduler”, managing the submission of your jobs to Condor based on the DAG dependencies. Condor Job Queue C D A A B.dag File

Grid Computing School, July 2006, Rio de Janeiro DAGMan Running a Condor DAG (cont’d) DAGMan holds & submits jobs to the Condor queue at the appropriate times. Condor Job Queue C D B C B A

Grid Computing School, July 2006, Rio de Janeiro DAGMan Running a Condor DAG (cont’d) In case of a job failure, DAGMan continues until it can no longer make progress, and then creates a “rescue” file with the current state of the DAG. Condor Job Queue X D A B Rescue File

Grid Computing School, July 2006, Rio de Janeiro DAGMan Recovering a Condor DAG Once the failed job is ready to be re-run, the rescue file can be used to restore the prior state of the DAG. Condor Job Queue C D A B Rescue File C

Grid Computing School, July 2006, Rio de Janeiro DAGMan Recovering a Condor DAG (cont’d) Once that job completes, DAGMan will continue the DAG as if the failure never happened. Condor Job Queue C D A B D

Grid Computing School, July 2006, Rio de Janeiro DAGMan Finishing a Condor DAG Once the DAG is complete, the DAGMan job itself is finished, and exits. Condor Job Queue C D A B

Grid Computing School, July 2006, Rio de Janeiro Additional DAGMan Features Provides other handy features for job management… nodes can have PRE & POST scripts failed nodes can be automatically re-tried a configurable number of times

Grid Computing School, July 2006, Rio de Janeiro DAG Job in EGEE The DAG job is a Directed Acyclic Graph Job The user has to set in the JDL JobType = „dag”, nodes ( containing the description of the nodes), and dependencies attributes NOTE: A plug-in has been implemented to map an EGEE DAG submission to a Condor DAG submission Some improvements have been applied to the ClassAd API to better address WMS need

Grid Computing School, July 2006, Rio de Janeiro nodes = { cmkin1 = [ file = “bckg_01.jdl" ; ], cmkin2 = [ file = “bckg_02.jdl" ; ], …… cmkinN = [ file = “bckg_0N.jdl" ; ] }; dependencies = { {cmkin1, cmkin2}, {cmkin2, cmkin3}, {cmkin2, cmkin5}, {{cmkin4, cmkin5}, cmkinN} } cmk in1 cmk in4 cmk in2 cmk in5 cmk inN cmk in3 DAG Job in EGEE