Download presentation
Presentation is loading. Please wait.
Published byHilary Richards Modified over 8 years ago
1
www.consorzio-cometa.it FESR Consorzio COMETA - Progetto PI2S2 Job Description Language (JDL) Marcello Iacono Manno marcello.iacono@ct.infn.it PRIMO TUTORIAL GRID PER L’UNIVERSITA’ DI PALERMO Palermo, 10 Dicembre 2007
2
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 2 JDL describes a job request to the WMS thus allowing the actual job submission descriptors are called attributes mandatory attributes are those describing the process resource descriptors are provided by the Information System (IS) following the Glue Schema Introduction to JDL
3
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 3 JDL generalities JDL describes a request to the WMS JDL allows submission, cancellation, status query, output retrieval with either CLI or GUI two versions: legacy Network Server interface (socket) WMProxy (web service) job description: Condor classified advertisements (classads) descriptors are called attributes mandatory attributes: job fundamental descriptors resource attributes are related to the status and characteristics of grid resources Information Service (IS) schema (Glue for gLite)
4
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 4 JDL format a list of entries enclosed by [ ] (each terminated by ;) entry: = | ; attribute: a string with the name of the attribute value: string “abc” a double-quoted string Integer 1234 Floating Point 12.34 Boolean “true”,”false”,expression (see GLUESchema) classads (see nodes) list of values: enclosed by separated by { “abc”, “bcd”, “def” }
5
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 5 Request types Type = “Job”a simple job (default) “DAG” a Direct Acyclic Graph of dependent jobs “Collection”a set of independent jobs
6
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 6 JobType JobType = “Normal”a simple job “Interactive”a job whose standard streams are forwarded to the submitting client “MPICH”a MPI parallel job “Partitionable”a job composed by a set of independent steps / iterations for parallel execution “Checkpointable” a job able to save its state in order to be suspended and resumed from the same point “Parametric”a job with parametric attributes in its JDL to submit many similar instances with a single command (only parameterized attributes vary)
7
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 7 Executable Executable = the path for the command/exec file(*) “/usr/bin/java/j2sdk1.5.0/bin/java”, “/home/user/executable.exe” environment variables accepted(*) “$JAVA/bin/java” local or absolute paths accepted(*) “executable.exe” requires an identical file entry in the Input Sandbox remote files can be specified by gsiftp (local path exec) mandatory for all jobs wild cards not allowed arguments are reported in a dedicated attribute (*) on the executing WN
8
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 8 Arguments Arguments = the arguments for the executable file “-out outputfile.dat” together with: Executable = “execprog”; originates on the WN the command line: $ execprog -out outputfile.dat quotas (“”) have to follow a backslash(\) “ -a \”quoted string\” -bcd” becomes (with the above executable) $ execprog -a ”quoted string” -bcd
9
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 9 StdInput, StdOutput, StdError StdInput, StdOutput, StdError = the paths for the I/O/err file same rules as the ‘Executable’ attribute require identical entries in the Input/Output Sandboxes StdOutput and StdError can be the same file StdInput not required for Interactive jobs examples: StdInput = “/home/iacono/config.dat”; StdOutput = “gsiftp://grid999.ct.infn.it:1234/tmp/file.out”;
10
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 10 InputSandbox & OutputSandbox InputSandbox, OutputSandbox = identifying … the input file(s) to be copied from the local UI (or a gridFTP server) to the WN before starting execution the output file(s) to be prepared for downloading from WN after job execution completion ( transfer: glite-job-output ) wild cards admitted if solved locally LFN files not admitted (InputData or script–copy required) files in InputSandbox must not exceed 10 MB length different filenames required (same destination directory) examples: InputSandbox= { “myinp1.dat“, ”data/myinp2.dat” };
11
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 11 InputSandboxBaseURI & C. InputSandboxBaseURI, OutputSandboxBaseDestURI = modify the I/O Sandboxes root to a gridFTP server express the URI as a local file in I/O Sandboxes perform automatic retrieval upon job completion support for http files will be provided soon example: InputSandboxBaseURI =“gsiftp://grid999.ct.infn.it:1234:/tmp”; modifies the meaning of InputSandbox = “myfile.dat”; into: InputSandbox = “gsiftp://grid999.ct.infn.it:1234/tmp/myfile.dat”; OutputSandboxDestURI = modify the root of each entry in OutputSandbox when gsiftp is used require same cardinality compared to OuputSanbox perform automatic file downloading at job completion not compatible with OutputSandboxBaseDestURI
12
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 12 ExpiryTime & Environment ExpiryTime = a job with no matching CE at submission time is held to retry matchmaking and aborted after 1 day of unsuccessful retries expiry time modifies this default duration expressed in seconds since epoch glite-job-submit allows an user-friendly format Environment = describe the environment variables string format: = example: Environment = { “JOB_LOG_FILE=/tmp/job.log”, “INP_DIR=/tmp/input_files” };
13
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 13 PerusalFileEnable,PerusalTimeInterval & PerusalFileDestURI PerusalFileEnable = enable job file perusal support (runtime output files inspection) example: PerusalFileEnable = “true”; PerusalTimeInterval = the interval length in seconds between two following savings example: PerusalTimeInterval = 10; PerusalFileDestURI = a string with the URI of a gridFTP or https server
14
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 14 InputData InputData = identify LFNs, GUIDs, LDs and/or queries query the related Data Catalog to retrieve PFNs input queried files into WN current directory influence the WMS matchmaking decision example: InputData = { “lfn:/grid/gilda/isospin.dat”, “guid:135b7b23-4a6a-87e7-9d101f8c8b70”, “lds:testfile.inp” // LDS catalog “query: select my_files”, // LDS catalog “si-lfn:/file.inp” /* StorageIndex catalog */ }; when catalog is not specified (first two entries), if StorageIndex attribute is declared, then the pointed catalog is used, otherwise WMS tries first RLS and then DLI
15
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 15 StorageIndex = (*) the endpoint URL of the StorageIndex (SI) service to resolve file names for si-lfn: or si-guid: files in InputData when StorageIndex is not specified, the VO default SI is used example: StorageIndex = "https://grid017.ct.infn.it:8443/gilda/glite-data- catalog-service-fr-mysql/services/SEIndex" StorageIndex & DataCatalog DataCatalog = (*) the endpoint URL of the RLS or DLI service to be used to resolve file names example: DataCatalog = “https://grid017.ct.infn.it:8443/gilda/glite-data-catalog- service-fr-mysql/services/FiremanCatalog"; (*) for usage with InputData only
16
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 16 OutputSE & OutputData OutputSE = a string representing the URL of a SE for output data storing influence the matchmaking decision by the RB example: OutputSE = “grid009.ct.infn.it”; OutputData = (*) list of classads describing output files similar to DataRequirements automatic upload output files upon job completion (*) not yet supported
17
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 17 DataAccessProtocol & DataRequirements DataAccessProtocol = the protocols to be used to retrieve files (mandatory when InputData is defined) example: DataAccessProtocol = { “file”, “gridftp” }; DataRequirements = data requirements for a job example: DataRequirements = { [ DataCatalogType = “...” ; DataCatalog = “https://...”; InputData = { “lfn:…”, “guid:…”, “lds:…”, “query:…” }; ], [ DataCatalogType = “SI”; …. ] };
18
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 18 VirtualOrganisation, RetryCount & ShallowRetryCount VirtualOrganisation = the name of the VO has to match with the WMS default (?) overwritten by –vo option in glite-job-submit example: VirtualOrganisation = “gilda”; RetryCount, ShallowRetryCount = indicate how many times the job must be re-submitted upon a failure due to some grid resources not valid for DAG / Collection limited by the MaxRetryCount parameter of WMS resubmission is ‘shallow’ if user job aborted before running ‘deep’ resubmission resets shallow retry count
19
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 19 LBAddress & VOMSProxyServer LBAddress = the address of LB format: [: ] default taken from WMS configuration (port = 9000) example: LBAddress = “lb-grid.ct.infn.it“; MyProxyServer = the address of the proxy server automatic renewal of proxy certificate (long jobs) port defaults to 7512 example: MyProxyServer = “grid001.ct.infn.it:7512”;
20
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 20 HLRLocation & JobProvenance HLRLocation = the user Home Location Register (HLR) HLR manages the economic transaction takes into for resource usage bills the job on the user account example: HLRLocation = “prod-hlr-01.ct.infn.it” JobProvenance = the endpoint URL of the JobProvenance service where data about the job have to be stored the WMS sends to this destination the job sandboxes files
21
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 21 NodeNumber & ListenerPort NodeNumber = an integer >1 specifying how many CPUs are needed for a MPI job mandatory for JobType = “MPICH” example: NodeNumber = 3; ListenerPort = the port number where condor_console_shadow listens for job standard streams for usage with JobType = ‘Interactive’ example: ListenerPort = 44000;
22
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 22 ListenerHost & ListenerPipeName ListenerHost = the host name on which the condor_console_shadow listens for job standard streams for usage with JobType = “Interactive” it is used when submission and interactive session are on different machines ListenerPipeName = the absolute path of the pipes where job standard streams are located example: ListenerPipeName = “/tmp/pipe”; means: stdin=/tmp/pipe.in, stdout=/tmp/pipe.out (default=/tmp/listener/ )
23
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 23 JobSteps JobSteps = either an integer representing the number of steps to run or a string list with the labels associated to the steps of a partitionable or checkpointable job the main stepper is a part of the user job (not WMS) that links it to the run time checkpointing library if also specified in JobState classad, this definition prevails examples: JobSteps = 1000 ; (runs 1000 steps of main stepper) JobSteps = { “a”,”b”,”c” }; (runs sections “a”, “b”, “c” of main stepper)
24
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 24 CurrentStep & JobState CurrentStep = an integer number (>0) indicating the step number to be taken as the initial one when submitting a checkpointable or partitionable job example: CurrentStep = 2; (default=0) JobState = a job checkpoint state to start with when submitting a checkpointable job example: JobState = [ JobSteps = 1000; CurrentStep = 350; UserData = [ DumpPath=“gsiftp://grid999.ct.infn.it:1234/tmp/dumpfile” ] ]
25
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 25 GLUESchema GLUESchema (Grid Logical Uniform Environment) an information model to describe grid sites and services independent from specific implementation syntax. examples: entity = property = Requirements = other.GlueCEUniqueID == "grid010.ct.infn.it:2119/jobmanager-lcgpbs-infinite" Requirements = other.GlueCEInfoLRMSType == "PBS" || other.GlueCEInfoLRMSType == "LSF"
26
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 26 Requirements & UserTags Requirements = a Boolean classad expression with C-like operators describe job requirement on resources (CE) attributes according to the GlueSchema CE requirements attributes in the IS begin with prefix “other.” example: Requirements = other.GlueCEInfoTotalCPUs > 2 && other.GlueCEPolicyMaxRunningJobs < 2; UserTags = a classad attribute that allows the user to specify user-defined key, value pairs (values must be strings) tags can be used to query the LB
27
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 27 Rank & FuzzyRank Rank = a classad Floating Point expression showing the rank of each CE matching the Requirements expression the highest rank CE will be selected for job execution examples: Rank = other.GlueCEPolicyMaxRunningJobs -other.GlueCEStateRunningJobs; (CE with the greatest number of free slots) Rank = - other.GlueCEStateEstimateResponseTime (CE with the estimated shortest travel time through the local batch system queue) FuzzyRank = a Boolean attribute for the fuzzyfication of ranking attribution defaults to false
28
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 28 DAG Attributes changes DAG Attributes changes: Type and VirtualOrganisation: become mandatory attributes JobType: cannot be Partitionable or CheckPointable HLRLocation, LBAddress, MyProxyServer and JobProvenance: must be the same for all the nodes AllowZippedISB: creates a compressed InputSandbox for the whole DAG PerusalFileEnable: if declared for a node is extended to the others UserTags, Requirements and Rank: apply to the whole DAG only InputSandbox, InputSandboxBaseURI and OuputSandboxBaseURI: values declared for the whole DAG are inherited by single nodes
29
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 29 Nodes (1/2) Nodes = classads describe nodes and their dependencies example : nodes = [ a = [ /* node “a” */ description = [ description: a classad JobType = “Normal”; containing a JDL file Executable = “a.exe”; to describe a node InputSandbox = {…}; ]; b = [ /* node “b” */ file = node_b,jdl;file: a string indicating ];the absolute path of a … JDL file describing a node ];
30
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 30 Nodes (2/2) & max_nodes_running dependencies = a list describing dependencies, strings are node names format:{ { { a, b }, c }, { c, d } } means that node “c” depends on nodes “a” and “b” and node “d” depends on node ”c” example : dependencies = { { a, b };// node “b” depends on node “a” { a, c };// node “c” depends on node “a” { a, d };// node “d” depends on node “a” { { c, d }, e };// node “e” depend on nodes “c” and “d” }; max_nodes_running = an Integer > 0 representing the maximum number of nodes that can be submitted simultaneously by DAGMan
31
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 31 Partitionable Jobs PARTITIONABLE JOBS a partitionable job has 3 steps pre-job (node “a”) sub-jobs (nodes “b”,”c”,”d”) post-job (node “e”) sub-jobs are independent each other WMS transforms it into a DAG must also be a checkpointable job JobSteps (see slide22) is used to distribute M sub-jobs into N steps (with M < N; same weight is assumed for all the sub-jobs)
32
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 32 StepWeight, PreJob & PostJob StepWeight = a list of “weights” to help job partitioning example: StepWeight = { 7.5, 15, 55, 15, 7.5 }; PreJob, PostJob = classads for pre- and post-job description example: PreJob = [ Type = “Job”;JobType = “Normal”; VirtualOrganisation = “gilda”;Executable = “pre-job.exe”; ];
33
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 33 Parametric Jobs Job = “Parametric” results in the submission of sets of identical jobs except for a parameter varying from a job to another each job receives a different jobID so… it is possible to trace it separately from the others but… parametric job handle allows a common treating a special variable (_PARAM_) marks variable items amongst JDL attributes _PARAM_ assumes numerical values or a list of declared values (strings)
34
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 34 ParameterStart, ParameterStep, Parameters, NodesCollocation Parameters = an Integer to indicate the number of steps or a list of strings (each of them is the name of a step) ParameterStart, ParameterStep = ParameterStart indicates the initial step ParameterStep indicates the amount of increment between two subsequent values of _PARAM_ NodesCollocation = if true all the job instances are sent to the same CE
35
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 35 Job Collections Job = “Collection” a set of independent jobs that must be submitted, monitored and controlled as a single request similar to a DAG, but without dependencies all the clauses for DAGs are extended to Collections nodes are treated by classads attributes are referred to the whole Collection an inherit mechanism is also present
36
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 36 GangMatching Requirements = anyMatch ( ) | whichMatch( ) | allMatch ( ) RB uses classads to perform matchmacking job and CE are usually the only involved entities if also SE is to be considered, a more general mechanism is provided with new classad built-in functions example: Requirements = anyMatch( other.storage.CloseSEs, target.GlueSAStateAvailableSpace > 200) forces RB to select a CE close to a SE with >200 MBs of available space
37
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 37 JDL EXAMPLES 1 – 2 / 4 Example 1 hostname.jdl Type = “Job”; JobType = “Normal”; Executable = “/bin/sh/”; Arguments = “start_hostname.sh”; StdError = “stderr.log”; StdOutput = “stdout.log”; InputSandbox = “start_hostname.sh”; OutputSandbox = {“stderr.log”,”stdout.log”}; RetryCount = 7; Requirements = other.GlueCEInfoLRMSType == “PBS”; Rank = other.GlueCEStateFreeCPUs; start_hostname.sh #!/bin/bash hostname –f Example 2 mpi.jdl Type = "Job"; JobType = "MPICH"; Executable = "MPItest.sh"; Arguments = "cpi 2"; NodeNumber = 2; StdOutput = "test.out"; StdError = "test.err"; InputSandbox = {"MPItest.sh","cpi"}; OutputSandbox = { "test.err","test.out","executable.out“ }; Requirements = other.GlueCEInfoLRMSType == "PBS" || other.GlueCEInfoLRMSType == "LSF";
38
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 38 JDL EXAMPLES 3 - 4 / 4 Example 3 DAG.jdl [ type = "dag"; max_nodes_running = 2; nodes = [ b = [ file = "hostname.jdl"; ]; a = [ file = "hostname.jdl"; ]; dependencies = { {a, b}} ]; ] Example 4 interactive.jdl Type = "Job"; JobType = "Interactive"; Executable = "scriptint.sh"; InputSandbox = {"scriptint.sh"}; ListenerPort=24780; scriptint.sh #!/bin/sh echo "Welcome!" sleep 1 echo "What is your name?" read A echo "Bye Bye $A" exit 0
39
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 39 JDL EXAMPLES 1 - 2 / 2 (extra) Example 1 extra inputdata.jdl Type="Job"; JobType="Normal"; Executable="a.sh"; StdOutput="a.out"; StdError="a.err"; DataCatalogType = "LSF"; DataAccessProtocol = { "file","gsiftp" }; InputData = "lfn:/grid/gilda/isospin452.dat"; OutputSandbox = { "a.out", "a.err" }; Example 2 extra parametric.jdl Type = “Job”; JobType = “Paramteric”; Executable = “executable.exe”; Parameters = 10; ParameterStart = 1; ParameterStep = 1; StdInput = “input_PARAM_.txt” StdOutput = “output_PARAM_.txt”; StdError = “error_PARAM_.txt”; InputSandbox = { “executable.exe”, “input_PARAM_.txt” }; OutputSandbox={“output_PARAM_.txt” “error_PARAM_.txt”};
40
Palermo, Grid Tutorial per l'Universita' di Palermo, 10.12.2007 40 Documentation JDL (submission via WMS Netrwork Server) https://edms.cern.ch/file/555796/1/EGEE-JRA1-TEC-555796-JDL-Attributes-v0-7.doc JDL (submission via WMS WMProxy) https://edms.cern.ch/file/590869/1/EGEE-JRA1-TEC-590869-JDL-Attributes-v0-5.doc GLUESchema http://www.cnaf.it/~sergio/datatag/glue/v11/CE/index.htm CONDOR classads http://www.cs.wisc.edu/condor/classad/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.