Download presentation
Presentation is loading. Please wait.
1
Job Description Language
Gergely Sipos, Péter Kacsuk MTA SZTAKI Title: title of this talk. Also place in footer on master slide. Presenter’s name. EGEE is funded by the European Union under contract IST Grid Computing School – July 2006, Rio de Janeiro
2
Job Description Language
The supported attributes are grouped in two categories: Job Attributes Define the job itself Resource expression attributes Taken into account by the RB for carrying out the matchmaking algorithm (to choose the “best” resource where to submit the job) Computing Resource Used to build expressions of Requirements and/or Rank attributes by the user Have to be prefixed with “other.” (external) or “self.” (internal) Data and Storage resources Input data to process, SE where to store output data, protocols spoken by application when accessing SEs Grid Computing School – July 2006, Rio de Janeiro
3
JDL: some relevant attributes
JobType Normal (simple, sequential job), Interactive, MPICH, Checkpointable Or combination of them Executable (mandatory) The command name Arguments (optional) Job command line arguments StdInput, StdOutput, StdError (optional) Standard input/output/error of the job Environment (optional) List of environment settings InputSandbox (optional) List of files on the UI local disk needed by the job for running The listed files will automatically staged to the remote resource OutputSandbox (optional) List of files, generated by the job, which have to be retrieved VirtualOrganisation (optional) A different way to specify the VO of the user Grid Computing School – July 2006, Rio de Janeiro
4
JDL: some relevant attributes II
Input Data (For the broker but no data movement) DataAccessProtocol file|gridftp|rfio (Together with InputData) Output Data {OutputFile= [CE path] [ StorageElement= SE ] [ LogicalFileName = lfn:fileName ] }(Real Data movement – LCG, no data movement - gLite) OutputSE rank requirements MyProxyServer RetryCount NodeNumber JobSteps Grid Computing School – July 2006, Rio de Janeiro
5
Example of JDL file [ JobType = “Normal”; Executable = "/exe/sum.exe";
InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"}; OutputSandbox = {“sim.err”, “test.out”, “sim.log"}; Requirements = (other.GlueHostOperatingSystemName == “linux") && (other.GlueCEPolicyMaxWallClockTime > 10000); Rank = other.GlueCEStateFreeCPUs; ] Grid Computing School – July 2006, Rio de Janeiro
6
A “real world” JDL file job attributes part [ JobType = "normal";
Executable = "lexor_wrap.sh"; StdOutput = "dc digit.A8_QCD._01730.job.log.3"; StdError = "dc digit.A8_QCD._01730.job.log.3"; OutputSandbox {"metadata.xml", "lexor_wrap.log","dq_337704_stagein.log","dq_337704_stageout.log",\ "dc digit.A8_QCD._01730.job.log.3" }; RetryCount = 0; Arguments = "dc simul.A8_QCD._01730.pool.root,\ dc digit.A8_QCD._01730.pool.root "; Environment = { "LEXOR_WRAPPER_LOG=lexor_wrap.log","LEXOR_STAGEOUT_MAXATTEMPT=5","LEXOR_STAGEOUT_INTERVAL=60","LEXOR_LCG_GFAL_INFOSYS=atlas-bdii.cern.ch:2170","LEXOR_T_RELEASE=8.0.7","LEXOR_T_PACKAGE= /JobTransforms","LEXOR_T_BASEDIR=JobTransforms ","LEXOR_TRANSFORMATION=share/dc2.g4digit.trf","LEXOR_STAGEIN_LOG=dq_337704_stagein.log","LEXOR_STAGEIN_SCRIPT=dq_337704_stagein.sh","LEXOR_STAGEOUT_LOG=dq_337704_stageout.log","LEXOR_STAGEOUT_SCRIPT=dq_337704_stageout.sh" }; MyProxyServer = "lxb0727.cern.ch"; VirtualOrganisation = "atlas"; rank = -other.GlueCEStateEstimatedResponseTime job attributes part Grid Computing School – July 2006, Rio de Janeiro
7
A “real world” JDL file (cont.)
resource attributes part requirements = ( Member("VO-atlas-lcg-release-0.0.2", other.GlueHostApplicationSoftwareRunTimeEnvironment) && (other.GlueCEStateStatus == "Production“) && !Member("VO-atlas-has-m1", other.GlueHostApplicationSoftwareRunTimeEnvironment)) && (other.GlueCEInfoHostName != "lcgce02.gridpp.rl.ac.uk" ) && (other.GlueCEInfoHostName != "lcg-ce.lps.umontreal.ca" ) && (other.GlueCEInfoHostName != "lcgce02.triumf.ca" ) && (other.GlueCEInfoHostName != "ce-a.ccc.ucl.ac.uk" ) && Member("VO-atlas-release-8.0.7", other.GlueHostApplicationSoftwareRunTimeEnvironment)) && ( other.GlueCEPolicyMaxCPUTime >= (Member("LCG-2_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment) ? ( / 60 ) : ) / other.GlueHostBenchmarkSI00 ) ) && ( other.GlueHostNetworkAdapterOutboundIP == true ) && (other.GlueHostMainMemoryRAMSize >= 512 ) ); ] Grid Computing School – July 2006, Rio de Janeiro
8
Requirements Job requirements on the resources
Specified using GLUE attributes of resources published in the Information Service Its value is a boolean expression Only one requirements can be specified ( one C-like logic expression ) if there are more than one, only the last one is taken into account If not specified, default value defined in UI configuration file is considered Default: other.GlueCEStateStatus == "Production" (the resource has to be able to accept jobs and dispatch them on WNs) Grid Computing School – July 2006, Rio de Janeiro
9
Relevant Glue Attributes 1 (State)
State (objectclass GlueCEState) GlueCEStateRunningJobs: number of running jobs GlueCEStateWaitingJobs: number of jobs not running GlueCEStateTotalJobs: total number of jobs (running + waiting) GlueCEStateStatus: queue status: queueing (jobs are accepted but not run), production (jobs are accepted and run), closed (jobs are neither accepted nor run), draining (jobs are not accepted but those in the queue are run) GlueCEStateWorstResponseTime: worst possible time between the submission of a job and the start of its execution GlueCEStateEstimatedResponseTime: estimated time between the submission of a job and the start of its execution GlueCEStateFreeCPUs: number of CPUs available to the scheduler Grid Computing School – July 2006, Rio de Janeiro
10
Relevant Glue Attributes 2 (Hardware)
Architecture (objectclass GlueHostArchitecture) GlueHostArchitecturePlatformType: platform description GlueHostArchitectureSMPSize: number of CPUs Processor (objectclass GlueHostProcessor) GlueHostProcessorVendor: name of the CPU vendor GlueHostProcessorModel: name of the CPU model GlueHostProcessorVersion: version of the CPU GlueHostProcessorOtherProcessorDescription: other description for the CPU […] Grid Computing School – July 2006, Rio de Janeiro
11
Relevant Glue Attributes 3 (HW & Software)
Application software (objectclass GlueHostApplicationSoftware) GlueHostApplicationSoftwareRunTimeEnvironment: list of software installed on this host Main memory (objectclass GlueHostMainMemory) GlueHostMainMemoryRAMSize: physical RAM GlueHostMainMemoryVirtualSize: size of the configured virtual memory Benchmark (objectclass GlueHostBenchmark) GlueHostBenchmarkSI00: SpecInt2000 benchmark GlueHostBenchmarkSF00: SpecFloat2000 benchmark Network adapter (objectclass GlueHostNetworkAdapter) […] GlueHostNetworkAdapterOutboundIP: permission for outbound connectivity GlueHostNetworkAdapterInboundIP: permission for inbound connectivity Grid Computing School – July 2006, Rio de Janeiro
12
Relevant Glue Attributes 4: policy of LRMS
GlueCEPolicyMaxWallClockTime: maximum wall clock time available to jobs submitted to the CE, in seconds (previously it was in minutes) GlueCEPolicyMaxCPUTime: maximum CPU time available to jobs submitted to the CE, in seconds (previously it was in minutes) GlueCEPolicyMaxTotalJobs: maximum allowed total number of jobs in the queue GlueCEPolicyMaxRunningJobs: maximum allowed number of running jobs in the queue Grid Computing School – July 2006, Rio de Janeiro
13
Exercise: JDL Requirements
other.GlueCEInfoLRMSType == “PBS” && other.GlueCEInfoTotalCPUs > 1 (the resource has to use PBS as the LRMS and whose WNs have at least two CPUs) Member(“CMSIM-133”, other.GlueHostApplicationSoftwareRunTimeEnvironment) (a particular experiment software has to run on the resource and this information is published on the resource environment) The Member operator tests if its first argument is a member of its second argument. Used in case of multi attribute. RegExp(“cern.ch”, other.GlueCEUniqueId) (the job has to run on the CEs in the domain cern.ch) Matches the regular expression (other.GlueHostNetworkAdapterOutboundIP == true) && Member(“VO-alice-Alien”, other.GlueHostApplicationSoftwareRunTimeEnvironment) && Member(“VO-alice-Alien-v4-01-Rev-01”, other.GlueHostApplicationSoftwareRunTimeEnvironment) && (other.GlueCEPolicyMaxWallClockTime > 86000) (the resource must have some packages installed VO-alice-Alien and VO-alice-Alien-v4-01-Rev-01 and the job may run for more than WallClock time units) Grid Computing School – July 2006, Rio de Janeiro
14
Rank Expresses preference (how to rank resources that have already met the Requirements expression) It is expressed as a floating-point number The CE with the highest rank is the one selected (see Matchmaking later on) If not specified, default value defined in the UI configuration file is considered Example: -other.GlueCEStateEstimatedResponseTime (the lowest estimated traversal time) Usually the default Grid Computing School – July 2006, Rio de Janeiro
15
WMS Matchmaking Grid Computing School – July 2006, Rio de Janeiro
16
The Matchmaking algorithm
The matchmaker has the goal to find the best suitable CE where to execute the job To accomplish this task, the WMS interacts with the other EGEE components (File Catalogue, and Information Service) There are three different scenarios to deal with: Direct job submission Job submission without data-access requirements Job submission with data-access requirements Grid Computing School – July 2006, Rio de Janeiro
17
The Matchmaking algorithm: direct job submission
CE defined in the JDL The WMS does not perform any matchmaking algorithm at all The job is simply submitted to the specified CE CE defined during the edg-job-submit (glite-job-submit) command: If the CEId is specified then the WMS Does NOT check whether the user is authorised to access the CE Does NOT interact with the File Catalog for the resolution of files requirements Only checks the JDL syntax, while converting the JDL into a ClassAd Syntax: edg-job-submit --resource <ce_id> <job.jdl> command ce_id = hostaname:port/jobmanager-lsf-grid01 Grid Computing School – July 2006, Rio de Janeiro
18
The user JDL contains some requirements
The Matchmaking algorithm: job submission without data access requirements (I) The user JDL contains some requirements Once the JDL has been received by the WMS and converted in ClassAd, the WMS invokes the matchmaker The matchmaker has to find if the characteristics and status of Grid resources match the job requirements Grid Computing School – July 2006, Rio de Janeiro
19
There are two phases of evaluation:
The Matchmaking algorithm: job submission without data access requirements (II) There are two phases of evaluation: Requirements check: The Matchmaker contacts the BDII in order to create a set of suitable CEs compliant with user requirements and where the user is authorized to submit jobs The Matchmaker creates the set of suitable CEs Ranking phase: The Matchmaker contacts the BDII again to obtain the values of those attributes that are in the rank expression The CE with maximum rank value is selected If 2 or more CE have same rank, Matchmakes selects random one Can adopt a stochastic selection (enabling fuzzyness) The user has to set the JDL FuzzyRank attribute to true The rank value = probability to select the CE The higher the rank value is, the higher the probability is. Grid Computing School – July 2006, Rio de Janeiro
20
The Matchmaking algorithm: job submission with data access requirements (I)
The user can specify in the JDL the following attributes InputData represents the input files InputData = {“lfn:my-file-001"} lfn=logical file name, see Data Management OutputSE represents the SE where the output file should be staged OutputSE = "gilda-se-01.pd.infn.it"; OutputData represents the output files OutputFile = "dummy.dat"; StorageElement = "gilda-se-01.pd.infn.it"; LogicalFileName = "lfn:iome_outputData"; DataAccessProtocol represents the protocol spoken by the application to access the file DataAccessProtocol = "gsiftp"; Match- Maker/ Broker FC IS Grid Computing School – July 2006, Rio de Janeiro
21
The Matchmaker finds the most suitable CEs taking into account
The Matchmaking algorithm: job submission with data access requirements (II) The Matchmaker finds the most suitable CEs taking into account the SEs where input data are physically stored the SE where output data should be staged Previous to requirements and ranking checks, the broker Performs a pre-match processing interacts with File Catalog Filters CEs satisfying both data access and user authorization requirements Grid Computing School – July 2006, Rio de Janeiro
22
The Matchmaker finds most suitable CEs considering
The Matchmaking algorithm: job submission with data access requirements(III) Summary The Matchmaker interacts with a File Catalogue and the Information Service The FC is used to resolve the location of data (see Data Management talk for more details ) The Matchmaker finds most suitable CEs considering SEs where input data are physically stored SEs where output data should be staged Previous to requirements and ranking checks, the broker Performs a pre-match processing (access the FC) Filters CEs satisfying both data access and user authorization requirements Grid Computing School – July 2006, Rio de Janeiro
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.