Download presentation
Presentation is loading. Please wait.
Published byAgatha Heath Modified over 9 years ago
1
INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Workload Management System Mike Mineter mjm@nesc.ac.uk
2
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 2 Contents What is the Workload Management System (WMS)? How do you use it? Further information Practicals
3
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 3 Without WMS… Without the WMS, need direct interaction with nodes –Need to know resource addresses, capabilities Usually want a higher level abstraction – submit a job to a Grid not to a CE User Nodes
4
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 4 Basics Why does the Workload Management System exist? Grids have –Many users –Running many jobs – a “job” = an executable you want to run –Where many compute nodes are available –Workload Management System is a software service that makes running jobs easier for the user It builds on the basic grid services –E.g. Authorisation, Authentication, Security, Information Systems, Job submission Terminology: “Compute element”: defined as a batch queue - One cluster can have many queues
5
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 5 Which CE do you want to use? Without the WMS, use the Information System to see what’s available, then choose… lcg-infosites --vo gilda ce #CPU Free Total Jobs Running Waiting ComputingElement ---------------------------------------------------------- 10 10 1 0 1 grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-short 10 10 0 0 0 grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-long 10 10 2 0 2 grid011f.cnaf.infn.it:2119/jobmanager-lcgpbs-infinite 48 48 0 0 0 grid010.ct.infn.it:2119/jobmanager-lcgpbs-short 48 48 0 0 0 grid010.ct.infn.it:2119/jobmanager-lcgpbs-long 48 48 0 0 0 grid010.ct.infn.it:2119/jobmanager-lcgpbs-infinite …….[30% shown]. WMS does this for you! –chooses CE for each job, balances workload, manages jobs and their files
6
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 6 With WMS WMS manages jobs on users’ behalf User doesn’t decide where jobs are run User defines the job and its requiremements, WMS matches this with available CEs Effect: Easier submission Users insulated from change in Compute elements WMS – can optimise your jobs – e.g. which CE? UserWMS Compute Elements
7
Enabling Grids for E-sciencE INFSO-RI-508833 7 WMS CEs Local Workstation UI UI (user interface) has preinstalled client software WMS User describes job in text file using Job Description Language Submits job to WMS using (usually) the command-line interface ssh Workload Management System
8
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 8 Using WMS Jobs run in batch mode on grids. Steps in running a job on a gLite grid with WMS: 1.Create a text file in “Job Description Language” 2.Optional check: list the compute elements that match your requirements (“list match” command) 3.Submit the job ~ “glite-wms-job-submit myfile.jdl” Non-blocking - Each job is given an id. 4.Occasionally check the status of your job 5.When “Done” retrieve output
9
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 9 JDL-file attributes Executable – sets the name of the executable file; Arguments – command line arguments of the program; StdOutput, StdError - files for storing the standard output and error messages output; InputSandbox – set of input files needed by the program, including the executable; OutputSandbox – set of output files which will be written during the execution, including standard output and standard error output; these are sent from the CE to the WMS for you to retrieve ShallowRetryCount – in case of grid error, retry job this many times (“Shallow”: before job is running)
10
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 10 Executable = “gridTest”; StdError = “stderr.log”; StdOutput = “stdout.log”; InputSandbox = {“/home/joda/test/gridTest”}; OutputSandbox = {“stderr.log”, “stdout.log”}; Requirements = other.GlueCEPolicyMaxCPUTime > 480; ShallowRetryCount = 3; Example JDL file
11
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 11 Job states Flag Meaning SUBMITTED submission logged in the Logging & Bookkeeping service WAITjob match making for resources READYjob being sent to executing CE SCHEDULEDjob scheduled in the CE queue manager RUNNING job executing on a Worker Node of the selected CE queue DONEjob terminated without grid errors CLEAREDjob output retrieved ABORTjob aborted by middleware, check reason
12
Enabling Grids for E-sciencE INFSO-RI-508833 12 WMS: role of WMProxy CEs Local Workstation UI UI (user interface) has preinstalled client software Workload Manager Client on the UI communicates with the “WM Proxy” On UI run: glite-wms-…commands WMProxy acts on your behalf in using the WM – it needs a “delegated proxy” – hence “- a” option on commands WMProxy
13
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 13 WMProxy WMProxy is a SOAP Web service providing access to the Workload Management System (WMS) Job characteristics specified via JDL –jobRegister create id map to local user and create job dir register to L&B return id to user –input files transfer –jobStart register sub-jobs to L&B map to local user and create sub-job dir’s unpack sub-job files deliver jobs to WM
14
Enabling Grids for E-sciencE INFSO-RI-508833 14 More about WMProxy CEs Local Workstation UI UI (user interface) has preinstalled client software Workload Manager WMPProxy can manage complex jobs Before WMProxy, user had to script or create software to manage these on the UI WMProxy
15
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 15 Complex Jobs Direct Acyclic Graph (DAG) is a set of jobs where the input, output, or execution of one or more jobs depends on one or more other jobs A Collection is a group of jobs with no dependencies –basically a collection of JDL’s –Can have common sandbox A Parametric job is a job having one or more attributes in the JDL that vary their values according to parameters It is possible to have one shot submission of a (possibly very large, up to thousands) group of jobs –Submission time reduction Single call to WMProxy server Single Authentication and Authorization process Sharing of files between jobs –Availability of both a single Job Id to manage the group as a whole and an Id for each single job in the group D C A E B
16
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 16 Status of WMProxy For simple jobs: glite-wms-… becoming the recommended way to use the WMS History: –Before the glite-wms- commands we had glite- commands used the WMS without WMProxy –Before the glite- commands we had edg- commands (edg-job-submit….) European Data Grid – project before EGEE Used the “resource broker” Still very widely used –You might see these commands still in use. Status –Complex jobs with WMProxy: not yet in routine production use –Watch for news!
17
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 17 Further information gLite Users Guide –Follow http://www.glite.org and “Documentation”http://www.glite.org GILDA wiki –We are using some of these pages –https://grid.ct.infn.it/twiki/bin/view/GILDA/https://grid.ct.infn.it/twiki/bin/view/GILDA/ EGEE Digital Library http://egee.lib.ed.ac.uk/http://egee.lib.ed.ac.uk/
18
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 18 Practicals You will need to be know UNIX (Linux) a bit – edit files and commands –Work with someone if this is new to you Follow links on the agenda page –“Practical_1a” Create a simple JDL file List the CEs that can accept it Submit it Check its status until its done Retrieve output – “Practical_1b” Uses a script – time for that one exercise only from this page –Practical 2: complex jobs - see the benefit of WMProxy
19
Enabling Grids for E-sciencE INFSO-RI-508833 Workload Management System – 29 September 2007 19 Summary Logging & Book-keeping WMSStorageElementComputingElement InformationService Job Status DataSets info Author. &Authen. Job Submit Event Job Query Job Status Input “sandbox” Input “sandbox” + Broker Info Output “sandbox” Publish SE & CE info “User interface” LCG FileCatalogue (LFC)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.