Download presentation
Presentation is loading. Please wait.
Published byHugh Gray Modified over 9 years ago
1
EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald Smits Centre for Information Technology, University of Groningen Utrecht, Grid Tutorial 2008
2
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 2 Introduction ?
3
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 3 Components in the EGEE Grid Workload Management System (WMS) Storage Element (SE) Computing Element (CE) Information System (BDII) LCG File Catalog (LFC) JDL User Interface (UI)
4
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 4 Workload Management System Tasks of the WMS –Find the best resource for your tasks (jobs) –Submit jobs to compute resources –Logging and book keeping –Delegated Grid credential management
5
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 5 Job preparation You need to provide –A complete (enough) job description What program? What data? Any requirements on OS, installed software, ?? –Possibly a program You’re submitting in unknown territory! Program portably! Don’t rely on hard-coded paths or special locations The program you send may not even be in $HOME! –Perhaps some input data –Perhaps instructions on what to do with the output
6
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 6 Here is a minimal job description (call it hello.jdl) We specified –The program to run and its arguments –Directed the standard error and output streams to files –Told it what to do with the output Executable = “/bin/echo”; Arguments = “Goedemiddag”; StdError = “stderr.log”; StdOutput = “stdout.log”; OutputSandbox = {“stderr.log”, “stdout.log”}; How to Write a Job Description
7
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 7 User issues a voms-proxy-init –enters his certificate’s password –Receives a valid Globus proxy User issues a: glite-wms-job-submit -a mytest.jdl and gets back from the system a unique Job Identifier (JobId) User issues a: glite-wms-job-status JobId to get logging information about the current status of his Job When the “Done” status is reached, the user can issue a glite-wms-job-output JobId and the system returns the name of the temporary directory where the job output can be found on the UI machine. Job Submission Example
8
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 8 $ voms-proxy-init --voms tutor Cannot find file or dir: /admins/fokke/.glite/vomses Enter GRID pass phrase: Your identity: /O=dutchgrid/O=users/O=rug/OU=rc/CN=Fokke Dijkstra Creating temporary proxy........................................... Done Contacting voms.grid.sara.nl:30007 [/O=dutchgrid/O=hosts/OU=sara.nl/CN=voms.grid.sara.nl] "tutor" Done Creating proxy................................................ Done Your proxy is valid until Wed Nov 5 23:11:27 2008 $ glite-wms-job-submit -a hello.jdl Connecting to the service https://wms.grid.sara.nl:7443/glite_wms_wmproxy_server ====================== glite-wms-job-submit Success ====================== The job has been successfully submitted to the WMProxy Your job identifier is: https://wms.grid.sara.nl:9000/V7pw7lTR4MeFMVAz12larQ ========================================================================== JobId Submitting it
9
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 9 UI JDL Workload Management System (WMS) Storage Element (SE) Computing Element (CE) Information System (IS) Job Status submitted LCG File Catalog (LFC) User Interface (UI) waiting Job + Input sandbox A Job Submission Example readyscheduled Job + Input sandbox
10
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 10 $ glite-wms-job-status https://wms.grid.sara.nl:9000/V7pw7lTR4MeFMVAz12larQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://wms.grid.sara.nl:9000/V7pw7lTR4MeFMVAz12larQ Current Status: Scheduled Status Reason: Job successfully submitted to Globus Destination: ce.grid.rug.nl:2119/jobmanager-pbs-long Submitted: Wed Nov 5 11:12:15 2008 CET ************************************************************* Checking the status
11
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 11 Check status using browser
12
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 12 UI JDL Workload Management System (WMS) Storage Element (SE) Computing Element (CE) Information System (IS) LCG File Catalog (LFC) Job Status submitted waiting User Interface (UI) readyscheduledrunning Output Sandbox done A Job Submission Example
13
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 13 $ glite-wms-job-output https://wms.grid.sara.nl:9000/V7pw7lTR4MeFMVAz12larQ Connecting to the service https://wms.grid.sara.nl:7443/glite_wms_wmproxy_server ============================================================================== JOB GET OUTPUT OUTCOME Output sandbox files for the job: https://wms.grid.sara.nl:9000/V7pw7lTR4MeFMVAz12larQ have been successfully retrieved and stored in the directory: /tmp/jobOutput/fokke_V7pw7lTR4MeFMVAz12larQ =============================================================================== $ cat /tmp/jobOutput/fokke_V7pw7lTR4MeFMVAz12larQ Goedemiddag Getting the Output
14
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 14 UI JDL Workload Management System (WMS) Storage Element (SE) Computing Element (CE) Information System (IS) LCG File Catalog (LFC) Output Sandbox cleared submitted waiting ready scheduled running done Job Status User Interface (UI) A Job Submission Example
15
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 15 Job Description Language Job Description Language based on Classified Advertisement language Lines: –Attribute = expression; –Can be multiple lines, semicolon is separator –”for strings” –# and // for comments –No blanks after ; !!
16
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 16 The supported attributes are grouped in two categories: –Job Define the job itself –Resources Taken into account by the WMS for carrying out the matchmaking algorithm Computing Resource (Attributes) Used to build expressions of Requirements and/or Rank attributes by the user Have to be prefixed with “other.” Data and Storage resources (Attributes) Input data to process, SE where to store output data, protocols spoken by application when accessing SEs Types of Attributes
17
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 17 Executable (mandatory) –The command name Arguments (optional) –Job command line arguments StdInput, StdOutput, StdErr (optional) –Standard input/output/error of the job Environment (optional) –List of environment settings InputSandbox (optional) –List of files on the UI local disk needed by the job for running –The listed files are staged from the UI to the remote CE –Wildcards allowed –Unique filenames required OutputSandbox (optional) –List of files, generated by the job, which have to be retrieved Job Definition Attributes
18
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 18 Requirements –Job requirements on computing resources –Specified using attributes of resources published in the Information System –other.GlueCEStateStatus == "Production" always included (the resource has to be in the Production grid) –Useful requirements: Wallclock time and specific sites: Requirements = other.GlueCEPolicyMaxWallClockTime > 720 && RegExp(“nikhef.nl", other.GlueCEUniqueID); Specific tag published: Requirements = Member("VO-ncf-gromacs- 3.3.2",other.GlueHostApplicationSoftwareRunTimeEnvironment); –Logical expressions: &&: and ||: or !: not Resource Attributes
19
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 19 InputData (optional) –Refers to data used as input by the job: these data are published in the Replica Catalog and stored in the SEs) –GUIDs and/or LFNs –Job must be sent to CE that has the data nearby DataAccessProtocol (mandatory if InputData specified) –The protocol or the list of protocols which the application is able to speak with for accessing InputData on a given SE Data Attributes
20
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 20 The WMS has to find the best suitable CE where the job will be executed It interacts with Data Management service and Information System The CE chosen has to match the job requirements If 2 or more CEs satisfy all the requirements, the one with the best Rank is chosen –Specified using attributes of resources published in the Information Service –If not specified, default value is used: Rank = -other.GlueCEStateEstimatedResponseTime; quickest response time WMS match making and ranking
21
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 21 Executable = “gridTest”; StdError = “stderr.log”; StdOutput = “stdout.log”; InputSandbox = {“/home/joda/test/gridTest”}; OutputSandbox = {“stderr.log”, “stdout.log”}; InputData = “lfn:/grid/tutor/testbed0-00019”; DataAccessProtocol = “gridftp”; Requirements = other.Architecture==“INTEL” && \ other.OpSys==“CentOS” && other.FreeCpus >=4; Rank = “other.GlueHostBenchmarkSF00”; Example JDL File
22
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 22 glite-wms-job-submit [-a] [-d ] [-o ] -o the generated jobId is written in the Useful for other commands, e.g.: glite-wms-job-status –i (or jobId) -i the status information about edg_jobId contained in the are displayed -a use automatic delegation -d use an existing delegated proxy at the WMS e.g. one generated using: glite-wms-job-delegate-proxy –d Job Submission
23
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 23 glite-wms-job-list-match Lists resources matching a job description Performs the matchmaking without submitting the job glite-wms-job-cancel Cancels a given job glite-wms-job-status Displays the status of the job glite-wms-job-output Returns the job-output (the OutputSandbox files) to the user glite-wms-job-logging-info Displays logging information about submitted jobs (all the events “pushed” by the various components of the WMS) Very useful for debug purposes Other WMS UI Commands
24
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 24 Why? –To avoid job failure because it outlived the validity of the initial proxy –To prevent long term proxies from lying around –Use a safe system for storing long term proxies WMS support automatic proxy renewal mechanism as long as the user credentials are handled by a proxy server. 1.Create a proxy using voms-proxy-init --voms 2.Register this proxy with the MyProxy server using myproxy-init -s [-t -c ] -d -n server is the server address (e.g. px.matrix.sara.nl) cred is the number of hours the proxy should be valid on the server proxy is the number of hours renewed proxies should be valid 3.The Proxy is automatic renewed by WMS without user intervention for all the job life Proxy Renewal
25
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 25 Advanced Job types: Job Collection Set of independent jobs –Collect jobs in single directory: glite-wms-job-submit -a --collection –Advanced collection using global set of attributes [ Type = "Collection"; InputSandbox = {"myjob.exe", "fileA"}; OutputSandboxBaseDestURI = "gsiftp://lxb0707.cern.ch/data/doe"; DefaultNodeShallowRetryCount = 5; Nodes = { [ Executable = "myjob.exe"; InputSandbox = {root.InputSandbox, "fileB"}; OutputSandbox = {"myoutput1.txt"}; Requirements = other.GlueCEPolicyMaxWallClockTime > 1440; ], [ NodeName = "mysubjob"; Executable = "myjob.exe"; OutputSandbox = {"myoutput2.txt"}; ShallowRetryCount = 3; ], [ File = "/home/doe/test.jdl"; ] } ]
26
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 26 Advanced Job types: Parametric Identical jobs, except the value of a parameter Parameters –List of items –Number ParameterStart and ParameterStep necessary –_PARAM_ replaced by parameter value [ JobType = "Parametric"; Executable = "myjob.exe"; StdInput = "input_PARAM_.txt"; StdOutput = "output_PARAM_.txt"; StdError = "error_PARAM_.txt"; Parameters = 100; ParameterStart = 1; ParameterStep = 1; InputSandbox = {"myjob.exe", "input_PARAM_.txt“}; OutputSandbox = {"output_PARAM_.txt", "error_PARAM_.txt"}; ]
27
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 27 Advanced job types: MPI For programs using the MPI parallel library JobType=”MPICH” NodeNumber = –Request n cores on the remote cluster –Scheduling is determined at remote site Submit script that starts up your program using MPI –You can use mpi-start for this Scheduling SMP nodes not yet possible
28
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 28 Other Advanced Job types Direct Acyclic Graph –Graph shows dependencies between jobs Interactive –Opens graphical window that connects to job nodeA nodeBnodeC mynode nodeD
29
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 29 Pilot jobs Send agent to a site first –Will fetch workload from central service –Both single and multi user frameworks exist Advantages –Hides problematic sites from the user –Probably less overhead per workload –Makes central scheduling possible Disadvantage –Less efficient scheduling at site –Security concerns with multiple users
30
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 30 How to get your program on the Grid? Send it with job –Binary package –Compile it on the fly Use software manager accounts –Write permission on special shared directory –Publish tags in information system Use preinstalled packages –Only possible for special collaborations Example: VL-e software stack Advantages – You will be able to run anywhere Disadvantages –Portability extremely important, otherwise jobs will fail –Overhead Advantages –Software can be validated –Central management takes burden from users Disadvantages –Sites have to support this –Lot of work for software manager Advantages –Very easy for users Disadvantages –Sites have to support it –Lot of work for software packager
31
Enabling Grids for E-sciencE EGEE-III INFSO-RI-222667 gLite job submission 31 Pointers to advanced topics Can be found in gLite user guide: http://glite.web.cern.ch/glite/documentation/ http://glite.web.cern.ch/glite/documentation/ Advanced sandbox play –Gridftp instead of local files No space on WMS needed Brokerinfo –Information about local environment (CE, SEs, etc.) Job perusal –Peek at the output while your job is running Automatic retries –RetryCount –ShallowRetryCount
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.