INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Architecture of the gLite Workload Management System Giuseppe Andronico INFN EGEE Tutorial.

Slides:



Advertisements
Similar presentations
Workload Management David Colling Imperial College London.
Advertisements

EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
INFSO-RI Enabling Grids for E-sciencE LCG-2 and gLite Architecture and components Author E.Slabospitskaya.
Job Submission The European DataGrid Project Team
E-infrastructure shared between Europe and Latin America 12th EELA Tutorial for Users and System Administrators Architecture of the gLite.
SEE-GRID-SCI Hands-On Session: Workload Management System (WMS) Installation and Configuration Dusan Vudragovic Institute of Physics.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
IST E-infrastructure shared between Europe and Latin America Architecture of the gLite WMS Alexandre Duarte CERN Fifth EELA.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Submission Fokke Dijkstra RuG/SARA Grid.
The EDG Workload Management System – n° 1 The EDG Workload Management System.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP.
INFSO-RI Enabling Grids for E-sciencE The Workload Management System: an overview Giuseppe La Rocca INFN – Catania ICTP/INFM-Democritos.
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals GILDA Tutors INFN Catania ICTP/INFM-Democritos Workshop on Porting Scientific.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Computational grids and grids projects DSS,
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
1 Esther Montes Prado CIEMAT 10th EELA Tutorial Madrid, Hands-on on WMS (Review and Summary)
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald.
INFSO-RI Enabling Grids for E-sciencE The gLite Workload Management System Elisabetta Molinari (INFN-Milan) on behalf of the JRA1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Submission Fokke Dijkstra RuG/SARA Grid.
EGEE is a project funded by the European Union under contract IST Job Description Language - more control over your Job Assaf Gottlieb University.
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Job Services Emidio.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract INFSO-RI Grid Accounting.
E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.
EGEE is a project funded by the European Union under contract INFSO-RI Practical approaches to Grid workload management in the EGEE project Massimo.
INFSO-RI Enabling Grids for E-sciencE Claudio Cherubino, INFN Catania Grid Tutorial for users Merida, April 2006 Special jobs.
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
EGEE is a project funded by the European Union under contract IST WS-Based Advance Reservation and Co-allocation Architecture Proposal T.Ferrari,
INFSO-RI Enabling Grids for E-sciencE Job Submission Tutorial (material from INFN Catania)
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
1 DIRAC Job submission A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.
INFSO-RI Enabling Grids for E-sciencE Job Description Language (JDL) Giuseppe La Rocca INFN First gLite tutorial on GILDA Catania,
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals Giuseppe La Rocca INFN – Catania gLite Tutorial at the EGEE User Forum CERN.
EGEE is a project funded by the European Union under contract IST Job Description Language – How to control your Job Nadav Grossaug IsraGrid.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Special Jobs Valeria Ardizzone INFN - Catania.
EDG - WP1 (Grid Work Scheduling) Status and plans Massimo Sgaravatto INFN Padova.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
INFSO-RI Enabling Grids for E-sciencE Architecture of gLite WMS Salvatore Monforte Marco Pappalardo INFN Retreat between GILDA and.
Biomed tutorial 1 Enabling Grids for E-sciencE INFSO-RI EGEE is a project funded by the European Union under contract IST JDL Flavia.
Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni.
Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.
Introduction to Job Description Language (JDL) Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY September.
INFSO-RI Enabling Grids for E-sciencE Architecture of gLite WMS Salvatore Monforte Marco Pappalardo INFN First gLite tutorial on.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
EU 2nd Year Review – Feb – WP1 Demo – n° 1 WP1 demo Grid “logical” checkpointing Fabrizio Pacini (Datamat SpA, WP1 )
Architecture of the gLite WMS
Workload Management System on gLite middleware
Special jobs with the gLite WMS
Workload Management System ( WMS )
EGEE tutorial, Job Description Language - more control over your Job Assaf Gottlieb Tel-Aviv University EGEE is a project.
Job Submission in the DataGrid Workload Management System
Alexandre Duarte CERN Fifth EELA Tutorial Santiago, 06/09-07/09,2006
Introduction to Grid Technology
I2G CrossBroker Enol Fernández UAB
Workload Management System
5. Job Submission Grid Computing.
gLite Advanced Job Management
Workload Management System (WMS) & Job Description Language (JDL)
Job Description Language (JDL)
Job Submission M. Jouvin (LAL-Orsay)
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE Architecture of the gLite Workload Management System Giuseppe Andronico INFN EGEE Tutorial Taipei,

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Overview of gLite WMS Job Management Services –main services related to job management/execution are  computing element job management (job submission, job control, etc.), but it must also provide provision of information about its characteristics and status  workload management core component discussed in details  accounting special case as it will eventually take into account ocomputing, storage and network resources  job provenance keep track of the definition of submitted jobs, execution conditions and environment, and important points of the job life cycle for a long period odebugging, post-mortem analysis, comparison of job execution  package manager automates the process of installing, upgrading, configuring, and removing software packages from a shared area on a grid site. oextension of a traditional package management system to a Grid

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, WMS Services UI Replica Catalog Inform. System Storage Element Resource Broker Node (Workload Manager, WM) Architecture Overview Logging & Bookkeeping Job status Grid Interface Computing Element LRMS LCG Match Maker Job Adapter Network Server Workload Manager Job Contr. - CondorG Match Maker Task Queue Information Supermarket Network Server Job Submission gLite

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, WMS’s Scheduling Policies WM can adopt –eager scheduling (“push” model)  a job is bound to a resource as soon as possible and, once the decision has been taken, the job is passed to the selected resource for execution –lazy scheduling (“pull” model)  foresees that the job is held by the WM until a resource becomes available, at which point that resource is matched against the submitted jobs the job that fits best is passed to the resource for immediate execution. Varying degrees of eagerness (or laziness) are applicable –match-making level  eager scheduling implies matching a job against multiple resources  lazy scheduling implies matching a resource against multiple jobs

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, WMS’s Architecture

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, WMS’s Architecture Job management requests (submission, cancellation) expressed via a Job Description Language (JDL)

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, WMS’s Architecture Keeps submission requests Requests are kept for a while for a while if no matching resources available

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, WMS’s Architecture Repository of resource information information available to matchmaker Updated via notifications and/or active polling on sources

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, WMS’s Architecture Finds an appropriate CE for each submission request, taking into account job requests and preferences, Grid status, utilization policies on resources

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, WMS’s Architecture Performs the actual job submission and monitoring

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, The Information Supermarket ISM represents one of the most notable improvements in the WM as inherited from the EU DataGrid (EDG) project –decoupling between the collection of information concerning resources and its use  allows flexible application of different policies The ISM basically consists of a repository of resource information that is available in read only mode to the matchmaking engine –the update is the result of  the arrival of notifications  active polling of resources  some arbitrary combination of both –can be configured so that certain notifications can trigger the matchmaking engine  improve the modularity of the software  support the implementation of lazy scheduling policies

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, The Task Queue The Task Queue represents the second most notable improvement in the WM internal design –possibility to keep a submission request for a while if no resources are immediately available that match the job requirements  technique used by the AliEn and Condor systems Non-matching requests –will be retried either periodically  eager scheduling approach –or as soon as notifications of available resources appear in the ISM  lazy scheduling approach

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Job Logging & Bookkeeping L&B tracks jobs in terms of events –important points of job life  submission, finding a matching CE, starting execution etc gathered from various WMS components The events are passed to a physically close component of the L&B infrastructure – locallogger  avoid network problems stores them in a local disk file and takes over the responsibility to deliver them further The destination of an event is one of bookkeeping servers –assigned statically to a job upon its submission  processes the incoming events to give a higher level view on the job states Submitted, Running, Done  various recorded attributes JDL, destination CE name, job exit code Retrieval of both job states and raw events is available via legacy (EDG) and WS querying interfaces –user may also register for receiving notifications on particular job state changes

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Job Submission Services WMS components handling the job during its lifetime and performing the submission Job Adapter –is responsible for  making the final touches to the JDL expression for a job, before it is passed to CondorC for the actual submission  creating the job wrapper script that creates the appropriate execution environment in the CE worker node transfer of the input and of the output sandboxes CondorC –responsible for  performing the actual job management operations job submission, job removal DAGMan –meta-scheduler  purpose is to navigate the graph  determine which nodes are free of dependencies  follow the execution of the corresponding jobs. –instance is spawned by CondorC for each handled DAG Log Monitor –is responsible for  watching the CondorC log file  intercepting interesting events concerning active jobs events affecting the job state machine  triggering appropriate actions.

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Job Preparation Information to be specified when a job has to be submitted: Job characteristics Job requirements and preferences on the computing resources Also including software dependencies Job data requirements Information specified using a Job Description Language (JDL) Based upon Condor’s CLASSified ADvertisement language (ClassAd) Fully extensible language A ClassAd Constructed with the classad construction operator [] It is a sequence of attributes separated by semi-colons. An attribute is a pair (key, value), where value can be a Boolean, an Integer, a list of strings, … = ;

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Job Description Language (JDL) The supported attributes are grouped into two categories: Job Attributes Define the job itself Resources Taken into account by the Workload Manager for carrying out the matchmaking algorithm (to choose the “best” resource where to submit the job) Computing Resource Used to build expressions of Requirements and/or Rank attributes by the user Have to be prefixed with “other.” Data and Storage resources Input data to process, Storage Element (SE) where to store output data, protocols spoken by application when accessing SEs

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, JDL: Relevant Attributes (1) JobType Normal (simple, sequential job), DAG, Interactive, MPICH, Checkpointable Executable (mandatory) The command name Arguments (optional) Job command line arguments StdInput, StdOutput, StdError (optional) Standard input/output/error of the job Environment List of environment settings InputSandbox (optional) List of files on the UI’s local disk needed by the job for running The listed files will be staged automatically to the remote resource OutputSandbox (optional) List of files, generated by the job, which have to be retrieved

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, JDL: Relevant Attributes (2) Requirements Job requirements on computing resources Specified using attributes of resources published in the Information Service If not specified, default value defined in UI configuration file is considered Default: other.GlueCEStateStatus == "Production" (the resource has to be able to accept jobs and dispatch them on WNs) Rank Expresses preference (how to rank resources that have already met the Requirements expression) Specified using attributes of resources published in the Information Service If not specified, default value defined in the UI configuration file is considered Default: - other.GlueCEStateEstimatedResponseTime (the lowest estimated traversal time) Default: other.GlueCEStateFreeCPUs (the highest number of free CPUs) for parallel jobs (see later)

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, JDL: Relevant Attributes (3) InputData Refers to data used as input by the job: these data are published in the Replica Catalog and stored in the Storage Elements LFNs and/or GUIDs InputSandbox Executable, files etc. that are sent to the job DataAccessProtocol (mandatory if InputData has been specified) The protocol or the list of protocols that the application is able to speak with for accessing InputData on a given Storage Element OutputSE The Uniform Resource Identifier of the output Storage Element RB uses it to choose a Computing Element that is compatible with the job and is close to Storage Element Details in Data Management lecture

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Example of JDL File [ JobType=“Normal”; Executable = “gridTest”; StdError = “stderr.log”; StdOutput = “stdout.log”; InputSandbox = {“/home/mydir/test/gridTest”}; OutputSandbox = {“stderr.log”, “stdout.log”}; InputData = {“lfn:/glite/myvo/mylfn” }; DataAccessProtocol = “gridftp”; Requirements = other.GlueHostOperatingSystemNameOpSys == “LINUX” && other.GlueCEStateFreeCPUs>=4; Rank = other.GlueCEPolicyMaxCPUTime; ]

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Jobs State Machine (1/9) Submitted: job is entered by the user to the User Interface but not yet transferred to Network Server for processing

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Jobs State Machine (2/9) Waiting: job accepted by NS and waiting for Workload Manager processing or being processed by WMHelper modules.

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Jobs State Machine (3/9) Ready: job processed by WM and its Helper modules (CE found) but not yet transferred to the CE (local batch system queue) via JC and CondorC. This state does not exists for a DAG as it is not subjected to matchmaking (the nodes are) but passed directly to DAGMan.

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Jobs State Machine (4/9) Scheduled: job waiting in the queue on the CE. This state also does not exists for a DAG as it is not directly sent to a CE (the node are).

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Jobs State Machine (5/9) Running: job is running. For a DAG this means that DAGMan has started processing it.

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Jobs State Machine (6/9) Done: job exited or considered to be in a terminal state by CondorC (e.g., submission to CE has failed in an unrecoverable way).

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Jobs State Machine (7/9) Aborted: job processing was aborted by WMS (waiting in the WM queue or CE for too long, over-use of quotas, expiration of user credentials).

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Jobs State Machine (8/9) Cancelled: job has been successfully canceled on user request.

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Jobs State Machine (9/9) Cleared: output sandbox was transferred to the user or removed due to the timeout.

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Job Submission Command Line Interface glite-job-submit [–r ] [-c ] [--vo ] [-o ] -r the job is submitted directly to the computing element identified by -c the configuration file is pointed by the UI instead of the standard configuration file --vo the Virtual Organisation (if user is not happy with the one specified in the UI configuration file) -o the generated edg_jobId is written in the Useful for other commands, e.g.: glite-job-status –i (or jobId)

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Job Resubmission If something goes wrong, the WMS tries to reschedule and resubmit the job (possibly on a different resource satisfying all the requirements) Maximum number of resubmissions: min(RetryCount, MaxRetryCount) RetryCount: JDL attribute MaxRetryCount: attribute in the “RB” configuration file One can disable job resubmission for a particular job: RetryCount=0; in the JDL file

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Directed Acyclic Graphs (DAGs) A DAG represents a set of jobs: Nodes = Jobs Edges = Dependencies NodeA NodeB NodeC NodeD NodeE

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, DAG: JDL Structure Type = “DAG” VirtualOrganisation = “yourVO” Max_Nodes_Running = int >0 MyProxyServer = “…” Requirements = “…” Rank = “…” InputSandbox = more later! OutSandbox = “…” Nodes = nodeX more later! Dependencies = more later! Mandatory Optional Mandatory

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Attribute: Nodes The Nodes attribute is the core of the DAG description; …. Nodes = [ nodefilename1 = [...] nodefilename2 = […] ……. dependencies = … ] Nodefilename1 = [ file = “foo.jdl”; ] Nodefilename2 = [ file = “/home/vardizzo/test.jdl”; retry = 2; ] Nodefilename1 = [ description = [ JobType = “Normal”; Executable = “abc.exe”; Arguments = “1 2 3”; OutputSandbox = […]; InputSandbox = […]; ….. ] retry = 2; ]

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Attribute: Dependencies It is a list of lists representing the dependencies between the nodes of the DAG. …. Nodes = [ nodefilename1 = [...] nodefilename2 = […] ……. dependencies = … ] dependencies = {nodefilename1, nodefilename2} { nodefilename1, nodefilename2 } { { nodefilename1, nodefilename2 }, nodefilename3 } { { { nodefilename1, nodefilename2}, nodefilename3}, nodefilename4 } MANDATORY : YES! dependencies = {};

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Type = “DAG” VirtualOrganisation = “yourVO” Max_Nodes_Running = int >0 MyProxyServer = “…” Requirements = “…” Rank = “…” InputSandbox = { }; Nodes = [ nodefilename =[]; ….. dependencies = …; ]; NodeA= [ description = [ JobType = “Normal”; Executable = “abc.exe”; OutputSandbox = {“myout.txt”}; InputSandbox = { “/home/vardizzo/myfile.txt”, root.InputSandbox; }; ] InputSandbox & Inheritance All nodes inherit the value of the attributes from the one specified for the DAG. Nodes without any InputSandbox values, have to contain in their description an empty list: InputSandbox = { };

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Message Passing Interface (MPI) The MPI job is run in parallel on several processors. Libraries supported for parallel jobs: MPICH. Currently, execution of parallel jobs is supported only on single CE’s. MPI JOB CE WN

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, MPI: JDL Structure Type = “job”; JobType = “MPICH”; Executable = “…”; NodeNumber = “int > 1”; Argument = “…”; Requirements = Member(“MpiCH”, other.GlueHostApplicationSoftwareRunTimeEnvironment) && other.GlueCEInfoTotalCPUs >= NodeNumber ; Rank = other.GlueCEStateFreeCPUs; Mandatory Optional Mandatory

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Logical Checkpointable Jobs It is a job that can be decomposed in several steps; In every step the job state can be saved in the LB and retrieved later in case of failures; The job can start running from a previously saved state instead from the beginning again. STEP 1STEP 2STEP 3STEP 4 JOB’S START JOB’S END A BCD

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Checkpointable Jobs: JDL Structure Type = “job”; JobType = “checkpointable”; Executable = “…”; JobSteps = “list int | list string”; CurrentStep = “int > = 0”; Argument = “…”; Requirements = “…”; Rank = “”; Mandatory Optional

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Interactive Jobs It is a job whose standard streams are forwarded to the submitting client. The DISPLAY environment variable has to be set correctly, because an X window is open. UI Listener Process X window or std no-gui WN

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Interactive Jobs Specified setting JobType = “Interactive” in JDL When an interactive job is executed, a window for the stdin, stdout, stderr streams is opened Possibility to send the stdin to the job Possibility the have the stderr and stdout of the job when it is running Possibility to start a window for the standard streams for a previously submitted interactive job with command glite-job-attach

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Interactive Jobs: JDL Structure Type = “job”; JobType = “interactive”; Executable = “…”; Argument = “…”; ListenerPort = “int > 0”; OutputSandbox = “”; Requirements = “…”; Rank = “”; Mandatory Optional Mandatory gLite Commands: glite-job-attach [options]

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, gLite Commands JDL Submission: glite-job-submit –o guidfile jobCheck.jdl JDL Status: glite-job-status –i guidfile JDL Output: glite-job-output –i guidfile Get Latest Job State: glite-job-get-chkpt –o statefile –i guidfile Submit a JDL from a state: glite-job-submit -chkpt statefile –o guidfile jobCheck.jdl See also [options] typing –help after the commands.

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Grid Accounting A generic Grid accounting process accumulates info on Grid Usage by users/groups (VOs) and involves many subsequent phases as: Metering:Collection of usage metrics on computational resources. Accounting:Storage of such metrics for further analysis. Usage Analysis:Production of reports from the available records. Pricing: Assign and manage prices for computational resources. Billing: Assign a cost to user operations and charge them. To be used: To track resource usage | To discover abuses (and help avoiding them). Allows implementation of submission policies based on resource usage –Exchange market among Grid users and Grid resource owners, which should result in market equilibrium  Load balancing on the Grid During the metering phase the user payload on a resource needs to be correctly measured, and unambiguously assigned to the Grid User that directly or indirectly requested it to the Grid  Load Dedicated Sensors for Grid Resources These pieces of information, when organized, form the Usage Record for the user process  Grid Unique Identifier (for User, Resource, Job) plus the metrics of the resource consumption. A distributed architecture is essential, as well as reliable and fault tolerant communication mechanisms. Different types of users are interested in different views of the usage records.

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, DGAS The Data Grid Accounting System was originally developed within the EU Datagrid Project and is now being maintained and re- engineered within the EU EGEE Project. The Purpose of DGAS is to implement Resource Usage Metering, Accounting and Account Balancing (through resource pricing) in a fully distributed Grid environment. It is conceived to be distributed, secure and extensible. The system is designed in order for Usage Metering, Accounting and Account Balancing (through resource pricing) to be indipendent layers. Usage Metering Usage accounting Account balancing, resource pricing, (billing) usage records accounting data Usage Analysis

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, DGAS Accounting Architecture A simplified view of DGAS within the WMS context.

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, Further information Workload Management In particular WMS User & Admin Guide and JDL docs Condor ClassAd Condor DAGman

Enabling Grids for E-sciencE INFSO-RI EGEE Tutorial, Seoul, References gLite WMS’s User Guide – EGEE Middleware Architecture DJRA1.1 – Practical approaches to Grid workload management in the EGEE project – CHEP 2004 – Grid accounting in EGEE, current practices – Terena Network Conference 2005 – ons/show.php?pres_id=107http:// ons/show.php?pres_id=107