INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Workflow Management in Giuseppe La Rocca INFN – Catania ICTP/INFM-Democritos Workshop on Porting.

Slides:



Advertisements
Similar presentations
FP62004Infrastructures6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Claudio Cherubino INFN Catania.
Advertisements

EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
Job Submission The European DataGrid Project Team
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
SEE-GRID-SCI User Interface (UI) Installation and Configuration Branimir Ackovic Institute of Physics Serbia The SEE-GRID-SCI.
E-infrastructure shared between Europe and Latin America 12th EELA Tutorial for Users and System Administrators Architecture of the gLite.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
IST E-infrastructure shared between Europe and Latin America Architecture of the gLite WMS Alexandre Duarte CERN Fifth EELA.
E-infrastructure shared between Europe and Latin America Architecture of the WMS Manuel Rubio del Solar CETA-CIEMAT EELA Tutorial, Mérida,
Special Jobs Claudio Cherubino INFN - Catania. 2 MPI jobs on gLite DAG Job Collection Parametric jobs Outline.
Querétaro (Mexico), E2GRIS – Job Description Language JDL 1.
EGEE-II INFSO-RI Enabling Grids for E-sciencE International Summer School on Grid Computing 2006 gLite Information System and Workload.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) WMPROXY API Python & C++ Diego Scardaci
Grid Initiatives for e-Science virtual communities in Europe and Latin America The Job Description Language JDL 1.
E-science grid facility for Europe and Latin America gLite Job Management. User and Site Admin Tutorial Elisa Ingrà – INFN Catania Dublin.
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals GILDA Tutors INFN Catania ICTP/INFM-Democritos Workshop on Porting Scientific.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
Nadia LAJILI User Interface User Interface 4 Février 2002.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
Group 1 : Grid Computing Laboratory of Information Technology Supervisors: Alexander Ujhinsky Nikolay Kutovskiy.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Using gLite API Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Nov. 18, EGEE and gLite are registered trademarks gLite Middleware Usage Dusan.
INFSO-RI Enabling Grids for E-sciencE The gLite Workload Management System Elisabetta Molinari (INFN-Milan) on behalf of the JRA1.
SEE-GRID-SCI The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
Job Management DIRAC Project. Overview  DIRAC JDL  DIRAC Commands  Tutorial Exercises  What do you have learned? KEK 10/2012DIRAC Tutorial.
E-infrastructure shared between Europe and Latin America 1 Workload Management System-WMS Luciano Diaz Universidad Nacional Autónoma de México - UNAM Mexico.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
INFSO-RI Enabling Grids for E-sciencE Claudio Cherubino, INFN Catania Grid Tutorial for users Merida, April 2006 Special jobs.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Using gLite API Vladimir Dimitrov IPP-BAS “gLite middleware Application Developers.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
Enabling Grids for E-sciencE Workload Management System on gLite middleware - commands Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi.
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
INFSO-RI Enabling Grids for E-sciencE Job Workflows with gLite Emidio Giorgio INFN NA4 Generic Applications Meeting 10 January 2006.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Workload management in gLite 3.x - MPI P. Nenkova, IPP-BAS, Sofia, Bulgaria Some of.
INFSO-RI Enabling Grids for E-sciencE Job Submission Tutorial (material from INFN Catania)
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMPROXY usage Álvaro Fernández IFIC (CSIC)
EGEE-II INFSO-RI Enabling Grids for E-sciencE Command Line Grid Programming Spiros Spirou Greek Application Support Team NCSR “Demokritos”
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Job sandboxes.
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals Giuseppe La Rocca INFN – Catania gLite Tutorial at the EGEE User Forum CERN.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Special Jobs Valeria Ardizzone INFN - Catania.
Enabling Grids for E-sciencE Sofia, 17 March 2009 INFSO-RI Introduction to Grid Computing, EGEE and Bulgarian Grid Initiatives –
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMS tricks & tips – further scripting Giuseppe.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
INFSO-RI Enabling Grids for E-sciencE Flexible Job Submission Using Web Services: The gLite WMProxy Experience Giuseppe Avellino.
LCG2 Tutorial Viet Tran Institute of Informatics Slovakia.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMProxy Java API & SEE-GRID File Management.
Job Management Beijing, 13-15/11/2013. Overview Beijing, /11/2013 DIRAC Tutorial2  DIRAC JDL  DIRAC Commands  Tutorial Exercises  What do you.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Advanced Job Riccardo Rotondo
Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
FESR Trinacria Grid Virtual Laboratory Practical using WMProxy advanced job submission Emidio Giorgio INFN Catania.
Practical using C++ WMProxy API advanced job submission
Special jobs with the gLite WMS
gLite Advanced Job Management
The gLite Workload Management System
Alexandre Duarte CERN Fifth EELA Tutorial Santiago, 06/09-07/09,2006
Workload Management System
gLite Advanced Job Management
gLite Job Management Amina KHEDIMI CERIST
gLite Job Management Christos Theodosiou
Job Description Language
GENIUS Grid portal Hands on
Job Description Language (JDL)
Hands on Session: DAG Job Submission
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE Workflow Management in Giuseppe La Rocca INFN – Catania ICTP/INFM-Democritos Workshop on Porting Scientific Applications on Computational GRIDs Trieste – ITALY,06-17 February 2006

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February Outline WMProxy overview – New features  Shared Sandboxes  ‘Scattered’ Input Sandboxes  ‘Scattered’ Output Sandboxes  ‘Compressed’ Sandboxes New request types – DAG jobs – Job collections – Parametric jobs

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February WMProxy WMProxy (Workload Manager Proxy) – is a new service providing access to the gLite Workload Management System (WMS) functionality through a simple Web Services based interface.  has been designed to handle a large number of requests for job submission gLite 1.5 => ~180 secs for 500 jobs Goal is to get in the short term to ~60 secs for 1000 jobs  it provides additional features such as bulk submission and the support for shared and compressed sandboxes for compound jobs.  It’s the natural replacement of the NS in the passage to the SOA approach.

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February Authorization The client must be properly authorized when interacts with the WMProxy service. – This means that either the FQAN or the DN (in case of globus-style proxies) of the client must be properly listed and authorized in the glite_wms_wmproxy.gacl file on the WMProxy machine.

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February Each job submitted to a WMProxy Service is given the delegated credentials of the user who submitted it. These credentials can then be used to perform operation that require interaction with other services – (e.g. submission to the CE, a GridFTP file transfer etc.) – There are two possible mechanism to ask for a delegation of the user credentails:  asking the “automatic” delegation of the credentials during the submission operation  asking for an “explicit“ delegation

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February WMProxy client tools WMProxy can be accessed through: published WSDL – Developers can generate themselves client stubs in their favourite language from the published WSDL C++/Java/Python API – Light client libraries generated using respectively gSoap, Axis and SOAPpy – Hides WSDL/SOAP tooling dirty details – Python API available starting from gLite 1.5

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February WMProxy C++ client commands The commands to interact with WMProxy Service are: glite-wms-delegate-proxy glite-wms-job-submit glite-wms-job-list-match glite-wms-job-cancel glite-wms-job-output

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February WMProxy client commands (cont.) glite-wms-delegate-proxy allows the user to delegate her proxy credential to the WMProxy service.This delegated credential can then be used for job submissions. glite-wms-job-submit submits a job to a WMProxy Service. It requires a JDL file as input and returns a WMS job identifier. glite-wms-job-list-match lists the available resources where the job can be submitted. glite-wms-job-cancel cancels one or more jobs previously submitted to WMProxy Service. glite-wms-job-output Retrieve output files of a job, when finished.

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February glite-wms-job-delegate-proxy [options] options: --version --help --config, -c --vo --debug --logfile --noint --delegationid, -d --autm-delegation, -a --endpoint, -e --output, -o if specified, the operations on the WMProxy will be associated to the credential previously delegated with the idstring delegation string.

Enabling Grids for E-sciencE INFSO-RI New Features

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February Shared Sandboxes for sub-jobs of compound jobs JDL has been extended to allow specification of the input sandbox at the level of the compound request (i.e. DAGs, Collections and Parametric jobs) – This Input sandbox is trasferred only once by the new WMS client commands but can be accessed by all sub-jobs of the compound job – Each sub-jobs sandboxes can refers to a single files of the “shared sandbox”, e.g. InputSandbox = root.InputSandbox[0]; – or to sandboxes of other sub-jobs, e.g., InputSandbox = root.nodes.nodeA.description.OutputSandbox[2];

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February ‘Scattered’ Input Sandboxes Input Sandbox can contain – file paths on the UI machine (i.e. the usual way) – URI pointing to files on a remote gridFTP/HTTPS server InputSandbox = { "gsiftp://neo.datamat.it:2811/var/prg/sim.exe", " "file:///home/pacio/myconf"}; A base URI to be applied to all sandbox files can also be specified InputSandboxBaseURI = “gsiftp://matrix.datamat.it:2811/var";

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February ‘Scattered’ Output Sandboxes JDL has been enriched with new attributes for specifying the destinations for the files listed in the OutputSandbox attribute list OutputSandbox = {"jobOutput", "run1/event1", "jobError“ }; OutputSandboxDestURI = { "gsiftp://matrix.datamat.it/var/jobOutput", " "gsiftp://matrix.datamat.it/var/jobError"}; A base URI to be applied to all sandbox files can also be specified OutputSandboxBaseDestURI = "gsiftp://neo.datamat.it/home/run1/";

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February ‘Compressed’ Sandboxes (gLite version >= 1.5) A compressed archive is created with the input sandboxes files using libtar and zlib libraries – This is done automatically by WMProxy client commands – This mechanism can be enabled/disabled by the user through the JDL (AllowZippedISB attribute) The archive is transferred (instead of single files) to the WMS WMProxy service untars the files in the jobs directories when the job is ‘started’ and removes the archive

Enabling Grids for E-sciencE INFSO-RI New type of request

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February DAG job DAG is a set of jobs where the input, output, or execution of one or more jobs depends on one or more other ones  The jobs are nodes (vertices) in the graph  the edges (arcs) identify the dependencies – Their management has been improved with  Shared sandboxes  Attributes Inheritance  Attribute references between nodes and with the ‘parent’ nodeA nodeBnodeC NodeF nodeD

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February DAG jdl [ type = "dag"; max_nodes_running = 4; nodes = [ nodeA = [ file ="nodes/nodeA.jdl" ; ]; nodeB = [ file ="nodes/nodeB.jdl" ; ]; nodeC = [ file ="nodes/nodeC.jdl" ; ]; nodeF = [ file ="nodes/nodeF.jdl"; ]; dependencies = { {nodeA, nodeB}, {nodeA, nodeC},{nodeA, nodeF}, { {nodeB,nodeC,nodeF}, nodeD } } ]; ; ] Node description could be done also here, instead of using separate file

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February WMProxy : submission & monitoring In order to submit job with WMProxy, it’s mandatatory to use credentials delegation The submission/monitoring commands are slightly different, but most of “old” options are supported glite-wms-job-delegate-proxy -d del_ID_01 glite-wms-job-submit -d del_ID_01 myjob.jdl glite-wms-job-status \ sz0jpI_g glite-wms-job-output \ sz0jpI_g

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February Job Collection Job collection is a set of independent jobs that user can submit and monitor as it was a single job Jobs of a collection are submitted as DAG nodes, without dependencies The JDL is a list of ClassAds which describe the subjobs [ Type = "collection“; nodes = { [ ], … };... ]

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February Job collection examples [ Type = “Collection"; RetryCount = 0; nodes={ [ Executable = "/bin/hostname"; Arguments = “-f"; StdOutput = "hostname.out"; StdError = "hostname.err"; OutputSandbox = {"hostname.err","hostname.out"}; ], [ Executable = "/bin/sh"; Arguments = "start_povray_valve.sh"; StdOutput = “povray.out"; StdError = “povray.err"; InputSandbox = {“start_povray_valve.sh"}; OutputSandbox = {“povray.err",“povray.out"}; Requirements = Member (“POVRAY-3,5”, other.GlueHostApplicationSoftwareRunTimeEnvironment); ] }; ]

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February Submitting the collection.. glite-wms-delegate-proxy allows the user to delegate her proxy credential to the WMProxy service. – $ glite-wms-job-delegate-proxy -d myWMProxy glite-wms-job-submit submits a job to a WMProxy Service. It requires a JDL file as input and returns a WMS job identifier. – $ glite-wms-job-submit -d myWMProxy collection.jdl

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February $ glite-wms-job-status ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : Current Status: Done (Success) Exit code: 0 Status Reason: Job terminated successfully Destination: dagman Submitted: Mon Jan 30 12:43: CET ************************************************************* - Nodes information for: Status info for the Job : Current Status: Done (Success) Exit code: 0 Status Reason: Job terminated successfully Destination: dgt01.ui.savba.sk:2119/jobmanager-lcgpbs-infinite Submitted: Mon Jan 30 12:43: CET Parent Job: ************************************************************* Status info for the Job : Current Status: Done (Success) Exit code: 0 Status Reason: Job terminated successfully Destination: dgt01.ui.savba.sk:2119/jobmanager-lcgpbs-infinite Submitted: Mon Jan 30 12:43: CET Parent Job: *************************************************************

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February glite-wms-job-output Retrieve output files of a job, when finished. – $ glite-wms-job-output --dir./collection-output ll collection-output/ drwxr-xr-x 2 larocca users larocca_p9qxZlS5yDewQnd7kyN0NA drwxr-xr-x 2 larocca users larocca_rgzQISTy6VLev3G8Enjefw

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February Parametric Jobs (1/2) A parametric job is a job where one or more of its attributes are parametric Value of attributes varies according to parameter Job monitoring / managing is always done through an unique jobID, as if the job was single [ JobType = "Parametric"; Executable = “/bin/echo"; Arguments = “PARAM”; StdOutput = "myoutput_PARAM_.txt"; StdError = "myerror_PARAM_.txt"; Parameters = 2500; ParameterStep = 100; ParameterStart = 1000; OutputSandbox = {“myoutput_PARAM_.txt”}; ]

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February Parametric jobs (2/2) Parameter can be either integer and string. Executable = “/bin/cat"; Arguments = “inputEARTH.txt”; InputSandbox = "inputEARTH.txt"; StdOutput = "myoutputEARTH.txt"; StdError = "myerrorEARTH.txt"; OutputSandbox = {“myoutputEARTH.txt”}; Executable = “/bin/cat"; Arguments = “inputMOON.txt”; InputSandbox = "inputMOON.txt"; StdOutput = "myoutputMOON.txt"; StdError = "myerrorMOON.txt"; OutputSandbox = {“myoutputMOON.txt”}; Executable = “/bin/cat"; Arguments = “inputMARS.txt”; InputSandbox = "inputMARS.txt"; StdOutput = "myoutputMARS.txt"; StdError = "myerrorMARS.txt"; OutputSandbox = {“myoutputMARS.txt”}; [ JobType = "Parametric"; Executable = “/bin/cat"; Arguments = “input_PARAM_.txt”; InputSandbox = "input_PARAM_.txt"; StdOutput = "myoutput_PARAM_.txt"; StdError = "myerror_PARAM_.txt"; Parameters = {earth,moon,mars}; OutputSandbox = {“myoutput_PARAM_.txt”}; ] ls inputEARTH.txt inputMARS.txt inputMOON.txt

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February References WSDL documentation – – interface/interface/WMProxy.wsdl interface/interface/WMProxy.wsdl WMProxy User’s Guide – JDL Attributes Specification – – wm/api_doc/wms_jdl/index.html wm/api_doc/wms_jdl/index.html API documentation –

Enabling Grids for E-sciencE INFSO-RI ICTP/INFM - Trieste February Questions…