Www.eu-eela.org E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 Special Jobs Valeria Ardizzone INFN - Catania.

Slides:



Advertisements
Similar presentations
FP62004Infrastructures6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Claudio Cherubino INFN Catania.
Advertisements

Generic MPI Job Submission by the P-GRADE Grid Portal Zoltán Farkas MTA SZTAKI.
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin - ITALY 18 – 19 January Job Services.
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
FESR Consorzio COMETA - Progetto PI2S2 The gLite Workload Management System Annamaria Muoio INFN Catania Italy
Consorzio COMETA - Progetto PI2S2 UNIONE EUROPEA HPC Applications on the Sicilian Grid Infrastructure Marcello Iacono-Manno
EGEE-II INFSO-RI Enabling Grids for E-sciencE Supporting MPI Applications on EGEE Grids Zoltán Farkas MTA SZTAKI.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
E-science grid facility for Europe and Latin America E2GRIS1 Francisco Prieto (PhD), Maria Boton, Raul Priego – CETA- CIEMAT Itacuruça.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Submission Fokke Dijkstra RuG/SARA Grid.
Special Jobs Claudio Cherubino INFN - Catania. 2 MPI jobs on gLite DAG Job Collection Parametric jobs Outline.
EGEE-II INFSO-RI Enabling Grids for E-sciencE International Summer School on Grid Computing 2006 gLite Information System and Workload.
FESR Consorzio COMETA - Progetto PI2S2 Using MPI to run parallel jobs on the Grid Marcello Iacono Manno Consorzio COMETA
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Special Jobs Matias Zabaljauregui UNLP.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Luciano Díaz ICN-UNAM Based on Domenico.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) WMPROXY API Python & C++ Diego Scardaci
The gLite API – PART I Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
Nadia LAJILI User Interface User Interface 4 Février 2002.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) GISELA Additional Services Diego Scardaci
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
EGEE is a project funded by the European Union under contract IST Status of NA4 Generic Applications Roberto Barbera NA4 Generic Applications.
Group 1 : Grid Computing Laboratory of Information Technology Supervisors: Alexander Ujhinsky Nikolay Kutovskiy.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America The GENIUS Grid Portal Roberto Barbera Univ.
E-science grid facility for Europe and Latin America Using Secure Storage Service inside the EELA-2 Infrastructure Diego Scardaci INFN (Italy)
E-science grid facility for Europe and Latin America gLite MPI Tutorial for Grid School Daniel Alberto Burbano Sefair, Universidad de Los.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Feb. 06, Introduction to High Performance and Grid Computing Faculty of Sciences,
Architecture of the gLite WMS (Workload Management System) Hands-on Paola Celio Universita’ Roma TRE INFN Roma TRE Sevilla Septembre 2007.
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Job Services Emidio.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Worker Node installation & configuration.
INFSO-RI Enabling Grids for E-sciencE Claudio Cherubino, INFN Catania Grid Tutorial for users Merida, April 2006 Special jobs.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
INFSO-RI Enabling Grids for E-sciencE Job Workflows with gLite Emidio Giorgio INFN NA4 Generic Applications Meeting 10 January 2006.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid2Win: Porting of gLite middleware to.
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Moisés Hernández Duarte UNAM FES Cuautitlán.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Command Line Grid Programming Spiros Spirou Greek Application Support Team NCSR “Demokritos”
INFSO-RI Enabling Grids for E-sciencE Job Description Language (JDL) Giuseppe La Rocca INFN First gLite tutorial on GILDA Catania,
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals Giuseppe La Rocca INFN – Catania gLite Tutorial at the EGEE User Forum CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMS tricks & tips – further scripting Giuseppe.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
12th EELA Tutorial for Users and Managers E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni.
GRID commands lines Original presentation from David Bouvet CC/IN2P3/CNRS.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Advanced Job Riccardo Rotondo
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
FESR Consorzio COMETA - Progetto PI2S2 Jobs with Input/Output data Fabio Scibilia, INFN - Catania, Italy Tutorial per utenti e.
FESR Consorzio COMETA - Progetto PI2S2 Using MPI to run parallel jobs on the Grid Marcello Iacono Manno Consorzio Cometa
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Worker Node & Torque Client Installation.
Advanced services in gLite Gergely Sipos and Peter Kacsuk MTA SZTAKI.
Architecture of the gLite WMS
Stephen Childs Trinity College Dublin
Advanced Topics: MPI jobs
Special jobs with the gLite WMS
Java standalone version
gLite Advanced Job Management
The gLite Workload Management System
Alexandre Duarte CERN Fifth EELA Tutorial Santiago, 06/09-07/09,2006
Job Description Language
5. Job Submission Grid Computing.
Special Jobs: MPI Alessandro Costa INAF Catania
gLite Advanced Job Management
The gLite Workload Management System
GENIUS Grid portal Hands on
Job Description Language (JDL)
Hands on Session: DAG Job Submission
Presentation transcript:

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Special Jobs Valeria Ardizzone INFN - Catania 12th EELA Tutorial Lima,

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, Outline Overview Job with Data Requirements Overview MPI -How to create a MPI job. -MPI job in middleware. Overview DAG Outline

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA User can stage input files from the UI to the WN using the InputSandbox attribute InputSandbox = {"codesa.i686", "start_root.sh", "./Korba/atmbc.const", "./Korba/bctran-window.3", "./Korba/codesa3d.fnames", ".rootrc", "convert.C", "GraphCODESA3D.C"}; Overview Jobs with Data The upper limit for InputSandbox is 10Mbyte!

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, What can I do if my job requires huge data to be processed ?..the question

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, InputData InputData (optional) This is a string or a list of strings representing the Logical File Name (LFN) or Grid Unique Identifier (GUID) needed by the job as input. The list is used by the RB to find the CE from which the specified files can be better accessed and schedules the job to run there. InputData = {“lfn:cmstestfile”, “guid:135b7b23-4a6a-11d7-87e7-9d101f8c8b70”};..the answer /1

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, DataAccessProtocol DataAccessProtocol (mandatory if InputData has been specified) The protocol or the list of protocols which the application is able to “speak” with for accessing files listed in InputData on a given SE. gsiftpfile Supported protocols in gLite are currently gsiftp, and file. DataAccessProtocol = {“file”,“gsiftp”};..the answer /2

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, pds2jpg-MERIS-Etna.jdl [ JobType = "normal"; Type = "Job"; Executable = "/bin/bash"; Arguments = "pds2jpg_install.sh \ MER_FR__2PNUPA _092534_ _00079_06145_0033"; StdOutput = "pds2jpg.out"; StdError = "pds2jpg.err"; InputSandbox = {"./pds2jpg_install.sh","./beam20.tar.gz"}; InputData = {"lfn:/grid/gilda/MER_FR__2PNUPA _092534_ _00079_06145_0033.N1"}; DataAccessProtocol = {"gridftp","rfio","gsiftp"}; OutputSandbox = { "MER_FR__2PNUPA _092534_ _00079_06145_0033.jpg", "ENVISAT_Product_courtesy_of_European_Space_Agency", "pds2jpg.out", "pds2jpg.err" }; RetryCount = 3; ]

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, pds2jpg_install.sh #!/bin/sh echo Staging Input Data \(Courtesy of European Space Agency\); #Skip the "/" from the argument. file=`echo $1 | awk -F '/' '{print $2}'` echo lcg-cp --vo gilda lfn:/grid/gilda/MER_FR__2PNUPA _092534_ _00079_06145_0033.N1 file:`pwd`/${file}.N1 lcg-cp --vo gilda lfn:/grid/gilda/MER_FR__2PNUPA _092534_ _00079_06145_0033.N1 file:`pwd`/${file}.N1 echo Staging Application; ls -al gunzip beam20.tar.gz; tar xvf beam20.tar; cd beam-2.0/bin; echo Starting Application; echo "./pds2jpg-run.sh $file;"./pds2jpg-run.sh $file; echo "mv $file.jpg../.." mv $file.jpg../.. touch../../ENVISAT_Product_courtesy_of_European_Space_Agency echo "Input ENVISAT Product courtesy of European Space Agency">../../ENVISAT_Product_courtesy_of_European_Space_Agency echo No Output Packaging;

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, the output.. Input ENVISAT Product courtesy of European Space Agency

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, another couple of questions The output produced by my job must be processed, as input, by some other jobs Q.1) How can I make accessible this data for other computation ? Q.2) Can I upload the data to be processed later automatically ?

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, OutputData OutputData (optional) This attribute allows the user to ask for the automatic upload and registration of datasets produced by the job on the Worker Node (WN). This attribute contains the following three attributes: 1.OutputFile 2.StorageElement 3.LogicalFileName Output data

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, OutputFile OutputFile (mandatory if OutputData has been specified) This is a string attribute representing the name of the output file, generated by the job on the WN, which has to be automatically uploaded and registered by the WMS. StorageElement StorageElement (optional) This is a string representing the URI of the Storage Element where the output file specified in the OutputFile has to be uploaded by the WMS. LogicalFileName LogicalFileName (optional) This is a string representing the LFN user wants to associate to the output file when registering it to the Catalogue. Automatic uploading mechanism NOT yet supported in gLite Output data

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, Execution of parallel jobs is an essential issue for modern conceptions of informatics and applications. Most used library for parallel jobs support is (Message Passing Interface) MPI At the state of the art, parallel jobs can run inside single Computing Elements (CE) only; –several projects are involved into studies concerning the possibility of executing parallel jos on Worker Nodes (WNs) belonging to differents CEs. Overview MPI

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, How to create a MPI Job

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, In order to garantee that MPI job can run, the following requirements MUST BE satisfied: MPICH –the MPICH software must be installed and placed in the PATH environment variable, on all the WNs of the CE. –The Executable that is specified in the JDL must not be the MPI application directly, but a wrapper script that invokes the MPI applications by calling mpirun command. Requirements & Settings

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, For the user’s point of view, jobs to be run as MPI are specified setting the JDL JobType attribute to MPICH and specifying the NodeNumber attribute as well. E.g.: JobType = “MPICH”; NodeNumber = 4; This attribute define the required number of CPUs needed for the application.

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, When these two attributes are included in a JDL the User Interface (UI) automatically add the following expression (other.GlueCEInfoTotalCPUs >= NodeNumber) && Member (“MPICH”,other.GlueHostApplicationSoftwareRunTimeEnvironment) to the JDL requirements expression in order to find out the appropriate resources where the job can be executed.

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, the problem... Unfortunately LCG project was not synchronized with the latter requirement avoiding to share disk space with nodes inside the same CE. This drove us to spend our time in providing a ad-hoc solution in order to find an efficent workaround to this problem. The solution adopted bypasses the problem by putting some intelligence inside the script passed in Inputsandbox.

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, … the solution In detail each job has to mirror, via scp, its files on all nodes dedicated to it. ssh hostbased authentication MUST BE well configured between all the WNs.

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, [ Type = "Job"; JobType = "MPICH"; Executable = "MPItest.sh"; NodeNumber = 5; Arguments = "cpi 5"; StdOutput = "test.out"; StdError = "test.err"; InputSandbox = {"MPItest.sh","cpi"}; OutputSandbox = {"test.err","test.out","executable.out"}; Requirements = other.GlueCEInfoLRMSType == "PBS" || other.GlueCEInfoLRMSType == "LSF"; ] mpi.jdl Actually the Local Resource Manager supported are PBS and LSF only. Actually the Local Resource Manager supported are PBS and LSF only. The number of threads specified with NodeNumber attribute agrees with the second Argument. It will be used during the invoking of mpirun command.

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, $HOST_NODEFILE for i in `cat $HOST_NODEFILE` ; do echo "Mirroring via SSH to $i" # creates the working directories on all the nodes allocated for parallel execution. ssh $i mkdir -p `pwd` # copies the needed files on all the nodes allocated for parallel execution. /usr/bin/scp -rp./* $i:`pwd` # checks that all files are present on all the nodes allocated for parallel execution. ssh $i ls `pwd` done # execute the parallel job with mpirun. echo "Executing $EXE" chmod 755 $EXE mpirun -np $CPU_NEEDED -machinefile $HOST_NODEFILE `pwd`/$EXE > executable.out MPItest.sh The Environment variable $HOST_NODEFILE contains the list of WNs allocated for the parallel execution.

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, DAG job A DAG job is a set of jobs where input, output, or execution of one or more jobs can depend on other jobs Dependencies are represented through Directed Acyclic Graphs, where the nodes are jobs, and the edges identify the dependencies nodeA nodeBnodeC NodeF nodeD

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, JDL structure

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, Attribute: Nodes

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, Attribute: Dependencies

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, DAG jdl [ type = "dag"; max_nodes_running = 4; nodes = [ nodeA = [ file ="nodes/nodeA.jdl" ; ]; nodeB = [ file ="nodes/nodeB.jdl" ; ]; nodeC = [ file ="nodes/nodeC.jdl" ; ]; nodeD = [ file ="nodes/nodeD.jdl"; ]; dependencies = { {nodeA, nodeB}, {nodeA, nodeC}, { {nodeB,nodeC}, nodeD } } ]; ] Node description could also be done here, instead of using separate files

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, References Job Description Language TEC JDL-Attributes-v0-8.pdf TEC JDL-Attributes-v0-8.pdf GILDA wiki: –Job with Data –MPI Job withedgcommands

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA

E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Lima, 12th EELA Tutorial, Thank you very much for your kind attention!