Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.eu-eela.org E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 Special Jobs Valeria Ardizzone INFN - Catania.

Similar presentations


Presentation on theme: "Www.eu-eela.org E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 Special Jobs Valeria Ardizzone INFN - Catania."— Presentation transcript:

1 www.eu-eela.org E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 Special Jobs Valeria Ardizzone INFN - Catania 12th EELA Tutorial Lima, 26.09.2007

2 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 2 Lima, 12th EELA Tutorial, 25.09.2007 Outline Overview Job with Data Requirements Overview MPI -How to create a MPI job. -MPI job in middleware. Overview DAG Outline

3 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 User can stage input files from the UI to the WN using the InputSandbox attribute InputSandbox = {"codesa.i686", "start_root.sh", "./Korba/atmbc.const", "./Korba/bctran-window.3", "./Korba/codesa3d.fnames", ".rootrc", "convert.C", "GraphCODESA3D.C"}; Overview Jobs with Data The upper limit for InputSandbox is 10Mbyte!

4 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 4 Lima, 12th EELA Tutorial, 25.09.2007 What can I do if my job requires huge data to be processed ?..the question

5 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 5 Lima, 12th EELA Tutorial, 25.09.2007 InputData InputData (optional) This is a string or a list of strings representing the Logical File Name (LFN) or Grid Unique Identifier (GUID) needed by the job as input. The list is used by the RB to find the CE from which the specified files can be better accessed and schedules the job to run there. InputData = {“lfn:cmstestfile”, “guid:135b7b23-4a6a-11d7-87e7-9d101f8c8b70”};..the answer /1

6 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 6 Lima, 12th EELA Tutorial, 25.09.2007 DataAccessProtocol DataAccessProtocol (mandatory if InputData has been specified) The protocol or the list of protocols which the application is able to “speak” with for accessing files listed in InputData on a given SE. gsiftpfile Supported protocols in gLite are currently gsiftp, and file. DataAccessProtocol = {“file”,“gsiftp”};..the answer /2

7 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 7 Lima, 12th EELA Tutorial, 25.09.2007 pds2jpg-MERIS-Etna.jdl [ JobType = "normal"; Type = "Job"; Executable = "/bin/bash"; Arguments = "pds2jpg_install.sh \ MER_FR__2PNUPA20030504_092534_000000502016_00079_06145_0033"; StdOutput = "pds2jpg.out"; StdError = "pds2jpg.err"; InputSandbox = {"./pds2jpg_install.sh","./beam20.tar.gz"}; InputData = {"lfn:/grid/gilda/MER_FR__2PNUPA20030504_092534_000000502016_00079_06145_0033.N1"}; DataAccessProtocol = {"gridftp","rfio","gsiftp"}; OutputSandbox = { "MER_FR__2PNUPA20030504_092534_000000502016_00079_06145_0033.jpg", "ENVISAT_Product_courtesy_of_European_Space_Agency", "pds2jpg.out", "pds2jpg.err" }; RetryCount = 3; ]

8 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 8 Lima, 12th EELA Tutorial, 25.09.2007 pds2jpg_install.sh #!/bin/sh echo Staging Input Data \(Courtesy of European Space Agency\); #Skip the "/" from the argument. file=`echo $1 | awk -F '/' '{print $2}'` echo lcg-cp --vo gilda lfn:/grid/gilda/MER_FR__2PNUPA20030504_092534_000000502016_00079_06145_0033.N1 file:`pwd`/${file}.N1 lcg-cp --vo gilda lfn:/grid/gilda/MER_FR__2PNUPA20030504_092534_000000502016_00079_06145_0033.N1 file:`pwd`/${file}.N1 echo Staging Application; ls -al gunzip beam20.tar.gz; tar xvf beam20.tar; cd beam-2.0/bin; echo Starting Application; echo "./pds2jpg-run.sh $file;"./pds2jpg-run.sh $file; echo "mv $file.jpg../.." mv $file.jpg../.. touch../../ENVISAT_Product_courtesy_of_European_Space_Agency echo "Input ENVISAT Product courtesy of European Space Agency">../../ENVISAT_Product_courtesy_of_European_Space_Agency echo No Output Packaging;

9 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 9 Lima, 12th EELA Tutorial, 25.09.2007 the output.. Input ENVISAT Product courtesy of European Space Agency

10 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 10 Lima, 12th EELA Tutorial, 25.09.2007..another couple of questions The output produced by my job must be processed, as input, by some other jobs Q.1) How can I make accessible this data for other computation ? Q.2) Can I upload the data to be processed later automatically ?

11 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 11 Lima, 12th EELA Tutorial, 25.09.2007 OutputData OutputData (optional) This attribute allows the user to ask for the automatic upload and registration of datasets produced by the job on the Worker Node (WN). This attribute contains the following three attributes: 1.OutputFile 2.StorageElement 3.LogicalFileName Output data

12 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 12 Lima, 12th EELA Tutorial, 25.09.2007 OutputFile OutputFile (mandatory if OutputData has been specified) This is a string attribute representing the name of the output file, generated by the job on the WN, which has to be automatically uploaded and registered by the WMS. StorageElement StorageElement (optional) This is a string representing the URI of the Storage Element where the output file specified in the OutputFile has to be uploaded by the WMS. LogicalFileName LogicalFileName (optional) This is a string representing the LFN user wants to associate to the output file when registering it to the Catalogue. Automatic uploading mechanism NOT yet supported in gLite Output data

13 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 13 Lima, 12th EELA Tutorial, 25.09.2007 Execution of parallel jobs is an essential issue for modern conceptions of informatics and applications. Most used library for parallel jobs support is (Message Passing Interface) MPI At the state of the art, parallel jobs can run inside single Computing Elements (CE) only; –several projects are involved into studies concerning the possibility of executing parallel jos on Worker Nodes (WNs) belonging to differents CEs. Overview MPI

14 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 14 Lima, 12th EELA Tutorial, 25.09.2007 How to create a MPI Job

15 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 15 Lima, 12th EELA Tutorial, 25.09.2007 In order to garantee that MPI job can run, the following requirements MUST BE satisfied: MPICH –the MPICH software must be installed and placed in the PATH environment variable, on all the WNs of the CE. –The Executable that is specified in the JDL must not be the MPI application directly, but a wrapper script that invokes the MPI applications by calling mpirun command. Requirements & Settings

16 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 16 Lima, 12th EELA Tutorial, 25.09.2007 For the user’s point of view, jobs to be run as MPI are specified setting the JDL JobType attribute to MPICH and specifying the NodeNumber attribute as well. E.g.: JobType = “MPICH”; NodeNumber = 4; This attribute define the required number of CPUs needed for the application.

17 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 17 Lima, 12th EELA Tutorial, 25.09.2007 When these two attributes are included in a JDL the User Interface (UI) automatically add the following expression (other.GlueCEInfoTotalCPUs >= NodeNumber) && Member (“MPICH”,other.GlueHostApplicationSoftwareRunTimeEnvironment) to the JDL requirements expression in order to find out the appropriate resources where the job can be executed.

18 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 18 Lima, 12th EELA Tutorial, 25.09.2007 the problem... Unfortunately LCG project was not synchronized with the latter requirement avoiding to share disk space with nodes inside the same CE. This drove us to spend our time in providing a ad-hoc solution in order to find an efficent workaround to this problem. The solution adopted bypasses the problem by putting some intelligence inside the script passed in Inputsandbox.

19 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 19 Lima, 12th EELA Tutorial, 25.09.2007 … the solution In detail each job has to mirror, via scp, its files on all nodes dedicated to it. ssh hostbased authentication MUST BE well configured between all the WNs.

20 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 20 Lima, 12th EELA Tutorial, 25.09.2007 [ Type = "Job"; JobType = "MPICH"; Executable = "MPItest.sh"; NodeNumber = 5; Arguments = "cpi 5"; StdOutput = "test.out"; StdError = "test.err"; InputSandbox = {"MPItest.sh","cpi"}; OutputSandbox = {"test.err","test.out","executable.out"}; Requirements = other.GlueCEInfoLRMSType == "PBS" || other.GlueCEInfoLRMSType == "LSF"; ] mpi.jdl Actually the Local Resource Manager supported are PBS and LSF only. Actually the Local Resource Manager supported are PBS and LSF only. The number of threads specified with NodeNumber attribute agrees with the second Argument. It will be used during the invoking of mpirun command.

21 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 21 Lima, 12th EELA Tutorial, 25.09.2007 $HOST_NODEFILE for i in `cat $HOST_NODEFILE` ; do echo "Mirroring via SSH to $i" # creates the working directories on all the nodes allocated for parallel execution. ssh $i mkdir -p `pwd` # copies the needed files on all the nodes allocated for parallel execution. /usr/bin/scp -rp./* $i:`pwd` # checks that all files are present on all the nodes allocated for parallel execution. ssh $i ls `pwd` done # execute the parallel job with mpirun. echo "Executing $EXE" chmod 755 $EXE mpirun -np $CPU_NEEDED -machinefile $HOST_NODEFILE `pwd`/$EXE > executable.out MPItest.sh The Environment variable $HOST_NODEFILE contains the list of WNs allocated for the parallel execution.

22 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 22 Lima, 12th EELA Tutorial, 25.09.2007 DAG job A DAG job is a set of jobs where input, output, or execution of one or more jobs can depend on other jobs Dependencies are represented through Directed Acyclic Graphs, where the nodes are jobs, and the edges identify the dependencies nodeA nodeBnodeC NodeF nodeD

23 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 23 Lima, 12th EELA Tutorial, 25.09.2007 JDL structure

24 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 24 Lima, 12th EELA Tutorial, 25.09.2007 Attribute: Nodes

25 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 25 Lima, 12th EELA Tutorial, 25.09.2007 Attribute: Dependencies

26 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 26 Lima, 12th EELA Tutorial, 25.09.2007 DAG jdl [ type = "dag"; max_nodes_running = 4; nodes = [ nodeA = [ file ="nodes/nodeA.jdl" ; ]; nodeB = [ file ="nodes/nodeB.jdl" ; ]; nodeC = [ file ="nodes/nodeC.jdl" ; ]; nodeD = [ file ="nodes/nodeD.jdl"; ]; dependencies = { {nodeA, nodeB}, {nodeA, nodeC}, { {nodeB,nodeC}, nodeD } } ]; ] Node description could also be done here, instead of using separate files

27 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 27 Lima, 12th EELA Tutorial, 25.09.2007 References Job Description Language https://edms.cern.ch/file/555796/1/EGEE-JRA1- TEC-555796-JDL-Attributes-v0-8.pdf https://edms.cern.ch/file/555796/1/EGEE-JRA1- TEC-555796-JDL-Attributes-v0-8.pdf GILDA wiki: –Job with Data https://grid.ct.infn.it/twiki/bin/view/GILDA/JobDatahttps://grid.ct.infn.it/twiki/bin/view/GILDA/JobData –MPI Job https://grid.ct.infn.it/twiki/bin/view/GILDA/MPIJobs- withedgcommands

28 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409

29 E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 29 Lima, 12th EELA Tutorial, 25.09.2007 Thank you very much for your kind attention!


Download ppt "Www.eu-eela.org E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA-026409 Special Jobs Valeria Ardizzone INFN - Catania."

Similar presentations


Ads by Google