Presentation is loading. Please wait.

Presentation is loading. Please wait.

Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni.

Similar presentations


Presentation on theme: "Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni."— Presentation transcript:

1 Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni

2 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Name Convention Logical File Name 1 Logical File Name n GUID Physical File SURL n Physical File SURL 1........ Globally Unique Identifier (GUID)Globally Unique Identifier (GUID) “guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6” Site URL (SURL)Site URL (SURL) “sfn://lxshare0209.cern.ch/data/alice/ntuples.dat” Logical File Name (LFN)Logical File Name (LFN) “lfn:cms/track1” Transport URL (TURL)Transport URL (TURL) “gsiftp://lxshare0209.cern.ch//data/alice/ntuples.dat”

3 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Job Description Language Job Description Language (JDL)Job Description Language (JDL) is used to describe jobs for execution on Grid. CLASSified Advertisement language (ClassAd)The JDL adopted is based upon Condor’s CLASSified Advertisement language (ClassAd). The supported attributes are grouped in categories: Data and Storage resources Input data to process, SE where to store output data, protocols spoken by application when accessing SEs

4 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Jobs and Output data Sandbox < 30MB Use catalogue for bigger output Use catalogue + metadata for refinements REQUIRES A DATA “DRIVER”

5 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Data driver Script to submit the job: –Prepares the environment –Executes the program/s –Handles data Save output Store comments/metadata information

6 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Data handling Suggestion: save data in the closer SE –VO_XXXX_DEFAULT_SE –Protocol: rfio, gridftp, lcg-cr Register data on the catalogue –Lcg-rf

7 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania A simple example > cat test.sh #!/bin/sh /bin/echo Hello $1 and Welcome to the EGEE Tutorial! > test.out lcg-cr --vo YourVO -d $VO_YourVO_DEFAULT_SE -l lfn:/grid/yourvo/tests/out.txt > cat test.jdl [ Type = “job”; JobType = “Normal”; Executable = “scriptOutput.sh"; Arguments = “pippo”; VirtualOrganisation = “YourVO”; StdOutput = “sim.out”; StdError = “sim.err”; InputSandbox = {"scriptOutput.sh"}; OutputSandbox={sim.out, sim.err}; Requirements=…. ]

8 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Job with input data “Move the computation to data rather than move data to computing power” “Use the closer data repository!” JDL + lcg-info + lcg-cp

9 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania InputDataInputData (optional) This is a string or a list of strings representing the Logical File Name (LFN) or Grid Unique Identifier (GUID) needed by the job as input. The list is used by the RB to find the CE from which the specified files can be better accessed and schedules the job to run there. InputData = {“lfn:cmstestfile”, “guid:135b7b23-4a6a-11d7-87e7-9d101f8c8b70”}; JDL Attributes for data

10 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania DataAccessProtocolDataAccessProtocol –The protocol or the list of protocols which the application is able to “speak” with for accessing files listed in InputData on a given SE. gridftpfilerfioSupported protocols in LCG-2 are currently gridftp, file and rfio. DataAccessProtocol = {“file”,“gridftp”,“rfio”}; JDL: Relevant Attributes

11 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Job with Input Data [ Type = “ job ” ; JobType = “ Normal ” ; Executable = “ scriptInput.sh"; Arguments = “ Francesco ” ; VirtualOrganisation = “ gilda ” ; StdOutput = “ std.out ” ; StdError = “ std.err ” ; InputSandbox = { “ scriptInput.sh"}; OutputSandbox = { “ std.err ”, “ std.out ” }; InputData = “ lfn:myoutdata.1 ” ; DataAccessProtocol = { “ gridftp ”, ” rfio ” }; Requirements=(other.GlueCEInfoTotalCPUs>4); Rank=(other.GlueCEStateFreeCPUs); ]

12 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania HowTo run near your data JDL (now is disabled but…) –Identify the SE; –You can use the Requirements; – GlueCECloser etc… –More replicas “||” Only one replica runs on the closer CE: –Glite-job-submit -r cehost

13 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Job with input data example #!/bin/sh lcg-cp –vo gilda lfn:myoutdata.1 file:`pwd`/dataset1.out echo “Before updating..” cat dataset1.out #Adding new entry on the dataset1.out file. /bin/echo Hello $1 and Welcome to the EGEE Tutorial! >> dataset1.out echo “After updating..” cat dataset1.out

14 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania IO data: advanced GRID API to read and write; Gridftp API; Catalogue API; Gfal API. –See hands on

15 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania JDL Attributes http://server11.infn.it/workload-grid/docs/DataGrid-01-TEN-0142-0_2.pdf LCG-2 User Guide Manual Series http://egee-na4.ct.infn.it/documentation/LCG-2-Userguide.pdf EDG Tutorial http://www.dutchgrid.nl/DataGrid/introduction/edg-tutorial-handout.pdf EDG Users’ Guide http://marianne.in2p3.fr/datagrid/documentation/EDG-User-Guide-2.0.pdf

16 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Summary & Conclusions We explained the main attributes to create and submit basic JDL on the GRID. Jobs which interacts with the LFC through the JDL Jobs which interacts with the LFC using the lcg-* commands.

17 Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Questions…


Download ppt "Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni."

Similar presentations


Ads by Google