Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni.

Slides:



Advertisements
Similar presentations
EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin, January Hands on Job Services.
Advertisements

EGEE is a project funded by the European Union under contract IST EGEE Tutorial Turin - ITALY 18 – 19 January Job Services.
EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
FESR Consorzio COMETA - Progetto PI2S2 The gLite Workload Management System Annamaria Muoio INFN Catania Italy
Job Submission The European DataGrid Project Team
EGEE is funded by the European Union under contract IST Elena Slabospitskaya IHEP NA3 manager for Russia An inroduction to services provided.
Riccardo Bruno, INFN.CT Sevilla, 10-14/09/2007 GENIUS Exercises.
Grid Data Management Assaf Gottlieb - Israeli Grid NA3 Team EGEE is a project funded by the European Union under contract IST EGEE tutorial,
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Submission Fokke Dijkstra RuG/SARA Grid.
Basic Grid Job Submission Alessandra Forti 28 March 2006.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Luciano Díaz ICN-UNAM Based on Domenico.
Enabling Grids for E-sciencE Workload Management System on gLite middleware Matthieu Reichstadt CNRS/IN2P3 ACGRID School, Hanoi (Vietnam)
E-science grid facility for Europe and Latin America Marcelo Risk y Juan Francisco García Eijó Laboratorio de Sistemas Complejos Departamento.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
Group 1 : Grid Computing Laboratory of Information Technology Supervisors: Alexander Ujhinsky Nikolay Kutovskiy.
Job Submission The European DataGrid Project Team
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite job submission Fokke Dijkstra Donald.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Nov. 18, EGEE and gLite are registered trademarks gLite Middleware Usage Dusan.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Job Submission Fokke Dijkstra RuG/SARA Grid.
Jan 31, 2006 SEE-GRID Nis Training Session Hands-on V: Standard Grid Usage Dušan Vudragović SCL and ATLAS group Institute of Physics, Belgrade.
Job Management DIRAC Project. Overview  DIRAC JDL  DIRAC Commands  Tutorial Exercises  What do you have learned? KEK 10/2012DIRAC Tutorial.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
EGEE is a project funded by the European Union under contract IST Grid Data Management Roberto Barbera Univ. Of Catania and INFN
High-Performance Computing Lab Overview: Job Submission in EDG & Globus November 2002 Wei Xing.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
INFSO-RI Enabling Grids for E-sciencE Job Workflows with gLite Emidio Giorgio INFN NA4 Generic Applications Meeting 10 January 2006.
Workload Management System Jason Shih WLCG T2 Asia Workshop Dec 2, 2006: TIFR.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Moisés Hernández Duarte UNAM FES Cuautitlán.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Command Line Grid Programming Spiros Spirou Greek Application Support Team NCSR “Demokritos”
INFSO-RI Enabling Grids for E-sciencE Job Description Language (JDL) Giuseppe La Rocca INFN First gLite tutorial on GILDA Catania,
INFSO-RI Enabling Grids for E-sciencE GILDA Praticals Giuseppe La Rocca INFN – Catania gLite Tutorial at the EGEE User Forum CERN.
E-infrastructure shared between Europe and Latin America FP6−2004−Infrastructures−6-SSA Special Jobs Valeria Ardizzone INFN - Catania.
Enabling Grids for E-sciencE EGEE-II INFSO-RI Porting an application to the EGEE Grid & Data management for Application Rachel Chen.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMS tricks & tips – further scripting Giuseppe.
Job Submission The European DataGrid Project Team
Biomed tutorial 1 Enabling Grids for E-sciencE INFSO-RI EGEE is a project funded by the European Union under contract IST JDL Flavia.
User Interface UI TP: UI User Interface installation & configuration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Data management in EGEE.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Architecture of LHC File Catalog Valeria Ardizzone INFN Catania – EGEE-II NA3/NA4.
GRID commands lines Original presentation from David Bouvet CC/IN2P3/CNRS.
Grid Data Management Assaf Gottlieb Tel-Aviv University assafgot tau.ac.il EGEE is a project funded by the European Union under contract IST
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Data Management Maha Metawei
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
Introduction to Computing Element HsiKai Wang Academia Sinica Grid Computing Center, Taiwan.
FESR Consorzio COMETA - Progetto PI2S2 Jobs with Input/Output data Fabio Scibilia, INFN - Catania, Italy Tutorial per utenti e.
Introduction to Job Description Language (JDL) Alessandro Costa INAF Catania Corso di Calcolo Parallelo Grid Computing Catania - ITALY September.
Enabling Grids for E-sciencE Work Load Management & Simple Job Submission Practical Shu-Ting Liao APROC, ASGC EGEE Tutorial.
Create an script to print “hello world” in an output file with also the information of an input file. The input file should be previously register in the.
gLite Basic APIs Christos Filippidis
Architecture of the gLite WMS
Workload Management System on gLite middleware
Special jobs with the gLite WMS
EGEE tutorial, Job Description Language - more control over your Job Assaf Gottlieb Tel-Aviv University EGEE is a project.
Hands-On Session: Data Management
ESRIN Grid Workshop Tutorial
5. Job Submission Grid Computing.
Job Management with DATA
Data Management Ouafa Bentaleb CERIST, Algeria
The gLite Workload Management System
gLite Job Management Christos Theodosiou
GENIUS Grid portal Hands on
Job Description Language (JDL)
Hands on Session: DAG Job Submission
Job Submission M. Jouvin (LAL-Orsay)
Presentation transcript:

Istituto Nazionale di Astrofisica Information Technology Unit INAF-SI Job with data management Giuliano Taffoni

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Name Convention Logical File Name 1 Logical File Name n GUID Physical File SURL n Physical File SURL Globally Unique Identifier (GUID)Globally Unique Identifier (GUID) “guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6” Site URL (SURL)Site URL (SURL) “sfn://lxshare0209.cern.ch/data/alice/ntuples.dat” Logical File Name (LFN)Logical File Name (LFN) “lfn:cms/track1” Transport URL (TURL)Transport URL (TURL) “gsiftp://lxshare0209.cern.ch//data/alice/ntuples.dat”

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Job Description Language Job Description Language (JDL)Job Description Language (JDL) is used to describe jobs for execution on Grid. CLASSified Advertisement language (ClassAd)The JDL adopted is based upon Condor’s CLASSified Advertisement language (ClassAd). The supported attributes are grouped in categories: Data and Storage resources Input data to process, SE where to store output data, protocols spoken by application when accessing SEs

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Jobs and Output data Sandbox < 30MB Use catalogue for bigger output Use catalogue + metadata for refinements REQUIRES A DATA “DRIVER”

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Data driver Script to submit the job: –Prepares the environment –Executes the program/s –Handles data Save output Store comments/metadata information

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Data handling Suggestion: save data in the closer SE –VO_XXXX_DEFAULT_SE –Protocol: rfio, gridftp, lcg-cr Register data on the catalogue –Lcg-rf

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania A simple example > cat test.sh #!/bin/sh /bin/echo Hello $1 and Welcome to the EGEE Tutorial! > test.out lcg-cr --vo YourVO -d $VO_YourVO_DEFAULT_SE -l lfn:/grid/yourvo/tests/out.txt > cat test.jdl [ Type = “job”; JobType = “Normal”; Executable = “scriptOutput.sh"; Arguments = “pippo”; VirtualOrganisation = “YourVO”; StdOutput = “sim.out”; StdError = “sim.err”; InputSandbox = {"scriptOutput.sh"}; OutputSandbox={sim.out, sim.err}; Requirements=…. ]

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Job with input data “Move the computation to data rather than move data to computing power” “Use the closer data repository!” JDL + lcg-info + lcg-cp

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania InputDataInputData (optional) This is a string or a list of strings representing the Logical File Name (LFN) or Grid Unique Identifier (GUID) needed by the job as input. The list is used by the RB to find the CE from which the specified files can be better accessed and schedules the job to run there. InputData = {“lfn:cmstestfile”, “guid:135b7b23-4a6a-11d7-87e7-9d101f8c8b70”}; JDL Attributes for data

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania DataAccessProtocolDataAccessProtocol –The protocol or the list of protocols which the application is able to “speak” with for accessing files listed in InputData on a given SE. gridftpfilerfioSupported protocols in LCG-2 are currently gridftp, file and rfio. DataAccessProtocol = {“file”,“gridftp”,“rfio”}; JDL: Relevant Attributes

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Job with Input Data [ Type = “ job ” ; JobType = “ Normal ” ; Executable = “ scriptInput.sh"; Arguments = “ Francesco ” ; VirtualOrganisation = “ gilda ” ; StdOutput = “ std.out ” ; StdError = “ std.err ” ; InputSandbox = { “ scriptInput.sh"}; OutputSandbox = { “ std.err ”, “ std.out ” }; InputData = “ lfn:myoutdata.1 ” ; DataAccessProtocol = { “ gridftp ”, ” rfio ” }; Requirements=(other.GlueCEInfoTotalCPUs>4); Rank=(other.GlueCEStateFreeCPUs); ]

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania HowTo run near your data JDL (now is disabled but…) –Identify the SE; –You can use the Requirements; – GlueCECloser etc… –More replicas “||” Only one replica runs on the closer CE: –Glite-job-submit -r cehost

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Job with input data example #!/bin/sh lcg-cp –vo gilda lfn:myoutdata.1 file:`pwd`/dataset1.out echo “Before updating..” cat dataset1.out #Adding new entry on the dataset1.out file. /bin/echo Hello $1 and Welcome to the EGEE Tutorial! >> dataset1.out echo “After updating..” cat dataset1.out

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania IO data: advanced GRID API to read and write; Gridftp API; Catalogue API; Gfal API. –See hands on

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania JDL Attributes LCG-2 User Guide Manual Series EDG Tutorial EDG Users’ Guide

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Summary & Conclusions We explained the main attributes to create and submit basic JDL on the GRID. Jobs which interacts with the LFC through the JDL Jobs which interacts with the LFC using the lcg-* commands.

Corso Avanzato di Calcolo Parallelo e Grid Computing - 27 Sep - Catania Questions…