Download presentation
Presentation is loading. Please wait.
Published byMarcus Bennett Modified over 9 years ago
1
Workload Management meeting 07/10/2004 Federica Fanzago INFN Padova Grape for analysis M.Corvo, F.Fanzago, N.Smirnov INFN Padova
2
Workload Management meeting 07/10/2004 Federica Fanzago INFN Padova Goals – To show how we think to implement the real analysis job on Grape – Grape was already used to run some analysis job but it is necessary to add some functionalities (like data discovery according to PubDB and automatic retrieve of output...), evaluate architecture and test it. – Grape was developed to run production, but now we want to concentrate only on analysis tasks. 1
3
Workload Management meeting 07/10/2004 Federica Fanzago INFN Padova What the user should provide... 2... as information written into grape.cfg file … a) The analysis input parameter dataset and owner b) The number of events to analyze for each job (job splitting) c) The name of ORCA executable to run on WN d) The name of output file produced by executable (root file) e) The user orcarc card... and... e) GRAPE finds the executable and the libraries into the user SCRAM area, in order to pack them and include into jdl InputSandbox f) GRAPE modifies orcarc card according to job splitting and include into jdl InputSandox
4
Workload Management meeting 07/10/2004 Federica Fanzago INFN Padova GRAPE workflow 3 1) Read grape.cfg file 2) Create scripts to submit a) data discovery (quering PubDB) b) packaging of user code c) modify orcarc d) create the shell script to run on WN (wrapper of orca executable) 3) Create jdl files 4) Submit jobs to the Grid (without Boss as first prototype) 5) Automatic job output retrieval
5
Workload Management meeting 07/10/2004 Federica Fanzago INFN Padova How GRAPE uses user information (1) 4 Data discovery Query the CERN PubDB to discover where the data are stored (by RC name field). Possibly more than one site. Sites storing data will be written like requirement into jdl file so the Resource Broker is driven to match one of them like resources where to submit the analysis job. The RB decides where to send job With the same query get also local catalogs location (and access protocol) for all sites. Local catalog Information are sent with jobs via InputSandbox (catalogs_file) On WN, use catalogs_file to get the correct POOL catalog, depending on the site, and put into the orcarc card.
6
Workload Management meeting 07/10/2004 Federica Fanzago INFN Padova How GRAPE uses user information (2) 5 Packaging of code and modify the card The name of analysis executable is necessary to package the code and related libraries into a tgz archive to be sent with InputSandox. The environment variable $LOCALRT provides the path of the user scram area. The orcarc provided by the user will be modified by Grape according to job splitting (will PubDB publish the total number of events of dataset-owner??), that means to change the FirstEvent, MaxEvents. Creates jdl to submit to the grid The InputSandbox is filled with: 1) tzg archive of user code 2) orcarc card 3) catalogs_file obtained from PubDB The OutputSandbox is defined with: 1) root output file 2) std.out and std.err of grid job
7
Workload Management meeting 07/10/2004 Federica Fanzago INFN Padova How GRAPE uses user information (3) 6 Creates script to run on WN that: 1) set the CMS environment to run ORCA in LCG environment 2) create scram area 3) unpack the user code into the scram area 4) overwrite the InputFileCatalogURL into the orcarc card with the correct POOL file to use, selected from catalogs_file according to the site where job is running.Eventually copy local catalog if needed (eg RFIO protocol) 5) run the executable 6) rename output file accordling with job splitting (mv MyHisto.root MyHisto_n.root) 7) the produced output (root file) returns to the user via OutputSandbox (not stage of output into a SE and registration in RLS) Submit the job to the Grid via edg-job-submit command, eventually with BOSS. The monitoring is done via grid command (edg-job-status). In the future we are thinking to use BOSS or GridICE (with application monitoring implementation). Retrieve of output A wrapper script of edg-job-get-output command that, when job is finished, retrieves automatically the output and puts files into a user predefined directory.
8
Workload Management meeting 07/10/2004 Federica Fanzago INFN Padova What done and what to do 7 The general architecture is already done Grape was already used to run analysis on LCG environment We are implementing: - connection with PubDB and modification of shell scripts (Nikolai and Federica) - software packaging and automatic output retrieve (Marco) - monitoring to do... We think to have a running prototype for the end of next week. We are happy if people will try to use it and provide feedback !!!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.