David Bouvet Fabio Hernandez IN2P3 Computing Centre - Lyon

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Advertisements

PBSpro Advanced Information Systems & Technology Advanced Campus Services Prepared by Chao “Bill” Xie, PhD student Computer Science Fall 2005.
Report on Hepix Spring 2005 Forschungszentrum Karlsruhe 9-13 May Storage and data management Batch scheduling workshop May.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
Physicists's experience of the EGEE/LCG infrastructure usage for CMS jobs submission Natalia Ilina (ITEP Moscow) NEC’2007.
SA1 / Operation & support Enabling Grids for E-sciencE Integration of heterogeneous computational resources in.
A proposal for standardizing the working environment for a LCG/EGEE job David Bouvet - Grid Computing team - CCIN2P3 HEPIX Karlsruhe 13/05/2005.
1 BIG FARMS AND THE GRID Job Submission and Monitoring issues ATF Meeting, 20/06/03 Sergio Andreozzi.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
The ILC And the Grid Andreas Gellrich DESY LCWS2007 DESY, Hamburg, Germany
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Glite. Architecture Applications have access both to Higher-level Grid Services and to Foundation Grid Middleware Higher-Level Grid Services are supposed.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
VO Box Issues Summary of concerns expressed following publication of Jeff’s slides Ian Bird GDB, Bologna, 12 Oct 2005 (not necessarily the opinion of)
Site Certification Process (Round Table) Fabio Hernandez IN2P3 Computing Center - Lyon October
CERN Running a LCG-2 Site – Oxford July - 1 LCG2 Administrator’s Course Oxford University, 19 th – 21 st July Developed.
EGEE is a project funded by the European Union under contract IST New VO Integration Fabio Hernandez ROC Managers Workshop,
GLite WN Installation Giuseppe LA ROCCA INFN Catania ACGRID-II School 2-14 November 2009 Kuala Lumpur - Malaysia.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
Site BDII and CE Installation Muhammad Farhan Sjaugi, UPM 2009 November , UM Malaysia 1.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
A GOS Interoperate Interface's Design & Implementation GOS Adapter For JSAGA Meng You BUAA.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Worker Node & Torque Client Installation.
Jean-Philippe Baud, IT-GD, CERN November 2007
The Information System in gLite middleware
SuperB – INFN-Bari Giacinto DONVITO.
YAIM Optimized Cristina Aiftimiei – Sergio Traldi
The EDG Testbed Deployment Details
gLite Information System
OpenPBS – Distributed Workload Management System
MyProxy Server Installation
Andreas Unterkircher CERN Grid Deployment
Farida Naz Andrea Sciabà
Introductions Using gLite Grid Miguel Angel Díaz Corchero
GDB 8th March 2006 Flavia Donno IT/GD, CERN
Joint JRA1/JRA3/NA4 session
lcg-infosites documentation (v2.1, LCG2.3.1) 10/03/05
INFNGRID Workshop – Bari, Italy, October 2004
Accounting at the T1/T2 Sites of the Italian Grid
The Information System in gLite
The CREAM CE: When can the LCG-CE be replaced?
CRAB and local batch submission
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Grid User Interface Giuliano Taffoni.
Information System Virginia Martín-Rubio Pascual
The gLite API – Part II Giuseppe LA ROCCA ACGRID-II School
Glexec/SCAS Pilot: IN2P3-CC status
Artem Trunov and EKP team EPK – Uni Karlsruhe
Testing Activities on the CERT-TB Status report
Developments in Batch and the Grid
CMS report from FNAL demo week Marco Verlato (INFN-Padova)
Long term job submission and monitoring uing grid services
Pierre Girard ATLAS Visit
Data services in gLite “s” gLite and LCG.
EGEE Middleware: gLite Information Systems (IS)
Overview of gLite Middleware
gLite Job Management Christos Theodosiou
Grid Management Challenge - M. Jouvin
Information System (BDII)
Site availability Dec. 19 th 2006
Installation/Configuration
Job Submission M. Jouvin (LAL-Orsay)
The LHCb Computing Data Challenge DC06
Presentation transcript:

A proposal for standardizing the working environment for a LCG/EGEE job David Bouvet (dbouvet@in2p3.fr) Fabio Hernandez (fabio@in2p3.fr) IN2P3 Computing Centre - Lyon HEPIX, Karlsruhe, 13/05/2005 LCG Operations Workshop, Bologna, 25/05/2005

Motivation Problem raised some months ago by Jeff Templon: D0 jobs encountered problems in Lyon due to different use of environment variables to address scratch/temp disk space Standard is defined for: Environment Variables « IEEE Std 1003.1, 2004 POSIX Part 1: Base definitions, Amendment 8 » http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap08.html among which: HOME, PATH, PWD, SHELL, TMPDIR, USER Batch Environment Services « IEEE Std 1003.1, 2004 POSIX Part 2: Shell and Utilities, Amendment 1 » http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap03.html PBS_ENVIRONMENT, PBS_JOBID, PBS_JOBNAME, PBS_QUEUE PBS_O_HOME, PBS_O_HOST, PBS_O_LOGNAME, PBS_O_PATH, PBS_O_QUEUE, PBS_O_SHELL, PBS_O_WORKDIR There is no standard definition of environment variables for grid batch jobs  Proposal for LCG/EGEE sites of a common definition of minimal set of environment variables for grid batch jobs D. Bouvet / F. Hernandez 2

Conditions of test: ATLAS VO, short queue Current status Environment variables for grid batch job have been checked on several LCG/EGEE sites (among which all the LCG Tier-1s) Conditions of test: ATLAS VO, short queue Batch system CEs distribution # CEs checked BQS 3 2 CONDOR 4 TORQUE 72 11 PBS 36 13 LSF 5 D. Bouvet / F. Hernandez 3

Current status: POSIX variables  : defined  : not defined on some sites  not all these variables are defined on all CEs Variable BQS CONDOR TORQUE PBS LSF POSIX basic: HOME PATH PWD SHELL TMPDIR USER     POSIX batch D. Bouvet / F. Hernandez 4

Current status (cont.)  : defined  : not defined on some sites  even for Globus, not all the sites define the same set of environment variables. Variable BQS CONDOR TORQUE PBS LSF GLOBUS variables: GLOBUS_LOCATION GLOBUS_PATH GLOBUS_TCP_PORT_RANGE X509_USER_PROXY    MYPROXY_SERVER (useful for proxy renewal)  D. Bouvet / F. Hernandez 5

Current status: LCG environment variables (middleware related) (list from the LCG Users Guide) Variable Definition BQS CONDOR TORQUE PBS LSF EDG_LOCATION Base of the installed EDG software    LCG_LOCATION Base of the installed LCG software EDG_WL_JOBID Job ID (for a running job) in a WN EDG_WL_LOCATION Base of the EDG’s WMS software EDG_WL_PATH Path for EDG’s WMS commands  EDG_WL_RB_BROKERINFO Location of the .BrokerInfo file in a WN LCG_GFAL_INFOSYS Location of the BDII for lcg-utils and GFAL LCG_CATALOG_TYPE Type of file catalog used (edg or lfc) for lcg-utils and GFAL LFC_HOST Location of the LFC catalog (only for catalog type lfc) D. Bouvet / F. Hernandez 6

Current status: LCG environment variables (job related) (list from the LCG Users Guide) Variable Definition BQS CONDOR TORQUE PBS LSF EDG_TMP Temp directory    LCG_TMP VO_<VO-name>_DEFAULT_SE Default SE defined for a CE in a WN VO_<VO-name>_SW_DIR Base directory of the VO’s software in a WN D. Bouvet / F. Hernandez 7

Current status: gLite environment variables gLite environment variables on WN (in config. files and scripts) from gLite installation guide: GLITE_LOCATION /opt/glite GLITE_LOCATION_VAR /var/glite GLITE_LOCATION_LOG /var/log/glite GLITE_LOCATION_TMP /tmp/glite GLITE_LOCATION_TMP  another tmp directory to clean! D. Bouvet / F. Hernandez 8

Proposal for standardization Variable type Definition Name POSIX Home directory of job user on WN HOME Temp directory TMPDIR (currently LCG_TMP, EDG_TMP, GLITE_LOCATION_TMP) PWD SHELL PATH Grid batch jobs Job working directory on WN GRID_WORKDIR Site name on which the job run (same as siteName in Information Provider) GRID_SITENAME WN hostname on which the job runs GRID_HOSTNAME CE and queue names on which the job run (same as GlueCEUniqueID in Information Provider) GRID_CEID Job ID in local batch system GRID_LOCAL_JOBID Job ID on grid GRID_GLOBAL_JOBID (currently EDG_WL_JOBID) User’s DN of certificate GRID_USERID D. Bouvet / F. Hernandez 9

Proposal for standardization (cont.) Use of POSIX variable when existing TMPDIR: POSIX variable which could also be used by middleware for creating temporary files (courrently LCG_TMP, EDG_TMP, GLITE_LOCATION_TMP) HOME: MPI jobs need a home directory and some grids (like OSG) have a permanent mapping for each grid user D. Bouvet / F. Hernandez 10

Proposal for standardization (cont.) Minimal set of environment variable (not related to middleware). The naming convention must be independent of grid middleware name for easing portability of grid jobs GRID_WORKDIR: job-specific working directory (file permissions 700) e.g.: /scratch/atlas0011293.ccwl0092 GRID_SITENAME: to know on which site the job run (same as siteName in the Information System) e.g.: IN2P3-CC GRID_HOSTNAME: full host name - could be useful to know the WN hostname for problem tracking (and parallel jobs?) [not strictly necessary but may be convenient for users to have it] e.g.: ccwl0006.in2p3.fr GRID_CEID: CE name on which the job run (same as GlueCEUniqueID in Information System) e.g.: heplnx201.pp.rl.ac.uk:2119/jobmanager-torque-short GRID_LOCAL_JOBID: useful for problem tracking (and parallel jobs?) e.g.: lcg0509104420-07243 GRID_GLOBAL_JOBID: same as EDG_WL_JOBID for LCG e.g.: https://lxn1188.cern.ch:9000/HPMN2WVHurMlji-Fnqba0A GRID_USERID: DN of user’s certificate (already set on some sites) e.g.: /O=GRID-FR/C=FR/O=CNRS/OU=CC-LYON/CN=David Bouvet/Email=dbouvet@in2p3.fr D. Bouvet / F. Hernandez 11

Proposal for standardization (cont.) Site-specific TMPDIR HOME GRID_WORKDIR GRID_LOCAL_JOBID GRID_HOSTNAME GRID_SITENAME GRID_USERID Grid-specific GRID_CEID GRID_GLOBAL_JOBID Minimal modification to the RB possible but not required Modifications to the job managers are required => need to identify the people responsible for maintaining them Sites need some way to configure the job managers to specify the site-dependent parameters D. Bouvet / F. Hernandez 12

Proposal for standardization (cont.) When agreed on a set of variables and a naming convention, this standard should be implemented on all LCG/EGEE CEs. A (short) document is being written describing the meaning of the variables the compulsory/optional character of each one of them The document will be distributed and your feedback will be welcome D. Bouvet / F. Hernandez 13

Please give us your feedback Questions/Comments Please give us your feedback D. Bouvet / F. Hernandez 14