IN2P3 Status Report HTASC March 2003 Fabio HERNANDEZ et al. from CC-in2p3 François ETIENNE
HTASC - 14 March Outline User community Update on computing services Update on storage services Network status Grid status
HTASC - 14 March , Mb/s 2000 IN2P3 current context 18 labs 1 Computer Center 2500 users 40 experiments CC IN2P3 -CERN connection 155 Mb/s Gb/s 2003 CC IN2P3 -SLAC connection 30 Mb/s Mb/s Mb/s 2003
HTASC - 14 March RENATER current context Deployed : oct More grid than star-shape Most links = 2.4 Gbps still 2 main nodes : Paris, Lyon
HTASC - 14 March User community Experiments: LHC (Atlas, CMS, Alice, LHCB), BaBar (SLAC), D0 (FNAL), PHENIX (Brookhaven), astrophysics (17 expts : EROS, SuperNovae, Auger, Virgo…) 2500 users from different countries TIER A BaBar 20% CPU power were consumed by non-French users in 2002 Starting to provide services to biologists at a local/regional level (4 teams and ~3% of cpu over the last 6 months, WP10 EDG, Heaven cluster) User community steadily growing
HTASC - 14 March Experiments CPU (UI ~ 5 SI-95) Aleph Alice Ams Antares Archeops Atlas Auger Babar Clas Cmb Cms D Delphi Edelweiss Eros Euso Glast H Hess Indra Experiments BIOLOGY (several teams) Lhcb NA NA Nemo Ngs-Opera Phenix Planck-S Siren Snovae Star5 000 Tesla Thémis Virgo WA Total experiments above : CPU (UI) : ~ hours (~ 300 Mh SI-95) Experiments CPU request
HTASC - 14 March Computing Services Supported platforms: Linux, SunOS, AIX Dropped support for HP-UX Currently migrating to RedHat Linux 7.2 and SunOS 5.8 Waiting for remaining users and EDG to drop support for RH6.2 More CPU power added over the last six months : 72 bi-processor Intel Pentium 1.4 GHz, 2 GB RAM, 120 GB disk (november) 192 bi-processor Intel Pentium 2.4 GHz, 2 GB RAM (february) Today, the computing capacity (batch+interactive) is Linux: 920 CPUs SunOS: 62 CPUs AIX: 70 CPUsTotal > CPUs Worker nodes storage capacity used for temporary data (reset after job execution)
HTASC - 14 March Storage Services Extensive use of AFS for user and group files HPSS and staging system for physics data Mix of several platforms/protocols SunOS, AIX, Tru64 SCSI, FibreChannel AFS, NFS, RFIO Shared disk capacity (IBM, Hitachi, Sun) ~50TB AFS User Home directories Code, programs and some experimental data Xtage Temporary disk system for data on tape
HTASC - 14 March Storage Services (cont.) Mass storage (HPSS): 250 TB now, 500 TB expected in dec 03 Installed capacity on tape: 700 TB Up to 8.8 TB/day Originally purchased for Babar but now used by most experiments Babar Objectivity: 130 TB and 25 TB cache disk, others: 120 TB and 4.4TB STK 9840 (20GB tapes, fast mount) and STK 9940 (200GB tapes, slower mount, higher I/O) Accessed by RFIO, mainly rfcp. Supports files larger than 2GB Direct HPSS access from network through BBFTP
HTASC - 14 March Storage Services (cont.) Semi-permanent storage Suited for small files(which deteriorate HPSS performances) Access with NFS or RFIO API Back-up possible for experiments whose CC-IN2P3 is the « base-site » (Auger, Antares) Working on RFIO transparent access Back-up, Archive: TSM (Tivoli Storage Manager) For Home directories, critical experimental data, HPSS metadata, Oracle data TSM allows data archival (Elliot). For back up of external data (eg. From Admin. Data of IN2P3, from Biology labs, etc)
HTASC - 14 March Disks AFS : 4 TB HPSS : 4,4 TB Objectivity : 25 TB Oracle :0.4 TB Xstage :1,2 TB Semi-perm. : 1,9 TB TSM : 0.3 TB Local : 10 TB Tapes 1 robot STK – 6 silos, slots 12 drives 9940B 200 GB/tape (7 hpss, 3 TSM, 2 others) 35 drives GB (28 hpss, 4 TSM, 3 others) 8 drives IBM-34900,8 GB (service will stop by end 2003) 1 Robot DLT – 400 Slots 6 DLT DLT 7000 Storage Service (cont)
HTASC - 14 March Network International connectivity through… RENATER+GEANT to the US (600 Mbps via ESNET and ABILENE in NY) and Europe CERN to the US as alternate (600 Mbps) Babar is using both links to the US for transferring data between SLAC and Lyon Specific software developed for "filling the pipe" (bbFTP) being extensively used by Babar and D0, amongst others Dedicated 1 Gb link between Lyon and CERN since january 2003 LAN is composed of a mixture of FastEthernet and GigabitEthernet Ubiquitous wireless service Connectivity to the other IN2P3 laboratories across the country by RENATER-3 (the French academic and research network, 2.4 Gbps links) All labs have a private connection to RENATER POPs
HTASC - 14 March Grid-related activities Fully involved in the DataGRID project & partly in DataTag (INRIA) One of the 5 major test bed sites Currently all the "conventional" production environment is accessible through the grid interface Jobs submitted to the grid are managed by BQS, the home-grown batch management system Grid jobs can use the same pool of resources than normal jobs (~1000 CPUs) Access to mass storage (HPSS) from remote sites enabled through bbFTP Benefits: Tests of DataGRID software in a production environment Scalability tests can be performed Users access exactly the same working environment and data whatever the interface they choose to access our facility Operational issues detected early
HTASC - 14 March Grid-related activities (cont.) Disadvantages Local resources needed for integration of the production environment (AFS, BQS, …). More work needed to achieve a seamless integration between the local and grid worlds Users want us to provide a grid service: how to provide a service around a "moving target" software project? Some experiments already using the grid interface for "semi- production" Other expressed interest in using it as soon as it gets more stable Starting from march 2003, the resource broker and associated services for Applications and Development DataGRID testbeds will be hosted and operated in Lyon
HTASC - 14 March Grid-related activities (cont.) Involved in several other grid projects at regional and national levels Cooperation agreement signed with IBM to work on grid technology Exchange of experiences Grid technology evaluation Perform experiments of this technology in a production environment Explore technologies for virtualization of storage …
HTASC - 14 March CNRS –IN2P3 Coordination of: WP6 Integration Testbed WP7 Networking WP10 Bioinformatics IPSLEarth Observation (Paris) BBEBioinformatics (Lyon) CREATISImaging and signal processing (Lyon) RESAMHigh Speed networking (Lyon) LIPParallel computing (Lyon) IBCPBioinformatics (Lyon) URECNetworking (Paris –Grenoble) LIMOSBioinformatics (Clermont Ferrant) LBPBioinformatics (Clermont Ferrant) LPCIN2P3 (Clermont-Ferrant) LALIN2P3 (Paris) SubatechIN2P3 (Nantes) LLR-XIN2P3 (Paris) ISNIn2P3 (Grenoble) CC-In2P3IN2P3 (Lyon) LPNHEIN2P3 (Paris) CPPMIN2P3 (Marseille) LAPPIN2P3 (Annecy)