INFN Tier1 Status report Spring HEPiX 2005 Andrea Chierici – INFN CNAF.

Slides:

Advertisements

Similar presentations

Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.

Advertisements

Tier 1 Luca dell’Agnello INFN – CNAF, Bologna Workshop CCR Paestum, 9-12 Giugno 2003.

“A prototype for INFN TIER-1 Regional Centre” Luca dell’Agnello INFN – CNAF, Bologna Workshop CCR La Biodola, 8 Maggio 2002.

Luca dell’Agnello INFN-CNAF FNAL, May

BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.

INFN – Tier1 Site Status Report Vladimir Sapunenko on behalf of Tier1 staff.

Southgrid Status Report Pete Gronbech: February 2005 GridPP 12 - Brunel.

1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),

Computing/Tier 3 Status at Panjab S. Gautam, V. Bhatnagar India-CMS Meeting, Sept 27-28, 2007 Delhi University, Delhi Centre of Advanced Study in Physics,

INFN Tier1 Andrea Chierici INFN – CNAF, Italy LCG Workshop CERN, March

ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.

Soluzioni HW per il Tier 1 al CNAF Luca dell’Agnello Stefano Zani (INFN – CNAF, Italy) III CCR Workshop May

US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.

INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.

Federico Ruggieri INFN-CNAF GDB Meeting 10 February 2004 INFN TIER1 Status.

LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.

ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.

INDIACMS-TIFR Tier 2 Grid Status Report I IndiaCMS Meeting, April 05-06, 2007.

RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.

RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.

Tier1 status at INFN-CNAF Giuseppe Lo Re INFN – CNAF Bologna Offline Week

10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.

Fabric Monitor, Accounting, Storage and Reports experience at the INFN Tier1 Felice Rosso on behalf of INFN Tier1 Workshop sul.

1 INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff 28 th October 2009.

Tier2 Centre in Prague Jiří Chudoba FZU AV ČR - Institute of Physics of the Academy of Sciences of the Czech Republic.

INFN TIER1 (IT-INFN-CNAF) “Concerns from sites” Session LHC OPN/ONE “Networking for WLCG” Workshop CERN, Stefano Zani

October, HEPiX Fall 2005 at SLACSLAC Site Report Roberto Gomezel INFN.

ATLAS Tier 1 at BNL Overview Bruce G. Gibbard Grid Deployment Board BNL 5-6 September 2006.

Fabric Monitoring at the INFN Tier1 Felice Rosso on behalf of INFN Tier1 Joint OSG & EGEE Operations WS, Culham (UK)

USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Implementation of a reliable and expandable on-line storage for compute clusters Jos van Wezel.

RAL Site Report John Gordon HEPiX/HEPNT Catania 17th April 2002.

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.

Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.

BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.

CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.

Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.

The 2001 Tier-1 prototype for LHCb-Italy Vincenzo Vagnoni Genève, November 2000.

SA1 operational policy training, Athens 20-21/01/05 Presentation of the HG Node “Isabella” and operational experience Antonis Zissimos Member of ICCS administration.

CASTOR CNAF TIER1 SITE REPORT Geneve CERN June 2005 Ricci Pier Paolo

CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.

The Italian Tier-1: INFN-CNAF 11-Oct-2005 Luca dell’Agnello Davide Salomoni.

Database CNAF Barbara Martelli Rome, April 4 st 2006.

BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.

The Italian Tier-1: INFN-CNAF Andrea Chierici, on behalf of the INFN Tier1 3° April 2006 – Spring HEPIX.

IT-INFN-CNAF Status Update LHC-OPN Meeting INFN CNAF, December 2009 Stefano Zani 10/11/2009Stefano Zani INFN CNAF (TIER1 Staff)1.

Status of GSDC, KISTI Sang-Un Ahn, for the GSDC Tier-1 Team

PADME Kick-Off Meeting – LNF, April 20-21, DAQ Data Rate - Preliminary estimate Tentative setup: all channels read with Fast ADC 1024 samples, 12.

Storage & Database Team Activity Report INFN CNAF,

Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.

IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.

RAL Plans for SC2 Andrew Sansum Service Challenge Meeting 24 February 2005.

Tier2 Centre in Prague Jiří Chudoba FZU AV ČR - Institute of Physics of the Academy of Sciences of the Czech Republic.

G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.

Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.

Quattor installation and use feedback from CNAF/T1 LCG Operation Workshop 25 may 2005 Andrea Chierici – INFN CNAF

Validation tests of CNAF storage infrastructure Luca dell’Agnello INFN-CNAF.

INFN Site Report R.Gomezel October 9-13, 2006 Jefferson Lab, Newport News.

status, usage and perspectives

Luca dell’Agnello INFN-CNAF

INFN CNAF TIER1 Network Service

Andrea Chierici On behalf of INFN-T1 staff

Service Challenge 3 CERN

Luca dell’Agnello INFN-CNAF

The INFN TIER1 Regional Centre

Ákos Frohner EGEE'08 September 2008

The INFN Tier-1 Storage Implementation

Vladimir Sapunenko On behalf of INFN-T1 staff HEPiX Spring 2017

Experience with GPFS and StoRM at the INFN Tier-1

Storage resources management and access at TIER1 CNAF

Presentation transcript:

INFN Tier1 Status report Spring HEPiX 2005 Andrea Chierici – INFN CNAF

Introduction Location: INFN-CNAF, Bologna (Italy) –one of the main nodes of GARR network Computing facility for INFN HNEP community –Partecipating to LCG, EGEE, INFNGRID projects Multi-Experiment TIER1 –LHC experiments –VIRGO –CDF –BABAR –AMS, MAGIC, ARGO,... Resources assigned to experiments on a yearly Plan.

Services Computing servers (CPU FARMS) Access to on-line data (Disks) Mass Storage/Tapes Broad-band network access System administration Database administration Experiment specific library software Coordination with TIER0, other TIER1s and TIER2s

Infrastructure Hall in the basement (-2 nd floor): ~ 1000 m 2 of total space –Easily accessible with lorries from the road –Not suitable for office use (remote control) Electric power –220 V mono-phase (computers) 4 x 16A PDU needed for 3.0 GHz Xeon racks –380 V three-phase for other devices (tape libraries, air conditioning etc…) –UPS: 800 KVA (~ 640 KW) needs a separate room (conditioned and ventilated). –Electric Generator: 1250 KVA (~ 1000 KW)  up to 160 racks (~100 with 3.0 GHz Xeon)

HW Resources CPU: –320 old biprocessor boxes GHz –350 new biprocessor boxes 3 GHz (+70 servers +55 Babar + 48 CDF +30 LHCb) 1300 KSi2K Total Disk: –FC, IDE, SCSI, NAS Tapes: –Stk L TB –Stk LTO-2 with 2000 tapes  400 TB B with 800 tapes  200 TB Networking: –30 rack switches  46 FE UTP + 2 GE FO –2 core switches  96 GE FO GE FO + 4x10 GE –2x1Gbps links to WAN

Networking GARR-G Backbone with 2.5 Gbps F/O will be upgraded to Dark Fiber (Q3 2005) INFN-TIER1 access is now 1 Gbps (+1 Gbps for SC) and will be 10 Gbps soon (September 2005) –Gigapop is colocated with INFN-CNAF International Connectivity via Geant: 10 Gbps access in Milan already in place.

CNAF Network Setup 1Gb/s Dedicated to SC GARR Italian Research Network ER16 FarmSW3(IBM)FarmSWG1 FarmSW1FarmSW2(Dell) LHCBSW1 FarmSW4(IBM3) Catalyst3550 Farmsw4 Catalyst3550 SW SW-04-07SW SW SW-04-10SW FarmSW12 ServSW2 Catalyst3550 SW SW SW SW SW SW SW SW-05-07SW SW FarmSWG3 HP Babar FarmSWtest SW SW BD SSR 8860 ServiceCahllenge Summit Gb/s CNAF Internal Services 1Gb/s SW n x 10Gb/s Back Door 1Gb/s Production link

Tier1 LAN Each CPU rack (36 WNs) equipped with FE switch with 2xGb uplinks to core switch Disk servers connected via GE to core switch Foreseen upgrade to rack Gb switch –10 GB core switch already installed

LAN model layout SAN STK Tape Library NAS Rack_farm

Networking resources 30 Switches (14 switches 10Gb Ready) 3 Core Switch/Router (SSR8600, ER16, BD) –SSR8600 is also the WAN access router with firewalling functions) –A new Black Diamond is already installed with 120 GE and 12x10GE ports (it can scale up to 480 GE or 48x10GE) –New access router (with 4x10GE and 4xGE interfaces) in order to substitute the SSR8600 as WAN access (ER 16 and Black Diamond will aggregate all the Tier1’s resources). 3 Switch L2/L3 (with 48xGE and 2x10GE) to be used during the “Service Challenge”.

Farming Team composition –2 FTE for general purpose farm –1 FTE for hosted farms (CDF and Babar) –~3 FTE clearly not enough to manage ~800 WNs  more people needed Tasks –Installation & management of Tier1 WNs and servers Using Quattor (still some legacy lcfgng nodes around) –Deployment & configuration of OS & LCG middleware HW maintenance management –Management of batch scheduler (LSF, torque)

Access to Batch system “Legacy” non Grid Access CELSF Wn1WNn SE Grid Access UI Grid

Farming: where were we last year Computing resources not fully used –Batch system (Torque+maui) showed not to be scalable –Present policy assignment does not allow resource use optimization Interoperability issues –Full experiment integration still to be achieved –Difficulty to deal with 3 different farms

Farming: evolution (1) Migration of whole farm to SLC (CERN version!) almost complete –Quattor deployment successful (more than 500 WNs) –Standard configuration of WNs for all experiments –To be completed in a couple of weeks Migration from torque+maui to LSF (v6.0) –LSF farm running successfully –Process to be completed together with OS migration –Fair sharing model for resource access –Progressive inclusion of BABAR & CDF WNs into general farm

Farming: evolution (2) LCG upgrade to release –Installation via quattor Dropped lcfgng Different packaging from yaim Deployed upgrade to 500 nodes in one day –Still some problems with VOMS integration to be investigated –1 legacy LCG CE still running Access to resources centrally managed with Kerberos (authc) and LDAP (authz)

c=it o=infn ou=cnaf ou=cr U U U ou=grid ou=ui ou=wn ou=privat e ou=public AFS: infn.it ou=cr ou=grid bastion G G U ou=local A WorkerN (pool accounts) UserInt o=cnaf UG Generic CNAF users U ext.acess CR.WorkerNode CR.UserInterface locaUser gridUser AFSinfn.it user infn user A A public A ou=afs Authorization with LDAP

Some numbers (LSF output) QUEUE_NAME PRIO STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SUSP alice 40 Open:Active cdf 40 Open:Active dteam 40 Open:Active atlas 40 Open:Active cms 40 Open:Active lhcb 40 Open:Active babar_test 40 Open:Active babar 40 Open:Active virgo 40 Open:Active argo 40 Open:Active magic 40 Open:Active ams 40 Open:Active infngrid 40 Open:Active guest 40 Open:Active test 40 Open:Active jobs running!

Storage & Database Team composition –~ 1.5 FTE for general storage –~ 1 FTE for CASTOR –~ 1 FTE for DBs Tasks –DISK (SAN, NAS) - HW/SW installation and maintenance, remote (gridSE) and local (rfiod/nfs/GPFS) access service, clustered/parallel filesystem tests, participation to SC 2 SAN systems (~ 225 TB) 4 NAS systems (~ 60TB) –CASTOR HSM system - HW/SW installation and maintenance, gridftp and SRM access service STK library with 6 LTO2 and B drives (+4 to install) –1200 LTO2 (200 GB) tapes – B (200 GB) tapes –DB (Oracle for Castor & RLS test, Tier 1 “global” Hardware db)

Storage setup Physical access to main storage (Fast-T900) via SAN –Level1 disk servers connected via FC Usually also in GPFS cluster –Easiness of administration –Load balancing and redundancy –Lustre under evaluation –Can be level2 disk servers connected to storage only via GPFS LCG and FC dependencies on OS decoupled WNs are not members of GPFS cluster (no scalability on large number of WNs) –Storage available to WNs via rfio, xrootd (BABAR) or NFS (few cases, see next slide) –NFS used mainly to share experiment sw on WNs But not suitable for data access

NFS stress test NFS server... 1) Connectathon (NFS/RPC test) 2) iozone (NFS-I/O test) NFS clients WNs NAS LSF fibre channell Parameters client kernel server kernel rsize wsize protocol test execution time I/O disk write I/O wait server threads Results problems with kernel 2.6 is better than 2.4 (even with sdr patch) UDP protocol has better performance than TCP better rsize is better wsize is 8192 nfs protocol may scale over 200 clients without aggregate performance degradation. job scheduler

Disk storage tests Data processing for the LHC experiments at Tier1 facilities requires the access to PetaBytes of data from thousands of nodes simultaneously at high rate None of old-fashioned storage systems allow to handle these requirements We are in the process of defining the required hardware and software components of the system –It is emerging that a Storage Area Network approach with a Parallel File System on top can make the job V. Vagnoni – LHCb Bologna

Hardware testbed Disk storage –3 controllers of a IBM DS4500 –Each controller serves 2 RAID5 arrays, 4 TB each (17 x 250 GB disks + 1 hot spare) –Each RAID5 is further subdivided in two LUNs of 2 TB each –12 LUNs and 24 TB of disk space in total (102 x 250GB disks + 8 hot spares) File System Servers –6 IBM xseries 346, dual Xeon, 2 GB RAM, Gigabit NIC –QLogic fiber channel PCI card on each server connected to the DS4500 via a Brocade switch –6 Gb/s available bandwidth to/from the clients Clients –36 dual Xeon, 2 GB RAM, Gigabit NIC V. Vagnoni – LHCb Bologna

Parallel File Systems We evaluated the two main-stream products on the market –GPFS (version 2.3) by IBM –Lustre (version 1.4.1) by CFS Both come with advanced management and configuration features, SAN-oriented failover mechanism, data recovery –But can be used as well on top of standard disks and arrays By using GPFS and Lustre, the 12 DS4500 LUNs in use were aggregated in one 24 TB file system from the servers, and mounted by the clients through the Gigabit network –Both the file systems are 100% POSIX-compliant from the client side –The file systems appear to the clients as ordinary local mount points V. Vagnoni – LHCb Bologna

Performance (1) A home-made benchmarking tool oriented to HEP applications has been written –Allows simultaneous sequential read/write from an arbitrary number of clients and processes per client Raw ethernet throughput vs time (20 x 1GB file simultaneous reads) V. Vagnoni – LHCb Bologna

Performance (2) Net throughput (Gb/s) # of simultaneous read/writes Results of read/write (1GB different files) V. Vagnoni – LHCb Bologna

CASTOR issues At present STK library with 6xLTO2 and 2x9940B drives –2000 x 200GB LTO-2 tapes  400TB –800 x 200GB 9940B tapes  160TB (free) –Tender for upgrade with 2500 x 200GB tapes (500TB) In general CASTOR performances (as other HSM software) increase with clever pre-staging of files (ideally ~ 90%) LTO-2 drives not usable in a real production environment with present CASTOR release –hangs on locate/fskip occur every not-sequential reading operations or checksum and not terminated tape (RDONLY) every 50/100GB data written (also STK assistance is needed) –Usable only with medium file size of 20 MB or more –Good reliability on optimized (sequential or pre-staged) operations –Fixes with CASTOR v.2 (Q2 2005) ? CERN and PIC NEVER reported HW problems with 9940B drives during last year data-challenges.

Service challenge (1) WAN Dedicated 1Gb/s link connected via GARR+Geant 10 Gb/s link available in September ‘05 LAN Extreme Summit 400 (48xGE+2x10GE) dedicated to Service Challenge. CERN 10Gb in September Gb Summit Gb/s+2x10Gb/s GARR Italian Research Network 11 Servers + Internal HDs

Service challenge (2) 11 SUN Fire V20 Dual Opteron (2,2 Ghz) –2x 73 GB U320 SCSI HD –2x Gbit Ethernet interfaces –2x PCI-X Slots OS: SLC3.0.4 (arch x86_64), the kernel is Tests bonnie++/IOzone on local disks  ~60 MB/s r and w Tests Netperf/Iperf on LAN  ~950 Mb/s Globus (GridFTP) v2.4.3 Installed on all cluster nodes CASTOR SRM v1.2.12, Stager CASTOR v

SC2: Transfer cluster 10 machines used as GridFTP/SRM servers, 1 as CASTOR Stager/SRM-repository –Internal disks used (70x10=700GB). –For SC3 also CASTOR tape servers with IBM LTO2 or STK 9940B drives will be used Load balancing implemented assigning to a CNAME the IP addresses of all the 10 servers with round-robin algorithm SC2 goal reached: 100 MBps disk-to-disk for 2 weeks sustained

Sample of production: CMS (1) CMS activity at T1: grand summary –Data transfer >1 TB per day T0  T1 in 04 via PhEDEx –Local MC production >10 Mevts for >40 physics datasets –Grid activities official production on LCG analysis on DST via grid tools

Sample of production: CMS (2)

Sample of production: CMS (3)

Summary & Conclusions During 2004 INFN Tier1 deeply involved in LHC –Some experiments (e.g. BABAR, CDF) already in data taking phase Main issue is HR shortage –~10 FTE required for Tier1 to nearly double the staff (both for hw and sw maintenance)