1 INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff 28 th October 2009.

Slides:



Advertisements
Similar presentations
Exporting Raw/ESD data from Tier-0 Tier-1s Wrap-up.
Advertisements

LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Southgrid Status Pete Gronbech: 27th June 2006 GridPP 16 QMUL.
Title US-CMS User Facilities Vivian O’Dell US CMS Physics Meeting May 18, 2001.
INFN-T1 site report Giuseppe Misurelli On behalf of INFN-T1 staff HEPiX Spring 2015.
Luca dell’Agnello INFN-CNAF FNAL, May
1 RAL Status and Plans Carmine Cioffi Database Administrator and Developer 3D Workshop, CERN, November 2009.
BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.
March 27, IndiaCMS Meeting, Delhi1 T2_IN_TIFR of all-of-us, for all-of-us, by some-of-us Tier-2 Status Report.
INFN – Tier1 Site Status Report Vladimir Sapunenko on behalf of Tier1 staff.
Toward new HSM solution using GPFS/TSM/StoRM integration Vladimir Sapunenko (INFN, CNAF) Luca dell’Agnello (INFN, CNAF) Daniele Gregori (INFN, CNAF) Riccardo.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Spring 2014.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
Federico Ruggieri INFN-CNAF GDB Meeting 10 February 2004 INFN TIER1 Status.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
ScotGRID:The Scottish LHC Computing Centre Summary of the ScotGRID Project Summary of the ScotGRID Project Phase2 of the ScotGRID Project Phase2 of the.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Storage and Storage Access 1 Rainer Többicke CERN/IT.
INFN TIER1 (IT-INFN-CNAF) “Concerns from sites” Session LHC OPN/ONE “Networking for WLCG” Workshop CERN, Stefano Zani
Author: Andrew C. Smith Abstract: LHCb's participation in LCG's Service Challenge 3 involves testing the bulk data transfer infrastructure developed to.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.
Status Report of WLCG Tier-1 candidate for KISTI-GSDC Sang-Un Ahn, for the GSDC Tier-1 Team GSDC Tier-1 Team ATHIC2012, Busan,
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
INFN-T1 site report Andrea Chierici, Vladimir Sapunenko On behalf of INFN-T1 staff HEPiX spring 2012.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
INFN-T1 site report Luca dell’Agnello On behalf ot INFN-T1 staff HEPiX Spring 2013.
January 20, 2000K. Sliwa/ Tufts University DOE/NSF ATLAS Review 1 SIMULATION OF DAILY ACTIVITITIES AT REGIONAL CENTERS MONARC Collaboration Alexander Nazarenko.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2015.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
IT-INFN-CNAF Status Update LHC-OPN Meeting INFN CNAF, December 2009 Stefano Zani 10/11/2009Stefano Zani INFN CNAF (TIER1 Staff)1.
PADME Kick-Off Meeting – LNF, April 20-21, DAQ Data Rate - Preliminary estimate Tentative setup: all channels read with Fast ADC 1024 samples, 12.
Storage & Database Team Activity Report INFN CNAF,
Daniele Cesini - INFN CNAF. INFN-CNAF 20 maggio 2014 CNAF 2 CNAF hosts the Italian Tier1 computing centre for the LHC experiments ATLAS, CMS, ALICE and.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
Farming Andrea Chierici CNAF Review Current situation.
RAL Plans for SC2 Andrew Sansum Service Challenge Meeting 24 February 2005.
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
Mass Storage Systems for the Large Hadron Collider Experiments A novel approach based on IBM and INFN software A. Cavalli 1, S. Dal Pra 1, L. dell’Agnello.
Stato del Tier1 Luca dell’Agnello 11 Maggio 2012.
Validation tests of CNAF storage infrastructure Luca dell’Agnello INFN-CNAF.
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
Dynamic Extension of the INFN Tier-1 on external resources
Extending the farm to external sites: the INFN Tier-1 experience
Luca dell’Agnello INFN-CNAF
StoRM: a SRM solution for disk based storage systems
NL Service Challenge Plans
INFN Computing infrastructure - Workload management at the Tier-1
Andrea Chierici On behalf of INFN-T1 staff
Update on Plan for KISTI-GSDC
Luca dell’Agnello INFN-CNAF
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
Vladimir Sapunenko On behalf of INFN-T1 staff HEPiX Spring 2017
The LHCb Computing Data Challenge DC06
Presentation transcript:

1 INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff 28 th October 2009

Overview Infrastructure Network Farming Storage 2

Infrastructure 3

4 INFN-T1 2005INFN-T Racks40120 Power sourceUniversityDirectly from supplier (15kV) Power Transformer 1 (~1MVA)3 (~2.5MVA) UPS1 diesel engine/UPS (~640kVA) 2 Rotary UPS (~3400kVA) + 1 diesel engine (~640kVA) Chiller1 (~530kVA)7 (~2740kVA)

5 UPS up to 3,8 MW V 1.4 MW 1 MW Chillers 1.4 MW 1.2 MW

Mechanical and electrical surveillance

Network 7

INFN CNAF TIER1 Network 7600 GARR 2x10Gb/s 10Gb/s Exterme BD x10Gb /s 10Gb/s LHC-OPN dedicated link 10Gb/s T0-T1 (CERN) T1-T1 (PIC,RAL,TRIUMPH) T1-T1’s (BNL,FNAL,TW-ASGC,NDGF) T1-T2’s CNAF General purpose Exterme BD8810 Worker Nodes 2x1Gb/s Extreme Summit450 Extreme Summit450 4x1Gb/ s Extreme Summit450 Worker Nodes 4x1Gb/s 2x10Gb /s Extreme Summit400 Storage Servers Disk Servers Castor Stagers Fiber Channel Storage Devices SAN Extreme Summit400 In Case of network Congestion: Uplink upgrade from 4 x 1Gb/s to 10 Gb/s or 2x10Gb/s LHC-OPN CNAF-KIT CNAF-IN2P3 CNAF-SARA T0-T1 BACKUP 10Gb/s WAN RAL PIC TRIUMPH Cisco NEXUS 7000

Farming 9

New tender 1U Twin solution with these specs:  2 Intel Nehalem  24GB RAM  2x 320 GB SATA rpm,  2x 1Gbps Ethernet 118 twin, reaching HEP-SPEC, measured on SLC44 Delivery and installation foreseen within

Computing resources Including machines from new tender, INFN- T1 computing power will reach HEP- SPEC within 2009 Further increase within January 2010 will bring us to HEP-SPEC Within may 2010, we will reach HEP- SPEC (as we pledged to WLCG)  This basically will triple current computing power 11

Resource usage per VO 12

KSI2K pledged vs used 13

New accounting system Grid, local and overall job visualization Tier1/Tier2 separation Several parameters monitored  avg and max RSS, avg and max Vmem added in latest release KSI2K/HEP-SPEC accounting WNoD accounting Available at: Feedback welcome to: 14

New accounting: sample picture 15

GPU Computing (1) We are investigating GPU computing  NVIDIA Tesla C1060, used for porting software and performing comparison tests  py?confId=266, meeting with Bill Dally (chief scientist and vice president of NVIDIA). py?confId=266 16

GPU Computing (2) Applications currently tested:  Bioinformatics: CUDA-based paralog filtering in Expressed Sequence Tag clusters  Physics: Implementing a second order electromagnetic particle in cell code on the CUDA architecture  Physics: Spin-Glass Monte Carlo Simulations First two apps showed more than 10x increase in performance!! 17

GPU Computing (3) We plan to buy 2 more workstations in 2010, with 2 GPU each.  We wait for the FERMI architecture, foreseen for spring 2010 We will continue the activities currently ongoing and will probably test some monte carlo simulations for superB We plan to test selection and shared usage of GPUs via grid 18

Storage 19

tenders Disk tender requested  Baseline: 3.3 PB raw (~ 2.7 PB-N) 1 st option: 2.35 PB raw (~ 1.9 PB-N) 2 nd option: 2 PB raw (~ 1.6 PB-N) Options to be requested during Q2 and Q  New disk in production ~ end of Q tapes (~ 4 PB) acquired with library tender  4.9 PB needed beginning of 2010  7.7 PB probably needed by half 2010

21 To be upgraded to Srm v 2.2 end-points available  Supported protocols: rfio, gridftp Still cumbersome to manage  requires frequent intervention in the Oracle db  Lack of management tools CMS migrated to StoRM for D0T1

22 WLCG Storage Classes at INFN-T1 today Storage Class – offer different levels of storage quality (e.g. copy on disk and/or on tape)  DnTm = n copies on disk and m copies on tape Implementation of 3 Storage Classes needed for WLCG (but usable also by non-LHC experiments)  Disk0-Tape1 (D0T1) or “custodial nearline” Data migrated to tapes and deleted from disk when staging area full Space managed by system Disk is only a temporary buffer  Disk1-Tape0 (D1T0) “replica online” Data kept on disk: no tape copy Space managed by VO  Disk1-Tape1 (D1T1) “custodial online” Data kept on disk AND one copy kept on tape Space managed by VO (i.e. if disk is full, copy fails)‏ Currently CASTOR Currently GPFS/TSM + StoRM

23 YAMSS: present status Yet Another Mass Storage System  Scripting and configuration layer to interface GPFS&TSM Can work driven by StoRM or stand-alone  Experiments not using the SRM model can work with it GPFS-TSM (no StoRM) interface ready  Full support for migrations and tape ordered recalls StoRM  StoRM in production at INFN-T1 and in other centres around the world for “pure” disk access (i.e. no tape)  integration with YAMSS for migrations and tape ordered recalls ongoing (almost completed) Bulk migrations and recalls tested with a typical use case (stand-alone YAMSS, without StoRM)  Weekly production workflow of the CMS experiment

24 Why GPFS&TSM Tivoli Storage Manager (developed by IBM) is a tape oriented storage manager widely used (also in HEP world, e.g. FZK)  Built-in functionality present in both products to implement backup and archiving from GPFS. The development of a HSM solution is based on the combination of features of GPFS (since v.3.2) and TSM (since v.5.5).  Since GPFS v.3.2 the new concept of “external storage pool” extends use of policy driven Information Lifecycle Management (ILM) to tape storage. External pools are real interfaces to external storage managers, e.g. HPSS or TSM  HPSS very complex (no benefits in this sense compared to CASTOR)

25 YAMSS: hardware set-up 20x4 Gbps ~ 500 TB for GPFS on CX GridFTP servers (4x2 Gbps) 6 NSD servers (6x2 Gbps) on LAN HSM STA HSM STA HSM STA 8x4 Gbps 3x4 Gbps db 8x4 Gbps 8 tape drives T10KB: - 1 TB per tape, - 1 Gbps per drive TAN SAN TSM server 4 Gbps FC

YAMSS: validation tests  Concurrent access in r/w to MSS for transfers and from farm  StoRM not used in these tests  3 HSM nodes serving 8 T10KB drives  6 drives (at maximum) used for recalls  2 drives (at maximum) used for migrations  Order of 1GB/s of aggregated traffic 26 ~550 MB/s from tape to disk ~100 MB/s from disk to tape ~400 MB/s from disk to the computing nodes (not shown in this graph)

Questions? 27