INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013

Outline Network Farming Storage Common services 28/10/2013Andrea Chierici2

Network

WAN Connectivity NEXUSCisco7600 RAL PIC TRIUMPH BNL FNAL TW-ASGC NDFGF IN2P3 SARA T1 resources LHC ONE LHC OPN General IP GARR Bo1 20Gb/s 10Gb/s 10 Gb/s CNAF-FNAL CDF (Data Preservation) 20 Gb Physical Link (2x10Gb) shared for LHCOPN and LHCONE. 10Gb/s 10 Gb/s For General IP Connectivity General IP  20 Gb/s (Q3-Q4 2013) LHCOPN/ONE  40 Gb/s (Q3-Q4 2013) 28/10/20134

Farming and Storage current connection model INTERNET LHCOPN cisco 7600 bd8810 nexus 7018 10Gb/s Disk Servers Farming Switch Worker Nodes 4X1Gb/s Old resources 2009-2010 Farming Switch 20 Worker Nodes per switch 2x10Gb/s Up to 4x10Gb/s Core switches and routers are fully redundant (power, CPU, fabrics) Every Switch is connected with load sharing on different port modules Core switches and routers have a strict SLA (next solar day) for maintenance 28/10/2013Andrea Chierici5

Farming

Computing resources 195K HS-06 – 17K job slots 2013 tender installed in summer – AMD CPUs, 16 job slots Upgraded whole farm to SL6 – Per-VO and per-Node approach – Some CEs upgraded and serving only some VOs Older nehalem nodes got a significant boost switching to SL6 (and activating hyperthreading too…) 28/10/2013Andrea Chierici7

New CPU tender 2014 tender delayed until beginning of 2014 – Will probably cover also 2015 needs Taking into account TCO (energy consumption) not only sales price 10 Gbit WN connectivity – 5 MB/s per job (minimum) required – 1 gbit link is not enough to face the traffic generated by modern multi core CPUs Network bonding is hard to configure Blade servers are attractive – Cheaper 10 gbit network infrastructure – Cooling optimization – OPEX reduction – BUT: higher street price 28/10/2013Andrea Chierici8

Monitoring & Accounting (1) Rewritten our local resource accounting and monitoring portal Old system was completely home-made – Monitoring and accounting were separate things – Adding/removing queues on LSF meant editing lines in monitoring system code – Hard to maintain: >4000 lines of Perl code 28/10/2013Andrea Chierici9

Monitoring & Accounting (2) New system: monitoring and accounting share same data base Scalable and based on open source software (+ few python lines) Graphite (http://graphite.readthedocs.org) – Time series oriented data base – Django Webapp to plot on-demand graphs – lsfmonacct module released on github Automatic queue management 28/10/2013Andrea Chierici10

Monitoring & Accounting (3) 28/10/2013Andrea Chierici11

Monitoring & Accounting (4) 28/10/2013Andrea Chierici12

Issues Grid accounting problems starting from April 2013 – Subtle bugs affecting the log parsing stage on the CEs (DGAS urcollector) and causing it to skip data WNODeS issue upgrading to SL6 – Code maturity problems: addressed quickly – Now ready for production Babar and CDF will be using it rather soon Potentially the whole farm can be used with WNODeS 28/10/2013Andrea Chierici13

New activities Investigation on Grid Engine as an alternative batch system ongoing Testing zabbix as a platform for monitoring computing resources – Possible alternative to nagios + lemon WNs dynamic update to deal mainly with kernel/cvmfs/gpfs upgrades Evaluating APEL as an alternative to DGAS for grid accounting system 28/10/2013Andrea Chierici14

Storage

Storage Resources Disk Space: 15.3 PB-N (net) on-line – 7 EMC2 CX3-80 + 1 EMC2 CX4-960 (~2 PB) + 100 servers (2x1 gbps connections) – 7 DDN S2A 9950 + 1 DDN SFA 10K + 1 DDN SFA 12K(~11.3PB) + ~80 servers (10 gbps) – Installation of the latest system (DDN SFA 12K 1.9 PB-N) was completed this summer ~1.8 PB-N expansion foreseen before Christmas break – Aggregate bandwidth: 70 GB/s Tape library SL8500 ~16 PB on line with 20 T10KB drives and 13 T10KC drives (3 additional drives were added during summer 2013) – 8800 x 1 TB tape capacity, ~ 100MB/s of bandwidth for each drive – 1200 x 5 TB tape capacity, ~ 200MB/s of bandwidth for each drive – Drives interconnected to library and servers via dedicated SAN (TAN). 13 Tivoli Storage manager HSM nodes access the shared drives – 1 Tivoli Storage Manager (TSM) server common to all GEMSS instances A tender for additional 470 x 5TB tape capacity is under way All storage systems and disk-servers on SAN (4Gb/s or 8Gb/s) 28/10/2013Andrea Chierici16

Storage Configuration All disk space is partitioned in ~10 GPFS clusters served by ~170 servers – One cluster per each main experiment (LHC) – GPFS deployed on the SAN implements a full HA system – System scalable to tens of PBs and able to serve thousands of concurrent processes with an aggregate bandwidth of tens of GB/s GPFS coupled with TSM offers a complete HSM solution: GEMSS Access to storage granted through standard interfaces (posix, srm, xrootd and soon webdav) – FS directly mounted on WNs 28/10/2013Andrea Chierici17

Storage research activities Studies on more flexible and user-friendly methods for accessing storage over WAN – Storage federation implementation – cloud-like approach We developed an integration between GEMSS Storage System and Xrootd in order to match the requirements of CMS and ALICE, using ad-hoc Xrootd modifications – CMS modification was validated by the official Xrootd integration build – This integration is currently in production Another alternative approach for storage federations, based on http/webdav (Atlas use-case), is under investigation 28/10/2013Andrea Chierici18

LTDP Long Term Data preservation (LTDP) for CDF experiment – FNAL-CNAF Data Copy Mechanism is completed Copy of the data will follow this timetable: – end 2013 - early 2014 → All data and MC user level n-tuples (2.1 PB) – mid 2014 → All raw data (1.9 PB) + Databases Bandwidth of 10 Gb/s reserved on transatlantic Link CNAF ↔ FNAL “code preservation” issue to be addressed 28/10/2013Andrea Chierici19

Common services

Installation and configuration tools Currently Quattor is the tool used at INFN-T1 Investigation done on an alternative installation and management tool (study carried on by storage group) Integration between two tools: – Cobbler, for installation phase – Puppet, for server provisioning and management operations Results of investigation demonstrate Cobbler + Puppet as a viable and valid alternative – currently used within CNAF OpenLAB 28/10/2013Andrea Chierici21

Grid Middleware status EMI-3 update status – Argus, BDII, Cream CE, UI, WN, Storm – Some UIs still at SL5 (will be upgraded soon) EMI-1 phasing-out (only FTS remains) VOBOX updated to WLCG release 28/10/2013Andrea Chierici22

INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Similar presentations

Presentation on theme: "INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013.

Similar presentations

Presentation on theme: "INFN-T1 site report Andrea Chierici On behalf of INFN-T1 staff HEPiX Fall 2013."— Presentation transcript:

Similar presentations

About project

Feedback