Download presentation
Presentation is loading. Please wait.
Published byHandoko Irawan Modified over 5 years ago
1
CHIPP - CSCS F2F meeting CSCS, Lugano January 25th , 2018
2
Tier 2 status and plans CSCS
Statistics Plans Availability/Reliability Resources Overview CPU usage Storage usage Operations Updates Main Issues
3
slurm_shared NFS mount performance issue
CRAY ramp up NAS Issue dCache upgrade Inode issue slurm_shared NFS mount performance issue 3
4
11 nodes missing: 8 GPFS temp 3 Broken Insert_Footer
5
Insert_Footer
7
Statistics
8
NAS Issue dCache upgrade LHCb ?? Inode issue 8
9
Statistics – Usage per VO
10
Statistics – Storage usage
ATLAS Total: PB Used: PB Free: PB CMS Total: PB Used: PB Free: PB LHCb Total: PB Used: PB Free: PB
11
Operations
12
Operations – Updates: CHiPP Services Overview
12
13
dCache Upgrades No upgrades since last F2F
Next minor upgrade will be in Feb. 2018 Next major upgrade to the gold release (3.2) will be in mid 2018
14
Ticket report
15
LHCb issue About 40% of the jobs were failing for few weeks in December We did several debug No more input from LHCb The Issue solved itself (from LHCb side)
16
Meltdown/Spectre Kernel patch applied as for EGI requiements (Only Meltdown fixed) Performance degradation due to the patch No FW so no solution for Spectre
17
Singularity Phoenix all setup for CMS requirements
Daint under investigation Working on Shifter to enable CMS workflow NOT NICE TO GET IMPORTANT REQUIREMENTS LIKE THIS
18
VO Rep one-on-one followups
Propose schedule for meeting 1-2 times per year: We propose 2 meetings in between the F2F VC or in person up to the VO We are still working on enabling VO Rep to access their own logs
19
Plans
20
IPv6 Deployment CSCS is IPv6 ready Present configuration
We are using IB network IB Gateways do not support IPv6 IPv6 Implementation plan (dCache: ~20 hosts) Use IB cards + adapters + 10G twinax-cables on current host (CHF 40.00) 10G switches are already deployed and available Security needs to be managed
21
Daint outages Center-wide outage (Feb 21 ) In the near future we will:
move all LHConCRAY nodes to the same electrical group for increased performance and jitter reduction. increase DVS nodes from 5 to 8 (same electrical group). Deploy new nodes expected in April In the near future we will: Add an additional TEST ARC-CE to Dom (the TDS of DAINT) so we can evaluate updates and different utilization models (opportunistic?)
22
Future Operations Install next Phase on Cray & SAN (Phase M)
Work on Scratch replacement Centre-wide outage Feb 21st, June 13th, Oct 10th dCache latest minor version (v3 not yet) Upgrade perf-sonar Try to stay out of RH7.4 + OFED4 upgrade in Phoenix ATLAS storage accounting IPv6 (Evaluating how to migrate) Access to CHIPP Monitoring Trips: dCache workshop HEPIX CVMFS workshop Follow-up with GDB on HPC 22
23
Resource Planning Compute 2018 110 kHS06 Installed 99 kHS06 pledged
40:40:20 Storage 2018 4 PB installed and pledged Insert_Footer
24
Thank you for your attention.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.