Download presentation
Presentation is loading. Please wait.
1
NIKHEF Data Processing Fclty
Status Overview per David Groep, NIKHEF NEROC-TECH NDPF status overview
2
NEROC-TECH NDPF status overview
A historical view Started in 2000 with a dedicated farm for DØ 50 Dual P3-800 MHz tower model Dell Precision 220 800 GByte “3ware” disk array jobs NEROC-TECH NDPF status overview
3
NEROC-TECH NDPF status overview
Many different farms 2001: EU DataGrid WP6 ‘Application’ test bed 2002: addition of the ‘development’ test bed 2003: LCG-1 production facility April 2004: amalgamation of all nodes into LCG-2 September 2004: addition of EGEE PPS VL-E P4 CTB EGEE JRA1 LTB NEROC-TECH NDPF status overview
4
NEROC-TECH NDPF status overview
Growth of resources Intel Pentium III 800 MHz 100 CPUs 2000 Intel Pentium III 933 MHz 40 CPUs 2001 AMD Athlon MP2000+ ~2 GHz 132 CPUs 2002 Intel XEON 2.8 GHz 54 CPUs 2003 Intel XEON 2.8 GHz 20 CPUs 2003 Total WN resources (raw) 353 THz hr/mo ~200 kSI2k Total on-line disk cache 7 TByte NEROC-TECH NDPF status overview
5
NEROC-TECH NDPF status overview
Node types 2U “pizza” boxes PIII 933 MHz, 1GByte RAM, 43 Gbyte disk 1U GFRC (NCF) AMD MP2000+, 1GByte RAM, 60 Gbyte disk ‘thermodynamic challenges’ 1U Halloween XEON 2.8 GHz 2GByte RAM, 80 Gbyte disk first GigE nodes NEROC-TECH NDPF status overview
6
Connecting things together
Collapsed backbone strategy Foundry Networks BigIron 15000 14 GigE SX, 2x GigE LX BaseTX 48 100BaseTX Service nodes directly GigE connected Farms connected via local switches WN oversubscription typical 1:5 – 1:7 Dynamic re-assignment of nodes to facilities DHCP Relay built-in NAT support (for worker nodes) NEROC-TECH NDPF status overview
7
NEROC-TECH NDPF status overview
NIKHEF Farm Network NEROC-TECH NDPF status overview
8
NEROC-TECH NDPF status overview
Network Uplinks NIKHEF links 1 Gb/s IPv4 & 1 Gb/s IPv6 SURFnet 2 Gb/s WTCW (to SARA) SURFnet links: NEROC-TECH NDPF status overview
9
NDPF Usage Analyzed production batch logs since May 2002
total of 1.94 PHzHours provided in jobs Added “Halloween” LHC Data Challenges Added NCF GFRC experimental use and tests not shown NEROC-TECH NDPF status overview
10
Usage per Virtual Organisation
Real-time web info: Dzero acts as “background fill” Usage doesn’t (yet) reflect shares NEROC-TECH NDPF status overview
11
NEROC-TECH NDPF status overview
Usage monitoring Live viewgraphs farm occupancy per-VO distribution network loads Tools Cricket (network) home-grown scripts + rrdtool NEROC-TECH NDPF status overview
12
NEROC-TECH NDPF status overview
Central services VO-LDAP services LHC VOs DutchGrid CA “edg-testbed-stuff”: Torque & Maui distribution installation support components NEROC-TECH NDPF status overview
13
NEROC-TECH NDPF status overview
Some of the issues Data access patterns in Grids jobs tend to clutter $CWD high load when shared over NFS shared homes required for traditional batch & MPI Garbage collection for “foreign” jobs OpenPBS & Torque transient $TMPDIR patch Policy management maui fair-share policies CPU capping max-queued-jobs capping NEROC-TECH NDPF status overview
14
Developments: work in progress
Parallel Virtual File Systems From LCFGng to Quattor (Jeff) Monitoring and ‘disaster recovery’ (Davide) NEROC-TECH NDPF status overview
15
NEROC-TECH NDPF status overview
Team NEROC-TECH NDPF status overview
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.