Download presentation
Presentation is loading. Please wait.
Published byGary Reginald Wade Modified over 9 years ago
1
Randy MelenApril 14, 19991 Stanford Linear Accelerator Center Site Report HEPiX@RAL April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader
2
Randy MelenApril 14, 19992 Past 12 months... n Busy! n Target of May 9 for BaBar detector to begin n Challenge to get systems assembled and tested in time, to get C++ code working and sufficiently optimized, to handle 100 events/second for reconstruction and event recording n Once BaBar data begins, more difficult to make system changes, take service outages
3
Randy MelenApril 14, 19993 New Hardware Developments n Increased Solaris batch systems to compute farm ( from 5 Sun Ultra 2300 systems to 18 systems) n Upgraded Sun UE 6000 to 4GB memory n Acquired 4 Sun UE4500 systems, increased to 6 systems, for HPSS data movers, total of 4TB of disk n Acquired Sun UE10000 (24 CPUs, 12GB memory, 1.5TB disk, 2 domains) n 4 Sun E250 systems as tape movers n 3 IBM F50 systems as data movers
4
Randy MelenApril 14, 19994 New Hardware Developments (cont.) n Added 220 Sun U5 systems (256MB, 9GB IDE disk, 333MHz UltraSPARC IIi with 2MB cache, $188/SI95) n Expect to add ~200 more U5 systems 2Q1999, probably more disk, perhaps UE10000 upgrade to 400MHz CPUs
5
Randy MelenApril 14, 19995 Farm Management n Upgraded farm master for LSF to IBM F50 n Working with Sun Auto Client software and cacheFS to centrally manage 200-400 Sun U5 systems n Actively doing Solaris performance tuning on UE6000 and UE10000 n Adding 2 Sun E250 systems as BaBar build systems; need to be able to build 1M C++ lines of code each night (twice?)
6
Randy MelenApril 14, 19996 Mass Storage Hardware n Upgraded 5 STK silos to PowderHorn robots n Added a 6’th STK silo and 12 STK Eagle drives; more Eagle drives will be needed n Need to add BaBar data import/export tape device; considering STK 9740 with DLT 7000 and RedWood drives
7
Randy MelenApril 14, 19997 Farm Network Technology n Currently using 3 Cisco Catalyst 5500 switches (~1.2 Gbps backplanes), everything on Fast Ethernet, single collision domains n Migrating to 3 Cisco Catalyst 6509 switches (~16 Gbps backplanes) n Deploying Gb Ethernet on ~16 Solaris servers
8
Randy MelenApril 14, 19998 HPSS Phase 3 (Porting) Ongoing n With assistance from Sun, began moving and testing the Solaris 2.5.1 port to Solaris 2.6 n Lots of issues related to getting infrastructure pieces at correct version levels n Began HPSS 4.1 datamover port to Solaris 2.6 n Sun and IBM signed agreement for IBM to port HPSS 4.1A; we expect to deploy ~4Q1999
9
Randy MelenApril 14, 19999 HPSS Stage 4 (PRV0) Plans n While Solaris port continues, use IBM F50 systems as datamovers n Move development (porting and testing) to Solaris U250 build servers
10
Randy MelenApril 14, 199910 Currently Supported Systems n General Servers u generally Solaris 2.5.1--> Solaris 2.6 u AFS servers will become Sun U2300 systems for AFS 3.5 multithreading u AIX 4.1.5 --> 4.2.1 u phasing out “core” NFS file server (AIX 3.2.5!) by moving binaries and home directories to AFS n Farm Servers u AIX 4.2.1 now frozen, not a porting platform for BaBar as of 7/1998 u Solaris 2.5.1 -> 2.6 completed n Desktop u still NT though much more Linux than before now
11
Randy MelenApril 14, 199911 Intel Farm Prototype n A prototype 17 node Intel compute farm acquired 4Q1998: u 2-way 256MB, 9GB disk, Dell 450MHz Pentium-II u partnership with Accelerator Research group and NERSC u strong interest in MPI and developing for Cray T3E production u decided on Linux from RedHat u modest success so far for scalability u expect to expand to 32 nodes 3Q1999 u Issues that remain: F Commercial software support (e.g., Objectivity, AFS, LSF with AFS support) F Manageability of large numbers of systems F MPI cluster vs “task farm”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.