NPACI Panel on Clusters David E. Culler Computer Science Division University of California, Berkeley
1/29/99NAPCI Clusters2 Clusters have Happened IBM SGI SUN HP Microsoft...
1/29/99NAPCI Clusters3 Performance + Cost-Performance April 1998 T3E vs Intel Pentium II on NAS Benchmarks T3E-900 PII-400 BT LU SP
1/29/99NAPCI Clusters4 Berkeley (NPACI) NOW 100 Sun Ultra2 workstations Inteligent network interface –proc + mem Myrinet Network –160 MB/s per link –300 ns per hop
1/29/99NAPCI Clusters5 Beowolf Consortium LINUX PCs Fast ethernet Basic stand-alone and batch cluster cookbook
1/29/99NAPCI Clusters6 HPVM (NPACE) NT PCs Myricom Network Fast Messages LSF start-up
1/29/99NAPCI Clusters7 Berkeley Millennium PC-based Unix and NT Clusters Departmental and Campus Shared as Computational Economy Gigabit Ethernet SIMS C.S. E.E. M.E. BMRC N.E. IEOR C. E. MSME NERSC Transport Business Chemistry Astro Physics Biology Economy Math
1/29/99NAPCI Clusters8 What you get off the shelf Go to your favorite web site and order –Dual PII 450 MHz, 1 GB Mem, 36 GB disk –$9,563 at Dell 4400 CPU Hours per Quarter to your self! –$1.20 CPU hours (at 30% over 3 years) Buy 5?
1/29/99NAPCI Clusters9 Three Kinds of Clusters Throughput Clusters Availability Clusters High-Performance Parallel Clusters
1/29/99NAPCI Clusters10 Throughput Clusters Workstation / PC Farms Provide resource pool for large numbers of sequential jobs Used widely in industry –toy story on 2000 sparcstations –ultrasparc on 1000 sparcstations Can also include background on desktops –CONDOR Application specific front-ends attractive –parametric studies, monte carlo Fill cracks in parallel clusters Big farms require infrastructure
1/29/99NAPCI Clusters11 Availability Clusters Use system redundancy to mask faults –all big databases do it VAX Clusters => IBM sysplex => Wolf PackClients Disk array A Disk array B Interconnect Server A Server B
1/29/99NAPCI Clusters12 High-Performance Clusters Utilize modern system area networks and user- level communication layers to construct general purpose parallel machine from commodity parts
1/29/99NAPCI Clusters13 Emerging System Area Networks Gigabit Ethernet –price dropping, widely deployed System Area Networks –Myricom –ServerNet –Synfinity Virtual Interface Architecture –Intel/Microsoft/Compaq std based on univ. research prototypes
1/29/99NAPCI Clusters14 MPI Performance
1/29/99NAPCI Clusters15 Example: NAS Parallel Benchmarks Better node performance than the Cray T3D Better scalability than the IBM SP-2
1/29/99NAPCI Clusters16 Cluster-Wide Parallel I/O Sustain 500 MB/s disk bandwidth and 1,000 MB/s network bandwidth by driving all the disks
1/29/99NAPCI Clusters17 Software Base is Growing Technical software moving to Linux and NT –NAG, matlab, petc,... Cluster prototypes being hardened and packaged Cookbooks emerging Few cluster-integrator companies
1/29/99NAPCI Clusters18 Summary of the State of the Art Cluster designs are emerging in many areas –throughput, availability, parallel computing –technology is advancing Still immature software base –strong ties to free software movement Many small clusters by spit and baling wire Large clusters require engineering –commercial components improving Rapid pace of change presents sys. Admin challenge –not unlike the desktop problem Management tools badly needed
1/29/99NAPCI Clusters19 What Does it Mean for NPACI Where do clusters fit with computational science and engineering needs? Cycles vs Software vs Administration vs Expertise? What role should the center take? What role should partner sites have?