Berkeley Cluster Projects David E. Culler culler@cs.berkeley.edu http://now.cs.berkeley.edu/ 11/23, 1998 1
Goals Make a fundamental change in how we design and construct large-scale systems market reality: 50%/year performance growth => cannot allow 1-2 year engineering lag technological opportunity: single-chip “Killer Switch” => fast, scalable communication Highly integrated building-wide, campus-wide systems Explore novel system design concepts in this new “cluster” paradigm
100 node Ultra/Myrinet NOW
Fast Communication Challenge Network Interface Hardware Comm.. Software Network Interface Hardware Comm. Software Network Interface Hardware Comm. Software Network Interface Hardware Comm. Software Killer Platform ° ° ° ns ms µs Killer Switch Fast processors and fast networks The time is spent in crossing between them
Opening: Intelligent Network Interfaces Dedicated Processing power and storage embedded in the Network Interface An I/O card today Tomorrow on chip? Mryicom Net 160 MB/s Myricom NIC P M M I/O bus (S-Bus) 50 MB/s M M P $ M $ P $ $ Sun Ultra 170 $ P P P P 15
NOW System Architecture Parallel Apps Large Seq. Apps Sockets, Split-C, MPI, HPF, vSM Global Layer UNIX Resource Management Network RAM Distributed Files Process Migration UNIX Workstation UNIX Workstation UNIX Workstation UNIX Workstation Comm. SW Comm. SW Comm. SW Comm. SW Net Inter. HW Net Inter. HW Net Inter. HW Net Inter. HW Fast Commercial Switch (Myrinet) 14
Communication Performance Direct Network Access Latency 1/BW LogP: Latency, Overhead, and Bandwidth Active Messages: lean layer supporting programming models
World-Record Disk-to-Disk Sort Sustain 500 MB/s disk bandwidth and 1,000 MB/s network bandwidth
Massive Cheap Storage Basic unit: 2 PCs double-ending four SCSI chains Currently serving Fine Art at http://www.thinker.org/imagebase/
Cluster of SMPs (CLUMPS) Four Sun E5000s 8 processors 3 Myricom NICs Multiprocessor, Multi-NIC, Multi-Protocol
Information Servers Basic Storage Unit: Dedicated Info Servers Ultra 2, 300 GB raid, 800 GB tape stacker, ATM scalable backup/restore Dedicated Info Servers web, security, mail, … VLANs project into dept.
Millennium Computational Community SIMS Business BMRC Chemistry C.S. E.E. Biology Gigabit Ethernet Astro NERSC M.E. Physics N.E. IEOR Math Transport Economy C. E. MSME
Millennium PC Clumps Inexpensive, easy to manage Cluster Replicated in many departments Prototype for very large PC cluster
Proactive Infrastructure Information appliances Stationary desktops Scalable Servers