Download presentation
Presentation is loading. Please wait.
Published byAngelina Phelps Modified over 9 years ago
1
March 22, 2000Dr. Thomas Sterling, Caltech1
2
Networking Options for Beowulf Clusters Dr. Thomas Sterling California Institute of Technology and NASA Jet Propulsion Laboratory March 22, 2000 Presentation to the American Physical Society:
3
March 22, 2000Dr. Thomas Sterling, Caltech3
4
March 22, 2000Dr. Thomas Sterling, Caltech4
5
March 22, 2000Dr. Thomas Sterling, Caltech5 Points of Inflection Computing Heroic Era (1950) –technology: vacuum tubes, mercury delay lines, pulse transformers –architecture: accumulator based –model: von-Neumann, sequential instruction execution –examples: Whirlwind, EDSAC Mainframe (1960) –technology: transistors, core memory, disk drives –architecture: register bank based –model: reentrant concurrent processes –examples: IBM 7042, 7090, PDP-1 Scientific Computer(1970) –technology: earliest SSI logic gate modules –architecture: virtual memory –model: parallel processing –examples: CDC 6600, Goodyear STARAN
6
March 22, 2000Dr. Thomas Sterling, Caltech6 Points of Inflection in the History of Computing Supercomputers (1980) –technology: ECL, semiconductor integration, RAM –architecture: pipelined –model: vector –example: Cray-1 Massively Parallel Processing (1990) –technology: VLSI, microprocessor, –architecture: MIMD –model: Communicating Sequential Processes, Message passing –examples: TMC CM-5, Intel Paragon ? (2000) –trans-teraflops epoch
7
March 22, 2000Dr. Thomas Sterling, Caltech7
8
March 22, 2000Dr. Thomas Sterling, Caltech8
9
March 22, 2000Dr. Thomas Sterling, Caltech9 Punctuated Equilibrium nonlinear dynamics drive to point of inflexion Drastic reduction in vendor support for HPC Component technology for PCs matches workstation capability PC hosted software environments achieve sophistication and robustness of mainframe O/S Low cost network hardware and software enable balanced PC clusters MPPs establish low level of expectation Cross-platform parallel programming model
10
March 22, 2000Dr. Thomas Sterling, Caltech10 BEOWULF-CLASS SYSTEMS Cluster of PCs –Intel x86 –DEC Alpha –Mac Power PC Pure M 2 COTS Unix-like O/S with source –Linux, BSD, Solaris Message passing programming model –PVM, MPI, BSP, homebrew remedies Single user environments Large science and engineering applications
11
March 22, 2000Dr. Thomas Sterling, Caltech11
12
March 22, 2000Dr. Thomas Sterling, Caltech12 Beowulf-class Systems A New Paradigm for the Business of Computing Brings high end computing to broad ranged problems –new markets Order of magnitude Price-Performance advantage Commodity enabled –no long development lead times Low vulnerability to vendor-specific decisions –companies are ephemeral; Beowulfs are forever Rapid response technology tracking Just-in-place user-driven configuration –requirement responsive Industry-wide, non-proprietary software environment
13
March 22, 2000Dr. Thomas Sterling, Caltech13
14
March 22, 2000Dr. Thomas Sterling, Caltech14 Have to Run Big Problems on Big Machines? Its work, not peak flops A user’s throughput over application cycle Big machines yield little slices –due to time and space sharing But data set memory requirements –wide range of data set needs, three order of magnitude –latency tolerant algorithms enable out-of-core computation What is Beowulf breakpoint for price-performance?
15
March 22, 2000Dr. Thomas Sterling, Caltech15 Throughput Turbochargers Recurring costs approx.. 10% MPPs Rapid response to technology advances Just-in-place configuration and reconfigurable High reliability Easily maintained through low cost replacement Consistent portable programming model –Unix, C, Fortran, Message passing Applicable to wide range of problems and algorithms Double machine room throughput at a tenth the cost Provides super-linear speedup
16
March 22, 2000Dr. Thomas Sterling, Caltech16 Beowulf Project - A Brief History Started in late 1993 NASA Goddard Space Flight Center –NASA JPL, Caltech, academic and industrial collaborators Sponsored by NASA HPCC Program Applications: single user science station –data intensive –low cost General focus: –single user (dedicated) science and engineering applications –out of core computation –system scalability –Ethernet drivers for Linux
17
March 22, 2000Dr. Thomas Sterling, Caltech17 Beowulf System at JPL (Hyglac) 16 Pentium Pro PCs, each with 2.5 Gbyte disk, 128 Mbyte memory, Fast Ethernet card. Connected using 100Base-T network, through a 16-way crossbar switch. u Theoretical peak performance: 3.2 GFlop/s. u Achieved sustained performance: 1.26 GFlop/s.
18
March 22, 2000Dr. Thomas Sterling, Caltech18 A 10 Gflops Beowulf California Institute of Technology Center for Advance Computing Research 172 Intel Pentium Pro microprocessors
19
March 22, 2000Dr. Thomas Sterling, Caltech19 Avalon architecture and price.
20
March 22, 2000Dr. Thomas Sterling, Caltech20 1st printing: May, 1999 2nd printing: Aug. 1999 MIT Press
21
March 22, 2000Dr. Thomas Sterling, Caltech21 Beowulf at Work
22
March 22, 2000Dr. Thomas Sterling, Caltech22 Beowulf Scalability
23
March 22, 2000Dr. Thomas Sterling, Caltech23 Electro-dynamic FDTD Code All timing data is in CPU seconds/simulated time step, for a global grid size of 282 362 102, distributed on 16 processors.
24
March 22, 2000Dr. Thomas Sterling, Caltech24 Network Topology Scaling Latencies ( s)
25
March 22, 2000Dr. Thomas Sterling, Caltech25 Routed Network - Random Pattern
26
March 22, 2000Dr. Thomas Sterling, Caltech26
27
March 22, 2000Dr. Thomas Sterling, Caltech27 System Area Network Technologies Fast Ethernet –LAN, 100 Mbps, 100 usec Gigabit Ethernet –LAN/SAN, 1000 Mbps, 50 usec ATM –WAN/LAN, 155/620 Mbps, Myrinet –SAN, 1250 Mbps, 20 usec Giganet –SAN/VIA, 1000 Gbps, 5 usec Servernet II –SAN/VIA, 1000 Gbps, 10 usec SCI –SAN, 8000 Gbps, 5 usec
28
March 22, 2000Dr. Thomas Sterling, Caltech28 3Com CoreBuilder 9400 Switch and Gigabit Ethernet NIC
29
March 22, 2000Dr. Thomas Sterling, Caltech29 Lucent Cajun M770 Multifunction Switch
30
March 22, 2000Dr. Thomas Sterling, Caltech30 M2LM-SW16 16-Port Myrinet Switch with 8 SAN ports and 8 LAN ports
31
March 22, 2000Dr. Thomas Sterling, Caltech31 Dolphin Modular SCI Switch for System Area Networks
32
March 22, 2000Dr. Thomas Sterling, Caltech32 Giganet High Performance Host Adapters
33
March 22, 2000Dr. Thomas Sterling, Caltech33 Giganet High Performance Cluster Switch
34
March 22, 2000Dr. Thomas Sterling, Caltech34
35
March 22, 2000Dr. Thomas Sterling, Caltech35
36
March 22, 2000Dr. Thomas Sterling, Caltech36
37
March 22, 2000Dr. Thomas Sterling, Caltech37
38
March 22, 2000Dr. Thomas Sterling, Caltech38
39
March 22, 2000Dr. Thomas Sterling, Caltech39
40
March 22, 2000Dr. Thomas Sterling, Caltech40 The Beowulf Delta looking forward 6 years Clock rate: X 4 flops (per chip): X 50 (2-4 proc/chip, 4-8 way ILP/proc) #processors: 32 Networking: X 32 (32 - 64 Gbps) Memory: X 10 (4 Gbytes) Disk: X 100 price-performance: X 50 system performance: 50 Tflops
41
March 22, 2000Dr. Thomas Sterling, Caltech41 Million $$ Teraflops Beowulf? Today, $3M peak Tflops < year 2002 $1M peak Tflops Performance efficiency is serious challenge System integration –does vendor support of massive parallelism have to mean massive markup System administration, boring but necessary Maintenance without vendors; how? –New kind of vendors for support Heterogeneity will become major aspect
42
March 22, 2000Dr. Thomas Sterling, Caltech42
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.