Download presentation
Presentation is loading. Please wait.
Published byDominick Harris Modified over 9 years ago
1
Real Parallel Computers
2
Modular data centers
3
Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel Computing 2005
4
Short history of parallel machines 1970s: vector computers 1990s: Massively Parallel Processors (MPPs) –Standard microprocessors, special network and I/O 2000s: –Cluster computers (using standard PCs) –Advanced architectures (BlueGene) –Comeback of vector computer (Japanese Earth Simulator) –IBM Cell/BE 2010s: –Multi-cores, GPUs –Cloud data centers
5
Performance development and predictions
6
Clusters Cluster computing –Standard PCs/workstations connected by fast network –Good price/performance ratio –Exploit existing (idle) machines or use (new) dedicated machines Cluster computers vs. supercomputers (MPPs) –Processing power similar: based on microprocessors –Communication performance was the key difference –Modern networks have bridged this gap (Myrinet, Infiniband, 10G Ethernet)
7
Overview Cluster computers at our department –DAS-1: 128-node Pentium-Pro / Myrinet cluster (gone) –DAS-2: 72-node dual-Pentium-III / Myrinet-2000 cluster –DAS-3: 85-node dual-core dual Opteron / Myrinet-10G –DAS-4: 72-node cluster with accelerators (GPUs etc.) Part of a wide-area system: –Distributed ASCI Supercomputer
8
Distributed ASCI Supercomputer (1997-2001)
9
DAS-2 Cluster (2002-2006) 72 nodes, each with 2 CPUs (144 CPUs in total) 1 GHz Pentium-III 1 GB memory per node 20 GB disk Fast Ethernet 100 Mbit/s Myrinet-2000 2 Gbit/s (crossbar) Operating system: Red Hat Linux Part of wide-area DAS-2 system (5 clusters with 200 nodes in total) Myrinet switch Ethernet switch
10
DAS-3 Cluster (Sept. 2006) 85 nodes, each with 2 dual-core CPUs (340 cores in total) 2.4 GHz AMD Opterons (64 bit) 4 GB memory per node 250 GB disk Gigabit Ethernet Myrinet-10G 10 Gb/s (crossbar) Operating system: Scientific Linux Part of wide-area DAS-3 system (5 clusters; 263 nodes), using SURFnet-6 optical network with 40-80 Gb/s wide-area links
11
DAS-3 Networks Nortel 5530 + 3 * 5510 ethernet switch 85 compute nodes 85 * 1 Gb/s ethernet Myri-10G switch 85 * 10 Gb/s Myrinet 10 Gb/s ethernet blade 8 * 10 Gb/s eth (fiber) Nortel OME 6500 with DWDM blade 80 Gb/s DWDM SURFnet6 1 or 10 Gb/s Campus uplink Headnode (10 TB mass storage) 10 Gb/s Myrinet 10 Gb/s ethernet
12
Myrinet Nortel DAS-3 Networks
13
DAS-4 (Sept. 2010) 72 nodes (2 quad-core Intel Westmere Xeon E5620, 24 GB memory, 2 TB disk) 2 fat nodes with 94 GB memory Infiniband network + 1 Gb/s Ethernet 16 NVIDIA GTX 480 graphics accelerators (GPUs) 2 Tesla C2050 GPUs
14
DAS-4 performance Infiniband network: - One-way latency: 1.9 microseconds - Throughput: 22 Gbit/s CPU performance: - 72 nodes (576 cores): 4399.0 GFLOPS
15
Blue Gene/L Supercomputer
16
Blue Gene/L 2.8/5.6 GF/s 4 MB 2 processors 2 chips, 1x2x1 5.6/11.2 GF/s 1.0 GB (32 chips 4x4x2) 16 compute, 0-2 IO cards 90/180 GF/s 16 GB 32 Node Cards 2.8/5.6 TF/s 512 GB 64 Racks, 64x32x32 180/360 TF/s 32 TB Rack System Node Card Compute Card Chip
17
Blue Gene/L Networks 3 Dimensional Torus –Interconnects all compute nodes (65,536) –Virtual cut-through hardware routing –1.4Gb/s on all 12 node links (2.1 GB/s per node) –1 µs latency between nearest neighbors, 5 µs to the farthest –Communications backbone for computations –0.7/1.4 TB/s bisection bandwidth, 68TB/s total bandwidth Global Collective –One-to-all broadcast functionality –Reduction operations functionality –2.8 Gb/s of bandwidth per link –Latency of one way traversal 2.5 µs –Interconnects all compute and I/O nodes (1024) Low Latency Global Barrier and Interrupt –Latency of round trip 1.3 µs Ethernet –Incorporated into every node ASIC –Active in the I/O nodes (1:8-64) –All external comm. (file I/O, control, user interaction, etc.) Control Network
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.