Download presentation
Presentation is loading. Please wait.
1
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
2
Main Design Principles Some science & engineering applications scale up to and beyond 10,000 parallel processes; Improve computing capability, holding total system cost; Cost/perf trade-offs considering the end-use: –Applications <> Architecture <> Packaging Reduce complexity and size. –~25KW/rack is max for air-cooling in standard room. –Need to improve performance/power ratio. –700MHz PowerPC440 for ASIC has excellent FLOP/Watt. Maximize Integration: –On chip: ASIC with everything except main memory. –Off chip: Maximize number of nodes in a rack.. Large systems require excellent reliability, availability, serviceability (RAS)
3
3 Physical Layout of BG/L
4
4 The Compute Chip System-on-a-chip (SoC) 1 ASIC – 2 PowerPC processors – L1 and L2 Caches – 4MB embedded DRAM – DDR DRAM interface and DMA controller – Network connectivity hardware – Control / monitoring equip. (JTAG)
5
5 Compute and Node Cards
6
Node Architecture IBM PowerPC embedded CMOS processors, embedded DRAM, and system-on-a-chip technique is used. 11.1-mm square die size, allowing for a very high density of processing. The ASIC uses IBM CMOS CU-11 0.13 micron technology. 700 Mhz processor speed close to memory speed. Two processors per node. Second processor is intended primarily for handling message passing operations
7
Midplane and Rack 1 rack holds 1024 nodes Nodes optimized for low power ASIC based on SoC technology –Outperform commodity clusters while saving on power –Aggressive packaging of processor, memory and interconnect –Power efficient & space efficient –Allows for latencies and bandwidths that are significantly better than those for nodes typically used in ASC scale supercomputers
8
The Torus Network 64 x 32 x 32 Each compute node is connected to its six neighbors: x+, x-, y+, y-, z+, z- Compute card is 1x2x1 Node card is 4x4x2 –16 compute cards in 4x2x2 arrangement Midplane is 8x8x8 –16 node cards in 2x2x4 arrangement Each uni-directional link is 1.4Gb/s, or 175MB/s. Each node can send and receive at 1.05GB/s. Supports cut-through routing, along with both deterministic and adaptive routing. Variable-sized packets of 32,64,96…256 bytes Guarantees reliable delivery 8
9
BG/L System Software System software supports efficient execution of parallel applications Compiler support for MPI-based C, C++, Fortran Front-end nodes are commodity PCs running Linux I/O nodes run a customized Linux kernel Compute nodes: extremely lightweight custom kernel –Space sharing, single-thread/processor (dual-threaded per node) –Flat address space, no paging –Physical resources are memory-mapped Service node is a single multiprocessor machine running a custom OS 9
10
Space Sharing BG/L system can be partitioned into electronically isolated sets of nodes (power of 2) Single-user, reservation-based for each partition Faulty hardware are electrically isolated to allow other nodes to continue to run in the presence of component failures.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.