© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Blue Gene/P System Overview - Hardware.

Slides:



Advertisements
Similar presentations
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Advertisements

Office of Science U.S. Department of Energy Bassi/Power5 Architecture John Shalf NERSC Users Group Meeting Princeton Plasma Physics Laboratory June 2005.
Kei Davis and Fabrizio Petrini Europar 2004, Pisa Italy 1 CCS-3 P AL STATE OF THE ART.
Case study IBM Bluegene/L system InfiniBand. Interconnect Family share for 06/2011 top 500 supercomputers Interconnect Family CountShare % Rmax Sum (GF)
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
Top 500 Computers Federated Distributed Systems Anda Iamnitchi.
Today’s topics Single processors and the Memory Hierarchy
.1 Network Connected Multi’s [Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005]
Digital RF Stabilization System Based on MicroTCA Technology - Libera LLRF Robert Černe May 2010, RT10, Lisboa
Understanding Application Scaling NAS Parallel Benchmarks 2.2 on NOW and SGI Origin 2000 Frederick Wong, Rich Martin, Remzi Arpaci-Dusseau, David Wu, and.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
An Introduction to Princeton’s New Computing Resources: IBM Blue Gene, SGI Altix, and Dell Beowulf Cluster PICASso Mini-Course October 18, 2006 Curt Hillegas.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.
Interconnection and Packaging in IBM Blue Gene/L Yi Zhu Feb 12, 2007.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
Hardware Overview Net+ARM – Well Suited for Embedded Ethernet
BlueGene/L Power, Packaging and Cooling Todd Takken IBM Research February 6, 2004 (edited 2/11/04 version of viewgraphs)
MIS 1.5 Training. Day 1 Section 1 Overview and Features (30 min) Section 2 Hardware Overview (1 hour) Section 3 Internal Cabling (1 hour) Lab 1 Show and.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
Blue Gene / C Cellular architecture 64-bit Cyclops64 chip: –500 Mhz –80 processors ( each has 2 thread units and a FP unit) Software –Cyclops64 exposes.
1 Lecture 7: Part 2: Message Passing Multicomputers (Distributed Memory Machines)
Quadrics, Inc. June 2006 – Confidential slide 1 QsNet III and QsTenG Networks for High Performance Computing 12 th June 2006.
Computers Central Processor Unit. Basic Computer System MAIN MEMORY ALUCNTL..... BUS CONTROLLER Processor I/O moduleInterconnections BUS Memory.
Next KEK machine Shoji Hashimoto 3 rd ILFT Network Workshop at Jefferson Lab., Oct. 3-6, 2005.
© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Blue Gene/L Torus Interconnection Network N. R. Adiga, et.al IBM Journal.
P. Vranas, IBM Watson Research Lab 1 BG/L architecture and high performance QCD P. Vranas IBM Watson Research Lab.
© 2010 IBM Corporation Enabling Concurrent Multithreaded MPI Communication on Multicore Petascale Systems Gabor Dozsa 1, Sameer Kumar 1, Pavan Balaji 2,
Overview of the New Blue Gene/L Computer Dr. Richard D. Loft Deputy Director of R&D Scientific Computing Division National Center for Atmospheric Research.
Sun Fire™ E25K Server Keith Schoby Midwestern State University June 13, 2005.
SLAC Particle Physics & Astrophysics The Cluster Interconnect Module (CIM) – Networking RCEs RCE Training Workshop Matt Weaver,
Rensselaer Why not change the world? Rensselaer Why not change the world? 1.
Reconfigurable Computing: A First Look at the Cray-XD1 Mitch Sukalski, David Thompson, Rob Armstrong, Curtis Janssen, and Matt Leininger Orgs: 8961 & 8963.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Principles of Parallel Programming First Edition by Calvin Lin Lawrence Snyder.
The IBM Blue Gene/L System Architecture Presented by Sabri KANTAR.
2009/4/21 Third French-Japanese PAAP Workshop 1 A Volumetric 3-D FFT on Clusters of Multi-Core Processors Daisuke Takahashi University of Tsukuba, Japan.
XStream: Rapid Generation of Custom Processors for ASIC Designs Binu Mathew * ASIC: Application Specific Integrated Circuit.
Cray Inc. Hot Interconnects 1 Bob Alverson, Duncan Roweth, Larry Kaplan Cray Inc.
Modeling Billion-Node Torus Networks Using Massively Parallel Discrete-Event Simulation Ning Liu, Christopher Carothers 1.
Kaaba Technosolutions Pvt Ltd1 Objectives Learn that a computer requires both hardware and software to work Learn about the many different hardware components.
IDC HPC User Forum April 14 th, 2008 A P P R O I N T E R N A T I O N A L I N C Steve Lyness Vice President, HPC Solutions Engineering
Overview High Performance Packet Processing Challenges
Interconnection network network interface and a case study.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
1 Opportunities and Challenges of Modern Communication Architectures: Case Study with QsNet CAC Workshop Santa Fe, NM, 2004 Sameer Kumar* and Laxmikant.
1 Next Generation Correlators, June 26 th −29 th, 2006 The LOFAR Blue Gene/L Correlator Stichting ASTRON (Netherlands Foundation for Research in Astronomy)
Raw Status Update Chips & Fabrics James Psota M.I.T. Computer Architecture Workshop 9/19/03.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
A 1.2V 26mW Configurable Multiuser Mobile MIMO-OFDM/-OFDMA Baseband Processor Motivations –Most are single user, SISO, downlink OFDM solutions –Training.
BluesGene/L Supercomputer A System Overview Pietro Cicotti October 10, 2005 University of California, San Diego.
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating.
ERICSSON MICROWAVE COMMISSIONING GUIDELINES
Architecture of Parallel Computers CSC / ECE 506 BlueGene Architecture 4/26/2007 Dr Steve Hunter.
BLUE GENE Sunitha M. Jenarius. What is Blue Gene A massively parallel supercomputer using tens of thousands of embedded PowerPC processors supporting.
Hardware Architecture
PARALLEL MODEL OF EVOLUTIONARY GAME DYNAMICS Amanda Peters MIT /13/2009.
Network Connected Multiprocessors
Overview Parallel Processing Pipelining
NVIDIA’s Extreme-Scale Computing Project
Appro Xtreme-X Supercomputers
QuickPath interconnect GB/s GB/s total To I/O
BlueGene/L Supercomputer
Advanced Computer Architecture 5MD00 / 5Z033 TOP 500 supercomputers
Network Processors for a 1 MHz Trigger-DAQ System
Peter Bannon Staff Fellow HP
Cluster Computers.
Presentation transcript:

© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Blue Gene/P System Overview - Hardware

IBM Blue Gene/P System Administration BlueGene/P Racks 13.6 GF/s 8 MB EDRAM 4 processors 1 chip, 20 DRAMs 13.6 GF/s 2.0 GB DDR2 32 Node Cards 13.9 TF/s 2 TB 72 Racks, 72x32x32 1 PF/s 144 TB Cabled 8x8x16 Rack Compute Card Chip 435 GF/s 64 GB (32 chips 4x4x2) 32 compute, 0-2 IO cards Node Card Intro (BG/P and BG/L)

IBM Blue Gene/P System Administration Node Dual Processor Processor Card 2 chips, 1x2x1 Node Card (32 chips 4x4x2) 16 compute, 0-2 IO cards Rack 32 Node Cards Cabled 8x8x16 System 64 Racks, 64x32x32 5.6/11.2 GF/s 1.0 GB 90/180 GF/s 16 GB 2.8/5.6 TF/s 512 GB 180/360 TF/s 32 TB BlueGene/L Racks Intro (BG/P and BG/L)

IBM Blue Gene/P System Administration Comparison of BG/L and BG/P nodes FeatureBlue Gene/LBlue Gene/P Cores per node24 Core Clock Speed700 MHz850 MHz Cache CoherencySoftware managedSMP Private L1 cache32 KB per core Private L2 cache14 stream prefetching Shared L3 cache4 MB8 MB Physical Memory per Node 512 MB - 1 GB2 GB Main Memory Bandwidth 5.6 GB/s13.6 GB/s Peak Performance5.6 GFlop/s per node13.6 GFlop/s per node

IBM Blue Gene/P System Administration Comparison of BG/L and BG/P nodes (2) Torus Network Bandwidth2.1 GB/s5.1 GB/s Hardware Latency (nearest neighbor) 200 ns (32B packet) and 1.6 μs (256B packet) 100 ns (32B packet) and 800 ns (256B packet) Global Collective Network Bandwidth700 MB/s1.7 GB/s Hardware Latency (Round trip worst case) 5.0 μs3.0 μs Full System (72 rack comparison) Peak Performance410 TFlop/s~1 PFlop/s Power1.7 MW~2.3 MW

IBM Blue Gene/P System Administration Blue Gene/P Rack

IBM Blue Gene/P System Administration Blue Gene/L Rack

IBM Blue Gene/P System Administration BG/P and BG/L Cooling

IBM Blue Gene/P System Administration 32 Compute nodes Optional IO card (one of 2 possible) with 10Gb optical link Local DC-DC regulators (6 required, 8 with redundancy) BG/P Node Card

IBM Blue Gene/P System Administration Blue Gene/L Hardware Node Card

IBM Blue Gene/P System Administration BG/P Compute Card

IBM Blue Gene/P System Administration Three type of compute/IO cards  1.07 volts44V3572  1.13 volts44V3575  1.19 volts44V3578  Cannot be mixed within a node card.

IBM Blue Gene/P System Administration Blue Gene/L Processor Card

IBM Blue Gene/P System Administration Blue Gene/P Hardware Link Cards

IBM Blue Gene/P System Administration Blue Gene/L Hardware Link Cards

IBM Blue Gene/P System Administration Blue Gene/P Hardware Service Card

IBM Blue Gene/P System Administration Blue Gene/L Hardware Service Card

IBM Blue Gene/P System Administration Naming conventions (1) Racks: Rxx Power Modules: Rxx-B-Px Bulk Power Supply: Rxx-B Power Cable: Rxx-B-C Rack Column (0-F) Rack Row (0-F) Midplanes: Rxx-Mx Clock Cards: Rxx-K Fan Assemblies: Rxx-Mx-Ax Fans: Rxx-Mx-Ax-Fx Power Module (0-7) 0-3 Left to right facing front 4-7 left to right facing rear Bulk Power Supply Rack Row (0-F) Rack Column (0-F) Midplane (0-1) 0=Bottom 1=Top Rack Column (0-F) Rack Row (0-F) Clock Rack Column (0-F) Rack Row (0-F) Fan Assembly (0-9) 0=Bottom Front, 4=Top Front 5=Bottom Rear, 9=Top Rear Midplane (0-1) 0-Bottom, 1=Top Rack Column (0-F) Rack Row (0-F) Fan (0-2) 0=Tailstock 2= Midplane Fan Assembly (0-9) 0=Bottom Front, 4=Top Front 5=Bottom Rear, 9=Top Rear Midplane (0-1) 0-Bottom, 1=Top Rack Column (0-F) Rack Row (0-F) See picture below for details Bulk Power Supply Rack Row (0-F) Rack Column (0-F) Power Cable Bulk Power Supply Rack Row (0-F) Rack Column (0-F)

IBM Blue Gene/P System Administration Naming conventions (2) Service Cards: Rxx-Mx-S Service Card Midplane (0-1) 0-Bottom, 1=Top Rack Column (0-F) Rack Row (0-F) Link Cards: Rxx-Mx-Lx Node Cards: Rxx-Mx-Nxx Compute Cards: Rxx-Mx-Nxx-Jxx Link Card (0-3) Midplane (0-1) 0-Bottom, 1=Top Rack Column (0-F) Rack Row (0-F) Node Card (00-15) Midplane (0-1) 0-Bottom, 1=Top Rack Column (0-F) Rack Row (0-F) Compute Card (04 through 35) Node Card (00-15) Midplane (0-1) 0-Bottom, 1=Top Rack Column (0-F) Rack Row (0-F) 0=Bottom Front 1=Top Front 2=Bottom Rear 3=Top Rear 00=Bottom Front 07=Top Front 08=Bottom Rear 15=Top Rear I/O Cards: Rxx-Mx-Nxx-Jxx I/O Card (00-01) Node Card (00-15) Midplane (0-1) 0-Bottom, 1=Top Rack Column (0-F) Rack Row (0-F) Note: Master service card for a rack is always Rxx-M0-S See picture below for details

IBM Blue Gene/P System Administration Rack Naming Convention Note direction of slant on covers Service card side Note: The fact that this illustration shows numbers 00 through 77 does not imply this is the largest configuration possible. The largest configuration possible is 256 racks numbered 00 through FF.

IBM Blue Gene/P System Administration Torus X-Y-Z X Y Z

IBM Blue Gene/P System Administration Node, Link, Service Card Names L1 N07 N06 N05 N04 S N03 N02 N01 N00 L0 L3 N15 N14 N13 N12 N11 N10 N09 N08 L2

IBM Blue Gene/P System Administration Node Card J35 J31 J27 J23 J19 J15 J11 J07 J01 J34 J30 J26 J22 J18 J14 J10 J06 J33 J29 J25 J21 J17 J13 J09 J05 J00 EN0 EN1 J32 J28 J24 J20 J16 J12 J08 J04

IBM Blue Gene/P System Administration Service Card Control Network Clock Input Rack Row Indicator (0-F) Rack Column Indicator (0-F)

IBM Blue Gene/P System Administration U00 U01 U02 U03 U04 U05 J00J02J04J06J08J10J12J14 J01J03J05J07J09J11J13J15 Link Card

IBM Blue Gene/P System Administration Clock Card Output 9 Master Output 8 Output 7 Output 6 Output 5 Output 4 Output 3 Output 2 Output 1 Output 0 Input Slave

IBM Blue Gene/P System Administration Networks (BG/P and BG/L) Service Network Service Network Functional Network Functional Network Site Network Site Network Users Administrator Front-End Node File System Service Node Blue Gene Core

IBM Blue Gene/P System Administration Blue Gene/P Interconnection Networks  3 Dimensional Torus  Interconnects all compute nodes –Communications backbone for computations  Adaptive cut-through hardware routing  3.4 Gb/s on all 12 node links (5.1 GB/s per node)  0.5 µs latency between nearest neighbors, 5 µs to the farthest –MPI: 3 µs latency for one hop, 10 µs to the farthest  1.7/2.6 TB/s bisection bandwidth, 188TB/s total bandwidth (72k machine)  Collective Network  Interconnects all compute and I/O nodes (1152)  One-to-all broadcast functionality  Reduction operations functionality  6.8 Gb/s of bandwidth per link  Latency of one way tree traversal 2 µs, MPI 5 µs  ~62TB/s total binary tree bandwidth (72k machine)  Low Latency Global Barrier and Interrupt  Latency of one way to reach all 72K nodes 0.65 µs, MPI 1.6 µs  Other networks  10Gb Functional Ethernet –I/O nodes only  1Gb Private Control Ethernet –Provides JTAG access to hardware. Accessible only from Service Node system

IBM Blue Gene/P System Administration Blue Gene/L Networks  3 Dimensional Torus  Interconnects all compute nodes (65,536)  Virtual cut-through hardware routing  1.4Gb/s on all 12 node links (2.1 GB/s per node)  Communications backbone for computations  0.7/1.4 TB/s bisection bandwidth, 67TB/s total bandwidth  Collective Network  One-to-all broadcast functionality  Reduction operations functionality  2.8 Gb/s of bandwidth per link; Latency of tree traversal 2.5 µs  ~23TB/s total binary tree bandwidth (64k machine)  Interconnects all compute and I/O nodes (1024)  Ethernet IP  Incorporated into every node ASIC  Active in the I/O nodes (1:64)  All external comm. (file I/O, control, user interaction, etc.)  Control System Network  Boot, monitoring and diagnostics  Global Barrier and Interrupt