Brent Gorda LBNL – SOS7 3/5/03 1 Planned Machines: BluePlanet SOS7 March 5, 2003 Brent Gorda Future Technologies Group Lawrence Berkeley National Lab
National Energy Research Scientific Computing Center ~2000 Users in ~400 projects Serves all disciplines of the DOE Office of Science NERSC Focus on large-scale computing
Brent Gorda LBNL – SOS7 3/5/03 3 The Divergence Problem The requirements of high performance computing for science and engineering, and the requirements of the commercial market are diverging. The commercial cluster of SMP approach is no longer sufficient to provide the highest level of performance –Lack of memory bandwidth - high memory latency –Lack of interconnect bandwidth - high interconnect latency –The commodity building block was the microprocessor but is now the entire server (SMP)! U.S. computer industry is driven by commercial applications The decision for NERSC-3 E can be seen as an indication of the divergence problem: Power 4 had a low SSP number – scaling problem
Brent Gorda LBNL – SOS7 3/5/03 4 Workshops: Sept 2002 – defining the Blue Planet architecture Nov – IBM gathered input for Power 6 White Paper: "Creating Science-Driven Computer Architecture: A New Path to Scientific Leadership,“ Cooperative Development – NERSC/ANL/IBM Workshop
Brent Gorda LBNL – SOS7 3/5/03 5 What is unique… … in the design process: Awareness and sensitivity to vendor’s mainstream products Get back “inside the box” … in the collaboration: Application teams drive the design process …in the structure and function of the system: Single core cpu design – not a previously planned product Additional Federation stage – 4x scale up in links Virtual Vector Architecture – ViVA
Brent Gorda LBNL – SOS7 3/5/03 6 Applications to Drive the Design of New Architectures Combustion Simulation and Adaptive Methods Computational Astrophysics Nanoscience Climate Modeling Accelerator Modeling Lattice Quantum Chromodynamics Quantum Monte Carlo Calculations of Nuclei High Energy / Elementary Particle Physics Biochemical and Biosystems Simulations Advanced Simulations of Plasma Microturbulence Computational Environmental Molecular Science Application’s needs vary widely, however most utilize MPI
Brent Gorda LBNL – SOS7 3/5/03 7 Interaction based on Scientific Apps AMRCoupled Climate AstrophysicsNanoscience MADCAPCactusFLAPWLSMS Sensitive to global bisection XXXX Sensitive to processor to memory latency XXX Sensitive to network latency XXXXX Sensitive to point to point communications XXX Sensitive to OS interference in frequent barriers XX Benefits from deep CPU pipelining XXXXXX Benefits from Large SMP nodes X
Brent Gorda LBNL – SOS7 3/5/03 8 Slide courtesy of Peter Ungaro, IBM
Brent Gorda LBNL – SOS7 3/5/03 9 What prior experience guided this choice? Power 4 memory bandwidth does not support 32 CPUs, and Power 4 Memory Latency is only 29% longer than Power 3.
Brent Gorda LBNL – SOS7 3/5/03 10 The majority of applications achieve low percentage of peak –Adding cpu’s w/o adding bandwidth makes no sense to HPC applications The majority of applications are data starved: memory and interconnect Many of the applications are regular and may be able to take advantage of ViVa What prior experience guided this choice?
Brent Gorda LBNL – SOS7 3/5/03 11 Other than your own machine, for your needs what are the best and worst machines? Difficult to push the envelope as a production facility Balance would appear to be the most important technical feature for a general purpose system Stability is the most important production feature Both involve more than just hardware The best system is one that lives up to (sometimes slightly reduced) expectations