One step ahead
The Challenges of Architectures that Grow to Petascale and can be Sustained Economically Steve Reinhardt Principal Engineer, SGI spr at sgi.com
SGI’s systems are evolving to enable ultrascale versions of today’s applications and enable a new type of computational science, while remaining economically sustainable.
Agenda Besides Architecture… Enabling Ultra-scale Applications Enabling New Computational Science Sustaining Economically
Besides Hardware Architecture... Efficient execution environment RAS OS architecture –Linux scaled aggressively, with multiples in ultrascale configurations Robust scheduling RAS Packaging density / heat dissipation RAS
Agenda Besides Architecture… Enabling Ultra-scale Applications Enabling New Computational Science Sustaining Economically
Local Performance: Needed Flexibility of Memory Access Note: Original (Jan2003) models used for both X1 and Altix Price Performance Absolute Performance Driven by focus of engineering team Driven by cost of large engineering team Driven by parts replication cost
Ideal Machine (Technical/Economic Balance) Price PerformanceAbsolute Performance High, cost-effective cache bandwidth of mass market parts Highest cost-effective memory bandwidth Design focus on gather/scatter Note: For O(100KP) petascale machines, value of O(5X) processor performance advantage is less than today
Local Performance: Multi-Paradigm Low Data locality High Low Compute high Intensity Vector-like PIM-like Scalar Application-specific
Ultraviolet : Concept Architecture MPU UV Petascale GAM. Globally Addressable. Low Latency. High Bandwidth. O(100K) Ports GPU I/O APU
Global Performance Communications –grids becoming more dynamic -> low latency essential –processor counts growing -> low latency essential –low latency -> global address space –in clock periods, remote memory getting further away –bandwidth-conserving operations needed –high absolute link performance Synchronization –current mechanisms insufficient for ultrascale –optimizations will help, but maybe not enough –new mechanisms needed Dynamic load balancing –mechanisms need to mature, and interfaces become standard
Challenges Clear virtual machine and performance models for these new mechanisms Compilers/tools that exploit these mechanisms mostly automatically and accept user hints Appropriate performance balance for typical uses Need to gain successful experience at very large scale (10-30KP) before going to ultrascale (100KP)
Agenda Besides Architecture… Enabling Ultra-scale Applications Enabling New Computational Science Sustaining Economically
Scientific Process Observe existing data for patterns Hypothesize models that match the data Test those models to understand accuracy (i.e., add new data) **Believed first coined by Scott Studham et al., PNNL
Scientific Process Observe existing data for patterns Hypothesize models that match the data Test those models to understand accuracy (i.e., add new data) “First Principles” computing; most of current HPC “Dynamic Network Inference” computing** Query: When we know what we want and how to ask for it Inference: When we know only somewhat what we want Exploration: When we know little, but anticipate more “ planned serendipity ” **Believed first coined by Scott Studham et al., PNNL
Example: Post-Genomic Biology <10% of the human genome is known to code for proteins Selective pressure generally removes unused genetic material What is the other 90% of the genome doing? –Have the raw data (genome) –Need to add other types of data (e.g., protein association info) –Multi-petabytes of data all told –Probably not a purely computational problem
Differences from First Principles Data access patterns ~impossible to predict a priori -> low latency / global address space New tools for data exploration needed –need to automatically search for new, perhaps-vaguely-defined, patterns (that foster new theory) –highly interactive/coupled with the scientist’s thought process –but beware difficulty of launching new languages Contents of memory much more valuable –RAS
“ and now for something completely different ”: Star-P Developed by Alan Edelman and colleagues at MIT, etc. Simple extensions to the MATLAB® language –data parallel, MIMD, and mixed Builds on the existing base of MATLAB programs –broadening the market for HPC systems New back-end server implemented for parallel execution Preserves key MATLAB strengths: –very high level language –interactivity / exploration –easy visualization “Put the fun back in supercomputing”
Agenda Besides Architecture… Enabling Ultra-scale Applications Enabling New Computational Science Sustaining Economically
Key Points SGI retains system focus …but uses commodity components wherever practical –Exploit best mass-market processors (Itanium™) augment to make suitable for wider range of HPC apps –Use Linux fully reap the cost benefits of reduced support of proprietary Unix™ variant –IFB cables, EFI firmware Innovations for ultrascale must be relevant for wider markets –e.g., multi-paradigm computing must accelerate ISV apps Use new technologies to broaden the market –e.g., Star-P
SGI’s systems are evolving to enable ultrascale versions of today’s applications and enable a new type of computational science, while remaining economically sustainable.
One step ahead
“There are no technology-independent lessons in computer science.” Butler Lampson, Xerox PARC