SGI Contributions to Supercomputing by 2010 Steve Reinhardt Director of Engineering

Slides:



Advertisements
Similar presentations
Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Advertisements

Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
1 Agenda … HPC Technology & Trends HPC Platforms & Roadmaps HP Supercomputing Vision HP Today.
Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini
Appro Xtreme-X Supercomputers A P P R O I N T E R N A T I O N A L I N C.
Commodity Computing Clusters - next generation supercomputers? Paweł Pisarczyk, ATM S. A.
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
Ver 0.1 Page 1 SGI Proprietary Introducing the CRAY SV1 CRAY SV1-128 SuperCluster.
SGI’2000Parallel Programming Tutorial Supercomputers 2 With the acknowledgement of Igor Zacharov and Wolfgang Mertz SGI European Headquarters.
2. Computer Clusters for Scalable Parallel Computing
Beowulf Supercomputer System Lee, Jung won CS843.
1 Parallel Scientific Computing: Algorithms and Tools Lecture #3 APMA 2821A, Spring 2008 Instructors: George Em Karniadakis Leopold Grinberg.
© 2003 IBM Corporation IBM Systems and Technology Group Operating System Attributes for High Performance Computing Ken Rozendal Distinguished Engineer.
25 Years of Changing the World Q3 FY08. SGI PROPRIETARY Who Is SGI Our people provide the best compute, storage and visualization solutions on the planet…
Information Technology Center Introduction to High Performance Computing at KFUPM.
Understanding Application Scaling NAS Parallel Benchmarks 2.2 on NOW and SGI Origin 2000 Frederick Wong, Rich Martin, Remzi Arpaci-Dusseau, David Wu, and.
Silicon Graphics, Inc. Poster Presented by: SGI Proprietary Technologies for Breakthrough Research Rosario Caltabiano North East Higher Education & Research.
CS 213 Commercial Multiprocessors. Origin2000 System – Shared Memory Directory state in same or separate DRAMs, accessed in parallel Upto 512 nodes (1024.
NPACI Panel on Clusters David E. Culler Computer Science Division University of California, Berkeley
Performance Analysis of MPI Communications on the SGI Altix 3700 Nor Asilah Wati Abdul Hamid, Paul Coddington, Francis Vaughan Distributed & High Performance.
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
NPACI: National Partnership for Advanced Computational Infrastructure Supercomputing ‘98 Mannheim CRAY T90 vs. Tera MTA: The Old Champ Faces a New Challenger.
An Introduction to Princeton’s New Computing Resources: IBM Blue Gene, SGI Altix, and Dell Beowulf Cluster PICASso Mini-Course October 18, 2006 Curt Hillegas.
Server Platforms Week 11- Lecture 1. Server Market $ 46,100,000,000 ($ 46.1 Billion) Gartner.
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Machine.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
1 Lecture 7: Part 2: Message Passing Multicomputers (Distributed Memory Machines)
1 Computing platform Andrew A. Chien Mohsen Saneei University of Tehran.
CENG 546 Dr. Esma Yıldırım. Copyright © 2012, Elsevier Inc. All rights reserved What is a computing cluster?  A computing cluster consists of.
1 Parallel Computing Basics of Parallel Computers Shared Memory SMP / NUMA Architectures Message Passing Clusters.
Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
CLUSTER COMPUTING STIMI K.O. ROLL NO:53 MCA B-5. INTRODUCTION  A computer cluster is a group of tightly coupled computers that work together closely.
SGI's Platform Strategy: Addressing the Productivity Gap in HPC Dave Parry Senior Vice President and General Manager Server and Platform Group Silicon.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Cluster Workstations. Recently the distinction between parallel and distributed computers has become blurred with the advent of the network of workstations.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
A High-Performance Scalable Graphics Architecture Daniel R. McLachlan Director, Advanced Graphics Engineering SGI.
Remote Direct Memory Access (RDMA) over IP PFLDNet 2003, Geneva Stephen Bailey, Sandburst Corp., Allyn Romanow, Cisco Systems,
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
Headline in Arial Bold 30pt HPC User Forum, April 2008 John Hesterberg HPC OS Directions and Requirements.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
CCS Overview Rene Salmon Center for Computational Science.
One step ahead. The Challenges of Architectures that Grow to Petascale and can be Sustained Economically Steve Reinhardt Principal Engineer, SGI spr at.
Spring 2003CSE P5481 Issues in Multiprocessors Which programming model for interprocessor communication shared memory regular loads & stores message passing.
Itanium 2 Impact Software / Systems MSC.Software Jay Clark Director, Business Development High Performance Computing
Computing Environment The computing environment rapidly evolving ‑ you need to know not only the methods, but also How and when to apply them, Which computers.
Cray Environmental Industry Solutions Per Nyberg Earth Sciences Business Manager Annecy CAS2K3 Sept 2003.
+ Clusters Alternative to SMP as an approach to providing high performance and high availability Particularly attractive for server applications Defined.
Interconnection network network interface and a case study.
Experiences with Co-array Fortran on Hardware Shared Memory Platforms Yuri DotsenkoCristian Coarfa John Mellor-CrummeyDaniel Chavarria-Miranda Rice University,
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
A Multi-platform Co-Array Fortran Compiler for High-Performance Computing Cristian Coarfa, Yuri Dotsenko, John Mellor-Crummey {dotsenko, ccristi,
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage September 2010 Brandon.
Tackling I/O Issues 1 David Race 16 March 2010.
Background Computer System Architectures Computer System Software.
Computer System Optimization. Introduction PC with Software NVR The main components of PC and the factors when choosing a PC Dual streaming Standalone.
Page : 1 SC2004 Pittsburgh, November 12, 2004 DEISA : integrating HPC infrastructures in Europe DEISA : integrating HPC infrastructures in Europe Victor.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
1 MSWG, Amsterdam, December 15, 2005 DEISA security Jules Wolfrat SARA.
CNAF - 24 September 2004 EGEE SA-1 SPACI Activity Italo Epicoco.
High Performance Computing (HPC)
Application of General Purpose HPC Systems in HPEC
Appro Xtreme-X Supercomputers
Exploring Distributed Computing Techniques with Ccactus and Globus
Parallel I/O System for Massively Parallel Processors
CLUSTER COMPUTING.
SiCortex Update IDC HPC User Forum
Presentation transcript:

SGI Contributions to Supercomputing by 2010 Steve Reinhardt Director of Engineering

Data Access Visualization HPC Scalable servers and superclusters SGI ® Origin ® family SGI ® Altix™ 3000 family SGI ® NUMAflex ™ Supercomputing Aspects of SGI Deliver data wherever the users are CXFS/WAN demo at SC’02 Each server reads directly, at channel speeds Biggest installed configuration.5PB “VAN” Deliver images wherever the users are Enable collaboration NOTE: No “enterprise” references

Memory is unifying theme globally addressable up to O(PB) incorporating varied processing types latency (-> 500ns for 10KP) bandwidth (local stride-1 B:F -> 2.0+ local gather/scatter B:F remote bisection BW B:F ->.3) Sustained performance differentiated scaling (latency & bandwidth) better memory interface new synchronization substrate Raise the level of programming abstraction UPC/CAF (near-term) parallel Matlab (radical) SGI in HPC

SGI Origin® family MIPS processors, Irix OS exploit low power consumption, ISA control SGI Altix™ family IPF processors, Linux OS exploit SGI interconnect, with industry-standard ISA and rapid OS maturation

Balancing High Innovation and Profitability low Differentiation high low Profitability high “Death Valley”:enough differentiation to have higher cost but not enough to have high value

System / Component Differentiation System Cost System Value OS Interconnect Memory Processor

Ideal Differentiation System Cost System Value OS Interconnect Memory Processor

SGI Origin series System Cost System Value OS Interconnect Memory Processor

Quadrics cluster System Cost System Value OS Interconnect Memory Processor

IBM SP3 system System Cost System Value OS Interconnect Memory Processor

SGI Altix system System Cost System Value OS Interconnect Memory Processor

World-record result for a µP-based system; fourth overall.8 B:F (6.4GB/s shared by 2x4GF processors) Single kernel; NUMA placement support in Linux STREAM Triad Results

Interconnect Scaling MPI bandwidth versus distance (MB/s) Coming soon

Altix 3000 Throughput Performance Throughput of 4 jobs, each 8P, crash application System: Altix 3000, 32P, 64GB, XVM, TP900 Individual jobs in the throughput mix are between 0.4% and 1.8 % slower than the standalone case

Summary: SGI for HPC Long-term directions –Memory: globally addressable, high BW, low latency –Strong delivered performance differentiated scaling (latency & bandwidth) better memory interface new synchronization substrate –Raise the level of programming abstraction UPC/CAF (near-term); parallel Matlab (radical) Near-term deliverables –Altix 3000 system distinguished performance rapidly maturing Open Source software base