Appro Xtreme-X Supercomputers A P P R O I N T E R N A T I O N A L I N C 1
Company Overview :: Corporate Snapshot Leading developer of high performance servers, clusters and supercomputers Established in 1991 Headquartered in Milpitas, CA Sales & Service office in Houston, TX Manufacturing Hardware in Asia Global Presence via Strategic and Channel Partners 72% Profitable CAGR over past 3 years Deployed the second largest Supercomputer in Japan Six top ranked computing systems listed in the Top 500 Delivering balanced architecture for scalable performance Target Markets Financial Services Government / Defense Manufacturing Oil & Gas 2
Strategic Partnership :: Appro & NEC Join Forces in HPC Market NEC has a strong presence in the EMEA HPC Market with over 20 years of experience This is a breakthrough for Appro’s entry into the EMEA HPC market Provides sustainable competitive advantages enabling both companies to participate in this growing market segment Appro and NEC look forward to working together to offer powerful, flexible and reliable solutions to EMEA HPC markets Formal Press Announcement will go out on Tuesday, 9/16/08
HPC Experience :: Past Performance History NOAA Cluster 2006. 9 2006. 11 NOAA Cluster LLNL Atlas Cluster 9,216 Cores 18.4TB System Memory 44 TFlops 1,424 Cores 2.8TB System Memory 15 TFlops 2007. 6 2008. 2 LLNL Minos Cluster DE Shaw Research Cluster 6,912 Cores 13.8TB System Memory 33 TFlops 4,608 Cores 9.2 TB System Memory 49 TFlops
HPC Experience :: Past Performance History TLCC Cluster 2008. 4 2008. 6 TLCC Cluster Tsukuba University Cluster 48,384 Cores LLNL, LANL, SNL 426 TFlops 10,784 Cores Quad-rail IB 95 TFlops 2008. 7 2008. 8 Renault F1 CFD Cluster LLNL Hera Cluster 4,000 Cores Dual-rail IB 38 TFlops 13,824 Cores 120 TFlops
HPC Challenges :: Changes in the Industry Petascale deployments (4000+ node deployment) Balanced Systems (CPU/Memory/Network) Scalability (SW & Network) Reliability (Real RAS: Network, Node, SW) Facilities (Space, Power & Cooling) Integrated exotics (GPU cluster) Solutions still being evaluated
Petascale Deployments :: Based on a Scalable Multi-Tier Architecture : InfiniBand for Computing : 10GbE Operation : GbE Management InfiniBand Network Operation Network (10GbE) External Firewall Router (GbE) Compute Node IO I/O Server Group Parallel File System Servers or Bridge 4X IB 4x IB 2x GbE per node 2x 10GbE 2x GbE N GbE Mgmt Storage Controllers FC or GbE Global File System GbE or 10GbE
Petascale Deployments :: Scalable cluster management software 3D Torus Network Topology Support Stateless Operation Job Scheduling Dual Rail Networks BIOS Synchronization ACE Middle Ware-Hooks Instant SW Provisioning Virtual Cluster Manager Failover & Recovery IB-Subnet Manager Standard Linux OS Support Remote Lights out Management “Appro Cluster Engine™ software turns a cluster of Servers into a,” functional, usable, reliable and available computing system” Jim Ballew, CTO Appro
Petascale Deployments :: Innovative Cooling and Density needed Top View Up to 30% Improvement in Density with Greater Cooling Efficiency Delivers Cold Air directly to the equipment for optimum cooling efficiency. Delivers comfortable air temperature to the room for return to Chillers Back-to-Back Rack configuration saves floor space in the datacenter and encloses the Cold isles inside the racks FRU and maintenance is done from the front side of the rack cabinet
Petascale Deployments :: Path to PetaFLOP Computing Appro Xtreme-X Supercomputer - Modular Scalable Performance Number of Racks 1 2 8 48 96 192 Number of Processors 128 256 1024 5,952 11,904 23,808 Number of Cores 512 1,024 4096 23,808 47,616 95,232 Peak Performance 6TF/s 12TF/s 49TF/s 279TF/s 558TF/s 1.1PF/s Memory Capacity 1.5TB 3TB 12TB 72TB 143TB 286TB Memory BW Ration GB/s per GF/s - 0.68GB/s per GF/s Memory Capacity Ratio GB per GF/s - 0.26GB per GF/s IO Fabric Interconnect – Dual-Rail QDR IO BW Ratio GB/sec per GF/s - 0.17GB/s per GF/s Usable Node-Node BW GB/s - 6.4GB/s Node-Node Latency - <2us Performance Numbers are Based on 2.93GHz Intel Nehalem Processors and Includes only Compute Nodes ....
Xtreme-X Supercomputer :: Possible Path to PetaFLOP GPU Computing GPU Computing Cluster – Solution still being evaluated Number of Racks 3 5 10 18 34 Number of Blades 64 128 256 512 1024 Number of GPUs 32 64 128 256 512 Peak GPU Performance 128TF 256TF 512TF 1PF 2PF Peak CPU Performance 6TF 12TF 24TF 48TF 96TF Max Memory Capacity 1.6TB 3.2TB 6.4TB 13TB 26TB Bandwidth to GPU – 6.4GB/sec Node Memory Bandwidth – 32GB/sec Max IO Bandwidth (2 QDR X4 IB) - 6.4GB/sec Node to Node Latency – 2us ....
Appro Xtreme-X Supercomputers Thank you Questions? A P P R O I N T E R N A T I O N A L I N C 12