Download presentation
Presentation is loading. Please wait.
Published byGriselda Hines Modified over 9 years ago
1
Addressing Future HPC Demand with Multi-core Processors Stephen S. Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise Group September 5, 2007
2
2 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY RELATING TO SALE AND/OR USE OF INTEL PRODUCTS, INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT, OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights. Intel may make changes to specifications, product descriptions, and plans at any time, without notice. The processors, the chipsets, or the ICH components may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available upon request. All dates provided are subject to change without notice. All dates specified are target dates, are provided for planning purposes only and are subject to change. Intel and are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. * Other names and brands may be claimed as the property of others. Copyright © 2007, Intel Corporation. All rights reserved.
3
3 Intel Is Listening… With multi-core processors, however, we finally get to a scheme where the HEP execution profile shines. Thanks to the fact that our jobs are embarrassingly parallel (each physics event is independent of the others) we can launch as many processes (or jobs) as there are cores on a die. However, this requires that the memory size is increased to accommodate the extra processes (making the computers more expensive). As long as the memory traffic does not become a bottleneck, we see a practically linear increase in the throughput on such a system. A ROOT/PROOF analysis demo elegantly demonstrated this during the launch of the quad- core chip at CERN. Source: “ Processors size up for physics at the LHC” Sverre Jarp,CERN Courier, March 28, 2007 Sverre Jarp, CERN Courier, March 28, 2007 HEP – High Energy Physics LHC - Large Hadron Collider
4
4 Accelerating Multi- and Many-core Performance Through Parallelism Power delivery and management High bandwidth memory Reconfigurable cache Scalable fabric Fixed-function units Big Core Core Big Core Big cores for Single Thread PerformanceBig cores for Single Thread Performance Small cores for Multi-Thread PerformanceSmall cores for Multi-Thread Performance
5
5 Memory Bandwidth Demand Computational Fluid Dynamic (CFD) - Splash 0 500 1000 1500 2000 2500 3000 3500 4000 4500 Memory Bandwidth (GB/Sec) Computation (GFlops) 92 277 738 2215 75x50x50, 10fps 150x100x100, 10fps 75x50x50, 30fps 150x100x100, 30fps 5000 Source: Intel Labs 547 1642 4380 13140 10000 25000 1 TFLOP Or 1 TB/s 0.17 Byte/Flop DDR3 Bandwidth (GB/Sec) Assuming frequency: 2133MHz 34 68 136 272 544 1088 2,4,8,16,32,64 DDR3 channels
6
6 Increasing Signaling Rate More Bandwidth & Less Power Differential Copper Interconnect & Silicon Photonics 10 11 12 01333 Signaling Rate MHz Power (mW/Gbps) State of the art DDR 1066160018672133240026002800 Single-ended Interconnect Future State of the Art with FBD (@25mW/Gbps & 5Gb/s): 100 GB/sec ~ 1 Tb/sec = 1,000 Gb/sec 25mw/Gb/sec = 25 Watts Bus-width = 1,000 Gb/sec / 5 = 200, about 400 pins (differential) 0 5 10 15 20 25 30 05101520 Signaling Rate GBit/sec Power (mW/Gbps) State of the art with FBD Future Copper Interconnect Optical today 40 Silicon Photonics Vision 40
7
7 Addressing Memory Bandwidth Bringing Memory Closer to the Cores Package DRAMCPU Heat-sink Last Level Cache Fast DRAM Memory on Package *Future Vision, does not represent real Intel product 3D Memory Stacking Package Si Chip
8
8 Photonics For Memory BW and Capacity High Performance with Remote Memory Rx Tx The Indium Phosphide emits the light into the silicon waveguide The silicon acts as laser cavity - Light bounces back and forth and get amplified by InP based material integrated silicon photonic chip Remote Memory Blade integrated into a system Integrated Tb/s Optical Link on a Single Chip
9
9 Increasing Ethernet Bandwidth TPC-H SPECweb05 TPC-C SPECjApps 10 Gb/s 20Gb/s SPECweb05 TPC-H 40Gb/s 30Gb/s 50Gb/s 60Gb/s 70Gb/s 80Gb/s SPECweb05 TPC-H TPC-C TPC-H Source: Intel, 2006. TPC & SPEC are standard server application benchmarks Today server I/O is fragmented 1GbE performance doesn’t meet current server I/O requirements 10GbE is still too expensive 2/4G Fibre Channel & 10G Infiniband are being deployed to meet the growing I/O demands for storage & HPC clusters In 2-4 Years, convergence on 10GbE looks promising 10GBASE-T availability will drive costs down 10GbE will offer a compelling value- proposition for LAN, SAN, & HPC cluster connections 7-10 Years Look beyond to 40GbE and 100GbE ~2x Perf Gain Every 2 Years
10
10 Outside the Box: HPC Networking with Optical Cables Benefit Over Copper: Scalable, BW, throughput Scalable, BW, throughput Longer distance – today up to 100m Longer distance – today up to 100m Higher reliability: 10ˉ 15 Bit Error Rate (BER) or lower Higher reliability: 10ˉ 15 Bit Error Rate (BER) or lower Smaller & lighter form factor Smaller & lighter form factor Electrical Socket Optical Transceiver in Plug Optical Cable Optical Transceiver in Plug Up to 100m Electrical Socket 2007200920112013 20 Gb/s 40Gb/s 100Gb/s 80Gb/s 120Gb/s Doubling the Data Rate Every 2 Years
11
11 Power Aware Everywhere Silicon Heat Sinks Systems Facilities Packages Silicon: Moore’s law, Strained silicon, Transistor leakage control techniques, Hi-K Dielectric, Clock gating, etc. Processor and System Power: Multi-core, Integrated Voltage Regulators (IVR), Fine grain power management, etc. Facilities: Efficient Power Conversion and Cooling
12
12 Reliability Challenges & Vision Source: Intel + V + V + - +- - + -+ Depletion Region Drift Diffusion Ion Path An exponential growth in FIT/chip FIT/bit of memory cell: expected to be roughly constant, but,FIT/bit of memory cell: expected to be roughly constant, but, Moore’s law: increasing the bit count exponentially: 2x every 2 yearsMoore’s law: increasing the bit count exponentially: 2x every 2 years Source: Intel Soft Error: one of many reliability challenges Process Techniques Device Param Tuning Rad-hard Cell Creation Circuit Techniques Architectural Techniques Micro Solutions Macro Solutions Parity SECDED ECC π bit Lockstepping Redundant multithreading (RMT) Redundant multi-core CPU State-of-Art Processes
13
13 What Can We Expect?! 10 PFlops 1 PFlops 100 TFlops 10 TFlops 1 TFlops 100 GFlops 10 GFlops 1 GFlops 100 MFlops 100 PFlops 10 EFlops 1 EFlops 100 EFlops 199320171999200520112023 HPC 1 ZFlops #1 2029 Background picture from CERN OpenLab, Intel HPC Roundtable ‘06 SUM Of Top500 Beyond Petascale CERN: 115th of Top500 in June 2007 source: CERN Courier, Aug. 20, 2007
14
14 What can Intel do for you?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.