Addressing Future HPC Demand with Multi-core Processors Stephen S. Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise.

Slides:



Advertisements
Similar presentations
1 Keith D. Underwood, Eric Borch May 16, 2011 A Unified Algorithm for both Randomized Deterministic and Adaptive Routing in Torus Networks.
Advertisements

Improving Cluster Interconnect Reliability Otto Liebat April 14, 2008 Intel ® Connects Cables Technology.
TO COMPUTERS WITH BASIC CONCEPTS Lecturer: Mohamed-Nur Hussein Abdullahi Hame WEEK 1 M. Sc in CSE (Daffodil International University)
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Appro Xtreme-X Supercomputers A P P R O I N T E R N A T I O N A L I N C.
1 The World’s Leading Provider of Silicon Photonic Solutions BLAZAR 40 Gigabit QSFP Optical Active Cable Marek Tlalka VP of Marketing
Intel® Education Fluid Math™
Visit our Focus Rooms Evaluation of Implementation Proposals by Dynamics AX R&D Solution Architecture & Industry Experts Gain further insights on Dynamics.
INTEL CONFIDENTIAL Why Parallel? Why Now? Introduction to Parallel Programming – Part 1.
Intel® Processor Architecture: Multi-core Overview Intel® Software College.
Intel ® Server Platform Transitions Nov / Dec ‘07.
Intel® Education Read With Me Intel Solutions Summit 2015, Dallas, TX.
Yabin Liu Senior Program Manager Business Intelligence and Reporting.
Intel® Education Learning in Context: Science Journal Intel Solutions Summit 2015, Dallas, TX.
Microsoft Virtual Academy. Microsoft Virtual Academy.
1 VLSI and Computer Architecture Trends ECE 25 Fall 2012.
Orion Granatir Omar Rodriguez GDC 3/12/10 Don’t Dread Threads.
Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise Group Intel Corporation CHEP ‘09 March 24, 2009 More Computing.
IBIS-AMI and Direction Indication February 17, 2015 Updated Feb. 20, 2015 Michael Mirmak.
Multi Core Processor Submitted by: Lizolen Pradhan
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
Maximizing The Compute Power With Mellanox InfiniBand Connectivity Gilad Shainer Wolfram Technology Conference 2006.
Optical I/O Technology for Chip-to-Chip Digital VLSI Ian Young Intel Fellow Director, Advanced Circuits and Technology Integration Logic Technology Development.
Intel® Education Learning in Context: Concept Mapping Intel Solutions Summit 2015, Dallas, TX.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
Enterprise Platforms & Services Division (EPSD) JBOD Update October, 2012 Intel Confidential Copyright © 2012, Intel Corporation. All rights reserved.
[Tim Shattuck, 2006][1] Performance / Watt: The New Server Focus Improving Performance / Watt For Modern Processors Tim Shattuck April 19, 2006 From the.
Intel Confidential – For Use with Customers under NDA Only Revision - 01 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL®
IBIS-AMI and Direction Decisions
IBIS-AMI and Direction Indication February 17, 2015 Michael Mirmak.
Copyright © 2006 Intel Corporation. WiMAX Wireless Broadband Access: The World Goes Wireless Michael Chen Director of Product & Platform Marketing Group.
Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise Group Intel Corporation HPC: Energy Efficient Computing April.
Copyright © 2008 Intel Corporation. All rights reserved. Intel Delivering Leadership HPC Technology – today and tomorrow – …for Grids …for Grids Sept 22th,
Recognizing Potential Parallelism Introduction to Parallel Programming Part 1.
Outline  Over view  Design  Performance  Advantages and disadvantages  Examples  Conclusion  Bibliography.
Visit our Focus Rooms Evaluation of Implementation Proposals by Dynamics AX R&D Solution Architecture & Industry Experts Gain further insights on Dynamics.
Interconnect Technologies and Drivers primary technologies: integration + optics driven primarily by servers/cloud computing thin wires → slow wires; limits.
The Drive to Improved Performance/watt and Increasing Compute Density Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise.
Copyright © 2011 Intel Corporation. All rights reserved. Openlab Confidential CERN openlab ICT Challenges workshop Claudio Bellini Business Development.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Boxed Processor Stocking Plans Server & Mobile Q1’08 Product Available through February’08.
1 CS : Technology Trends Ion Stoica and Ali Ghodsi ( August 31, 2015.
Virtualization for the Win! Scaling Electronic Sports League’s servers way up Sreeram Sammeta Paul Lindberg Intel.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Revision - 01 Intel Confidential Page 1 Intel HPC Update Norfolk, VA April 2008.
Trends in IC technology and design J. Christiansen CERN - EP/MIC
0 1 Thousand Core Chips A Technology Perspective Shekhar Borkar Intel Corp. June 7, 2007.
INTEL CONFIDENTIAL Intel® Smart Connect Technology Remote Wake with WakeMyPC November 2013 – Revision 1.2 CDI/IBP #:
CS203 – Advanced Computer Architecture
The Technology Catalyst Performance: Computational capability – Improve application performance by as much as 10X, Increase density and lowering cost.
Wi-Fi BT/BLE Combo Module WINC3400 hands-on
1 Game Developers Conference 2008 Comparative Analysis of Game Parallelization Dmitry Eremin Senior Software Engineer, Intel Software and Solutions Group.
Lynn Choi School of Electrical Engineering
HyperTransport™ Technology I/O Link
Appro Xtreme-X Supercomputers
TECHNOLOGY TRENDS.
BLIS optimized for EPYCTM Processors
Architecture & Organization 1
Energy Efficient Computing in Nanoscale CMOS
Intel’s Core i7 Processor
Parallelspace PowerPoint Template for ArchiMate® 2.1 version 1.1
Many-core Software Development Platforms
Prof. Yosef Ben-Ezra Founder & Chief Technology Officer
Digital Video Solutions For Any Content Anywhere March 2010
Architecture & Organization 1
Chapter 1 Introduction.
Leveraging Optical Technology in Future Bus-based Chip Multiprocessors
Computer Evolution and Performance
Expanded CPU resource pool with
Presentation transcript:

Addressing Future HPC Demand with Multi-core Processors Stephen S. Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise Group September 5, 2007

2 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY RELATING TO SALE AND/OR USE OF INTEL PRODUCTS, INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT, OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights. Intel may make changes to specifications, product descriptions, and plans at any time, without notice. The processors, the chipsets, or the ICH components may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available upon request. All dates provided are subject to change without notice. All dates specified are target dates, are provided for planning purposes only and are subject to change. Intel and are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. * Other names and brands may be claimed as the property of others. Copyright © 2007, Intel Corporation. All rights reserved.

3 Intel Is Listening… With multi-core processors, however, we finally get to a scheme where the HEP execution profile shines. Thanks to the fact that our jobs are embarrassingly parallel (each physics event is independent of the others) we can launch as many processes (or jobs) as there are cores on a die. However, this requires that the memory size is increased to accommodate the extra processes (making the computers more expensive). As long as the memory traffic does not become a bottleneck, we see a practically linear increase in the throughput on such a system. A ROOT/PROOF analysis demo elegantly demonstrated this during the launch of the quad- core chip at CERN. Source: “ Processors size up for physics at the LHC” Sverre Jarp,CERN Courier, March 28, 2007 Sverre Jarp, CERN Courier, March 28, 2007 HEP – High Energy Physics LHC - Large Hadron Collider

4 Accelerating Multi- and Many-core Performance Through Parallelism Power delivery and management High bandwidth memory Reconfigurable cache Scalable fabric Fixed-function units Big Core Core Big Core Big cores for Single Thread PerformanceBig cores for Single Thread Performance Small cores for Multi-Thread PerformanceSmall cores for Multi-Thread Performance

5 Memory Bandwidth Demand Computational Fluid Dynamic (CFD) - Splash Memory Bandwidth (GB/Sec) Computation (GFlops) x50x50, 10fps 150x100x100, 10fps 75x50x50, 30fps 150x100x100, 30fps 5000 Source: Intel Labs TFLOP Or 1 TB/s 0.17 Byte/Flop DDR3 Bandwidth (GB/Sec) Assuming frequency: 2133MHz ,4,8,16,32,64 DDR3 channels

6 Increasing Signaling Rate More Bandwidth & Less Power Differential Copper Interconnect & Silicon Photonics Signaling Rate MHz Power (mW/Gbps) State of the art DDR Single-ended Interconnect Future State of the Art with FBD & 5Gb/s): 100 GB/sec ~ 1 Tb/sec = 1,000 Gb/sec  25mw/Gb/sec = 25 Watts Bus-width = 1,000 Gb/sec / 5 = 200, about 400 pins (differential) Signaling Rate GBit/sec Power (mW/Gbps) State of the art with FBD Future Copper Interconnect Optical today 40 Silicon Photonics Vision 40

7 Addressing Memory Bandwidth Bringing Memory Closer to the Cores Package DRAMCPU Heat-sink Last Level Cache Fast DRAM Memory on Package *Future Vision, does not represent real Intel product 3D Memory Stacking Package Si Chip

8 Photonics For Memory BW and Capacity High Performance with Remote Memory Rx Tx The Indium Phosphide emits the light into the silicon waveguide The silicon acts as laser cavity - Light bounces back and forth and get amplified by InP based material integrated silicon photonic chip Remote Memory Blade integrated into a system Integrated Tb/s Optical Link on a Single Chip

9 Increasing Ethernet Bandwidth TPC-H SPECweb05 TPC-C SPECjApps 10 Gb/s 20Gb/s SPECweb05 TPC-H 40Gb/s 30Gb/s 50Gb/s 60Gb/s 70Gb/s 80Gb/s SPECweb05 TPC-H TPC-C TPC-H Source: Intel, TPC & SPEC are standard server application benchmarks Today server I/O is fragmented 1GbE performance doesn’t meet current server I/O requirements 10GbE is still too expensive 2/4G Fibre Channel & 10G Infiniband are being deployed to meet the growing I/O demands for storage & HPC clusters In 2-4 Years, convergence on 10GbE looks promising 10GBASE-T availability will drive costs down 10GbE will offer a compelling value- proposition for LAN, SAN, & HPC cluster connections 7-10 Years Look beyond to 40GbE and 100GbE ~2x Perf Gain Every 2 Years

10 Outside the Box: HPC Networking with Optical Cables Benefit Over Copper: Scalable, BW, throughput Scalable, BW, throughput Longer distance – today up to 100m Longer distance – today up to 100m Higher reliability: 10ˉ 15 Bit Error Rate (BER) or lower Higher reliability: 10ˉ 15 Bit Error Rate (BER) or lower Smaller & lighter form factor Smaller & lighter form factor Electrical Socket Optical Transceiver in Plug Optical Cable Optical Transceiver in Plug Up to 100m Electrical Socket Gb/s 40Gb/s 100Gb/s 80Gb/s 120Gb/s Doubling the Data Rate Every 2 Years

11 Power Aware Everywhere Silicon Heat Sinks Systems Facilities Packages Silicon: Moore’s law, Strained silicon, Transistor leakage control techniques, Hi-K Dielectric, Clock gating, etc. Processor and System Power: Multi-core, Integrated Voltage Regulators (IVR), Fine grain power management, etc. Facilities: Efficient Power Conversion and Cooling

12 Reliability Challenges & Vision Source: Intel + V + V Depletion Region Drift Diffusion Ion Path An exponential growth in FIT/chip FIT/bit of memory cell: expected to be roughly constant, but,FIT/bit of memory cell: expected to be roughly constant, but, Moore’s law: increasing the bit count exponentially: 2x every 2 yearsMoore’s law: increasing the bit count exponentially: 2x every 2 years Source: Intel Soft Error: one of many reliability challenges Process Techniques Device Param Tuning Rad-hard Cell Creation Circuit Techniques Architectural Techniques Micro Solutions Macro Solutions Parity SECDED ECC π bit Lockstepping Redundant multithreading (RMT) Redundant multi-core CPU State-of-Art Processes

13 What Can We Expect?! 10 PFlops 1 PFlops 100 TFlops 10 TFlops 1 TFlops 100 GFlops 10 GFlops 1 GFlops 100 MFlops 100 PFlops 10 EFlops 1 EFlops 100 EFlops HPC 1 ZFlops # Background picture from CERN OpenLab, Intel HPC Roundtable ‘06 SUM Of Top500 Beyond Petascale CERN: 115th of Top500 in June 2007 source: CERN Courier, Aug. 20, 2007

14 What can Intel do for you?