Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise Group Intel Corporation HPC: Energy Efficient Computing April.

Slides:



Advertisements
Similar presentations
1 Keith D. Underwood, Eric Borch May 16, 2011 A Unified Algorithm for both Randomized Deterministic and Adaptive Routing in Torus Networks.
Advertisements

3rd Generation Intel ® Core vPro Processor The Business Case [Presenter:] [Title:] [Date:]
1 Computational models of the physical world Cortical bone Trabecular bone.
Intel’s Solid-State Drives Knut Grimsrud Intel Fellow, Director of Storage Architecture.
Order management and Order fulfillment Prabhu Padhi, Meera Mahabala Senior Program Manager.
Guðmundur Helgi Axelsson Program Manager End of Day and Statement Posting.
1 Runnemede: Disruptive Technologies for UHPC John Gustafson Intel Labs HPC User Forum – Houston 2011.
Intel® Education Fluid Math™
Supercomputers Daniel Shin CS 147, Section 1 April 29, 2010.
Lappeenrannan teknillinen yliopisto TITE Prof. Esa Kerttula Päivä 1: Luento 1-1-7: Maaliskuu © Esa Kerttula.
® IBM Software Group © 2007 IBM Corporation Achieving Harmony IBM's Platform and Methodology for Systems Engineering and Embedded Software Development.
Visit our Focus Rooms Evaluation of Implementation Proposals by Dynamics AX R&D Solution Architecture & Industry Experts Gain further insights on Dynamics.
Lecture 1: Introduction to High Performance Computing.
HEVC Commentary and a call for local temporal distortion metrics Mark Buxton - Intel Corporation.
Guðmundur Helgi Axelsson Program Manager Inventory and Replenishment.
Jeff Blucher Program Manager Store setup and POS.
Intel ® Server Platform Transitions Nov / Dec ‘07.
Online Channel Management
Intel® Education Read With Me Intel Solutions Summit 2015, Dallas, TX.
Yabin Liu Senior Program Manager Business Intelligence and Reporting.
Yabin Liu Senior Program Manager Credit Card Payment Processing.
Intel® Education Learning in Context: Science Journal Intel Solutions Summit 2015, Dallas, TX.
Scott Tucker Program Manager Customer and Loyalty.
Intel® Solid-State Drive Data Center TCO Calculator The data in this presentation is based on your analysis and business assumptions when using the Intel®
Orion Granatir Omar Rodriguez GDC 3/12/10 Don’t Dread Threads.
Evaluation of a DAG with Intel® CnC Mark Hampton Software and Services Group CnC MIT July 27, 2010.
Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise Group Intel Corporation CHEP ‘09 March 24, 2009 More Computing.
IBIS-AMI and Direction Indication February 17, 2015 Updated Feb. 20, 2015 Michael Mirmak.
Overview Introduction The Level of Abstraction Organization & Architecture Structure & Function Why study computer organization?
K-12 Blueprint Overview March An Overview The K-12 Blueprint offers resources for education leaders involved.
Copyright © 2013 Intel Corporation. All rights reserved. Digital Signage for Growing Businesses November 2013.
Intel® Education Learning in Context: Concept Mapping Intel Solutions Summit 2015, Dallas, TX.
Partner Logo Overcoming Buyer Objections Q
Legal Notices and Important Information Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each.
Enterprise Platforms & Services Division (EPSD) JBOD Update October, 2012 Intel Confidential Copyright © 2012, Intel Corporation. All rights reserved.
Intel Confidential – For Use with Customers under NDA Only Revision - 01 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL®
IBIS-AMI and Direction Decisions
IBIS-AMI and Direction Indication February 17, 2015 Michael Mirmak.
Copyright © 2006 Intel Corporation. WiMAX Wireless Broadband Access: The World Goes Wireless Michael Chen Director of Product & Platform Marketing Group.
Copyright © 2008 Intel Corporation. All rights reserved. Intel Delivering Leadership HPC Technology – today and tomorrow – …for Grids …for Grids Sept 22th,
Recognizing Potential Parallelism Introduction to Parallel Programming Part 1.
Visit our Focus Rooms Evaluation of Implementation Proposals by Dynamics AX R&D Solution Architecture & Industry Experts Gain further insights on Dynamics.
The Drive to Improved Performance/watt and Increasing Compute Density Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise.
Copyright © 2011 Intel Corporation. All rights reserved. Openlab Confidential CERN openlab ICT Challenges workshop Claudio Bellini Business Development.
Visit our Focus Rooms Evaluation of Implementation Proposals by Dynamics AX R&D Solution Architecture & Industry Experts Gain further insights on Dynamics.
Boxed Processor Stocking Plans Server & Mobile Q1’08 Product Available through February’08.
© 2009 IBM Corporation Motivation for HPC Innovation in the Coming Decade Dave Turek VP Deep Computing, IBM.
Addressing Future HPC Demand with Multi-core Processors Stephen S. Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise.
Josef Schauer Program Manager Previous version support.
© 2015 IBM Corporation Big Data Journey. © 2015 IBM Corporation 2.
Revision - 01 Intel Confidential Page 1 Intel HPC Update Norfolk, VA April 2008.
© 2012 IBM Corporation IBM Security Systems 1 © 2012 IBM Corporation Cloud Security: Who do you trust? Martin Borrett Director of the IBM Institute for.
Josef Schauer Program Manager Commerce Data Exchange.
INTEL CONFIDENTIAL Intel® Smart Connect Technology Remote Wake with WakeMyPC November 2013 – Revision 1.2 CDI/IBP #:
Josef Schauer Program Manager Retail headquarters setup.
Evoluzione delle CPU in relazione all'efficienza energetica Michele Michelotto.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice ProLiant G5 to G6 Processor Positioning.
SPRING 2012 Assembly Language. Definition 2 A microprocessor is a silicon chip which forms the core of a microcomputer the concept of what goes into a.
1 Game Developers Conference 2008 Comparative Analysis of Game Parallelization Dmitry Eremin Senior Software Engineer, Intel Software and Solutions Group.
Innovation at Intel to address Challenges of Exascale Computing
BLIS optimized for EPYCTM Processors
Many-core Software Development Platforms
Digital Video Solutions For Any Content Anywhere March 2010
A Proposed New Standard: Common Privacy Vulnerability Scoring System (CPVSS) Jonathan Fox, Privacy Office/PDIT Harold A. Toomey, PSG/ISecG Jason M. Fung,
12/26/2018 5:07 AM Leap forward with fast, agile & trusted solutions from Intel & Microsoft* Eman Yarlagadda (for Christine McMonigal) Hybrid Cloud – Product.
Ideas for adding FPGA Accelerators to DPDK
Chapter 1 Introduction.
By Vipin Varghese Application Engineer (NCSD)
Expanded CPU resource pool with
Presentation transcript:

Steve Pawlowski Intel Senior Fellow GM, Architecture and Planning CTO, Digital Enterprise Group Intel Corporation HPC: Energy Efficient Computing April 20, 2009

2 Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS. Intel may make changes to specifications and product descriptions at any time, without notice. All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. This document may contain information on products in the design phase of development. The information here is subject to change without notice. Do not finalize a design with this information. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights. Wireless connectivity and some features may require you to purchase additional software, services or external hardware. Nehalem, Penryn, Westmere, Sandy Bridge and other code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Intel, Intel Inside, Pentium, Xeon, Core and the Intel logo are trademarks of Intel Corporation in the United States and other countries. *Other names and brands may be claimed as the property of others. Copyright © 2009 Intel Corporation.

3 Reach Exascale by 2018 From GigFlops to ExaFlops Sustained TeraFlop Sustained PetaFlop Sustained GigaFlop Sustained ExaFlop “The pursuit of each milestone has led to important breakthroughs in science and engineering.” Source: IDC “In Pursuit of Petascale Computing: Initiatives Around the World,” 2007 ~1987 ~ ~2018 Note: Numbers are based on Linpack Benchmark. Dates are approximate.

4 What are the Challenges? Power is Gating Every Part of Computing The Challenge of Exascale Source: Intel, for illustration and assumptions, not product representativeEFLOP Power? Power Consumption Power (KW) 1000,000 Voltage is not scaling as in the past MFLOP GFLOP TFLOP PFLOP ,000 ? 200MW 150MW 100MW 10MW MW? Compute Memory Comm Disk An ExaFLOPS Machine without Power Management Other misc. power consumptions: Power supply losses Cooling … etc 10EB 100pJ per FLOP 1.5nJ per Byte ~400W / Socket

5 HPC Platform Power Data from P3 Jet Power Calculator, V2.0 DP 80W Nehalem Memory – 48GB (12 x 4GB DIMMs) Single Power Supply 230Vac Need a platform view of power consumption: CPU, Memory and VR, etc. CPU Planar & VR’s Memory

6 Device Efficiency is Slowing Unmanaged growth in power will reach Giga Watt level at Exascale Relative Performance and Power (GFlops as the base) Power at a glance: (assume 31% CPU Power in a system) Today’s Peta: nj/op Today’s COTS: 2nj/op (assume 100W/50GFlops) Unmanaged Exa: if 1GW, 0.31nj/op; Exa Source: Intel Labs

7 To Reach ExaFlops Flops Pentium® II Architecture Pentium® 4 Architecture Pentium® Architecture Intel® Core™ uArch 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13 1.E+14 1.E Pentium® III Architecture Tera Peta Giga Source: Intel Future Projection What it takes to get to Exa… 40 + TFlops per socket Power goal = 200W / Socket, to reach Linpack ExaFlops: 5 pJ / op / socket * 40 TFlops - 25K sockets peak or 33K sustained, or 10 pJ / op / socket * 20 TFlops - 50K sockets peak (conservative) Intel estimates of future trends. Intel estimates are based in part on historical capability of Intel products and projections for capability improvement. Actual capability of Intel products will vary based on actual product configurations.

8 Parallelism for Energy Efficient Performance Relative Performance Super Scalar Era of Pipelined Architecture Multi Threaded Multi-Core Era of Thread & Processor Level Parallelism Speculative, OOO Era of Instruction LevelParallelism Many Core Future Projection Intel estimates of future trends. Intel estimates are based in part on historical capability of Intel products and projections for capability improvement. Actual capability of Intel products will vary based on actual product configurations. Source: Intel Labs

9 Parallelism’s Challenges   Current models based on communication between sequential processes (e.g. MPI, SHMEM, etc.).   Depend on check-pointing for resilience. TerascalePetascaleExascale Time Mean Time between Component Failure Mean Time For a Global Checkpoint Comm-based systems break-down beyond the crossover point We need new, fault resilient programming models, so computations make progress even as components fail. Source: Intel

10 Software Scaling Performance Forward Existing Software Message Passing Programming Model Resiliency Issues Exa Concurrency Require A New Hierarchical Structure A concurrency primitives framework Specifying, assigning, executing, migrating, and debugging a hierarchy of units of computation Providing a unified foundation A high-level declarative coordination language Orchestrate billions of tasks written in existing serial languages Manage Resiliency: fully utilize hardware capabilities Today’s parallel framework Concurrency Primitives Framework High-level Declarative Coordination Language Virtual, abstract machine Application

11 Reduce Memory and Communication Power Core-to-core ~10pJ per Byte Chip to memory ~150pj per Byte Chip to chip ~16pJ per Byte Data movement is expensive

12 Technologies to Increase Bandwidth HE-WS/HPC Traditional CPU BW demand BW Trend Source: Intel Forecast DDR3 Assuming Assuming DDR4 increasing channels eDRAM: replace on-pkg mem controller with very fast flex links to an on-board mem controller Memory Package CPU Memory Controller + Buffer Assuming eDRAM at 2X/3 yrs CAGR Intel estimates of future trends in bandwidth capability. Intel estimates are based in part on historical bandwidth capability of Intel products and projections for bandwidth capability improvement. Actual bandwidth capability of Intel products will vary based on actual product configurations. BW Projections GB/S (Per Skt)

13 Power Efficient High I/O Interconnect Bandwidth X 8X 40X 75X (Exa) HPC Interconnect requirement progressionCOTS interconnect 50 GB/s MPI: 30Mmsgs/s, SHMEM: 300Mmsgs/s 200 GB/s MPI: 75Mmsgs/s, SHMEM: 1Gmsgs/s 1TB/s MPI: 325Mmsgs/s, SHMEM: 5Gmsgs/s 4TB/s MPI: 1.25Gmsgs/s, SHMEM: 20Gmsgs/s Source: Intel MPI: Message Passing Interconnect; SHMEM: Shared Memory <20 mW/Gb/s10 mW/Gb/s3 mW/Gb/s1 mW/Gb/s Power Target Copper and/or Silicon Photonics Intel estimates of future trends in bandwidth capability. Intel estimates are based in part on historical bandwidth capability of Intel products and projections for bandwidth capability improvement. Actual bandwidth capability of Intel products will vary based on actual product configurations.

14 Signaling Data Rate and Energy Efficiency Data Rate (Gb/s) Signaling Energy Efficiency (pj/bit) Proposed Copper (Near term target) 1.0 GDDR5 ~25 DDR3 Intel ISSCC Intel VLSI ~15 Silicon Photonics (Longer term) Source: Intel Labs

15 Solid State Drive Future Performance and Energy Efficiency Assume: Capacity of the SSD grows at a CAGR of about 1.5; historical HDD at SSD GigaBytes Future projection Vision 10 ExaBytes at 2018: 2 Million SSD’s vs. ½ Million HDD each, total 2MW If HDD (300 IOPS) and SSD (10k IOPS) constant: SSD has 140X IOPS Innovations to improve IO: 2X less power with 140x performance gain Source: Intel, calculations based on today’s vision

16 Increase Data Center Compute Density Silicon Process Target 50% yearly improvements in performance/watt Year Compute Density Data Center Innovation Power Management Small Form Factor New Technology ++++ Source: Intel, based on Intel YoY improvement with SpecPower Benchmark

17 Revised Exascale System Power ExaFLOPS Machine without Power Mgmt 10EB 100pJ com per FLOP 1.5nJ per Byte ~400W / Socket 200MW 150MW 100MW 10MW Other misc. power consumption: … Power supply losses Cooling … etc MW? Memory Comm Disk Compute Source: Intel, for illustration and assumptions, not product representative 10EB 5TB/SSD 9pJ per FLOP 150pJ per Byte 50K each 10MW 9MW ~9MW ~2MW ~40MW Compute Memory Comm SSD ExaFLOPS Machine Future Vision Other misc. power consumption: … Power supply losses Cooling … etc 10MW

18