Past Practices of Conventional Core Microarchitecture is Dead

Slides:

Advertisements

Similar presentations

FHTE 4/26/11 1. FHTE 4/26/11 2 Two Key Challenges Programmability Writing an efficient parallel program is hard Strong scaling required to achieve ExaScale.

Advertisements

Explicit HW and SW Hierarchies High-Level Abstractions for giving the system what it wants Mattan Erez The University of Texas at Austin Salishan 2011.

Benjamin C. Johnstone, Dr. Sonia Lopez Alarcon 1.

1 Runnemede: Disruptive Technologies for UHPC John Gustafson Intel Labs HPC User Forum – Houston 2011.

CS 7810 Lecture 3 Clock Rate vs. IPC: The End of the Road for Conventional Microarchitectures V. Agarwal, M.S. Hrishikesh, S.W. Keckler, D. Burger UT-Austin.

Desktop with Direct3D 10 capable hardware Laptop with Direct3D 10 capable hardware Direct3D 9 capable hardware Older or no graphics hardware.

Chapter 2 Computer Clusters Lecture 2.3 GPU Clusters for Massive Paralelism.

CS Lecture 4 Clock Rate vs. IPC: The End of the Road for Conventional Microarchitectures V. Agarwal, M.S. Hrishikesh, S.W. Keckler, D. Burger UT-Austin.

MS108 Computer System I Lecture 2 Metrics Prof. Xiaoyao Liang 2014/2/28 1.

£899 – Ultimatum Computers indiegogo.com/ultimatumcomputers The Ultimatum.

Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"

Exascale Computing. 1 Teraflops Chip Knight Corner will be manufactured with Intel’s 3-D Tri-Gate 22nm process and features more than 50 cores.

Multi-Core Development Kyle Anderson. Overview History Pollack’s Law Moore’s Law CPU GPU OpenCL CUDA Parallelism.

Technology Education THE PERSONAL COMPUTER (PC) HARDWARE PART 1.

Software Grade 10. BIOS and the Power-on Self Test A computer can’t do much without instructions The first thing the CPU does when you switch it on is.

Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.

Computer Hardware & Processing Inside the Box CSC September 16, 2010.

Hardware Architecture

Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.

نظام المحاضرات الالكترونينظام المحاضرات الالكتروني Introduction :: Computer Organization and Architecture Computer.

Hardware refers to the tangible parts of computer systems and typically includes support for processing, storage, input, and output. Hardware Processing.

Internal hardware of a computer Learning Objectives Learn how the processor works Learn about the different types of memory and what memory is used for.

Chapter 2 content Basic organization of computer What is motherboard

A Case for Toggle-Aware Compression for GPU Systems

Parts of a Computer.

Computer Science.

What is it and why do you need it?

Our Graphics Environment

INTRODUCTION TO COMPUTERS

Computer usage Notur 2007.

Inc. 32 nm fabrication process and Intel SpeedStep.

NVIDIA’s Extreme-Scale Computing Project

Computing Hardware.

Chapter1 Fundamental of Computer Design

Chapter 7.2 Computer Architecture

Lynn Choi School of Electrical Engineering

Parallel Computing in the Multicore Era

Introduction to Computer Architecture

Technology advancement in computer architecture

Assembly Language for Intel-Based Computers, 5th Edition

Morgan Kaufmann Publishers

Technology Education THE PERSONAL COMPUTER (PC) HARDWARE PART 1

IB Computer Science Topic 2.1.1

Ultra-Low-Voltage UWB Baseband Processor

CSC 2231: Parallel Computer Architecture and Programming GPUs

“The Brain”… I will rule the world!

The University of Texas at Austin

BIC 10503: COMPUTER ARCHITECTURE

Introduction to CUDA Programming

ECEG-3202 Computer Architecture and Organization

Do Now Open a Word Document and complete the following:

Introduction to Computer Architecture

Alpha 21264: Microarchitecture and Performance

Computer Parts Poster This is one of the first wall displays that I added to my room. I have refined it over the years into the version that you now.

Ghifar Parahyangan Catholic University August 22, 2011

ECEG-3202 Computer Architecture and Organization

The University of Adelaide, School of Computer Science

1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.

Energy-Efficient Storage Systems

Parallel Computing in the Multicore Era

Persistence: I/O devices

500 nm WRITE VOLTAGE 0 V.

Hardware Main memory 26/04/2019.

Mapping DSP algorithms to a general purpose out-of-order processor

“The Brain”… I will rule the world!

The University of Adelaide, School of Computer Science

Utsunomiya University

Intel CPU for Desktop PC: Past, Present, Future

William Stallings Computer Organization and Architecture 7th Edition

Presentation transcript:

Past Practices of Conventional Core Microarchitecture is Dead ROADMAP Backdrop Brief history of graphics hardware Why GPU Computing? Progression GPU Computing 1.0 – compute pretending to be graphics GPU Computing 2.0 – direct computing, CUDA GPU Computing 3.0 – an emerging ecosystem Future Driving workloads GPU Computing 4.0? Steve Keckler Architecture Research Group

Old Equations

New Equation

Where are we now? Today’s high-end CPUs: 1-2nJ/Flop Today’s high-end GPUs: ~200pJ/Flop

What do things cost? Operation Energy 64-bit FP Operation 10.5pJ Regfile access (2 read/1 write) 5.5pJ Instruction RAM access 3.6pJ Data RAM access On-chip wire 18-110fJ/bit-mm 64-bit on-chip bus 1.2-7pJ/mm Standard off-chip link 30pJ/bit TSV (not including wire) 1-11fJ/bit 30nm with aggressive voltage scaling, from DARPA Exascale Report, 2008

Core microarchitecture is not dead But now need to focus on core perf/W Limit communication, storage access overheads All multicore proposals must focus on Perf/W Drive down overheads of data movement and tracking

Research poster submission deadline: August 15 www.nvidia.com/gtc Research poster submission deadline: August 15