Past Practices of Conventional Core Microarchitecture is Dead

Slides:



Advertisements
Similar presentations
FHTE 4/26/11 1. FHTE 4/26/11 2 Two Key Challenges Programmability Writing an efficient parallel program is hard Strong scaling required to achieve ExaScale.
Advertisements

Explicit HW and SW Hierarchies High-Level Abstractions for giving the system what it wants Mattan Erez The University of Texas at Austin Salishan 2011.
Benjamin C. Johnstone, Dr. Sonia Lopez Alarcon 1.
1 Runnemede: Disruptive Technologies for UHPC John Gustafson Intel Labs HPC User Forum – Houston 2011.
CS 7810 Lecture 3 Clock Rate vs. IPC: The End of the Road for Conventional Microarchitectures V. Agarwal, M.S. Hrishikesh, S.W. Keckler, D. Burger UT-Austin.
Desktop with Direct3D 10 capable hardware Laptop with Direct3D 10 capable hardware Direct3D 9 capable hardware Older or no graphics hardware.
Chapter 2 Computer Clusters Lecture 2.3 GPU Clusters for Massive Paralelism.
CS Lecture 4 Clock Rate vs. IPC: The End of the Road for Conventional Microarchitectures V. Agarwal, M.S. Hrishikesh, S.W. Keckler, D. Burger UT-Austin.
MS108 Computer System I Lecture 2 Metrics Prof. Xiaoyao Liang 2014/2/28 1.
£899 – Ultimatum Computers indiegogo.com/ultimatumcomputers The Ultimatum.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Exascale Computing. 1 Teraflops Chip Knight Corner will be manufactured with Intel’s 3-D Tri-Gate 22nm process and features more than 50 cores.
Multi-Core Development Kyle Anderson. Overview History Pollack’s Law Moore’s Law CPU GPU OpenCL CUDA Parallelism.
Technology Education THE PERSONAL COMPUTER (PC) HARDWARE PART 1.
Software Grade 10. BIOS and the Power-on Self Test A computer can’t do much without instructions The first thing the CPU does when you switch it on is.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
Computer Hardware & Processing Inside the Box CSC September 16, 2010.
Hardware Architecture
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
نظام المحاضرات الالكترونينظام المحاضرات الالكتروني Introduction :: Computer Organization and Architecture Computer.
Hardware refers to the tangible parts of computer systems and typically includes support for processing, storage, input, and output. Hardware Processing.
Internal hardware of a computer Learning Objectives Learn how the processor works Learn about the different types of memory and what memory is used for.
Chapter 2 content Basic organization of computer What is motherboard
A Case for Toggle-Aware Compression for GPU Systems
Parts of a Computer.
Power Supply.
Computer Science.
What is it and why do you need it?
Our Graphics Environment
INTRODUCTION TO COMPUTERS
Computer usage Notur 2007.
Inc. 32 nm fabrication process and Intel SpeedStep.
NVIDIA’s Extreme-Scale Computing Project
Computing Hardware.
Chapter1 Fundamental of Computer Design
Chapter 7.2 Computer Architecture
Lynn Choi School of Electrical Engineering
Parallel Computing in the Multicore Era
Introduction to Computer Architecture
Technology advancement in computer architecture
Assembly Language for Intel-Based Computers, 5th Edition
Morgan Kaufmann Publishers
Technology Education THE PERSONAL COMPUTER (PC) HARDWARE PART 1
IB Computer Science Topic 2.1.1
Ultra-Low-Voltage UWB Baseband Processor
CSC 2231: Parallel Computer Architecture and Programming GPUs
“The Brain”… I will rule the world!
The University of Texas at Austin
BIC 10503: COMPUTER ARCHITECTURE
Introduction to CUDA Programming
ECEG-3202 Computer Architecture and Organization
Do Now Open a Word Document and complete the following:
Introduction to Computer Architecture
Alpha 21264: Microarchitecture and Performance
Computer Parts Poster This is one of the first wall displays that I added to my room. I have refined it over the years into the version that you now.
Ghifar Parahyangan Catholic University August 22, 2011
ECEG-3202 Computer Architecture and Organization
The University of Adelaide, School of Computer Science
1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.
Energy-Efficient Storage Systems
Parallel Computing in the Multicore Era
Persistence: I/O devices
500 nm WRITE VOLTAGE 0 V.
Hardware Main memory 26/04/2019.
Mapping DSP algorithms to a general purpose out-of-order processor
“The Brain”… I will rule the world!
The University of Adelaide, School of Computer Science
Utsunomiya University
Intel CPU for Desktop PC: Past, Present, Future
William Stallings Computer Organization and Architecture 7th Edition
Presentation transcript:

Past Practices of Conventional Core Microarchitecture is Dead ROADMAP Backdrop Brief history of graphics hardware Why GPU Computing? Progression GPU Computing 1.0 – compute pretending to be graphics GPU Computing 2.0 – direct computing, CUDA GPU Computing 3.0 – an emerging ecosystem Future Driving workloads GPU Computing 4.0? Steve Keckler Architecture Research Group

Old Equations

New Equation

Where are we now? Today’s high-end CPUs: 1-2nJ/Flop Today’s high-end GPUs: ~200pJ/Flop

What do things cost? Operation Energy 64-bit FP Operation 10.5pJ Regfile access (2 read/1 write) 5.5pJ Instruction RAM access 3.6pJ Data RAM access On-chip wire 18-110fJ/bit-mm 64-bit on-chip bus 1.2-7pJ/mm Standard off-chip link 30pJ/bit TSV (not including wire) 1-11fJ/bit 30nm with aggressive voltage scaling, from DARPA Exascale Report, 2008

Core microarchitecture is not dead But now need to focus on core perf/W Limit communication, storage access overheads All multicore proposals must focus on Perf/W Drive down overheads of data movement and tracking

Research poster submission deadline: August 15 www.nvidia.com/gtc Research poster submission deadline: August 15