GRAPHICS AND COMPUTING GPUS Jehan-François Pâris

Slides:



Advertisements
Similar presentations
Larrabee Eric Jogerst Cortlandt Schoonover Francis Tan.
Advertisements

Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,
COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Lecture 6: Multicore Systems
Status – Week 257 Victor Moya. Summary GPU interface. GPU interface. GPU state. GPU state. API/Driver State. API/Driver State. Driver/CPU Proxy. Driver/CPU.
G30™ A 3D graphics accelerator for mobile devices Petri Nordlund CTO, Bitboys Oy.
Prepared 5/24/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
Appendix A — 1 FIGURE A.2.2 Contemporary PCs with Intel and AMD CPUs. See Chapter 6 for an explanation of the components and interconnects in this figure.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Chapter.
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.
Evolution of the Programmable Graphics Pipeline Patrick Cozzi University of Pennsylvania CIS Spring 2011.
Status – Week 260 Victor Moya. Summary shSim. shSim. GPU design. GPU design. Future Work. Future Work. Rumors and News. Rumors and News. Imagine. Imagine.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 April 4, 2013 © Barry Wilkinson CUDAIntro.ppt.
Under the Hood: 3D Pipeline. Motherboard & Chipset PCI Express x16.
High Performance in Broad Reach Games Chas. Boyd
Computer performance.
Background image by chromosphere.deviantart.com Fella in following slides by devart.deviantart.com DM2336 Programming hardware shaders Dioselin Gonzalez.
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
© Copyright Khronos Group, Page 1 Harnessing the Horsepower of OpenGL ES Hardware Acceleration Rob Simpson, Bitboys Oy.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
1 ITCS 4/5010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Dec 31, 2012 Emergence of GPU systems and clusters for general purpose High Performance Computing.
Computer Graphics Graphics Hardware
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.
Architecture Examples And Hierarchy Samuel Njoroge.
Chris Kerkhoff Matthew Sullivan 10/16/2009.  Shaders are simple programs that describe the traits of either a vertex or a pixel.  Shaders replace a.
The Graphics Rendering Pipeline 3D SCENE Collection of 3D primitives IMAGE Array of pixels Primitives: Basic geometric structures (points, lines, triangles,
GPU in HPC Scott A. Friedman ATS Research Computing Technologies.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Morgan Kaufmann Publishers
Emergence of GPU systems and clusters for general purpose high performance computing ITCS 4145/5145 April 3, 2012 © Barry Wilkinson.
A Closer Look At GPUs By Kayvon Fatahalian and Mike Houston Presented by Richard Stocker.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
1)Leverage raw computational power of GPU  Magnitude performance gains possible.
A SEMINAR ON 1 CONTENT 2  The Stream Programming Model  The Stream Programming Model-II  Advantage of Stream Processor  Imagine’s.
Computer Graphics 3 Lecture 6: Other Hardware-Based Extensions Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Lecture.
Morgan Kaufmann Publishers Multicores, Multiprocessors, and Clusters
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
Ray Tracing by GPU Ming Ouhyoung. Outline Introduction Graphics Hardware Streaming Ray Tracing Discussion.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 GPU.
GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.
3/12/2013Computer Engg, IIT(BHU)1 CUDA-3. GPGPU ● General Purpose computation using GPU in applications other than 3D graphics – GPU accelerates critical.
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating.
GPU Computing for GIS James Mower Department of Geography and Planning University at Albany.
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 © Barry Wilkinson GPUIntro.ppt Oct 30, 2014.
NVIDIA® TESLA™ GPU Based Super Computer By : Adam Powell Student # For COSC 3P93.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
GPUs (Graphics Processing Units). Information from Textbook Online Appendix C includes information on GPUs Access online resources from: –
Computer Engg, IIT(BHU)
Appendix C Graphics and Computing GPUs
Computer Graphics Graphics Hardware
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 July 12, 2012 © Barry Wilkinson CUDAIntro.ppt.
GPU Architecture and Its Application
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
Graphics on GPU © David Kirk/NVIDIA and Wen-mei W. Hwu,
Graphics Processing Unit
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 © Barry Wilkinson GPUIntro.ppt Nov 4, 2013.
Graphics Processing Unit
Computer Graphics Graphics Hardware
Graphics Processing Unit
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
Presentation transcript:

GRAPHICS AND COMPUTING GPUS Jehan-François Pâris

Chapter Organization Why bother? Evolution GPU System Architecture Programming GPUs …

Why bother? (I) Yesterday's fastest computer was the Sequoia supercomputer –Can crunch quadrillion calculations per second (16.32 Petaflops/s). –98,304 compute nodes Each compute nodes is a 16-core PowerPC A2 processor

Why bother? (II) Today's fastest computer is the Cray XK7 –Hits Petaflops/s on the LINPAC benchmark. –Features 560,640 processors, including 261,632 Nvidia K20x accelerating cores. Supercomputing version of consumer- oriented GK104 CPU

Why bother (III) Most techniques developed for high-speed computing end trickling down to mass markets

EVOLUTION

History (I) Up to late 90's –No GPUs –Much simpler VGA controller Consisted of –A memory controller –Display generator + DRAM DRAM was either shared with CPU or private

History (I) By 1997 –More complex VGA controllers Incorporated 3D accelerating functions in hardware –Triangle set up and rasterization – Texture mapping and shading

Rasterization Converting –An image described in a vector graphics format as a combination of shapes Lines, polygons, letters, … into –A raster image consisting of individual pixels

History (II) By 2000 –Single chip graphics processor incorporated nearly all functions of graphics pipeline of high-end workstations Beginning of the end of high-end workstation market –VGA controller was renamed Graphic Processing Units

Current trends (I) Graphics processing standards –Well defined APIs – Open GL: Open standard for 3D graphics programming – DirectX: Set of MS multimedia programming interfaces (Direct3D for 3D graphics) Xbox was named after it!

Current trends (II) Frequent doubling of GPU speeds –Every 12 to 18 months New paradigm: –Visual computing stands at the intersection graphic processing and parallel computing Can implement novel graphics algorithms Use GPUs for non-conventional applications

Two results Triumph of heterogeneous architectures –Combining powers of CPU and GPU GPUs become scalable parallel processors –Moving from hardware-defined pipelining architectures to more flexible programmable architectures

From GPGU to CUDA GPGU –General-Purpose computing on GPU –Uses traditional graphics API and graphics pipeline

From GPGU to CUDA CUDA –Compute Unified Device Architecture –Parallel computing platform and programming model C/C++ Invented by NVIDIA – Single Program Multiple Data approach

GPU SYSTEM ARCHITECTURE

Old School Approach CPU North Bridge South Bridge VGA Controller RAM PCI bus Frame buffer UART To VGA display

Intel Architecture CPU North Bridge South Bridge DDR2 RAM To display GPU GPU Memory

AMD Architecture CPU North Bridge Chipset DDR2 RAM To display GPU GPU Memory

Variations Unified Memory Architecture (UMA): –GPU shares RAM with CPU –Lower memory bandwidth, higher latency –Cheap, low-end solution Scalable Link Interconnect: –NVIDIA –Allows multiple GPUs –High-end solution

Integrated solutions Integrate CPU and Northbridge Integrate GPU and chipset

Game console Similar architectures Architectures evolve over time Objective is to reduce costs while maintaining performance

GPU interfaces and drivers GPU attached to CPU via PCI-Express –Replaces older AGP Interfaces such as OpenGL and Direct3D use the GPU as a coprocessor –Send commands, programs and data to GPU through a specific GPU device driver They are often buggy!

Graphics logical pipeline Vertex Shader Geometry Shader Setup & Raster Pixel Shader Raster & Merger These functions must be mapped into a programmable GPU Input Ass'er

Basic Unified GPU Architecture Programmable processor array –Tightly integrated with fixed-function processors for texture filtering, rasterization, raster operations –Emphasis in on very high level of parallelism

Example architecture Tesla architecture (NVIDIA Geoforce 8800) 116 streaming processors (SP) cores –Organized as 14 multithreaded streaming multiprocessors (SM) Each SP core –Manages 96 concurrent threads Thread state are maintained by hardware –Connects with four 64-bit DRAM partitions

Example architecture Each SM has –8 SP cores –2 special function units –Separate caches for instructions and constants –A multithreaded instruction unit –Shared memory (NUMA?)

PROGRAMMING GPUS Will focus on parallel computing applications

Key idea Must decompose problem into set of parallel computations –Ideally two-level to match GPU organization

Example Data are in big array Small array Tiny

CUDA CUDA programs are written in C Provides three abstractions –Hierarchy of thread groups –Shared memory –Barrier synchronization

Barrier synchronization Barriers let threads –Wait for completion of a computation step by other cores so they can Exchange results Start next step

Example Tiny Barrier = Wait for each other Exchange partial results Tiny Barrier = Wait for each other Exchange partial results Tiny

Big fallacies GPUs –Not good for general computation –Cannot run double precision arithmetic –Do not do floating point correctly Cannot speedup O(n) algorithms