Many-Core Programming with GRAMPS Jeremy Sugerman Kayvon Fatahalian Solomon Boulos Kurt Akeley Pat Hanrahan.

Slides:



Advertisements
Similar presentations
Sven Woop Computer Graphics Lab Saarland University
Advertisements

COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Graphics Pipeline.
GPUs and GPU Programming Bharadwaj Subramanian, Apollo Ellis Imagery taken from Nvidia Dawn Demo Slide on GPUs, CUDA and Programming Models by Apollo Ellis.
GCAFE 28 Feb Real-time REYES Jeremy Sugerman.
Reducing Shading on GPUs Using Quad-Fragment Merging JAEHYUN CHO
Cost-based Workload Balancing for Ray Tracing on a Heterogeneous Platform Mario Rincón-Nigro PhD Showcase Feb 17 th, 2012.
Rasterization and Ray Tracing in Real-Time Applications (Games) Andrew Graff.
Extending GRAMPS Shaders Jeremy Sugerman June 2, 2009 FLASHG.
GRAMPS Overview and Design Decisions Jeremy Sugerman February 26, 2009 GCafe.
Control Flow Virtualization for General-Purpose Computation on Graphics Hardware Ghulam Lashari Ondrej Lhotak University of Waterloo.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Programming Many-Core Systems with GRAMPS Jeremy Sugerman 14 May 2010.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
TEMPLATE DESIGN © Sum() is now a Shader stage: An N:1 shader and a graph cycle reduce in place, in parallel. 'Barrier'
GRAMPS: A Programming Model For Graphics Pipelines Jeremy Sugerman, Kayvon Fatahalian, Solomon Boulos, Kurt Akeley, Pat Hanrahan.
GRAMPS: A Programming Model for Graphics Pipelines and Heterogeneous Parallelism Jeremy Sugerman March 5, 2009 EEC277.
Pixel Shader Vertex Shader The Real-time Graphics Pipeline Input Assembler Rasterizer Output Merger.
GRAMPS Beyond Rendering Jeremy Sugerman 11 December 2009 PPL Retreat.
Hybrid PC architecture Jeremy Sugerman Kayvon Fatahalian.
Many-Core Programming with GRAMPS Jeremy Sugerman Stanford PPL Retreat November 21, 2008.
Many-Core Programming with GRAMPS & “Real Time REYES” Jeremy Sugerman, Kayvon Fatahalian Stanford University June 12, 2008.
Many-Core Programming with GRAMPS Jeremy Sugerman Stanford University September 12, 2008.
Doing More With GRAMPS Jeremy Sugerman 10 December 2009 GCafe.
Further Developing GRAMPS Jeremy Sugerman FLASHG January 27, 2009.
XMT-GPU A PRAM Architecture for Graphics Computation Tom DuBois, Bryant Lee, Yi Wang, Marc Olano and Uzi Vishkin.
FLASHG 15 Oct Graphics on GRAMPS Jeremy Sugerman Kayvon Fatahalian.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
Ray Tracing and Photon Mapping on GPUs Tim PurcellStanford / NVIDIA.
CHAPTER 4 Window Creation and Control © 2008 Cengage Learning EMEA.
Programmable Pipelines. Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Programmable Pipelines. 2 Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Chris Kerkhoff Matthew Sullivan 10/16/2009.  Shaders are simple programs that describe the traits of either a vertex or a pixel.  Shaders replace a.
Next-Generation Graphics APIs: Similarities and Differences Tim Foley NVIDIA Corporation
Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.
Piko: A Framework for Authoring Programmable Graphics Pipelines Anjul Patney and Stanley Tzeng UC Davis and NVIDIA Kerry A. Seitz, Jr. and John D. Owens.
Hybrid Prototyping of MPSoCs Samar Abdi Electrical and Computer Engineering Concordia University Montreal, Canada
CS 480/680 Intro Dr. Frederick C Harris, Jr. Fall 2014.
Stream Processing Main References: “Comparing Reyes and OpenGL on a Stream Architecture”, 2002 “Polygon Rendering on a Stream Architecture”, 2000 Department.
GPU Architecture and Programming
A Closer Look At GPUs By Kayvon Fatahalian and Mike Houston Presented by Richard Stocker.
GRAPHICS PIPELINE & SHADERS SET09115 Intro to Graphics Programming.
CS662 Computer Graphics Game Technologies Jim X. Chen, Ph.D. Computer Science Department George Mason University.
Programmable Pipelines Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts Director, Arts Technology Center University.
1 Ray Tracing with Existing Graphics Systems Jeremy Sugerman, FLASHG 31 January 2006.
A SEMINAR ON 1 CONTENT 2  The Stream Programming Model  The Stream Programming Model-II  Advantage of Stream Processor  Imagine’s.
GPU Based Sound Simulation and Visualization Torbjorn Loken, Torbjorn Loken, Sergiu M. Dascalu, and Frederick C Harris, Jr. Department of Computer Science.
GPUs – Graphics Processing Units Applications in Graphics Processing and Beyond COSC 3P93 – Parallel ComputingMatt Peskett.
1 Saarland University, Germany 2 DFKI Saarbrücken, Germany.
Fateme Hajikarami Spring  What is GPGPU ? ◦ General-Purpose computing on a Graphics Processing Unit ◦ Using graphic hardware for non-graphic computations.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Lecture.
COMPUTER GRAPHICS CS 482 – FALL 2015 SEPTEMBER 29, 2015 RENDERING RASTERIZATION RAY CASTING PROGRAMMABLE SHADERS.
Ray Tracing using Programmable Graphics Hardware
What are shaders? In the field of computer graphics, a shader is a computer program that runs on the graphics processing unit(GPU) and is used to do shading.
UW EXTENSION CERTIFICATE PROGRAM IN GAME DEVELOPMENT 2 ND QUARTER: ADVANCED GRAPHICS The GPU.
Ray Tracing by GPU Ming Ouhyoung. Outline Introduction Graphics Hardware Streaming Ray Tracing Discussion.
GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.
J++ Machine Jeremy Sugerman Kayvon Fatahalian. Background  Multicore CPUs  Generalized GPUs (Brook, CTM, CUDA)  Tightly coupled traditional CPU (more.
GPU Computing for GIS James Mower Department of Geography and Planning University at Albany.
An Introduction to the Cg Shading Language Marco Leon Brandeis University Computer Science Department.
GPU Architecture and Its Application
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
Introduction to OpenGL
The Graphics Rendering Pipeline
Graphics Processing Unit
Ray Tracing on Programmable Graphics Hardware
Introduction to OpenGL
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
Presentation transcript:

Many-Core Programming with GRAMPS Jeremy Sugerman Kayvon Fatahalian Solomon Boulos Kurt Akeley Pat Hanrahan

2 Problem Statement  Facilitate efficient development and execution in many-/multi-core commodity systems.  Homogeneous or heterogeneous cores. Status Quo:  GPUs: Easy to write GL/D3D and run it fast, hard to express anything else  CPUs: Possible (not easy) to write anything, possible (hard) to run it fast

3 GRAMPS Background  Resembles a GPU with software constructed pipeline.  Not (too) radical even in a pure graphics context  Similar story saw fixed -> programmable shading  Now the pipeline topology is under analogous pressures: proliferation of stages and options  And graphics is more than a GL/D3D pipeline…  And throughput / many-core is more than graphics…

4 GRAMPS Programming Model  Software constructs the pipeline (actually graph)  Exposes threads, shaders, fixed function stages –Coprocessors exposed via ISA  Exposes FIFOs / Queues connecting stages  Also enables software push / re-sorting  Exposes Buffers for memory access

5 GRAMPS’ Place  Compared to GPU Pipeline: More things possible (and medium easy), still (mostly) runs fast, less hardware independent  Compared to CPU: Easier to write things, easier to run them well, some loss of expressivity and flexibility  Still a role for a ‘graphics pipeline’. It’s an app!  GRAMPS is a layer, model for state machines.

6 GRAMPS and Streaming  From some angles, GRAMPS sounds a lot like Stream Processing / Computing  Distinctions are most visible in the target traits.  Streaming expects predictable data creation, flow, and consumption. Intensive offline / compile-time optimization and pre-scheduling.  GRAMPS expects dynamic data-dependent execution, (and thus) run-time scheduling  Also, GRAMPS assumes commodity and heterogeneity.

GRAMPS Examples Rast Shade FB Blend Frame Buffer Input Fragment Queue Output Fragment Queue Camera Intersect FB Blend Frame Buffer Ray Queue Sample Queue Shade Pixel Queue Rasterization Pipeline Ray Tracing Pipeline

8 GRAMPS Overview  Concepts: Graphs Stages: thread, shader, fixed-function Queues: ordered, unordered, sets (exclusion) Buffers  Components APIs: setup/driver, thread, shader Scheduler: fat core, shader core, top-level

9 What We’ve Built  Three rendering pipelines: Direct3D, Packet Tracer, D3D + Push (Hybrid)  Simulator and Runtime for two machines: GPU-like: Many threads per core, hw sched CPU-like: Few threads per core, sw sched

10 Rendering Pipelines Direct3D Pipeline (with Ray-tracing Extension) IA 1 VS 1 RO Rast Trace IA N VS N PS Frame Buffer Vertex Buffers Sample Queue Set Ray Queue Primitive Queue Input Vertex Queue 1 Primitive Queue 1 Input Vertex Queue N … … Ray-tracing Pipeline Tiler Sampler CameraIntersect Shade FB Blend Frame Buffer Sample Queue Tile Queue Ray Queue Ray Hit Queue Fragment Queue = Thread Stage = Shader Stage = Fixed-func Stage = Queue = Output via Push OM PS2 Fragment Queue = Stage Output Ray Hit Queue Ray-tracing Extension Primitive Queue N

11 Initial Results  Measured thread occupancy, worst case total queue memory.

12 GRAMPS Vis

13 High-level Challenges  Is GRAMPS a suitable GPU evolution? –Enable pipeline competitive with bare metal? –Enable innovation: advanced / alternative methods? –Is there a ‘best’ graphics pipeline on top?  Is GRAMPS a good parallel compute model? –Map well to hardware, hardware trends? –Support important apps? –Concepts influence developers?

14 What’s Next?  Low level implementation: scheduling, more accurate simulation.  More apps: REYES, physics, likely more.  Audit and refine model: graph modification / state change, fork-join / blocking calls, locks / barriers / synchronization primitives intra- or inter-stage  Prototype, explore next generation graphics pipelines.