Shader generation and compilation for a programmable GPU Student: Jordi Roca Monfort Advisor: Agustín Fernández Jiménez Co-advisor: Carlos González Rodríguez.

Slides:



Advertisements
Similar presentations
COMPUTER GRAPHICS SOFTWARE.
Advertisements

COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Ray tracing. New Concepts The recursive ray tracing algorithm Generating eye rays Non Real-time rendering.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Graphics Pipeline.
3D Graphics Rendering and Terrain Modeling
CS-378: Game Technology Lecture #9: More Mapping Prof. Okan Arikan University of Texas, Austin Thanks to James O’Brien, Steve Chenney, Zoran Popovic, Jessica.
Real-Time Rendering TEXTURING Lecture 02 Marina Gavrilova.
9/25/2001CS 638, Fall 2001 Today Shadow Volume Algorithms Vertex and Pixel Shaders.
CS5500 Computer Graphics © Chun-Fa Chang, Spring 2007 CS5500 Computer Graphics April 19, 2007.
1 Shader Performance Analysis on a Modern GPU Architecture Victor Moya, Carlos González, Jordi Roca, Agustín Fernández Jordi Roca, Agustín Fernández Department.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
GPU Simulator Victor Moya. Summary Rendering pipeline for 3D graphics. Rendering pipeline for 3D graphics. Graphic Processors. Graphic Processors. GPU.
1 A Single (Unified) Shader GPU Microarchitecture for Embedded Systems Victor Moya, Carlos González, Jordi Roca, Agustín Fernández Department of Computer.
ARB Fragment Program in GPULib. Summary Fragment program arquitecture New instructions.  Emulating instructions not supported directly New Required GL.
Vertex & Pixel Shaders CPS124 – Computer Graphics Ferdinand Schober.
1 Angel: Interactive Computer Graphics 4E © Addison-Wesley 2005 Models and Architectures Ed Angel Professor of Computer Science, Electrical and Computer.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
Under the Hood: 3D Pipeline. Motherboard & Chipset PCI Express x16.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
GPU Programming Robert Hero Quick Overview (The Old Way) Graphics cards process Triangles Graphics cards process Triangles Quads.
Technology and Historical Overview. Introduction to 3d Computer Graphics  3D computer graphics is the science, study, and method of projecting a mathematical.
Programmable Pipelines. Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Geometric Objects and Transformations. Coordinate systems rial.html.
Advanced Computer Graphics March 06, Grading Programming assignments Paper study and reports (flipped classroom) Final project No written exams.
OpenGL Shading Language (Advanced Computer Graphics) Ernest Tatum.
GPU Shading and Rendering Shading Technology 8:30 Introduction (:30–Olano) 9:00 Direct3D 10 (:45–Blythe) Languages, Systems and Demos 10:30 RapidMind.
Programmable Pipelines. 2 Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Chris Kerkhoff Matthew Sullivan 10/16/2009.  Shaders are simple programs that describe the traits of either a vertex or a pixel.  Shaders replace a.
1 ATTILA: A Cycle-Level Execution-Driven Simulator for Modern GPU Architectures Victor Moya, Carlos González, Jordi Roca, Agustín Fernández Jordi Roca,
09/09/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Event management Lag Group assignment has happened, like it or not.
Cg Programming Mapping Computational Concepts to GPUs.
09/11/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Graphics Pipeline Texturing Overview Cubic Environment Mapping.
CS 450: COMPUTER GRAPHICS REVIEW: INTRODUCTION TO COMPUTER GRAPHICS – PART 2 SPRING 2015 DR. MICHAEL J. REALE.
The Graphics Rendering Pipeline 3D SCENE Collection of 3D primitives IMAGE Array of pixels Primitives: Basic geometric structures (points, lines, triangles,
CSC 461: Lecture 3 1 CSC461 Lecture 3: Models and Architectures  Objectives –Learn the basic design of a graphics system –Introduce pipeline architecture.
1 Introduction to Computer Graphics with WebGL Ed Angel Professor Emeritus of Computer Science Founding Director, Arts, Research, Technology and Science.
1Computer Graphics Lecture 4 - Models and Architectures John Shearer Culture Lab – space 2
Shading Languages & HW Giovanni Civati Dept. of Information Tecnology University of Milan, Italy Milan, 26 th May 2004.
Programmable Pipelines Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts Director, Arts Technology Center University.
Computer Graphics Chapter 6 Andreas Savva. 2 Interactive Graphics Graphics provides one of the most natural means of communicating with a computer. Interactive.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Review on Graphics Basics. Outline Polygon rendering pipeline Affine transformations Projective transformations Lighting and shading From vertices to.
A SEMINAR ON 1 CONTENT 2  The Stream Programming Model  The Stream Programming Model-II  Advantage of Stream Processor  Imagine’s.
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
COMPUTER GRAPHICS CS 482 – FALL 2015 SEPTEMBER 29, 2015 RENDERING RASTERIZATION RAY CASTING PROGRAMMABLE SHADERS.
09/25/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Shadows Stage 2 outline.
Ray Tracing using Programmable Graphics Hardware
Current Student – University of Wisconsin – Stout Applied Mathematics and Computer Science: Software Development Associate Degree in Computer Programming.
The Graphics Pipeline Revisited Real Time Rendering Instructor: David Luebke.
Radiance Cache Splatting: A GPU-Friendly Global Illumination Algorithm P. Gautron J. Křivánek K. Bouatouch S. Pattanaik.
1 E. Angel and D. Shreiner: Interactive Computer Graphics 6E © Addison-Wesley 2012 Models and Architectures 靜宜大學 資訊工程系 蔡奇偉 副教授 2012.
GLSL Review Monday, Nov OpenGL pipeline Command Stream Vertex Processing Geometry processing Rasterization Fragment processing Fragment Ops/Blending.
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
- Introduction - Graphics Pipeline
Programmable Pipelines
Graphics Processing Unit
3D Graphics Rendering PPT By Ricardo Veguilla.
Chapter 6 GPU, Shaders, and Shading Languages
The Graphics Rendering Pipeline
Understanding Theory and application of 3D
Models and Architectures
Models and Architectures
Models and Architectures
Introduction to Computer Graphics with WebGL
Graphics Processing Unit
Models and Architectures
Models and Architectures
CIS 441/541: Introduction to Computer Graphics Lecture 15: shaders
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
Presentation transcript:

Shader generation and compilation for a programmable GPU Student: Jordi Roca Monfort Advisor: Agustín Fernández Jiménez Co-advisor: Carlos González Rodríguez

Outline Introduction. Background. Goals. Design and implementation. Conclusions.

Introduction

ATTILA simulation framework Vendor OpenGL API Vendor Driver GLInterceptor OpenGL Application ATTILA OpenGL API ATTILA Driver ATTILA Simulator OpenGL trace Statistics GLPlayer

ATTILA Driver ATTILA Simulator Statistics Simulates last generation of 3D graphics boards (programmable GPUs) My Work ATTILA OpenGL API OpenGL Application OpenGL trace Vendor OpenGL API Vendor driver GLInterceptor GLPlayer Extend/Complete OpenGL API to execute recent/advanced 3D Applications (Doom3, Unreal Tournament, etc)

Background

Renderization (I) ¿What is called renderization? Generate the pixels for a set of images/frames forming an animated scene. Goal: compute each pixel color as fast as possible → determines FPS ¿Which computations are required? Given the scene objects DB, compute the color of the projected objects in the pixel screen area. Each pixel color depends on the scene lighting and the viewer camera position.

Renderization (II) Position View Info Renderization data Geometry info Position, Color Lighting Info Screen area

Renderization approaches For each pixel (x,y) compute physical interaction between the lights and objects in scene: RayTracing, Radiosity, Photon Map Very expensive pixel computation: Global lighting (shadows, indirect reflections among objects) Interaction between objects and lights are computed only in vertices and for each pixel (x,y) the corresponding value is approached. Direct Rendering (3D graphics boards, 3D game consoles, etc.). Only direct illumination from light sources (Each vertex color is independent)

Direct Rendering (I) Position Viewer Info Renderization data Geometry info Position, Color Lighting Info Screen area Color interpolation

Direct Rendering (II) The higher density of vertices, the more realistic lighting. In addition, more vertices are required to improve level of detail in surfaces. Thus: ▲ realism → ▲ vertices → ▲ computation → ▼ FPS Solution: Specify surface using less vertices and Specify surface details using textures.

Textures Renderization data Position Viewer Info Geometry info Position, Color Lighting Info Screen area Textures

Texture mapping Screen area (0.63,0.86) (0.26,0.37) (0.79,0.10)

Texture mapping Screen area (0.63,0.86) (0.26,0.37) (0.79,0.10) Coordinate interpolator (0.40,0.45) Texture sampled value

3D Rendering Pipeline Generate interpolated attributes (color, coordinates) Per-pixel texture mapping Compute: color coordinates vertex position in screen Final screen 3D scene Vertex DB Viewer info Lighting info Textures Vertex processing stage (VERTEX SHADING) Parallelizable process Fragment processing stage (FRAGMENT SHADING) Parallelizable process RASTERIZER

3D RP Implementation Implementations Software: Mesa 3D Graphics Library (OpenGL). Software + hardware acceleration: Vendor OpenGL, Direct3D, Xbox, PlayStation, etc. Work distribution between CPU y graphics board transparently to the applications.

3D accelerators evolution 2D accelerators (pre Voodo) <1996 3D accelerators (3Dfx Voodo) 1996 Graphical Processor Units (GeForce) 1999 Programmable GPUs (GeForce 3) 2001 Rasterizer FS VS Final screen BD CPU VGA Rasterizer FS VS Final screen BD CPU 3D accelerators Rasterizer FS VS Final screen BD CPU GPU Rasterizer FS VS Final screen BD CPU PGPU

GPUs: applying 2 textures Rasterizer (x,y) InterpolatedcolorTexture coordinate 1 Final color F1 Fragment stream Texture coordinate 2 + Fragment Unit 0 Texture Memory * Fixed Function Uses: Per-pixel lighting. Shadow implementation. Bump-mapping.

Programmable GPUs: 2 textures Rasterizer (x,y) InterpolatedcolorTexture coordinate Final color F1 Fragment Stream Texture coordinate Fragment Shader 0 Texture Memory ALU Temporals Shader Processors LDTEX t1, coord1, Text1 LDTEX t2, cood2, Text2 ADD t1, colorIn, t1 MUL t1, t1, t2

Shader Processors SP execute small programs (shaders) using vectorial and scalar instructions, that define the computation in the following stages: Vertex processing: Vertex Shader Lighting computation On-screen vertex projection Texture coordinates generation. Fragment processing: Fragment Shader Texture color fetch and blending. FOG It is like a GPU supporting “infinite visualization effects” not supported in previous graphics boards generations.

Goals

Implement all the necessary modules in the OpenGL API to: Support new real 3D applications using shaders in our simulation framework. Support also for old applications using FF and applications combining both shaders and FF. Idea: Perform Fixed Function emulation through generating equivalent shaders for SP.

Things to do Implement shader support in our OpenGL API: Using the most used shader programming language by 3D apps: ARB_vertex_program y ARB_fragment_program Study how to express FF functions in terms of shaders (pre-study phase).

Design and implementation

Fixed Function emulation

FF Emulation Rasterizer Fragment Shader Vertex Shader Final screen BD !!ARBvp1.0 ATTRIB pos = vertex.position; PARAM mat[4] = { state.matrix.mvp }; # Transform by concatenation of the # MODELVIEW and PROJECTION matrices. DP4 result.position.x, mat[0], pos; DP4 result.position.y, mat[1], pos; DP4 result.position.z, mat[2], pos; DP4 result.position.w, mat[3], pos; # Pass the primary color through # w/o lighting. MOV result.color, vertex.color; END !!ARBfp1.0 #first set of texture coordinates ATTRIB tex = fragment.texcoord; # interpolated color ATTRIB col = fragment.color; OUTPUT outColor = result.color; TEMP tmp; #sample the texture TEX tmp, tex, texture, 2D; #perform the modulation MUL outColor, tmp, col; END

FF emulation Implemented functions (according to OpenGL Spec 2.0): Vertex Shading (85% of total): Per-vertex standard OpenGL lighting: Point, directional and spot lights. Attenuation. Local and infinite viewer. Vertex transformation Automatic texture coordinate generation. Object Plane and Eye Plane Normal Map, Reflection Map and Sphere Map. FOG coordinate. Fragment Shading (90% of total): Multi-texturing and texture combine functions FOG application: Linear, Exponential and Second Order Exponential

FF emulation example FOG application: Algorithm: For each pixel, perform linear interpolation between the original and the fog color, accoding to the distance from the object to the viewer.

FOG emulation FOG exponential mode f = e -density*fogcoord f = 2 -(density * fogcoord)/ln(2) (e = 2 1/ln 2 ) Final color = pixel color * f + fog color * (1 - f)

FOG emulation !!ARBfp1.0 ATTRIB fogCoord = fragment.fogcoord; OUTPUT oColor = result.color; PARAM fogColor = state.fog.color; PARAM fogParams = program.local[0]; # fogParams.x : density/ln(2) TEMP fragmentColor, fogFactor; # Texture applications.... # Fog Factor computing... MUL fogFactor.x, fogParam.x, fogCoord.x; # fogFactor.x = density*fogcoord/ln(2) EX2_SAT fogFactor.x, -fogFactor.x; # fogFactor.x = 2^-(fogFactor.x) # Fog color interpolation LRP oColor, fogFactor.x, fragmentColor, fogColor; END

ARB compilers

!!ARBvp1.0 ATTRIB pos = vertex.position; PARAM mat[4] = { state.matrix.mvp }; # Transform by concatenation of the # MODELVIEW and PROJECTION matrices. DP4 result.position.x, mat[0], pos; DP4 result.position.y, mat[1], pos; DP4 result.position.z, mat[2], pos; DP4 result.position.w, mat[3], pos; # Pass the primary color through # w/o lighting. MOV result.color, vertex.color; END !!ARBfp1.0 #first set of texture coordinates ATTRIB tex = fragment.texcoord; # interpolated color ATTRIB col = fragment.color; OUTPUT outColor = result.color; TEMP tmp; #sample the texture TEX tmp, tex, texture, 2D; #perform the modulation MUL outColor, tmp, col; END

The compilers common architecture !!ARBvp1.0 PARAM arr[5] = { program.env[0..4] }; #ADDRESS addr; ATTRIB v1 = vertex.attrib[1]; PARAM par1 = program.local[0]; OUTPUT oPos = result.position; OUTPUT oCol = result.color.front.primary; OUTPUT oTex = result.texcoord[2]; ARL addr.x, v1.x; MOV res, arr[addr.x - 1]; END Lexical - Syntactic Analysis (Flex + Bison) !!ARBvp1.0 IR Semantic Analysis Symbol table Code generation GPU Specific Generic Line:By0By1By2By3By4By5By6By7By8By9ByAByBByByDByEByF 011: b 6a 00 0f 1b : b 1b b : b 1b b 14 b8 014: b 1b b : b 1b b 04 f8 016: b 1b b : b 1b b : b b 04 d8 019: b b : ae 00 0c 1b : b 04 b8 022: b : c 1b 14 f8 024: a ae 00 0c 1b : b c 1b 14 38

Intermediate Representation Example: !!ARBvp1.0 ATTRIB pos = vertex.position; PARAM mat[4] = { state.matrix.mvp }; # Transform by concatenation of the # MODELVIEW and PROJECTION matrices. DP4 result.position.x, mat[0], pos; DP4 result.position.y, mat[1], pos; DP4 result.position.z, mat[2], pos; DP4 result.position.w, mat[3], pos; # Pass the primary color through # w/o lighting. MOV result.color, vertex.color; END IRProgram header: “!!ARBvp1.0” IRVP1ATTRIBStatement name: pos attrib: vertex.position Program Statements IRInstruction opcode: DP4 destination: result.position IRDstOperand writeMask: x isResultRegister: true source: mat IRSrcOperand swizzleMask: xyzw isInputRegister: false destinationsources source: pos IRSrcOperand swizzleMask: xyzw isInputRegister: false

Semantic analysis and generic code generation Features: Implemented using the visitor pattern. Decouples IR from the different operations involved in each compiler phase. Allows using a common analyzer and a common code generator for both program types.

Code generation Phase 1: Generate an architecture-independent generic code assuming unbounded machine resources. Phase 2: Translate to specific code being aware of the concrete GPU architecture constraints. GenericInstruction GenericCode GenericInstruction Machine File Descriptor GPUInstruction Specific Code GPUInstruction

Conclusions

Achieved goals: Now, the OpenGL API implementation supports: Fixed Function emulation Of almost the entire set of functions of VS and FS stages (the most important ones). Shader compilation for ARB_vertex_program and ARB_fragment_program specifications. Both compilers share most of the implementation. Clear separation between generic and specific stages.

Future work Support/include other 3D RP parts (i.e. interpolation) like programables stages to reduce hardware complexity and power consumption (embedded systems). Implement high-level shading languages compilers (GLSlang, HLSL).

End of the presentation