GPU Programming Yanci Zhang Game Programming Practice.

Slides:



Advertisements
Similar presentations
COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Advertisements

Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
GLSL Basics Discussion Lecture for CS 418 Spring 2015 TA: Zhicheng Yan, Sushma S Kini, Mary Pietrowicz.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Graphics Pipeline.
CS-378: Game Technology Lecture #9: More Mapping Prof. Okan Arikan University of Texas, Austin Thanks to James O’Brien, Steve Chenney, Zoran Popovic, Jessica.
Informationsteknologi Wednesday, December 12, 2007Computer Graphics - Class 171 Today’s class OpenGL Shading Language.
The Programmable Graphics Hardware Pipeline Doug James Asst. Professor CS & Robotics.
MAT 594CM S10Fundamentals of Spatial ComputingAngus Forbes Week 4 : GLSL Shaders Topics: Shader programs, vertex & fragment shaders, passing data into.
MAT 594CM S10Fundamentals of Spatial ComputingAngus Forbes Week 5 : GLSL Shaders Topics: Shader syntax, passing textures into shaders, per-pixel lighting,
Control Flow Virtualization for General-Purpose Computation on Graphics Hardware Ghulam Lashari Ondrej Lhotak University of Waterloo.
GLSL I May 28, 2007 (Adapted from Ed Angel’s lecture slides)
GLSL I Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts Director, Arts Technology Center University of New Mexico.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
ATI GPUs and Graphics APIs Mark Segal. ATI Hardware X1K series 8 SIMD vertex engines, 16 SIMD fragment (pixel) engines 3-component vector + scalar ALUs.
Mohan Sridharan Based on slides created by Edward Angel GLSL I 1 CS4395: Computer Graphics.
GPU Tutorial 이윤진 Computer Game 2007 가을 2007 년 11 월 다섯째 주, 12 월 첫째 주.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
GPU Programming Robert Hero Quick Overview (The Old Way) Graphics cards process Triangles Graphics cards process Triangles Quads.
Enhancing GPU for Scientific Computing Some thoughts.
Programmable Pipelines. Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
OpenGL Shading Language (Advanced Computer Graphics) Ernest Tatum.
GPU Shading and Rendering Shading Technology 8:30 Introduction (:30–Olano) 9:00 Direct3D 10 (:45–Blythe) Languages, Systems and Demos 10:30 RapidMind.
Programmable Pipelines. 2 Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Chris Kerkhoff Matthew Sullivan 10/16/2009.  Shaders are simple programs that describe the traits of either a vertex or a pixel.  Shaders replace a.
A Crash Course in HLSL Matt Christian.
Cg Programming Mapping Computational Concepts to GPUs.
CSE 690: GPGPU Lecture 6: Cg Tutorial Klaus Mueller Computer Science, Stony Brook University.
Shadow Mapping Chun-Fa Chang National Taiwan Normal University.
GRAPHICS PIPELINE & SHADERS SET09115 Intro to Graphics Programming.
CS662 Computer Graphics Game Technologies Jim X. Chen, Ph.D. Computer Science Department George Mason University.
Programmable Pipelines Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts Director, Arts Technology Center University.
OpenGL Shader Language Vertex and Fragment Shading Programs.
Shaders in OpenGL Marshall Hahn. Introduction to Shaders in OpenGL In this talk, the basics of OpenGL Shading Language will be covered. This includes.
OpenGL Shading Language (GLSL)
CSE 381 – Advanced Game Programming GLSL. Rendering Revisited.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.
Review on Graphics Basics. Outline Polygon rendering pipeline Affine transformations Projective transformations Lighting and shading From vertices to.
OpenGL Shading Language (GLSL)
Week 3 Lecture 4: Part 2: GLSL I Based on Interactive Computer Graphics (Angel) - Chapter 9.
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
COMPUTER GRAPHICS CS 482 – FALL 2015 SEPTEMBER 29, 2015 RENDERING RASTERIZATION RAY CASTING PROGRAMMABLE SHADERS.
09/25/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Shadows Stage 2 outline.
Ray Tracing using Programmable Graphics Hardware
What are shaders? In the field of computer graphics, a shader is a computer program that runs on the graphics processing unit(GPU) and is used to do shading.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA.
OpenGL Shading Language
OpenGL Shading Language (GLSL)
Programming with OpenGL Part 3: Shaders Ed Angel Professor of Emeritus of Computer Science University of New Mexico 1 E. Angel and D. Shreiner: Interactive.
OpenGl Shaders Lighthouse3d.com.
GLSL I.  Fixed vs. Programmable  HW fixed function pipeline ▪ Faster ▪ Limited  New programmable hardware ▪ Many effects become possible. ▪ Global.
An Introduction to the Cg Shading Language Marco Leon Brandeis University Computer Science Department.
GLSL Review Monday, Nov OpenGL pipeline Command Stream Vertex Processing Geometry processing Rasterization Fragment processing Fragment Ops/Blending.
COMP 175 | COMPUTER GRAPHICS Remco Chang1/XX13 – GLSL Lecture 13: OpenGL Shading Language (GLSL) COMP 175: Computer Graphics April 12, 2016.
Shaders, part 2 alexandri zavodny.
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
Shader.
CSE 381 – Advanced Game Programming GLSL Syntax
Graphics Processing Unit
Chapter 6 GPU, Shaders, and Shading Languages
CS451Real-time Rendering Pipeline
GLSL I Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts Director, Arts Technology Center University of New Mexico.
Day 05 Shader Basics.
Chapter VI OpenGL ES and Shader
Graphics Processing Unit
Programming with OpenGL Part 3: Shaders
CIS 441/541: Introduction to Computer Graphics Lecture 15: shaders
CS 480/680 Computer Graphics GLSL Overview.
CS 480/680 Fall 2011 Dr. Frederick C Harris, Jr. Computer Graphics
Presentation transcript:

GPU Programming Yanci Zhang Game Programming Practice

Outline Parallel computing GPU overview OpenGL shading language overview Vertex / Geometry / Fragment shader Using GLSL in OpenGL Application: Per-pixel shading Game Programming Practice

Why Parallel Computing? Performance of CPU increased 50% per year from 1986 to 2002 Simply wait for the next generation of CPU in order to obtain increased performance Single-processor performance improvement slowed down to 20% since 2002 The road to rapidly increasing performance lay in the direction of parallelism Game Programming Practice

Why Parallel Computing? Performance of CPU increased 50% per year from 1986 to 2002 Simply wait for the next generation of CPU in order to obtain increased performance Single-processor performance improvement slowed down to 20% since 2002 The road to rapidly increasing performance lay in the direction of parallelism Put multiple processors on a single circuit rather than developing ever-faster monolithic processor Game Programming Practice

What is GPU ? GPU: Graphics Processing Unit Developed rapidly from being primitive drawing devices to being major computing resources Extremely powerful and flexible processor Tremendous memory bandwidth and computational power High level languages have emerged Capable of general-purpose computation beyond graphics applications GPU has evolved into an extremely powerful and flexible processor. The latest graphics architectures provide tremendous memory bandwidth and computational power, with fully programmable vertex and pixel processing units that support vector operations up to 32-bits floating point precision. High level languages have emerged for graphics hardware, making this computational power accessible. Architecturally, GPUs are highly parallel streaming processors optimized for vector operations, with both MIMD (vertex) and SIMD (pixel) pipelines. Not surprisingly, these processors are capable of general-purpose computation beyond the graphics applications for which they were designed. Game Programming Practice

Motivation In many respects GPU is more powerful than CPU Computational power: FLOPS (Floating point Operations Per Second) Parallelism Bandwidth Performance growth rate Game Programming Practice

Floating Point Calculation FLOPS: A common benchmark measurement for rating the speed of FPU CPU Intel Core i7 980 XE (quad-core): 107.55 GFLOPS GPU nVidia GeForce GTX 480: 2.02 TFLOPS Modern GPUs support high precision 32-bit floating point throughout the pipeline No support for a double precision format a common benchmark measurement for rating the speed of microprocessors. Floating-point operations include any operations that involve fractional numbers. Game Programming Practice

Parallelism Parallelism: allows simultaneous operations at the same time CPU Do not adequately exploit parallelism Dual-core, quad-core GPU GeForce GTX 480: 512 kernels CPU programming models are generally serial ones that do not adequately expose data parallelism in their applications. They do an admirable job of taking advantage of IP and allow some DP execution, but the degree of parallelism exploited by CPU is much less than that of GPU Game Programming Practice

Bandwidth Peak performance of computer systems is often far in excess of actual application performance The bandwidth between key components ultimately dictates system performance CPU 64bits DDR3-2133 dual-channel: 17GB/s GPU GeForce GTX 480: 384bits, 177.4GB/s Peak performance of computer systems is often far in excess of actual application performance, due to the memory gap problem, the mismatch of memory and processor performance. In data-intensive applications, the processing elements (PEs) often spend most of the time waiting for data. GPUs have traditionally been optimized for high data throughput, with wide data buses (256 bit) and the latest memory technology (GDDR3). Game Programming Practice

Getting Faster and Faster CPU Annual growth ~ 1.5x -> decade growth ~60x Moore’s law GPU Annual growth ~2.0x -> decade growth > 1000x Faster than Moore’s law Multi-billion dollar video game market is a pressure cooker that drives innovation Game Programming Practice

Keys to High-Perf. Computing Efficient computation Maximize the hardware devoted to computation Allow parallelism Task parallelism Data parallelism Instruction parallelism Ensure each computation unit operates at maximum efficiency We can envision several ways to exploit parallelism and permit simultaneous execution. TP: Run tasks on different data at the same time DP: Within a stage, if we are running a task on several data elements, we may be able to exploit DP in evaluating them at the same time IP: Within the complex evaluation of a single data element, we may be able to evaluate several simple operations at the same time Game Programming Practice

Keys to High-Perf. Computing Efficient communication Simply providing large amounts of computation is not sufficient PEs often spend most of the time waiting for data Minimize off-chip communication As both clock speeds and chip sizes increase, the amount of time it takes for a signal to travel across an entire chip, measured in clock cycles, is also increasing. On today’s fastest processors, sending a signal from one side of a chip to another typically requires multiple clock cycles, and this amount of time increases with each new process generation. Game Programming Practice

Stream Programming Model A programming model allowing high efficiency in computation and communication Two basic components Stream All data is represented as a stream An ordered set of data of the same data type Kernels: operations on streams Applications are constructed by chaining multiple kernels together Part of the reason that CPUs are poorly suited to many of these high-performance applications is their serial programming model, which does not expose the parallelism and communication patterns in the application. In the stream programming model, all data is represented as a stream, which we define as an ordered set of data of the same data type. That data type can be simple (a stream of integers or floating-point numbers) or complex (a stream of points or triangles or transformation matrices). While a stream can be any length, we will see that operations on streams are most efficient if streams are long. Game Programming Practice

Kernel Operates on entire streams of elements and produces new streams Within a kernel, computations on one stream element are never dependent on computations on another element Input elements and intermediate computed data are stored locally Fits perfectly onto data-parallel hardware A kernel operates on entire streams, taking one or more streams as inputs and producing one or more streams as outputs. The defining characteristic of a kernel is that it operates on entire streams of elements as opposed to individual elements. The most typical use of a kernel is to evaluate a function on each element of an input stream (a “map” operation); for example, a transformation kernel may project each element of a stream of points into a different coordinate system. Kernel outputs are functions only of their kernel inputs, and within a kernel, computations on one stream element are never dependent on computations on another element. These restrictions have two major advantages. First, the data required for kernel execution is completely known when the kernel is written (or compiled). Kernels can thus be highly efficient when their input elements and their intermediate computed data are stored locally or are carefully controlled global references. Second, requiring independence of computation on separate stream elements within a single kernel allows mapping what appears to be a serial kernel calculation onto data-parallel hardware. Game Programming Practice

Efficient Computation (1) Use of transistors can be divided to three categories: Control: direct the computation Datapath: perform computation Storage: store data Game Programming Practice

Efficient Computation (2) Only simple control flow in kernel execution Devote most of transistors to datapath hardware rather than control hardware Streams expose parallelism in the application Allows a hardware implementation to specialize hardware Because kernels operate on entire streams, stream elements can be processed in parallel using data-parallel hardware. Long streams with many elements allow this data-level parallelism to be highly efficient. Within the processing of a single element, we can exploit instruction-level parallelism. And because applications are constructed from multiple kernels, multiple kernels can be deeply pipelined and processed in parallel, using task-level parallelism. Dividing the application of interest into kernels allows a hardware implementation to specialize hardware for one or more kernels’ execution. Special-purpose hardware, with its superior efficiency over programmable hardware, can thus be used appropriately in this programming model. Finally, allowing only simple control flow in kernel execution (such as the data-parallel evaluation of a function on each input element) permits hardware implementations to devote most of their transistors to datapath hardware rather than control hardware Game Programming Practice

Efficient Communication Off-chip communication is efficient Intermediate results between kernels are kept on-chip to minimize off-chip communication High degree of latency tolerance First, off-chip (global) communication is more efficient when entire streams, rather than individual elements, are transferred to or from memory, because the fixed cost of initiating a transfer can be amortized over an entire stream rather than a single element. Next, structuring applications as chains of kernels allows the intermediate results between kernels to be kept on-chip and not transferred to and from memory. Efficient kernels attempt to keep their inputs and their intermediate computed data local within kernel execution units; therefore, data references within kernel execution do not go off-chip or across a chip to a data cache, as would typically happen in a CPU. And finally, deep pipelining of execution allows hardware implementations to continue to do useful work while waiting for data to return from global memories. This high degree of latency tolerance allows hardware implementations to optimize for throughput rather than latency. Game Programming Practice

Instruction-Stream-Based (CPU) Prescribes both the operation to be executed and the required data Only a limited prefetch of the input data can occur Jumps are expected in the instruction stream L2 cache consumes lots of the transistors in CPU CPU programming models are generally serial ones that do not adequately expose data parallelism in their applications. They do an admirable job of taking advantage of IP and allow some DP execution, but the degree of parallelism exploited by CPU is much less than that of GPU One reason parallel hardware is less prevalent in CPU is the designers’ decision to devote more transistors to control hardware. CPU programs have more complex control requirements than GPU programs, so a large fraction of a CPU’s transistors and wires implements complex control functionality such as branch prediction and out-of-order execution. Game Programming Practice

Data-Stream-Based (GPU) Separates two tasks: Configuring PEs Controlling data-flow to and from PEs Data elements can be assembled from memory before processing Uses only small caches and devotes the majority of transistors to computation Game Programming Practice

Mapping Pipeline to Stream Model The stream formulation of the graphics pipeline All data as streams All computation as kernels Both user-programmable and nonprogrammable stages can be expressed as kernels The graphics pipeline is a good match for the stream model for several reasons. The graphics pipeline is traditionally structured as stages of computation connected by data flow between the stages. This structure is analogous to the stream and kernel abstractions of the stream programming model. Data flow between stages in the graphics pipeline is highly localized, with data produced by a stage immediately consumed by the next stage; in the stream programming model, streams passed between kernels exhibit similar behavior. And the computation involved in each stage of the pipeline is typically uniform across different primitives, allowing these stages to be easily mapped to kernels. Game Programming Practice

Fixed vs. Programmable Fixed Programmable Very fast Can not modify the pipeline, only can turn on/off some functions Hard to implement advanced techniques on GPU Programmable Allows programmers to write shaders to change the pipeline Implementing the pipeline in hardware make processing polygons much faster, but the developer could not modify the pipeline Game Programming Practice

Basic Programmable Graphics Hardware Three programmable kernels in pipeline Vertex shader Geometry shader Pixel shader Load shaders through graphics API The fixed pipeline are replaced by shaders Game Programming Practice

OpenGL 4.3 Pipelines OpenGL 4.3 Pipelines GPGPU programming pipeline Tessellation Evaluation Shader graphics rendering pipeline Game Programming Practice

Vertex Processor MIMD: Multiple Instruction stream, Multiple Data stream A number of processors that function asynchronously and independently Game Programming Practice

Vertex Shader: Basic Function Operate on a single input vertex and produce a single output vertex Replace transformation & lighting unit Now you have to do everything by yourself Transformation Lighting Texture coordinates generation As a minimum, a vertex shader must output vertex position in homogeneous clip space The vertex-shader (VS) stage processes vertices from the input assembler, performing per-vertex operations such as transformations, skinning, morphing, and per-vertex lighting. Vertex shaders always operate on a single input vertex and produce a single output vertex. Instead of setting parameters to control the pipeline, you write a vertex shader program that executes on the graphics hardware. A vertex shader is a graphics processing function used to add special effects to objects in a 3D environment by performing mathematical operations on the objects' vertex data. Each vertex can be defined by many different variables. For instance, a vertex is always defined by its location in a 3D environment using the x-, y-, and z- coordinates. Vertices may also be defined by colors, coordinates. Vertices may also be defined by colors, textures, and lighting characteristics. Vertex Shaders don't actually change the type of data; they simply change the values of the data, so that a vertex emerges with a different color, different textures, or a different position in space. As a minimum, a vertex shader must output vertex position in homogeneous clip space. Optionally, the vertex shader can output texture coordinates, vertex color, vertex lighting, fog factors, and so on. Game Programming Practice

Vertex Shader: Advanced Function What else we can do? Displacement mapping Object deformation Vertex blending Game Programming Practice

Vertex Shader: Limitations We can not Add or delete any vertices Change the primitive type Change the order of vertices form the primitives No knowledge of the type of primitive and neighboring vertices Game Programming Practice

Fragment Processor SIMD: Single Instruction, Multiple Data Achieves data level parallelism “get this pixel, get the next one” -> “get lots of pixel” Game Programming Practice

Fragment Shader: Basic Function Invoked once for each fragment covered by the primitive Computes the final pixel color and depth Can output up to 8 32-bit 4-component data for the current pixel location Game Programming Practice

Fragment Shader: Advanced Function Enables rich shading techniques Per-pixel lighting, bump mapping, normal mapping Fluid simulation … Game Programming Practice

Fragment Shader: Limitations Dynamic branching less efficient than vertex proc. Can not change the screen coordinate of a fragment No arbitrary memory write Game Programming Practice

Geometry Shader New for 2007 Executed after vertex shaders Input: whole primitive, possibly with adjacent information Invoked once for every primitive Output: multiple vertices forming a single selected topology (tristrip, linestrip, pointlist) Output may be fed to rasterizer and/or to a vertex buffer in memory The geometry-shader (GS) stage runs application-specified shader code with vertices as input and the ability to generate vertices on output. Unlike vertex shaders, which operate on a single vertex, the geometry shader's inputs are the vertices for a full primitive (two vertices for lines, three vertices for triangles, or single vertex for point). Geometry shaders can also bring in the vertex data for the edge-adjacent primitives as input (an additional two vertices for a line, an additional three for a triangle). The geometry-shader stage is capable of outputting multiple vertices forming a single selected topology (GS stage output topologies available are: tristrip, linestrip, and pointlist). The number of primitives emitted can vary freely within any invocation of the geometry shader, though the maximum number of vertices that could be emitted must be declared statically. When a geometry shader is active, it is invoked once for every primitive passed down or generated earlier in the pipeline. Each invocation of the geometry shader sees as input the data for the invoking primitive, whether that is a single point, a single line, or a single triangle. A triangle strip from earlier in the pipeline would result in an invocation of the geometry shader for each individual triangle in the strip (as if the strip were expanded out into a triangle list). All the input data for each vertex in the individual primitive is available (i.e. 3 vertices for triangle), plus adjacent vertex data if applicable/available. A geometry shader outputs data one vertex at a time by appending vertices to an output stream object. The topology of the streams is determined by a fixed declaration Game Programming Practice

Geometry Shader: Applications Point Sprite Expansion Single Pass Render-to-Cubemap Dynamic Particle Systems Fur/Fin Generation Shadow Volume Generation Game Programming Practice

Programmable GPUs: Applications Graphics applications Per-pixel lighting Ray tracing Deformation GPGPU Computer vision Physically-based simulation Image processing Database queries Game Programming Practice

GPGPU General-purpose Computation on GPUs Capable of performing more than the specific graphics computations Goal: make the inexpensive power of the GPU available to developers as a sort of computational coprocessor Example applications range from in-game physics simulation to conventional computational science With the increasing programmability of commodity graphics processing units (GPUs), these chips are capable of performing more than the specific graphics computations for which they were designed. They are now capable coprocessors, and their high speed makes them useful for a variety of applications. The goal of this page is to catalog the current and historical use of GPUs for general-purpose computation. Game Programming Practice

Shading Language Production rendering Real-time rendering Geared towards maximum image quality Example: RenderMan Real-time rendering GLSL: OpenGL shading language HLSL: DirectX High-level shading language CG: C for Graphic, NVidia Game Programming Practice

OpenGL Shading Language High level shading language based on C Not a hardware-specific language Cross platform compatibility on multiple OS Each hardware vender includes GLSL compiler in their driver Game Programming Practice

Before Using GLSL Check whether your GPU supports GLSL GLSL is part of OpenGL 2.0 If OpenGL 2.0 is not available, then use OpenGL extensions Game Programming Practice

Extensions Required GL_ARB_shader_object GL_ARB_fragment_shader Adds API calls that are necessary to manage shader objects and program objects GL_ARB_fragment_shader Adds functionality to define fragment shader objects GL_ARB_vertex_shader Adds functionality to define vertex shader objects Game Programming Practice

GLEW 1/2 GLEW: The OpenGL Extension Wrangler Library (http://glew.sourceforge.net/) Initialize GLEW #include <GL/glew.h> #include <GL/glut.h> ... glutInit(&argc, argv); glutCreateWindow("GLEW Test"); GLenum err = glewInit(); if (GLEW_OK != err) {   /* Problem: glewInit failed, something is seriously wrong. */   fprintf(stderr, "Error: %s\n", glewGetErrorString(err));   ... } Game Programming Practice

GLEW 2/2 Check extensions Check core OpenGL functionality if (GLEW_ARB_vertex_shader) {   /* It is safe to use the GL_ARB_vertex_shader extension here. */    } if (GLEW_VERSION_2_0) {   /* Yay! OpenGL 2.0 is supported! */ } Game Programming Practice

Data Types Scalar Vector Matrix Texture bool, int, float Supports 2D, 3D, 4D vector: vec{2,3,4}, ivec{2,3,4}, bvec{2,3,4} Matrix Square matrix: mat2, mat3, mat4 mat2x3, mat2x4, mat3x2, mat3x4, mat4x2, mat4x3 Texture sampler1D, sampler2D, sampler3D samplerCube sampler1DShadow, sampler2DShadow Game Programming Practice

Variables 1/3 Pretty much the same as in C Flexible when initializing variables using other variables float a,b; // two float variables (the comments are like in C) int c = 2; // initialize a variable when declaring it vec3 g = vec3(1.0,2.0,3.0); //declare and initialize a vector vec2 a = vec2(1.0,2.0); vec2 b = vec2(3.0,4.0); vec4 c = vec4(a,b) // c = vec4(1.0,2.0,3.0,4.0); Game Programming Practice

Variables 2/3 Flexible when accessing a vector {x, y, z, w}: accessing vectors that represent points or normals {r, g, b, a}: accessing vectors that represent colors {s, t, p, q}: accessing vectors that represent texture coordinates Game Programming Practice

Variables 3/3 Accessing components beyond those declared for the vector type is an error vec4 a = vec4(1.0, 2.0, 3.0, 4.0); float posX = a.x; //posX = 1.0 float posY = a[1]; //posY = 2.0 float depth = a.w; //depth = 4.0 Vec3 b = a.xxy; // b = vec3(1.0, 1.0, 2.0) Vec3 c = a.bra; // b = vec3(3.0, 1.0, 4.0) vec2 t = vec2(1.0, 2.0); float tt = t.z; //incorrect! Game Programming Practice

Vector and Matrix Operations Operations are component-wise vec3 u, v, w; float f; mat3 a1, a2, a3; u = v+ f; u = v + w; u = v * a1; a1 = a2 * a3; u.x = v.x + f; u.y = v.y + f; u.z = v.z + f; u.x = v.x + w.x; u.y = v.y + w.y; u.z = v.z + w.z; u.x = dot(v, a1[0]); u.y = dot(v, a1[1]); u.z = dot(v, a1[2]); Game Programming Practice

Control Flow Statements selection (if-else) iteration (for, while, and do-while) jumps (discard, return, break, and continue) discard is only allowed within fragment shaders discard causes the fragment to be discarded and no updates to any buffers will occur if (depth > 0.5) discard; Game Programming Practice

Function Definition The function main() is used as the entry point to a shader executable returnType functionName (type0 arg0, type1 arg1, ..., typen argn) { // do some computation return returnValue; } Game Programming Practice

Important Build-in Variables 1/2 gl_Position (vec4) Output of vertex shader Homogeneous vertex position Must write a value into this variable gl_FragCoord (vec4) Holds the window relative coordinates x, y, z, and 1/w values for the fragment Read-only variable in fragment shader Game Programming Practice

Important Build-in Variables 2/2 gl_FragColor (vec4) Output of fragment shader Writing to gl_FragColor specifies the fragment color gl_FragDepth (float) Default value: gl_FragCoord.z If you write to gl_FragDepth, then it is your responsibility for always writing it Game Programming Practice

Build-in Functions Angle and trigonometry functions sin, cos, asin, acos … Exponential functions pow, exp, sqrt … Common functions abs, clamp, smoothstep … Geometric functions length, dot, cross … Game Programming Practice

Build-in Functions Matrix functions Vector relational functions outerProduct, transpose … Vector relational functions lessThan, equal … Texture lookup functions texture2D, texture2DLod… Fragment processing functions Noise functions Game Programming Practice

Important Build-in Functions ftransform() For vertex shaders only Produces exactly the same result as would be produced by OpenGL’s fixed functionality transform reflect(vec3 I, vec3 N) Computes reflection vector by incident vector I and normal vector N gl_Position = ftransform() Game Programming Practice

First Example Vertex shader Fragment shader void main() { gl_Position = ftransform(); } void main() { gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0); } Game Programming Practice

Make Fun of Fragment Shader void main() { vec4 t = vec4(1.0, 0.6, 0.3, 0.0); gl_FragColor = t.xxxx; //flexible vector accessing } void main() { gl_FragColor = vec4(gl_FragCoord.zzz, 1.0); //let’s view the depth map } void main() { if (gl_FragCoord.x > 320) discard; //try discard gl_FragColor = vec4(1.0, 1.0, 1.0, 1.0); } Game Programming Practice

More Build-in Variables Vertex shader build-in attributes gl_Vertex, gl_Normal, gl_Color, gl_MultiTexCoord[] … Vertex shader build-in output variables gl_FrontColor, gl_TexCoord[] … Fragment shader build-in input variables gl_Color, gl_TexCoord[] … Built-In uniform state gl_ModelViewMatrix, gl_ProjectionMatrix … Game Programming Practice

Example: Using Build-in Matrixes void main() { gl_Position = ftransform(); } void main() { gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex; } void main() { gl_Position = gl_ModelViewMatrix * gl_Vertex; gl_Position = gl_ProjectionMatrix * gl_Position; } Game Programming Practice

Example: Using Colors Vertex shader Fragment shader void main() { gl_Position = ftransform(); gl_FrontColor = gl_Color; } void main() { gl_FragColor = gl_Color; } Game Programming Practice

Example: Using Texture Coordinates Vertex shader Fragment shader void main() { gl_Position = ftransform(); gl_TexCoord[0] = vec4(gl_MultiTexCoord0.xy, 1.0, 0.0); } void main() { gl_FragColor = gl_TexCoord[0]; } Game Programming Practice

gl_NormalMatrix Important to per-vertex and per-pixel lighting Transpose of the inverse of the upper leftmost 3x3 of gl_ModelViewMatrix Converts normal vector from object space to eye space Game Programming Practice

View Normal Vectors Vertex shader Fragment shader void main() { gl_Position = ftransform(); gl_FrontColor = vec4(gl_Normal, 1.0); } void main() { gl_Position = ftransform(); gl_FrontColor = vec4(gl_NormalMatrix * gl_Normal, 1.0); } void main() { gl_FragColor = gl_Color; } Game Programming Practice

Communications Communication between OpenGL and shader One way communication Use uniform qualifier when declaring variables Communication between vertex and fragment shader Use varying qualifier when declaring variables Game Programming Practice

Uniform Used to declare global variables Variable values are the same across the entire primitive being processed Read-only Initialized externally either at link time or through the API uniform vec4 lightPosition; uniform vec3 color = vec3(0.7, 0.7, 0.2); // value assigned at link time Game Programming Practice

OpenGL Setup Game Programming Practice

Creating Shader Object _ShaderID = glCreateShader(GL_VERTEX_SHADER); if (_ShaderID == 0) //glCreateShader() return 0 if it fails to create a shader object { printf("Fail to create shader object!\n"); exit(-1); } //load the shader source file to a string _pShaderSource glShaderSource(_ShaderID, 1, (const GLchar **)&_pShaderSource, &fileLen); CheckGLError(__FILE__, __LINE__); glCompileShader(_ShaderID); glGetShaderiv(_ShaderID, GL_COMPILE_STATUS, &ShaderStatus); if (ShaderStatus == GL_FALSE) printf("Fail to compile the shader: %s\n", vFileName); Game Programming Practice

Creating Program Object _ProgramID = glCreateProgram(); if (_ProgramID == 0) { printf("Fail to create shader program object!\n"); exit(-1); } glAttachShader(_ProgramID, VertexShaderID); //attach vertex shader CheckGLError(__FILE__, __LINE__); glAttachShader(_ProgramID, FragShaderID); //attach fragment shader glLinkProgram(_ProgramID); glGetProgramiv(_ProgramID, GL_LINK_STATUS, &ProgramStatus); if (ProgramStatus == GL_FALSE) printf("Fail to link the program!\n"); glUseProgram(_ProgramID); Game Programming Practice

Initialize Uniform Variables Suppose an uniform variable is declared in shader: Initialize uniform variable by OpenGL uniform vec3 u_Color; loc = glGetUniformLocation(_ProgramID, “u_Color”); if (loc == -1) { cout << "Error: can't find uniform variable! \n"; } glUniform3f(loc, v0, v1, v2); Game Programming Practice

Application: Per-Pixel Shading Three types of light in OpenGL Ambient light Diffuse light Specular light Fixed pipeline conducts vertex-based shading Fast but poor quality Per-pixel shading is possible by utilizing the programmable ability of modern GPU Game Programming Practice

Assignment Add specular light Game Programming Practice