Download presentation
Presentation is loading. Please wait.
1
Status – Week 279 Victor Moya
2
Rasterization Setup triangles (calculate slope values). Setup triangles (calculate slope values). Fill triangle: Interpolate parameters. Fill triangle: Interpolate parameters. Parameters: R, G, B, z, r, s, t, q. Parameters: R, G, B, z, r, s, t, q.
3
Pixel Planes Calculate 3 edge functions: if all the edge functions are positive in a point (x, y) the point is inside the triangle. Calculate 3 edge functions: if all the edge functions are positive in a point (x, y) the point is inside the triangle. E(x, y) = (x – X)dY – (y – Y)dX E(x, y) > 0 if (x, y) is to the “right” side. E(x, y) = 0 if (x, y) is exactly on the line. E(x, y) < 0 if (x, y) is to the “left” side.
4
Edge Functions
5
Classification (1) A polygon defined by N vertex: (xi, yi) 0 < i <= N (x0, y0) = (xN, yN) The incremental classification of the points around a polygon can be calculated as: Initial values: dXi = Xi – X(i-1) dYi = Yi – Y(i-1) Ei(Xs, Ys) = (Xs – Xi) dY – (Ys – Yi) dXi for 0 < i <= N
6
Classification(2) Incremental computation for a unit step in X and Y axis: E(x + 1, y) = Ei(x, y) + dYi E(x - 1, y) = Ei(x, y) - dYi E(x, y + 1) = Ei(x, y) - dYi E(x, y - 1) = Ei(x, y) + dXi Fragment inside of the triangle if: Ei >= 0 for all i : 0 = 0 for all i : 0 < i <= N
7
Classification
8
Traversing the Polygon
9
Clipping
10
Parallel Rasterization E(x + L, y) = E(x) + Ldy Allows a group of interpolators, each responsible for a pixel within a block of contiguous pixels, to simultaneously compute the edge function of an adjacent block in a single cycle
11
Olano and Greer Triangle Scan Conversion using 2D Homogeneous Coordinates Triangle Scan Conversion using 2D Homogeneous Coordinates Based in Pixel Planes and Pineda approach (edge functions) but using homogeneous coordinates. Based in Pixel Planes and Pineda approach (edge functions) but using homogeneous coordinates. Avoids the need of clipping. Avoids the need of clipping. Adds a hither edge function for user clipping. Adds a hither edge function for user clipping. Perspective correct interpolation. Perspective correct interpolation.
12
Interpolation function A parameter varies linearly accross a triangle in 3D: u = aX + bY + cZ The 3D position (X, Y, Z) projects to 2D, using 2DH coords (x = X, y = Y, w = Z). The equation in 2DH space: u = ax + by + cw 2D perspective correct function (division by w): u/w = a x/w + b y/w + c = a X + b Y + c u/w is a linear function in screen space (X, Y)
13
Interpolation function If each vertex has a a value for u we can resolve [a b c] using this equation: If each vertex has a a value for u we can resolve [a b c] using this equation:
14
Scan conversion Edge function parameters: [1 0 0], [0 1 0], [0 0 1]. Edge function parameters: [1 0 0], [0 1 0], [0 0 1]. 1/w interpolation parameter: [1 1 1]. 1/w interpolation parameter: [1 1 1]. Zero-area and back facing triangles: 3x3 matrix inverse of M only exists if the determinant of M isn’t 0. The determinant calculates a function of the area of the triangle. Zero-area and back facing triangles: 3x3 matrix inverse of M only exists if the determinant of M isn’t 0. The determinant calculates a function of the area of the triangle.
15
Arbitrary clip planes To add arbitrary clip planes (user clip planes) we need to add new clip edge functions: To add arbitrary clip planes (user clip planes) we need to add new clip edge functions:
16
Algorithm To summarize the algorithm: setup: three edge functions = M-1 = inverse of 2D homogeneous vertex matrix for each clip edge clip edge function = dot product test * M-1 clip edge function = dot product test * M-1 interpolation function for 1/w = sum of rows of M-1 interpolation function for 1/w = sum of rows of M-1 for each parameter for each parameter interpolation function = parameter vector * M-1 interpolation function = parameter vector * M-1 pixel processing: interpolate linear edge and parameter functions interpolate linear edge and parameter functions where all edge functions are positive where all edge functions are positive w = 1/(1/w) for each parameter perspective-correct parameter = parameter * w
17
Cost Setup: Setup: Calculate the interpolation coefficients and slopes. Calculate the interpolation coefficients and slopes. 1 matrix inversion (1 division, multiple multiplication/additions). 1 matrix inversion (1 division, multiple multiplication/additions). 1 matrix vector multiplication for each parameter. This includes the edge and clip edge functions, the 1/w value and the other parameters (r, g, b, z, s, t, r) (3x3 matrix/vector multiplication: 9 Mul + 6 Add). 1 matrix vector multiplication for each parameter. This includes the edge and clip edge functions, the 1/w value and the other parameters (r, g, b, z, s, t, r) (3x3 matrix/vector multiplication: 9 Mul + 6 Add). Calculate the X and Y slopes (derivatives) for each parameter and the initial value at the first pixels (2 Mul + 2 Add per parameter). Calculate the X and Y slopes (derivatives) for each parameter and the initial value at the first pixels (2 Mul + 2 Add per parameter).
18
Cost (2) Per pixel: Per pixel: Interpolate parameters: 1 Addition per parameter. Interpolate parameters: 1 Addition per parameter. Determine if the 3 edge functions are positive (3 test sign). Determine if the 3 edge functions are positive (3 test sign). Determine if the clip edge functions are positive (n test sign) Determine if the clip edge functions are positive (n test sign) Per pixel inside the triangle: Per pixel inside the triangle: w = 1/(1/w) (1 division????) w = 1/(1/w) (1 division????) For each parameter, perspective correct parameter value: u = uw * w (1 multiplication for each parameter). For each parameter, perspective correct parameter value: u = uw * w (1 multiplication for each parameter).
19
Rasterization/Fragments Calculate the final color value of the fragment: Calculate the final color value of the fragment: Texture Read. Texture Read. Color sum. Color sum. Fog. Fog.
20
OpenGL Rasterization
21
Per fragment (tests) Determine the vissibility of the fragment: Determine the vissibility of the fragment: Ownership test. Ownership test. Scissor test. Scissor test. Alpha test. Alpha test. Stencil test. Stencil test. Depth Buffer test. Depth Buffer test. Final pixel color: Final pixel color: Blending. Blending. Dithering. Dithering. Logic Operation. Logic Operation.
22
OpenGL per fragment
23
OpenGL Multitexture
24
Z-Buffer Vissibility test. Vissibility test. 1 read from the Z-buffer (24bits). 1 read from the Z-buffer (24bits). If test fails the fragment is discarded. If test fails the fragment is discarded. If not 1 write to the Z-buffer (24 bits). If not 1 write to the Z-buffer (24 bits). Early Z test (avoid useless work). Early Z test (avoid useless work). Hierarchical Z-Buffer: reduces bandwidth Hierarchical Z-Buffer: reduces bandwidth Z-Buffer compression: reduces bandwidth and memory usage. Z-Buffer compression: reduces bandwidth and memory usage. Fast Z clear. Fast Z clear. Pixel shaders that change pixel depth (Z) disable early Z test. Pixel shaders that change pixel depth (Z) disable early Z test.
25
Hierarchical Z, Z Compression and Fast Z-Clear
26
Textures Original: additional color (material) information per pixel. It is used to compensate lack of geometry information. Original: additional color (material) information per pixel. It is used to compensate lack of geometry information. Current: color, normals or any kind of information. Different formats (access) supporter by hardware (1D, 2D, 3D, cubemap). Current: color, normals or any kind of information. Different formats (access) supporter by hardware (1D, 2D, 3D, cubemap). Supported dependant reads (use information from a texture as address to access another texture). Supported dependant reads (use information from a texture as address to access another texture). Minimification, magnification. Minimification, magnification. MIP mapping (Multus in Parvum): multiple levels of detail for a single texture. MIP mapping (Multus in Parvum): multiple levels of detail for a single texture. Filtering: bilinear (4 access same mipmap), trilinear (8 access to two mipmaps), anisotropic (up to 128 access (16x trilinear) access). Filtering: bilinear (4 access same mipmap), trilinear (8 access to two mipmaps), anisotropic (up to 128 access (16x trilinear) access).
27
Register combiners Multitexture: multiple textures can be read per cycle (multiple texture units per pipe, up to 4 in Matrox Parhelia). Also multiple textures per pass (loop mode, up to 16 in DX9 hardware). Multitexture: multiple textures can be read per cycle (multiple texture units per pipe, up to 4 in Matrox Parhelia). Also multiple textures per pass (loop mode, up to 16 in DX9 hardware). The output of those textures is combined (*, +,...) with the pixel interpolated color. The output of those textures is combined (*, +,...) with the pixel interpolated color. First implementation of pixel shaders (not really instructions for a processor, but a configuration for the hardware). First implementation of pixel shaders (not really instructions for a processor, but a configuration for the hardware).
28
GeForce256 Register Combiners Spare 0 Fragment Color Texture Fetching General Combiner 0 4 RGB Inputs Texture 0 Texture 1 Fog Color/Factor Register Set 6 RGB Inputs Specular Color 4 Alpha Inputs 3 RGB Outputs 3 Alpha Outputs General Combiner 1 4 RGB Inputs 4 Alpha Inputs 3 RGB Outputs 3 Alpha Outputs Final Combiner 1 Alpha Input Specular Color
29
GeForce 3/4 Register Combiners
32
Texture Effects There is a large a new graphics effects that can be achieved with those extended texture functions: There is a large a new graphics effects that can be achieved with those extended texture functions: Cubemap (lightning, shadows). Cubemap (lightning, shadows). Bump Mapping (per pixel lightning/shading). Bump Mapping (per pixel lightning/shading). Others? Others?
33
Pixel Shaders DX9 pixel shaders are true processors. Based in Vertex Shaders but without branching. Replaces (or complements) the register combiner stage. DX9 pixel shaders are true processors. Based in Vertex Shaders but without branching. Replaces (or complements) the register combiner stage. Most instructions of the vertex shader are present in the pixel shader (but branches). Conditional codes, swizzle, negate, absolute value, mask, conditional mask (NV30). Most instructions of the vertex shader are present in the pixel shader (but branches). Conditional codes, swizzle, negate, absolute value, mask, conditional mask (NV30). Additional instructions (NV30): Additional instructions (NV30): Texture read: TEX, TEXP, TXD. Texture read: TEX, TEXP, TXD. Partial derivarives: DDX, DDY. Partial derivarives: DDX, DDY. Pack/Unpack: PK2H, PK2US, PK4B, PK4UB, PK4UBG, UP2H, UP2US, UP4B, UP4UB, UP4UBG. Pack/Unpack: PK2H, PK2US, PK4B, PK4UB, PK4UBG, UP2H, UP2US, UP4B, UP4UB, UP4UBG. Fragment conditional kill: KIL. Fragment conditional kill: KIL. Extra math: LRP (linear interpolation), X2D (2D coordinate transform), RFL (reflection), POW (exponentation). Extra math: LRP (linear interpolation), X2D (2D coordinate transform), RFL (reflection), POW (exponentation).
34
R300 Pixel Shader
35
Pixel Shader Inputs: 1 position (x, y, z, 1/w), 2 colors (4 compenent vector RGBA), 8 texture coordinates, 1 fog coordinate. Inputs: 1 position (x, y, z, 1/w), 2 colors (4 compenent vector RGBA), 8 texture coordinates, 1 fog coordinate. Outputs: fragment color (RGBA), optionally new fragment depth. In NV30/R300 also to 4 RGBA textures. Outputs: fragment color (RGBA), optionally new fragment depth. In NV30/R300 also to 4 RGBA textures. Temporaries (NV30): 32 32-bit registers (64 16-bit registers). Temporaries (NV30): 32 32-bit registers (64 16-bit registers). Constants (NV30): unlimited? (maybe memory?). Accessed by ‘name’ (label). Also literal constants (embedded). Constants (NV30): unlimited? (maybe memory?). Accessed by ‘name’ (label). Also literal constants (embedded). R300: 12 temporary registers, 32 constants. R300: 12 temporary registers, 32 constants. 16 samplers and 8 texture coordinates (DX9). 16 samplers and 8 texture coordinates (DX9).
36
Pixel Shader R300: 64 ALU instructions, 32 texture instructions, 4 levels of dependent read. Up to 96 instructions (?). R300: 64 ALU instructions, 32 texture instructions, 4 levels of dependent read. Up to 96 instructions (?). R300: R300: ALU instructions: ADD, MOV, MUL, MAD, DP3, DP4, FRAC, RCP, RSP, EXD, LOG, CMP. ALU instructions: ADD, MOV, MUL, MAD, DP3, DP4, FRAC, RCP, RSP, EXD, LOG, CMP. Texture: TEXLD, TEXLDP, TEXLDBIAS, TEXKILL. Texture: TEXLD, TEXLDP, TEXLDBIAS, TEXKILL. NV30: up to 1024 instructions. NV30: up to 1024 instructions.
37
Others Fog. Fog. Scissor and Ownership test. Scissor and Ownership test. Stencil test. Stencil test. Alpha test. Alpha test. Blending. Blending. Antialiasing. Antialiasing. Etc. Etc.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.