Status – Week 240 Victor Moya. Summary Post Geometry Pipeline. Post Geometry Pipeline. Rasterization. Rasterization. Triangle Setup. Triangle Setup. Triangle.

Slides:



Advertisements
Similar presentations
COMPUTER GRAPHICS SOFTWARE.
Advertisements

CS 352: Computer Graphics Chapter 7: The Rendering Pipeline.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Graphics Pipeline.
Status – Week 257 Victor Moya. Summary GPU interface. GPU interface. GPU state. GPU state. API/Driver State. API/Driver State. Driver/CPU Proxy. Driver/CPU.
RealityEngine Graphics Kurt Akeley Silicon Graphics Computer Systems.
Computer Graphics Lecture 8 Arbitrary Viewing II: More Projection, Clipping and Mathematics of 3D Viewing.
The Programmable Graphics Hardware Pipeline Doug James Asst. Professor CS & Robotics.
Vertices and Fragments I CS4395: Computer Graphics 1 Mohan Sridharan Based on slides created by Edward Angel.
Clipping & Scan Conversion
(conventional Cartesian reference system)
A Crash Course on Programmable Graphics Hardware Li-Yi Wei 2005 at Tsinghua University, Beijing.
Status – Week 243 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
Status – Week 231 Victor Moya. Summary Primitive Assembly Primitive Assembly Clipping triangle rejection. Clipping triangle rejection. Rasterization.
Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry.
Status – Week 277 Victor Moya.
GPU Simulator Victor Moya. Summary Rendering pipeline for 3D graphics. Rendering pipeline for 3D graphics. Graphic Processors. Graphic Processors. GPU.
1 A Single (Unified) Shader GPU Microarchitecture for Embedded Systems Victor Moya, Carlos González, Jordi Roca, Agustín Fernández Department of Computer.
Status – Week 226 Victor Moya. Summary Recursive descent. Recursive descent. Hierarchical Z Buffer. Hierarchical Z Buffer.
Status – Week 279 Victor Moya. Rasterization Setup triangles (calculate slope values). Setup triangles (calculate slope values). Fill triangle: Interpolate.
Status – Week 283 Victor Moya. 3D Graphics Pipeline Akeley & Hanrahan course. Akeley & Hanrahan course. Fixed vs Programmable. Fixed vs Programmable.
Status – Week 281 Victor Moya. Objectives Research in future GPUs for 3D graphics. Research in future GPUs for 3D graphics. Simulate current and future.
Status – Week 239 Victor Moya. Summary Primitive Assembly Primitive Assembly Clipping triangle rejection. Clipping triangle rejection. Rasterization.
Status – Week 260 Victor Moya. Summary shSim. shSim. GPU design. GPU design. Future Work. Future Work. Rumors and News. Rumors and News. Imagine. Imagine.
Status – Week 280 Victor Moya. Rasterization Setup triangles. Setup triangles. Fill triangle: Interpolate parameters. Fill triangle: Interpolate parameters.
Status – Week 245 Victor Moya. Summary Streamer Streamer Creditos investigación. Creditos investigación.
Status – Week 227 Victor Moya. Summary How to lose a week. How to lose a week. Rasterization. Rasterization.
Graphics Pipeline Rasterization CMSC 435/634. Drawing Terms Primitive – Basic shape, drawn directly – Compare to building from simpler shapes Rasterization.
Graphics Pipeline Clipping CMSC 435/634. Graphics Pipeline Object-order approach to rendering Sequence of operations – Vertex processing – Transforms.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
CS 450: Computer Graphics REVIEW: OVERVIEW OF POLYGONS
Under the Hood: 3D Pipeline. Motherboard & Chipset PCI Express x16.
Programmable Pipelines. Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Graphics Pipeline Rasterization CMSC 435/634. Drawing Terms Primitive – Basic shape, drawn directly – Compare to building from simpler shapes Rasterization.
Programmable Pipelines. 2 Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Triangle Scan Conversion. 2 Angel: Interactive Computer Graphics 5E © Addison-Wesley 2009 Rasterization Rasterization (scan conversion) –Determine which.
Graphics Pipeline Rasterization CMSC 435/634. Drawing Terms Primitive – Basic shape, drawn directly – Compare to building from simpler shapes Rasterization.
CS 450: COMPUTER GRAPHICS REVIEW: INTRODUCTION TO COMPUTER GRAPHICS – PART 2 SPRING 2015 DR. MICHAEL J. REALE.
CS 480/680 Computer Graphics Implementation III Dr. Frederick C Harris, Jr. Fall 2011.
OpenGL Conclusions OpenGL Programming and Reference Guides, other sources CSCI 6360/4360.
CSE Real Time Rendering Week 2. Graphics Processing 2.
Computer Graphics The Rendering Pipeline - Review CO2409 Computer Graphics Week 15.
CSE Real Time Rendering Week 9. Post Geometry Shaders Courtesy: E. Angel and D. Shreiner – Interactive Computer Graphics 6E © Addison-Wesley 2012.
1 Introduction to Computer Graphics with WebGL Ed Angel Professor Emeritus of Computer Science Founding Director, Arts, Research, Technology and Science.
Programmable Pipelines Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts Director, Arts Technology Center University.
CAP4730: Computational Structures in Computer Graphics
1 Visiblity: Culling and Clipping Computer Graphics COMP 770 (236) Spring 2009 January 21 & 26: 2009.
MIT EECS Frédo Durand and Barb Cutler
Jens Krüger & Polina Kondratieva – Computer Graphics and Visualization Group computer graphics & visualization 3D Rendering Praktikum: Shader Gallery The.
Review on Graphics Basics. Outline Polygon rendering pipeline Affine transformations Projective transformations Lighting and shading From vertices to.
Graphics Pipeline Rasterization CMSC 435/634. Drawing Terms Primitive – Basic shape, drawn directly – Compare to building from simpler shapes Rasterization.
Fateme Hajikarami Spring  What is GPGPU ? ◦ General-Purpose computing on a Graphics Processing Unit ◦ Using graphic hardware for non-graphic computations.
What are shaders? In the field of computer graphics, a shader is a computer program that runs on the graphics processing unit(GPU) and is used to do shading.
MIT EECS 6.837, Durand and Cutler The Graphics Pipeline: Line Clipping & Line Rasterization.
2 3D Viewing Process  3D viewing process MC Model Space Model Transformation Model Transformation WC World Space Viewing Transformation Viewing Transformation.
Computer Graphics Ken-Yi Lee National Taiwan University (the slides are adapted from Bing-Yi Chen and Yung-Yu Chuang)
1 Introduction to Computer Graphics with WebGL Ed Angel Professor Emeritus of Computer Science Founding Director, Arts, Research, Technology and Science.
Advanced D3D Programming Sim Dietrich
Programmable Pipelines
A Crash Course on Programmable Graphics Hardware
CS 551 / 645: Introductory Computer Graphics
Graphics Processing Unit
Chapter 6 GPU, Shaders, and Shading Languages
The Graphics Rendering Pipeline
Software Rasterization
Introduction to Computer Graphics with WebGL
Lecture 13 Clipping & Scan Conversion
Chapter VII Rasterizer
Presentation transcript:

Status – Week 240 Victor Moya

Summary Post Geometry Pipeline. Post Geometry Pipeline. Rasterization. Rasterization. Triangle Setup. Triangle Setup. Triangle Traversal. Triangle Traversal. Interpolation. Interpolation. Current status. Current status.

Post Geometry Pipeline Divide by w? Divide by w? Clipping? Clipping? NVidia doesn’t seem to have geometric clipping. NVidia doesn’t seem to have geometric clipping. Alpha kill in NV2x for user clip planes. Alpha kill in NV2x for user clip planes. ATI seems to have geometric clipping. ATI seems to have geometric clipping. Proper user clipping. Proper user clipping. No support for transformed and lit vertex clipping. No support for transformed and lit vertex clipping. What do we do? What do we do?

Post Geometry Pipeline Clipping: Clipping: 6 frustum clip planes. 6 frustum clip planes. At least 6 user clip planes. At least 6 user clip planes. Hardware requeriments: Hardware requeriments: Plane – edge intersection (?). Plane – edge intersection (?). Generates new vertices (for triangles 1 or 2). Generates new vertices (for triangles 1 or 2). –Interpolate output attributes at the new vertex. Can generate new triangles (for triangles 1). Can generate new triangles (for triangles 1). –Affects primitive assembly. At least frustum clipping should be fast. At least frustum clipping should be fast.

Post Geometry Pipeline Viewport Transformation Viewport Transformation Delay to end of rasterization (at conversion from fixed point to float point fragment attributes). Delay to end of rasterization (at conversion from fixed point to float point fragment attributes). Use fixed point device coordinates [-1, 1] for rasterization. Use fixed point device coordinates [-1, 1] for rasterization. Rasterization. Rasterization.

MC StF StOC StC PA TS TT Int StL Shader A*TL+L 2111 MC: Memory ControllerShader: Vertex Shader StF: Streamer FetchPA: Primitive Assembly StL: Streamer LoaderTS: Triangle Setup StOC: Streamer Output CacheTT: Triangle Traversal StC: Streamer CommitInt: Interpolation

Rasterization We can divide it in three phases: We can divide it in three phases: Setup. Setup. Calculate linear equation coefficients, start values and slopes. Calculate linear equation coefficients, start values and slopes. Perform area and face culling. Perform area and face culling. Traversal. Traversal. Traverse the triangle generating fragments inside the triangle. Traverse the triangle generating fragments inside the triangle. Clipping of fragments by frustum and user clip. Clipping of fragments by frustum and user clip. Interpolation. Interpolation. Interpolate all fragment attributes for the generated fragment. Interpolate all fragment attributes for the generated fragment.

Triangle Setup Use 2DH rasterization setup. Use 2DH rasterization setup. Create matrix (inverse or just adjoint matrix?) from the three vertex 2DH positions. Create matrix (inverse or just adjoint matrix?) from the three vertex 2DH positions. Calculate determinant. Calculate determinant. Cull for sign (face culling) and zero (zero area). Cull for sign (face culling) and zero (zero area). Send the edge equation coefficients or/and start and slope values to Triangle Traversal. Send the edge equation coefficients or/and start and slope values to Triangle Traversal. Optional: send other equations (1/w, clip planes, interpolators …). Optional: send other equations (1/w, clip planes, interpolators …).

Triangle Setup Adjoint rasterization matrix adj(M): Adjoint rasterization matrix adj(M): First level: 18 muls. First level: 18 muls. Second level: 9 adds. Second level: 9 adds. a 0 = y 1 w 2 – y 2 w 1 a 0 = y 1 w 2 – y 2 w 1 a 1 = y 2 w 0 – y 0 w 2 a 1 = y 2 w 0 – y 0 w 2 a 2 = y 0 w 1 – y 1 w 0 a 2 = y 0 w 1 – y 1 w 0 b 0 = x 2 w 1 – x 1 w 2 b 0 = x 2 w 1 – x 1 w 2 b 1 = x 0 w 2 – x 2 w 0 b 1 = x 0 w 2 – x 2 w 0 b 2 = x 1 w 0 – x 0 w 1 b 2 = x 1 w 0 – x 0 w 1 c 0 = x 1 y 2 – x 2 y 1 c 0 = x 1 y 2 – x 2 y 1 c 1 = x 2 y 0 - x 0 y 2 c 1 = x 2 y 0 - x 0 y 2 c 2 = x 0 y 1 – x 1 y 0 c 2 = x 0 y 1 – x 1 y 0

Triangle Setup Matrix determinant det(M): Matrix determinant det(M): 1 DP3: {w 0, w 1, w 2 } X {c 0, c 1, c 2 } 1 DP3: {w 0, w 1, w 2 } X {c 0, c 1, c 2 } Inverse matrix M -1 (not needed?): Inverse matrix M -1 (not needed?): First level: 1 reciproque: 1/det(M). First level: 1 reciproque: 1/det(M). Second level: 9 muls. Second level: 9 muls. Edge equations: Edge equations: M -1 rows. M -1 rows. E 0 = [a 0, b 0, c 0 ] E 0 = [a 0, b 0, c 0 ] E 1 = [a 1, b 1, c 1 ] E 1 = [a 1, b 1, c 1 ] E 2 = [a 2, b 2, c 2 ] E 2 = [a 2, b 2, c 2 ]

Triangle Setup 1/w equation: 1/w equation: Sum of rows (param vector {1, 1, 1}). Sum of rows (param vector {1, 1, 1}). Can be calculated as the sum of the edge equations. Can be calculated as the sum of the edge equations. Additional equations: Additional equations: param vector {u 0, u 1, u 2 } X M -1 : 3 DP3. param vector {u 0, u 1, u 2 } X M -1 : 3 DP3. Frustum/Viewport clip: Frustum/Viewport clip: D 0 = [1, 0, -x 0 ] D 0 = [1, 0, -x 0 ] D 1 = [-1, 0, x 0 + w] D 1 = [-1, 0, x 0 + w] D 2 = [0, 1, -y 0 ] D 2 = [0, 1, -y 0 ] D 3 = [0, -1, y 0 + h] D 3 = [0, -1, y 0 + h]

** * + + * * DP3

Triangle Traversal Different algorithms: Different algorithms: I don’t know which is better. I don’t know which is better. Scanline. Scanline. Centerline (PixelVision). Centerline (PixelVision). Tiled (Neon, McCormack). Tiled (Neon, McCormack). Incremental and Hierarchical Hilbert Order (McCool). Incremental and Hierarchical Hilbert Order (McCool). Others? Others?

Triangle Traversal Traversal algorithm effects: Traversal algorithm effects: Can improve the texture pattern access (Neon, Hilbert). Can improve the texture pattern access (Neon, Hilbert). Can improve framebuffer memory access (Neon). Can improve framebuffer memory access (Neon). Traversal algorithm requeriments: Traversal algorithm requeriments: Must produce at least 2x2 fragments per cycle or multiples (2 2x2 or 3 2x2, etc). Must produce at least 2x2 fragments per cycle or multiples (2 2x2 or 3 2x2, etc). Must be efficient and generate the less fragments outside the triangle. Must be efficient and generate the less fragments outside the triangle. Antialiasing? Antialiasing?

Triangle Traversal Uses edge equation coefficients and/or start and slope values calculated from then to walk the triangle. Uses edge equation coefficients and/or start and slope values calculated from then to walk the triangle. One ‘step’ per cycle. One ‘step’ per cycle. Fixed point arithmetic : integer addition. Fixed point arithmetic : integer addition. Requires to save state (2 to 3 saved states) or must use walk back (spends cycles). Requires to save state (2 to 3 saved states) or must use walk back (spends cycles). Tests (sign) the edge equations values at n positions per cycle. Tests (sign) the edge equations values at n positions per cycle. May test frustum and znear/zfar clip at the same time. May test frustum and znear/zfar clip at the same time.

Triangle Traversal Hardware requeriments: Hardware requeriments: Multiple fixed point adders. Multiple fixed point adders. Multiple sign testers. Multiple sign testers. Registers for current (at least 3 for each edge equation) and saved states. Registers for current (at least 3 for each edge equation) and saved states. Registers for edge slops/increments (as many as fragments generated per cycle and edge equations?). Registers for edge slops/increments (as many as fragments generated per cycle and edge equations?).

Traversal Algorithm TEST

Interpolation. Using barycentric method: Using barycentric method: Use the edge equation result (McCool): Use the edge equation result (McCool): F 0 (x,y) = E 0 F 0 (x,y) = E 0 F 1 (x,y) = E 1 F 1 (x,y) = E 1 F 2 (x,y) = E 2 F 2 (x,y) = E 2 Calculate sum of edge equations at the fragment: Calculate sum of edge equations at the fragment: R’(x,y) = F 0 (x,y) + F 1 (x,y) + F 2 (x,y) R’(x,y) = F 0 (x,y) + F 1 (x,y) + F 2 (x,y) Calculate reciproque: Calculate reciproque: r = 1/R’(x,y) r = 1/R’(x,y) Interpolate attribute at the fragment: Interpolate attribute at the fragment: p k (x,y) = p k0 rF 0 (x,y) + p k1 rF 1 (x,y) + p k2 rF 2 (x,y) p k (x,y) = p k0 rF 0 (x,y) + p k1 rF 1 (x,y) + p k2 rF 2 (x,y)

Interpolation Alternative (Olano & Greer): Alternative (Olano & Greer): At setup: At setup: Use 2DH method and calculate coefficients for all the attributes. Use 2DH method and calculate coefficients for all the attributes. Calculate 1/w (sum of rows) coefficients. Calculate 1/w (sum of rows) coefficients. Requires a vector matrix mul per attribute. Requires a vector matrix mul per attribute. At traverse/interpolation: At traverse/interpolation: Interpolate 1/w and attributes using fixed point incremental arithmetic. Interpolate 1/w and attributes using fixed point incremental arithmetic. Calculate reciproque of 1/w. Calculate reciproque of 1/w. Mul interpolated attribute by reciproque of 1/w Mul interpolated attribute by reciproque of 1/w

Interpolation Barycentric coordinates (McCool): Barycentric coordinates (McCool): no cost at setup. no cost at setup. store the parameter values at the three triangle edges. store the parameter values at the three triangle edges. fixed: 1 addition, 1 reciproque and 3 muls fixed: 1 addition, 1 reciproque and 3 muls per parameter: 1 DP3. per parameter: 1 DP3. Interpolation using Olano & Greer: Interpolation using Olano & Greer: vector matrix mul at setup per parameter and 1/w: 3 DP3. vector matrix mul at setup per parameter and 1/w: 3 DP3. store current state and slope increment for all the parameters and 1/w. store current state and slope increment for all the parameters and 1/w. fixed: 1 addition, 1 reciproque fixed: 1 addition, 1 reciproque per parameter: 1 addition, 1 mul. per parameter: 1 addition, 1 mul.

Interpolation How many attributes/parameters can be interpolated per cycle? How many attributes/parameters can be interpolated per cycle? XBOX: XBOX: 5 interpolators? 5 interpolators? general interpolator: color diffuse + color specular (shared). general interpolator: color diffuse + color specular (shared). Texture interpolators: 4? Texture interpolators: 4? Note: each of those interpolators is for a 4D vector. Note: each of those interpolators is for a 4D vector.

VERTEX ATTRIBUTES +1/x * * * * * * + FRAGMENT ATTRIBUTES

Current status Implemented Primitive Assembly box (with trivial degenerate triangle rejection). Implemented Primitive Assembly box (with trivial degenerate triangle rejection). Added GPU_VERTEX_OUTPUT_ATTRIBUTE register. Added GPU_VERTEX_OUTPUT_ATTRIBUTE register. Boolean vector of MAX_VERTEX_ATTRIBUTES that stores if a vertex output register is written in the shader (and therefore must be transmited). Boolean vector of MAX_VERTEX_ATTRIBUTES that stores if a vertex output register is written in the shader (and therefore must be transmited). Now the transmission latency for vertex between the Shader and Streamer Commit and between Streamer Commit and Primitive Assembly is determined by the number of ouput attributes. Now the transmission latency for vertex between the Shader and Streamer Commit and between Streamer Commit and Primitive Assembly is determined by the number of ouput attributes.

Current Status Started Triangle Setup box and support classes. Started Triangle Setup box and support classes.

Current Status Comments: Comments: Streamer Loader to Shader transmission should also have transmission latency penalty? Streamer Loader to Shader transmission should also have transmission latency penalty? Where are stored the vertex output attributes? Where are stored the vertex output attributes? How many times we must pay the vertex transmission penalty? How many times we must pay the vertex transmission penalty?

Current Status Signal Analyzer: Signal Analyzer: Already works with large traces. Already works with large traces.

References Triangle Scan Conversion using 2D Homogeneous Coordinates, Marc Olano, Trey Greer. Triangle Scan Conversion using 2D Homogeneous Coordinates, Marc Olano, Trey Greer. Tiled Polygon Traversal Using Half- Plane Edge Functions, Joel McCormack, Robert McNamara. Tiled Polygon Traversal Using Half- Plane Edge Functions, Joel McCormack, Robert McNamara. Incremental and Hierarchical Hilber Order Edge Equation Polygon Rasterization, Michael D. McCool, Chris Wales, Kevin Moule. Incremental and Hierarchical Hilber Order Edge Equation Polygon Rasterization, Michael D. McCool, Chris Wales, Kevin Moule.

References A Parallel Algorithm for Polygon Rasterization, Juan Pineda. A Parallel Algorithm for Polygon Rasterization, Juan Pineda.