Presentation is loading. Please wait.

Presentation is loading. Please wait.

Status – Week 239 Victor Moya. Summary Primitive Assembly Primitive Assembly Clipping triangle rejection. Clipping triangle rejection. Rasterization.

Similar presentations


Presentation on theme: "Status – Week 239 Victor Moya. Summary Primitive Assembly Primitive Assembly Clipping triangle rejection. Clipping triangle rejection. Rasterization."— Presentation transcript:

1 Status – Week 239 Victor Moya

2 Summary Primitive Assembly Primitive Assembly Clipping triangle rejection. Clipping triangle rejection. Rasterization. Rasterization. Triangle Setup. Triangle Setup. Early Z. Early Z. Current status. Current status.

3 Primitive Assembly Works as a LRU cache. Works as a LRU cache. Asks the Post T&L cache for missing vertex. Asks the Post T&L cache for missing vertex. Checks if some of the new vertex are already in the primitive assembly cache. Checks if some of the new vertex are already in the primitive assembly cache. Three vertex stored (2 for triangles, 3 for quads). Three vertex stored (2 for triangles, 3 for quads). Last vertex is always bypassed directly to Triangle Setup. Last vertex is always bypassed directly to Triangle Setup.

4 Clipping Rejection Check clipping per vertex. Check clipping per vertex. Apply results per primitive. Apply results per primitive. Reject full primitives. Reject full primitives. DP3 clip plane equation with vertex homogeneous coordinates. DP3 clip plane equation with vertex homogeneous coordinates. Signed distance between the vertex and the plane. Signed distance between the vertex and the plane. Clip the primitive when all the vertex are negative for some of the planes. Clip the primitive when all the vertex are negative for some of the planes. Problem: triangles with all vertex outside the clip volume, but with a region inside. Problem: triangles with all vertex outside the clip volume, but with a region inside.

5 Rasterization Primitive Assembly Triangle Setup TraversalInterpolation Rasterizer Emulator Setup(vattrib[3])nextFragment()Interpolate(fr)

6 Rasterization Boxes only carry timing. Boxes only carry timing. Latency and throughput for the setup, traversal and interpolation operations. Latency and throughput for the setup, traversal and interpolation operations. Rasterizer Emulator performs the actual work: Rasterizer Emulator performs the actual work: Setup algorithm. Setup algorithm. Traversal algorithm. Traversal algorithm. Interpolation algorithm. Interpolation algorithm.

7 Rasterization Timing and rasterization algorithm are independent. Timing and rasterization algorithm are independent. Rasterization boxes can simulate as many ‘stages’ as needed without worrying about functionality. Rasterization boxes can simulate as many ‘stages’ as needed without worrying about functionality. Rasterizer emulator offers an interface for all the rasterization operations: Rasterizer emulator offers an interface for all the rasterization operations: Setup(), Area(), AreaSign(), GenerateNextFragment(), GenerateNextTile(), InterpolateFragment(), InterpolateFragmentAttribute(), etc… Setup(), Area(), AreaSign(), GenerateNextFragment(), GenerateNextTile(), InterpolateFragment(), InterpolateFragmentAttribute(), etc…

8 Rasterization Setup Box: Setup Box: Get the triangle vertex positions and attributes. Get the triangle vertex positions and attributes. Send to internal signal ‘setup’ -> simulates setup latency. Send to internal signal ‘setup’ -> simulates setup latency. Read internal signal ‘setup’. Read internal signal ‘setup’. RastEmu::setup(vattrib[3]). RastEmu::setup(vattrib[3]). RastEmu::getArea(). RastEmu::getArea(). Check area sign and face culling method: Check area sign and face culling method: Reject if area is zero or near zero. Reject if area is zero or near zero. Reject if face culling enabled and wrong sign. Reject if face culling enabled and wrong sign. Invert coefficient signs if front face culling. Invert coefficient signs if front face culling. Issue triangle to triangle traversal. Issue triangle to triangle traversal.

9 Rasterization Traversal Box: Traversal Box: Read triangles from Setup box. Read triangles from Setup box. Set start point: RastEmu::setStart(). Set start point: RastEmu::setStart(). Optional? Optional? Algorithm dependant? Algorithm dependant? Ask for next fragment/fragment tile: write to internal signal ‘next fragment’. Simulates fragment generation latency. Ask for next fragment/fragment tile: write to internal signal ‘next fragment’. Simulates fragment generation latency. Read generated fragment: read ‘next fragment’ signal. Read generated fragment: read ‘next fragment’ signal. RastEmu::nextFragment(). RastEmu::nextFragment(). Send fragment to interpolation. Send fragment to interpolation.

10 Rasterization Traversal Box: Traversal Box: Other algorithms could not provide a fragment per cycle or have variable latency for each generated fragment. Other algorithms could not provide a fragment per cycle or have variable latency for each generated fragment. RastEmu::nextFragment() could return a boolean. RastEmu::nextFragment() could return a boolean. RastEmu::nextFragment() could return the number of generated fragments (or a mask for a tile). RastEmu::nextFragment() could return the number of generated fragments (or a mask for a tile). RastEmu::nextFragment() could return the ‘amount of work’. RastEmu::nextFragment() could return the ‘amount of work’. Additional interface functions for fragment generation and triangle traversal. Additional interface functions for fragment generation and triangle traversal. Fragment culling is done in the rasterizer emulator? Fragment culling is done in the rasterizer emulator?

11 Rasterization Interpolation box: Interpolation box: Read fragments from Traversal box. Read fragments from Traversal box. Interpolate -> write to ‘interpolate’ signal. Interpolate -> write to ‘interpolate’ signal. per fragment, or per fragment, or per attribute per attribute Read ‘interpolate’ signal. Read ‘interpolate’ signal. RastEmu::interpolate(). RastEmu::interpolate(). Repeat if per attribute/group of attributes. Repeat if per attribute/group of attributes. Send to fragment FIFO. Send to fragment FIFO.

12 Triangle Setup Using hardware equivalent to a vertex shader. Using hardware equivalent to a vertex shader. Use multithreading to hide dependecy latencies. Use multithreading to hide dependecy latencies. Same as shaders. Same as shaders. Multiple triangles at setup at the same time. Multiple triangles at setup at the same time. Minimum setup latency: Minimum setup latency: 6 cycles (just adj(M) using McCool method). 6 cycles (just adj(M) using McCool method). Minimum initialization latency: Minimum initialization latency: 1 cycle using multithreading and enough registers. 1 cycle using multithreading and enough registers.

13 Triangle Setup Registers: Registers: rA, rB, rC -> Edge equations a, b and c coefficients (adj(M) and M -1 matrix rows). rA, rB, rC -> Edge equations a, b and c coefficients (adj(M) and M -1 matrix rows). rX, rY, rW -> the 3 vertices x, y and w coordinates (M colums). rX, rY, rW -> the 3 vertices x, y and w coordinates (M colums). rD, rI -> matrix determinant and reciprocate. rD, rI -> matrix determinant and reciprocate. rR -> 1/w equation coefficients. rR -> 1/w equation coefficients. rU -> parameter values at the three vertices rU -> parameter values at the three vertices rP -> parameter equation coefficients rP -> parameter equation coefficients

14 Triangle Setup Adj(M): (at least 6 cycles + lat. dep.) Adj(M): (at least 6 cycles + lat. dep.) mul rC.xyz, rX.yzx, rY.zxy mul rB.xyz, rX.zxy, rW.yzx mul rA.xyz, rY.yzx, rW.zxy mad rC.xyz, rX.zxy, rY.yzx, -rC mad rB.xyz, rX.yzx, rW.zyx, -rB mad rA.xyz, rY.zxy, rW.yzx, -rA

15 Triangle Setup det(M): (1 cycle) det(M): (1 cycle) M -1 : (4 cycles + dep. lat.) M -1 : (4 cycles + dep. lat.) rcc rI.x, rD.x mul rC, rC, rI mul rB, rB, rI mul rA, rA, rI dp3 rD.x, rC, rW

16 Triangle Setup 1/w coefficients: (2 cycles + dep. lat.) 1/w coefficients: (2 cycles + dep. lat.) Parameter coefficients: (3 cycles) Parameter coefficients: (3 cycles) add rR, rA, rB add rR, rR, rC dp3 rU.x, rP, rA dp3 rU.y, rP, rB dp3 rU.z, rP, rC

17 Early Z Could be implemented before interpolation. Could be implemented before interpolation. Interpolate the triangle Z (z/w) first. Interpolate the triangle Z (z/w) first. Could save some calculations. Could save some calculations. Would save time? Would save time?

18 Current Status (to be done) (to be done)


Download ppt "Status – Week 239 Victor Moya. Summary Primitive Assembly Primitive Assembly Clipping triangle rejection. Clipping triangle rejection. Rasterization."

Similar presentations


Ads by Google