Status – Week 228 Victor Moya
Summary Hierarchical Z-Buffer. Hierarchical Z-Buffer.
Hierarchical Z-Buffer Two Level Hierarchical Z-Buffer for 3D Graphics Hardware. Cheng-Hsien Chen, Chen-Yi Lee. Two Level Hierarchical Z-Buffer for 3D Graphics Hardware. Cheng-Hsien Chen, Chen-Yi Lee. System, Method and Apparatus for Multi-Level Hierarchical Z Buffering. US Patent Application 2003/ System, Method and Apparatus for Multi-Level Hierarchical Z Buffering. US Patent Application 2003/
Hierarchical Z-Buffer PACLIPRASTINTPSFTEST HZ ZTEST
Hierarchical Z-Buffer HZ MEM Z CACHE DECODE ENCODE COMBINING CACHE MEMORY (Z-BUFFER) MERGE LOGIC
Hierarchical Buffer Model from ATI patent application. Model from ATI patent application. See also two other ATI patents about early implementations of the HZ. See also two other ATI patents about early implementations of the HZ. Two level HZ. Two level HZ. but only one used? but only one used? Early Z access is performed in two phases: Early Z access is performed in two phases: access to the HZ L2. access to the HZ L2. if passes, access the Z cache. if passes, access the Z cache. if miss, access memory (fetch new Z cache line). if miss, access memory (fetch new Z cache line).
Hierarchical Z-Buffer Z Cache Z Cache small lines? 32 bit per pixel, 4 pixels per line? small lines? 32 bit per pixel, 4 pixels per line? data is compressed in the Z-Buffer. data is compressed in the Z-Buffer. decode/decompress at line fetch. decode/decompress at line fetch. encode/compress at line evict. encode/compress at line evict. compress mechanism is also used to calculate the farthest Z value in the line. compress mechanism is also used to calculate the farthest Z value in the line. size? size? replacement policy? replacement policy?
Hierarchical Z-Buffer HZ Memory HZ Memory each position stores the farthest Z value in a NxM tile of the original Z Buffer. each position stores the farthest Z value in a NxM tile of the original Z Buffer. data precission? 8 bits? 16 bits? 32 bits? data precission? 8 bits? 16 bits? 32 bits? combine cache to build the tile farthest value. combine cache to build the tile farthest value. number of HZ levels? number of HZ levels? L1 on die, L2 on cache/memory. L1 on die, L2 on cache/memory.
Hierarchical Z-Buffer HZ memory HZ memory combining cache size? combining cache size? replacement policy? FIFO? replacement policy? FIFO? update mechanism update mechanism
Hierarchical Z-Buffer HZ Memory HZ Memory size? size? Example: Example: –8x8 tiles –8 bits per value –2048x2048 resolution –64KB –a second level can be implemented using pointers. –LARGE!
Hierarchical Z-Buffer Bit-mask Cache HZ Buffer Z write
Hierarchical Z-Buffer TMPZ COVERAGE MASK COVERAGE?
Hierarchical Z-Buffer Light weight Z-Buffer? Light weight Z-Buffer? HZ Buffer: HZ Buffer: 2 bit pointer (L2 HZ). 2 bit pointer (L2 HZ). 4 x 8 bit Z values (L1 HZ). 4 x 8 bit Z values (L1 HZ). ~49 KB for 1024x768. ~49 KB for 1024x768. HZ test per primitive. HZ test per primitive. HZ test per fragment. HZ test per fragment.
Hierarchical Z-Buffer Bit-mask cache: Bit-mask cache: builds HZ blocks. builds HZ blocks. holds the current farthest Z for the block. holds the current farthest Z for the block. 1 bit per block pixel: covered. 1 bit per block pixel: covered. FIFO replacing policy. FIFO replacing policy. if full block covered update HZ buffer. if full block covered update HZ buffer.
Hierarchical Z-Buffer access HZ access Z Cache cull discard passes test hitcull pass discard pass miss Access memory Replace test
Hierarchical Z-Buffer Triangle Traversal Hierarchical Z Interpolator fragment test Z test? Z Cache Memor Controller HZ TEST BOXES
Hierarchical Z Buffer Z test fragment Z cache Memory Controller Hierarchical Z Z TEST AND Z AND HZ UPDATE BOXES
Cache simulation c1c2c3c4 box1 box2 acc1 res1acc2 res2 accc3 acc3 res3 Latency 2 Throughput 1 Box2 = cache c5
Cache simulation c1c2c3c4 box1 acc1res1res2 accc3 res3 c5 acc2 Latency 1 cycle Throughput 1 Box1 => included cache
Cache simulation Use always 2+ cycle access caches? Use always 2+ cycle access caches? Use always in-box caches? Use always in-box caches? Shared cache? Shared cache?