N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA.

N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA

Real-time rendering Visibility culling –quickly reject what’s not visible Context what won’t affect any pixel in final image Many methods available [COCSD02,PT02]

Occlusion maps Select potential occluders [LG95,KCCO00] –project and rasterize them –store distance to closest one at each pixel Z buffer / occlusion map / depth map Traverse potential occludees –project and rasterize them –test visibility of each fragment depth comparison against depth map - use bounding volumes - do it hierarchically

Optimizations Reduce number of pixels tested –Hierarchical Z Buffer [ZMHH97]

Optimizations Reduce number of pixels tested –Hierarchical Z Buffer [ZMHH97] –Lazy Occlusion Grid [HTP01] –Summed Area Tables [HW99] Use hardware Z buffer –implemented for hidden face removal with optimizations [Mor00, AMN03] –exposed through Occlusion Queries

Occlusion queries # of pixels passing z test if some geometry were rendered in current framebuffer Hardware-assisted culling [HSLM02,BWPP04] Other applications [TPK01] –culling & clamping of shadow volumes [LWGM04] –LOD selection [ASVNB00]

Motivation for N-Buffers Query depth map within GPU –Advantages reduce communication with CPU allow to discard/optimize geometry on GPU –Constraints limited # of operations complex datastructures unavailable –no pointers and lists “complex” algorithms prohibited –branching and indirections costly

Task at hand For a given object, find the maximum depth covered by its projection Depth map accessed as a texture –Lookups give information at one pixel –We need information over a region Use texture to encode depth over a region –proximity grids

The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i –various neigborood/size possible –we choose squares with lower left corner on texel with size 2 i x 2 i

The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0 depth map

The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1 depth map that texel stores maximum depth within that region

The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1level 2 depth map that texel stores maximum depth within that region

The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1level 2level 3 depth map that texel stores maximum depth within that region

The datastructure Like an image pyramid but... –all levels have same resolution –level 0 (depth map) can have any dimensions not limited to power of 2 # of levels is log of largest dimension –but we might build only the first levels

Construction Level i+1 obtained from level i level 0 level 1level 2

Construction Level i+1 obtained from level i level 0level 1level 2

Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i] standard z-buffer

Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i]

Construction Similar to matrix reduction... –Buck and Purcell, GPU Gems, p 626...but we keep full resolution –gives us locality

Construction Complexity –first step depends on scene complexity –other steps depends only on resolution Computation cost –~10ms for 640x480 –no read back GeForce FX 6800

Query Naive approach top view viewport level 0level 1level 2level 3level 4level 5

Query Naive approach –project occludee top view viewport level 0level 1level 2level 3level 4level 5

Query Naive approach –project occludee –get screen space bbox extents + z min top view viewport level 0level 1level 2level 3level 4level 5

Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5

Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood –do one lookup in matching level at lower left corner top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5 z max

Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood –do one lookup in matching level at lower left corner –compare z min and z max top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5 z max

Query Naive approach Overly conservative –(bvolume of occludee) –screenspace bbox –bounding neighborood top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5 Need a tighter coverage

4 tiles coverage depth max in region > depth max in sub-region 2 4 x 2 4 2 5 x 2 5 screenspace bbox bounding neighborood z max z ≤

depth max in region > depth max in sub-region 4 tiles coverage 2 4 x 2 4 2 5 x 2 5 screenspace bbox bounding neighborood z max z ≤

4 tiles coverage depth max in region > depth max in sub-region 2 5 x 2 5 z max z ≤ z1,z1,z2,z2, z3,z3,z4z4 max() z max = 2 4 x 2 4 bounding neighborood screenspace bbox

4 tiles coverage 5 ways of covering with 4 squares Measure of the gain on over-conservativity

Applications Occlusion culling Particles Shadow volume clamping

Occlusion Culling N-Buffer vs. Occlusion Queries –walkthrough in city-like scene –occluders at frame n = visible at frame n-1 Measured the number of depth tests –testing each building –using a hierarchy of bounding volumes

Occlusion Culling Occlusion queries are faster –harware implementation, available API –N-Buffers penalized computation of 4 tiles coverage on CPU use of glReadPixels to query levels Occlusion queries can be interleaved with rendering [BWPP04]

Occlusion Culling # of depth tests smaller with N-Buffers –4 tests/occludee << nb of pixels rasterized N-Buffers always benefit from hierarchy –testing A cheaper than testing children(A) –not the case for OQ

Occlusion Culling # of depth tests smaller with N-Buffers –4 tests/occludee << nb of pixels rasterized N-Buffers always benefit from hierarchy –testing A cheaper than testing children(A) –not the case for OQ n1n1 n2n2 n n>n 1 +n 2

Hardware implementation? Extra memory to store levels Dedicated component for level updates –not all levels? –lazy updates? Faster than OQ for large objects Fixed (4) number of operations –simple implementation –good for parallelism

Particles Particle rendered using ARB_point_sprite –no need to compute quad on CPU Particle animated within GPU –up to a million particle in real-time

Particles Particle rendered using ARB_point_sprite –no need to compute quad on CPU Particle animated within GPU –up to a million particle in real-time How to cull unseen particles? –can not use OQ!

Particles Using N-Buffers –for 16x16 point sprites compute 4 first levels only do one texture lookup in vertex program Not implementable yet –v. program lookups require LUMINANCE_FLOAT32_ATI –N-Buffers require DEPTH_COMPONENT

Shadow volumes clamping Ignore unseen or fully shadowed casters Clamp shadow volume to shadowed area [LWGM04]

Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? light camera scene

The litmap Light view of what’s seen by viewer Camera’s view Light’s view

Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? Minimum/maximum depth covered by a shadow caster

Shadow volumes clamping Compute two litmaps –furthest visible parts –closest visible parts Compute N-Buffers for both For each shadow caster –use N-Buffers to lookup min/max visible parts –cull and clamp accordingly Can be done in vertex program [BS03]

Shadow volumes clamping

Simpler than CC shadow volumes [LWGM04] –single slice –not optimized (no hardware support) reduce by more of 80% the fill rate

Conclusion Novel representation for depth maps –for encoding depth information over a region –fast to compute possible implementation on hardware –fixed number of operations for query queries available in vertex/fragment programs Applications –can’t beat (yet) hardware optimized approaches –more a proof of concept

Future work Not limited to culling –depth maps used for relief [OB00,PNC05] Other neighborood basis –RIP maps [KLK00] Link to theory of Wavelet Zoom [Mal01]

Questions

N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA.

Similar presentations

Presentation on theme: "N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA.

Similar presentations

Presentation on theme: "N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA."— Presentation transcript:

Similar presentations

About project

Feedback