Download presentation
Presentation is loading. Please wait.
Published byTanya Britten Modified over 9 years ago
1
N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA
2
Real-time rendering Visibility culling –quickly reject what’s not visible Context what won’t affect any pixel in final image Many methods available [COCSD02,PT02]
3
Occlusion maps Select potential occluders [LG95,KCCO00] –project and rasterize them –store distance to closest one at each pixel Z buffer / occlusion map / depth map Traverse potential occludees –project and rasterize them –test visibility of each fragment depth comparison against depth map - use bounding volumes - do it hierarchically
4
Optimizations Reduce number of pixels tested –Hierarchical Z Buffer [ZMHH97]
5
Optimizations Reduce number of pixels tested –Hierarchical Z Buffer [ZMHH97] –Lazy Occlusion Grid [HTP01] –Summed Area Tables [HW99] Use hardware Z buffer –implemented for hidden face removal with optimizations [Mor00, AMN03] –exposed through Occlusion Queries
6
Occlusion queries # of pixels passing z test if some geometry were rendered in current framebuffer Hardware-assisted culling [HSLM02,BWPP04] Other applications [TPK01] –culling & clamping of shadow volumes [LWGM04] –LOD selection [ASVNB00]
7
Motivation for N-Buffers Query depth map within GPU –Advantages reduce communication with CPU allow to discard/optimize geometry on GPU –Constraints limited # of operations complex datastructures unavailable –no pointers and lists “complex” algorithms prohibited –branching and indirections costly
8
Task at hand For a given object, find the maximum depth covered by its projection Depth map accessed as a texture –Lookups give information at one pixel –We need information over a region Use texture to encode depth over a region –proximity grids
9
The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i –various neigborood/size possible –we choose squares with lower left corner on texel with size 2 i x 2 i
10
The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0 depth map
11
The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1 depth map that texel stores maximum depth within that region
12
The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1level 2 depth map that texel stores maximum depth within that region
13
The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1level 2level 3 depth map that texel stores maximum depth within that region
14
The datastructure Like an image pyramid but... –all levels have same resolution –level 0 (depth map) can have any dimensions not limited to power of 2 # of levels is log of largest dimension –but we might build only the first levels
15
Construction Level i+1 obtained from level i level 0 level 1level 2
16
Construction Level i+1 obtained from level i level 0level 1level 2
17
Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i] standard z-buffer
18
Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i]
19
Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i]
20
Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i]
21
Construction Similar to matrix reduction... –Buck and Purcell, GPU Gems, p 626...but we keep full resolution –gives us locality
22
Construction Complexity –first step depends on scene complexity –other steps depends only on resolution Computation cost –~10ms for 640x480 –no read back GeForce FX 6800
23
Query Naive approach top view viewport level 0level 1level 2level 3level 4level 5
24
Query Naive approach –project occludee top view viewport level 0level 1level 2level 3level 4level 5
25
Query Naive approach –project occludee –get screen space bbox extents + z min top view viewport level 0level 1level 2level 3level 4level 5
26
Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5
27
Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood –do one lookup in matching level at lower left corner top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5 z max
28
Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood –do one lookup in matching level at lower left corner –compare z min and z max top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5 z max
29
Query Naive approach Overly conservative –(bvolume of occludee) –screenspace bbox –bounding neighborood top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5 Need a tighter coverage
30
4 tiles coverage depth max in region > depth max in sub-region 2 4 x 2 4 2 5 x 2 5 screenspace bbox bounding neighborood z max z ≤
31
depth max in region > depth max in sub-region 4 tiles coverage 2 4 x 2 4 2 5 x 2 5 screenspace bbox bounding neighborood z max z ≤
32
4 tiles coverage depth max in region > depth max in sub-region 2 5 x 2 5 z max z ≤ z1,z1,z2,z2, z3,z3,z4z4 max() z max = 2 4 x 2 4 bounding neighborood screenspace bbox
33
4 tiles coverage 5 ways of covering with 4 squares Measure of the gain on over-conservativity
34
Applications Occlusion culling Particles Shadow volume clamping
35
Applications Occlusion culling Particles Shadow volume clamping
36
Occlusion Culling N-Buffer vs. Occlusion Queries –walkthrough in city-like scene –occluders at frame n = visible at frame n-1 Measured the number of depth tests –testing each building –using a hierarchy of bounding volumes
37
Occlusion Culling Occlusion queries are faster –harware implementation, available API –N-Buffers penalized computation of 4 tiles coverage on CPU use of glReadPixels to query levels Occlusion queries can be interleaved with rendering [BWPP04]
38
Occlusion Culling # of depth tests smaller with N-Buffers –4 tests/occludee << nb of pixels rasterized N-Buffers always benefit from hierarchy –testing A cheaper than testing children(A) –not the case for OQ
39
Occlusion Culling # of depth tests smaller with N-Buffers –4 tests/occludee << nb of pixels rasterized N-Buffers always benefit from hierarchy –testing A cheaper than testing children(A) –not the case for OQ n1n1 n2n2 n n>n 1 +n 2
40
Hardware implementation? Extra memory to store levels Dedicated component for level updates –not all levels? –lazy updates? Faster than OQ for large objects Fixed (4) number of operations –simple implementation –good for parallelism
41
Applications Occlusion culling Particles Shadow volume clamping
42
Particles Particle rendered using ARB_point_sprite –no need to compute quad on CPU Particle animated within GPU –up to a million particle in real-time
43
Particles Particle rendered using ARB_point_sprite –no need to compute quad on CPU Particle animated within GPU –up to a million particle in real-time How to cull unseen particles? –can not use OQ!
44
Particles Using N-Buffers –for 16x16 point sprites compute 4 first levels only do one texture lookup in vertex program Not implementable yet –v. program lookups require LUMINANCE_FLOAT32_ATI –N-Buffers require DEPTH_COMPONENT
45
Applications Occlusion culling Particles Shadow volume clamping
46
Shadow volumes clamping Ignore unseen or fully shadowed casters Clamp shadow volume to shadowed area [LWGM04]
47
Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? light camera scene
48
Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? light camera scene
49
The litmap Light view of what’s seen by viewer Camera’s view Light’s view
50
The litmap Light view of what’s seen by viewer Camera’s view Light’s view
51
Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? Minimum/maximum depth covered by a shadow caster
52
Shadow volumes clamping Compute two litmaps –furthest visible parts –closest visible parts Compute N-Buffers for both For each shadow caster –use N-Buffers to lookup min/max visible parts –cull and clamp accordingly Can be done in vertex program [BS03]
53
Shadow volumes clamping
54
Simpler than CC shadow volumes [LWGM04] –single slice –not optimized (no hardware support) reduce by more of 80% the fill rate
55
Conclusion Novel representation for depth maps –for encoding depth information over a region –fast to compute possible implementation on hardware –fixed number of operations for query queries available in vertex/fragment programs Applications –can’t beat (yet) hardware optimized approaches –more a proof of concept
56
Future work Not limited to culling –depth maps used for relief [OB00,PNC05] Other neighborood basis –RIP maps [KLK00] Link to theory of Wavelet Zoom [Mal01]
57
Questions
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.