Presentation is loading. Please wait.

Presentation is loading. Please wait.

N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA.

Similar presentations


Presentation on theme: "N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA."— Presentation transcript:

1 N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA

2 Real-time rendering Visibility culling –quickly reject what’s not visible Context what won’t affect any pixel in final image Many methods available [COCSD02,PT02]

3 Occlusion maps Select potential occluders [LG95,KCCO00] –project and rasterize them –store distance to closest one at each pixel Z buffer / occlusion map / depth map Traverse potential occludees –project and rasterize them –test visibility of each fragment depth comparison against depth map - use bounding volumes - do it hierarchically

4 Optimizations Reduce number of pixels tested –Hierarchical Z Buffer [ZMHH97]

5 Optimizations Reduce number of pixels tested –Hierarchical Z Buffer [ZMHH97] –Lazy Occlusion Grid [HTP01] –Summed Area Tables [HW99] Use hardware Z buffer –implemented for hidden face removal with optimizations [Mor00, AMN03] –exposed through Occlusion Queries

6 Occlusion queries # of pixels passing z test if some geometry were rendered in current framebuffer Hardware-assisted culling [HSLM02,BWPP04] Other applications [TPK01] –culling & clamping of shadow volumes [LWGM04] –LOD selection [ASVNB00]

7 Motivation for N-Buffers Query depth map within GPU –Advantages reduce communication with CPU allow to discard/optimize geometry on GPU –Constraints limited # of operations complex datastructures unavailable –no pointers and lists “complex” algorithms prohibited –branching and indirections costly

8 Task at hand For a given object, find the maximum depth covered by its projection Depth map accessed as a texture –Lookups give information at one pixel –We need information over a region Use texture to encode depth over a region –proximity grids

9 The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i –various neigborood/size possible –we choose squares with lower left corner on texel with size 2 i x 2 i

10 The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0 depth map

11 The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1 depth map that texel stores maximum depth within that region

12 The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1level 2 depth map that texel stores maximum depth within that region

13 The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1level 2level 3 depth map that texel stores maximum depth within that region

14 The datastructure Like an image pyramid but... –all levels have same resolution –level 0 (depth map) can have any dimensions not limited to power of 2 # of levels is log of largest dimension –but we might build only the first levels

15 Construction Level i+1 obtained from level i level 0 level 1level 2

16 Construction Level i+1 obtained from level i level 0level 1level 2

17 Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i] standard z-buffer

18 Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i]

19 Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i]

20 Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i]

21 Construction Similar to matrix reduction... –Buck and Purcell, GPU Gems, p 626...but we keep full resolution –gives us locality

22 Construction Complexity –first step depends on scene complexity –other steps depends only on resolution Computation cost –~10ms for 640x480 –no read back GeForce FX 6800

23 Query Naive approach top view viewport level 0level 1level 2level 3level 4level 5

24 Query Naive approach –project occludee top view viewport level 0level 1level 2level 3level 4level 5

25 Query Naive approach –project occludee –get screen space bbox extents + z min top view viewport level 0level 1level 2level 3level 4level 5

26 Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5

27 Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood –do one lookup in matching level at lower left corner top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5 z max

28 Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood –do one lookup in matching level at lower left corner –compare z min and z max top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5 z max

29 Query Naive approach Overly conservative –(bvolume of occludee) –screenspace bbox –bounding neighborood top view viewport level 0level 1level 2level 3level 4level 5 2 5 x 2 5 Need a tighter coverage

30 4 tiles coverage depth max in region > depth max in sub-region 2 4 x 2 4 2 5 x 2 5 screenspace bbox bounding neighborood z max z ≤

31 depth max in region > depth max in sub-region 4 tiles coverage 2 4 x 2 4 2 5 x 2 5 screenspace bbox bounding neighborood z max z ≤

32 4 tiles coverage depth max in region > depth max in sub-region 2 5 x 2 5 z max z ≤ z1,z1,z2,z2, z3,z3,z4z4 max() z max = 2 4 x 2 4 bounding neighborood screenspace bbox

33 4 tiles coverage 5 ways of covering with 4 squares Measure of the gain on over-conservativity

34 Applications Occlusion culling Particles Shadow volume clamping

35 Applications Occlusion culling Particles Shadow volume clamping

36 Occlusion Culling N-Buffer vs. Occlusion Queries –walkthrough in city-like scene –occluders at frame n = visible at frame n-1 Measured the number of depth tests –testing each building –using a hierarchy of bounding volumes

37 Occlusion Culling Occlusion queries are faster –harware implementation, available API –N-Buffers penalized computation of 4 tiles coverage on CPU use of glReadPixels to query levels Occlusion queries can be interleaved with rendering [BWPP04]

38 Occlusion Culling # of depth tests smaller with N-Buffers –4 tests/occludee << nb of pixels rasterized N-Buffers always benefit from hierarchy –testing A cheaper than testing children(A) –not the case for OQ

39 Occlusion Culling # of depth tests smaller with N-Buffers –4 tests/occludee << nb of pixels rasterized N-Buffers always benefit from hierarchy –testing A cheaper than testing children(A) –not the case for OQ n1n1 n2n2 n n>n 1 +n 2

40 Hardware implementation? Extra memory to store levels Dedicated component for level updates –not all levels? –lazy updates? Faster than OQ for large objects Fixed (4) number of operations –simple implementation –good for parallelism

41 Applications Occlusion culling Particles Shadow volume clamping

42 Particles Particle rendered using ARB_point_sprite –no need to compute quad on CPU Particle animated within GPU –up to a million particle in real-time

43 Particles Particle rendered using ARB_point_sprite –no need to compute quad on CPU Particle animated within GPU –up to a million particle in real-time How to cull unseen particles? –can not use OQ!

44 Particles Using N-Buffers –for 16x16 point sprites compute 4 first levels only do one texture lookup in vertex program Not implementable yet –v. program lookups require LUMINANCE_FLOAT32_ATI –N-Buffers require DEPTH_COMPONENT

45 Applications Occlusion culling Particles Shadow volume clamping

46 Shadow volumes clamping Ignore unseen or fully shadowed casters Clamp shadow volume to shadowed area [LWGM04]

47 Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? light camera scene

48 Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? light camera scene

49 The litmap Light view of what’s seen by viewer Camera’s view Light’s view

50 The litmap Light view of what’s seen by viewer Camera’s view Light’s view

51 Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? Minimum/maximum depth covered by a shadow caster

52 Shadow volumes clamping Compute two litmaps –furthest visible parts –closest visible parts Compute N-Buffers for both For each shadow caster –use N-Buffers to lookup min/max visible parts –cull and clamp accordingly Can be done in vertex program [BS03]

53 Shadow volumes clamping

54 Simpler than CC shadow volumes [LWGM04] –single slice –not optimized (no hardware support) reduce by more of 80% the fill rate

55 Conclusion Novel representation for depth maps –for encoding depth information over a region –fast to compute possible implementation on hardware –fixed number of operations for query queries available in vertex/fragment programs Applications –can’t beat (yet) hardware optimized approaches –more a proof of concept

56 Future work Not limited to culling –depth maps used for relief [OB00,PNC05] Other neighborood basis –RIP maps [KLK00] Link to theory of Wavelet Zoom [Mal01]

57 Questions


Download ppt "N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA."

Similar presentations


Ads by Google