N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA.

Slides:



Advertisements
Similar presentations
Sven Woop Computer Graphics Lab Saarland University
Advertisements

Photon Mapping on Programmable Graphics Hardware Timothy J. Purcell Mike Cammarano Pat Hanrahan Stanford University Craig Donner Henrik Wann Jensen University.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Graphics Pipeline.
Visibility in Games Harald Riegler. 2 / 18 Visibility in Games n What do we need it for? u Increase of rendering speed by removing unseen scene data from.
Two Methods for Fast Ray-Cast Ambient Occlusion Samuli Laine and Tero Karras NVIDIA Research.
Occlusion Culling Fall 2003 Ref: GamasutraGamasutra.
Visibility Culling. Back face culling View-frustrum culling Detail culling Occlusion culling.
Visibility Culling using Hierarchical Occlusion Maps Hansong Zhang, Dinesh Manocha, Tom Hudson, Kenneth E. Hoff III Presented by: Chris Wassenius.
Rasterization and Ray Tracing in Real-Time Applications (Games) Andrew Graff.
Chapter 6: Vertices to Fragments Part 2 E. Angel and D. Shreiner: Interactive Computer Graphics 6E © Addison-Wesley Mohan Sridharan Based on Slides.
Adapted from: CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware Naga K. Govindaraju, Stephane.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
GPUGI: Global Illumination Effects on the GPU
Z-Buffer Optimizations Patrick Cozzi Analytical Graphics, Inc.
Vertices and Fragments III Mohan Sridharan Based on slides created by Edward Angel 1 CS4395: Computer Graphics.
University of Texas at Austin CS 378 – Game Technology Don Fussell CS 378: Computer Game Technology Beyond Meshes Spring 2012.
1 A Hierarchical Shadow Volume Algorithm Timo Aila 1,2 Tomas Akenine-Möller 3 1 Helsinki University of Technology 2 Hybrid Graphics 3 Lund University.
Hidden Surface Removal
Afrigraph 2004 Massive model visualization Tutorial A: Part I Rasterization Based Approaches Andreas Dietrich Computer Graphics Group, Saarland University.
Shadows Computer Graphics. Shadows Shadows Extended light sources produce penumbras In real-time, we only use point light sources –Extended light sources.
Erdem Alpay Ala Nawaiseh. Why Shadows? Real world has shadows More control of the game’s feel  dramatic effects  spooky effects Without shadows the.
Voxelized Shadow Volumes Chris Wyman Department of Computer Science University of Iowa High Performance Graphics 2011.
Ray Tracing and Photon Mapping on GPUs Tim PurcellStanford / NVIDIA.
CSE 381 – Advanced Game Programming Basic 3D Graphics
NVIDIA PROPRIETARY AND CONFIDENTIAL Occlusion (HP and NV Extensions) Ashu Rege.
Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware Nolan GoodnightGreg HumphreysCliff WoolleyRui Wang University of Virginia.
Cg Programming Mapping Computational Concepts to GPUs.
CS 450: COMPUTER GRAPHICS REVIEW: INTRODUCTION TO COMPUTER GRAPHICS – PART 2 SPRING 2015 DR. MICHAEL J. REALE.
Week 2 - Friday.  What did we talk about last time?  Graphics rendering pipeline  Geometry Stage.
Computer Graphics 2 Lecture 8: Visibility Benjamin Mora 1 University of Wales Swansea Pr. Min Chen Dr. Benjamin Mora.
10/26/04© University of Wisconsin, CS559 Fall 2004 Last Time Drawing lines Polygon fill rules Midterm Oct 28.
Real-Time Rendering SPEEDING UP RENDERING Lecture 04 Marina Gavrilova.
Quick-CULLIDE: Efficient Inter- and Intra- Object Collision Culling using Graphics Hardware Naga K. Govindaraju, Ming C. Lin, Dinesh Manocha University.
Hidden Surface Removal 1.  Suppose that we have the polyhedron which has 3 totally visible surfaces, 4 totally invisible/hidden surfaces, and 1 partially.
Tone Mapping on GPUs Cliff Woolley University of Virginia Slides courtesy Nolan Goodnight.
2 COEN Computer Graphics I Evening’s Goals n Discuss application bottleneck determination n Discuss various optimizations for making programs execute.
Real-time Graphics for VR Chapter 23. What is it about? In this part of the course we will look at how to render images given the constrains of VR: –we.
Hardware-accelerated Rendering of Antialiased Shadows With Shadow Maps Stefan Brabec and Hans-Peter Seidel Max-Planck-Institut für Informatik Saarbrücken,
1Computer Graphics Implementation II Lecture 16 John Shearer Culture Lab – space 2
Implementation II Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts University of New Mexico.
Implementation II.
Sample Based Visibility for Soft Shadows using Alias-free Shadow Maps Erik Sintorn – Ulf Assarsson – uffe.
Efficient Streaming of 3D Scenes with Complex Geometry and Complex Lighting Romain Pacanowski and M. Raynaud X. Granier P. Reuter C. Schlick P. Poulin.
Recap: General Occlusion Culling l When cells and portals don’t work… –Trees in a forest –A crowded train station l Need general occlusion culling algorithms:
Review on Graphics Basics. Outline Polygon rendering pipeline Affine transformations Projective transformations Lighting and shading From vertices to.
Coherent Hierarchical Culling: Hardware Occlusion Queries Made Useful Jiri Bittner 1, Michael Wimmer 1, Harald Piringer 2, Werner Purgathofer 1 1 Vienna.
Graphics Graphics Korea University cgvr.korea.ac.kr 1 7. Speed-up Techniques Presented by SooKyun Kim.
Graphics Interface 2009 The-Kiet Lu Kok-Lim Low Jianmin Zheng 1.
Computer Graphics I, Fall 2010 Implementation II.
Image Processing A Study in Pixel Averaging Building a Resolution Pyramid With Parallel Computing Denise Runnels and Farnaz Zand.
What are shaders? In the field of computer graphics, a shader is a computer program that runs on the graphics processing unit(GPU) and is used to do shading.
1 CSCE 441: Computer Graphics Hidden Surface Removal Jinxiang Chai.
Hierarchical Occlusion Map Zhang et al SIGGRAPH 98.
09/23/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Reflections Shadows Part 1 Stage 1 is in.
CHC ++: Coherent Hierarchical Culling Revisited Oliver Mattausch, Jiří Bittner, Michael Wimmer Institute of Computer Graphics and Algorithms Vienna University.
SHADOW CASTER CULLING FOR EFFICIENT SHADOW MAPPING JIŘÍ BITTNER 1 OLIVER MATTAUSCH 2 ARI SILVENNOINEN 3 MICHAEL WIMMER 2 1 CZECH TECHNICAL UNIVERSITY IN.
Computer Graphics Implementation II
Real-Time Soft Shadows with Adaptive Light Source Sampling
Hybrid Ray Tracing and Path Tracing of Bezier Surfaces using a mixed hierarchy Rohit Nigam, P. J. Narayanan CVIT, IIIT Hyderabad, Hyderabad, India.
Week 2 - Friday CS361.
Real-Time Ray Tracing Stefan Popov.
Understanding Theory and application of 3D
3D Rendering Pipeline Hidden Surface Removal 3D Primitives
Implementation II Ed Angel Professor Emeritus of Computer Science
Real-time Rendering Shadow Maps
Graphics Processing Unit
UMBC Graphics for Games
Introduction to Computer Graphics with WebGL
Implementation II Ed Angel Professor Emeritus of Computer Science
Presentation transcript:

N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA

Real-time rendering Visibility culling –quickly reject what’s not visible Context what won’t affect any pixel in final image Many methods available [COCSD02,PT02]

Occlusion maps Select potential occluders [LG95,KCCO00] –project and rasterize them –store distance to closest one at each pixel Z buffer / occlusion map / depth map Traverse potential occludees –project and rasterize them –test visibility of each fragment depth comparison against depth map - use bounding volumes - do it hierarchically

Optimizations Reduce number of pixels tested –Hierarchical Z Buffer [ZMHH97]

Optimizations Reduce number of pixels tested –Hierarchical Z Buffer [ZMHH97] –Lazy Occlusion Grid [HTP01] –Summed Area Tables [HW99] Use hardware Z buffer –implemented for hidden face removal with optimizations [Mor00, AMN03] –exposed through Occlusion Queries

Occlusion queries # of pixels passing z test if some geometry were rendered in current framebuffer Hardware-assisted culling [HSLM02,BWPP04] Other applications [TPK01] –culling & clamping of shadow volumes [LWGM04] –LOD selection [ASVNB00]

Motivation for N-Buffers Query depth map within GPU –Advantages reduce communication with CPU allow to discard/optimize geometry on GPU –Constraints limited # of operations complex datastructures unavailable –no pointers and lists “complex” algorithms prohibited –branching and indirections costly

Task at hand For a given object, find the maximum depth covered by its projection Depth map accessed as a texture –Lookups give information at one pixel –We need information over a region Use texture to encode depth over a region –proximity grids

The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i –various neigborood/size possible –we choose squares with lower left corner on texel with size 2 i x 2 i

The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0 depth map

The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1 depth map that texel stores maximum depth within that region

The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1level 2 depth map that texel stores maximum depth within that region

The datastructure Sequence of depth maps (levels) At level i a texel stores maximum depth in a neighborood of size i level 0level 1level 2level 3 depth map that texel stores maximum depth within that region

The datastructure Like an image pyramid but... –all levels have same resolution –level 0 (depth map) can have any dimensions not limited to power of 2 # of levels is log of largest dimension –but we might build only the first levels

Construction Level i+1 obtained from level i level 0 level 1level 2

Construction Level i+1 obtained from level i level 0level 1level 2

Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i] standard z-buffer

Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i]

Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i]

Construction Can be done on the GPU –render scene offscreen –copy depth to texture L[0] –for i = 1 to n setup fragment program render a quad –covering viewport –with unit texcoords –with fragment program copy depth to texture L[i]

Construction Similar to matrix reduction... –Buck and Purcell, GPU Gems, p but we keep full resolution –gives us locality

Construction Complexity –first step depends on scene complexity –other steps depends only on resolution Computation cost –~10ms for 640x480 –no read back GeForce FX 6800

Query Naive approach top view viewport level 0level 1level 2level 3level 4level 5

Query Naive approach –project occludee top view viewport level 0level 1level 2level 3level 4level 5

Query Naive approach –project occludee –get screen space bbox extents + z min top view viewport level 0level 1level 2level 3level 4level 5

Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood top view viewport level 0level 1level 2level 3level 4level x 2 5

Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood –do one lookup in matching level at lower left corner top view viewport level 0level 1level 2level 3level 4level x 2 5 z max

Query Naive approach –project occludee –get screen space bbox extents + z min –get bounding neighborood –do one lookup in matching level at lower left corner –compare z min and z max top view viewport level 0level 1level 2level 3level 4level x 2 5 z max

Query Naive approach Overly conservative –(bvolume of occludee) –screenspace bbox –bounding neighborood top view viewport level 0level 1level 2level 3level 4level x 2 5 Need a tighter coverage

4 tiles coverage depth max in region > depth max in sub-region 2 4 x x 2 5 screenspace bbox bounding neighborood z max z ≤

depth max in region > depth max in sub-region 4 tiles coverage 2 4 x x 2 5 screenspace bbox bounding neighborood z max z ≤

4 tiles coverage depth max in region > depth max in sub-region 2 5 x 2 5 z max z ≤ z1,z1,z2,z2, z3,z3,z4z4 max() z max = 2 4 x 2 4 bounding neighborood screenspace bbox

4 tiles coverage 5 ways of covering with 4 squares Measure of the gain on over-conservativity

Applications Occlusion culling Particles Shadow volume clamping

Applications Occlusion culling Particles Shadow volume clamping

Occlusion Culling N-Buffer vs. Occlusion Queries –walkthrough in city-like scene –occluders at frame n = visible at frame n-1 Measured the number of depth tests –testing each building –using a hierarchy of bounding volumes

Occlusion Culling Occlusion queries are faster –harware implementation, available API –N-Buffers penalized computation of 4 tiles coverage on CPU use of glReadPixels to query levels Occlusion queries can be interleaved with rendering [BWPP04]

Occlusion Culling # of depth tests smaller with N-Buffers –4 tests/occludee << nb of pixels rasterized N-Buffers always benefit from hierarchy –testing A cheaper than testing children(A) –not the case for OQ

Occlusion Culling # of depth tests smaller with N-Buffers –4 tests/occludee << nb of pixels rasterized N-Buffers always benefit from hierarchy –testing A cheaper than testing children(A) –not the case for OQ n1n1 n2n2 n n>n 1 +n 2

Hardware implementation? Extra memory to store levels Dedicated component for level updates –not all levels? –lazy updates? Faster than OQ for large objects Fixed (4) number of operations –simple implementation –good for parallelism

Applications Occlusion culling Particles Shadow volume clamping

Particles Particle rendered using ARB_point_sprite –no need to compute quad on CPU Particle animated within GPU –up to a million particle in real-time

Particles Particle rendered using ARB_point_sprite –no need to compute quad on CPU Particle animated within GPU –up to a million particle in real-time How to cull unseen particles? –can not use OQ!

Particles Using N-Buffers –for 16x16 point sprites compute 4 first levels only do one texture lookup in vertex program Not implementable yet –v. program lookups require LUMINANCE_FLOAT32_ATI –N-Buffers require DEPTH_COMPONENT

Applications Occlusion culling Particles Shadow volume clamping

Shadow volumes clamping Ignore unseen or fully shadowed casters Clamp shadow volume to shadowed area [LWGM04]

Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? light camera scene

Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? light camera scene

The litmap Light view of what’s seen by viewer Camera’s view Light’s view

The litmap Light view of what’s seen by viewer Camera’s view Light’s view

Shadow volumes clamping From light’s view, what part of the (visible) scene a shadow volume encompass? Minimum/maximum depth covered by a shadow caster

Shadow volumes clamping Compute two litmaps –furthest visible parts –closest visible parts Compute N-Buffers for both For each shadow caster –use N-Buffers to lookup min/max visible parts –cull and clamp accordingly Can be done in vertex program [BS03]

Shadow volumes clamping

Simpler than CC shadow volumes [LWGM04] –single slice –not optimized (no hardware support) reduce by more of 80% the fill rate

Conclusion Novel representation for depth maps –for encoding depth information over a region –fast to compute possible implementation on hardware –fixed number of operations for query queries available in vertex/fragment programs Applications –can’t beat (yet) hardware optimized approaches –more a proof of concept

Future work Not limited to culling –depth maps used for relief [OB00,PNC05] Other neighborood basis –RIP maps [KLK00] Link to theory of Wavelet Zoom [Mal01]

Questions