Coherent Hierarchical Culling: Hardware Occlusion Queries Made Useful Jiri Bittner 1, Michael Wimmer 1, Harald Piringer 2, Werner Purgathofer 1 1 Vienna.

Slides:



Advertisements
Similar presentations
Sven Woop Computer Graphics Lab Saarland University
Advertisements

Optimized Stencil Shadow Volumes
Zhao Dong 1, Jan Kautz 2, Christian Theobalt 3 Hans-Peter Seidel 1 Interactive Global Illumination Using Implicit Visibility 1 MPI Informatik Germany 2.
N-Buffers for efficient depth map query Xavier Décoret Artis GRAVIR/IMAG INRIA.
Introduction to Massive Model Visualization Patrick Cozzi Analytical Graphics, Inc.
Presented by Konstantinos Georgiadis. Abstract This method extends the Hierarchical Radiosity approach for environments whose geometry and surface attributes.
Visibility in Games Harald Riegler. 2 / 18 Visibility in Games n What do we need it for? u Increase of rendering speed by removing unseen scene data from.
Two Methods for Fast Ray-Cast Ambient Occlusion Samuli Laine and Tero Karras NVIDIA Research.
Occlusion Culling Fall 2003 Ref: GamasutraGamasutra.
View-Dependent Simplification of Arbitrary Polygonal Environments David Luebke.
Visibility Culling. Back face culling View-frustrum culling Detail culling Occlusion culling.
Visibility Culling using Hierarchical Occlusion Maps Hansong Zhang, Dinesh Manocha, Tom Hudson, Kenneth E. Hoff III Presented by: Chris Wassenius.
Smooth view-dependent LOD control and its application to terrain rendering Hugues Hoppe Microsoft Research IEEE Visualization 1998.
1 Dr. Scott Schaefer Hidden Surfaces. 2/62 Hidden Surfaces.
Tomas Mőller © 2000 Speeding up your game The scene graph Culling techniques Level-of-detail rendering (LODs) Collision detection Resources and pointers.
GH05 KD-Tree Acceleration Structures for a GPU Raytracer Tim Foley, Jeremy Sugerman Stanford University.
Adapted from: CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware Naga K. Govindaraju, Stephane.
Image-Based Techniques Hierarchical Image Caching Michael Chung.
Memory Efficient Acceleration Structures and Techniques for CPU-based Volume Raycasting of Large Data S. Grimm, S. Bruckner, A. Kanitsar and E. Gröller.
Optimized Subdivisions for Preprocessed Visibility Oliver Mattausch, Jiří Bittner, Peter Wonka, Michael Wimmer Institute of Computer Graphics and Algorithms.
Bounding Volume Hierarchies and Spatial Partitioning Kenneth E. Hoff III COMP-236 lecture Spring 2000.
Adaptive Global Visibility Sampling Jiří Bittner 1, Oliver Mattausch 2, Peter Wonka 3, Vlastimil Havran 1, Michael Wimmer 2 1 Czech Technical University.
1 A Hierarchical Shadow Volume Algorithm Timo Aila 1,2 Tomas Akenine-Möller 3 1 Helsinki University of Technology 2 Hybrid Graphics 3 Lund University.
Hidden Surface Removal
1 Occlusion Culling ©Yiorgos Chrysanthou, , Anthony Steed, 2004.
Afrigraph 2004 Massive model visualization Tutorial A: Part I Rasterization Based Approaches Andreas Dietrich Computer Graphics Group, Saarland University.
Erdem Alpay Ala Nawaiseh. Why Shadows? Real world has shadows More control of the game’s feel  dramatic effects  spooky effects Without shadows the.
Samuli Laine: A General Algorithm for Output-Sensitive Visibility PreprocessingI3D 2005, April 3-6, Washington, D.C. A General Algorithm for Output- Sensitive.
Computer Graphics 2 Lecture x: Acceleration Techniques for Ray-Tracing Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
Ray Tracing and Photon Mapping on GPUs Tim PurcellStanford / NVIDIA.
10/09/2001CS 638, Fall 2001 Today Spatial Data Structures –Why care? –Octrees/Quadtrees –Kd-trees.
NVIDIA PROPRIETARY AND CONFIDENTIAL Occlusion (HP and NV Extensions) Ashu Rege.
Visibility Queries Using Graphics Hardware Presented by Jinzhu Gao.
GPUs and Accelerators Jonathan Coens Lawrence Tan Yanlin Li.
The Visibility Problem In many environments, most of the primitives (triangles) are not visible most of the time –Architectural walkthroughs, Urban environments.
Culling Techniques “To cull” means “to select from group” In graphics context: do not process data that will not contribute to the final image The “group”
Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware Nolan GoodnightGreg HumphreysCliff WoolleyRui Wang University of Virginia.
Visibility for Computer Graphics Xavier Décoret Master IVR 2005.
Graphics Graphics Korea University cgvr.korea.ac.kr Solid Modeling 고려대학교 컴퓨터 그래픽스 연구실.
Computer Graphics 2 Lecture 8: Visibility Benjamin Mora 1 University of Wales Swansea Pr. Min Chen Dr. Benjamin Mora.
On a Few Ray Tracing like Algorithms and Structures. -Ravi Prakash Kammaje -Swansea University.
Visibility Culling III: Image-Space Occlusion David Luebke Computer Science Department University of Virginia
Real-Time Rendering SPEEDING UP RENDERING Lecture 04 Marina Gavrilova.
Occlusion Query. Content Occlusion culling Collision detection (convex) Etc. Fall
Tone Mapping on GPUs Cliff Woolley University of Virginia Slides courtesy Nolan Goodnight.
Real-time Graphics for VR Chapter 23. What is it about? In this part of the course we will look at how to render images given the constrains of VR: –we.
Adaptive Display Algorithmfor Interactive Frame Rates.
Interactive Rendering With Coherent Ray Tracing Eurogaphics 2001 Wald, Slusallek, Benthin, Wagner Comp 238, UNC-CH, September 10, 2001 Joshua Stough.
Fast BVH Construction on GPUs (Eurographics 2009) Park, Soonchan KAIST (Korea Advanced Institute of Science and Technology)
1 Visiblity: Culling and Clipping Computer Graphics COMP 770 (236) Spring 2009 January 21 & 26: 2009.
David Luebke11/26/2015 CS 551 / 645: Introductory Computer Graphics David Luebke
Hierarchical Penumbra Casting Samuli Laine Timo Aila Helsinki University of Technology Hybrid Graphics, Ltd.
Graphics Graphics Korea University cgvr.korea.ac.kr 1 7. Speed-up Techniques Presented by SooKyun Kim.
Where We Stand At this point we know how to: –Convert points from local to window coordinates –Clip polygons and lines to the view volume –Determine which.
Maths & Technologies for Games Graphics Optimisation - Batching CO3303 Week 5.
Maths & Technologies for Games Spatial Partitioning 1 CO3303 Week 8-9.
Hierarchical Occlusion Map Zhang et al SIGGRAPH 98.
Computer Graphics Inf4/MSc 1 Computer Graphics Lecture 5 Hidden Surface Removal and Rasterization Taku Komura.
Occlusion Culling David Luebke University of Virginia.
CHC ++: Coherent Hierarchical Culling Revisited Oliver Mattausch, Jiří Bittner, Michael Wimmer Institute of Computer Graphics and Algorithms Vienna University.
SHADOW CASTER CULLING FOR EFFICIENT SHADOW MAPPING JIŘÍ BITTNER 1 OLIVER MATTAUSCH 2 ARI SILVENNOINEN 3 MICHAEL WIMMER 2 1 CZECH TECHNICAL UNIVERSITY IN.
Visibility-Driven View Cell Construction Oliver Mattausch, Jiří Bittner, Michael Wimmer Institute of Computer Graphics and Algorithms Vienna University.
Real-Time Soft Shadows with Adaptive Light Source Sampling
Hidden Surfaces Dr. Scott Schaefer.
Real-Time Ray Tracing Stefan Popov.
Hybrid Ray Tracing of Massive Models
CSCE 441: Computer Graphics Hidden Surface Removal
Conservative Visibility Preprocessing using Extended Projections Frédo Durand, George Drettakis, Joëlle Thollot and Claude Puech iMAGIS-GRAVIR/IMAG-INRIA.
CS 551 / 645: Introductory Computer Graphics
LCTS: Ray Shooting using Longest Common Traversal Sequences
Presentation transcript:

Coherent Hierarchical Culling: Hardware Occlusion Queries Made Useful Jiri Bittner 1, Michael Wimmer 1, Harald Piringer 2, Werner Purgathofer 1 1 Vienna University of Technology 2 VRVis Vienna

Michael WimmerVienna University of Technology 2 Coherent Hierarchical Culling Coherent Hierarchical Culling Motivation RRender QOcclusion Query CCull CPU GPU time Typical hardware occlusion culling scenario Typical hardware occlusion culling scenario RQ RQ RQ RQ CQ Q R R Waiting time

Michael WimmerVienna University of Technology 3 Occlusion Culling: Offline vs. Online Offline Offline  Global information about visibility (from region) - Difficult to implement - Accuracy and maintenance problems + No runtime overhead Online Online  Local information about visibility (from point) + Easier to implement + Greater accuracy, easy maintenance - Runtime overhead

Michael WimmerVienna University of Technology 4 Online Occlusion Culling Object space methods Object space methods - Need complex geometric calculations (hard to handle detailed scenes) + Do not require rasterization Image space methods Image space methods + No geometric calculations (easier to handle detailed scenes) - Require rasterization

Michael WimmerVienna University of Technology 5 Hardware Occlusion Culling Hardware is good at rasterization! Hardware is good at rasterization! Hardware counts rasterized fragments Hardware counts rasterized fragments  But need not update frame buffer NV/ARB_occlusion_query NV/ARB_occlusion_query  Asynchronous  Allows multiple simultaneous occlusion queries General algorithm idea: General algorithm idea:  Render simple approximation first (bbox) invisible: cull object invisible: cull object visible: render object visible: render object

Michael WimmerVienna University of Technology 6 Hardware Occlusion Culling Advantages Advantages  Pixel-exact  No explicit occluder rendering  Exploit rasterization power of GPU  Easy to use (API calls) Problems Problems  Delay in availability of the results  Time to execute queries  If fill-bound: only useful if several objects culled

Michael WimmerVienna University of Technology 7 Hierarchical Stop&Wait (S&W) Front-to-back hierarchy traversal 1. Issue visibility query for node 2. Stop and Wait for result  Invisible: cull the subtree  Visible: render or continue 1. recursively Advantage: Advantage:  Hierarchy can cull huge subtrees Problems: Problems:  Waiting causes CPU stalls and GPU starvation  Huge rasterization costs (especially for large interior nodes)

Michael WimmerVienna University of Technology 8 and and RxRender object x QxQuery object x CxCull object x CPU GPU CPU Stalls GPU Starvation R1Q2 R1Q2 R2Q3 R2Q3 C3Q4 R4 time Waiting time

Michael WimmerVienna University of Technology 9 Solution: Coherent Hierarchical Culling Scheduling based on temporal coherence Scheduling based on temporal coherence  Skipping certain visibility tests  Immediate rendering of certain geometry Clever interleaving of queries and rendering Clever interleaving of queries and rendering  Maintaining a queue of running occlusion queries Design goal: easy implementation Design goal: easy implementation

Michael WimmerVienna University of Technology 10 Coherent Hierarchical Culling (CHC) RxRender object x QxQuery object x CxCull object x CPU R1Q2 GPU R1Q2 R2Q3 R2Q3 C3Q4 R4 visible in previous frameAssume independent occlusion time

Michael WimmerVienna University of Technology 11 CHC Algorithm Outline Front-to-back hierarchy traversal Front-to-back hierarchy traversal 1.Node handling  Interior node Previously invisible: issue visibility query Previously invisible: issue visibility query Previously visible: continue 1. recursively Previously visible: continue 1. recursively  Leaf Issue visibility query Issue visibility query Previously visible: render immediately Previously visible: render immediately 2.Check availability of query results Invisible: propagate visibility change Invisible: propagate visibility change Visible: render or continue 1. recursively Visible: render or continue 1. recursively

Michael WimmerVienna University of Technology 12 Why Interleaving Works… Processing a node only depends on… Processing a node only depends on… 1.Front to back order 2.Results of queries for processed nodes where: Previous frame: processed node  current node S&WCHC visible  visible yesno visible  invisible yesno invisible  visible yesno invisible  invisible (different subtrees) yesno invisible  invisible (parent  child, refinement of visibility) yesyes

Michael WimmerVienna University of Technology 13 no queries for previously visible interior nodes CHC: Hierarchy Traversal assume no query dependencies previously visible previously invisible front-to-back order hidden regions: queries depend on parents

Michael WimmerVienna University of Technology 14 CHC Features Reduction of CPU stalls and GPU starvation Reduction of CPU stalls and GPU starvation  Interleaving queries with rendering previously visible geometry Reduction of the number of queries Reduction of the number of queries  Avoids expensive redundant queries for interior nodes  Size of tested regions adapts to visibility pull-up: occluded region growing pull-up: occluded region growing pull-down: visible region growing pull-down: visible region growing

Michael WimmerVienna University of Technology 15 Implementation Issues Front-to-back traversal Front-to-back traversal  Priority queue: allows various hierarchical data structures Checking query results Checking query results  glGetOcclusionQueryivNV  GL_PIXEL_COUNT_AVAILABLE_NV  Very cheap operation Queries for previously visible nodes Queries for previously visible nodes  Use actual geometry as occludee (instead of bounding box)

Michael WimmerVienna University of Technology 16 Further Optimizations Conservative visibility testing Conservative visibility testing  Assume visible node remains visible n frames + Saves additional occlusion queries Approximate visibility Approximate visibility  #visible pixels < threshold  node invisible + Saves rendered geometry - Produces image errors

Michael WimmerVienna University of Technology 17 Results – Test Scenes Teapots 11.5M triangles 21k kD-tree nodes City 1M triangles 33k kD-tree nodes Power plant 12.7M triangles 18.7k kD-tree nodes

Michael WimmerVienna University of Technology 18 Results – Speedup Ideal:zero overhead – render only visible geometry

Michael WimmerVienna University of Technology 19 Results – Summary Comparison to hierarchical S&W Comparison to hierarchical S&W  #queries reduced by almost 2  Times for stalls reduced by 20-60x (to 0.18 –1.31ms) Close to ideal algorithm! Close to ideal algorithm!  Only 2–9ms slower  Overhead due to query time

Michael WimmerVienna University of Technology 20 Results – Teapot

Michael WimmerVienna University of Technology 21 Results – City

Michael WimmerVienna University of Technology 22 Results – Powerplant

Michael WimmerVienna University of Technology 23 Optimization Results Conservative culling, 2 frames assumed visible Conservative culling, 2 frames assumed visible  Good for deep hierarchies with simple leaf geometry  Further speedup up to 21% Approximate culling, 25 pixels threshold Approximate culling, 25 pixels threshold  Good for scenes with complex visible geometry  Further speedup up to 33%

Michael WimmerVienna University of Technology 24 Conclusion Efficient scheduling of hardware occlusion queries Efficient scheduling of hardware occlusion queries  Greatly reduces CPU stalls and GPU starvation  Reduces number of required queries Simple to implement Simple to implement Arbitrary hierarchical data structure Arbitrary hierarchical data structure Speedup ~4 over VFC Speedup ~4 over VFC Close to ideal solution for tested scenes Close to ideal solution for tested scenes Watch out for GPU Gems II Watch out for GPU Gems II

Michael WimmerVienna University of Technology 25 Thanks for Your Attention

Michael WimmerVienna University of Technology 26 previously visible: continue 1. recursively previously visible: render CHC: Example previously visible: issue query + render query result available: continue 1. recursivelypull-up invisibilityfinal classificationpreviously invisible: queryquery result available: renderquery result available: cull query queue GPU R4 5 6 Q5Q6/R6 7 Q7 8 Q8R7 10 Q10/R10 11 Q11 issued queries R6Q6/ query result available: mark visible Q10