Download presentation
Presentation is loading. Please wait.
1
A Hierarchical Shadow Volume Algorithm
Timo Aila1,2 Tomas Akenine-Möller3 1Helsinki University of Technology Hybrid Graphics Lund University
2
Outline Brief intro to shadow volumes Our solution Results Q&A
fillrate problem, existing solutions Our solution idea implementation Results Q&A
3
Shadow volumes [Crow77] Shadow volumes define closed volumes of space that are in shadow infinitesimal light source shadow caster = light cap dark cap extruded side quads
4
Is point inside shadow volume?
Pick reference point R outside shadow volume any such point is OK Span line from R to point to be classified Compute sum of enter (+1) and exit (-1) events P1 shadow volume 2D illustration: R P2 P3
5
Using graphics hardware
R at ∞ behind pixel (z-fail) [Bilodeau&Songy, Carmack] infinity always outside SVs – robust must not clip to far plane of view frustum sum hidden events to stencil buffer, sign from backface culling visible samples (or pixels) 2D illustration: - + camera R + - + shadow volume view frustum
6
Amount of pixel processing
Adapted from [Chan and Durand 2004]
7
Fillrate problem 50+ fps without shadows on ATI Radeon 9800XT at 1280x1024, 1 sample/pixel 1 fps when shadow volumes rasterized 2.2 billion pixels per frame
8
Existing solutions (1/2)
CC shadow volumes [Lloyd et al. 2004] draw SVs only where receivers exist good when lots of empty space Hybrid shadow maps and volumes [Chan&Durand 2004] use SVs only at shadow boundaries boundary pixels determined using shadow map artifacts due to limited shadow map resolution
9
Existing solutions (2/2)
Depth bounds [Nvidia 2003] application supplies min & max depth values separately for each shadow volume rasterize shadow volume only when visible geometry between [min,max] optimal bounds hard to compute min max camera 2D illustration: shadow volume visible pixels
10
Outline Brief intro to shadow volumes Our solution Results Q&A
fillrate problem, existing solutions Our solution idea implementation Results Q&A
11
Reference image
12
Shadow volume algorithm executed once per 8x8 pixel tile
13
Green tiles may contain shadow boundary - other tiles were correct
14
Low-res (gray) + per-pixel computed boundaries (dark)
15
How to detect shadow boundaries?
Two facts about shadow volumes always closed SV triangles mark potential shadow boundaries If 3D volume in scene not intersected by shadow volume triangles fully lit or fully in shadow single sample classifies entire volume
16
Outline Brief intro to shadow volumes Our solution Results Q&A
fillrate problem, existing solutions Our solution idea implementation Results Q&A
17
Detecting boundary tiles
Bound tile with axis-aligned bounding box 8x8 pixel region Zmin, Zmax Triangle vs. AA Box intersection test low-resolution rasterization Zmin and Zmax tests 8 8 pixels Zmax Zmin
18
Fast update of non-boundary tiles
Copy low-res shadows to stencil buffer writing 64 per-pixel values would be slow Two-level stencil buffer saves the day maintain [Smin, Smax] per tile always test the higher level first often no need to validate per-pixel values stencil values of non-boundary tiles are constant
19
Implementation – Stage 1
SV triangles Boundary? Low-res shadows Low-resolution rasterizer Per-tile operations Buffers built separately for each shadow volume Classifications ready when entire SV processed application marks begin/end of shadow volumes
20
Implementation – Stage 2
Boundary? Low-res shadows SV triangles Low-resolution rasterizer boundary tile? No Copy to 2-level stencil Yes Per-pixel rasterizer Stencil ops Update 2-level stencil
21
Alternative implementations
Two pass Pass 1 = Stage 1 Pass 2 = Stage 2 How to keep pixel units busy during Stage 1? maybe assign per-tile operations to pixel shaders? Single pass Separate stages using delay stream [Aila et al. 2003] Stage 2 of current SV executes simultaneously with next SV’s Stage 1
22
Hardware resources Two-level stencil buffer Per-tile operations
Optionally delay stream * duplicate low-res rasterizer & Zmin/Zmax units * cache for per shadow volume buffers multiple buffers for pipelined operation allocate from external memory * If not already there for occlusion culling purposes
23
Outline Brief intro to shadow volumes Our solution Results Q&A
fillrate problem, existing solutions Our solution idea implementation Results Q&A
24
Results – Simple scene (1280x1024)
Depth bounds Hierarchical Improvement Ratio in #pixels 1.1 12.7 11.5 Ratio in bandwidth 1.03 17.6 17.2
25
Results – Knights (1280x1024) Ratio in #pixels 2.6 7.4 2.8
Depth bounds Hierarchical Improvement Ratio in #pixels 2.6 7.4 2.8 Ratio in bandwidth 2.4 5.6
26
Results – Powerplant (1280x1024)
Depth bounds Hierarchical Improvement Ratio in #pixels 2.4 22.9 9.5 Ratio in bandwidth 2.3 16.0 6.9
27
Summary Hierarchical rendering method for shadow volumes Future work
significant fillrate savings compared to other hardware methods also works for soft shadow volumes Future work would it make sense to extend programmability to per-tile operations? how many pipeline bubbles are created? requires chip-level simulations
28
Thank you! Questions? Acknowledgements
Ville Miettinen, Jacob Ström, Eric Haines, Ulf Assarsson, Lauri Savioja, Jonas Svensson, Ulf Borgenstam, Karl Schultz, 3DR group at Helsinki University of Technology The National Technology Agency of Finland, Hybrid Graphics, Bitboys, Nokia and Remedy Entertainment ATI for granting fellowship to Timo ( )
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.