Hardware-Accelerated Adaptive EWA Volume Splatting Wei Chen ZJU Liu Ren CMU Matthias Zwicker MIT Hanspeter Pfister MERL
2 Volume Splatting Object-order method 3D reconstruction kernel centered at each voxel (elliptical Gaussian) Voxel contribution = 2D footprint (color, opacity) Weighted footprints accumulated into image Voxel kernels Screen 2D footprints = splats
3 Speed Quality Software Texture splats Axis-aligned EWA Swan Image-aligned Fast splats OpenGL ex Westover1989 Crawfis 1993 Swan 1997 Mueller 1999 Huang 2000 Zwicker 2001 Xue 2003 Related Work Our work
4 Outline EWA volume splatting Adaptive EWA splatting GPU implementation Results and conclusions
5 EWA Volume Splatting Compensate aliasing artifacts due to perspective projection EWA Filter = low-pass filter warped reconstruction filter W Volume Low-Pass Filter EWA volume resampling filter Projection Convolution
6 EWA Volume Splatting (512x512x3) Reconstruction filter only: 6.25 fps EWA filter: 4.97 fps Low-pass filter only: 6.14 fps EWA filter: 3.79 fps
7 Analysis of EWA Filter Warped recon- struction kernel Low-pass filter Resampling filter Minification Magnification
8 Analysis of EWA Filter Shape of EWA Splat is dependent on distance from the view plane r k Reconstruction filter radius u 2 Distance to the view plane r h Low-pass filter radius EWA splat Note that
9 Adaptive EWA Filtering Warped recon- struction kernel Low-pass filter Resampling filter if u 2 > A use low-pass filter if u 2 < B use reconstruction filter if A<u 2 < B use EWA filter
10 Patch Processing Process a 8 x 8 patch of voxels at a time Filter selection based on four corners of each patch (choose smallest) Traversal order Patch Distance
11 Adaptive EWA Volume Splatting (512x512x3) Adaptive EWA filter: 6.88 fps EWA filter: 4.83 fps Adaptive EWA filter: 1.84 fps EWA filter: 1.75 fps
12 Outline EWA volume splatting Adaptive EWA splatting GPU implementation Results and conclusions
13 Object-Space EWA Splatting Object-space EWA splatting with texture mapping [Ren et al. Eurographics 2002] Texture (unit Gaussian) Unit quad (0,0)(1,0) (0,1) (1,1) EWA Splat (elliptical Gaussian) Projection Texture mapping Textured quad Vertex shader computation
14 Proxy Geometry Template Rectilinear volumes: use one proxy geometry template for all slices in each direction Store vertex indices in AGP memory... Regularity Voxel geometry Proxy geometry template Quad geometry
15 Vertex Compression Compress each vertex to 32 bits Decompression on-the-fly in programmable hardware To store vertex information of 256x256x256 volume in video memory Without compression 2,048 MBytes With compression 12 MBytes Retained-mode hardware acceleration feasible
16 Retained vs. Immediate Mode Data#Total Splats #Rendered Splats Immediate Mode Retained Mode Bonsai fps7.53 fps Engine fps10.28 fps Lobster fps10.60 fps Head fps2.86 fps Factor of ~10 improvement
17 Interactive Classification: Opacity Culling Hardware-accelerated list-based traversal For each slice For each 32 x 32 patch of voxels (smaller indices) Indices of proxy geometry organized into iso-value lists using bucket sort; CPU merges lists on-line Render only iso-value lists with visible voxels Patch
18 Interactive Classification: Opacity Culling DataList-based opacity culling Standard opacity culling Head2.80 fps0.3 fps Engine10.18 fps0.8 fps Bonsai7.23 fps0.8 fps Lobster10.30 fps1.1 fps Includes changes to TF every frame Factor of ~10 improvement
19 Deferred Shading Volume texture access is only possible in fragment programs * However, per fragment shading is expensive Solution: deferred shading in two passes * Newer GPUs allow texture access in vertex programs
20 Deferred Shading Pass one: 3D texture access, classification and illumination in vertex shader, render one pixel per voxel Pass two: reuse the pixel data from the first pass to shade the 2D footprint Performance gain: 5%-10% speedup Pass one Pass two Final result
21 Experiments P4 2.4 GHz ATI 9800 Pro with 256 MB RAM Direct3D 9.0b with VS 2.0 and PS 2.0 TypeEWAReconstruction OnlyLow-pass Only Regular Rectilinear Vertex shader instructions
22 Sheet-buffer Composition Axis-aligned traversal, addition in sheet buffers, then blending front-to-back 0.80 fps 3.00 fps 3.45 fps
23 UNC Head: 208x256x fps #Rendered splats: 2,955, M splats / sec
24 Bonsai: 256x256x fps #Rendered splats: 274,866 2M splats / sec
25 Engine: 256x256x fps #Rendered splats: 247, M splats / sec
26 Lobster: 301x324x fps #Rendered splats: 555, M splats / sec
Video
28 Our Contributions Adaptive EWA computation Volume data compression Retained-mode hardware acceleration Interactive opacity culling Deferred two-pass shading
29 Future Work Image-aligned EWA volume splatting Irregular volume splatting Pointsprites in OpenGL Floating point textures Vertex texture for classification
30 Acknowledgementshttp://graphics.cs.cmu.edu/projects/adpewa/index.html Jessica Hodgins (CMU) Markus Gross (ETH)