Download presentation
Presentation is loading. Please wait.
1
GPU-Based Frequency Domain Volume Rendering Ivan Viola, Armin Kanitsar, and Meister Eduard Gröller Institute of Computer Graphics and Algorithms Vienna University of Technology
2
Ivan ViolaVienna University of Technology 2 / 16 Motivation volume rendering is time consuming volume rendering is time consuming computational complexity is O(N 3 ) computational complexity is O(N 3 ) our goal: fastest volume rendering our goal: fastest volume rendering GPUs GPUs very fast fragment processor very fast memory access Fourier Volume Rendering (FVR) Fourier Volume Rendering (FVR) theoretically fastest volume rendering
3
Ivan ViolaVienna University of Technology 3 / 16 GPU Frequency Domain Volume Rendering CPU
4
Ivan ViolaVienna University of Technology 4 / 16 FVR Characteristics Pros computational complexity O(N 2 log(N)) computational complexity O(N 2 log(N)) renders the whole volume not iso-surfaces renders the whole volume not iso-surfaces very fast rendering stage: very fast rendering stage: slicing in frequency domain inverse 2D Fourier transform Cons rendering results into X-ray images rendering results into X-ray images time-consuming preprocessing time-consuming preprocessing
5
Ivan ViolaVienna University of Technology 5 / 16 Rendering Stage 1: Slicing stage with the highest speed-up stage with the highest speed-up nearest neighbor interpolation nearest neighbor interpolation supported by GPU tri-linear interpolation tri-linear interpolation tri-cubic interpolation tri-cubic interpolation windowed sinc of width four windowed sinc of width four
6
Ivan ViolaVienna University of Technology 6 / 16 Tri-Linear Interpolation not natively supported by graphics hardware not natively supported by graphics hardware can be computed using the LRP instruction can be computed using the LRP instruction [1,1] [0,0] [X,Y] frac(8X)
7
Ivan ViolaVienna University of Technology 7 / 16 Cubic Interpolation & Windowed sinc not natively supported by graphics hardware not natively supported by graphics hardware no equivalent to LRP instruction no equivalent to LRP instruction filter kernel stored in textures [Hadwiger et al. VMV’01] filter kernel stored in textures [Hadwiger et al. VMV’01] separability of 3D kernel filters of width four stored in RGBA 1D texture
8
Ivan ViolaVienna University of Technology 8 / 16 Rendering Stage 2: Inverse 2D FFT 1D FFT consists of two parts 1D FFT consists of two parts scrambling butterfly operation
9
Ivan ViolaVienna University of Technology 9 / 16 Fast Fourier Transform in 1D a0a0a0a0 a1a1a1a1 a2a2a2a2 a3a3a3a3 a4a4a4a4 a5a5a5a5 a6a6a6a6 a7a7a7a7 a0a0a0a0 a4a4a4a4 a2a2a2a2 a6a6a6a6 a1a1a1a1 a5a5a5a5 a3a3a3a3 a7a7a7a7 scramble 1 1 1 1 WkNWkNWkNWkN W08W08W08W08 W28W28W28W28 W48W48W48W48 W68W68W68W68 W08W08W08W08 W28W28W28W28 W48W48W48W48 W68W68W68W68 butterfly W08W08W08W08 W18W18W18W18 W28W28W28W28 W38W38W38W38 W48W48W48W48 W58W58W58W58 W68W68W68W68 W78W78W78W78 A0A0A0A0 A1A1A1A1 A2A2A2A2 A3A3A3A3 A4A4A4A4 A5A5A5A5 A6A6A6A6 A7A7A7A7
10
Ivan ViolaVienna University of Technology 10 / 16 Fast Fourier Transform on the GPU two buffers – ping-pong rendering two buffers – ping-pong rendering two channels rendering buffers required two channels rendering buffers required scramble pass scramble pass 1D lookup butterfly passes butterfly passes log 2 (N) passes texture encodes W k N W k N p and q coordinate p and q coordinate butterfly sign butterfly sign
11
Ivan ViolaVienna University of Technology 11 / 16 Hartley Transform - Alternative to FFT real input is transformed into real output real input is transformed into real output ½ memory requirements scrambling the same as in FFT scrambling the same as in FFT double-butterfly operation double-butterfly operation three source values, cos and sin HT not separable HT not separable additional correction pass required GPU implementation not faster than FFT
12
Ivan ViolaVienna University of Technology 12 / 16 Fast Hartley Transform on the GPU similar to FFT – ping-pong rendering similar to FFT – ping-pong rendering only one channel rendering buffers required only one channel rendering buffers required scrambling the same scrambling the same double-butterfly double-butterfly two lookup textures addresses of source values (3 channels) addresses of source values (3 channels) cos and sin terms (2 channels) cos and sin terms (2 channels)
13
Ivan ViolaVienna University of Technology 13 / 16 Results Framerates for ATI Radeon 9800 XT ResolutionNNTLTC IFFT 2D 256x25614501050180153 512x5125003504535
14
Ivan ViolaVienna University of Technology 14 / 16 Demo
15
Ivan ViolaVienna University of Technology 15 / 16 Conclusions rendering stage of FVR very fast on GPU rendering stage of FVR very fast on GPU slicing – high performance gain wrap around is “for free” speed-up also for inverse FFT nearest neighbour – very poor quality nearest neighbour – very poor quality tri-linear interpolation – high performace tri-linear interpolation – high performace tri-cubic interpolation – high quality tri-cubic interpolation – high quality
16
Ivan ViolaVienna University of Technology 16 / 16 Thank You! viola@cg.tuwien.ac.at
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.