Presentation is loading. Please wait.

Presentation is loading. Please wait.

Memory Efficient Acceleration Structures and Techniques for CPU-based Volume Raycasting of Large Data S. Grimm, S. Bruckner, A. Kanitsar and E. Gröller.

Similar presentations


Presentation on theme: "Memory Efficient Acceleration Structures and Techniques for CPU-based Volume Raycasting of Large Data S. Grimm, S. Bruckner, A. Kanitsar and E. Gröller."— Presentation transcript:

1 Memory Efficient Acceleration Structures and Techniques for CPU-based Volume Raycasting of Large Data S. Grimm, S. Bruckner, A. Kanitsar and E. Gröller Institute of Computer Graphics and Algorithms Vienna University of Technology Vienna, Austria

2 Sören GrimmVienna University of Technology 2 Motivation (1/3) Direct Volume Rendering: Direct Volume Rendering: Important tool in medical environments CT angiography CT angiography run-offs (> 1000 slices) are clinical practice Scanner resolutions are Scanner resolutions are getting higher (1024x1024 per slice) Efficient data memory layout essential!

3 Sören GrimmVienna University of Technology 3 Test hardware specifications: Notebook: Pentium M 1.6 GHz Pentium M 1.6 GHz 1 GB RAM 1 GB RAM GeForce 4 GO (32 MB) GeForce 4 GO (32 MB) Motivation (2/3) - Goals Interactive rendering/handling of large datasets up to 1 GB Interactive rendering/handling of large datasets up to 1 GB Support of heterogeneous PC hardware environment Support of heterogeneous PC hardware environment Smart combination and modification of known methods!

4 Sören GrimmVienna University of Technology 4 Motivation (3/3) Hierarchy of successively larger but slower memory technology Hierarchy of successively larger but slower memory technology Avoid frequent access to slower levels Avoid frequent access to slower levels Exploit spatial and temporal locality Exploit spatial and temporal locality L1 cache Main memory L2 cache Hard disk CPU Memory hierarchy

5 Sören GrimmVienna University of Technology 5 Outline Memory Layout & Data Processing Scheme Memory Layout & Data Processing Scheme Gradient Caching Gradient Caching Empty Space Skipping Empty Space Skipping Parallelization & Features Parallelization & Features Results Results

6 Sören GrimmVienna University of Technology 6 Linear Memory Layout (1/2) 01 15 4 2 3 567 891011 121314 01234567 Memory storage order 2D Slice

7 Sören GrimmVienna University of Technology 7 Linear Memory Layout (2/2) Rays Store volume as a stack of 2D images (slices) Volume View dependent cache behavior!

8 Sören GrimmVienna University of Technology 8 Bricked Memory Layout (1/2) 01 15 4 2 3 567 891011 121314 01452367 Memory storage order 2D Slice

9 Sören GrimmVienna University of Technology 9 Bricked Memory Layout (2/2) Volume Rays Store volume as a set of equally sized cubes (bricks) Constant cache behavior!

10 Sören GrimmVienna University of Technology 10 Bricked-wise Processing Volume Rays Processing of all resample locations is done brick-wise High Cache Coherence!

11 Sören GrimmVienna University of Technology 11 Outline Memory Layout & Data Processing Scheme Memory Layout & Data Processing Scheme Gradient Caching Gradient Caching Empty Space Skipping Empty Space Skipping Parallelization & Features Parallelization & Features Results Results

12 Sören GrimmVienna University of Technology 12 Gradient Caching (1/3) Pre-computed gradients Pre-computed gradients  High performance  For sufficient quality, memory requirements are at least doubled Compute gradients on-the-fly Compute gradients on-the-fly  Calculation expensive  No additional storage requirement

13 Sören GrimmVienna University of Technology 13 Gradient Caching (2/3) To accelerate calculation  Caching of gradients To accelerate calculation  Caching of gradients Brick-wise traversal allows to use a brick-sized gradient cache which can be re-used for each brick Brick-wise traversal allows to use a brick-sized gradient cache which can be re-used for each brick Cell

14 Sören GrimmVienna University of Technology 14 Gradient Caching (3/3) Volume Rays Gradient cache One brick-sized gradient cache One brick-sized gradient cache Constant very small memory requirement Constant very small memory requirement

15 Sören GrimmVienna University of Technology 15 Outline Memory Layout & Data Processing Scheme Memory Layout & Data Processing Scheme Gradient Caching Gradient Caching Empty Space Skipping Empty Space Skipping Parallelization & Features Parallelization & Features Results Results

16 Sören GrimmVienna University of Technology 16 Empty Space Skipping (brick-level) Min-Max info contained in brick used for discarding empty regions Min-Max info contained in brick used for discarding empty regions Template based brick projection to rasterize depth values Template based brick projection to rasterize depth values In software, very fast for orthographic projections In software, very fast for orthographic projections

17 Sören GrimmVienna University of Technology 17 Empty Space Skipping (octree-level) Each brick contains three-level octree Each brick contains three-level octree Caching of classification information Caching of classification information Stored in linearized octree using hierarchy compression Stored in linearized octree using hierarchy compression Octree goes down to 4x4x4 voxels Octree goes down to 4x4x4 voxels Template based projection Template based projection Transparent Inhomogeneous Opaque Min-Max and classification caching increase the memory requirements by approx. 4%

18 Sören GrimmVienna University of Technology 18 Cell Invisibility Cache (1/2) Not skipped by octree Example ray Skipped by octree

19 Sören GrimmVienna University of Technology 19 Cell Invisibility Cache (2/2) Visible Classi- fication Re-sampling Gradient- Estimation Compositing Shading Advance ray CIC Visible YES NO YES CIC increase the memory requirements by approx. 6%

20 Sören GrimmVienna University of Technology 20 Empty Space Skipping Project all non-transparent bricks onto image plane to find first entry points of rays Project all non-transparent bricks onto image plane to find first entry points of rays For finer resolution, use a min-max octree per brick and project the octree For finer resolution, use a min-max octree per brick and project the octree Cell Invisibility Cache Cell Invisibility Cache All these acceleration techniques increase the memory requirements by just 10%

21 Sören GrimmVienna University of Technology 21 Outline Memory Layout & Data Processing Scheme Memory Layout & Data Processing Scheme Gradient Caching Gradient Caching Empty Space Skipping Empty Space Skipping Parallelization & Features Parallelization & Features Results Results

22 Sören GrimmVienna University of Technology 22 Parallelization / Hyper Threading Advancing Ray-front 1D Screen Phy. CPU 1 Phy. CPU 2 [Law and Yagel 1996] Log. CPU 1 Log. CPU 2 Log. CPU 1 Log. CPU 2

23 Sören GrimmVienna University of Technology 23 Features High quality Multiple segmented object and Transfer- functions Transfer- functions on clipping planes View aligned and axis aligned cutting planes

24 Sören GrimmVienna University of Technology 24 Outline Memory Layout & Data Processing Scheme Memory Layout & Data Processing Scheme Gradient Caching Gradient Caching Empty Space Skipping Empty Space Skipping Parallelization & Features Parallelization & Features Results Results

25 Sören GrimmVienna University of Technology 25 Results (1/3) - Bricking Brick size in kilo-byte 8645124096 1 3 4 2 132768 linear volume layout cache thrashing + bricking overhead Optimal brick size Speedup factor Speedup: 2.8 Linear vs. bricked memory layout Cache size: 512 KB

26 Sören GrimmVienna University of Technology 26 Results (2/3) – Gradient Caching Speedup: 3.4 Speedup: 2.7 Pentium M 1.6 GHz 1 GB RAM

27 Sören GrimmVienna University of Technology 27 Pentium M 1.6 GHz 1 GB RAM Results (3/3) - Performance

28 Sören GrimmVienna University of Technology 28 Conclusions Sub second frame rates for large datasets on a standard notebook Sub second frame rates for large datasets on a standard notebook Fully interactive volume visualization of large data on commodity hardware is within reach Fully interactive volume visualization of large data on commodity hardware is within reach Alternative memory layouts are the key to handling large datasets Alternative memory layouts are the key to handling large datasets

29 Sören GrimmVienna University of Technology 29 Intel Pentium M 1600 MHz (software capture) Visible Male (587 x 341 x 1878) Questions?

30 Thank you for your attention Institute of Computer Graphic and Algorithms Sponsored by Tiani MedGraph AG


Download ppt "Memory Efficient Acceleration Structures and Techniques for CPU-based Volume Raycasting of Large Data S. Grimm, S. Bruckner, A. Kanitsar and E. Gröller."

Similar presentations


Ads by Google