Hardware-Assisted Visibility Sorting for Tetrahedral Volume Rendering Steven Callahan Milan Ikits João Comba Cláudio Silva Steven Callahan Milan Ikits João Comba Cláudio Silva
Scientific Computing and Imaging Institute, University of Utah Overview Introduction Previous Work Hardware-Assisted Visibility Sorting Results Future Work Conclusion
Scientific Computing and Imaging Institute, University of Utah Research Goal Real-time volume rendering Scalable (machine performance) Data of arbitrary size Simple and robust implementations
Scientific Computing and Imaging Institute, University of Utah Volume Rendering Regular Irregular
Scientific Computing and Imaging Institute, University of Utah Why Irregular Grids ? Unstructured grids are the preferred data type in scientific computations Level-Of-Detail (LOD) techniques intrinsically need unstructured grids El-Sana et al, Ben-Gurion
Scientific Computing and Imaging Institute, University of Utah Optical Models Light s ss Absorption plus emission
Scientific Computing and Imaging Institute, University of Utah Compositing Front-to-back I0I0 I1I1 I2I2 00 11 22 I2I2 22 01 I 01
Scientific Computing and Imaging Institute, University of Utah Volume Rendering: (Intersection) Sampling + Sorting
Scientific Computing and Imaging Institute, University of Utah Sampling: Triangle-Based Approach Class 1 (+, +, +, -) Class 2 (+, +, -, -) Projected Tetrahedra [Shirley-Tuchman 1990]
Scientific Computing and Imaging Institute, University of Utah Sorting Application Rasterization Display Object-Space Sorting Image Space i.e., let’s sort the geometry!
Scientific Computing and Imaging Institute, University of Utah Cell-Projection B p A A < B p
Scientific Computing and Imaging Institute, University of Utah Object-Space Sorting: Williams’ MPVO Viewing direction A B C E D F B < A A < C B < E C < E C < D E < F D < F Idea: Define ordering relations by looking at shared faces.
Scientific Computing and Imaging Institute, University of Utah MPVO Limitations Missing relations!
Scientific Computing and Imaging Institute, University of Utah XMPVO Viewing direction Idea: Using ray shooting queries to complement ordering relations. A B C D A < C A < B B < D
Scientific Computing and Imaging Institute, University of Utah Sorting Application Rasterization Display Object Space Image-Space Sorting i.e., let’s sort the pixels!
Scientific Computing and Imaging Institute, University of Utah Image-Space Sorting: A-Buffer Idea: Keep a list of intersections for each pixel. [Carpenter 1984]
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer Not sorted!
Scientific Computing and Imaging Institute, University of Utah Cell-Projection With An A-Buffer Sorted!
Scientific Computing and Imaging Institute, University of Utah A-Buffer Limitations c cells n x n pixels Number of Intersections: O(cn ) 2 Problems Time: sorting takes too long Memory: storage too high
Scientific Computing and Imaging Institute, University of Utah Sorting Application Rasterization Display Image-Space Sorting Object-Space Sorting
Scientific Computing and Imaging Institute, University of Utah Approximate Object-Space Sorting 1
Scientific Computing and Imaging Institute, University of Utah Approximate Object-Space Sorting 1 2
Scientific Computing and Imaging Institute, University of Utah Approximate Object-Space Sorting 1 2 3
Scientific Computing and Imaging Institute, University of Utah Approximate Object-Space Sorting
Scientific Computing and Imaging Institute, University of Utah Approximate Object-Space Sorting
Scientific Computing and Imaging Institute, University of Utah Approximate Object-Space Sorting A Solution: Use an insertion-sort A-buffer!
Scientific Computing and Imaging Institute, University of Utah Approximate Object-Space Sorting Use a conservative bound on the intersections What about the space problem?
Scientific Computing and Imaging Institute, University of Utah Hardware Assisted Visibility Sorting (HAVS) Sort in image-space and object-space Do an approximate object-space sorting of the cells on the CPU (i.e. sort by face centroid) Complete the sort in image-space by using a fixed depth A-buffer (called a k-buffer) implemented on the GPU Can handle non-convex meshes, has a low memory overhead, and requires minimal pre-processing of data
Scientific Computing and Imaging Institute, University of Utah HAVS Overview
Scientific Computing and Imaging Institute, University of Utah k-buffer Fixed size A-buffer of depth k Fragment stream sorter Stores k entries for each pixel. Each entry consists of the fragment’s scalar value and its distance to the viewpoint An incoming fragment replaces the entry that is closest to the eye (front-to-back compositing) Given a sequence of fragments such that each fragment is within k positions from its position is sorted order, it will output the fragments in sorted order
Scientific Computing and Imaging Institute, University of Utah k-buffer: Hardware Implementation Use multiple render target capability of ATI graphics cards (ATI_draw_buffers in OpenGL) Use P-buffer to accumulate color and opacity and three Aux buffers for the k-buffer entries P-buffer Aux 0 Aux 1 Aux 2 r g b a r comp g comp b comp a comp v1v1 v2v2 d1d1 d2d2 v4v4 v3v3 d4d4 d3d3 v5v5 v6v6 d5d5 d6d6
Scientific Computing and Imaging Institute, University of Utah Fragment Shader Overview
Scientific Computing and Imaging Institute, University of Utah Details Fix incorrect screen-space texture coordinates caused by perspective-correct interpolation Perspective interpolation Projecting vertices to find tex coords Projecting tex coords in shader
Scientific Computing and Imaging Institute, University of Utah Details Simultaneously reading and writing to a buffer is undefined when fragments are rasterized in parallel
Scientific Computing and Imaging Institute, University of Utah Details The buffers are initialized and flushed using k screen-aligned rectangles with negative scalar values Handling non-convex objects requires the exterior faces to be tagged with a negative distance d and keeping track of when we are inside or outside of the mesh with the sign of the scalar value v
Scientific Computing and Imaging Institute, University of Utah Details Early ray termination reads accumulated opacity and kills fragment if it is over a given threshold. Early z-test is currently not available on ATI 9800 when using multiple rendering targets
Scientific Computing and Imaging Institute, University of Utah Pre-Integrated Transfer Function Previous Work Volume density optical model Williams and Max 1992 Pre-integration on GPU Roettger et al 5 s to update a 128x128x128 table Incremental pre-integration on CPU Wieler et al 1.5 s to update a 128x128x128 table
Scientific Computing and Imaging Institute, University of Utah Pre-Integrated Transfer Function Williams and Max l S f S b
Scientific Computing and Imaging Institute, University of Utah Pre-Integrated Transfer Function Roettger et al. S S b f n = 0…l max T 3D
Scientific Computing and Imaging Institute, University of Utah Pre-Integrated Transfer Function Weiler et al. l l S f S b S p l’ l
Scientific Computing and Imaging Institute, University of Utah Pre-Integrated Transfer Function Our Approach Incremental pre-integration of the 3D transfer function completely on the GPU Compute base slice using [Roettger et al.] Compute the other slices using the base slice and the previously computed slice [Weiler et al.] s to update a 128x128x128 table This allows interactive updates to the colormap and transfer function opacity
Scientific Computing and Imaging Institute, University of Utah Experiments Environment 3.0 GHz Pentium MB RAM Windows XP ATI Radeon 9800 Pro Results k-buffer analysis Performance results
Scientific Computing and Imaging Institute, University of Utah K-buffer Analysis DatasetMax AMax kk > 2k > 6 Spx , Torso ,3171,683 Fighter Accuracy analysis Analysis of k depth required to correctly render datasets Max values from 14 fixed viewpoints
Scientific Computing and Imaging Institute, University of Utah k-buffer Analysis Distribution analysis Shows actual pixels that require large k depths to render correctly for each viewpoint k 6 (red)
Scientific Computing and Imaging Institute, University of Utah Results DatasetCellsK = 2 Fps K = 2 Tets/s K = 6 Fps K = 6 Tets/s Spx20.8 M K K Torso1.1 M K K Fighter1.4 M K K Performance Average values from 14 fixed viewpoints Does not include partial sort on CPU 512 x 512 viewport with a 128 x 128 x 128 pre-integrated transfer function
Scientific Computing and Imaging Institute, University of Utah Image – Blunt Fin
Scientific Computing and Imaging Institute, University of Utah Image - Spx
Scientific Computing and Imaging Institute, University of Utah Image – Torso
Scientific Computing and Imaging Institute, University of Utah Image - Fighter
Scientific Computing and Imaging Institute, University of Utah Future Work Optimize partial sort on CPU Develop techniques to refine datasets to respect a given k (subdivide degenerate tets) Incorporate isosurface rendering Parallel techniques Proper hole handling Dynamic data Use early z-test
Scientific Computing and Imaging Institute, University of Utah Conclusion Renders up to 6 million Tets/sec when using a linear transfer function Handles arbitrary non-convex meshes Requires minimal pre-processing of data Maximum data size is bounded by main memory Uses simple vertex and fragment shaders