Introduction to Massive Model Visualization Patrick Cozzi Analytical Graphics, Inc.
Contents Minimal computer graphics background Culling Level of Detail (LOD) Memory Management Videos throughout
Computer Graphics Background Goal: Convert 3D model to pixels 3D models are composed of triangles
Computer Graphics Background 1 triangle = 3 vertices Gross over simplification: 3 floats per vertex (x 0, y 0, z 0 ) (x 1, y 1, z 1 )(x 2, y 2, z 2 )
Computer Graphics Background Triangles go through graphics pipeline to become pixels View parameters define the size and shape of the world viewer near plane far plane
Computer Graphics Background CPUGPUMonitor Vertex Processing Geometry Processing Fragment Processing PCIe Bus Color Depth Stencil
Computer Graphics Background Visible Surfaces
Example Massive Models Procedurally generated model of Pompeii: ~1.4 billion polygons. Image from [Mueller06]
Example Massive Models Boeing 777 model: ~350 million polygons. Image from
Example Massive Models
Trends No upper bound on model complexity – Procedural generation – Laser scans – Aerial imagery Image from [Lakhia06]
Trends High GPU throughput – At least million triangles per second Widen gap between processor and memory performance CPU – GPU bottleneck
Goal output-sensitivity: performance as a function of the number of pixels rendered, not the size of the model
View Frustum Culling Can be slower than brute force. When? culled rendered culled rendered
Bounding Volumes Spheres Axis aligned bounding boxes Object oriented bounding boxes Hybrids
View Frustum Culling
Demo
Occlusion Culling Effective in scenes with high depth complexity culled
Occlusion Culling From-region or from-point Most are conservative Occluder Fusion Difficult for general scenes with arbitrary occluders. So make simplifying assumptions: – [Wonka00] – urban environments – [Ohlarik08] – planets and satellites
Hardware Occlusion Queries From-point visibility that handles general scenes with arbitrary occluders and occluder fusion How? – Use the GPU
Hardware Occlusion Queries Render occluders Render object’s BV using HOQ Render full object based on result
Hardware Occlusion Queries CPU stalls and GPU starvation Draw o1Draw o2Draw o3 Draw o1Draw o2Draw o3 CPU GPU Query o1 Draw o1 -- stall starve -- CPU GPU
Hardware Occlusion Queries Demo
Is Culling Enough?
Level of Detail Generation: less triangles, simpler shader Selection: distance, pixel size Switching: avoid popping Discrete, Continuous, Hierarchical
Simplification Operations edge collapse Also Vertex Merge Vertex Removal Cell Collapse See [Luebke01]
Discrete LOD 3,086 Triangles 52,375 Triangles69,541 Triangles
Discrete LOD Not enough detail up close Too much detail in the distance
Continuous LOD edge collapse vertex split Image from [Luebke01]
Hierarchical LOD 1 Node 3,086 Triangles 4 Nodes 9,421 Triangles 16 Nodes 77,097 Triangles
Hierarchical LOD 1 Node 3,086 Triangles 4 Nodes 9,421 Triangles 16 Nodes 77,097 Triangles
Hierarchical LOD visit(node) { if (computeSSE(node) < pixel tolerance) { render(node); } else { foreach (child in node.children) visit(child); } Node Refinement
Hierarchical LOD
Demo
Hierarchical LOD Easy to – Add view frustum culling – Add occlusion culling via HOQs Render front to back – Use VMSSE to drive refinement Requires preprocessing Is Culling + HLOD enough?
Memory Management Out-of-Core Compression Cache Coherent Layouts
Out-of-Core HLOD visit(node) { if ((computeSSE(node) < pixel tolerance) || (not all children resident)) { render(node); foreach (child in node.children) requestResidency(child); } else { foreach (child in node.children) visit(child); }
Out-of-Core HLOD Multithreaded – Disk reads – Decompression, normal generation, etc Cache management – Prioritize reads, e.g. distance from viewer – Replacement policy Skeleton in memory? – BV, error metric, parent – child relationships
Out-of-Core Prefetching Reduce geometry cache misses Predict and load required nodes
Out-of-Core Prefetching Predict camera position [Correa03] v v’ f f’
Out-of-Core Prefetching [Varadhan02] – Coherence of Front – Prefetch ascendants/descendants – Prefetch with enlarged view frustum – Prioritize
Compression “Size is Speed” Geometric – Vertices, Indices – I/O and Rendering Performance Texture – Performance or Quality RenderDiskDe/re-compress
Cache Coherent Layouts Vertex Shader Post VS Cache Pre VS Cache GPU Main Memory Primitive Assembly Reorder Triangles Reorder Vertices Reorder vertices and indies to maximize GPU cache hits
Cache Coherent Layouts Minimize ACMR – Average Cache Miss Ratio Cache Oblivious [Yoon05] Linear Time [Sander07]
Not Covered Today Dynamic Scenes Clustered backface culling IBR, Mapping Sorting Batching Ray Tracing
Summary Combine culling, LOD, and out-of-core techniques Keep the CPU and GPU busy Exploit Coherence: Spatial and Temporal
For More Information [Gobbetti08] – Technical Strategies for Massive Model Visualization [Luebke01] – A Developer’s Survey of Polygonal Simplification Algorithms My
About Analytical Graphics, Inc COTS software for national security and space professionals Rated best place to work: rd in 2007.
Questions ?