Download presentation
Presentation is loading. Please wait.
1
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Oblivious Mesh Layouts Sung-Eui Yoon, Peter Lindstrom Valerio Pascucci, Dinesh Manocha 1: University of North Carolina - Chapel Hill 2: Lawrence Livermore National Laboratory 1 1 2 2 http://gamma.cs.unc.edu/COL
2
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Goal Compute cache-coherent layouts of polygonal meshes ♦ For geometric processing and visualization ♦ Handle any kinds of polygonal models (e.g., irregular geometry)
3
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Motivation High growth rate of computational power of CPUs and GPUs Growth rate during 1993 – 2004 Courtesy: http://www.hcibook.com/e3/online/moores-law/
4
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Memory Hierarchies and Caches CPU or GPU Fast memory or cache Slow memory Block transfer Disk 10 6 ns Access time: 10 2 ns 10 0 ns
5
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Coherent Layouts Cache-Aware ♦ Optimized for particular cache parameters (e.g., block size) Cache-Oblivious ♦ Minimizes data access time without any knowledge of cache parameters ♦ Directly applicable to various hardware and memory hierarchies
6
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL 82 million triangles Irregular distribution of geometry CAD Model – Double Eagle Tanker Model
7
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Isosurface and Scanned Models Isosurface 100M triangles St. Matthew 372M triangles
8
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Main Contribution Algorithm to compute cache- oblivious layouts of polygonal meshes Cache-oblivious metric Multilevel optimization framework Applicable to hierarchical representations
9
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Live Demo – View- Dependent Rendering (VDR) GeForce Go 6800 Ultra Based on multiresolution hierarchy ♦ Dynamically computes simplification ♦ Cache-oblivious layout is used to minimize GPU vertex cache misses
10
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Related Work Cache-coherent algorithms Mesh layouts
11
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Coherent Algorithms Cache-aware [Coleman and McKinley 95, Vitter 01, Sen et al. 02] Cache-oblivious [Frigo et al. 99, Arge et al. 04] Focus on specific problems such as sorting and linear algebra computations
12
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Mesh Layouts Rendering sequences ♦ Triangle strips ♦ [Deering 95, Hoppe 99, Bogomjakov and Gotsman 02] Processing sequences ♦ [Isenburg and Gumhold 03, Isenburg and Lindstrom 04] Assume that access pattern globally follows the layout order!
13
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Mesh Layouts Space-filling curves ♦ [Sagan 94, Velho and Gomes 91, Pascucci and Frank 01, Lindstrom and Pascucci 01, Gopi and Eppstein 04] Assume geometric regularity!
14
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline Overview Cache-oblivious metric Results
15
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline Overview Cache-oblivious metric Results
16
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Overview Multilevel optimization Cache-oblivious metric Local permutations vava vbvb vdvd vcvc Input graph vava vbvb vdvd vcvc Result 1D layout
17
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Graph-based Representation Undirected graph, G = (V, E) ♦ Represents access patterns of applications Vertex ♦ Data element ♦ (e.g., mesh vertex or mesh triangle) Edge ♦ Connects two vertices if they are likely to be accessed sequentially vava vbvb vdvd vcvc
18
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Problem Statement Vertex layout of G = (V, E) ♦ One-to-one mapping of vertices to indices in the 1D layout Compute a that minimizes the expected number of cache misses vava vbvb vdvd vcvc vava vbvb vdvd vcvc 1 234
19
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Local Permutation Vertex layout
20
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Terminology Edge span of (v a, v b ) Layout mapping
21
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Terminology ♦ Set of edges having edge span i in the layout 4
22
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Terminology Edge span distribution ♦ where i is in [1, n] Edge span 1 Number of edges 234 1 1 1 1 4 2 3 4 1
23
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache Miss Ratio Function (CMRF), Probability of a cache miss for a given edge span i 0 1 Cache miss ratio = Probability to have a cache miss Edge span 1 n-1 i
24
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Number of Cache Misses at Runtime Estimated by multiplying two factors ♦ Runtime edge span distribution ♦ CMRF 1D Layout: Edge span 2Edge span 4Edge span 2 + + ( )
25
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Number of Cache Misses at Runtime 1D Layout: Edge span 2Edge span 4Edge span 2 + + Runtime edge span distribution CMRF ( )
26
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Expected Number of Cache Misses ♦ Approximate runtime edge span distribution with one of the layout Edge span distribution of the layout The number of vertices
27
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline Overview Cache-oblivious metric Results
28
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Cache-Oblivious Metric Decides if a local permutation reduces number of cache misses ♦ Probabilistic formulation ♦ Reduces to geometric volume computation
29
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Does a Local Permutation Decrease Cache Misses? ?
30
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Does a Local Permutation Decrease Cache Misses?
31
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Monotonocity of CMRF, Assume CMRF is a monotonically increasing function of edge span 0 1 Cache miss ratio Edge span 1 ∞ i
32
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Exact Cache-Oblivious Metric where All the possible cache configurations Monotonicity of CMRF
33
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Geometric Formulation where Half hyperspace p 2 p 1 0 Closed hyperspace p 2 p 1 0
34
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Geometric Volume Computation Assume each CMRF to be equally likely Half hyperspace (blue area) ♦ Space of CMRFs that reduce cache misses p 2 p 1 0 where
35
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Geometric Volume Computation Time complexity ♦ Exact: [Lasserre and Zeron 01] ♦ Approximate: [Kannan et al. 97]
36
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL p 2 p 1 0 Fast and Approximate Volume Comparison Define a top polytope in closed hyperspace Compute the centroid, C, of the top polytope Top polytope Centroid, C
37
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL p 2 p 1 0 Fast and Approximate Volume Comparison Use the centroid for approximate volume comparison ♦ The volume containing the centroid is likely to be larger Centroid, C
38
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Bound of Approximation 0.1% ~ 0.3% compared to the exact metric
39
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Final Approximate Metric Centroid Pack non-zero to 1,…, m
40
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Layout Optimization Find an optimal layout that minimizes our metric ♦ Combinatorial optimization problem
41
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Multilevel Minimization Step 1: Coarsening
42
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Multilevel Minimization Step 2: Ordering of coarsest graph
43
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Multilevel Minimization Step 3: Refinement and local optimization
44
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Outline Overview Cache-oblivious layouts Results
45
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Layout Computation Time Process 70 million vertices per hour ♦ Takes 2.6 hours to lay out St. Matthew model (372 million triangles) ♦ 2.4GHz of Pentium 4 PC with 1 GB main memory
46
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Edge Span Distributions of Different Layouts Cache-oblivious layout Spectral layout Original layout Edge span Number of edges >
47
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Applications View-dependent rendering Collision detection Isocontour extraction
48
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL View-Dependent Rendering Layout vertices and triangles of CHPM [Yoon et al. 04] ♦ Reduce misses of GPU vertex cache
49
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL View-Dependent Rendering Models# of Tri. Our layout Simplification layout [Yoon et al. 04] St. Matthew 372M106 M/s23 M/s Isosurface100M90 M/s20 M/s Double Eagle Tanker 82M47 M/s22 M/s 4.5X 2.1X Peak performance: 145 M tri / s on GeForce 6800 Ultra
50
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Realtime Captured Video – St. Matthew Model
51
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Comparison with Other Rendering Sequences Our layout Universal rendering sequences [Bogomjakov and Gotsman 2002]
52
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Comparison with Other Rendering Sequences Our layout [Hoppe 99] Optimized for 16 vertex cache size with FIFO replacement Optimized for no particular cache size
53
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Performance during View- Dependent Rendering Our layout [Hoppe 99] Optimized for various resolutions Optimized for full resolution
54
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Comparison with Space Filling Curve on Power Plant Model Our layout Space filling curve (Z-curve)
55
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Collision Detection Bounding volume hierarchies ♦ Widely used to accelerate the performance of collision detection ♦ Traversed to find contacting area ♦ Uses pre-computed layouts of OBB trees [Gottschalk et al. 96]
56
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Rigid Body Simulation
57
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Collision Detection Time 2X on average Depth-first layout Cache-oblivious layout
58
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Isocontour Extraction Contour tree [van Kreveld et al. 97] Use mesh as the input graph Extract an isocontour that is orthogonal to z-axis Puget sound, 134 M triangles Isocontour z(x,y) = 500m
59
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Comparison – First Extraction of Z(x,y) = 500m Relative Performance over Z-axis sorted layout Nearly optimized for particular isocontour 2 21 13 1 Disk access time is bottleneck
60
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Comparison – Second Extraction of Z(x,y) = 500m Relative Performance over Z-axis sorted layout 2 21 13 379 212 1 0.8 Memory and L1/L2 cache access times are bottleneck
61
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Limitations Assumptions on CMRF ♦ May not work well for all applications Does not compute global optimum ♦ Greedy solution
62
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Advantages General ♦ Applicable to all kinds of polygonal models ♦ Works well for various applications Cache-oblivious ♦ Can have benefit from CPU/GPU cache to memory and disk No modification of runtime application ♦ Only layout computation
63
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL OpenCCL: Cache-Coherent Layouts of Graphs and Meshes Source codes for computing a cache-coherent layout Easy to use CLayoutGraph Graph (NumVertex); 0 1 2 Graph.AddEdge (0, 1); Graph.AddEdge (0, 2); Graph.AddEdge (1, 2); int Order [NumVertex]; Graph.ComputeOrdering (Order); Google “Cache Oblivious Mesh Layout” or Http://gamma.cs.unc.edu/COL
64
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Conclusion Novel algorithm for computing cache-oblivious mesh layouts ♦ Cast the problem as an optimization ♦ Probabilistically compute the expected number of caches misses ♦ Achieve significant improvements (2 to 20X) without modifying runtime applications
65
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Ongoing and Future Work Apply to other applications ♦ Simplification and approximate collision detection [Yoon et al. 04] ♦ Shortest path computation, etc. Investigate optimality
66
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Ongoing and Future Work Cache-Oblivious Layouts of Bounding Volume Hierarchies [Yoon and Manocha 05] ♦ Tech. Report, University of North Carolina at Chapel Hill
67
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Acknowledgements Anonymous donor ♦ Power plant model Digital Michelangelo Project ♦ St. Matthew model at Stanford University LLNL ASCI VIEWS ♦ Isosurface model Newport news shipbuilding ♦ Double eagle tanker
68
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Acknowledgements Army Research Office DARPA Intel Corporation Lawrence Livermore Nat’l Lab. National Science Foundation Office of Naval Research RDECOM
69
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Martin Isenburg Dawoon Jung Brandon Lloyd Elise London Brian Salomon Avneesh Sud Acknowledgements
70
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Questions? Project URL http://gamma.cs.unc.edu/COL
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.