Download presentation
Presentation is loading. Please wait.
1
Optimization of Mesh Locality for Transparent Vertex Caching Hugues Hoppe Microsoft Research SIGGRAPH 99 Hugues Hoppe Microsoft Research SIGGRAPH 99
2
Triangle meshes
3
geometric processing rasterizationrasterization graphics processor frame buffer busbus (e.g. AGP) texture image...... geometrygeometry verticesverticesfacesfaces mesh in memory System architecture texture cache busbusCPUCPU L2 cache L1 cache bottleneckbottleneck verticesvertices??
4
Previous work l 16-entry FIFO buffer [Deering95, Chow97] l stack buffer [BarYehuda-Gotsman96] l mesh compression [Taubin-Rossignac98] [Gumhold-Strasser98] … l 16-entry FIFO buffer [Deering95, Chow97] l stack buffer [BarYehuda-Gotsman96] l mesh compression [Taubin-Rossignac98] [Gumhold-Strasser98] … compressed geometry stream geometric processing graphics processor busbus mesh buffer parsing logic v1cv2-v1c v3-v2cv4-v3c v5-v4cc c
5
Previous work Drawbacks: n only static geometry n new API n not backward compatible Drawbacks: n only static geometry n new API n not backward compatible compressed geometry stream geometric processing graphics processor busbus mesh buffer parsing logic v1cv2-v1c v3-v2cv4-v3c v5-v4cc c
6
Our approach texture image... texture cache texture image geometric processing rasterizationrasterization graphics processor busbus vertex cache traditional mesh API vertex array indexed strips Optimize ordering! No explicit cache management
7
applicationapplication Transparent vertex caching l Pros: n animated geometry n application program unchanged n backward compatible on legacy hardware l Cons: n less compression (but still a factor ~2) l Pros: n animated geometry n application program unchanged n backward compatible on legacy hardware l Cons: n less compression (but still a factor ~2) traditional mesh API vertex array indexed strips graphics system
8
11 22 33 4 55 66 77 Indexed triangle strips v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v7v7 112233 443355 66 227744 55 position x y z normal nx ny nz normal color rgba texture1 u v texture2 u v ~ 32 bytes ~ 2 bytes
9
Cache parameters size ? replacement policy ? vertex cache 16 entries FIFOFIFO
10
Vertex data access = cache hit = cache miss traditional strips with caching transfer ~0.5 vertex/tri assume in cache transfer ~1.0 vertex/tri
11
# misses 00112233 Vertex data access traditional strips with caching transfer ~0.5 vertex/tri transfer ~1.0 vertex/tri
12
ExampleExample before optimization # misses 00112233 after optimization 47% bandwidth reduction
13
Optimization problem Given mesh, find strips minimizing bus bandwidth ( strips correspond to ordering of faces F ) Given mesh, find strips minimizing bus bandwidth ( strips correspond to ordering of faces F ) cache miss rate # vertex indices
14
Two reordering techniques l Greedy strip-growing n fast: 40,000 faces/sec l Local optimization n improve initial greedy solution n very slow l Greedy strip-growing n fast: 40,000 faces/sec l Local optimization n improve initial greedy solution n very slow
15
Greedy strip-growing l Inspired by [Chow97] l To decide when to restart strip, perform lookahead cache simulations 1 2 3 4
16
When to restart strip? good strip length 213 3241 43 2 1 4 3 2 1 4 3 2 1 4 3 2 1 4 3 2 1 (cache size 4)
17
When to restart strip? good strip length strip too long 4 3 2 1 213 3241 43 2 1 4 3 2 1 4 3 2 1 4 3 21 3 1 2 4 4 2 3 1 3 1 4 2 4 2 3 1 jump in miss rate! (cache size 4)
18
Lookahead simulations l Perform s simulations 4 3 2 1 (a) restart immediately, after 0 faces (b) restart after 0 < i < s faces l If (a) is best, restart strip
19
ResultResult traditional long strips face order within strip strip restart
20
ResultResult traditional long strips greedy strip-growing
21
ResultResult beforebeforeafterafter 45.8 bytes/triangle 25.5 bytes/triangle
22
Local optimization Apply perturbations to face ordering if cost is lowered: Initial order F F 1..x-1 F y+1..m F’=Reflect x,y (F) F 1..x-1 F y+1..m F’=Insert1 x,y (F) FyFyFyFy FyFyFyFy F 1..x-1 F y+1..m F’=Insert2 x,y (F) F y-1..y FyFyFyFy FyFyFyFyxxyy FxFxFxFx FxFxFxFx F x..y-1 F y..x FyFyFyFy FyFyFyFy F y-1..y F x..y-2 F x..y-1
23
ResultResult greedy strip-growing local optimization 25.5 bytes/triangle 24.2 bytes/triangle
24
Bandwidth Results Improvement by factor of 1.6 – 1.9
25
Choice of cache size size 16 sufficient for most gain
26
Cache replacement policy 213 3241 43 2 1 4 3 2 1 4 3 2 1 all is OK FIFOFIFOLRULRU (cache size 4)
27
Cache replacement policy 213 1432 3 42 1 31 4 2 43 2 1 3 1 4 2 FIFOFIFOLRULRU213 3241 43 2 1 4 3 2 1 4 3 2 1 strips twice as long strips twice as long (cache size 4)
28
ComparisonComparison FIFOFIFOLRULRU
29
FIFOFIFOLRULRU ComparisonComparison
30
SummarySummary l Vertex caching reduces geometry bandwidth by factor of 1.6 to 1.9 l Transparent to application: simply pre-process the models (fast) l Still efficient on legacy hardware l Supports dynamic geometry l Vertex caching reduces geometry bandwidth by factor of 1.6 to 1.9 l Transparent to application: simply pre-process the models (fast) l Still efficient on legacy hardware l Supports dynamic geometry
31
Future work l Issue of cache size n Find face ordering good for all sizes? n Standardize on size 16? n Reprocess mesh at load time l Interaction with texture caching l Cache efficiency during runtime LOD l Issue of cache size n Find face ordering good for all sizes? n Standardize on size 16? n Reprocess mesh at load time l Interaction with texture caching l Cache efficiency during runtime LOD
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.