Download presentation
Presentation is loading. Please wait.
Published byJuniper Evans Modified over 9 years ago
1
Maximizing Parallelism in the Construction of BVHs, Octrees, and k-d Trees Tero Karras NVIDIA Research
2
Trees 2
3
Trees Path tracing Pharr & Humphreys Real-time ray tracing NVIDIA Collision detection Ageia Particle simulation NVIDIA Voxel-based global illumination Crassin et al. Surface reconstruction Amenta et al. Photon mapping Uchida BetterFaster 3
4
Outline Fastest existing methods are sequential Parallelize within each hierarchy level But not between levels 4
5
Outline Fastest existing methods are sequential Parallelize within each hierarchy level But not between levels Lack of parallelism Small workloads bottlenecked by top levels Sub-linear scaling of performance 5
6
Outline Novel way to build the entire tree in parallel Two algorithmic “building blocks” Fast, scalable 6
7
Outline Novel way to build the entire tree in parallel Two algorithmic “building blocks” Fast, scalable Main focus: BVHs Point-based octrees and k-d trees also covered in the paper 7
8
Bounding volume hierarchy 8
9
9
10
LBVH - Lauterbach et al. [2009] 1.Assign Morton codes 2.Sort primitives 3.Generate hierarchy 4.Fit bounding boxes p p x = 0. p y = 0. p z = 0. 1010 0111 1100 10
11
0 1 0 LBVH - Lauterbach et al. [2009] 1.Assign Morton codes 2.Sort primitives 3.Generate hierarchy 4.Fit bounding boxes p p x = 0. p y = 0. p z = 0. 110 011 110 1010 0111 1100 11
12
LBVH - Lauterbach et al. [2009] 1.Assign Morton codes 2.Sort primitives 3.Generate hierarchy 4.Fit bounding boxes p p x = 0. p y = 0. p z = 0. 1010 0111 1100 10101100code = 12
13
LBVH - Lauterbach et al. [2009] 1.Assign Morton codes 2.Sort primitives 3.Generate hierarchy 4.Fit bounding boxes 13
14
LBVH - Lauterbach et al. [2009] 1.Assign Morton codes 2.Sort primitives 3.Generate hierarchy 4.Fit bounding boxes 14
15
Binary radix tree 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 0011223344556677 15
16
Binary radix tree 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 0011223344556677 00 0 0 16
17
Binary radix tree 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 223344556677 00 000000 00000000 0 0 0011 17
18
Binary radix tree 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 44 00 000000 000000 0011 00100010 00100010 2233 1 1 11 77 11001100 11001100 5566 18
19
Binary radix tree 1 1 11 00 11001100 11001100 00100010 00100010 000000 000000 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 0011223344556677 n n-1 19
20
Longest common prefix 1 1 11 00 11001100 11001100 00100010 00100010 000000 000000 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 0011223344556677 20
21
Longest common prefix 00 0000100001 0000100001 00 0001000010 0001000010 11 0010000100 0010000100 22 0010100101 0010100101 33 1001110011 1001110011 44 1100111001 1100111001 66 1111011110 1111011110 77 1100011000 1100011000 55 δ(5,6) = 4 4 11001100 11001100 1100011000 1100011000 55 1100111001 1100111001 66 4 21
22
Longest common prefix 2 00 0000100001 0000100001 00 0001000010 0001000010 11 0010000100 0010000100 22 0010100101 0010100101 33 1001110011 1001110011 44 1100111001 1100111001 66 1111011110 1111011110 77 1100011000 1100011000 55 δ(5,6) = 4 11001100 11001100 δ(0,3) = ? 0000100001 0000100001 00 0010100101 0010100101 33 δ(0,3) = 2 2 4 22
23
Garanzha et al. [2011] Level 31 node Level 2 3 nodes Level 01 node 0001000010 0001000010 0000100001 0000100001 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 Level 12 nodes 1122 00 445533 0011223344556677 66 23
24
Our method 0001000010 0001000010 0000100001 0000100001 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0011223344556677 24
25
Our method Define a numbering scheme for the nodes Gain some knowledge of their identity Establish a connection with the keys 25
26
Our method Define a numbering scheme for the nodes Gain some knowledge of their identity Establish a connection with the keys Find the children of a given node Only look at node index and nearby keys 26
27
Our method Define a numbering scheme for the nodes Gain some knowledge of their identity Establish a connection with the keys Find the children of a given node Only look at node index and nearby keys Do this for all nodes in parallel 27
28
Numbering scheme 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 0011223344556677 00 3344 28
29
Numbering scheme 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 0011223344556677 00 3344 55 29
30
Numbering scheme 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 0011223344556677 11 66 11 66 3333 0000 4444 55552222 30
31
Numbering scheme 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 0011223344556677 00 3344 1155 66 22 31
32
Algorithm 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 0011223344556677 00 3344 1155 66 22 32
33
δ(2,3) = 4 Algorithm 0000100001 0000100001 00 0001000010 0001000010 11 0010000100 0010000100 22 0010100101 0010100101 33 1001110011 1001110011 44 1100011000 1100011000 55 1100111001 1100111001 66 1111011110 1111011110 77 0010000100 0010000100 22 0010100101 0010100101 33 0010100101 0010100101 33 1001110011 1001110011 44 δ(3,4) = 0 33 δ(2,3) = 4 33
34
Algorithm 0000100001 0000100001 00 0001000010 0001000010 11 0010000100 0010000100 22 0010100101 0010100101 33 1001110011 1001110011 44 1100011000 1100011000 55 1100111001 1100111001 66 1111011110 1111011110 77 0010100101 0010100101 33 1001110011 1001110011 44 δ(3,4) = 0 33 δ(2,3) = 4 δ(1,3) = 2 δ(0,3) = 2 34
35
Algorithm 0000100001 0000100001 00 0001000010 0001000010 11 0010000100 0010000100 22 0010100101 0010100101 33 1001110011 1001110011 44 1100011000 1100011000 55 1100111001 1100111001 66 1111011110 1111011110 77 33 δ(0,3) = 2 δ(2,3) = 4δ(1,3) = 2 35
36
Algorithm 0000100001 0000100001 00 0001000010 0001000010 11 0010000100 0010000100 22 0010100101 0010100101 33 1001110011 1001110011 44 1100011000 1100011000 55 1100111001 1100111001 66 1111011110 1111011110 77 33 δ(0,3) = 2 2 2 1122 36
37
For each node i=0..n-2 in parallel: 1.Determine direction of the range 2.Expand the range as far as possible 3.Find where to split the range 4.Identify children Algorithm Binary search 37
38
Duplicate keys The algorithm only works with unique keys Duplicates are common in practice 38
39
Duplicate keys The algorithm only works with unique keys Duplicates are common in practice Trick: Augment each key with its index Distinguishes between duplicates Keys are still in lexicographical order 39
40
Duplicate keys The algorithm only works with unique keys Duplicates are common in practice Trick: Augment each key with its index Distinguishes between duplicates Keys are still in lexicographical order Tie-break when evaluating δ(i,j) 40
41
LBVH 1.Assign Morton codes 2.Sort primitives 3.Generate hierarchy 4.Fit bounding boxes 41
42
Lauterbach et al. [2009] 42
43
Our method Need a different approach How many levels are there? Which nodes are located on a given level? 43
44
Our method Need a different approach How many levels are there? Which nodes are located on a given level? Traverse paths in the tree in parallel Start from leaves, advance toward the root Terminate threads using per-node atomic flags 44
45
Our method 45
46
Results Evaluate performance on GTX 480 (Fermi) CUDA, 30-bit Morton codes 46
47
Results Evaluate performance on GTX 480 (Fermi) CUDA, 30-bit Morton codes Compare against Garanzha et al. [2011] Identical tree (top-level SAH splits disabled) 47
48
Results Evaluate performance on GTX 480 (Fermi) CUDA, 30-bit Morton codes Compare against Garanzha et al. [2011] Identical tree (top-level SAH splits disabled) Simulate large GPUs N times as many cores N times the memory bandwidth 48
49
Results 49 Our method Garanzha et al. Fairy Forest 174K triangles milliseconds
50
Results 50 Our method Garanzha et al. Fairy Forest 174K triangles milliseconds
51
Results 51 Our method Turbine Blade 1.77M triangles milliseconds Garanzha et al.
52
Results 52 Stanford Dragon 871K triangles milliseconds Our method milliseconds
53
Acknowledgements Timo Aila Samuli Laine David Luebke Jacopo Pantaleoni Jaakko Lehtinen For helpful suggestions and proofreading. 53
54
Thank You Questions
55
Backup slides
56
Pseudocode 56
57
Results 57
58
Algorithm 0001000010 0001000010 0010000100 0010000100 0010100101 0010100101 1001110011 1001110011 1100011000 1100011000 1100111001 1100111001 1111011110 1111011110 0000100001 0000100001 0011223344556677 00 3344 112255 66 58
59
Algorithm 0000100001 0000100001 00 0001000010 0001000010 11 0010000100 0010000100 22 0010100101 0010100101 33 1001110011 1001110011 44 1100011000 1100011000 55 1100111001 1100111001 66 1111011110 1111011110 77 22 0001000010 0001000010 11 0010000100 0010000100 22 0010000100 0010000100 22 0010100101 0010100101 33 δ(1,2) = 2δ(2,3) = 4 59
60
δ(2,4) = 0 Algorithm 0000100001 0000100001 00 0001000010 0001000010 11 0010000100 0010000100 22 0010100101 0010100101 33 1001110011 1001110011 44 1100011000 1100011000 55 1100111001 1100111001 66 1111011110 1111011110 77 0001000010 0001000010 11 0010000100 0010000100 22 δ(2,3) = 4 δ(1,2) = 2 22 60
61
δ(2,3) = 4 Algorithm 0000100001 0000100001 00 0001000010 0001000010 11 0010000100 0010000100 22 0010100101 0010100101 33 1001110011 1001110011 44 1100011000 1100011000 55 1100111001 1100111001 66 1111011110 1111011110 77 22 61
62
δ(2,3) = 4 Algorithm 0000100001 0000100001 00 0001000010 0001000010 11 0010000100 0010000100 22 0010100101 0010100101 33 1001110011 1001110011 44 1100011000 1100011000 55 1100111001 1100111001 66 1111011110 1111011110 77 22 11 62
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.