Download presentation
Presentation is loading. Please wait.
1
© 2012 Elsevier, Inc. All rights reserved.
Chapter 06 © 2012 Elsevier, Inc. All rights reserved.
2
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.1 Placing 2D threads into linear order. © 2012 Elsevier, Inc. All rights reserved.
3
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.2 A simple sum reduction kernel. © 2012 Elsevier, Inc. All rights reserved.
4
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.3 Execution of the sum reduction kernel. © 2012 Elsevier, Inc. All rights reserved.
5
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.4 A kernel with fewer thread divergence. © 2012 Elsevier, Inc. All rights reserved.
6
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.5 Execution of the revised algorithm. © 2012 Elsevier, Inc. All rights reserved.
7
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.6 Placing matrix elements into linear order. © 2012 Elsevier, Inc. All rights reserved.
8
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.7 Memory access patterns in C 2D arrays for coalescing. © 2012 Elsevier, Inc. All rights reserved.
9
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.8 A coalesced access pattern. © 2012 Elsevier, Inc. All rights reserved.
10
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.9 An uncoalesced access pattern. © 2012 Elsevier, Inc. All rights reserved.
11
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.10 Using shared memory to enable coalescing. © 2012 Elsevier, Inc. All rights reserved.
12
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.11 Tiled matrix multiplication kernel using shared memory. © 2012 Elsevier, Inc. All rights reserved.
13
© 2012 Elsevier, Inc. All rights reserved.
Figure 6.12 Increased thread granularity with rectangular tiles. © 2012 Elsevier, Inc. All rights reserved.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.