Download presentation
Presentation is loading. Please wait.
1
© 2012 Elsevier, Inc. All rights reserved.
Chapter 05 © 2012 Elsevier, Inc. All rights reserved.
2
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.1 A simple matrix–matrix multiplication kernel using one thread to compute each d_P element (copied from Figure 4.7). © 2012 Elsevier, Inc. All rights reserved.
3
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.2 Overview of the CUDA device memory model. © 2012 Elsevier, Inc. All rights reserved.
4
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.3 Memory versus registers in a modern computer based on the von Neumann model. © 2012 Elsevier, Inc. All rights reserved.
5
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.4 Shared memory versus registers in a CUDA device SM. © 2012 Elsevier, Inc. All rights reserved.
6
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.5 A small example of matrix multiplication. For brevity, We show d_M[y*Width+x], d_N[y*Width+x], d_P[y*Width + x] as My,x, Ny,x, Py,x, respectively. © 2012 Elsevier, Inc. All rights reserved.
7
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.6 Global memory accesses performed by threads in block0,0. © 2012 Elsevier, Inc. All rights reserved.
8
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.7 Reducing traffic congestion in highway systems. © 2012 Elsevier, Inc. All rights reserved.
9
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.8 Carpooling requires synchronization among people. © 2012 Elsevier, Inc. All rights reserved.
10
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.9 Tiled algorithms require synchronization among threads. © 2012 Elsevier, Inc. All rights reserved.
11
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.10 Tiling M and N matrices to utilize shared memory. © 2012 Elsevier, Inc. All rights reserved.
12
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.11 Execution phases of a tiled matrix multiplication. © 2012 Elsevier, Inc. All rights reserved.
13
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.12 Tiled matrix multiplication kernel using shared memory. © 2012 Elsevier, Inc. All rights reserved.
14
© 2012 Elsevier, Inc. All rights reserved.
Figure 5.13 Calculation of the matrix indices in tiled multiplication. © 2012 Elsevier, Inc. All rights reserved.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.