Download presentation
Presentation is loading. Please wait.
1
An Analytical Model for a GPU
2
Overview
3
SVM Kernel Behavior: Need for other metrics
4
Degree of Parallelism zGPU Architecture yEach SM executes multiple warps in a time-sharing fashion while one or more are waiting for memory values xHiding the execution cost of warps that are executed concurrently. yHow many memory requests can be serviced and how many warps can be executed together while one warp is waiting for memory values.
5
MWP and CWP zMemory Warp: yThe warp that is waiting for memory values zMemory Warp waiting period: yThe time period from right after one warp sent memory requests until all the memory requests from the same warp are serviced. zCWP (Computation Warp Parallelism) yrepresents the number of warps that the SM processor can execute during one memory warp waiting period plus one. zMWP (Memory Warp Parallelism) yrepresents the maximum number of warps per SM that can access the memory simultaneously during memory warp waiting period
6
Relationship between MWP and CWP: CWP > MWP What is getting hidden? Total execution time (a) 8 warps (b) 4 warps
7
What is going on here?
8
Relationship between MWP and CWP: MWP > CWP Total execution time (a) 8 warps (b) 4 warps
9
Relationship between MWP and CWP: MWP > CWP Total execution time (8 warps)
10
Not Enough Warps Running
11
CPI PTX instruction set
12
Model
13
Example Blocks: 80 Threads: 128 5 blocks per SM
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.