Download presentation
Presentation is loading. Please wait.
1
GPU [1] Speaker 高崇閔
2
Exceed limitation Error massage Line 47 Ecercise2
3
CPU V.S. GPU Question : If we can speed up the computation of CPU, it’s no use about Question : If we can speed up the computation of CPU, it’s no use about GPU, doesn’t if ? GPU, doesn’t if ? Reply : In Table II, we spend twice time to transfer data than computation. Reply : In Table II, we spend twice time to transfer data than computation. But, it’s not means we can take place of GPU by CPU. We just But, it’s not means we can take place of GPU by CPU. We just transfer data to GPU once, but we can do several times transfer data to GPU once, but we can do several times computation in GPU. As a result, the time we cost in transfer is computation in GPU. As a result, the time we cost in transfer is sin that is necessaries. sin that is necessaries.
4
Num of block sizeGPU Drive -> Host CPU 16 32 KB 1.1940960.1262730 32 64 KB 1.2286480.1813080 64 128 KB 1.2174730.3788190 128 256 KB 1.2509970.4983870 256 512 KB 1.2954160.9998480 512 1.024 MB 1.3574351.5613720 1024 2.048 MB 1.4633142.8154420 2048 4.096 MB 1.7801154.65198816 4096 8.192 MB 2.2150869.06737515 8192 1.6384 MB 3.21744818.71774331 16384 3.2768 MB 5.03080738.32666063 32768 6.5536 MB 8.63741065.668404125 65536 131 MB 15.667075135.295959266 Ecercise4 Thread=512 N=number of block * threads size= size of (float) byte experimental platform : Geforce 8800 GT Table II
5
Computation A : m*n, B : n*p. A * B : m*p A : m*n, B : n*p. A * B : m*p The data we need to transfer is n*(p+m)*sizeof(float)byte The times we do computation is (m*p)*(addition) + (n^2)*(plus) A : 1*n, B : 1*n. A + B : 1*n A : 1*n, B : 1*n. A + B : 1*n The data we need to transfer is (2*n)*sizeof(float)byte The times we do computation is (n)*(addition) Vector addition Matrix multiple Ratio of addition and multiple The ratio of data transfer is (p+m) The ratio of computation is [ (m*p)*(addition) + (n^2)*(plus)] /(n)*(addition) (multiple / addition)
6
remain How to do several data in finite thread How to do several data in finite thread How to computation (multiple) between matrix and vector (inner product and outer product) How to computation (multiple) between matrix and vector (inner product and outer product)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.