Presentation is loading. Please wait.

Presentation is loading. Please wait.

By Yequn Zhang, Yu Zhang. Contents Introduction Problem Analysis Proposed Algorithm Evaluation.

Similar presentations


Presentation on theme: "By Yequn Zhang, Yu Zhang. Contents Introduction Problem Analysis Proposed Algorithm Evaluation."— Presentation transcript:

1 By Yequn Zhang, Yu Zhang

2 Contents Introduction Problem Analysis Proposed Algorithm Evaluation

3 Contents Introduction Problem Analysis Proposed Algorithm Evaluation

4 Gaus sian Elimination Forward Elimination Back Substitution

5 Contents Introduction Problem Analysis Proposed Algorithm Evaluation

6 Problem Analysis Data size used by kernels changes continuously Difficult to find an appropriate block size to avoid divergence Block-based approach Assign a certain part of computation running on CPU-leave the irregularity to cpu Manually make the data size changes with a step of block size Block number per grid is easy to set

7 Contents Introduction Problem Analysis Proposed Algorithm Evaluation

8 Forward Elimination A block-based approach Try to avoid divergence Try to use GPU Try to be fine-grained

9 K 1 Find Max Row

10 Swap cpu Now start to eliminate the block of data on cpu

11 Calculate coefficients

12 Elimination on CPU

13 K 1 Calculate Coefficients

14 K2K2 K 2 Elimination on CPU

15 Swap on GPU K3K3 K 3

16 K4K4 Elimination on GPU K 4

17 K5K5 Elimination on GPU K 5

18 Intra-block loop

19 Inter-block loop

20 Last inter-block loop processed on CPU

21 Back Substitution Launch kernel when number of coefficients per row exceeds four block size (64*4=256) A fine-grained way, use a similar way as forward elimination, part on CPU and part on GPU

22 Contents Introduction Problem Analysis Proposed Algorithm Evaluation

23 Block size effect

24 The contribution of swap and find max row Is it necessary to implement every part on GPU?

25 Performance breakdown Contribution of each part to the total performance, including kernels as well as CPU part

26 Speedup

27 Questions ?


Download ppt "By Yequn Zhang, Yu Zhang. Contents Introduction Problem Analysis Proposed Algorithm Evaluation."

Similar presentations


Ads by Google