Download presentation
Presentation is loading. Please wait.
Published byWesley Baker Modified over 8 years ago
1
Autonomic scheduling of tasks from data parallel patterns to CPU/GPU core mixes Published in: High Performance Computing and Simulation (HPCS), 2013 International Conference on 2013/12/191
2
Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/192
3
Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/193
4
Introduction The ubiquitous presence of multiple cores (at least one GPU) Efficient parallelism exploitation 2013/12/194
5
Introduction Motivation : to determine the division of workload between CPU and GPU an analytical performance model for scheduling tasks among CPU and GPU cores, such that the global execution time of the overall data parallel pattern is optimized 2013/12/195
6
Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/196
7
Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes To decide whether the parallelism exhibited by the application is suitable for GPUs may be solved by looking at only those parallel patterns that fit the GPU execution model, that is considering data parallel patterns only. To decide how to use the CPU while the GPU is computing 2013/12/197
8
Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Figuring out whether or not it is beneficial to split a data parallel computation among CPU and GPU cores Figuring out the percentage of tasks to be run on CPU and GPU cores 2013/12/198
9
Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes MAPE loop : Monitor Analyze Plan Execute 2013/12/199
10
Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/1910
11
Evaluating The CPU/GPU Tradeoff Two node CPU + main memory GPU + GPU memory First system owns the data, part of which must be sent to the second system Data copy between main memory and GPU memory : Setup and data transmission one core of the CPU → K cores 2013/12/1911
12
Evaluating The CPU/GPU Tradeoff 2013/12/1912
13
Evaluating The CPU/GPU Tradeoff 2013/12/1913
14
Evaluating The CPU/GPU Tradeoff CPU processing time GPU processing time Total execution time 2013/12/1914
15
Evaluating The CPU/GPU Tradeoff N → O(N) → N matrix multiplication : 2N 2 → O(N 3 ) → N 2 2013/12/1915
16
Evaluating The CPU/GPU Tradeoff CPU processing time GPU processing time 2013/12/1916
17
Evaluating The CPU/GPU Tradeoff CPU processing time GPU processing time 2013/12/1917
18
Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/1918
19
Experimental Results 2013/12/1919 Experiment platform
20
Experimental Results Benchmark b1 Computing the matrix whose elements are the square of the corresponding elements in the input matrix N → O(N) → N Benchmark b2 The simplest matrix multiplication algorithm (three nested loops, no blocking, no further optimization) 2N 2 → O(N 3 ) → N 2 2013/12/1920
21
Experimental Results 2013/12/1921
22
Experimental Results 2013/12/1922
23
Experimental Results Reduce sum Reduce min 2013/12/1923
24
Experimental Results 2013/12/1924
25
Experimental Results 2013/12/1925
26
Experimental Results P, 0.8P, 0.9P, 1.1P, 1.2P 2013/12/1926
27
Outline Introduction Autonomic Management of Data Parallel Computations Targeting CPU/GPU Mixes Evaluating The CPU/GPU Tradeoff Experimental Results Conclusions 2013/12/1927
28
Conclusions The main contribution of this work Computing the ratio between the number of tasks to be executed on CPU and GPU cores to optimize the completion time The classical map and reduce patterns which uses CPU and GPU cores according to the ratio computed by the model where the combined execution of tasks on GPU and CPU cores 2013/12/1928
29
Q&A 2013/12/1929
30
Thank you for listening 2013/12/1930
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.