Download presentation
Presentation is loading. Please wait.
Published byAllyson Watkins Modified over 8 years ago
1
Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee, Bingsheng He
2
OutLine Background & Motivation Multi-Level Parallel Speedup Evaluation Conclusion
3
Multi-Level Computing Architecture and Paradigm
4
MPI+OpenMP MPI+CUDA MPI+OpenMP+CUDA …..
5
Multi-Level Parallel Computing Model Lm L3L2 L1L1 Notes:Sequential Part Parallel Part PE 2,2 PE 1,1 PE 2,1 PE 3,1 PE 3,2 PE 3,3 PE 3,4 PE 3,5 PE 3,6 PE 3,7 PE 3,8
6
Parallel Speedup Definition Classification Absolute Speedup Relative Speedup
7
Relative Speedup Model Fixed-size Speedup Amdahl’s Law Fixed-time Speedup Gustafson’s Law
8
Motivation Example—NAS Benchmark (MPI+OpenMP)
9
Amdahl’s Law is UNSUITABLE for Multi-Level Parallel Computing
10
OutLine Background & Motivation Multi-Level Parallel Speedup Evaluation Conclusion
11
E-Amdahl’s Law Awareness of Different Grained-Level Parallelism Lm L3L2 L1L1 Notes : Sequential Part Parallel Part PE 2,2 PE 1,1 PE 2,1 PE 3,1 PE 3,2 PE 3,3 PE 3,4 PE 3,5 PE 3,6 PE 3,7 PE 3,8
12
E-Amdahl’s Law Two-Level Parallelism Speedup Model (MPI+OpenMP) where is the parallel fraction of coarse-grained (MPI-level) parallelism. is the parallel fraction of fine-grained (OpenMP-level) parallelism. is the number of processes spawned. is the number of threads spawned per process.
13
E-Gustafson’s Law Awareness of Different Grained-Level Parallelism Lm L3L2 L1L1 Notes : Sequential Part Parallel Part PE 2,2 PE 1,1 PE 2,1 PE 3,1 PE 3,2 PE 3,3 PE 3,4 PE 3,5 PE 3,6 PE 3,7 PE 3,8
14
OutLine Background & Motivation Multi-Level Parallel Speedup Evaluation Conclusion
15
Experiment Setup Platform and Configuration A linux cluster consisting of eight computing nodes each with two quad-core chips Configuration: One thread per CPU core Benchmarks NAS Parallel Benchmark (NPB) Multi-Zone (MZ) Version: BT-MZ (Unbalanced Workload Partitioning) SP-MZ (balanced Workload Partitioning) LU-MZ (balanced Workload Partitioning)
16
Performance Prediction
17
Prediction Result Comparison
18
OutLine Background & Motivation Multi-Level Parallel Speedup Evaluation Conclusion
19
Traditional speedup models are unsuitable for multi-level parallelism –Unable to be awareness of different granularities of parallelism for multi-level parallel computing. Multi-level Parallelism Model –A guidance model for multi-level optimization. –A prediction model for multi-level parallelism.
21
Argument Estimation
22
Speedup Under E-Amdahl’s Law
23
Speedup Under E-Gustafson’s Law
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.