Parallelism and Amdahl's Law Eric Shook Department of Geography Kent State University
Parallel computing Image sources: intel.com, http://www.nasa.gov/audience/foreducators/k-4/features/F_ESSEA_Course_K-4.html
Inter-process Communication Shared Memory Message Passing Processing Core Processing Core 1 Processing Core Processing Core 1 [40.742, - 74.245] [40.742, -74.245] [40.742, -74.245] Private memory space for processing core 0 Private memory space for processing core 1 Memory space is shared between processing core 0 and 1
Parallel Programming Paradigms Functional Parallelism Data Parallelism Processing Core Processing Core 1 Processing Core Processing Core 1 Task A Task B Task A Task A Data (Half) Equivalent processing times Data (Half) Task B Task B Data Data Data (Half) Data (Half)
Spatial Domain Decomposition Row or Column Quadtree Recursive Bisection Grid Ding, Y., & Densham, P. J. (1996). Spatial strategies for parallel spatial modelling. International Journal of Geographical Information Systems, 10(6), 669-698.
Challenges for Parallelism: Load-Imbalance Uneven amount of data for processing Processing Core Processing Core 1 Core 0 will finish processing much sooner than Core 1 Task A Task A
Load-Imbalance: Bad for Performance Imbalanced Workload Balanced Workload 20% 80% 50% 50%
Load-Imbalance: Bad for Performance Imbalanced Workload Balanced Workload 20% 80% 50% 50% Doing nothing, but could be processing Overloaded core All lost time due to imbalance
Challenges: Not Enough Parallelism Task A Task B Task C Task D Task E Not Enough Task Parallelism Data too small for Data Parallelism
Measuring Parallel Performance: Speedup Speedup is commonly used to assess the performance of a parallel program. Speedup is defined as the execution time on a single core (T1) over the execution time on p cores (Tp) (Amdahl, 1967). Linear or ideal speedup is reached when Sp = p. Linear Speedup Actual Speedup Speedup Number of cores
Amdahl's Law: Theoretical Speedup Serial Portion Task A Task B Parallel Portion Task C Task D Task E Assume P is the parallel portion of a parallel program, then (1-P) is the portion that cannot be made parallel (serial portion). Amdahl's law states that the maximum speedup on N processors is: 1 (1-P) + S(N) = P N
As N tends to infinity, S(N) tends to 1/(1-P) Amdahl's Law: Examples 1 (1-P) + As N tends to infinity, S(N) tends to 1/(1-P) S(N) = P N Parallel Portion Maximum Speedup* 99% 100 95% 20 90% 10 75% 4 50% 2 25% 1.3 * Even if we have one million processing cores!