Download presentation
Presentation is loading. Please wait.
Published bySabina Stevenson Modified over 9 years ago
1
Performance Evaluation of Parallel Processing
2
Why Performance?
3
Models of Speedup Speedup Scaled Speedup ◦ Parallel processing gain over sequential processing, where problem size scales up with computing power (having sufficient workload/parallelism) Performance Evaluation of Parallel Processing
4
Speedup T s =time for the best serial algorithm T p =time for parallel algorithm using p processors
5
Example Processor 1 time 100 time 1 2 3 4 25 25 time 1 2 3 4 35 35 (a)(b) (c)
6
Example (cont.) time 1 2 3 4 30 20 40 10 time 1 2 3 4 50 50 (d) (e)
7
What Is “Good” Speedup? Linear speedup: Superlinear speedup Sub-linear speedup:
8
Speedup p speedup
9
Ideal Speedup in Multiprocessor System Linear Linear speedup ─ the execution time of program on an n-processor system would be l/nth of the execution time on a one-processor system
10
Limitations Interprocessor communication Synchronization Load Balancing
11
Limitations of Interprocessor communication Whenever one processor generates (computes) a value that is needed by the fraction of the program running on another processor, that value must be communicated to the processors that need it, which takes time On a uniprocessor system, the entire program runs on one processor, so there is no time lost to interprocessor communication
12
Limitations of Synchronization It is often necessary to synchronize the processors to ensure that they have all completed some phase of the program before any processor begins working on the next phase of the program
13
Load balancing In many parallel applications, difficult to divide the program across the processors When each processor working the same amount of time not possible, some of the processors complete their tasks early and are then idle waiting for the others to finish
14
Superlinear speedups Achieving speedup of greater than n on nprocessor systems Each of the processors in an n-processor multiprocessor to complete its fraction of the program in less than l/nth of the program’s execution time on a uniprocessor
15
Factors That Limit Speedup ● Software Overhead Even with a completely equivalent algorithm, software overhead ar ises in the concurrent implementation ● Load Balancing Speedup is generally limited by the speed of the slowest node. So an important consideration is to ensure that each node performs the same amount of work ● Communication Overhead Assuming that communication and calculation cannot be overlap ped, then any time spent communicating the data between processors directly degrades the speedup
16
CS546 Lect ure 5 Pag e 16 Degradations of Parallel Processing Unbalanced Workload Communication Delay Overhead Increases with the Ensemble Size
17
Degradations of Distributed Computing Unbalanced Computing Power and Workload Shared Computing and Communication Resource Uncertainty, Heterogeneity, and Overhead Increases with the Ensemble Size
18
Causes of Superlinear Speedup Cache size increased Overhead reduced Latency hidden Randomized algorithms Mathematical inefficiency of the serial algorithm Higher memory access cost in sequential processing X.H. Sun, and J. Zhu, "Performance Considerations of Shared Virtual Memory Machines,""Performance Considerations of Shared Virtual Memory Machines," IEEE Trans. on Parallel and Distributed Systems, Nov. 1995
19
Efficiency ● Speed up does not measure how efficiently the processors are being used ● Is it worth using 100 processors to get a speedup of 2? ● Efficiency is defined as the ratio of the speedup and the number of processors required to achieve it ● Efficiency is given by E(P,N) = S(P, N) / P
20
If the best known serial algorithm takes 8 seconds i.e. Ts = 8, while a parallel algorithm takes 2 seconds using 5 processors, then
21
Say we have a program containing 100 operations each of which take 1 time unit. If 80 operations can be done in parallel i.e. P = 80 and 20 operations must be done sequentially i.e. S = 20 then using 80 processors find speedup
22
Speedup metrics Three performance models based on three speedup metrics are commonly used. Amdahl’s law -- Fixed problem size Gustafson’s law -- Fixed time speedup Sun-Ni’s law -- Memory Bounding speedup Three approaches to scalability analysis are based on Maintaining a constant efficiency, A constant speed, and A constant utilization
23
Amdahl’s Law The performance improvement that can be gained by a parallel implementation is limited by the fraction of time parallelism can actually be used in an application Let = fraction of program (algorithm) that is serial and cannot be parallelized. For instance: ◦ Loop initialization ◦ Reading/writing to a single disk ◦ Procedure call overhead Parallel run time is given by CS546 Lect ure 5 Pag e 23
24
Amdahl’s Law Amdahl’s law gives a limit on speedup in terms of CS546 Lect ure 5 Pag e 24
25
Fixed-Size Speedup (Amdahl Law, 67) Fixed-Size Speedup (Amdahl Law, 67) CS546 Lect ure 5 Pag e 25 WpWp W1W1 WpWp WpWp WpWp WpWp W1W1 W1W1 W1W1 W1W1 12345 Number of Processors (p) Amoun t of Work TpTp T1T1 TpTp TpTp TpTp T1T1 T1T1 TpTp T1T1 T1T1 12345 Number of Processors (p) Elapsed Time
26
Consider the effect of the serial fraction F on the speedup produced for N = 10 and N = 1024.
28
Comments on Amdahl’s Law The Amdahl’s fraction in practice depends on the problem size n and the number of processors p An effective parallel algorithm has: For such a case, even if one fixes p, we can get linear speedups by choosing a suitable large problem size Scalable speedup Practically, the problem size that we can run for a particular problem is limited by the time and memory of the parallel computer CS546 Lect ure 5 Pag e 28
29
Gustafson law Gustafson defined two “more relevant” notions of speedup » Scaled speedup » Fixed-time speedup » And renamed Amdahl’s version as “fixed- size” speedup
30
Gustafson’s Law
32
Gustafson’s Law : Scaling for Higher Accuracy ? The problem size (workload) is fixed and cannot scale to match the available computing power as the machine size increases. Thus, Amdahl’s law leads to a diminishing return when a larger system is employed to solve a small problem. The sequential bottleneck in Amdahl’s law can be alleviated by removing the restriction of a fixed problem size. Gustafson’s proposed a fixed time concept that achieves an improved speedup by scaling problem size with the increase in machine size
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.