4- Performance Analysis of Parallel Programs

4- Performance Analysis of Parallel Programs

Performance Evaluation of Computer Systems
CPU time (response time metric): Depends on the program and compiler efficiency.

2. MIPS and MFLOPS (throughput metric): depends on the program and type of instructions, no differentiation between floating point add and divide!

3. Performance of Processors with a Memory Hierarchy: In case of multiple cache levels:

4. Benchmark Programs: Synthetic benchmarks: small artificial programs that represent a large class of real applications. Such as Whetstone, Dhrystone Kernel benchmarks: small but relevant parts of real applications, such as Livermore Loops, or toy programs like quicksort. Real application benchmarks: several entire programs which reflect a workload of a standard user, called benchmark suites. Such as SPEC benchmarks (System Performance Evaluation Cooperation), and EEMBC benchmarks (EDV Embedded Microprocessor Benchmark Consortium).

Performance Metrics for Parallel Programs
parallel runtime Tp(n): the time between the start of the program and the end of the execution on all participating processors; execution of local computations of each participating processor; exchange of data between processors of a distributed address space; synchronization of the participating processors when accessing shared data structures; waiting times occurring because of an unequal load distribution; Parallelization overhead

Performance Metrics for Parallel Programs
Cost of a parallel program Cp(n): total amount of work performed by all processors; A parallel program is called cost-optimal if Speedup: Efficiency:

Speedup limit Amdahl’s law:
When a (constant) fraction f , 0 ≤ f ≤ 1, of a parallel program must be executed sequentially, if 20% of a program must be executed sequentially, then the attainable speedup is limited to 1/f = 5.

Scalability For a fixed problem size n a saturation of the speedup can be observed when the number p of processors is increased. Efficiency can be kept constant if both the number p of processors and the problem size n are increased. Larger problems can be solved in the same time as smaller problems if a sufficiently large number of processors is employed. Gustafson’s law: for the special case that the sequential program part has a constant execution time, independently of the problem size.

4- Performance Analysis of Parallel Programs

Similar presentations

Presentation on theme: "4- Performance Analysis of Parallel Programs"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

4- Performance Analysis of Parallel Programs

Similar presentations

Presentation on theme: "4- Performance Analysis of Parallel Programs"— Presentation transcript:

Similar presentations

About project

Feedback