4. Assessing and Understanding Performance
4. Performance 4.1 Introduction 4.2 CPU Performance and Its Factors 4.3 Evaluating Performance 4.4 Real Stuff: Two SPEC Benchmarks and the Performance of Recent Intel Processors 4.5 Fallacies and Pitfalls 4.6 Concluding Remarks 4.7 Historical Perspective and Further Reading 4.8 Exercises
4.1 Introduction Defining Performance How to measure, report, and summarize performance Defining Performance An analogy Figure 4.1 Airplane Passenger capacity Cruising range Cruising speed Passenger throughput Boeing 777 375 4630 610 228,750 Boeing 747 470 4150 286,700 BAC/Sud Concorde 132 4000 1350 178,200 Douglas DC-8-50 146 8720 544 79,424 Back to chapter overview
Performance of a Computer Response time ( = execution time ) The time between the start and completion of a task Throughput The total amount of a work done in a given time Performance and execution time Performancex = 1 / Execution timex X is n times faster than Y
Measuring Performance Definitions of time Wall-clock time = Response time = Elapsed time Total time to complete a task Including disk accesses, memory accesses, I/O activities, OS overhead and etc. CPU execution time = CPU time The time CPU spends computing for this task CPU time = User CPU time + System CPU time UNIX time command 90.7u 12.9s 2:39 65% Definitions of performance System performance: based on elapsed time CPU performance: based on user CPU time
4.2 CPU Performance and Its Factors CPU execution time = CPU clock cycles x clock cycle time = CPU clock cycles / clock rate Example: Improving Performance Same instruction sets Computer A : 4 GHz, 10 seconds Computer B : ? GHz, 6 second B requires 1.2 times as many clock cycles as A. Back to chapter overview
[Answer] CPU timeA = CPU clock cyclesA / clock rateA 10 seconds = CPU clock cyclesA / (4 X 109 cycles/sec) CPU clock cyclesA = 10 sec. X 4 X 109 cycles/sec = 40 X 109 cycles CPU timeB = CPU clock cyclesB / clock rateB = 1.2 X CPU clock cyclesA / clock rateB 6 seconds = 1.2 X 40 X 109 cycles / clock rateB clock rateB = 1.2 X 40 X 109 cycles / 6 seconds = 8 GHz
Hardware Software Interface CPU clock cycles = IC x CPI IC (Instruction Count) Dependent on compilers and architectures CPI (Cycles Per Instruction) Dependent on implementations Performance equation Execution Time = IC x CPI x clock cycle time = (IC x CPI) / clock rate
Example: Using the Performance Equation Same instruction set architecture, same program Clock cycle timeA = 250ps, CPIA = 2.0 Clock cycle timeB = 500ps, CPIB = 1.2 Which is faster, and by how much ? [Answer] Let I = instruction count for the program. CPU timeA = ICA x CPIA x clock cycle timeA = I x 2.0 x 250 ps = 500 x I ps CPU timeB = I x 1.2 x 500 ps = 600 x I ps Then Thus, A is 1.2 times faster than B for this program.
The Big Picture
Example: Comparing Code Segments Which will be faster ? What is the CPI for each sequence ?
[Answer] instruction count1 = 2 + 1 + 2 = 5 and Thus (1) executes fewer instructions. CPU clock cycles1 = 2x1 + 1x2 + 2x3 = 10 and CPU clock cycles2 = 4x1 + 1x2 + 1x3 = 9 Thus (2) is faster. CPI1 = CPU clock cycles1 / instruction count1 = 10 / 5 =2 CPI2 = 9 / 6 = 1.5 (2) has lower CPI.
4.3 Evaluating Performance Benchmarking The process of performance comparison for two or more systems by measurements Benchmark Programs specifically chosen to measure performance A workload that the user hopes will predict the performance of the actual workload Compiler tricks Optimizations in either the architecture or compiler Back to chapter overview
Compiler Tricks by IBM
Comparing and Summarizing Performance Difficulties with summarizing performance A is 10 times faster than B for program 1. B is 10 times faster than A for program 2. Total execution time: A Consistent Summary Measure AM: Arithmetic Mean = Weighted arithmetic mean = Figure 4.4
4.6 Concluding Remarks Three design criteria High-performance design Supercomputer and high-end server Low-cost design Embedded system Cost/performance design Desktop computer Execution time of real program as the metrics Back to chapter overview