Download presentation
Presentation is loading. Please wait.
Published byJack Gibson Modified over 9 years ago
1
Lecture 2a: Performance Measurement
2
Goals of Performance Analysis The goal of performance analysis is to provide quantitative information about the performance of a computer system
3
Goals of Performance Analysis Compare alternatives When purchasing a new computer system, to provide quantitative information Determine the impact of a feature In designing a new system or upgrading, to provide before-and-after comparison System tuning To find the best parameters that produce the best overall performance Identify relative performance To quantify the performance relative to previous generations Performance debugging To identify the performance problems and correct them Set expectations To determine the expected capabilities of the next generation
4
Performance Evaluation Performance Evaluation steps: 1.Measurement / Prediction What to measure? How to measure? Modeling for prediction Simulation Analytical Modeling 2.Analysis & Reporting Performance metrics
5
Performance Measurement Interval Timers Hardware Timers Software Timers
6
Performance Measurement Hardware Timers Counter value is read from a memory location Time is calculated as ClockCounter TcTc n bits to processor memory bus Time = ( x 2 - x 1 ) x T c
7
Performance Measurement Software Timers Interrupt-based When interrupt occurs, interrupt-service routine increments the timer value which is read by a program Time is calculated as Clock Prescaling Counter TcTc to processor interrupt input T’ c Time = ( x 2 - x 1 ) x T’ c
8
Performance Measurement Timer Rollover Occurs when an n-bit counter undergoes a transition from its maximum value 2 n – 1 to zero There is a trade-off between roll over time and accuracy T’ c 32-bit64-bit 10 ns42 s5850 years 1 s 1.2 hour0.5 million years 1 ms49 days0.5 x 10 9 years
9
Timers Solution: 1.Use 64-bit integer (over half a million year) 2.Timer returns two values: One represents seconds One represents microseconds since the last second With 32-bit, the roll over is over 100 years
10
Performance Measurement Interval Timers T0 Read current time Event being timed (); T1 Read current time Time for the event is: T1-T0
11
Performance Measurement Timer Overhead Initiate read_time Current time is read Event begins Event ends; Initiate read_time Current time is read T1T1 T2T2 T3T3 T4T4 Measured time: T m = T 2 + T 3 + T 4 Desired measurement: T e = T m – (T 2 + T 4 ) = T m – (T 1 + T 2 ) since T 1 = T 4 Timer overhead: T ovhd = T 1 + T 2 T e should be 100-1000 times greater than T ovhd.
12
Performance Measurement Timer Resolution Resolution is the smallest change that can be detected by an interval timer. nT’ c < T e < (n+1)T’ c If T’ c is large relative to the event being measured, it may be impossible to measure the duration of the event.
13
Performance Measurement Measuring Short Intervals T e < T’ c T’ c TeTe TeTe 1 0
14
Performance Measurement Measuring Short Intervals Solution: Repeat measurements n times. Approximates a binomial distribution. Average execution time: T’ e = (m/n) x T’ c m: number of 1s measured T’ c TeTe
15
Performance Measurement Measuring Short Intervals Solution: Repeat measurements n times. Measure the total execution time (T t ) Average execution time: T’ e = (T t / n ) – h T t : total execution time of n repetitions h: repetition overhead T’ c TeTe TtTt
16
Performance Measurement Time Elapsed time / wall-clock time / response time Latency to complete a task, including disk access, memory access, I/O, operating system overhead, and everything (includes time consumed by other programs in a time-sharing system) CPU time The time CPU is computing, not including I/O time or waiting time User time / user CPU time CPU time spent in the program System time / system CPU time CPU time spent in the operating system performing tasks requested by the program
17
Performance Measurement UNIX time command 90.7u 12.9s 2:39 65% Drawbacks: Resolution is in milliseconds Different sections of the code can not be timed User time System time Elapsed time Percentage of elapsed time
18
Timers Timer is a function, subroutine or program that can be used to return the amount of time spent in a section of code. t0 = timer(); … … t1 = timer(); time = t1 – t0; zero = 0.0; t0 = timer(&zero); … … t1 = timer(&t0); time = t1;
19
Timers Read: Wadleigh, Crawford pg 130-136 for: time, clock, gettimeofday, etc.
20
Timers Measuring Timer Resolution main() {... zero = 0.0; t0 = timer(&zero); t1 = 0.0; j=0; while (t1 == 0.0) { j++; zero=0.0; t0 = timer(&zero); foo(j); t1 = timer(&t0); } printf (“It took %d iterations for a nonzero time\n”, j); if (j==1) printf (“timer resolution <= %13.7f seconds\n”, t1); else printf (“timer resolution is %13.7f seconds\n”, t1); } foo(n){... i=0; for (k=0; k<n; k++) i++; return(i); }
21
Timers Measuring Timer Resolution Using clock() : Using times() : Using getrusage() : It took 682 iterations for a nonzero time timer resolution is 0.0200000 seconds It took 720 iterations for a nonzero time timer resolution is 0.0200000 seconds It took 7374 iterations for a nonzero time timer resolution is 0.0002700 seconds
22
Timers Spin Loops For codes that take less time to run than the resolution of the timer First call to a function may require an inordinate amount of time. Therefore the minimum of all times may be desired. main() {... zero = 0.0; t2 = 100000.0; for (j=0; j<n; j++) { t0 = timer(&zero); foo(j); t1 = timer(&t0); t2 = min(t2, t1); } t2 = t2 / n; printf (“Minimum time is %13.7f seconds\n”, t2); } foo(n){... }
23
Profilers A profiler automatically insert timing calls into applications to generate calls into applications It is used to identify the portions of the program that consumes the largest fraction of the total execution time. It may also be used to find system-level bottlenecks in a multitasking system. Profilers may alter the timing of a program’s execution
24
Profilers Data collection techniques Sampling-based This type of profilers use a predefined clock; every multiple of this clock tick the program is interrupted and the state information is recorded. They give the statistical profile of the program behavior. They may miss some important events. Event-based Events are defined (e.g. entry into a subroutine) and data about these events are collected. The collected information shows the exact execution frequencies. It has substantial amount of run-time overhead and memory requirement. Information kept Trace-based: The compiler keeps all information it collects. Reductionist: Only statistical information is collected.
25
Performance Evaluation Performance Evaluation steps: 1.Measurement / Prediction What to measure? How to measure? Modeling for prediction Simulation Analytical Modeling Queuing Theory 2.Analysis & Reporting Performance metrics
26
Predicting Performance Performance of simple kernels can be predicted to a high degree Theoretical performance and peak performance must be close It is preferred that the measured performance is over 80% of the theoretical peak performance
27
Homework 1 Write a C program to measure the execution time (elapsed time) of an addition operation (i.e. a=b+c). Run your program on both Windows and Linux systems. Use a timer that has at least s resolution. Prepare a one-page report and explain the following: Your method to measure time Your code Specifications of the system that you run your code (processor, clock speed, etc.) Your measurement results Comments on your results
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.