Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.

2 Chapter Overview Topics covered: How to measure, report, and summarize the performance of a computer The major factors that determine the performance of a computer

3 Chapter Overview Topics covered by section: Section 2.1: A description of the different ways in which performance can be determined Section 2.2: A description of the metrics for measuring performance from the viewpoint of both a computer user and a designer Section 2.3: A look at how these metrics are related and a presentation of the classical processor performance equation

4 Chapter Overview Topics covered by section: Section 2.4: A description of how best to choose benchmarks to evaluate machines Section 2.5: A description of how to accurately summarize the performance of a group of programs Section 2.6: A description of one set of commonly used CPU benchmarks; an examination of measurements for a variety of Intel processors using those benchmarks Section 2.7: An examination of some of the many pitfalls that have trapped designers and those who analyze and report performance

5 Defining Computer Performance As an individual user –The faster of two computers is the one that finishes running your program first. –Reducing response time, or execution time, the time between the start and completion of a task, is your main interest.

6 Defining Computer Performance As a computer center manager servicing many users –The faster of two timeshared computers is the one that completes the most jobs in one day. –Increasing throughput, the total amount of work done in a given time, is your main interest.

7 Timesharing, or Multitasking, Operating System A timesharing operating system is a sophisticated system that allows many users to “share” the computer simultaneously. How it works: –Users access the computer through terminals. –Each user has a separate program, or job, in memory.

8 Timesharing, or Multitasking, Operating System How it works, continued: –The computer starts to execute one user’s job. –When that job needs to wait (for keyboard input, for example), the operating system switches to and executes another user’s job. –When the second job needs to wait, the CPU is switched to another job, and so on.

9 Timesharing, or Multitasking, Operating System How it works, continued: –This switching takes place so quickly that the CPU may execute a portion of each user’s job several times per second. –It appears, therefore, that multiple programs are running simultaneously.

10 Example Problem (p. 56) Do the following changes to a computer system increase throughput, decrease response time, or both? 1)Replacing the processor in a computer with a faster version 2)Adding additional processors to a system that uses multiple processors for separate tasks (e.g., an airline reservation system)

11 Solution to Problem Case 1: –Both response time and throughput are improved. Case 2: –Only throughput increases, since no one task gets work done faster. –However, if job requests queue up, increasing throughput could also improve response time, since it would reduce the waiting time in the queue.

12 The Relationship Between Performance and Response Time, or Execution Time, for Some Task To maximize performance, we must minimize execution time. For a machine X:

13 The Relationship Between Performance and Response Time, or Execution Time, for Some Task For a machine Y:

14 The Relationship Between Performance and Response Time, or Execution Time, for Some Task For two machines X and Y, if X is faster than Y, then: Substituting, we get:

15 The Relationship Between Performance and Response Time, or Execution Time, for Some Task Or, equivalently: Therefore, the execution time on Y is greater than that on X, if X is faster than Y.

16 Relating the Performance and Execution Time of Two Different Machines Quantitatively The phrase “X is n times faster than Y” means:

17 Relating the Performance and Execution Time of Two Different Machines Quantitatively If X is n times faster than Y, then the execution time on Y is n times longer than it is on X:

18 Example Problem (p. 57) If machine A runs a program in 10 seconds and machine B runs the same program in 15 seconds, how much faster is A than B? Your answer?

19 Related Problems to Do for Homework On page 90 of your text: –Exercises 2.1, 2.2, and 2.5 Moving right along...

20 Terminology Used When Comparing Machine Performance We will use the terminology faster than (and avoid the use of slower than) when comparing machines quantitatively. We will say “improve performance” or “improve execution time” when we mean “increase performance” and “decrease execution time”.

22 Varying Definitions of Time Time is the measure of computer performance. Time, however, can be measured in different ways, depending on what we count. –Response time, or elapsed time The total time to complete a task –CPU execution time, or CPU time The time the CPU spends computing for this task Response time includes time required for disk accesses, memory accesses, I/O activities, etc.; CPU time does not.

23 CPU Time CPU time can be further divided into: –User CPU time The CPU time spent in the program –System CPU time The CPU time spent in the operating system performing tasks on behalf of the program

24 Program Execution Time When system CPU time is ignored (e.g., when comparing the performance of machines with different operating systems): –Program execution time = user CPU time When system CPU time is not ignored: –Program execution time = user CPU time + system CPU time

25 Terminology Used When Measuring Performance System performance –Refers to elapsed time CPU performance –Refers to user CPU time

26 A Different Metric for Measuring Performance Computer users like to think about performance in terms of time. Computer designers, on the other hand, may prefer to think about performance in terms of how fast the hardware can perform basic functions.

27 The System Clock The system clock runs at a constant rate, thus establishing the speed at which the computer can transport data and execute instructions. It emits a pulse at regular intervals, referred to as clock cycles, or ticks; clock cycles are measured in nanoseconds. The clock speed, or clock rate, is the number of times the system clock pulsates in one second; it is measured in megahertz. The clock period is the time required for a complete clock cycle.

29 Relating Clock Cycles and Clock Cycle Time to CPU Execution Time Alternatively, because clock rate and clock cycle time are inverses:

30 Relating Clock Cycles and Clock Cycle Time to CPU Execution Time It can be deduced from looking at this formula that performance can be improved by: –Reducing the length of the clock cycle, or clock cycle time –Reducing the number of clock cycles required for a program

31 Example Problem A program runs in 10 seconds on computer A, which has a 400 MHz clock. We want to build a machine B, that will run this program in 6 seconds. It has been determined that an increase in clock rate is possible, but this increase will cause machine B to require 1.2 times as many clock cycles as machine A for this program. What clock rate should we tell the designer to aim for? Your solution?

32 Relating the Number of Program Instructions to CPU Clock Cycles CPU execution time for a program is dependent upon the number of instructions in a program. Clock cycles per instruction (CPI) = the average number of clock cycles each instruction takes to execute

33 Basic Processor Performance Equation (in Terms of Instruction Count) Or:

34 Example Problem Suppose we have two implementations of the same instruction set architecture. Machine A has a clock cycle time of 1 ns and a CPI of 2.0 for some program. Machine B has a clock cycle time of 2 ns and a CPI of 1.2 for the same program. Which machine is faster for this program, and by how much? Your solution?

35 A Reliable Measure of Computer Performance Important note: –Always keep in mind that the only complete and reliable measure of computer performance is time, and not necessarily the instruction count.

36 Determining the Values Used in the Basic Performance Equation CPU execution time: can be determined by running the program Clock rate & clock cycle time: generally included in the documentation for a machine Instruction count: can be measured using available software tools or hardware counters CPI: harder to obtain, since it depends on design details in the machine, as well as the mix of instruction types executed in an application

37 Example Problem (Pages 64-65) A compiler designer is trying to decide … Which code sequence executes the most instructions? Which will be faster? What is the CPI for each sequence?

39 Benchmarks Benchmarks are programs specifically chosen to measure computer performance. Benchmarks form a workload, the set of programs run. To evaluate a computer system, measurements are made of the performance of the workload on that particular machine. It is hoped that these measurements will predict the performance of a user’s actual workload.

40 The Type of Programs Used for Benchmarks The best type of programs to use for benchmarks are real applications. –Applications that a user uses regularly, or … –Applications that are typical of what a user employs, such as a compiler for a user community of software development engineers

41 Why Real Applications Are Used As Benchmarks Using real applications as benchmarks makes it harder to find trivial ways to speed up the execution of the benchmark. In addition, when techniques are found to improve the performance of a program being used as a benchmark, those techniques are more likely to help other programs.

42 Why Real Applications Aren’t Used As Benchmarks by Everyone Why use small programs as benchmarks? –They are useful in the early stages of the design process, since they are easy to compile and simulate. They can be more easily standardized than large programs.

43 When Small Programs Should Not Be Used As Benchmarks Small programs should not be used as benchmarks to evaluate working computer systems since the resulting measurements may be misleading.

44 Reporting Performance Measurements Guidelines for writing a performance report: List everything another experimenter would need to duplicate the results. The list should include information about the following: –Operating system –Compilers –Input –Machine configuration

46 Response Time or Throughput In addition to choosing programs to use as benchmarks, we must also decide what we want to measure: 1)Response time or 2)Throughput

47 Summarizing the Performance of a Group of Benchmarks Now that all other necessary decisions have been made, we must decide how to summarize the performance of a group of benchmarks. Marketers and users often prefer to have a single number to compare performance, in spite of the fact that a single value provides less information regarding the performance measurements made.

48 Computing Performance The simplest approach to summarizing relative performance is to use the total amount of time it took to execute the group of programs on each machine. Using the example on page 70: –Computer B appears to be the faster machine since it has the lower total execution time. –Therefore:

49 Summarizing Performance Thus, if the workload consists of running programs 1 and 2 an equal number of times, the statement “B is 9.1 times faster than A for programs 1 and 2 together.” would predict the relative execution times for this workload on each machine. This summary is directly proportional to the execution time. This relationship does not hold, however, if the programs in the workload are not each run the same number of times.

50 Summarizing Performance Another way that a factor of 9.1 could have been arrived at would have been by computing the ratio of the average execution times for machines A and B:

51 Summarizing Performance It is a variation on this latter method of computation that must be used when the programs comprising the workload are not each run the same number of times. The next slide shows how relative performance would be computed if runs of program 1 made up 20% of the workload, and runs of program 2 made up 80% of the workload.

52 Summarizing Performance In this case, machine B is 9.8 times faster than machine A.

Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.

Similar presentations

Presentation on theme: "Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.

Similar presentations

Presentation on theme: "Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction."— Presentation transcript:

Similar presentations

About project

Feedback