Presentation is loading. Please wait.

Presentation is loading. Please wait.

EECE476: Computer Architecture Lecture 11: Understanding and Assessing Performance Chapter 4.1, 4.2 The University of British ColumbiaEECE 476© 2005 Guy.

Similar presentations


Presentation on theme: "EECE476: Computer Architecture Lecture 11: Understanding and Assessing Performance Chapter 4.1, 4.2 The University of British ColumbiaEECE 476© 2005 Guy."— Presentation transcript:

1 EECE476: Computer Architecture Lecture 11: Understanding and Assessing Performance Chapter 4.1, 4.2 The University of British ColumbiaEECE 476© 2005 Guy Lemieux

2 2 Questions Why do we want high performance? How do you measure performance? Why measure it?

3 3 Measurement and Evaluation Architecture is an iterative process -- searching the space of possible designs -- at all levels of computer systems Good Ideas Mediocre Ideas Bad Ideas Cost / Performance Analysis Design Analysis Creativity We need a way to measure performance so we can find the good ideas!!!

4 4 Performance Trends Microprocessors Minicomputers Mainframes Supercomputers 1995 Year 19901970197519801985 Log of Performance

5 5 Performance! How did we obtain this? Performance Year

6 6 How to Obtain Performance? Through Transistors? Source: IntelIntel

7 7 How to Obtain Performance? Through Clock Speed? Clock Speed of Intel CPUs (cycles per second) YEAR

8 8 How to Obtain Performance? Through Power?

9 9 Performance How to obtain performance? –We can’t really answer this until we understand how to measure performance! How to measure performance? –This is a fundamental question! –Buying a Car: Top Speed? Fuel Economy? Range? Turning radius? –Buying a Computer: Clock Speed? Power? Battery Life? Boot-up time?

10 10 Airplanes! Which has Greater Performance? AirplanePassenger Capacity (ppl) Range (km)Speed (km/h) Throughput (ppl*km/h) Boeing 7773757,450980367,500 Boeing 747 470 6,680980 460,600 Concorde1326,440 2,170 286,440 Douglas DC-8-50 146 14,030 875127,750

11 11 Performance: Two Fundamental Concepts 1. Throughput (aka bandwidth) –Total amount of work done in a given time Boeing 747 Laundromat with many washers & dryers Important for computer data centres 2. Response time (aka latency) –Time from start to end of a given task Concorde One fast, modern laundry machine at home Important for personal computers Which is more important for this course? –Mostly response time! –Better response time  usually implies higher throughput (but not  )

12 12 Defining Performance (Response Time) Given a computer architecture X, define: Performance X = 1 / ExecutionTime X Suppose X is “faster” than Y: Performance X > Performance Y Implies: 1 / ExecutionTime X > 1 / ExecutionTime Y or ExecutionTime Y > ExecutionTime X

13 13 Relative Performance X is n times faster than Y means: n = Performance X / Performance Y = ExecutionTime Y / ExecutionTime X Example: how much faster is A than B? Machine A: 10 seconds. Machine B: 15 seconds. 15/10 = 1.5 Hence, A is 1.5 times faster than B. Try to be clear: IMPROVE performance, don’t increase it!!!

14 14 Measuring Execution Time Three possible ways of measuring response time 1.Wall-clock Time Start to finish, includes everything (eg, other programs, I/O) Very non-deterministic! 2.CPU Time (System + User) User Time = your program (directly) System Time = in OS on behalf of your program (excludes I/O) System Time difficult to ascertain, other programs may affect it Can vary greatly, depending on quality of OS! Non-deterministic! 3.CPU Time (User only) Users program, excluding I/O, excluding OS Fairly deterministic Which is better? Either 2 or 3 …

15 15 Poor Choices for Performance Metrics Why are each of these metrics bad? –Number of instructions Static instruction count Dynamic instruction count How much work is done in each instruction? –Number of instructions per second MIPS: millions of instructions per second MIPS: meaningless indicator of processor speed (!) –Number of clock cycles –Clock speed (clock rate) Taken together, we may have something here….

16 16 Performance Equation (1) Simplified version: CPUTime= #ClockCycles * CycleTime = #ClockCycles / ClockRate #ClockCycles Encapsulates two things: –Number of instructions in a program –Complexity of each instruction CycleTime = 1 / ClockRate Clock Rate is the clock speed (in MHz or GHz) of the CPU Cycle Time is the clock period (in ns) of the CPU

17 17 Processor Speed Which is faster? A) 3.6 GHz Pentium 4 B) 2.0 GHz Pentium M

18 18 CycleTime CycleTime == clock period 3.6 GHz Pentium 4 processor is fast! –0.2778ns cycle time –SPECint_base2000 benchmark: 1510 (15.1 times faster than ULTRASparc) –http://www.spec.org/cpu2000/results/res2004q3/cpu2000-20040621-03127.htmlhttp://www.spec.org/cpu2000/results/res2004q3/cpu2000-20040621-03127.html 2.0 GHz Pentium M processor is faster! –0.5ns cycle time –SPECint_base2000 benchmark: 1528 (15.28 times faster than ULTRASparc) –http://www.spec.org/cpu2000/results/res2004q2/cpu2000-20040614-03081.htmlhttp://www.spec.org/cpu2000/results/res2004q2/cpu2000-20040614-03081.html Huh???? Clock speed alone is not a good indicator of processor speed

19 19 3.6 GHz Pentium 4

20 20 2.0 GHz Pentium M

21 21 #ClockCycles (part 1) Could assume each instruction takes one cycle: 1st instruction2nd instruction3rd instruction 4th 5th6th... This assumption is incorrect, different instructions take different amounts of time on different machines. Why? hint: these are machine instructions, not lines of C code time

22 22 #ClockCycles (part 2) Reality: each instruction can take a different number of cycles! 1.Multiplication is slower than addition 2.Floating point operations are slower than integer operations 3.Accessing memory takes is slower than accessing registers Important point: changing the cycle time often changes the number of cycles required for various instructions (more later) time

23 23 #ClockCycles (part 3) MIPS or InstrCount alone is meaningless #ClockCycles alone is meaningless CycleTime alone is meaningless … need to tie all three together…. InstrCount (instructions per program) CPI (cycles per instruction) CycleTime (time per cycle)

24 24 Performance Equation (2) Put the pieces together… CPUTime = InstrCount * CPI * CycleTime Dimensional analysis –Check the units… time/prog = (instr/prog)*(cycle/instr)*(time/cycle) XXXX

25 25 Performance Equation (3) Full version: CPUTime =  i (InstrCount i * CPI i ) * CycleTime InstrCount i count of instructions of type i CPI i cycles per instruction of type i

26 26 Quickie Quiz Give 2 most important concepts of performance measurements Give 3 ways of measuring performance Explain what is wrong with the following performance metrics –Instructions per second –Clock speed –Cycles per instruction –Number of transistors –Power What performance metric is used in this course? What is the performance equation? What does it mean? Why is it used?


Download ppt "EECE476: Computer Architecture Lecture 11: Understanding and Assessing Performance Chapter 4.1, 4.2 The University of British ColumbiaEECE 476© 2005 Guy."

Similar presentations


Ads by Google