Download presentation
Presentation is loading. Please wait.
1
Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律
2
Growth in CPU transistor count
3
Consequences of Moore ’ s law Cost of a chip remains unchanged during the growth of in density => cost down Electrical path length is shortened => increase operating speed Computer becomes smaller Reduction in power More circuitry on each chip => fewer inter- chip connections => more reliable
4
Chap.4 The Role of Performance Jen-Chang Liu, Spring 2005
5
Hardware performance is often key to the effectiveness of an entire system of hardware and software. What do we mean by saying one computer has better performance than another?
6
Example: performance of airplanes
7
Performance of a hardware system What do we mean by better performance? Fast speed ? Response time (execution time): the time between the start and completion of a task 完成 工作所需的時間 Throughput : the total amount of work done in a given time 單位時間完成的工作 Ex. multi-user system
8
Performance measure Performance X 1 Execution time x = * Relative performance: Performance A Performance B = n = Machine A is n times faster than B Execution time B Execution time A Ex. machine A runs a program in 10 sec., machine B runs a program in 15 sec., Performance A Performance B = 1.5 Execution time B Execution time A = = 15 10 Quantitative relation of performance and execution time on machine x:
9
Problem with previous definition of performance The definition of execution time How about multiple tasks run concurrently? Use which programs to evaluate the performance of a computer ?
10
Execution Time ? The total time to complete a task – response time, elapsed time In a timeshared system, such as Unix, a processor work on several programs Including disk access, memory access, I/O, OS overhead … 執行時間的定義 使用者觀點 Program A swap Prog. BI/O Program A Response time for A
11
CPU time CPU execution time Does not include waiting for I/O, running other programs CPU exec. time = user CPU time + system CPU time user CPU time CPU time spent in the program system CPU time CPU time spent in the OS about our program 不含 I/O, 執行其他程式時間
12
Example : CPU time Unix command : time 90.7u 12.9s 2:39 65% user CPU system CPU elapsed time 90.7+12.9 159 = 0.65 We will discuss CPU performance, i.e. user CPU time in the following discussion
13
Unit of time Seconds Clock cycle Ex. Clock cycle time = 2ns Clock rate = 1 2x10 -6 = 500 MHz CPU time for a program CPU clock cycles for a program =x Clock cycle time Instructions for a program =x Average clock cycle per instruction x Clock cycle time (CPI)
14
Example 1 Machine A,B has the same ISA, for the same program Machine A: clock cycle = 1ns, CPI = 2 Machine B: clock cycle = 2ns, CPI = 1.2 CPU time A = Inst. count x CPI x clock cycle time = I x 2 x 1 = 2I CPU time B = I x 1.2 x 2 = 2.4 I Performance A Performance B Execution time B Execution time A = = 2.4I 2I = 1.2 A is 1.2 times faster than B
15
Quiz 5/9 Program P runs in 10s on computer A, which has 4GHz clock We want run program P in 6s. We design computer B with faster clock rate, but it requires 1.2 times as many clock cycles as computer A. What clock rate should we use in computer B?
16
Example 2 Instruction class CPI A B C 1 2 3 Code 1: 2 1 2 Code 2: 4 1 1 Compiler generate 2 different code sequences A B C CPU clock cycle 1 = 2x1 + 1x2 + 2x3 = 10 cycles CPU clock cycle 2 = 4x1 + 1x2 + 1x3 = 9 cycles Total inst. 5 6 faster? faster
17
Short conclusion Computer Performance software hardware Response time CPU time I/O, other prog.s Instruction count CPI Clock cycle length How to optimize them in a hardware design?
18
Problem with previous definition of performance The definition of execution time How about multiple tasks run concurrently? Use which programs to evaluate the performance of a computer ?
19
Choose programs to evaluate performance Benchmarks: programs chosen to measure performance SPEC (System Performance Evaluation Cooperative) suit of benchmarks Started in 1989 http://open.specbench.org/ SPEC95 in textbook is retired … SPECx contains a set of benchmark programs
20
SPEC – money …
21
SPEC95 benchmarks Integer benchmarks written in C floating-pt benchmarks written in Fortran 77
22
Summarize performance Which is faster? Computer AComputer B Program 1(sec)110 Program 2(sec)1000100 Total time(sec)1001110 Performance B Performance A Execution time A Execution time B = = 1001 110 = 9.1 * Assume the programs occur in equal probability.
23
SPEC ratio The execution time of a benchmark program is normalized (compared to a baseline system) SPECint95, SPECfp95 SPEC ratio = Exec. Time on Sun SPARCstation 10/40 Exec. Time on the measured machine SPECint95 = geometric mean of SPEC ratios
24
Example: SPECint95 for Pentium and Pentium Pro 1 1 Performance improvement 2 2 Clock rate x2 SPECint x 1.7 ?
25
Amdahl’s law in computing CPU time for a program CPU clock cycles for a program =x Clock rate 1 Clock rate => CPU time2 2 * Improvement of one aspect of a machine does not increase performance by the same ratio 部分的改進 * Ex. The bottleneck in the memory system does not improve Exec. time after improve. = Exec. time affected by improve. Amount of improvement Exec. time unaffected + as in previous example
26
Example: Amdahl’s law A program takes 100s to run 20% multiplication, 50% memory op., 30% others What ’ s the speed up for Multiply speed 4 Memory access 2 100 20/4 + 50 + 30 =1.18 100 20 + 50/2 + 30 =1.33
27
MIPS as a measurement (not good … ) MIPS = Million Instructions Per Second High MIPS => faster ? MIPS= Instruction count Execution time x 10 6 Pitfalls: MIPS cannot be used to compare computers with different instruction sets => inst. count differs MIPS varies between programs on the same computer => no single MIPS for a machine
28
Example: MIPS ? Example: 500 MHz machine Code 1 Code 2 Inst. Count(x10 9 ) for each inst. class A BC 5 1 1 10 1 1 2 compilers for the same source program: Instruction class CPI A B C 1 2 3
29
Example: MIPS? MIPS 1 = Inst. count Exec timex10 6 = (5+1+1)x10 9 20x10 6 =350 MIPS 2 = (10+1+1)x10 9 30x10 6 =400 Exec. time 1 < Exec. time 2 MIPS 1 < MIPS 2 Exec. Time 1 = (5x1+1x2+1x3)x10 9 cycles 500x10 6 cycles/sec = 20 sec. Exec. Time 2 = (10x1+1x2+1x3)x10 9 cycles 500x10 6 cycles/sec = 30 sec.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.