Download presentation
Presentation is loading. Please wait.
1
ENGS 116 Lecture 21 Performance and Quantitative Principles Vincent H. Berk September 26 th, 2008 Reading for today: Chapter 1.1 - 1.4, Amdahl article Reading for Monday: Chapter 1.5 – 1.11, Mazor article Homework for Wednesday: 1.1, 1.3, 1.6, 1.7, 1.13
2
ENGS 116 Lecture 22 Review Task of Computer Designers –Determine which attributes are important for a new machine –Design a machine to maximize performance without violating cost/power/functionality constraints 3 Components of “Architecture” –Instruction set design –Organization –Hardware
3
ENGS 116 Lecture 23 Benchmarking Games Different configurations used to run the same workload on two systems. Compiler customized to optimize the workload. Workload arbitrarily picked to skew results. Test specification written to be biased toward one machine.
4
ENGS 116 Lecture 24 Design benchmarks for: Industrial and design Consumer Electronics Networking, routers Office applications Telecommunications Weapon systems
5
ENGS 116 Lecture 25 Execution time Weighted arithmetic mean: sum over execution time of all programs run, times their relative frequencies Normalized execution time: take a reference machine, set it to 1, then compute normalized execution times for others based on this machine Geometric mean of normalized execution time (reference computer becomes irrelevant, ratios can arbitrarily be compared)
6
ENGS 116 Lecture 26 Amdahl’s Law Execution time after improvement = Make the common case fast
7
ENGS 116 Lecture 27 Speedup due to enhancement E: Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected: ExTime (E) = Speedup (E) = Amdahl’s Law Speedup(E)= ExTime w/o E ExTime w/ E Performance w/ E Performance w/o E
8
ENGS 116 Lecture 28 Amdahl’s Law
9
ENGS 116 Lecture 29 Amdahl’s Law Example:Floating point instructions improved to run 2X, but only 10% of actual instructions are FP
10
ENGS 116 Lecture 210 Corollary: Make The Common Case Fast All instructions require an instruction fetch, only a fraction require a data fetch/store. –Optimize instruction access over data access Programs exhibit locality. Spatial Locality Temporal Locality Access to small memories is faster. –Provide a storage hierarchy such that the most frequent accesses are to the smallest (closest) memories. Disk/Tape Memory CacheRegisters
11
ENGS 116 Lecture 211 Metrics of Performance Compiler Programming Language Application Datapath Control TransistorsWiresPins ISA Function Units Millions of instructions per second: MIPS Millions of FP operations per second: MFLOPS Cycles per second (clock rate) Megabytes per second Answers per month Operations per second
12
ENGS 116 Lecture 212 Marketing Metrics Machines with different instruction sets? Programs with different instruction mixes? – Dynamic frequency of instructions Uncorrelated with performance Machine dependent Often not where time is spent
13
ENGS 116 Lecture 213 Aspects of CPU Performance Instr. Count CPI Clock Rate Program Compiler Instruction Set Organization Technology
14
ENGS 116 Lecture 214 Aspects of CPU Performance Instr. Count CPI Clock Rate ProgramX CompilerX(X) Instruction SetXX OrganizationXX TechnologyX
15
ENGS 116 Lecture 215 Average Cycles per Instruction CPI= (CPU Time Clock Rate) / Instruction Count= Cycles / Instruction Count CPU time = Cycle Time Instruction Frequency Invest resources where time is spent! Cycles Per Instruction
16
ENGS 116 Lecture 216 Example: Calculating CPI Base Machine (Reg / Reg) Op FreqCycles CPI (i) (% Time) ALU50% 1.5(33%) Load20% 2.4(27%) Store10% 2.2(13%) Branch 20% 2.4 (27%) 1.5 Typical Mix
17
ENGS 116 Lecture 217 Example Want to add register / memory operations - One source operand in memory - One source operand in register - Cycle count of 2 Side effect: Branch cycle count will increase to 3. What fraction of the loads must be eliminated for this to pay off? Base Machine (Reg / Reg) OpFreqCycles ALU50% 1 Load20% 2 Store10% 2 Branch20% 2
18
ENGS 116 Lecture 218 Example Solution Exec Time = Instruction Count CPI Clock OpFreqCyclesCPIFreqCyclesCPI ALU.50 1.5 Load.20 2.4 Store.10 2.2 Branch.20 2.4 Reg/Mem 1.001.5
19
ENGS 116 Lecture 219 Example Solution Exec Time = Instruction Count CPI Clock OpFreqCyclesCPIFreqCyclesCPI ALU.501.5.5 – X 1.5 – X Load.20 2.4.2 – X 2.4 – 2X Store.10 2.2.1 2.2 Branch.20 2.4.2 3.6 Reg/MemX 22X 1.001.51 – X (1.7 – X) /(1 – X) CPI New must be normalized to new instruction frequency
20
ENGS 116 Lecture 220 Example Solution Exec Time = Instruction Count CPI Clock OpFreqCyclesCPIFreqCyclesCPI ALU.50 1.5.5 – X 1.5 – X Load.20 2.4.2 – X 2.4 – 2X Store.10 2.2.1 2.2 Branch.20 2.4.2 3.6 Reg/MemX 22X 1.001.51 – X (1.7 – X) / (1 – X) Instr Cnt Old CPI Old Clock Old = Instr Cnt New CPI New Clock New
21
ENGS 116 Lecture 221 Example Solution Exec Time = Instruction Count CPI Clock OpFreqCyclesCPIFreqCyclesCPI ALU.50 1.5.5 – X 1.5 – X Load.20 2.4.2 – X 2.4 – 2X Store.10 2.2.1 2.2 Branch.20 2.4.2 3.6 Reg/MemX 22X 1.001.51 – X (1.7 – X) / (1 – X) Instr Cnt Old CPI Old Clock Old = Instr Cnt New CPI New Clock New 1.00 1.5= (1 – X) (1.7 – X) / (1 – X) 1.5= 1.7 – X 0.2= X ALL loads must be eliminated for this to be a win!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.