August 30, 2000 Prof. John Kubiatowicz

Slides:



Advertisements
Similar presentations
Performance Evaluation of Architectures Vittorio Zaccaria.
Advertisements

2-1 ECE 361 ECE C61 Computer Architecture Lecture 2 – performance Prof. Alok N. Choudhary
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
ENGS 116 Lecture 21 Performance and Quantitative Principles Vincent H. Berk September 26 th, 2008 Reading for today: Chapter , Amdahl article.
CIS629 Fall Lecture Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important.
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Aug 26, 2002.
Computer Performance Evaluation: Cycles Per Instruction (CPI)
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Sep 1, 2004 Lecture 3 (continuation of Lecture 2)
CS430 – Computer Architecture Lecture - Introduction to Performance
CIS429/529 Winter 07 - Performance - 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two.
Lecture 2: Benchmarks, Performance Metrics, Cost, Instruction Set Architecture Professor Alvin R. Lebeck Computer Science 220 Fall 2001.
1 Computer Performance: Metrics, Measurement, & Evaluation.
Where Has This Performance Improvement Come From? Technology –More transistors per chip –Faster logic Machine Organization/Implementation –Deeper pipelines.
Lecture 2: Computer Performance
CENG 450 Computer Systems & Architecture Lecture 3 Amirali Baniasadi
Memory/Storage Architecture Lab Computer Architecture Performance.
Operation Frequency No. of Clock cycles ALU ops % 1 Loads 25% 2
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
PerformanceCS510 Computer ArchitecturesLecture Lecture 3 Benchmarks and Performance Metrics Lecture 3 Benchmarks and Performance Metrics.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
Digital System Architecture 1 28 ต.ค ต.ค ต.ค ต.ค ต.ค. 58 Lecture 2a Computer Performance and Cost Pradondet Nilagupta.
1 Seoul National University Performance. 2 Performance Example Seoul National University Sonata Boeing 727 Speed 100 km/h 1000km/h Seoul to Pusan 10 hours.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
CS252/Patterson Lec 1.1 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic2: Technology Trend and Cost/Performance (Adapted from David A. Patterson’s CS252 lecture.
Cost and Performance.
Morgan Kaufmann Publishers
Performance Enhancement. Performance Enhancement Calculations: Amdahl's Law The performance enhancement possible due to a given design improvement is.
Performance Performance
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Performance 9 ways to fool the public Old Chapter 4 New Chapter 1.4.
Chapter 1 — Computer Abstractions and Technology — 1 Uniprocessor Performance Constrained by power, instruction-level parallelism, memory latency.
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 2: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed. Oct. 4,
Computer Architecture CSE 3322 Web Site crystal.uta.edu/~jpatters/cse3322 Send to Pramod Kumar, with the names and s.
Chapter 1 Performance & Technology Trends. Outline What is computer architecture? Performance What is performance: latency (response time), throughput.
June 20, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 1: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed.
Computer Architecture & Operations I
CpE 442 Introduction to Computer Architecture The Role of Performance
Measuring Performance II and Logic Design
CSCI206 - Computer Organization & Programming
Computer Architecture & Operations I
CS161 – Design and Architecture of Computer Systems
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
ECE 4100/6100 Advanced Computer Architecture Lecture 1 Performance
Performance Performance The CPU Performance Equation:
Review: Instruction Set Evolution
How do we evaluate computer architectures?
Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance Ayman Alharbi.
Lecture 4: Performance (conclusion) & Instruction Set Architecture
Lecture 2: Intro to Computer Architecture
Uniprocessor Performance
Morgan Kaufmann Publishers
CSCE 212 Chapter 4: Assessing and Understanding Performance
Computer Architecture CSCE 350
Chapter 1 Fundamentals of Computer Design
CS2100 Computer Organisation
CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof. John Kubiatowicz
Measurement & Evaluation
Chapter 1 Computer Abstractions & Technology Performance Evaluation
CSCI206 - Computer Organization & Programming
Performance of computer systems
Arrays versus Pointers
Performance of computer systems
A Question to Ponder On [from last lecture]
CS 704 Advanced Computer Architecture
Parameters that affect it How to improve it and by how much
CS2100 Computer Organisation
Presentation transcript:

August 30, 2000 Prof. John Kubiatowicz CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 30, 2000 Prof. John Kubiatowicz

Instruction Set Architecture (ISA) software instruction set hardware

Evolution of Instruction Sets Single Accumulator (EDSAC 1950) Accumulator + Index Registers (Manchester Mark I, IBM 700 series 1953) Separation of Programming Model from Implementation High-level Language Based Concept of a Family (B5000 1963) (IBM 360 1964) General Purpose Register Machines Complex Instruction Sets Load/Store Architecture (CDC 6600, Cray 1 1963-76) (Vax, Intel 432 1977-80) RISC (Mips,Sparc,HP-PA,IBM RS6000, . . .1987)

Amdahl's Law Speedup due to enhancement E: ExTime w/o E Performance w/ E Speedup(E) = ------------- = ------------------- ExTime w/ E Performance w/o E Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected

Amdahl’s Law Best you could ever hope to do:

Amdahl’s Law Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew = Speedupoverall =

Amdahl’s Law Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew = ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold 1 Speedupoverall = = 1.053 0.95

Aspects of CPU Performance CPU time = Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle Inst Count CPI Clock Rate Program X Compiler X (X) Inst. Set. X X Organization X X Technology X

Cycles Per Instruction (Throughput) “Average Cycles per Instruction” CPI = (CPU Time * Clock Rate) / Instruction Count = Cycles / Instruction Count “Instruction Frequency” Invest Resources where time is Spent!

Example: Calculating CPI Base Machine (Reg / Reg) Op Freq Cycles CPI(i) (% Time) ALU 50% 1 .5 (33%) Load 20% 2 .4 (27%) Store 10% 2 .2 (13%) Branch 20% 2 .4 (27%) 1.5 Typical Mix

SPEC: System Performance Evaluation Cooperative First Round 1989 10 programs yielding a single number (“SPECmarks”) Second Round 1992 SPECInt92 (6 integer programs) and SPECfp92 (14 floating point programs) Compiler Flags unlimited. March 93 of DEC 4000 Model 610: spice: unix.c:/def=(sysv,has_bcopy,”bcopy(a,b,c)= memcpy(b,a,c)” wave5: /ali=(all,dcom=nat)/ag=a/ur=4/ur=200 nasa7: /norecu/ag=a/ur=4/ur2=200/lc=blas Third Round 1995 new set of programs: SPECint95 (8 integer programs) and SPECfp95 (10 floating point) “benchmarks useful for 3 years” Single flag setting for all programs: SPECint_base95, SPECfp_base95

How to Summarize Performance Arithmetic mean (weighted arithmetic mean) tracks execution time: (Ti)/n or (Wi*Ti) Harmonic mean (weighted harmonic mean) of rates (e.g., MFLOPS) tracks execution time: n/(1/Ri) or n/(Wi/Ri) Normalized execution time is handy for scaling performance (e.g., X times faster than SPARCstation 10) But do not take the arithmetic mean of normalized execution time, use the geometric mean: (  Tj / Nj )1/n

Performance Evaluation “For better or worse, benchmarks shape a field” Good products created when have: Good benchmarks Good ways to summarize performance Given sales is a function in part of performance relative to competition, investment in improving product as reported by performance summary If benchmarks/summary inadequate, then choose between improving product for real programs vs. improving product to get more sales; Sales almost always wins! Execution time is the measure of computer performance!

Integrated Circuits Costs Die Cost goes roughly with die area4

Real World Examples Chip Metal Line Wafer Defect Area Dies/ Yield Die Cost layers width cost /cm2 mm2 wafer 386DX 2 0.90 $900 1.0 43 360 71% $4 486DX2 3 0.80 $1200 1.0 81 181 54% $12 PowerPC 601 4 0.80 $1700 1.3 121 115 28% $53 HP PA 7100 3 0.80 $1300 1.0 196 66 27% $73 DEC Alpha 3 0.70 $1500 1.2 234 53 19% $149 SuperSPARC 3 0.70 $1700 1.6 256 48 13% $272 Pentium 3 0.80 $1500 1.5 296 40 9% $417 From "Estimating IC Manufacturing Costs,” by Linley Gwennap, Microprocessor Report, August 2, 1993, p. 15