CS252/Patterson Lec 1.1 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic2: Technology Trend and Cost/Performance (Adapted from David A. Patterson’s CS252 lecture.

Slides:



Advertisements
Similar presentations
Performance Evaluation of Architectures Vittorio Zaccaria.
Advertisements

Slide 1 Fundamentals of Computer Design CSCE430/830 Computer Architecture Instructor: Hong Jiang Courtesy of Prof. Yifeng U. of Maine Fall, 2007.
1 CIS775: Computer Architecture Chapter 1: Fundamentals of Computer Design.
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
2-1 ECE 361 ECE C61 Computer Architecture Lecture 2 – performance Prof. Alok N. Choudhary
Computer Organization and Architecture 18 th March, 2008.
ENGS 116 Lecture 21 Performance and Quantitative Principles Vincent H. Berk September 26 th, 2008 Reading for today: Chapter , Amdahl article.
CIS629 Fall Lecture Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important.
Computer Performance Evaluation: Cycles Per Instruction (CPI)
CS252/Kubiatowicz Lec 1.1 8/25/03 CS252 Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance August 25, 2003 Prof.
1  1998 Morgan Kaufmann Publishers and UCB Performance CEG3420 Computer Design Lecture 3.
CIS429.S00: Lec2- 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important quantitative.
ECE 232 L4 perform.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 4 Performance,
ENGS 116 Lecture 11 ENGS 116 / COSC 107 Computer Architecture Introduction Vincent H. Berk September 21, 2005 Reading for Friday: Chapter 1.1 – 1.4, Amdahl.
CS61C L221 Performance © UC Regents 1 CS61C - Machine Structures Lecture 22 - Introduction to Performance November 17, 2000 David Patterson
CS430 – Computer Architecture Lecture - Introduction to Performance
CIS429/529 Winter 07 - Performance - 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
Lecture 1: Course Introduction, Technology Trends, Performance Professor Alvin R. Lebeck Computer Science 220 Fall 2001.
Chapter 1 - Fundamentals1 Computer Architecture Chapter 1 Fundamentals Prof. Jerry Breecher CSCI 240 Fall 2001.
ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.
1 Computer Performance: Metrics, Measurement, & Evaluation.
Where Has This Performance Improvement Come From? Technology –More transistors per chip –Faster logic Machine Organization/Implementation –Deeper pipelines.
Lecture 2: Computer Performance
CENG 450 Computer Systems & Architecture Lecture 3 Amirali Baniasadi
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
B0111 Performance Anxiety ENGR xD52 Eric VanWyk Fall 2012.
PerformanceCS510 Computer ArchitecturesLecture Lecture 3 Benchmarks and Performance Metrics Lecture 3 Benchmarks and Performance Metrics.
Lecture 1: Course Introduction, Technology Trends, Performance Professor Alvin R. Lebeck Compsci 220 / ECE 252 Fall 2004 Slides based on those of: Sorin,
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
1 Acknowledgements Class notes based upon Patterson & Hennessy: Book & Lecture Notes Patterson’s 1997 course notes (U.C. Berkeley CS 152, 1997) Tom Fountain.
Advanced Computer Architecture Fundamental of Computer Design Instruction Set Principles and Examples Pipelining:Basic and Intermediate Concepts Memory.
Digital System Architecture 1 28 ต.ค ต.ค ต.ค ต.ค ต.ค. 58 Lecture 2a Computer Performance and Cost Pradondet Nilagupta.
1 CS465 Performance Revisited (Chapter 1) Be able to compare performance of simple system configurations and understand the performance implications of.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
Computer Architecture CPSC 350
EEL5708/Bölöni Lec 1.1 August 21, 2006 Lotzi Bölöni Fall 2006 EEL 5708 High Performance Computer Architecture Lecture 1 Introduction.
Cost and Performance.
Morgan Kaufmann Publishers
S.J.Lee 1 컴퓨터 구조 강좌개요 순천향대학교 컴퓨터학부 이 상 정. S.J.Lee 2 교 재교 재 J.L.Hennessy & D.A.Patterson Computer Architecture a Quantitative Approach, Second Edition.
Performance Performance
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
EEL5708/Bölöni Lec 2.1 Fall 2004 August 27, 2004 Lotzi Bölöni Fall 2004 EEL 5708 High Performance Computer Architecture Lecture 2 Introduction: the big.
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
Performance Analysis Topics Measuring performance of systems Reasoning about performance Amdahl’s law Systems I.
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
EEL5708/Bölöni Lec 3.1 Fall 2004 Sept 1, 2004 Lotzi Bölöni Fall 2004 EEL 5708 High Performance Computer Architecture Lecture 3 Review: Instruction Sets.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 2: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed. Oct. 4,
EEL-4713 Ann Gordon-Ross.1 EEL-4713 Computer Architecture Performance.
VU-Advanced Computer Architecture Lecture 1-Introduction 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 1.
June 20, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 1: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed.
CpE 442 Introduction to Computer Architecture The Role of Performance
How do we evaluate computer architectures?
Graduate Computer Architecture Lecture 1 Review of Technology Trends and Cost/Performance Ayman Alharbi.
Morgan Kaufmann Publishers
Computer Architecture CSCE 350
CS775: Computer Architecture
Measurement & Evaluation
Computer Architecture
Performance of computer systems
August 30, 2000 Prof. John Kubiatowicz
Performance of computer systems
A Question to Ponder On [from last lecture]
Presentation transcript:

CS252/Patterson Lec 1.1 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic2: Technology Trend and Cost/Performance (Adapted from David A. Patterson’s CS252 lecture slides at Berkeley)

CS252/Patterson Lec 1.2 1/17/01 Technology Trends: Microprocessor Capacity CMOS improvements: Die size: 2X every 3 yrs Line width: halve / 7 yrs “Graduation Window” Alpha 21264: 15 million Pentium Pro: 5.5 million PowerPC 620: 6.9 million Alpha 21164: 9.3 million Sparc Ultra: 5.2 million Moore’s Law

CS252/Patterson Lec 1.3 1/17/01 Memory Capacity (Single Chip DRAM) year size(Mb)cyc time ns ns ns ns ns ns ns

CS252/Patterson Lec 1.4 1/17/01 Technology Trends (Summary) CapacitySpeed (latency) Logic2x in 3 years2x in 3 years DRAM4x in 3 years2x in 10 years Disk4x in 3 years2x in 10 years

CS252/Patterson Lec 1.5 1/17/01 Processor Performance Trends Microprocessors Minicomputers Mainframes Supercomputers Year

CS252/Patterson Lec 1.6 1/17/01 Processor Performance (1.35X before, 1.55X now) 1.54X/yr

CS252/Patterson Lec 1.7 1/17/01 Performance Trends (Summary) Workstation performance (measured in Spec Marks) improves roughly 50% per year (2X every 18 months) Improvement in cost performance estimated at 70% per year

CS252/Patterson Lec 1.8 1/17/01 Computer Architecture Topics Instruction Set Architecture Pipelining, Hazard Resolution, Superscalar, Reordering, Prediction, Speculation, Vector, DSP Addressing, Protection, Exception Handling L1 Cache L2 Cache DRAM Disks, WORM, Tape Coherence, Bandwidth, Latency Emerging Technologies Interleaving Bus protocols RAID VLSI Input/Output and Storage Memory Hierarchy Pipelining and Instruction Level Parallelism

CS252/Patterson Lec 1.9 1/17/01 Computer Architecture Topics M Interconnection Network S PMPMPMP ° ° ° Topologies, Routing, Bandwidth, Latency, Reliability Network Interfaces Shared Memory, Message Passing, Data Parallelism Processor-Memory-Switch Multiprocessors Networks and Interconnections

CS252/Patterson Lec /17/01 Course Focus Technology Programming Languages Operating Systems History Applications Interface Design (ISA) Measurement & Evaluation Parallelism Computer Architecture: Instruction Set Design Organization Hardware

CS252/Patterson Lec /17/01 Measurement Tools Benchmarks, Traces, Mixes Hardware: Cost, delay, area, power estimation Simulation (many levels) –ISA, RT, Gate, Circuit Queueing Theory Rules of Thumb Fundamental “Laws”/Principles

CS252/Patterson Lec /17/01 Which is faster? Time to run the task (ExTime) –Execution time, response time, latency Tasks per day, hour, week, sec, ns … (Performance) –Throughput, bandwidth Plane Boeing 747 BAD/Sud Concodre Speed 610 mph 1350 mph DC to Paris 6.5 hours 3 hours Passengers Throughput (pmph) 286, ,200

CS252/Patterson Lec /17/01 Definitions Performance is in units of things per sec –bigger is better If we are primarily concerned with response time –performance(x) = 1 execution_time(x) " X is n times faster than Y" means Performance(X) Execution_time(Y) n == Performance(Y) Execution_time(X)

CS252/Patterson Lec /17/01 Cycles Per Instruction IC = Instruction Count CPI = Clock Per Instruction

CS252/Patterson Lec /17/01 Cycles Per Instruction We may separate the contribution of each type of instruction to the execution time defining:

CS252/Patterson Lec /17/01 Example: Calculating CPI Typical Mix of instruction types in program Base Machine (Reg / Reg) OpFreqCyclesCPI(i)(% Time) ALU50%1.5(33%) Load20%2.4(27%) Store10%2.2(13%) Branch20%2.4(27%) 1.5

CS252/Patterson Lec /17/01 Aspects of CPU Performance (CPU Law) CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle Inst Count CPIClock Rate Program X Compiler X (X) Inst. Set. X X Organization X X Technology X

CS252/Patterson Lec /17/01 Amdahl's Law Speedup due to enhancement E: Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected

CS252/Patterson Lec /17/01 Amdahl’s Law

CS252/Patterson Lec /17/01 Amdahl’s Law Example: Floating point instructions improved to run 2X; but only 10% of actual instructions are FP

CS252/Patterson Lec /17/01 Metrics of Performance Compiler Programming Language Application Datapath Control TransistorsWiresPins ISA Function Units (millions) of Instructions per second: MIPS (millions) of (FP) operations per second: MFLOP/s Cycles per second (clock rate) Megabytes per second Answers per month Operations per second

CS252/Patterson Lec /17/01 SPEC: System Performance Evaluation Cooperative First Round 1989 –10 programs yielding a single number (“SPECmarks”) Second Round 1992 –SPECInt92 (6 integer programs) and SPECfp92 (14 floating point programs) »Compiler Flags unlimited. March 93 of DEC 4000 Model 610: spice: unix.c:/def=(sysv,has_bcopy,”bcopy(a,b,c)= memcpy(b,a,c)” wave5: /ali=(all,dcom=nat)/ag=a/ur=4/ur=200 nasa7: /norecu/ag=a/ur=4/ur2=200/lc=blas Third Round 1995 –new set of programs: SPECint95 (8 integer programs) and SPECfp95 (10 floating point) –“benchmarks useful for 3 years” –Single flag setting for all programs: SPECint_base95, SPECfp_base95

CS252/Patterson Lec /17/01 How to Summarize Performance Arithmetic mean (weighted arithmetic mean) tracks execution time: (T i )/n or (W i *T i ) Harmonic mean (weighted harmonic mean) of rates (e.g., MFLOPS) tracks execution time: n/(1/R i ) or n/(W i /R i ) Normalized execution time is handy for scaling performance (e.g., X times faster than SPARCstation 10) But do not take the arithmetic mean of normalized execution time, use the geometric i )^1/n)

CS252/Patterson Lec /17/01 Performance Evaluation “For better or worse, benchmarks shape a field” Good products created when have: –Good benchmarks –Good ways to summarize performance Given sales is a function in part of performance relative to competition, investment in improving product as reported by performance summary If benchmarks/summary inadequate, then choose between improving product for real programs vs. improving product to get more sales; Sales almost always wins! Execution time is the measure of computer performance!

CS252/Patterson Lec /17/01 Instruction Set Architecture (ISA) instruction set software hardware

CS252/Patterson Lec /17/01 Interface Design A good interface: Lasts through many implementations (portability, compatability) Is used in many differeny ways (generality) Provides convenient functionality to higher levels Permits an efficient implementation at lower levels Interface imp 1 imp 2 imp 3 use time

CS252/Patterson Lec /17/01 Summary, #1 Designing to Last through Trends CapacitySpeed Logic2x in 3 years2x in 3 years DRAM4x in 3 years2x in 10 years Disk4x in 3 years2x in 10 years 6yrs to graduate => 16X CPU speed, DRAM/Disk size Time to run the task –Execution time, response time, latency Tasks per day, hour, week, sec, ns, … –Throughput, bandwidth “X is n times faster than Y” means ExTime(Y) Performance(X) = ExTime(X)Performance(Y)

CS252/Patterson Lec /17/01 Summary, #2 Amdahl’s Law: CPI Law: Execution time is the REAL measure of computer performance! Good products created when have: –Good benchmarks, good ways to summarize performance Die Cost goes roughly with die area 4 Can PC industry support engineering/research investment? Speedup overall = ExTime old ExTime new = 1 (1 - Fraction enhanced ) + Fraction enhanced Speedup enhanced CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle