June 2005Computer Architecture, Background and MotivationSlide 1 Part I Background and Motivation.

Slides:



Advertisements
Similar presentations
Computer Organization Lab 1 Soufiane berouel. Formulas to Remember CPU Time = CPU Clock Cycles x Clock Cycle Time CPU Clock Cycles = Instruction Count.
Advertisements

CS1104: Computer Organisation School of Computing National University of Singapore.
Performance What differences do we see in performance? Almost all computers operate correctly (within reason) Most computers implement useful operations.
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
Jan. 2011Computer Architecture, Background and MotivationSlide 1 Part I Background and Motivation.
Oct. 2014Computer Architecture, Background and MotivationSlide 1 Part I Background and Motivation.
Computer Organization and Architecture 18 th March, 2008.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
June 2005Computer Architecture, Background and MotivationSlide Signals, Logic Operators, and Gates Figure 1.1 Some basic elements of digital logic.
Chapter 4 Assessing and Understanding Performance Bo Cheng.
Jan. 2007Computer Architecture, Background and MotivationSlide 1 Part I Background and Motivation.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
Feb 11, 2009 Lecture 3 sequential circuit design (example) computer system performance measures (Ch 4 of [P], Ch1 of [HP]) instruction set architecture.
Assessing and Understanding Performance B. Ramamurthy Chapter 4.
1 Lecture 11: Digital Design Today’s topics:  Evaluating a system  Intro to boolean functions.
Computer Architecture Lecture 2 Instruction Set Principles.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
Chapter 4 Assessing and Understanding Performance
1 Lecture 10: FP, Performance Metrics Today’s topics:  IEEE 754 representations  FP arithmetic  Evaluating a system Reminder: assignment 4 due in a.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
INTRODUCTION Jehan-François Pâris An evolving field Computer architectures keep changing –Building faster computers Supercomputers and.
Computer Organization and Design Performance Montek Singh Mon, April 4, 2011 Lecture 13.
1 Computer Performance: Metrics, Measurement, & Evaluation.
Computer Architecture, Background and MotivationSlide 1 Part I Background and Motivation.
I Background Provide motivation, paint the big picture, introduce tools: Review components used in building digital circuits Present an overview of computer.
B0111 Performance Anxiety ENGR xD52 Eric VanWyk Fall 2012.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
10/19/2015Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
1 Acknowledgements Class notes based upon Patterson & Hennessy: Book & Lecture Notes Patterson’s 1997 course notes (U.C. Berkeley CS 152, 1997) Tom Fountain.
Performance.
1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft.
Computer Performance Computer Engineering Department.
1 CS465 Performance Revisited (Chapter 1) Be able to compare performance of simple system configurations and understand the performance implications of.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
FAMU-FSU College of Engineering 1 Computer Architecture EEL 4713/5764, Spring 2006 Dr. Michael Frank Module #5 – Computer Performance.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
1 COMS 361 Computer Organization Title: Performance Date: 10/02/2004 Lecture Number: 3.
1  1998 Morgan Kaufmann Publishers How to measure, report, and summarize performance (suorituskyky, tehokkuus)? What factors determine the performance.
Performance Performance
1 Lecture 2: Performance, MIPS ISA Today’s topics:  Performance equations  MIPS instructions Reminder: canvas and class webpage:
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
4. Performance 4.1 Introduction 4.2 CPU Performance and Its Factors
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
L12 – Performance 1 Comp 411 Computer Performance He said, to speed things up we need to squeeze the clock Study
EGRE 426 Computer Organization and Design Chapter 4.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
CSE 340 Computer Architecture Summer 2016 Understanding Performance.
June 20, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 1: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed.
Performance. Moore's Law Moore's Law Related Curves.
Measuring Performance II and Logic Design
CSCI206 - Computer Organization & Programming
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
How do we evaluate computer architectures?
Defining Performance Which airplane has the best performance?
CSCE 212 Chapter 4: Assessing and Understanding Performance
CS2100 Computer Organisation
Part I Background and Motivation
CSCI206 - Computer Organization & Programming
CMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Computer Performance Read Chapter 4
Performance.
Computer Organization and Design Chapter 4
CS2100 Computer Organisation
Presentation transcript:

June 2005Computer Architecture, Background and MotivationSlide 1 Part I Background and Motivation

June 2005Computer Architecture, Background and MotivationSlide 2 I Background and Motivation Topics in This Part Chapter 1 Combinational Digital Circuits Chapter 2 Digital Circuits with Memory Chapter 3 Computer System Technology Chapter 4 Computer Performance Provide motivation, paint the big picture, introduce tools: Review components used in building digital circuits Present an overview of computer technology Understand the meaning of computer performance ( or why a 2 GHz processor isn’t 2  as fast as a 1 GHz model)

June 2005Computer Architecture, Background and MotivationSlide 3 4 Computer Performance Performance is key in design decisions; also cost and power It has been a driving force for innovation Isn’t quite the same as speed (higher clock rate) Topics in This Chapter 4.1 Cost, Performance, and Cost/Performance 4.2 Defining Computer Performance 4.3 Performance Enhancement and Amdahl’s Law 4.4 Performance Measurement vs Modeling 4.5 Reporting Computer Performance 4.6 The Quest for Higher Performance

June 2005Computer Architecture, Background and MotivationSlide Cost, Performance, and Cost/Performance Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft or are averages of cited range of values. AircraftPassengers Range (km) Speed (km/h) Price ($M) Airbus A Boeing Boeing Boeing Concorde DC

June 2005Computer Architecture, Background and MotivationSlide 5 Cost Effectiveness: Cost/Performance Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft or are averages of cited range of values. AircraftPassen- gers Range (km) Speed (km/h) Price ($M) A B B B Concorde DC Cost / Performance Throughput (M P km/hr) Smaller values better Larger values better

June 2005Computer Architecture, Background and MotivationSlide 6 Different Views of performance Performance from the viewpoint of a passenger: Speed Note, however, that flight time is but one part of total travel time. Also, if the travel distance exceeds the range of a faster plane, a slower plane may be better due to not needing a refueling stop Performance from the viewpoint of an airline: Throughput Measured in passenger-km per hour (relevant if ticket price were proportional to distance traveled, which in reality is not) Airbus A  895 = M passenger-km/hr Boeing  980 = M passenger-km/hr Boeing  885 = M passenger-km/hr Boeing  980 = M passenger-km/hr Concorde130  2200 = M passenger-km/hr DC  875 = M passenger-km/hr Performance from the viewpoint of FAA: Safety

June 2005Computer Architecture, Background and MotivationSlide 7 The Vanishing Computer Cost

June 2005Computer Architecture, Background and MotivationSlide 8 Figure 4.1 Performance improvement as a function of cost. Cost/Performance

June 2005Computer Architecture, Background and MotivationSlide Defining Computer Performance Figure 4.2 Pipeline analogy shows that imbalance between processing power and I/O capabilities leads to a performance bottleneck.

June 2005Computer Architecture, Background and MotivationSlide 10 Concepts of Performance and Speedup Performance = 1 / Execution time is simplified to Performance = 1 / CPU execution time (Performance of M 1 ) / (Performance of M 2 ) = Speedup of M 1 over M 2 = (Execution time of M 2 ) / (Execution time M 1 ) Terminology:M 1 is x times as fast as M 2 (e.g., 1.5 times as fast) M 1 is 100(x – 1)% faster than M 2 (e.g., 50% faster) CPU time = Instructions  (Cycles per instruction)  (Secs per cycle) = Instructions  CPI / (Clock rate) Instruction count, CPI, and clock rate are not completely independent, so improving one by a given factor may not lead to overall execution time improvement by the same factor.

June 2005Computer Architecture, Background and MotivationSlide 11 Figure 4.3 Faster steps do not necessarily mean shorter travel time. Faster Clock  Shorter Running Time

June 2005Computer Architecture, Background and MotivationSlide Performance Enhancement: Amdahl’s Law Figure 4.4 Amdahl’s law: speedup achieved if a fraction f of a task is unaffected and the remaining 1 – f part runs p times as fast. s =  min(p, 1/f) 1 f + (1 – f)/p f = fraction unaffected p = speedup of the rest

June 2005Computer Architecture, Background and MotivationSlide 13 Example 4.1 Amdahl’s Law Used in Design A processor spends 30% of its time on flp addition, 25% on flp mult, and 10% on flp division. Evaluate the following enhancements, each costing the same to implement: a.Redesign of the flp adder to make it twice as fast. b.Redesign of the flp multiplier to make it three times as fast. c.Redesign the flp divider to make it 10 times as fast. Solution a.Adder redesign speedup = 1 / [ / 2] = 1.18 b.Multiplier redesign speedup = 1 / [ / 3] = 1.20 c.Divider redesign speedup = 1 / [ / 10] = 1.10 What if both the adder and the multiplier are redesigned?

June 2005Computer Architecture, Background and MotivationSlide Performance Measurement vs. Modeling Figure 4.5 Running times of six programs on three machines.

June 2005Computer Architecture, Background and MotivationSlide 15 Performance Benchmarks Example 4.3 You are an engineer at Outtel, a start-up aspiring to compete with Intel via its new processor design that outperforms the latest Intel processor by a factor of 2.5 on floating-point instructions. This level of performance was achieved by design compromises that led to a 20% increase in the execution time of all other instructions. You are in charge of choosing benchmarks that would showcase Outtel’s performance edge. a.What is the minimum required fraction f of time spent on floating-point instructions in a program on the Intel processor to show a speedup of 2 or better for Outtel? Solution a.We use a generalized form of Amdahl’s formula in which a fraction f is speeded up by a given factor (2.5) and the rest is slowed down by another factor (1.2): 1 / [1.2(1 – f) + f / 2.5]  2  f  0.875

June 2005Computer Architecture, Background and MotivationSlide 16 Performance Estimation Average CPI =  All instruction classes (Class-i fraction)  (Class-i CPI) Machine cycle time = 1 / Clock rate CPU execution time = Instructions  (Average CPI) / (Clock rate) Table 4.3 Usage frequency, in percentage, for various instruction classes in four representative applications. Application  Instr’n class  Data compression C language compiler Reactor simulation Atomic motion modeling A: Load/Store B: Integer C: Shift/Logic D: Float E: Branch F: All others8964

June 2005Computer Architecture, Background and MotivationSlide 17 MIPS Rating Can Be Misleading Example 4.5 Two compilers produce machine code for a program on a machine with two classes of instructions. Here are the number of instructions: ClassCPICompiler 1Compiler 2 A 1 600M 400M B 2 400M 400M a.What are run times of the two programs with a 1 GHz clock? b.Which compiler produces faster code and by what factor? c.Which compiler’s output runs at a higher MIPS rate? Solution a.Running time 1 (2) = (600M  M  2) / 10 9 = 1.4 s (1.2 s) b.Compiler 2’s output runs 1.4 / 1.2 = 1.17 times as fast c.MIPS rating 1, CPI = 1.4 (2, CPI = 1.5) = 1000 / 1.4 = 714 (667)

June 2005Computer Architecture, Background and MotivationSlide Reporting Computer Performance Table 4.4 Measured or estimated execution times for three programs. Time on machine X Time on machine Y Speedup of Y over X Program A Program B Program C All 3 prog’s Analogy: If a car is driven to a city 100 km away at 100 km/hr and returns at 50 km/hr, the average speed is not ( ) / 2 but is obtained from the fact that it travels 200 km in 3 hours.

June 2005Computer Architecture, Background and MotivationSlide 19 Table 4.4 Measured or estimated execution times for three programs. Time on machine X Time on machine Y Speedup of Y over X Program A Program B Program C Geometric mean does not yield a measure of overall speedup, but provides an indicator that at least moves in the right direction Comparing the Overall Performance Speedup of X over Y Arithmetic mean Geometric mean

June 2005Computer Architecture, Background and MotivationSlide The Quest for Higher Performance State of available computing power ca. the early 2000s: Gigaflops on the desktop Teraflops in the supercomputer center Petaflops on the drawing board Note on terminology (see Table 3.1) Prefixes for large units: Kilo = 10 3, Mega = 10 6, Giga = 10 9, Tera = 10 12, Peta = For memory: K = 2 10 = 1024, M = 2 20, G = 2 30, T = 2 40, P = 2 50 Prefixes for small units: micro = 10  6, nano = 10  9, pico = 10  12, femto = 10  15

June 2005Computer Architecture, Background and MotivationSlide 21 Figure 4.7 Exponential growth of supercomputer performance. Supercom- puters

June 2005Computer Architecture, Background and MotivationSlide 22 Figure 4.8 Milestones in the DOE’s Accelerated Strategic Computing Initiative (ASCI) program with extrapolation up to the PFLOPS level. The Most Powerful Computers

June 2005Computer Architecture, Background and MotivationSlide 23 Figure 25.1 Trend in energy consumption per MIPS of computational power in general- purpose processors and DSPs. Performance is Important, But It Isn’t Everything