2016/5/26\ELEG323-08F\Topic4.ppt1 Topics 4: Performance Measurement Introduction to Computer Systems Engineering (CPEG 323)

Slides:



Advertisements
Similar presentations
CS1104: Computer Organisation School of Computing National University of Singapore.
Advertisements

Performance What differences do we see in performance? Almost all computers operate correctly (within reason) Most computers implement useful operations.
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
1  1998 Morgan Kaufmann Publishers Chapter 2 Performance Text in blue is by N. Guydosh Updated 1/25/04*
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
Chapter 4 Assessing and Understanding Performance Bo Cheng.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.5 Comparing and Summarizing Performance.
Computer ArchitectureFall 2007 © September 17, 2007 Karem Sakallah CS-447– Computer Architecture.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
Chapter 4 Assessing and Understanding Performance
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
Lecture 3: Computer Performance
1 Lecture 10: FP, Performance Metrics Today’s topics:  IEEE 754 representations  FP arithmetic  Evaluating a system Reminder: assignment 4 due in a.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
1 Measuring Performance Chris Clack B261 Systems Architecture.
1/18/02CSE Performance I Measuring Performance Part I.
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Computer Organization and Design Performance Montek Singh Mon, April 4, 2011 Lecture 13.
1 Computer Performance: Metrics, Measurement, & Evaluation.
CSE 340 Computer Architecture Summer 2014 Understanding Performance
Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( ) 2005.
Lecture 2b: Performance Metrics. Performance Metrics Measurable characteristics of a computer system: Count of an event Duration of a time interval Size.
Copyright 1995 by Coherence LTD., all rights reserved (Revised: Oct 97 by Rafi Lohev, Oct 99 by Yair Wiseman, Sep 04 Oren Kapah) IBM י ב מ 7-1 Measuring.
1 CHAPTER 2 THE ROLE OF PERFORMANCE. 2 Performance Measure, Report, and Summarize Make intelligent choices Why is some hardware better than others for.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
1 CPS4150 Chapter 4 Assessing and Understanding Performance.
10/19/2015Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
Performance.
Computer Performance Computer Engineering Department.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
Computer Architecture
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
1  1998 Morgan Kaufmann Publishers How to measure, report, and summarize performance (suorituskyky, tehokkuus)? What factors determine the performance.
Performance Performance
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
1 Lecture 2: Performance, MIPS ISA Today’s topics:  Performance equations  MIPS instructions Reminder: canvas and class webpage:
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Computer Organization Instruction Set Architecture (ISA) Instruction Set Architecture (ISA), or simply Architecture, of a computer is the.
4. Performance 4.1 Introduction 4.2 CPU Performance and Its Factors
Lecture 5: 9/10/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
Chapter 4. Measure, Report, and Summarize Make intelligent choices See through the marketing hype Understanding underlying organizational aspects Why.
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
L12 – Performance 1 Comp 411 Computer Performance He said, to speed things up we need to squeeze the clock Study
EGRE 426 Computer Organization and Design Chapter 4.
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Yaohang Li.
CSE 340 Computer Architecture Summer 2016 Understanding Performance.
Lecture 3. Performance Prof. Taeweon Suh Computer Science & Engineering Korea University COSE222, COMP212, CYDF210 Computer Architecture.
Measuring Performance Based on slides by Henri Casanova.
BITS Pilani, Pilani Campus Today’s Agenda Role of Performance.
CPEN Digital System Design Assessing and Understanding CPU Performance © Logic and Computer Design Fundamentals, 4 rd Ed., Mano Prentice Hall © Computer.
Computer Organization
Computer Architecture & Operations I
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
September 2 Performance Read 3.1 through 3.4 for Tuesday
Defining Performance Which airplane has the best performance?
Computer Architecture & Operations I
Morgan Kaufmann Publishers
CSCE 212 Chapter 4: Assessing and Understanding Performance
CS2100 Computer Organisation
Performance.
Chapter 2: Performance CS 447 Jason Bakos Fall 2001 CS 447.
Computer Organization and Design Chapter 4
CS2100 Computer Organisation
Presentation transcript:

2016/5/26\ELEG323-08F\Topic4.ppt1 Topics 4: Performance Measurement Introduction to Computer Systems Engineering (CPEG 323)

2016/5/26\ELEG323-08F\Topic4.ppt2 Reading List Slides: Topic4 Henn & Patt: Chapter 4 Other papers as assigned in class or homework

2016/5/26\ELEG323-08F\Topic4.ppt3 Performance An attempt to quantify how well a particular computer can perform a user’s applications Problems: - Essentially a software+hardware issue - Different machines have different strengths and weaknesses - There is an enormous amount of hype and outright deception in the market – be wary

2016/5/26\ELEG323-08F\Topic4.ppt4 Conflicting Goals User: Find the most suitable machine to get the job done at the lowest cost  A pplication- oriented metrics Vendor: Persuade you to buy their machine regardless of your needs  hardware-oriented metrics

2016/5/26\ELEG323-08F\Topic4.ppt5 Why Study Performance? Know the vocabulary and understand the issues, so that: - As a user/buyer, you can make better purchasing decisions - As an engineer, you can make better hardware/software design decision

2016/5/26\ELEG323-08F\Topic4.ppt6 Summary of Metrics  Latency and throughput  CPU time, CPI, clock rate and instruction count  MIPS, relative MIPS  SPEC ratio and rate  Benchmarks

2016/5/26\ELEG323-08F\Topic4.ppt7 Latency vs. Throughput These are two very different metrics! Latency: How long does it take to get a particular task done? - Also called execution time or running time - Usually measured in time (e.g., microseconds) Throughput: How many tasks can you perform in a unit of time? - Also related to bandwidth (communication channels, storage) - Usually measured in units per time (e.g., megabytes/ second) Relationship between them

2016/5/26\ELEG323-08F\Topic4.ppt8 Performance Expressed as Time Absolute time measures - Difference between start and finish of an operation - Synonyms: running time, elapsed time, completion time, execution time, response time, latency Relative (normalized) time measures - Running time normalized to some reference time

2016/5/26\ELEG323-08F\Topic4.ppt9 Choosing a Time-Based Performance Metric Guiding principle: choose performance measures that track running time Performance  Higher performance means it takes less time to run the application, so bigger is better Execution time 1

2016/5/26\ELEG323-08F\Topic4.ppt10 The Nature of Execution Time Execution time on a computer is typically divided into: User time: Time spent executing instructions in the user code System time: Time spent executing instructions in the kernel on behalf of the user code (e.g., opening files) Other: Time when the system is idle or executing other programs Use “time” and “top” commands in Unix to see these

2016/5/26\ELEG323-08F\Topic4.ppt11 Illustration of Execution Time “Real” or “wall clock” time is the sum of all three Warning: File access delays sometimes counted as “idle” even though they’re yours. User time Sys. time Other / idle

2016/5/26\ELEG323-08F\Topic4.ppt12 CPU Time vs. Latency - The time CPU spends for computing the given task, not including the time waiting for I/O or running other programs. Also known as CPU execution time -Consists of user CPU time and system CPU time. User CPU time: Total time CPU spends in the task System CPU time: Total time CPU spends in operating system for the sake of the task.

2016/5/26\ELEG323-08F\Topic4.ppt13 Application Metrics vs. Hardware Metrics How do you relate the application-oriented performance measurements to what is going on inside the machine? Most processors are synchronous, so we can use the clock as a basis.

2016/5/26\ELEG323-08F\Topic4.ppt14 Clock Cycles Clock “ticks” refer to clock edges (rising or falling) Cycle time (period) = time between ticks = seconds per cycle Clock rate (frequency) = cycles per second (1 Hz = 1 cycle/sec) A 2GHz clock has a cycle time of Clock period 2000 x 10 6 cycles/sec. 1 x sec nsec = 0.5 nsec.

2016/5/26\ELEG323-08F\Topic4.ppt15 Measuring Time If you’re lucky, you can count clock cycles directly; some CPUs have a built-in counter which increments every clock cycle. If you’re not, you have to use a slower clock. Most systems have extra hardware which generates a regular tick; many operating systems will count these ticks for you. Timing accuracy limited by the resolution of the clock – you get less accurate readings off a 1Hz clock than a 1MHz clock!

2016/5/26\ELEG323-08F\Topic4.ppt16 Cycles and Instructions In almost all processors, a single instruction (executing one line of assembly code) requires more than one clock cycle. Either: - One instruction must finish before the next can begin - Consecutive instructions may overlap (“pipelining”) In most processors, different types of instructions may take different numbers of cycles (e.g., integer vs. floating point)

2016/5/26\ELEG323-08F\Topic4.ppt17 Relating cycles and Instructions So we can add the following to our vocabulary: Cycles per instruction (CPI) – smaller is better Instruction per cycle (IPC) bigger is better If the cycles to execute one instruction vary depending on the instruction, then the average CPI or IPC of a program will depend on how many of each type of instruction is executed.

2016/5/26\ELEG323-08F\Topic4.ppt18 Clock rate- Hardware technology and organization CPI- Instruction set architecture Instruction - Instruction set architecture and count compiler technology - CPI should be measured, instead of check “Manuals” Why? ( affected by many factors, e.g Cache/memory, etc.) - The most important is time : lower inst. count may increase instruction clock cycle time Clock, CPI and Instruction Count

2016/5/26\ELEG323-08F\Topic4.ppt19 Example A program requires executing 100 million instructions on a processor which typically takes 2 CPI with a 2GHz clock. How much time will the program take?

2016/5/26\ELEG323-08F\Topic4.ppt20 Answer Or you can work backwards from a known execution times and clock rate to calculate the CPI for a given program. x instruction 2 cycles 1 x 10 8 instructions 2 x 10 9 cycles 1 second x = 0.1 seconds CPU time = Clock rate Instruction count * CPI

2016/5/26\ELEG323-08F\Topic4.ppt21 How to Improve the Performance? Reduce the number of instructions to execute Increase the number of instructions per cycle Concurrent execution of instructions Increase clock rate

2016/5/26\ELEG323-08F\Topic4.ppt22 Sometimes it is useful in designing the CPU to calculate the number of total CPU clock cycles as CPU clock cycles = (CPI i * I i ) n i=1  Weighted CPI

2016/5/26\ELEG323-08F\Topic4.ppt23 Where I i represents number of times instruction of type i is executed in a program and CPI i represents the average number of clock cycles for instruction of type i. This form can be used to express CPU time as CPU time = ( (CPI i * I i ) ) /clock rate n i=1  Weighted CPI Cont’d

2016/5/26\ELEG323-08F\Topic4.ppt24 CPI should be measured and not just calculated from a table in the back of a reference manual Always bear in mind that the real measure of computer performance is time. CPI Should Be Measured

2016/5/26\ELEG323-08F\Topic4.ppt25 Hardware-Oriented Metrics Clock rate and IPC are often combined into various figures of merit: MIPS (Millions of Instructions Per Second) – pronounced “mips” MOPS (Millions of Operations Per Second) – pronounced “mops” MFLOPS (Millions of Floating-point Operations Per Second) – pronounced “megaflops” and sometimes written “megaFLOPS” Replace first letter with K (kilo), G (giga), T (tera), P (peta), etc., as appropriate. ( or even E (exa), Z (zeta)..)

2016/5/26\ELEG323-08F\Topic4.ppt26 Problems with Hardware- Oriented Metrics Processors with different ISAs may require a different number of instructions to perform the same task, so MIPS hard to compare - MOPS and MFLOPS are a somewhat better measure - How do you count floating-point divides? Vendors usually report “peak” rates

2016/5/26\ELEG323-08F\Topic4.ppt27 One alternative to time as the metric is MIPS, or million instructions per second. For a given program, MIPS is simply MIPS = = Instruction countClock rate Execution time * 10 6 CPI * 10 6 MIPS Calculation

2016/5/26\ELEG323-08F\Topic4.ppt28 Limitations of MIPS -Meaningful only for comparing machines with same ISA, same program, and same input Instruction capability not considered -May vary inversely with performance! Instruction count is an absolute number without considering the frequency of each instruction class

2016/5/26\ELEG323-08F\Topic4.ppt29 MIPS - What May Go Wrong with It ? A number of popular measures have been adopted in the quest for a standard measure of computer performance, with the result that a few innocent terms have been twisted from their well-defined environment and forced into a service for which they were never intended.

2016/5/26\ELEG323-08F\Topic4.ppt30 Misleading Performance Measurement -MIPS=instruction count/(execution time*10 6 ) MIPS 1 = MIPS 2 = Code from Instruction counts (in billions) for each instruction class ABC Compiler 1511 Compiler Instruction class ABC CPI123 {(1*5+2*1+3*1)*10 9 }/(500*10 6 )=20s {(1*10+2*1+3*1)*10 9 }/(500*10 6 )=30s {(5+1+1)*10 9 }/(20*10 6 )=350 {(10+1+1)*10 9 }/(30*10 6 )=400 -Execution time=  (CPI i* l i )/clock rate Execution time 1 = Execution time 2 = Clock rate: 500MHZ

2016/5/26\ELEG323-08F\Topic4.ppt31 The authors’ position is that the only consistent and reliable measure of performance is the execution time of real programs, and that all proposed alternatives to time as the metric or to real programs as the items measured have eventually led to misleading claims or even mistakes in computer design. Key: Execution Time of Real Programs

2016/5/26\ELEG323-08F\Topic4.ppt32 What is MIPS? “Meaningless Indication of Processor Speed” - Bob Estall Computer, 1987

2016/5/26\ELEG323-08F\Topic4.ppt33 A computer system is multidimensional - therefore should be measured by some “vector”; MIPS is a scalar - measures only one dimension; MIPS is a very useful measure within it’s dimension. MIPS Is Not A Multidimensional Measure