CPU Performance Assessment As-Bahiya Abu-Samra 4-3-2014 - *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.

Slides:



Advertisements
Similar presentations
Performance What differences do we see in performance? Almost all computers operate correctly (within reason) Most computers implement useful operations.
Advertisements

Computer Abstractions and Technology
Computer Architecture and Organization
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
Example (1) Two computer systems have been tested using three benchmarks. Using the normalized ratio formula and the following tables below, find which.
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
CIS629 Fall Lecture Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important.
Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.
Computer Performance Evaluation: Cycles Per Instruction (CPI)
Assessing and Understanding Performance B. Ramamurthy Chapter 4.
Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律.
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
1 Lecture 10: FP, Performance Metrics Today’s topics:  IEEE 754 representations  FP arithmetic  Evaluating a system Reminder: assignment 4 due in a.
CIS429/529 Winter 07 - Performance - 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Chapter 1 Section 1.4 Dr. Iyad F. Jafar Evaluating Performance.
Computer Performance Computer Engineering Department.
Lecture 2b: Performance Metrics. Performance Metrics Measurable characteristics of a computer system: Count of an event Duration of a time interval Size.
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
10/19/2015Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
Performance.
CDA 3101 Discussion Section 09 CPU Performance. Question 1 Suppose you wish to run a program P with 7.5 * 10 9 instructions on a 5GHz machine with a CPI.
Computer Organization and Architecture Tutorial 1 Kenneth Lee.
1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft.
From lecture slides for Computer Organization and Architecture: Designing for Performance, Eighth Edition, Prentice Hall, 2010 CS 211: Computer Architecture.
Chapter 2 Computer Evolution and Performance. ENIAC - background Electronic Numerical Integrator And Computer – world’s 1 st general purpose electronic.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
Computer Organization (1) تنظيم الحاسبات (1)
CSE2021: Computer Organization Instructor: Dr. Amir Asif Department of Computer Science York University Handout # 2: Measuring Performance Topics: 1. Performance:
Performance Performance
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
1 Lecture 2: Performance, MIPS ISA Today’s topics:  Performance equations  MIPS instructions Reminder: canvas and class webpage:
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
Computer Engineering Rabie A. Ramadan Lecture 2. Table of Contents 2 Architecture Development and Styles Performance Measures Amdahl’s Law.
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
CSE 340 Computer Architecture Summer 2016 Understanding Performance.
Computer Organization CS345 David Monismith Based upon notes by Dr. Bill Siever and from the Patterson and Hennessy Text.
Measuring Performance II and Logic Design
William Stallings Computer Organization and Architecture 6th Edition
Lecture 2: Performance Evaluation
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
How do we evaluate computer architectures?
Parallel Processing - introduction
Morgan Kaufmann Publishers
Architecture & Organization 1
William Stallings Computer Organization and Architecture 8th Edition
CSCE 212 Chapter 4: Assessing and Understanding Performance
Architecture & Organization 1
CMSC 611: Advanced Computer Architecture
AKT211 – CAO 02 – Computer Evolution and Performance
Computer Evolution and Performance
William Stallings Computer Organization and Architecture 8th Edition
William Stallings Computer Organization and Architecture 8th Edition
William Stallings Computer Organization and Architecture 8th Edition
William Stallings Computer Organization and Architecture 8th Edition
CMSC 611: Advanced Computer Architecture
Performance.
Presentation transcript:

CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s Law *Examples sheet

Moore’s Law Increased density of components on chip. Gordon Moore – cofounder of Intel. Number of transistors on a chip will double every year. Since 1970’s development has slowed a little. ▫ Number of transistors doubles every 18 months. Cost of a chip has remained almost unchanged. Higher packing density means shorter electrical paths, giving higher performance. Smaller size gives increased flexibility. Reduced power and cooling requirements. Fewer interconnections increases reliability.

Growth in CPU Transistor Count This figure reflects the famous Moore’s law.

Performance Assessment Clock Speed Key parameters ▫ Performance, cost, size, security, reliability, power consumption System clock speed ▫ In Hz or multiples of ▫ Clock rate, clock cycle, clock tick, cycle time Signals in CPU take time to settle down to 1 or 0 Signals may change at different speeds Operations need to be synchronised Instruction execution in discrete steps ▫ Fetch, decode, load and store, arithmetic or logical ▫ Usually require multiple clock cycles per instruction Pipelining gives simultaneous execution of instructions So, clock speed is not the whole story

Instruction Execution Rate Millions of instructions per second (MIPS) Millions of floating point instructions per second (MFLOPS) Heavily dependent on instruction set, compiler design, processor implementation, cache & memory hierarchy

INSTRUCTION EXECUTION RATE A processor is driven by a clock with a constant frequency f or, equivalently, a constant cycle time T, where T = 1/f Define the instruction count, Ic, for a program as the number of machine instructions executed for that program until it runs to completion or for some defined time. interval Let CPI i be the number of cycles required for instruction type i. and Ii be the number of executed instructions of type I for a given program. Then we can calculate an overall CPI as follows: CPI = The processor time T needed to execute a given program T = Ic * CPI * t

performance A common measure of performance for a processor is the rate at which instructions are executed, expressed as millions of instructions per second (MIPS), referred to as the MIPS rate. We can express the MIPS rate in terms of the clock rate and CPI as follows MIPS rate =Ic /T * 10 6 = f / CPI * 10 6

performance For example, consider the execution of a program which results in the execution of 2 million instructions on a 400-MHz processor. The program consists of four major types of instructions. The instruction mix and the CPI for each instruction type are given below based on the result of a program trace experiment: Instruction Type CPI Instruction Mix Arithmetic and logic 1 60% Load/store with cache hit 2 18% Branch 4 12% Memory reference with cache miss 8 10% T = Ic * CPI * t The average CPI when the program is executed on a uniprocessor with the above trace results is CPI= 0.6 +(2 *0.18) +(4* 0.12) +(8 *0.1) =2.24. The corresponding MIPS rate is =(400* 106) /(2.24 *106) =178.

MFLOPS Another common performance measure deals. Floating-point performance is expressed as millions of floating-point operations per second (MFLOPS), defined as follows only with floating-point instructions. Number of executed floating-point operations in a program MFLOPS = _______________________________________________ execution time * 10 6

Benchmarks Programs designed to test performance Written in high level language ▫ Portable Represents style of task ▫ Systems, numerical, commercial Easily measured Widely distributed E.g. System Performance Evaluation Corporation (SPEC) ▫ CPU2006 for computation bound  17 floating point programs in C, C++, Fortran  12 integer programs in C, C++  3 million lines of code ▫ Speed and rate metrics  Single task and throughput

SPEC Speed Metric Single task Base runtime defined for each benchmark using reference machine Results are reported as ratio of reference time to system run time ▫ Tref i execution time for benchmark i on reference machine ▫ Tsut i execution time of benchmark i on test system Overall performance calculated by averaging ratios for all 12 integer benchmarks —Use geometric mean –Appropriate for normalized numbers such as ratios,, ∏x i = x 1 ∙x 2 ∙...∙x n

SPEC Rate Metric Measures throughput or rate of a machine carrying out a number of tasks Multiple copies of benchmarks run simultaneously ▫ Typically, same as number of processors Ratio is calculated as follows: ▫ Tref i reference execution time for benchmark i ▫ N number of copies run simultaneously ▫ Tsuti elapsed time from start of execution of program on all N processors until completion of all copies of program ▫ Again, a geometric mean is calculated

Amdahl’s Law Gene Amdahl [AMDA67] Potential speed up of program using multiple processors Concluded that: ▫ Code needs to be parallelizable ▫ Speed up is bound, giving diminishing returns for more processors Task dependent ▫ Servers gain by maintaining multiple connections on multiple processors ▫ Databases can be split into parallel tasks

Amdahl’s Law Formula Conclusions ▫ When f is small, the use of parallel processors has little effect. ▫ N ->∞, speedup is bound by 1/(1 – f)  So, diminishing returns for using more processors It deals with the potential speedup of a program using multiple processors compared to a single processor. For program running on single processor —Fraction f of code infinitely parallelizable with no scheduling overhead —Fraction (1-f ) of code inherently serial —T is total execution time for program on single processor —N is number of processors that fully exploit parallel portions of code

References AMDA67 Amdahl, G. “Validity of the Single- Processor Approach to Achieving Large-Scale Computing Capability”, Proceedings of the AFIPS Conference, 1967.