BITS Pilani, Pilani Campus Today’s Agenda Role of Performance.

Slides:



Advertisements
Similar presentations
CS2100 Computer Organisation Performance (AY2014/2015) Semester 2.
Advertisements

ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
Performance COE 308 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals.
Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Patterson.
Computer Organization and Architecture 18 th March, 2008.
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
CPU Performance Evaluation: Cycles Per Instruction (CPI)
Performance ICS 233 Computer Architecture and Assembly Language Dr. Aiman El-Maleh College of Computer Sciences and Engineering King Fahd University of.
Chapter 4 Assessing and Understanding Performance Bo Cheng.
EECC550 - Shaaban #1 Lec # 3 Spring Computer Performance Evaluation: Cycles Per Instruction (CPI) Most computers run synchronously utilizing.
1 CSE SUNY New Paltz Chapter 2 Performance and Its Measurement.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
Computer Performance Evaluation: Cycles Per Instruction (CPI)
Computer ArchitectureFall 2007 © September 17, 2007 Karem Sakallah CS-447– Computer Architecture.
CS/ECE 3330 Computer Architecture Chapter 1 Performance / Power.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
Chapter 4 Assessing and Understanding Performance
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
Benchmarks Programs specifically chosen to measure performance Must reflect typical workload of the user Benchmark types Real applications Small benchmarks.
1 ECE3055 Computer Architecture and Operating Systems Lecture 2 Performance Prof. Hsien-Hsin Sean Lee School of Electrical and Computer Engineering Georgia.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Chapter 1 Section 1.4 Dr. Iyad F. Jafar Evaluating Performance.
1 Computer Performance: Metrics, Measurement, & Evaluation.
Where Has This Performance Improvement Come From? Technology –More transistors per chip –Faster logic Machine Organization/Implementation –Deeper pipelines.
Benchmarks Prepared By : Arafat El-madhoun Supervised By:eng. Mohammad temraz.
CSE 340 Computer Architecture Summer 2014 Understanding Performance
CDA 3101 Fall 2013 Introduction to Computer Organization Benchmarks 30 August 2013.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
Lecture2: Performance Metrics Computer Architecture By Dr.Hadi Hassan 1/3/2016Dr. Hadi Hassan Computer Architecture 1.
1  1998 Morgan Kaufmann Publishers How to measure, report, and summarize performance (suorituskyky, tehokkuus)? What factors determine the performance.
Performance Performance
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
1 Lecture 2: Performance, MIPS ISA Today’s topics:  Performance equations  MIPS instructions Reminder: canvas and class webpage:
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
Performance COE 301 Computer Organization Dr. Muhamed Mudawar College of Computer Sciences and Engineering King Fahd University of Petroleum and Minerals.
ECE/CS 552: Benchmarks, Means and Amdahl’s Law © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi,
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 2: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed. Oct. 4,
Computer Architecture CSE 3322 Web Site crystal.uta.edu/~jpatters/cse3322 Send to Pramod Kumar, with the names and s.
June 20, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 1: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed.
BITS Pilani, Pilani Campus Today’s Agenda Role of Performance.
Measuring Performance and Benchmarks Instructor: Dr. Mike Turi Department of Computer Science and Computer Engineering Pacific Lutheran University Lecture.
Lecture 2: Performance Evaluation
CS161 – Design and Architecture of Computer Systems
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
September 2 Performance Read 3.1 through 3.4 for Tuesday
Assessing and Understanding Performance
Defining Performance Which airplane has the best performance?
CSCE 212 Chapter 4: Assessing and Understanding Performance
CS2100 Computer Organisation
Chapter 1 Computer Abstractions & Technology Performance Evaluation
CMSC 611: Advanced Computer Architecture
Performance of computer systems
Performance ICS 233 Computer Architecture and Assembly Language
Performance of computer systems
CMSC 611: Advanced Computer Architecture
COMS 361 Computer Organization
January 25 Did you get mail from Chun-Fa about assignment grades?
Benchmarks Programs specifically chosen to measure performance
Computer Organization and Design Chapter 4
CS2100 Computer Organisation
Presentation transcript:

BITS Pilani, Pilani Campus Today’s Agenda Role of Performance

BITS Pilani, Pilani Campus Definition of Performance For some program running on machine X: Performance X = 1 / Execution time X X is n times faster than Y means: Performance X / Performance Y = n

BITS Pilani, Pilani Campus Performance Equation I So, to improve performance one can either: – reduce the number of cycles for a program, or – reduce the clock cycle time, or, equivalently, – increase the clock rate CPU execution time CPU clock cycles Clock cycle time for a program =  equivalently

BITS Pilani, Pilani Campus Performance Equation II CPU execution time Instruction count average CPI Clock cycle time for a program Derive the above equation from Performance Equation I =  Average CPI: It provides one way to compare two different implementations of same ISA since instruction count for a program remain same It defers with processor design It also depends on the mix of instruction types executed in an application.

BITS Pilani, Pilani Campus CPI Example Computer A: Cycle Time = 250ps, CPI = 2.0 Computer B: Cycle Time = 500ps, CPI = 1.2 Same ISA Which is faster, and by how much? A is faster… …by this much

BITS Pilani, Pilani Campus CPI for Instruction Class CPU clock cycles can be computed by looking at different types of instructions and using their individual clock cycle counts. Formula: n CPU clock cycles =  (CPI i X C i ) i=1 Where, Ci is count of instructions of class I and CPIi is number of cycles per instruction for that instruction class. 6

BITS Pilani, Pilani Campus CPI Example Alternative compiled code sequences using instructions in classes A, B, C ClassABC CPI for class123 IC in sequence 1212 IC in sequence 2411 Sequence 1: IC = 5 Clock Cycles = 2×1 + 1×2 + 2×3 = 10 Avg. CPI = 10/5 = 2.0 Sequence 2: IC = 6 Clock Cycles = 4×1 + 1×2 + 1×3 = 9 Avg. CPI = 9/6 = 1.5

BITS Pilani, Pilani Campus Performance Summary Performance depends on – Algorithm: affects IC, possibly CPI – Programming language: affects IC, CPI – Compiler: affects IC, CPI – Instruction set architecture: affects IC, CPI – Clock cycle time and CPI are determined by Implementation of the processor.

BITS Pilani, Pilani Campus Evaluating Performance of a machine One way is to compare the execution time of the workload on two different computers. Another better way is to evaluate a computer using set of benchmark programs specially chosen to measure performance. The best type of programs to use for benchmarks are real applications. For example, typical engineering or scientific applications For software development engineers benchmarks may include programs such as compiler or word processing systems. 9

BITS Pilani, Pilani Campus Benchmarks Benchmark suites – Perfect Club: set of application codes – Livermore Loops: 24 loop kernels – Linpack: linear algebra package – SPEC: mix of code from industry organization

BITS Pilani, Pilani Campus SPEC (System Performance Evaluation Corporation) Sponsored by industry but independent and self-managed – trusted by code developers and machine vendors Clear guides for testing, see Regular updates (benchmarks are dropped and new ones added periodically according to relevance) Specialized benchmarks for particular classes of applications

BITS Pilani, Pilani Campus SPEC History First Round: SPEC CPU89 – 10 programs yielding a single number Second Round: SPEC CPU92 – SPEC CINT92 (6 integer programs) and SPEC CFP92 (14 floating point programs) – compiler flags can be set differently for different programs Third Round: SPEC CPU95 – new set of programs: SPEC CINT95 (8 integer programs) and SPEC CFP95 (10 floating point) – single flag setting for all programs Fourth Round: SPEC CPU2000 – new set of programs: SPEC CINT2000 (12 integer programs) and SPEC CFP2000 (14 floating point) – single flag setting for all programs – programs in C, C++, Fortran 77, and Fortran 90

BITS Pilani, Pilani Campus CINT2000 (Integer component of SPEC CPU2000) Program LanguageWhat It Is 164.gzip CCompression 175.vpr CFPGA Circuit Placement and Routing 176.gcc CC Programming Language Compiler 181.mcf CCombinatorial Optimization 186.crafty CGame Playing: Chess 197.parser CWord Processing 252.eon C++Computer Visualization 253.Perlbmk CPERL Programming Language 254.gap C Group Theory, Interpreter 255.Vortex CObject-oriented Database 256.bzip2 CCompression 300.twolf CPlace and Route Simulator

BITS Pilani, Pilani Campus CFP2000 (Floating point component of SPEC CPU2000) Program Language What It Is 168.wupwiseFortran 77Physics / Quantum Chromodynamics 171.swimFortran 77Shallow Water Modeling 172.mgrid Fortran 77 Multi-grid Solver: 3D Potential Field 173.applu Fortran 77 Parabolic / Elliptic Differential Equations 177.mesa C 3-D Graphics Library 178.Galgel Fortran 90 Computational Fluid Dynamics 179.art CImage Recognition / Neural Networks 183.equakeCSeismic Wave Propagation Simulation 187.facerecFortran 90 Image Processing: Face Recognition 188.ammp CComputational Chemistry 189.lucas Fortran 90 Number Theory / Primality Testing 191.fma3d Fortran 90 Finite-element Crash Simulation 200.sixtrackFortran 77 High Energy Physics Accelerator Design 301.apsi Fortran 77 Meteorology: Pollutant Distribution

BITS Pilani, Pilani Campus Amdahl's Law Amdahl's law, also known as Amdahl's argument, is named after computer architect Gene Amdahlcomputer architectGene Amdahl This law is used to find the maximum expected improvement to an overall system when only part of the system is improved. 15

BITS Pilani, Pilani Campus Amdahl's Law Execution Time After Improvement = Execution Time Unaffected +( Execution Time Affected / Amount of Improvement ) 16 Time before Time after Improvement Improvement

BITS Pilani, Pilani Campus Amdahl’s Law Speed-up = Perf new / Perf old =Exec_time old / Exec_time new = Performance improvement from using faster mode is limited by the fraction the faster mode can be applied. 17 f (1 - f) T old (1 - f) T new f / P

BITS Pilani, Pilani Campus Example "Suppose a program runs in 100 seconds on a machine, with multiply responsible for 80 seconds of this time. How much do we have to improve the speed of multiplication if we want the program to run 4 times faster?" How about making it 5 times faster? There is no amount by which multiple can be enhanced to achieve 5 times increase in performance. Thus performance enhance possible with a given improvement is limited by the amount that the improved feature is used. Principle: Make the common case fast 18

BITS Pilani, Pilani Campus Remember Performance is specific to a particular program/s –Total execution time is a consistent summary of performance For a given architecture performance increases come from: –increases in clock rate (without adverse CPI affects) –improvements in processor organization that lower CPI –compiler enhancements that lower CPI and/or instruction count 19

BITS Pilani, Pilani Campus Processor Design Introduction 20

BITS Pilani, Pilani Campus Implementing MIPS Fetch/Execute cycle 21

BITS Pilani, Pilani Campus 22