CDA 3101 Discussion Section 09 CPU Performance. Question 1 Suppose you wish to run a program P with 7.5 * 10 9 instructions on a 5GHz machine with a CPI.

Slides:



Advertisements
Similar presentations
CDA 3101 Discussion Section 08 Performance
Advertisements

CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CDA 3101 Fall 2010 Discussion Section 08 CPU Performance
Computer Organization Lab 1 Soufiane berouel. Formulas to Remember CPU Time = CPU Clock Cycles x Clock Cycle Time CPU Clock Cycles = Instruction Count.
1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
CS2100 Computer Organisation Performance (AY2014/2015) Semester 2.
Computer Abstractions and Technology
Performance Evaluation of Architectures Vittorio Zaccaria.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
Computer Organization and Architecture 18 th March, 2008.
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
CIS429.S00: Lec3 - 1 CPU Time Analysis Terminology IC = instruction count = number of instructions in the program CPI = cycles per instruction (varies.
ENGS 116 Lecture 21 Performance and Quantitative Principles Vincent H. Berk September 26 th, 2008 Reading for today: Chapter , Amdahl article.
CSCE 212 Quiz 4 – 2/16/11 *Assume computes take 1 clock cycle, loads and stores take 10 cycles and branches take 4 cycles and that they are running on.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
Computer Performance Evaluation: Cycles Per Instruction (CPI)
9/16/2004Comp 120 Fall September 16 Assignment 4 due date pushed back to 23 rd, better start anywayAssignment 4 due date pushed back to 23 rd, better.
CS/ECE 3330 Computer Architecture Chapter 1 Performance / Power.
Assessing and Understanding Performance B. Ramamurthy Chapter 4.
Computer Architecture Lecture 2 Instruction Set Principles.
1 Lecture 10: FP, Performance Metrics Today’s topics:  IEEE 754 representations  FP arithmetic  Evaluating a system Reminder: assignment 4 due in a.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Chapter 1 Section 1.4 Dr. Iyad F. Jafar Evaluating Performance.
Memory/Storage Architecture Lab Computer Architecture Performance.
Operation Frequency No. of Clock cycles ALU ops % 1 Loads 25% 2
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
10/19/2015Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
Computer Organization and Architecture Tutorial 1 Kenneth Lee.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
Morgan Kaufmann Publishers
CPU Performance using Different Parameters CS 250: Andrei D. Coronel, MS,CEH,PhD Cand.
Pipelining and Parallelism Mark Staveley
Performance Enhancement. Performance Enhancement Calculations: Amdahl's Law The performance enhancement possible due to a given design improvement is.
CS /02 Semester II Help Session IIA Performance Measures Colin Tan S
Performance Performance
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
Computer Organization CS224 Fall 2012 Lessons 41 & 42.
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
EGRE 426 Computer Organization and Design Chapter 4.
Computer Engineering Rabie A. Ramadan Lecture 2. Table of Contents 2 Architecture Development and Styles Performance Measures Amdahl’s Law.
Performance 9 ways to fool the public Old Chapter 4 New Chapter 1.4.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
Computer Architecture CSE 3322 Web Site crystal.uta.edu/~jpatters/cse3322 Send to Pramod Kumar, with the names and s.
Chapter 1 Performance & Technology Trends. Outline What is computer architecture? Performance What is performance: latency (response time), throughput.
CSE 340 Computer Architecture Summer 2016 Understanding Performance.
Lecture 3. Performance Prof. Taeweon Suh Computer Science & Engineering Korea University COSE222, COMP212, CYDF210 Computer Architecture.
Performance 9 ways to fool the public #1 – Reporting Results.
Performance. Moore's Law Moore's Law Related Curves.
Measuring Performance II and Logic Design
Compilers can have a profound impact on the performance of an application on given a processor. This problem will explore the impact compilers have on.
CSCI206 - Computer Organization & Programming
CS161 – Design and Architecture of Computer Systems
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
How do we evaluate computer architectures?
CS 286 Computer Architecture & Organization
Morgan Kaufmann Publishers
CSCE 212 Chapter 4: Assessing and Understanding Performance
CS2100 Computer Organisation
Defining Performance Section /14/2018 9:52 PM.
CSCI206 - Computer Organization & Programming
Performance Cycle time of a computer CPU speed speed = 1 / cycle time
1.4.2 [5] <1.4> What is the global CPI for each implementation?
Computer Organization and Design Chapter 4
CS2100 Computer Organisation
Presentation transcript:

CDA 3101 Discussion Section 09 CPU Performance

Question 1 Suppose you wish to run a program P with 7.5 * 10 9 instructions on a 5GHz machine with a CPI of 0.8. a.What is the expected CPU time? When you run P, it takes 3 seconds of wall clock time to complete. What is the percentage of the CPU time P received?

Question 1 The expected CPU time CPU Time = IC * CPI * Clock cycle time = 7.5 * 10 9 * 0.8 * 1/(5*10 9 ) ‏ = 1.2 seconds The percentage of the CPU time P 1.2seconds/3 seconds = 40%

Question 2 Consider program P, which runs on a 1 GHz machine M in 10 seconds. An optimization is made to P, replacing all instances of multiplying a value by 4 (mult X,X,4) with two instructions that set x to x + x twice(add X,X;add X,X). Call this new optimized program P'. The CPI of a multiply instruction is 4, and the CPI of an add is 1. After recompiling, the program now runs in 9 seconds on machine M. How many multiplies were replaced by the new compiler?

Question 2 The number of multiplies that were replaced the by new compiler Let Number of multiplies replaced in new compiler = X Number of cycles executed in the old compiler = 4X Number of cycles executed in new compiler = 2X Total number of cycles difference 4X-2X = 2X Total number of cycles difference between P and P' = – 9*10 9 = X = 10 9 => X = 5* 10 8

Question 3 For a typical workload, the percentages of three groups of instructions and their average CPI are given in the following table. InstructionPercentage (%)CPI Integer501 Branch52 Load/Store304 Floating-point1510

Question 3 a. Calculate the overall average CPI for the typical workload. b. There are three possible ways to make performance improvement. First, an enhanced compiler can reduce the floating-point instructions to ½ of the original floating-point instructions with the cost of increasing the integer instructions by 15% of the original integer instructions. Second, a new pipeline technique can reduce the average CPI of the floating-point instruction from 10 to 4 with an increasing clock cycle time of 4%. Third, an improved caching technique reduces the CPI of Load/Store from 4 to 3. Calculate and compare the performance improvement of the three solutions.

Question 3 a. Average CPI = 50%*1 + 5%*2 + 30%*4 + 15%*10 = 3.3 b. Old scheme: CPU time = 3.3*IC*Cycle Time First: (50%*IC*115%*1 + 5%*IC*2 + 30%*IC*4 + 15%*IC/2*10)*Cycle Time = 2.625*IC*Cycle Time Second: (50%*IC*1 + 5%*IC*2 + 30%*IC*4 + 15%*IC*4)*1.04*Cycle Time = 2.496*IC*Cycle Time Third: (50%*IC*1 + %5*IC*2 + 30%*IC*3 + 15%*IC*10)*Cycle Time = 3*IC*Cycle Time

Question 4 Suppose a computer runs at 1GHz. If a typical application has the following distribution of instructions. a. What’s the average CPI? b. If the runtime for an application is 4.35s, then how many Floating point instructions are generated? InstructionsPercentage (%)CPI Floating point20%3 Load30%2 Branches15%4 Integer10%1 Store20%1 Other5%1.5

Question 4 a. Average CPI = 0.2*3+0.3*2+0.15*4+0.1*1+0.2*1+0.05*1.5=2.175 b *10 9 /2.175*20%=4*10 8