The University of Adelaide, School of Computer Science

Slides:



Advertisements
Similar presentations
CDA 3101 Fall 2010 Discussion Section 08 CPU Performance
Advertisements

CSE431 Chapter 7A.1Irwin, PSU, 2008 CSE 431 Computer Architecture Fall 2008 Chapter 7A: Intro to Multiprocessor Systems Mary Jane Irwin (
Computer Organization Lab 1 Soufiane berouel. Formulas to Remember CPU Time = CPU Clock Cycles x Clock Cycle Time CPU Clock Cycles = Instruction Count.
CS2100 Computer Organisation Performance (AY2014/2015) Semester 2.
Computer Abstractions and Technology
Performance Evaluation of Architectures Vittorio Zaccaria.
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
Computer Organization and Architecture 18 th March, 2008.
Example (1) Two computer systems have been tested using three benchmarks. Using the normalized ratio formula and the following tables below, find which.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 5, 2005 Lecture 2.
CSCE 212 Quiz 4 – 2/16/11 *Assume computes take 1 clock cycle, loads and stores take 10 cycles and branches take 4 cycles and that they are running on.
9/16/2004Comp 120 Fall September 16 Assignment 4 due date pushed back to 23 rd, better start anywayAssignment 4 due date pushed back to 23 rd, better.
Computer Architecture Lecture 2 Instruction Set Principles.
Chapter 4 Assessing and Understanding Performance
Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
Lecture 2 Quantifying Performance
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
Compiled by Maria Ramila Jimenez
From lecture slides for Computer Organization and Architecture: Designing for Performance, Eighth Edition, Prentice Hall, 2010 CS 211: Computer Architecture.
Lecture 8: 9/19/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
Morgan Kaufmann Publishers
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Performance Performance
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Computer Engineering Rabie A. Ramadan Lecture 2. Table of Contents 2 Architecture Development and Styles Performance Measures Amdahl’s Law.
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
1 Lecture 3: Pipelining Basics Today: chapter 1 wrap-up, basic pipelining implementation (Sections C.1 - C.4) Reminders:  Sign up for the class mailing.
1 Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Performance. Moore's Law Moore's Law Related Curves.
Measuring Performance II and Logic Design
CS203 – Advanced Computer Architecture
Lecture 2: Performance Today’s topics:
Lecture 2: Performance Evaluation
Software Architecture in Practice
September 2 Performance Read 3.1 through 3.4 for Tuesday
ECE 4100/6100 Advanced Computer Architecture Lecture 1 Performance
How do we evaluate computer architectures?
Defining Performance Which airplane has the best performance?
CSC 4250 Computer Architectures
Morgan Kaufmann Publishers
CSCE 212 Chapter 4: Assessing and Understanding Performance
Chapter 1 Fundamentals of Computer Design
CS2100 Computer Organisation
Chapter 3: Principles of Scalable Performance
You have a system that contains a special processor for doing floating-point operations. You have determined that 50% of your computations can use the.
CMSC 611: Advanced Computer Architecture
The University of Adelaide, School of Computer Science
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
The University of Adelaide, School of Computer Science
Potential for parallel computers/parallel programming
CMSC 611: Advanced Computer Architecture
Potential for parallel computers/parallel programming
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Utsunomiya University
Chapter 2: Performance CS 447 Jason Bakos Fall 2001 CS 447.
Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Computer Organization and Design Chapter 4
CS2100 Computer Organisation
Presentation transcript:

The University of Adelaide, School of Computer Science Computer Architecture A Quantitative Approach, Fifth Edition The University of Adelaide, School of Computer Science 25 April 2019 EEL 6764 Principles of Computer Architecture Review – Assignment 1 Instructor: Hao Zheng Department of Computer Science & Engineering University of South Florida Tampa, FL 33620 Email: haozheng@usf.edu Phone: (813)974-4757 Fax: (813)974-5456 Chapter 2 — Instructions: Language of the Computer

1.8 You are designing a system for real-time application in which specific deadlines must be met. Finishing the computation faster gains nothing. You find that your system can execute the necessary code, in the worst case, twice as fast as necessary. (a) how much energy do you save if you execute at the current speed and turn off the system when the computation is complete?

1.8 You are designing a system for real-time application in which specific deadlines must be met. Finishing the computation faster gains nothing. You find that your system can execute the necessary code, in the worst case, twice as fast as necessary. (b) how much energy do you save if you set the voltage and frequency to be half as much?

1. 9 Server farms provide enough capacity for the highest request rate 1.9 Server farms provide enough capacity for the highest request rate. When the servers operate at 60% of their capacity, they consume 90% of max power. Servers usually operate at 60% of their capacity. The servers could be turned off, but they would take too long to restart in response to more load. A new system has been proposed that allows for a quick restart but requires 20% of max power while in this “barely alive” state (a) how much power savings would be achieved by turning off 60% of servers?

1. 9 Server farms provide enough capacity for the highest request rate 1.9 Server farms provide enough capacity for the highest request rate. When the servers operate at 60% of their capacity, they consume 90% of max power. The servers could be turned off, but they would take too long to restart in response to more load. A new system has been proposed that allows for a quick restart but requires 20% of max power while in this “barely alive” state (b) how much power savings would be achieved by placing 60% of servers in the “barely alive” state?

1. 9 Server farms provide enough capacity for the highest request rate 1.9 Server farms provide enough capacity for the highest request rate. When the servers operate at 60% of their capacity, they consume 90% of max power. The servers could be turned off, but they would take too long to restart in response to more load. A new system has been proposed that allows for a quick restart but requires 20% of max power while in this “barely alive” state (c) how much power savings would be achieved by reducing voltage by 20% and frequency by 40%?

1. 9 Server farms provide enough capacity for the highest request rate 1.9 Server farms provide enough capacity for the highest request rate. When the servers operate at 60% of their capacity, they consume 90% of max power. The servers could be turned off, but they would take too long to restart in response to more load. A new system has been proposed that allows for a quick restart but requires 20% of max power while in this “barely alive” state (d) how much power savings would be achieved by placing 30% of servers in the “barely alive” state and 30% off?

1. 12 Assume we are enhancing a machine by adding encryption engine 1.12 Assume we are enhancing a machine by adding encryption engine. When computing encryption, it is 20 times faster than the normal mode of execution. We define percentage of encryption as the percentage of time in the original execution that is spent performing encryption. This specialized HW increases power by 2%. (b) With what percentage of encryption will adding encryption HW result in a speedup of 2?

1. 12 Assume we are enhancing a machine by adding encryption engine 1.12 Assume we are enhancing a machine by adding encryption engine. When computing encryption, it is 20 times faster than the normal mode of execution. We define percentage of encryption as the percentage of time in the original execution that is spent performing encryption. This specialized HW increases power by 2%. (c) What percentage of time in the new execution will be spent on encryption if speedup of 2 is achieved?

1. 12 Assume we are enhancing a machine by adding encryption engine 1.12 Assume we are enhancing a machine by adding encryption engine. When computing encryption, it is 20 times faster than the normal mode of execution. We define percentage of encryption as the percentage of time in the original execution that is spent performing encryption. This specialized HW increases power by 2%. (d) Suppose the percentage of encryption is 50%. Assume that 90% of encryption can be parallelized. What is speedup if 2 or 4 encryption HW units are added?

1.14 When making changes to optimize part of a processor, it is often the case that speeding up one type of instruction comes at the cost of slowing down something else. For example, if we put in a complicated fast floating-point unit, that takes space, and something might have to be moved farther away from the middle to accommodate it, adding an extra cycle in delay to reach that unit. The basic Amdahl’s Law equation does not take into account this trade-off.

(a) If the new fast floating-point unit speeds up floating-point operations by, on average, 2x, and floating-point operations take 20% of the original program’s execution time, what is the overall speedup (ignoring the penalty to any other instructions)?

(b) Now assume that speeding up the floating-point unit slowed down data cache accesses, resulting in a 1.5x slowdown (or 2/3 speedup). Data cache accesses consume 10% of the execution time. What is the overall speedup now?

(c) After implementing the new floating-point operations, what percentage of execution time is spent on floating-point operations? What percentage is spent on data cache accesses?

1.6 When parallelizing an application, the ideal speedup is speeding up by the number of processors. This is limited by two things: percentage of the application that can be parallelized and the cost of communication. Amdahl’s Law takes into account the former but not the latter. (a) What is the speedup with N processors if 80% of the application is parallelizable, ignoring the cost of communication?

1.6 When parallelizing an application, the ideal speedup is speeding up by the number of processors. This is limited by two things: percentage of the application that can be parallelized and the cost of communication. Amdahl’s Law takes into account the former but not the latter. (b) What is the speedup with eight processors if, for every processor added, the communication overhead is 0.5% of the original execution time.