CS 704 Advanced Computer Architecture

Slides:



Advertisements
Similar presentations
Computer Abstractions and Technology
Advertisements

Performance Evaluation of Architectures Vittorio Zaccaria.
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
Computer Organization and Architecture 18 th March, 2008.
CIS629 Fall Lecture Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.5 Comparing and Summarizing Performance.
1 Lecture 2: System Metrics and Pipelining Today’s topics: (Sections 1.6, 1.7, 1.9, A.1)  Quantitative principles of computer design  Measuring cost.
Recap.
CIS429.S00: Lec2- 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important quantitative.
Computer Architecture Lecture 2 Instruction Set Principles.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
Chapter 4 Assessing and Understanding Performance
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
Amdahl's Law.
CIS429/529 Winter 07 - Performance - 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
Princess Sumaya Univ. Computer Engineering Dept. د. بســام كحـالــه Dr. Bassam Kahhaleh.
1 Measuring Performance Chris Clack B261 Systems Architecture.
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
AN INTRODUCTION TO THE OPERATIONAL ANALYSIS OF QUEUING NETWORK MODELS Peter J. Denning, Jeffrey P. Buzen, The Operational Analysis of Queueing Network.
Lecture 2: Computer Performance
1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Advanced Computer Architecture Fundamental of Computer Design Instruction Set Principles and Examples Pipelining:Basic and Intermediate Concepts Memory.
1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft.
From lecture slides for Computer Organization and Architecture: Designing for Performance, Eighth Edition, Prentice Hall, 2010 CS 211: Computer Architecture.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
Lecture 8: 9/19/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
1  1998 Morgan Kaufmann Publishers How to measure, report, and summarize performance (suorituskyky, tehokkuus)? What factors determine the performance.
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Lecture 5: 9/10/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
EGRE 426 Computer Organization and Design Chapter 4.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
1 Lecture 3: Pipelining Basics Today: chapter 1 wrap-up, basic pipelining implementation (Sections C.1 - C.4) Reminders:  Sign up for the class mailing.
VU-Advanced Computer Architecture Lecture 1-Introduction 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 1.
Performance. Moore's Law Moore's Law Related Curves.
Chapter 1 Introduction.
Measuring Performance II and Logic Design
CS203 – Advanced Computer Architecture
CS 704 Advanced Computer Architecture
Lecture 2: Performance Today’s topics:
OPERATING SYSTEMS CS 3502 Fall 2017
Lecture 2: Performance Evaluation
Software Architecture in Practice
Dan C. Marinescu Office: HEC 439 B. Office hours: M, Wd 3 – 4:30 PM.
September 2 Performance Read 3.1 through 3.4 for Tuesday
Defining Performance Which airplane has the best performance?
CS 704 Advanced Computer Architecture
CS 704 Advanced Computer Architecture
CS 704 Advanced Computer Architecture
Morgan Kaufmann Publishers
CSCE 212 Chapter 4: Assessing and Understanding Performance
CS2100 Computer Organisation
Defining Performance Section /14/2018 9:52 PM.
Chapter 6: CPU Scheduling
CS 704 Advanced Computer Architecture
Chapter 1 Fundamentals of Computer Design
Operating systems Process scheduling.
CSE8380 Parallel and Distributed Processing Presentation
CS 704 Advanced Computer Architecture
Performance Cycle time of a computer CPU speed speed = 1 / cycle time
August 30, 2000 Prof. John Kubiatowicz
Performance Models And Evaluation
Chapter 2: Performance CS 447 Jason Bakos Fall 2001 CS 447.
Computer Organization and Design Chapter 4
CS2100 Computer Organisation
Presentation transcript:

CS 704 Advanced Computer Architecture Lecture 3 Quantitative Principles … Cont’d Design for Performance Prof. Dr. M. Ashraf Chughtai Welcome to the third lecture of he series on Advanced Computer Architecture. MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd Today’s Topics Recap I/O performance Laws and Principles Performance enhancement Concluding: quantitative principles Home work Summary After a quick review of the previous two lectures on the computer design we will continue with the discussion on the quantitative principles of computer design. An introduction to the computer processor performance, which is the key to the computer design for performance, has been the theme of the second lecture. Today we will talk about: I/O performance measures Laws and principles of performance measure Computer performance enhancement Concluding quantitative principles of computer design Homework MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd Recap: Lecture 1-2 Computer architecture verses organization Technological Developments Computer design cycle Performance metrics: time verses throughput Price-Performance design Benchmarks: Performance evaluation Distinguishing between architecture and organization of processors we concluded that ‘the architecture of the members of a processor family are same whereas organization of same architecture may differ between different members of the family’ Technological developments, from vacuum tubes to VLSI circuits, dynamic memory and network technology gave birth to four different generations of computers. In the computer design cycle, the decisive factors for rapid changes in the computer development have been the performance enhancements, price reduction and functional improvements The processor performance of two designs is often compared by the factor n, which determines how much lower execution time one machine takes as compared to the other or how much faster the other machine is than first. Time is the key measurement of performance. However, the throughput - number of tasks completed in specified time cannot be ignored. A desktop user may define the performance of his/her machine in terms of time taken by the machine to execute a program; whereas a computer center manager running a large server system may define the performance in terms of the number of jobs completed in a specified time. Price-Performance Design: The relationship between cost and price is complex one; and computer designers must understand this relationship as it effects the selling of their design. The cost is the total amount spends to produce a product and the price is the amount for which a finished good is sold and it is controlled by the die yield and volume. Growth in Processor Performance: The supercomputers and mainframes, costing millions of dollars and occupying excessively large space, prevailing by early 1970’s have been replaced with very low-cost microprocessor-based desktop computing machines in the form of personal computer (PC) and workstation massively parallel processing machines. Benchmark is a program developed to evaluate the performance of a computer. Good products created when have: proficient benchmarks and expert ways to summarize performance MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd Computer I/O System Producer-Server model Producer: the device that generates request to be serviced Queue: the area where the tasks accumulate waiting to be serviced Server: the device performing the requested service Response Time: the time a task takes from the moment it is placed in the buffer to the time server finishes the task Server I/O device/ controller Producer Queue Arrivals departures An I/O system works on the principle of producer-server model, which comprises an area, called queue, where the tasks accumulate waiting to be serviced and the device performing the requested service, called server. Producer creates tasks to be processed and place them in a FIFO buffer – queue. The server takes the task form buffer and perform them The response time is the time task takes from the moment it arrives in the buffer to the time the server finishes the task MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

I/O Performance Parameters Diversity: Which I/O device can connect to the CPU Capacity: How many I/O devices can connect to the CPU Latency: Overall response time to complete a task Bandwidth: Number of task completed in specified time - throughput The parameters diversity that refers to which I/O device and capacity means how many I/O devices can connect to the CPU are the I/O performance measures having no counterpart in CPU performance metrics. In addition, the latency (response time) and bandwidth (throughput) also apply to the I/O system. An I/O system is said to be in equilibrium state when the rate at which the I/O requests from CPU arriving, at the input of I/O queue (buffer) equals the rate at which the requests departs the queue after being fulfilled by the I/O device. MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd I/O Transaction Time The interaction time or transaction time of a computer is sum of three times: Entry Time: the time for user to enter a command – average 0. 25 sec; from keyboard 4.0 sec. System Response Time: time between when user enters the command and system responds Think Time: the time from reception of the command until the user enters the next command The interaction or transaction time of a computer is sum of: Entry Time: the time for user to enter a command – average 0. 25 sec; from keyboard 4.0 sec. System Response Time: time between when user enters the command and system responds Think Time: the time from reception of the command until the user enters the next command MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Throughput verses Response time: Performance Measures .. Cont’d | | | | | | 0% 20% 40% 60% 80% 100% 200 _ 150 _ 100 _ 50 _ 20  % of maximum throughput - bandwidth Response time – latency ms The minimum response time achieves only 10% of the throughput The response time of 100% throughput takes 7-8 times the minimum response time The knee of the curve is the area where a little more throughput results in much longer response time MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Response time and throughput calculation Arrivals Departures If the system is in steady state, then the number of tasks entering the system must be equal to the number of tasks leaving the system Little’s Law: Mean number of tasks in system = Mean response time x Arrival rate The interaction or transaction time of a computer is sum of: Entry Time: the time for user to enter a command – average 0. 25 sec; from keyboard 4.0 sec. System Response Time: time between when user enters the command and system responds Think Time: the time from reception of the command until the user enters the next command MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Little’s Law – A Little queuing theory Mean number of tasks in the system = (Time accumulated) / (Time observe) Mean response time = (Time accumulated) / (Number tasks) Arrival rate λ = (Number tasks) / (Time observe) The expression for mean number of task may be written as: Time accumulated Timeaccumulated x Number tasks = Time observe Number tasks Time observe Mean number of tasks = mean response time x Arrival rate Assume that we observe a system for time (Time observe) minutes and found that the: number of task are completed in this time (Number task) and the sum of the times each task spends in the system is (Time accumulated), the arrival rate (λ) is the average number of arriving tasks/second; and the mean response time is the ratio of Timeaccumulated and number of tasks completed (Number task) during Time observe. MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Amdahl's Law Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected Original Execution Time of Task Time after fraction F Enhanced by factor S Execution time of the Fraction Enhanced Time for Fraction F to be Enhanced by factor S MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd Amdahl's Law Speedup due to enhancement E: Ex Time without E Speedup (E) = Ex Time with E Performance with E = Performance without E MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd Amdahl’s Law Ex Time new = Ex Time old x (1 – Fraction enhanced) + Fraction enhanced Speedup enhanced MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd Amdahl’s Law ExTimenew = ExTimeold x (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced 1 ExTimeold ExTimenew Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd Amdahl’s Law Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew = Speedupoverall = MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd Amdahl’s Law Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew = ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold 1 Speedupoverall = = 1.053 0.95 MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd Amdahl’s Law ExTimenew = ExTimeold x (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced 1 ExTimeold ExTimenew Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd

Lecture 3 - Performance... Cont'd Amdahl’s Law Solution ExTimenew = ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold 1 Speedupoverall = = 1.053 0.95 MAC/VU-Advanced Computer Architecture Lecture 3 - Performance... Cont'd