Compiled by Maria Ramila Jimenez

Slides:



Advertisements
Similar presentations
CDA 3101 Fall 2010 Discussion Section 08 CPU Performance
Advertisements

Computer Organization Lab 1 Soufiane berouel. Formulas to Remember CPU Time = CPU Clock Cycles x Clock Cycle Time CPU Clock Cycles = Instruction Count.
Performance of Cache Memory
Computer Abstractions and Technology
PradeepKumar S K Asst. Professor Dept. of ECE, KIT, TIPTUR. PradeepKumar S K, Asst.
Vector Processing. Vector Processors Combine vector operands (inputs) element by element to produce an output vector. Typical array-oriented operations.
Potential for parallel computers/parallel programming
Evaluating Performance
11Sahalu JunaiduICS 573: High Performance Computing5.1 Analytical Modeling of Parallel Programs Sources of Overhead in Parallel Programs Performance Metrics.
Computer Organization and Architecture 18 th March, 2008.
CSCE 212 Quiz 4 – 2/16/11 *Assume computes take 1 clock cycle, loads and stores take 10 cycles and branches take 4 cycles and that they are running on.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.5 Comparing and Summarizing Performance.
Recap.
Computer Architecture Lecture 2 Instruction Set Principles.
Steve Lantz Computing and Information Science Parallel Performance Week 7 Lecture Notes.
Amdahl's Law.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
Lecture 3 – Parallel Performance Theory - 1 Parallel Performance Theory - 1 Parallel Computing CIS 410/510 Department of Computer and Information Science.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
Yulia Newton CS 147, Fall 2009 SJSU. What is it? “Parallel processing is the ability of an entity to carry out multiple operations or tasks simultaneously.
Amdahl's Law Validity of the single processor approach to achieving large scale computing capabilities Presented By: Mohinderpartap Salooja.
Parallel Processing Sharing the load. Inside a Processor Chip in Package Circuits Primarily Crystalline Silicon 1 mm – 25 mm on a side 100 million to.
1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft.
From lecture slides for Computer Organization and Architecture: Designing for Performance, Eighth Edition, Prentice Hall, 2010 CS 211: Computer Architecture.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
Lecture 8: 9/19/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8.
Parallel Programming with MPI and OpenMP
Performance Enhancement. Performance Enhancement Calculations: Amdahl's Law The performance enhancement possible due to a given design improvement is.
Performance Performance
Advanced Computer Networks Lecture 1 - Parallelization 1.
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Computer Engineering Rabie A. Ramadan Lecture 2. Table of Contents 2 Architecture Development and Styles Performance Measures Amdahl’s Law.
Amdahl’s Law CPS 5401 Fall 2013 Shirley Moore
CMSC 611: Advanced Computer Architecture Performance & Benchmarks Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
Concurrency and Performance Based on slides by Henri Casanova.
Intro to Computer Org. Assessing Performance. What Is Performance? What do we mean when we talk about the “performance” of a CPU?
1 Potential for Parallel Computation Chapter 2 – Part 2 Jordan & Alaghband.
BITS Pilani, Pilani Campus Today’s Agenda Role of Performance.
Computer Organization CS345 David Monismith Based upon notes by Dr. Bill Siever and from the Patterson and Hennessy Text.
Performance. Moore's Law Moore's Law Related Curves.
Compilers can have a profound impact on the performance of an application on given a processor. This problem will explore the impact compilers have on.
CSCI206 - Computer Organization & Programming
Lecture 2: Performance Evaluation
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
ECE 4100/6100 Advanced Computer Architecture Lecture 1 Performance
Defining Performance Which airplane has the best performance?
Chapter 1 Fundamentals of Computer Design
CS2100 Computer Organisation
CSCI206 - Computer Organization & Programming
Parallel Processing Sharing the load.
You have a system that contains a special processor for doing floating-point operations. You have determined that 50% of your computations can use the.
CMSC 611: Advanced Computer Architecture
Amdahl's law.
Performance Cycle time of a computer CPU speed speed = 1 / cycle time
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
CMSC 611: Advanced Computer Architecture
The University of Adelaide, School of Computer Science
Potential for parallel computers/parallel programming
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
January 25 Did you get mail from Chun-Fa about assignment grades?
Potential for parallel computers/parallel programming
CS2100 Computer Organisation
Presentation transcript:

Compiled by Maria Ramila Jimenez Amdahl’s law Compiled by Maria Ramila Jimenez

Amdahl’s Law implies that the overall performance improvement is limited by the range of impact when an optimization is applied. Consider the following equations: Improvement rate N = (original execution time) / (new execution time) New execution time = timeunaffected + timeaffected / (Improvement Factor) Time = (instruction count) × CPI × CCT

CPI = (execution_time x clock_rate)/instructions (cpu_ex_time_B/cpu_ex_time_A) – formula to know which is faster

Amdahl’s law If a program currently takes 100 seconds to execute and loads and stores account for 20% of the execution times, how long will the program take if loads and stores are made 30% faster? For this, you can use Amdahl's law or you can reason it out step by step. Doing it step by step gives (1) Before the improvement loads take 20 seconds (2) If loads and stores are made 30 percent faster they will take 20/1.3 = 15.385 seconds, which corresponds to 4.615 seconds less. (3) Thus, the final program will take 100 - 4.615 = 95.38

Amdahl’s law Amdahl's law, named after computer architect Gene Amdahl, is used to find the maximum expected improvement to an overall system when only part of the system is improved. Amdhal's law can be interpreted more technically, but in simplest terms it means that it is the algorithm that decides the speedup not the number of processors. Wikipedia

Amdahl’s law Amdahl's law is a demonstration of the law of diminishing returns: while one could speed up part of a computer a hundred-fold or more, if the improvement only affects 12% of the overall task, the best the speedup could possibly be is 1 = 1.136 times faster. 1 – 0.12 Wikipedia

Amdahl’s law More technically, the law is concerned with the speedup achievable from an improvement to a computation that affects a proportion P of that computation where the improvement has a speedup of S. Wikipedia

Amdahl’s law For example, if an improvement can speedup 30% of the computation, P will be 0.3; if the improvement makes the portion affected twice as fast, S will be 2. Amdahl's law states that the overall speedup of applying the improvement will be Wikipedia

Amdahl’s law To see how this formula was derived, assume that the running time of the old computation was 1, for some unit of time. The running time of the new computation will be the length of time the unimproved fraction takes (which is 1 − P) plus the length of time the improved fraction takes. The length of time for the improved part of the computation is the length of the improved part's former running time divided by the speedup, making the length of time of the improved part P/S. Wikipedia

Amdahl’s law The final speedup is computed by dividing the old running time by the new running time, which is what the above formula does. Wikipedia

Amdahl’s law Here's another example. We are given a task which is split up into four parts: P1 = .11 or 11%, P2 = .18 or 18%, P3 = .23 or 23%, P4 = .48 or 48%, which add up to 100%. Then we say P1 is not sped up, so S1 = 1 or 100%, P2 is sped up 5x, so S2 = 5 or 500%, P3 is sped up 20x, so S3 = 20 or 2000%, and P4 is sped up 1.6x, so S4 = 1.6 or 160%. Wikipedia

Amdahl’s law By using the formula we find the running time is or a little less than 1/2 the original running time which we know is 1. Notice how the 20x and 5x speedup don't have much effect on the overall speed boost and running time when over half of the task is only sped up 1x (i.e. not sped up) or 1.6x. Therefore the overall speed boost is or a little more than double the original speed using the formula Wikipedia

Amdahl’s law Parallelization In the special case of parallelization, Amdahl's law states that if F is the fraction of a calculation that is sequential (i.e. cannot benefit from parallelization), and (1 − F) is the fraction that can be parallelized, then the maximum speedup that can be achieved by using N processors is

Amdahl’s law As an example, if F is only 10%, the problem can be sped up by only a maximum of a factor of 10, no matter how large the value of N used. For this reason, parallel computing is only useful for either small numbers of processors, or problems with very low values of F: so-called embarrassingly parallel problems. A great part of the craft of parallel programming consists of attempting to reduce F to the smallest possible value.

Amdahl’s law Where f is the fraction of the problem that must be computed sequentially and N is the number of processors.

Amdahl's Law Amdahl's law states that the performance improvement to be gained from using some faster mode of execution is limited by the fraction of the time the faster mode can be used.

Amdahl's Law Consider the following example: Program Loop 1 10% of total execution time 200 lines Loop 2 10 lines 90% of total execution time

Amdahl's Law Overall speedup = 1 (1-f) + f s where     f - fraction of a program that is enhanced,                s - speedup of the enhanced portion http://www.cs.iastate.edu/~prabhu/Tutorial/CACHE/amdahl.html