INTRODUCTION Jehan-François Pâris An evolving field Computer architectures keep changing –Building faster computers Supercomputers and.

Slides:



Advertisements
Similar presentations
CS1104: Computer Organisation School of Computing National University of Singapore.
Advertisements

CS2100 Computer Organisation Performance (AY2014/2015) Semester 2.
Computer Abstractions and Technology
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
1 Lecture 11: Digital Design Today’s topics:  Evaluating a system  Intro to boolean functions.
Chapter 4 Assessing and Understanding Performance
Gordon Moore Gordon Moore, cofounder of Intel 1965: 2 x trans. per chip/year After 1970: 2 x trans. per chip/1.5year 摩爾定律.
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
CMSC 611: Advanced Computer Architecture Benchmarking Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Computer Organization and Design Performance Montek Singh Mon, April 4, 2011 Lecture 13.
1 Computer Performance: Metrics, Measurement, & Evaluation.
Writer:-Rashedul Hasan Editor:- Jasim Uddin
Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( ) 2005.
Lecture 2: Computer Performance
Practical PC, 7th Edition Chapter 17: Looking Under the Hood
Ch4b- 2 EE/CS/CPE Computer Organization  Seattle Pacific University Performance metrics I’m concerned with how long it takes to run my program.
Computer Performance Computer Engineering Department.
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
B0111 Performance Anxiety ENGR xD52 Eric VanWyk Fall 2012.
10/19/2015Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
1 Acknowledgements Class notes based upon Patterson & Hennessy: Book & Lecture Notes Patterson’s 1997 course notes (U.C. Berkeley CS 152, 1997) Tom Fountain.
1. 2 Table 4.1 Key characteristics of six passenger aircraft: all figures are approximate; some relate to a specific model/configuration of the aircraft.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
Computer Architecture
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
Chapter 17 Looking “Under the Hood”. 2Practical PC 5 th Edition Chapter 17 Getting Started In this Chapter, you will learn: − How does a computer work.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
COSC 3330/6308 Solutions to the Third Problem Set Jehan-François Pâris November 2012.
Stored Programs In today’s lesson, we will look at: what we mean by a stored program computer how computers store and run programs what we mean by the.
1  1998 Morgan Kaufmann Publishers How to measure, report, and summarize performance (suorituskyky, tehokkuus)? What factors determine the performance.
1 Lecture 2: Performance, MIPS ISA Today’s topics:  Performance equations  MIPS instructions Reminder: canvas and class webpage:
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
4. Performance 4.1 Introduction 4.2 CPU Performance and Its Factors
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
L12 – Performance 1 Comp 411 Computer Performance He said, to speed things up we need to squeeze the clock Study
EGRE 426 Computer Organization and Design Chapter 4.
Computer Engineering Rabie A. Ramadan Lecture 2. Table of Contents 2 Architecture Development and Styles Performance Measures Amdahl’s Law.
Chapter 1 — Computer Abstractions and Technology — 1 Uniprocessor Performance Constrained by power, instruction-level parallelism, memory latency.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
Computer Organization CS345 David Monismith Based upon notes by Dr. Bill Siever and from the Patterson and Hennessy Text.
Performance. Moore's Law Moore's Law Related Curves.
Measuring Performance II and Logic Design
Unit 2 Technology Systems
September 2 Performance Read 3.1 through 3.4 for Tuesday
How do we evaluate computer architectures?
Defining Performance Which airplane has the best performance?
Morgan Kaufmann Publishers
CSCE 212 Chapter 4: Assessing and Understanding Performance
CS2100 Computer Organisation
CPU Central Processing Unit
CPU Central Processing Unit
CMSC 611: Advanced Computer Architecture
1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.
Computer Evolution and Performance
CMSC 611: Advanced Computer Architecture
Parameters that affect it How to improve it and by how much
Chapter 2: Performance CS 447 Jason Bakos Fall 2001 CS 447.
Computer Organization and Design Chapter 4
CS2100 Computer Organisation
Presentation transcript:

INTRODUCTION Jehan-François Pâris

An evolving field Computer architectures keep changing –Building faster computers Supercomputers and data centers –Building cheaper, smaller computers Laptops, notebooks, netbooks, smartbooks –Putting computer systems everywhere Cars, cell phones, HDTV: embedded computers

An analogy Electrical motors –Replaced the single steam engine powering many machines through transmission belts and pulleys –One electrical motor per machine –Domestic appliances, car starters, … –Power tools –Power windows, electrical toothbrushes, …

The coming revolution Cannot increase CPU clock frequency above 2 GHz without running into unsolvable heat dissipation problems –Switch to multicore architectures Two, four, eight, … CPUs per chip –Creates new problems Hardware: cache synchronization Software: programming these beasts Ouch!

Other challenges Reducing power consumption of data centers –Often contain archival data that are very rarely accessed Finding new ways to keep increasing magnetic disk capacity Dealing with physical limits to SDRAM density –Will never get 8 TB SODIMM modules Finding a replacement for hard drives

Classical computer components Input Output Memory Datapath Control – Datapath + Control = Processor Storage subsystem is missing!

A laptop motherboard

The course philosophy Showing you how computer work is fine Showing you how to make them faster is better!

PERFORMANCE ISSUES Defining performance Measuring it –Not an easy task Evaluating the impact of –Amount of work done by each instruction –Time they take to run –CPU clock speed

Measuring Performance Inverse of execution time of a benchmark Performance = 1/Execution Time If computers A and B are such that Execution Time A < Execution Time B for the same benchmark, then Performance A > Performance B

SPEC CPU Benchmark SPEC CPU2006 –Set of 12 integer and 17 floating-point benchmarks –Results are normalized: Execution on a reference processor / Execution on benchmarked processor –Single value is geometric mean of these ratios

How is it computed (I) Two new processors P and Q compared to a reference processor R Execution times for n benchmarks – P 1, P 2, …, P n – Q 1, Q 2, …, Q n – R 1, R 2, …, R n

How it is computed SPEC value for processor P is Observe that (property of geometric mean)

Impact of Instruction Set Execution Time = Number of Instructions × Mean Instruction Execution Time –Gave birth to the idea of more complex instruction sets Each does more Fewer instructions

Impact of Clock Speed Execution Time = Number of Clock Cycles × Clock Cycle Time same as Execution Time = Number of Clock Cycles / Clock Frequency

Putting everything together Execution Time = Number of Instructions × Number of Clock Cycles per Instruction × Clock Cycle Time Gives us three ways to reduce program execution time

1. Using fewer instructions VAX –Super minicomputer designed in late 70’s –Had a complicated instruction set (CISC) –Idea was to use more powerful instructions in order to reduce the number of instructions used to perform most frequent tasks –Poor pipelining performance

2. Using a faster clock Major reason for explosion of CPU performance in the 80’s and 90’s –IBM PC (1981): Intel 4.77 MHz –IBM PC AT (1984): Intel 6 and 8 MHz – Nowadays up to 3 GHz Cannot get much higher!

3. Using better instructions Best strategy is to reduce the average number of clock cycles per instruction –Privileging fast instructions –Using fixed-size instructions to allow pipelining –Trying to execute as many tasks as possible in parallel

Amdahl’s Law (I) Examples: –Supersonic jet Could fly from Houston to Washington in thirty minutes Total travel time would be dominated by travel time to airport and check in procedures –Today's laptops: Disk access times are the bottleneck

Amdahl’s Law (II) Assume that we have a technique for improving the performance of some part of a system. Let –T o be the time originally spent in the part of the system that can be improved –T i be the time spent in that part once the improvement has been applied –T n be the time spent in in the part of the system that remains unaffected

Amdahl’s Law (III) The total speedup for the whole system will be The maximum possible speedup when T i  0

An example Flying to Washington National Airport takes three hours Going to the airport and waiting for the flight takes a minimum of two hours Going from the airport to Washington downtown takes a minimum of 30 minutes What is the maximum speedup that could be achieved using much faster planes ? 5h30 / 2h30 = 2.2

Answer Current travel time: –To airport and wait: 2 hours –Plane: 3 hours –To downtown by DC metro: 30 minutes –Total: 5 hours 30 minutes

Answer Assume plane travels at speed of light : –To airport and wait: 2 hours –Plane: negligible –To downtown by DC metro: 30 minutes –Total: 2 hours 30 minutes Maximum speedup would be 5h30 / 2h30 = 2.2

Train and busses Commuter trains and city busses spend significant amount of trip time debarking and embarking travelers – Have wide doors Not true for Amtrak train and intercity buses – Fewer narrower doors

Train and busses

A problem Assume we have a technique to improve the speed of floating-point operations by 20 percent What will be the overall CPU speedup if we expect it to spend 10 percent of its time executing floating point operations? How would that speedup be affected if the CPU spends 30 percent of its time executing floating point operations?

Solution (I) First case: –Baseline time = 0.9 × × 1 = 1 –After improvement = 0.9 × × 0.8 = 0.98 –Speedup = 1/0.98 = 1.02 A 2 percent improvement!

Solution (II) Second case: –Baseline time = 0.7 × × 1 = 1 –After improvement = 0.7 × × 0.8 = 0.94 –Speedup = 1/0.94 = A 6.4 percent improvement!

REVIEW PROBLEMS

Problem Consider a huge program that consists of a purely sequential part that takes two hours and another part that takes eight hours. What is the maximum speedup we can achieve by parallelizing the second part of the program?

Answer Current run time: –Sequential part: 2 hours –Other part: 8 hours –Total: 10 hours Minimum run time: –Sequential part: 2 hours –Other part: negligible –Total: 2 hours

Answer Current run time: –Sequential part: 2 hours –Other part: 8 hours –Total: 10 hours Minimum run time: –Sequential part: 2 hours –Other part: negligible –Total: 2 hours Maximum speed up 10/2 = 5

Problem Server motherboard A has a SPEC CPU2006 rating of 31.4 while server motherboard B has a rating of Which one of the two motherboards is faster?

Answer Server motherboard A has a SPEC CPU2006 rating of 31.4 while server motherboard B has a rating of Which one of the two motherboards is faster? Motherboard A because a higher SPEC value is better

Fun problem Shanghai maglev train runs at 268 mph How does it compare to airplane for going between Houston and Washington, DC?

Fun answer Current travel time: –To airport and wait: 2 hours –Plane: 3 hours –To downtown by DC metro: 30 minutes –Total: 5 hours 30 minutes With maglev: –To station: 1 hour –Train to downtown DC: 6 hours 30 minutes –Total: 7 hours 30 minutes

Fun answer Current travel time: –To airport and wait: 2 hours –Plane: 3 hours –To downtown by DC metro: 30 minutes –Total: 5 hours 30 minutes With maglev: –To station: one hour –Train to downtown DC: 6 hours 30 minutes –Total: 7 hours 30 minutes Plane is still faster for very long trips