Ch1. Fundamentals of Computer Design 1. Formulas ECE562 Advanced Computer Architecture Prof. Honggang Wang ECE Department University of Massachusetts Dartmouth.

Slides:

Advertisements

Similar presentations

IT253: Computer Organization

Advertisements

Computer Organization and Architecture

CA 714CA Midterm Review. C5 Cache Optimization Reduce miss penalty –Hardware and software Reduce miss rate –Hardware and software Reduce hit time –Hardware.

Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.

CS1104: Computer Organisation School of Computing National University of Singapore.

5/18/2015CPE 731, 4-Principles 1 Define and quantify dependability (1/3) How decide when a system is operating properly? Infrastructure providers now offer.

Lecture 2: Modern Trends 1. 2 Microprocessor Performance Only 7% improvement in memory performance every year! 50% improvement in microprocessor performance.

CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Intro II & Trends Steve Ko Computer Sciences and Engineering University at Buffalo.

Instruction Level Parallelism and Its Exploitation.

1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.

CS.305 Computer Architecture Memory: Structures Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made.

1  1998 Morgan Kaufmann Publishers Chapter 8 Storage, Networks and Other Peripherals.

CPSC 614 Computer Architecture Lec 2 - Introduction EJ Kim Dept. of Computer Science Texas A&M University Adapted from CS 252 Spring 2006 UC Berkeley Copyright.

1 Lecture 2: System Metrics and Pipelining Today’s topics: (Sections 1.6, 1.7, 1.9, A.1)  Quantitative principles of computer design  Measuring cost.

Memory Hierarchy.1 Review: Major Components of a Computer Processor Control Datapath Memory Devices Input Output.

1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.

Bandwidth Rocks (1) Latency Lags Bandwidth (last ~20 years) Performance Milestones Disk: 3600, 5400, 7200, 10000, RPM.

CptS / E E COMPUTER ARCHITECTURE

CSE820 Lec 2 - Introduction Rich Enbody Based on slides by David Patterson.

CSCI 211 Computer System Architecture Lec 1 - Introduction

CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.

1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.

CS136, Advanced Architecture Introduction to Architecture (continued)

CSE 520 Computer Architecture Lec 4 – Quantifying Cost, Energy- Consumption, Performance, and Dependability (Chapter 1) Sandeep K. S. Gupta School of Computing.

Chapter 1: Fundamentals of Computer Design David Patterson Electrical Engineering and Computer Sciences University of California, Berkeley

I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.

Eng. Mohammed Timraz Electronics & Communication Engineer University of Palestine Faculty of Engineering and Urban planning Software Engineering Department.

1 Chapter 1: Fundamentals of Computer Design Introduction, class of computers Instruction set architecture (ISA) Technology trend: performance, power,

CSE 502 Graduate Computer Architecture Lec 1 & 2 - Introduction Larry Wittie Computer Science, StonyBrook University and.

Computer Architecture- 1 Ping-Liang Lai ( 賴秉樑 ) Chapter 1 Fundamentals of Computer Design Computer Architecture 計算機結構.

Introduction CS252 S05.

CPE232 Memory Hierarchy1 CPE 232 Computer Organization Spring 2006 Memory Hierarchy Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.

CSIE30300 Computer Architecture Unit 07: Main Memory Hsin-Chou Chi [Adapted from material by and

Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu

1 Lecture 1: CS/ECE 3810 Introduction Today’s topics:  Why computer organization is important  Logistics  Modern trends.

Eng. Mohammed Timraz Electronics & Communication Engineer University of Palestine Faculty of Engineering and Urban planning Software Engineering Department.

Computer Architecture Lec 2 - Introduction. 01/19/10Lec 02-intro 2 Review from last lecture Computer Architecture >> instruction sets Computer Architecture.

Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.

The University of Adelaide, School of Computer Science

1 CS/EE 6810: Computer Architecture Class format:  Most lectures on YouTube *BEFORE* class  Use class time for discussions, clarifications, problem-solving,

MS108 Computer System I Lecture 2 Metrics Prof. Xiaoyao Liang 2014/2/28 1.

C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.

Lecture 16: Storage and I/O EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.

CPE 731 Advanced Computer Architecture Technology Trends Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of California,

CS 5513 Computer Architecture Lecture 2 – More Introduction, Measuring Performance.

CS 152 L18 Disks, RAID, BW (1)Fall 2004 © UC Regents CS152 – Computer Architecture and Engineering Lecture 18 – ECC, RAID, Bandwidth vs. Latency

Morgan Kaufmann Publishers

1 Adapted from UC Berkeley CS252 S01 Lecture 18: Reducing Cache Hit Time and Main Memory Design Virtucal Cache, pipelined cache, cache summary, main memory.

1 Lecture 27: Disks Today’s topics:  Disk basics  RAID  Research topics.

Computer Organization Yasser F. O. Mohammad 1. 2 Lecture 1: Introduction Today’s topics:  Why computer organization is important  Logistics  Modern.

COMPUTER SYSTEMS ARCHITECTURE A NETWORKING APPROACH CHAPTER 12 INTRODUCTION THE MEMORY HIERARCHY CS 147 Nathaniel Gilbert 1.

CS35101 Computer Architecture Spring 2006 Lecture 18: Memory Hierarchy Paul Durand ( ) [Adapted from M Irwin (

Computer Architecture Chapter (5): Internal Memory

CS203 – Advanced Computer Architecture

CS203 – Advanced Computer Architecture

Copyright © 2012, Elsevier Inc. All rights reserved.

Hardware Technology Trends and Database Opportunities

HY425 – Αρχιτεκτονική Υπολογιστών Διάλεξη 02

The University of Adelaide, School of Computer Science

School of Computing and Informatics Arizona State University

Morgan Kaufmann Publishers

Architecture & Organization 1

CPE 432 Computer Design 1 – Introduction and Technology Trends

Lecture 2: Performance Today’s topics: Technology wrap-up

The University of Adelaide, School of Computer Science

The University of Adelaide, School of Computer Science

The University of Adelaide, School of Computer Science

Utsunomiya University

Presentation transcript:

Ch1. Fundamentals of Computer Design 1. Formulas ECE562 Advanced Computer Architecture Prof. Honggang Wang ECE Department University of Massachusetts Dartmouth 285 Old Westport Rd. North Dartmouth, MA Slides based on the PowerPoint Presentations created by David Patterson as part of the Instructor Resources for the textbook by Hennessy & Patterson Updated by Honggang Wang.

Outline Technology Trends Careful, quantitative comparisons: Formulas 1.Define and quantify power 2.Define and quantify relative cost 3.Define and quantify dependability 2

Moore’s Law: 2X transistors / “year” “Cramming More Components onto Integrated Circuits” – Gordon Moore, Electronics, 1965 # on transistors / cost-effective integrated circuit double every N months (12 ≤ N ≤ 24) 3

Technology Performance Trends Drill down into 4 technologies: – Disks, – Memory, – Network, – Processors Compare ~1980 Archaic vs. ~2000 Modern – Performance Milestones in each technology Bandwidth vs. Latency improvements in performance over time Bandwidth: number of events per unit time – E.g., M bits / second over network, M bytes / second from disk Latency: elapsed time for a single event – E.g., one-way network delay in microseconds, average disk access time in milliseconds 4

Disks: Archaic v. Modern CDC Wren I, 1983 (Sunstar) 3600 RPM 0.03 GBytes capacity Tracks/Inch: 800 Bits/Inch: 9550 Three 5.25” platters Bandwidth: 0.6 MBytes/sec Latency: 48.3 ms Cache: none Seagate , RPM (4X) 73.4 GBytes (2500X) Tracks/Inch: (80X) Bits/Inch: 533,000 (60X) Four 2.5” platters (in 3.5” form factor) Bandwidth: 86 MBytes/sec (140X) Latency: 5.7 ms (8X) Cache: 8 MBytes 5

Latency Lags Bandwidth (for last ~20 years) Performance Milestones Disk: 3600, 5400, 7200, 10000, RPM (8x, 143x) 6 (latency = simple operation w/o contention BW = best-case)

Memory Archaic 1980 DRAM (asynchronous) 0.06 Mbits/chip 16-bit data bus per module, 16 pins/chip 13 Mbytes/sec Latency: 225 ns (no block transfer) Modern 2000 Double Data Rate Synchr. (clocked) DRAM Mbits/chip (4000X) 64-bit data bus per DIMM, 66 pins/chip (4X) 1600 Mbytes/sec (120X) Latency: 52 ns (4X) Block transfers (page mode) 7

Latency Lags Bandwidth (last ~20 years) Performance Milestones Memory Module: 16bit plain DRAM, Page Mode DRAM, 32b, 64b, SDRAM, DDR SDRAM (4x,120x) Disk: 3600, 5400, 7200, 10000, RPM (8x, 143x) 8 (latency = simple operation w/o contention BW = best-case)

LANs: Archaic v. Modern Archaic Ethernet Year of Standard: Mbits/s link speed Latency: 3000  sec Shared media Coaxial cable 9 Modern Ethernet 802.3ae Year of Standard: ,000 Mbits/s(1000X) link speed Latency: 190 msec (15X) Switched media Category 5 copper wire Coaxial Cable: Copper core Insulator Braided outer conductor Plastic Covering Copper, 1mm thick, twisted to avoid antenna effect Twisted Pair: "Cat 5" is 4 twisted pairs in bundle

Latency Lags Bandwidth (last ~20 years) Performance Milestones Ethernet: 10Mb, 100Mb, 1000Mb, Mb/s (16x,1000x) Memory Module: 16bit plain DRAM, Page Mode DRAM, 32b, 64b, SDRAM, DDR SDRAM (4x,120x) Disk: 3600, 5400, 7200, 10000, RPM (8x, 143x) 10 (latency = simple operation w/o contention BW = best-case)

CPUs Archaic 1982 Intel MHz 2 MIPS (peak) Latency 320 ns 16-bit data bus, 68 pins Microcode interpreter, separate FPU chip (no caches) Modern 2001 Intel Pentium MHz(120X) 4500 MIPS (peak) (2250X) Latency 15 ns (20X) 64-bit data bus, 423 pins 3-way superscalar, Dynamic translate to RISC, Superpipelined (22 stage), Out-of-Order execution On-chip 8KB Data caches, 96KB Instr. Trace cache, 256KB L2 cache 11

Latency Lags Bandwidth (last ~20 years) Performance Milestones Processor: ‘286, ‘386, ‘486, Pentium, Pentium Pro, Pentium 4 (21x,2250x) Ethernet: 10Mb, 100Mb, 1000Mb, Mb/s (16x,1000x) Memory Module: 16bit plain DRAM, Page Mode DRAM, 32b, 64b, SDRAM, DDR SDRAM (4x,120x) Disk : 3600, 5400, 7200, 10000, RPM (8x, 143x) 12 CPU high, Memory low (“Memory Wall”)

Rule of Thumb for Latency Lagging BW In the time that bandwidth doubles, latency improves by no more than a factor of 1.2 to 1.4 Stated alternatively: Bandwidth improves by more than the square of the improvement in Latency 13

6 Reasons for Latency Lags Bandwidth 1. Moore’s Law helps BW more than latency Faster transistors, more transistors, more pins help Bandwidth MPU Transistors:0.130 vs. 42 M xtors (300X) DRAM Transistors:0.064 vs. 256 M xtors (4000X) MPU Pins:68 vs. 423 pins (6X) DRAM Pins: 16 vs. 66 pins (4X) Smaller, faster transistors but communicate over (relatively) longer lines: latency Feature size:1.5 to 3 vs micron(8X,17X) MPU Die Size:35 vs. 204 mm 2 (ratio sqrt  2X) DRAM Die Size: 47 vs. 217 mm 2 (ratio sqrt  2X) 14

6 Reasons for Latency Lags Bandwidth 2. Distance limits latency improvement Size of DRAM block  long bit and word lines  Increase DRAM access time 3.Bandwidth easier to sell (“bigger=better”) E.g., 10 Gbits/s Ethernet (“10 Gig”) vs. 10  sec latency Ethernet 4400 MB/s DIMM (“PC4400”) vs. 50 ns latency Even if just marketing, customers now trained Since bandwidth sells, more resources are thrown at bandwidth, which further tips the balance 15

6 Reasons for Latency Lags Bandwidth 4.Latency helps BW, but not vice versa Spinning disk faster improves both bandwidth and rotational latency 3600 RPM  RPM = 4.2X Average rotational latency: 8.3 ms  2.0 ms Things being equal, also helps BW by 4.2X Lower DRAM latency  More access/second (higher bandwidth) Higher linear density helps disk BW (and capacity), but not disk Latency 16

6 Reasons Latency Lags Bandwidth (cont’d) 5. Bandwidth hurts latency Queues help Bandwidth, hurt Latency (Queuing Theory) Adding chips to widen a memory module increases Bandwidth but higher fan-out on address lines may increase Latency 6. Operating System overhead hurts Latency more than Bandwidth Long messages amortize overhead; overhead bigger part of short messages 17

Summary of Technology Trends For disk, LAN, memory, and microprocessor, bandwidth improves by square of latency improvement – In the time that bandwidth doubles, latency improves by no more than 1.2X to 1.4X Lags probably are even larger in real systems, as bandwidth gains multiplied by replicated components – Multiple processors in a cluster or even in a chip – Multiple disks in a disk array – Multiple memory modules in a large memory – Simultaneous communication in switched LAN HW and SW developers should innovate assuming Latency Lags Bandwidth – If everything improves at the same rate, then nothing really changes – When rates vary, require real innovation 18

Outline Review Technology Trends: Careful, quantitative comparisons: Formulas 1.Define and quantify power 2.Define and quantify relative cost 3.Define and quantify dependability 19

Define and quantify power ( 1 / 2) For CMOS chips, traditional dominant energy consumption has been in switching transistors, called dynamic power 20 For mobile devices, energy metric better For a fixed task, slowing clock rate (frequency switched) reduces power, but not energy Capacitive load a function of number of transistors connected to output and technology, which determines capacitance of wires and transistors Dropping voltage helps both, e.g., dropped from 5V to 1V To save energy & dynamic power, most CPUs now turn off clock of inactive modules

Example of quantifying power Suppose 15% reduction in voltage results in a 15% reduction in frequency. What is impact on dynamic power? 21

Define and quantity power (2 / 2) Because leakage current flows even when a transistor is off, now static power important too 22 Leakage current increases in processors with smaller transistor sizes Increasing the number of transistors increases power even if they are turned off In 2006, goal for leakage is 25% of total power consumption; high performance designs at 40% Very low power systems even gate voltage to inactive modules to control loss due to leakage

Outline Review Technology Trends: Culture of tracking, anticipating and exploiting advances in technology Careful, quantitative comparisons: Formulas 1.Define and quantity power 2.Define and quantity relative cost 3.Define and quantity dependability 23

Define and quantify relative cost Critical Question: What is the fraction of good dies on a wafer (Die Yield)? Assume the wafer yield is 100%: where parameter  corresponds roughly to the # of critical masking levels, a measure of manufacturing complexity estimated at 4.0 for multilevel metal CMOS processes in To understand the cost of computers, must understand the cost of IC: where 1. Cost of die = Cost of wafer / (Dies per wafer x Die Yield) 2. Testing cost 3. Packaging cost

Define and quantify relative cost (cont) Example: Find the die yield for dies that are 1.5 cm on a side and 1.0cm on a side, assuming a defect density of 0.4 per cm -squared and  is 4. Answer: – For larger die: – For smaller die: substitute 2.25 with

Outline Review Technology Trends: Culture of tracking, anticipating and exploiting advances in technology Careful, quantitative comparisons: Formulas 1.Define and quantity power 2.Define and quantity relative cost 3.Define and quantity dependability 26

Define and quantify dependability (1/3) How decide when a system is operating properly? Infrastructure providers now offer Service Level Agreements (SLA) to guarantee that their networking or power service would be dependable Systems alternate between 2 states of service with respect to an SLA: 1.Service accomplishment, where the service is delivered as specified in SLA 2.Service interruption, where the delivered service is different from the SLA Failure = transition from state 1 to state 2 Restoration = transition from state 2 to state 1 27

Define and quantity dependability (2/3) Module reliability = measure of continuous service accomplishment (or time to failure). 2 metrics 1.Mean Time To Failure (MTTF) measures Reliability 2.Failures In Time (FIT) = 1/MTTF, the rate of failures Traditionally reported as failures per billion hours of operation Mean Time To Repair (MTTR) measures Service Interruption – Mean Time Between Failures (MTBF) = MTTF+MTTR Module availability measures service as alternate between the 2 states of accomplishment and interruption (number between 0 and 1, e.g. 0.9) Module availability = MTTF / ( MTTF + MTTR) 28

Example calculating reliability If modules have exponentially distributed lifetimes (age of module does not affect probability of failure), overall failure rate is the sum of failure rates of the modules Calculate FIT and MTTF for 10 disks (1M hour MTTF per disk), 1 disk controller (0.5M hour MTTF), and 1 power supply (0.2M hour MTTF): 29

Example calculating reliability If modules have exponentially distributed lifetimes (age of module does not affect probability of failure), overall failure rate is the sum of failure rates of the modules Calculate FIT and MTTF for 10 disks (1M hour MTTF per disk), 1 disk controller (0.5M hour MTTF), and 1 power supply (0.2M hour MTTF): 30

Outline Review Technology Trends: Culture of tracking, anticipating and exploiting advances in technology Careful, quantitative comparisons: Formulas 1.Define and quantity power 2.Define and quantity relative cost 3.Define and quantity dependability 31

Summary 32 Expect Bandwidth in disks, DRAM, network, and processors to improve by at least as much as the square of the improvement in Latency Quantify Cost (vs. Price) – IC  f(Area 2 ) + Learning curve, volume, commodity, margins Quantify dynamic and static power – Capacitance x Voltage 2 x frequency, Energy vs. power Quantify dependability – Reliability (MTTF vs. FIT), Availability (MTTF/(MTTF+MTTR)

Appendix 33

Disk Concept A platter from a 5.25" hard disk, with 20 concentric tracks drawn over the surface. Each track is divided into 16 imaginary sectors 34

Example of quantifying power Suppose 15% reduction in voltage results in a 15% reduction in frequency. What is impact on dynamic power? 35