Slide 1 Fundamentals of Computer Design CSCE430/830 Computer Architecture Instructor: Hong Jiang Courtesy of Prof. Yifeng U. of Maine Fall, 2007.

Slides:



Advertisements
Similar presentations
CS1104: Computer Organisation School of Computing National University of Singapore.
Advertisements

CS2100 Computer Organisation Performance (AY2014/2015) Semester 2.
Computer Abstractions and Technology
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
Computer Organization and Architecture 18 th March, 2008.
CIS629 Fall Lecture Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important.
Performance D. A. Patterson and J. L. Hennessey, Computer Organization & Design: The Hardware Software Interface, Morgan Kauffman, second edition 1998.
CIS429.S00: Lec2- 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important quantitative.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
Chapter 4 Assessing and Understanding Performance
Slide 1 Motivations and Introduction Phenomenal growth in computer industry/technology: X2/18mo in 20yr.  multi-GFLOPs processors, largely due to –Micro-electronics.
1 CSE SUNY New Paltz Chapter 1 Introduction CSE-45432Introduction to Computer Architecture Dr. Izadi.
Slide 1 Scalar Processor Design Phenomenal advances in its brief lifetime of 30+ years : X2/18mo in 30yr.  multi-GFLOPs processors, inspiring and facilitating.
CIS429/529 Winter 07 - Performance - 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
1 Measuring Performance Chris Clack B261 Systems Architecture.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Lecture 1: Course Introduction, Technology Trends, Performance Professor Alvin R. Lebeck Computer Science 220 Fall 2001.
ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.
1 Computer Performance: Metrics, Measurement, & Evaluation.
Where Has This Performance Improvement Come From? Technology –More transistors per chip –Faster logic Machine Organization/Implementation –Deeper pipelines.
Lecture 2: Computer Performance
EET 4250: Chapter 1 Computer Abstractions and Technology Acknowledgements: Some slides and lecture notes for this course adapted from Prof. Mary Jane Irwin.
CS 6461: Computer Architecture Fall 2013 History and Trends Instructor: Morris Lancaster.
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
1 Recap (from Previous Lecture). 2 Computer Architecture Computer Architecture involves 3 inter- related components – Instruction set architecture (ISA):
Performance Chapter 4 P&H. Introduction How does one measure report and summarise performance? Complexity of modern systems make it very more difficult.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
10/19/2015Erkay Savas1 Performance Computer Architecture – CS401 Erkay Savas Sabanci University.
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
Advanced Computer Architecture Fundamental of Computer Design Instruction Set Principles and Examples Pipelining:Basic and Intermediate Concepts Memory.
1 CS465 Performance Revisited (Chapter 1) Be able to compare performance of simple system configurations and understand the performance implications of.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
Performance Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
Computer Architecture CPSC 350
CS252/Patterson Lec 1.1 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic2: Technology Trend and Cost/Performance (Adapted from David A. Patterson’s CS252 lecture.
Cost and Performance.
Morgan Kaufmann Publishers
Performance Performance
EEL5708/Bölöni Lec 2.1 Fall 2004 August 27, 2004 Lotzi Bölöni Fall 2004 EEL 5708 High Performance Computer Architecture Lecture 2 Introduction: the big.
September 10 Performance Read 3.1 through 3.4 for Wednesday Only 3 classes before 1 st Exam!
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
4. Performance 4.1 Introduction 4.2 CPU Performance and Its Factors
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
Compsci Today’s topics l Operating Systems  Brookshear, Chapter 3  Great Ideas, Chapter 10  Slides from Kevin Wayne’s COS 126 course l Performance.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 2: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed. Oct. 4,
COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques Dr. Xiao Qin Auburn University
VU-Advanced Computer Architecture Lecture 1-Introduction 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 1.
Measuring Performance II and Logic Design
September 2 Performance Read 3.1 through 3.4 for Tuesday
How do we evaluate computer architectures?
Defining Performance Which airplane has the best performance?
Morgan Kaufmann Publishers
Architecture & Organization 1
Computer Architecture CSCE 350
CS2100 Computer Organisation
CS775: Computer Architecture
Computer Architecture
Architecture & Organization 1
BIC 10503: COMPUTER ARCHITECTURE
CMSC 611: Advanced Computer Architecture
Performance of computer systems
Performance of computer systems
CMSC 611: Advanced Computer Architecture
CS 704 Advanced Computer Architecture
CS2100 Computer Organisation
Presentation transcript:

Slide 1 Fundamentals of Computer Design CSCE430/830 Computer Architecture Instructor: Hong Jiang Courtesy of Prof. Yifeng U. of Maine Fall, 2007 Portions of these slides are derived from: Dave Patterson © UCB

Slide 2 Motivations and Introduction Phenomenal growth in computer industry/technology: X2/18mo in 20yr.  multi-GFLOPs processors, largely due to –Micro-electronics technology –Computer Design innovations We have come a long way in a short time of 60 years since the 1 st general purpose computer in 1946: Instruction Set Architecture: An Introduction

Slide 3 Motivations and Introduction Past (Milestones): –First electronic computer ENIAC in 1946: 18,000 vacuum tubes, 3,000 cubic feet, 20 2-foot 10-digit registers, 5 KIPs (thousand additions per second); –First microprocessor (a CPU on a single IC chip) Intel 4004 in 1971: 2,300 transistors, 60 KIPs, $200; –Virtual elimination of assembly language programming reduced the need for object-code compatibility; –The creation of standardized, vendor-independent operating systems, such as UNIX and its clone, Linux, lowered the cost and risk of bringing out a new architecture –RISC instruction set architecture paved ways for drastic design innovations that focused on two critical performance techniques: instruction-level parallelism and use of caches

Slide 4 Motivations and Introduction Present (State of the art): –Microprocessors approaching/surpassing 10 GFLOPS; –A high-end microprocessor ( $10million) ten years ago; –While technology advancement contributes a sustained annual growth of 35%, innovative computer design accounts for another 25% annual growth rate  a factor of 15 in performance gains!

Slide 5 Technology Trend Big Fish Eating Little Fish In reality:

Slide 6 Technology Trend PCWork- station Mini- computer Mainframe Mini- supercomputer Supercomputer Massively Parallel Processors 1988 Computer Food Chain

Slide 7 Technology Trend 1998 Computer Food Chain PCWork- station Mainframe Supercomputer Mini- supercomputer Clusters Mini- computer Now who is eating whom? Server

Slide 8 Parallel Computing Architectures in Top Nov MEMORY BUS/CROSSBAR CPU Symmetric Multiprocessing (SMP)Massively Parallel Processor (MPP) CPU M M M M PC network cluster MPP Cluster SMP Constellations SIMD Single processor Supercomputer Trends in Top 500

Slide 9 Why Such Changes in 10 years? Performance –Technology Advances »CMOS VLSI dominates older technologies (TTL, ECL) in cost AND performance –Computer architecture advances improves low-end »RISC, superscalar, RAID, … Price: Lower costs due to … –Simpler development »CMOS VLSI: smaller systems, fewer components –Higher volumes »CMOS VLSI : same dev. cost 10,000 vs. 10,000,000 units –Lower margins by class of computer, due to fewer services Function –Rise of networking/local interconnection technology

Slide 10 Amazing Underlying Technology Change In 1965, Gordon Moore sketched out his prediction of the pace of silicon technology. Moore's Law : The number of transistors incorporated in a chip will approximately double every 24 months. Decades later, Moore's Law remains true. From Intel

Slide 11 Technology Trends: Moore ’ s Law Gordon Moore (Founder of Intel) observed in 1965 that the number of transistors on a chip doubles about every 24 months. In fact, the number of transistors on a chip doubles about every 18 months. From intel

Slide 12 Technology Trends Based on SPEED, the CPU has increased dramatically, but memory and disk have increased only a little. This has led to dramatic changed in architecture, Operating Systems, and programming practices.

Slide 13 Technology  dramatic change Processor –transistor number in a chip: about 55% per year –clock rate: about 20% per year Memory –DRAM capacity: about 60% per year (4x every 3 years) –Memory speed: about 10% per year –Cost per bit: improves about 25% per year Disk –capacity: about 60% per year –Total use of data: 100% per 9 months! Network Bandwidth –10 years: 10Mb  100Mb – 5 years: 100Mb  1 Gb

Slide 14 Technology  dramatic change From IBM

Slide 15 Computer Architecture Is … the attributes of a [computing] system as seen by the programmer, i.e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls, the logic design, and the physical implementation. Amdahl, Blaaw, and Brooks, 1964 SOFTWARE

Slide 16 Computer Architecture ’ s Changing Definition 1950s to 1960s Computer Architecture Course: Computer Arithmetic 1970s to mid 1980s Computer Architecture Course: Instruction Set Design, especially ISA appropriate for compilers 1990s Computer Architecture Course: Design of CPU, memory system, I/O system, Multiprocessors, Networks 2010s: Computer Architecture Course: Self adapting systems? Self organizing structures? DNA Systems/Quantum Computing?

Slide 17 CSCE430/830 Course Focus Understanding the design techniques, machine structures, technology factors, evaluation methods that will determine the form of computers in the 21st Century Technology Programming Languages Operating Systems History Applications Interface Design (ISA) Measurement & Evaluation Parallelism Computer Architecture: Instruction Set Design Organization Hardware/Software Boundary Compilers

Slide 18 Computer Engineering Methodology Technology Trends Evaluate Existing Systems for Bottlenecks Benchmarks Simulate New Designs and Organizations Workloads Implement Next Generation System Implementation Complexity Architecture design is an iterative process: Searching the space of possible designs at all levels of computer systems

Slide 19 Summary 1.Moors’s laws: The number of transistors incorporated in a chip will approximately double every 18 months. 2.CPU speed increases dramatically, but the speed of memory, disk and network increases slowly. 3.Architecture design is an iterative process. Measure performance: Benchmarks

Slide 20 Quantitative Principles Performance Metrics: How do we conclude that System-A is “ better ” than System-B? Amdahl ’ s Law: Relates total speedup of a system to the speedup of some portion of that system. Topics: (Sections 1.1, 1.2, 1.5, 1.6) – Metrics for different market segments – Benchmarks to measure performance – Quantitative principles of computer design

Slide 21 Importance of Measurement Architecture design is an iterative process: Search the possible design space Make selections Evaluate the selections made Good Ideas Mediocre Ideas Bad Ideas Cost / Performance Analysis Good measurement tools are required to accurately evaluate the selection.

Slide 22 Two notions of “ performance ” Plane Boeing 747 BAD/Sud Concodre Speed 610 mph 1350 mph DC to Paris 6.5 hours 3 hours Passengers Throughput (pmph) 286, ,200 Time to do the task (Execution Time) – execution time, response time, latency, etc. Tasks per day, hour, week, sec, ns... (Performance) – throughput, bandwidth, etc. Which has higher performance?

Slide 23 Performance Definitions Performance is in units of things-per-second. –bigger is better Execution time is the reciprocal of performance. –performance(x) = 1 execution_time(x) "X is n times faster than Y" means execution_time (Y) performance(X) n = = execution_time (X) performance(Y) When is throughput more important than execution time? When is execution time more important than throughput?

Slide 24 Performance Terminology “ X is n% faster than Y ” means: ExTime(Y) Performance(X) n = = ExTime(X)Performance(Y) 100 n = 100(Performance(X) - Performance(Y)) Performance(Y) Example: Y takes 15 seconds to complete a task, X takes 10 seconds. What % faster is X than Y? n = 100(ExTime(Y) - ExTime(X)) ExTime(X)

Slide 25 Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected Speedup due to enhancement E: This fraction enhanced Quantitative Design: Amdahl's Law Amdahl’s Law gives a quick way to find the speedup from some enhancement.

Slide 26 Quantitative Design: Amdahl's Law This fraction enhanced ExTime old ExTime new ExTime new = ExTime old x (1 - Fraction enhanced ) + Fraction enhanced Speedup overall = ExTime old ExTime new Speedup enhanced = 1 (1 - Fraction enhanced ) + Fraction enhanced Speedup enhanced

Slide 27 Pictorial Depiction of Amdahl ’ s Law Before: Execution Time without enhancement E After: Execution Time with enhancement E: Enhancement E accelerates fraction F of original execution time by a factor of S Unaffected fraction: (1- F) Affected fraction: F Unaffected fraction: (1- F) F/S Unchanged Execution Time without enhancement E 1 Speedup(E) = = Execution Time with enhancement E (1 - F) + F/S shown normalized to 1 = (1-F) + F =1

Slide 28 Floating point (FP) instructions improved to run 2X; but only 10% of actual instructions are FP. Suppose the old execution time is ExTimeold, What are the current execution time and speedup? Quantitative Design: Amdahl's Law Speedup overall = =1.053 ExTime new = ExTime old x ( /2) = 0.95 x ExTime old Speedup = ExTime old ExTime new = 1 (1 - Fraction enhanced ) + Fraction enhanced Speedup enhanced Speedup = 1 ( ) + 0.1/2 = 1.053

Slide 29 The clock cycle time is the amount of time for one clock period to elapse (e.g. 5 ns). The clock rate is the inverse of the clock cycle time. For example, if a computer has a clock cycle time of 5 ns, the clock rate is: = 200 MHz 5 x 10 sec Computer Clocks A computer clock runs at a constant rate and determines when events take placed in hardware. Clk clock period -9

Slide 30 Computing CPU time The time to execute a given program is CPU time = CPU clock cycles for a program x clock cycle time Since clock cycle time and clock rate are reciprocals, thus CPU time = CPU clock cycles for a program / clock rate CPI: clock cycles per instruction CPU clock cycle for a program CPI = Instruction count

Slide 31 Computing CPU time The time to execute a given program is CPU time = CPU clock cycles for a program x clock cycle time Since clock cycle time and clock rate are reciprocals, thus CPU time = CPU clock cycles for a program / clock rate The number of CPU clock cycles can be determined by CPU clock cycles = (instructions/program) x (clock cycles/instruction) = Instruction count x CPI which gives The units for this are instructions clock cycles seconds seconds = x x program instruction clock cycle CPU time = Instruction count x CPI x clock cycle time CPU time = Instruction count x CPI / clock rate

Slide 32 Example of Computing CPU time If a computer has a clock rate of 2 GHz, how long does it take to execute a program with 1,000,000 instructions, if the CPI for the program is 3.5?

Slide 33 Example of Computing CPU time If a computer has a clock rate of 2 GHz, how long does it take to execute a program with 1,000,000 instructions, if the CPI for the program is 3.5? Using the equation CPU time = Instruction count x CPI / clock rate gives CPU time = x 3.5 / (2 x 10 9 ) If a computer ’ s clock rate increases from 200 MHz to 250 MHz and the other factors remain the same, how many times faster will the computer be? CPU time old clock rate new 250 MHz = = = 1.25 CPU time new clock rate old 200 MHZ What simplifying assumptions did we make? 6

Slide 34 Performance Example Two computers M1 and M2 with the same instruction set. For a given program, we have How many times faster is M2 than M1 for this program? ExTime M1 IC M1 x CPI M1 / Clock Rate M1 = ExTime M2 IC M2 x CPI M2 / Clock Rate M2 = 2.8/50 3.2/75 = 1.31 Clock rate (MHz) CPI M M2753.2

Slide 35 Aspects of CPU Performance CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle Inst Count CPICycle Time Program X Compiler X (X) Inst. Set. X X Organization X X Technology X

Slide 36 Performance Summary Two performance metrics execution time and throughput. Amdahl ’ s Law When trying to improve performance, look at what occurs frequently => make the common case fast. CPU time: CPU time = Instruction count x CPI x clock cycle time CPU time = Instruction count x CPI / clock rate Execution Time without enhancement E 1 Speedup(E) = = Execution Time with enhancement E (1 - F) + F/S