Slide 1 Motivations and Introduction Phenomenal growth in computer industry/technology: X2/18mo in 20yr.  multi-GFLOPs processors, largely due to –Micro-electronics.

Slides:



Advertisements
Similar presentations
11 Measuring performance Kosarev Nikolay MIPT Feb, 2010.
Advertisements

Computer Abstractions and Technology
TU/e Processor Design 5Z0321 Processor Design 5Z032 Computer Systems Overview Chapter 1 Henk Corporaal Eindhoven University of Technology 2011.
Slide 1 Fundamentals of Computer Design CSCE430/830 Computer Architecture Instructor: Hong Jiang Courtesy of Prof. Yifeng U. of Maine Fall, 2007.
Lecture 2c: Benchmarks. Benchmarking Benchmark is a program that is run on a computer to measure its performance and compare it with other machines Best.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.
Chapter 1 CSF 2009 Computer Performance. Defining Performance Which airplane has the best performance? Chapter 1 — Computer Abstractions and Technology.
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Computer Performance.
Chapter 4 Assessing and Understanding Performance Bo Cheng.
CIS629 Fall Lecture Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important.
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.5 Comparing and Summarizing Performance.
Chapter 1. Introduction This course is all about how computers work But what do we mean by a computer? –Different types: desktop, servers, embedded devices.
Introduction What is Parallel Algorithms? Why Parallel Algorithms? Evolution and Convergence of Parallel Algorithms Fundamental Design Issues.
CS/ECE 3330 Computer Architecture Chapter 1 Performance / Power.
1 Introduction Background: CS 3810 or equivalent, based on Hennessy and Patterson’s Computer Organization and Design Text for CS/EE 6810: Hennessy and.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
Chapter 4 Assessing and Understanding Performance
Slide 1 Scalar Processor Design Phenomenal advances in its brief lifetime of 30+ years : X2/18mo in 30yr.  multi-GFLOPs processors, inspiring and facilitating.
1 Introduction Background: CS 3810 or equivalent, based on Hennessy and Patterson’s Computer Organization and Design Text for CS/EE 6810: Hennessy and.
CIS429/529 Winter 07 - Performance - 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two.
Rung-Bin Lin Chapter 1: Fundamental of Computer Design1-1 Chapter 1. Fundamentals of Computer Design Introduction –Performance Improvement due to (1).
MIS 175 Spring Learning Objectives When you finish this chapter, you will: –Recognize major components of an electronic computer. –Understand how.
Chapter 1 Sections 1.1 – 1.3 Dr. Iyad F. Jafar Introduction.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.
Computer performance.
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.
EET 4250: Chapter 1 Computer Abstractions and Technology Acknowledgements: Some slides and lecture notes for this course adapted from Prof. Mary Jane Irwin.
Fundamentals of Computer Design
CS 6461: Computer Architecture Fall 2013 History and Trends Instructor: Morris Lancaster.
Chapter 1 - The Computer Revolution Chapter 1 — Computer Abstractions and Technology — 1  Progress in computer technology  Underpinned by Moore’s Law.
Recap Technology trends Cost/performance Measuring and Reporting Performance What does it mean to say “computer X is faster than computer Y”? E.g. Machine.
Economics and Sustainability Financial Factors Influencing Success.
Computer Architecture II CSC/CPE 315 Where software and hardware finally meet Prof. Franklin Chapter 1 – Fabrication.
1 Recap (from Previous Lecture). 2 Computer Architecture Computer Architecture involves 3 inter- related components – Instruction set architecture (ISA):
1 Introduction Background: CS 3810 or equivalent, based on Hennessy and Patterson’s Computer Organization and Design Text for CS/EE 5810/6810: Hennessy.
The University of Adelaide, School of Computer Science
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
CMSC 611 Evaluating Cost Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from David Culler, UC Berkeley.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
Chapter 1 — Computer Abstractions and Technology — 1 Understanding Performance Algorithm Determines number of operations executed Programming language,
Advanced Computer Architecture Fundamental of Computer Design Instruction Set Principles and Examples Pipelining:Basic and Intermediate Concepts Memory.
CMSC 611 Evaluating Cost Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from David Culler, UC Berkeley.
Computer Architecture
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
Cost and Performance.
CET Gannod1 Chapter 1 Fundamentals of Computer Design.
Morgan Kaufmann Publishers
1 Lecture 2: Performance, MIPS ISA Today’s topics:  Performance equations  MIPS instructions Reminder: canvas and class webpage:
4. Performance 4.1 Introduction 4.2 CPU Performance and Its Factors
Performance Analysis Topics Measuring performance of systems Reasoning about performance Amdahl’s law Systems I.
Chapter 1 — Computer Abstractions and Technology — 1 Uniprocessor Performance Constrained by power, instruction-level parallelism, memory latency.
Subroutines and Parameters Call and return Parameter passing Return values Leaf subroutines Combining C and assembly modules.
CS203 – Advanced Computer Architecture
CS203 – Advanced Computer Architecture Performance Evaluation.
CS203 – Advanced Computer Architecture
Lecture 2: Performance Evaluation
Morgan Kaufmann Publishers
CS775: Computer Architecture
Performance of computer systems
The University of Adelaide, School of Computer Science
Performance of computer systems
Computer Evolution and Performance
Performance of computer systems
CS 704 Advanced Computer Architecture
The University of Adelaide, School of Computer Science
Utsunomiya University
Presentation transcript:

Slide 1 Motivations and Introduction Phenomenal growth in computer industry/technology: X2/18mo in 20yr.  multi-GFLOPs processors, largely due to –Micro-electronics technology –Computer Design innovations We have come a long way in a short time of 56 years since the 1 st general purpose computer in 1946:

Slide 2 Motivations and Introduction Past (Milestones): –First electronic computer ENIAC in 1946: 18,000 vacuum tubes, 3,000 cubic feet, 20 2-foot 10-digit registers, 5 KIPs (thousand additions per second); –First microprocessor (a CPU on a single IC chip) Intel 4004 in 1971: 2,300 transistors, 60 KIPs, $200; –Virtual elimination of assembly language programming reduced the need for object-code compatibility; –The creation of standardized, vendor-independent operating systems, such as UNIX and its clone, Linux, lowered the cost and risk of bringing out a new architecture –RISC instruction set architecture paved ways for drastic design innovations that focused on two critical performance techniques: instruction-level parallelism and use of caches

Slide 3 Motivations and Introduction Present (State of the art): –Microprocessors approaching/surpassing 10 GFLOPS; –A high-end microprocessor ( $10million) ten years ago; –While technology advancement contributes a sustained annual growth of 35%, innovative computer design accounts for another 25% annual growth rate  a factor of 15 in performance gains!

Slide 4 Motivations and Introduction Present (State of the art): –Three different computing markets (fig. 1.3): »Desktop Computing –- driven by price- performance (a few hundreds through over 10K); »Servers – availability driven (distinguished from reliability), providing sustained high performance (fig. 1.2) »Embedded Computers – fastest growing portion of the computer market, real-time performance driven, and need to minimize memory and power, as well as ASIC

Slide 5 Motivations and Introduction Present (State of the art): –The Task of the Computer Designer (Fig. 1.4): »Instruction Set Architecture (Traditional view of what Computer Architecture is), the boundary between software and hardware; »Organization, high-level aspects of a computer’s design, such as the memory system, the bus structure, the internal design of CPU, based on a given instruction set architectrue; »Hardware, the specifics of a machine, including the detailed logic design and the packaging technology of the machine. Future (Technology Trends): –A truly successful instruction set architecture (ISA) should last for decades, however it takes an computer architect’s acute observation and knowledge of the rapidly changing technology, in order for the ISA to survive and cope with such changes:

Slide 6 Motivations and Introduction Future (Technology Trends): »IC logic technology: transistor count on a chip grows at 55% annual rate (35% density growth rate % die size growth) while device speed scales more slowly; »Semiconductor DRAM: density grows at 60% annually while cycle time improves very slowly (decreasing one-third in ten years). Bandwidth per chip increases twice as fast as latency decreases; »Magnetic dish technology: density increases at 100% annual rate since 1990 while access time improves at about a third every ten years; and »Network technology: both latency and bandwidth have been improving, with more focus on bandwidth of late; the increasing importance of networking has led to faster improvement in performance than before—Internet bandwidth doubles every year in the U.S. challenge & opportunity for computer designer »Scaling of transistor performance: while transistor density increases quadratically with linear decrease in feature size, transistor performance increases roughly linearly with decrease in feature size  challenge & opportunity for computer designer! »Wires and power in IC: propagation delay and power needs?

Slide 7 Motivations and Introduction Cost, Price and Their Trends: –Understanding cost and pricing structure of the industry and market is key to cost-sensitive design of computers; –The Learning Curve: manufacturing costs decrease over time (Fig.1.5&1.6), best measured by change in yield  helps project costs over product’s life;Fig

Slide 8 Motivations and Introduction Cost, Price and Their Trends: –Cost of an IC (Fig. 1.8):Fig. 1.8

Slide 9 Motivations and Introduction Cost, Price and Their Trends: –Cost of an IC: die yield has been obtained empirically, where ά corresponds inversely to the number of masking levels (manufacturing complexity). For today’s metal CMOS processes, it’s estimated at 4.0

Slide 10 Motivations and Introduction Distribution of Cost in a System Cost vs. Price (Fig. 1.10)Fig. 1.10

Slide 11 Motivations and Introduction Cost vs. Price (Fig. 1.10)Fig –Component cost(CC): original cost from a designer’s point of view; –Direct cost (DC, 20% of CC): making a product (labor cost, scrap, warranty, etc), not including service and maintenance; –Gross margin (GM, 33% of CC+DC): indirect cost  overhead: R&D, marketing, sales, manufacturing equipment maintenance, building rental, cost of financing, pretax profits, and taxes; Average selling price (ASP) = CC + DC + GM –Average discount (AD, 33% of ASP): volume discounts by manufacturers; List price = ASP + AD

Slide 12 Performances & Quantitative Principles “X is n times faster than Y”  Performance (throughput) is inversely proportional to execution time: Definition of time: –wall-clock time: response time or elapsed time; –CPU time: the accumulated time during which CPU is computing: »user CPU time »system CPU time –An example from UNIX: 90.7u 12.9s 2:39 65% »90.7u: user CPU time (seconds) »12.9s: system CPU time »2:39(159 sec): elapsed time »65%: percentage of CPU time

Slide 13 Performances & Quantitative Principles Workload Representations (in decreasing accuracy): –Real applications: most accurate but inflexible and poor portability –Modified/scripted applications: scripts to stimulate (or highlight) certain features and to enhance portability –Kernels: extracted from real programs, good for isolating performance of individual features of a machine –Toy benchmarks: simple and run on almost all computers, good for beginning programming assignments –Synthetic benchmarks: artificially created to match an “average” execution profile, do not reward optimizations of behaviors in real programs but absent from benchmarks, and vice versa--thus can be misleading

Slide 14 Performances & Quantitative Principles Benchmark Suites: collection of kernels, real and benchmark programs, lessening the weakness of any one benchmark by the presence of others.(fig. 1.11) –Desktop Benchmark Suites: SPEC SPEC89SPEC92SPEC95 SPEC2000CINTCFP2000 »CPU-intensive benchmarks: SPEC (Standard Performance Evaluation Corporation): SPEC89  SPEC92  SPEC95  SPEC2000(11 int CINT & 14 fp CFP2000, fig. 1.12): real programs modified for portability and highlighting CPU SPECviewperf SPECapc »Graphics-intensive benchmarks: SPECviewperf for systems supporting the OpenGL graphics library, SPECapc for applications with intensive use of graphics –Server Benchmark Suites: SPEC CPU2000 »CPU-throughput benchmarks: SPEC CPU2000  SPECrate SPECSFS SPECWeb »I/O-intensive benchmarks: SPECSFS for file server, SPECWeb for web server TPC »Transaction-processing (TP) benchmarks: TPC (Transaction Processing Council): TCP-A (85)  TCP-C (complex query)  TCP-H (ad-hoc decision support)  TCP-R (business decision support)  TCP-W (web- oriented) EEMBC –Embedded Benchmarks: EEMBC (“embassy suites”, fig. 1.13)

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21