CS 5513 Computer Architecture Lecture 2 – More Introduction, Measuring Performance.

Slides:



Advertisements
Similar presentations
Lecture 6: Multicore Systems
Advertisements

Performance What differences do we see in performance? Almost all computers operate correctly (within reason) Most computers implement useful operations.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Intro II & Trends Steve Ko Computer Sciences and Engineering University at Buffalo.
Instruction Level Parallelism and Its Exploitation.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.
CPSC 614 Computer Architecture Lec 2 - Introduction EJ Kim Dept. of Computer Science Texas A&M University Adapted from CS 252 Spring 2006 UC Berkeley Copyright.
CIS629 Fall Lecture Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important.
1 The Difference Engine, Charles Babbage Images from Wikipedia (Joe D and Andrew Dunn) Slides courtesy Anselmo Lastra.
CS 501: Software Engineering Fall 2000 Lecture 19 Performance of Computer Systems.
EECS 252 Graduate Computer Architecture Lecture 2  0 Review of Instruction Sets and Pipelines January 23 th, 2012 John Kubiatowicz Electrical Engineering.
1 CS 501 Spring 2005 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
Bandwidth Rocks (1) Latency Lags Bandwidth (last ~20 years) Performance Milestones Disk: 3600, 5400, 7200, 10000, RPM.
1 Lecture 10: FP, Performance Metrics Today’s topics:  IEEE 754 representations  FP arithmetic  Evaluating a system Reminder: assignment 4 due in a.
CIS429/529 Winter 07 - Performance - 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two.
CptS / E E COMPUTER ARCHITECTURE
CSE820 Lec 2 - Introduction Rich Enbody Based on slides by David Patterson.
CSCI 211 Computer System Architecture Lec 1 - Introduction
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
CS136, Advanced Architecture Introduction to Architecture (continued)
1 Computer Performance: Metrics, Measurement, & Evaluation.
CSE 520 Computer Architecture Lec 4 – Quantifying Cost, Energy- Consumption, Performance, and Dependability (Chapter 1) Sandeep K. S. Gupta School of Computing.
Chapter 1: Fundamentals of Computer Design David Patterson Electrical Engineering and Computer Sciences University of California, Berkeley
Eng. Mohammed Timraz Electronics & Communication Engineer University of Palestine Faculty of Engineering and Urban planning Software Engineering Department.
1 Chapter 1: Fundamentals of Computer Design Introduction, class of computers Instruction set architecture (ISA) Technology trend: performance, power,
CSE 502 Graduate Computer Architecture Lec Introduction Larry Wittie Computer Science, StonyBrook University and.
CSE 502 Graduate Computer Architecture Lec 1 & 2 - Introduction Larry Wittie Computer Science, StonyBrook University and.
Lecture 2: Computer Performance
Introduction CS252 S05.
BİL 221 Bilgisayar Yapısı Lab. – 1: Benchmarking.
Recap Technology trends Cost/performance Measuring and Reporting Performance What does it mean to say “computer X is faster than computer Y”? E.g. Machine.
Computer Architecture Lec 2 - Introduction. 01/19/10Lec 02-intro 2 Review from last lecture Computer Architecture >> instruction sets Computer Architecture.
Ch1. Fundamentals of Computer Design 2. Performance ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department University of Massachusetts.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
1 CS 501 Spring 2006 CS 501: Software Engineering Lecture 22 Performance of Computer Systems.
Advanced Computer Architecture Fundamental of Computer Design Instruction Set Principles and Examples Pipelining:Basic and Intermediate Concepts Memory.
CPE 731 Advanced Computer Architecture Technology Trends Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of California,
Eng. Mohammed Timraz Electronics & Communication Engineer University of Palestine Faculty of Engineering and Urban planning Software Engineering Department.
CSE 502 Graduate Computer Architecture Lec Introduction Larry Wittie Computer Science, StonyBrook University and.
CSE 502 Graduate Computer Architecture Lec Introduction Larry Wittie Computer Science, StonyBrook University and.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
CS252/Patterson Lec 1.1 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic2: Technology Trend and Cost/Performance (Adapted from David A. Patterson’s CS252 lecture.
KT6213 Lecture 7: Pipelining Computer Organization and Architecture.
Performance Performance
Microprocessor Microarchitecture Introduction Lynn Choi School of Electrical Engineering.
1 Lecture 2: Performance, MIPS ISA Today’s topics:  Performance equations  MIPS instructions Reminder: canvas and class webpage:
1 CS 501 Spring 2003 CS 501: Software Engineering Lecture 23 Performance of Computer Systems.
Yiorgos Makris Professor Department of Electrical Engineering University of Texas at Dallas EE (CE) 6304 Computer Architecture Lecture #4 (9/3/15) Course.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 2: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed. Oct. 4,
CS203 – Advanced Computer Architecture Performance Evaluation.
Ch1. Fundamentals of Computer Design 1. Formulas ECE562 Advanced Computer Architecture Prof. Honggang Wang ECE Department University of Massachusetts Dartmouth.
June 20, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 1: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed.
Computer Architecture & Operations I
Measuring Performance II and Logic Design
Microprocessor Microarchitecture Introduction
Lecture 2: Performance Today’s topics:
Lecture 2: Performance Evaluation
Copyright © 2012, Elsevier Inc. All rights reserved.
Lecture 3: MIPS Instruction Set
Source & Courtesy: Montek Singh
How do we evaluate computer architectures?
HY425 – Αρχιτεκτονική Υπολογιστών Διάλεξη 02
School of Computing and Informatics Arizona State University
Morgan Kaufmann Publishers
EE (CE) 6304 Computer Architecture Lecture #4 (8/30/17)
Adopted from Professor David Patterson
CPE 432 Computer Design 1 – Introduction and Technology Trends
Performance of computer systems
Performance of computer systems
Lecture 3: MIPS Instruction Set
Performance of computer systems
Presentation transcript:

CS 5513 Computer Architecture Lecture 2 – More Introduction, Measuring Performance

Review: Computer Architecture brings Other fields often borrow ideas from architecture Quantitative Principles of Design 1.Take Advantage of Parallelism 2.Principle of Locality 3.Focus on the Common Case 4.Amdahl’s Law 5.The Processor Performance Equation Careful, quantitative comparisons –Define, quantity, and summarize relative performance –Define and quantity relative cost –Define and quantity dependability –Define and quantity power Culture of anticipating and exploiting advances in technology Culture of well-defined interfaces that are carefully implemented and thoroughly checked

Outline Review Technology Trends Define, quantity, and summarize relative performance

Moore’s Law: 2X transistors / “year” “Cramming More Components onto Integrated Circuits” –Gordon Moore, Electronics, 1965 # on transistors / cost-effective integrated circuit double every N months (12 ≤ N ≤ 24)

Tracking Technology Performance Trends Drill down into 4 technologies: –Disks, –Memory, –Network, –Processors Compare ~1980 Archaic (Nostalgic) vs. ~2011 Modern (Newfangled) –Performance Milestones in each technology Compare for Bandwidth vs. Latency improvements in performance over time Bandwidth: number of events per unit time –E.g., M bits / second over network, M bytes / second from disk Latency: elapsed time for a single event – E.g., one-way network delay in microseconds, average disk access time in milliseconds

Disks: Archaic(Nostalgic) v. Modern(Newfangled) Seagate ST AS, RPM (2X) 2TB (66,666X) Tracks/Inch: 1,800,000 (2250X) Bits/Inch: 14,924,000 (1562X) Five 2.5” platters (in 3.5” form factor) Bandwidth: 149MBytes/sec (248X) Latency: 4.2 ms (11.5X) Cache: 64 MBytes CDC Wren I, RPM 0.03 GBytes capacity Tracks/Inch: 800 Bits/Inch: 9550 Three 5.25” platters Bandwidth: 0.6 MBytes/sec Latency: 48.3 ms Cache: none

Memory: Archaic (Nostalgic) v. Modern (Newfangled) 1980 DRAM (asynchronous) 0.06 Mbits/chip 64,000 xtors, 35 mm 2 16-bit data bus per module, 16 pins/chip 13 Mbytes/sec Latency: 225 ns (no block transfer) 2011 DDR3-2133N Synchr. (clocked) DRAM 32,768 Mbits/chip (546,133X) > 1 billion xtors, 495 mm 2 64-bit data bus per DIMM, 240 pins/DIMM (4X) 17,066 Mbytes/sec (1312X) Latency: 13 ns (17X) Block transfers (page mode)

CPUs: Archaic (Nostalgic) v. Modern (Newfangled) 1982 Intel MHz 2 MIPS (peak) Latency 320 ns 134,000 xtors, 47 mm 2 16-bit data bus, 68 pins Microcode interpreter, separate FPU chip (no caches) 2011 Intel Xeon E MHz(192X) 7200 MIPS (peak) (3600X) Latency ns (768X) 2.6 billion xtors, 513 mm 2 64-bit data bus, 423 pins 3-way superscalar, multithreaded, dynamic translate to RISC, deep pipeline, Out-of-Order execution On-chip 32KB Data caches, 64KB Instr. cache, 30MB last-level cache

Outline Review Technology Trends: Culture of tracking, anticipating and exploiting advances in technology Define, quantity, and summarize relative performance

Performance(X) Execution_time(Y) n == Performance(Y) Execution_time(X) Definition: Performance Performance is in units of things per sec –bigger is better If we are primarily concerned with response time performance(x) = 1 execution_time(x) " X is n times faster than Y" means

Performance: What to measure Usually rely on benchmarks vs. real workloads To increase predictability, collections of benchmark applications, called benchmark suites, are popular SPECCPU: popular desktop benchmark suite –CPU only, split between integer and floating point programs –SPECint2000 has 12 integer, SPECfp2000 has 14 integer pgms –SPECCPU2006 to be announced Spring 2006 –SPECSFS (NFS file server) and SPECWeb (WebServer) added as server benchmarks Transaction Processing Council measures server performance and cost-performance for databases –TPC-C Complex query for Online Transaction Processing –TPC-H models ad hoc decision support –TPC-W a transactional web benchmark –TPC-App application server and web services benchmark

How Summarize Suite Performance (1/5) Arithmetic average of execution time of all pgms? –But they vary by 4X in speed, so some would be more important than others in arithmetic average Could add a weights per program, but how pick weight? –Different companies want different weights for their products SPECRatio: Normalize execution times to reference computer, yielding a ratio proportional to performance = time on reference computer time on computer being rated

How Summarize Suite Performance (2/5) If program SPECRatio on Computer A is 1.25 times bigger than Computer B, then Note that when comparing 2 computers as a ratio, execution times on the reference computer drop out, so choice of reference computer is irrelevant

How Summarize Suite Performance (3/5) Since ratios, proper mean is geometric mean (SPECRatio unitless, so arithmetic mean meaningless) 1.Geometric mean of the ratios is the same as the ratio of the geometric means 2.Ratio of geometric means = Geometric mean of performance ratios  choice of reference computer is irrelevant! These two points make geometric mean of ratios attractive to summarize performance

How Summarize Suite Performance (4/5) Does a single mean well summarize performance of programs in benchmark suite? Can decide if mean a good predictor by characterizing variability of distribution using standard deviation Like geometric mean, geometric standard deviation is multiplicative rather than arithmetic Can simply take the logarithm of SPECRatios, compute the standard mean and standard deviation, and then take the exponent to convert back:

How Summarize Suite Performance (5/5) Standard deviation is more informative if know distribution has a standard form –bell-shaped normal distribution, whose data are symmetric around mean –lognormal distribution, where logarithms of data--not data itself--are normally distributed (symmetric) on a logarithmic scale For a lognormal distribution, we expect that 68% of samples fall in range 95% of samples fall in range

Example Standard Deviation (1/2) GM and multiplicative StDev of SPECfp2000 for Itanium 2

Example Standard Deviation (2/2) GM and multiplicative StDev of SPECfp2000 for AMD Athlon