Lecture 1: Course Introduction, Technology Trends, Performance Professor Alvin R. Lebeck Computer Science 220 Fall 2001.

Slides:



Advertisements
Similar presentations
Slide 1Michael Flynn EE382 Winter/99 EE382 Processor Design Stanford University Winter Quarter Instructor: Michael Flynn Teaching Assistant:
Advertisements

Performance What differences do we see in performance? Almost all computers operate correctly (within reason) Most computers implement useful operations.
Performance Evaluation of Architectures Vittorio Zaccaria.
Slide 1 Fundamentals of Computer Design CSCE430/830 Computer Architecture Instructor: Hong Jiang Courtesy of Prof. Yifeng U. of Maine Fall, 2007.
1 CIS775: Computer Architecture Chapter 1: Fundamentals of Computer Design.
TU/e Processor Design 5Z032 1 Processor Design 5Z032 The role of Performance Henk Corporaal Eindhoven University of Technology 2009.
100 Performance ENGR 3410 – Computer Architecture Mark L. Chang Fall 2006.
CpE442 Intro. To Computer Architecture CpE 442 Introduction To Computer Architecture Lecture 1 Instructor: H. H. Ammar These slides are based on the lecture.
2-1 ECE 361 ECE C61 Computer Architecture Lecture 2 – performance Prof. Alok N. Choudhary
Ch1. Fundamentals of Computer Design 3. Principles (5) ECE562/468 Advanced Computer Architecture Prof. Honggang Wang ECE Department University of Massachusetts.
ENGS 116 Lecture 21 Performance and Quantitative Principles Vincent H. Berk September 26 th, 2008 Reading for today: Chapter , Amdahl article.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 5, 2005 Lecture 2.
CIS629 Fall Lecture Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important.
CIS429.S00: Lec2- 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two important quantitative.
ECE 232 L4 perform.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 4 Performance,
ENGS 116 Lecture 11 ENGS 116 / COSC 107 Computer Architecture Introduction Vincent H. Berk September 21, 2005 Reading for Friday: Chapter 1.1 – 1.4, Amdahl.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
CS430 – Computer Architecture Lecture - Introduction to Performance
CIS429/529 Winter 07 - Performance - 1 Performance Overview Execution time is the best measure of performance: simple, intuitive, straightforward. Two.
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
ECE 232 L1 Intro.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 1 Introduction.
1 Measuring Performance Chris Clack B261 Systems Architecture.
CENG311 Computer Architecture Kayhan Erciyes. CS231 Assembly language and Digital Circuits Instructor:Kayhan Erciyes Office:
ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.
Where Has This Performance Improvement Come From? Technology –More transistors per chip –Faster logic Machine Organization/Implementation –Deeper pipelines.
Lecture 2: Computer Performance
September 15, Digital System Architecture Course Introduction and Overview Pradondet Nilagupta Spring 2001 (original notes from Randy Katz,
EET 4250: Chapter 1 Computer Abstractions and Technology Acknowledgements: Some slides and lecture notes for this course adapted from Prof. Mary Jane Irwin.
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
B0111 Performance Anxiety ENGR xD52 Eric VanWyk Fall 2012.
PerformanceCS510 Computer ArchitecturesLecture Lecture 3 Benchmarks and Performance Metrics Lecture 3 Benchmarks and Performance Metrics.
Lecture 1: Course Introduction, Technology Trends, Performance Professor Alvin R. Lebeck Compsci 220 / ECE 252 Fall 2004 Slides based on those of: Sorin,
1 CS/EE 362 Hardware Fundamentals Lecture 9 (Chapter 2: Hennessy and Patterson) Winter Quarter 1998 Chris Myers.
1 Acknowledgements Class notes based upon Patterson & Hennessy: Book & Lecture Notes Patterson’s 1997 course notes (U.C. Berkeley CS 152, 1997) Tom Fountain.
Advanced Computer Architecture Fundamental of Computer Design Instruction Set Principles and Examples Pipelining:Basic and Intermediate Concepts Memory.
Computer Organization and Design Computer Abstractions and Technology
Digital System Architecture 1 28 ต.ค ต.ค ต.ค ต.ค ต.ค. 58 Lecture 2a Computer Performance and Cost Pradondet Nilagupta.
1 CS465 Performance Revisited (Chapter 1) Be able to compare performance of simple system configurations and understand the performance implications of.
Computer Architecture
CEN 316 Computer Organization and Design Assessing and Understanding Performance Mansour AL Zuair.
Computer Architecture CPSC 350
CS252/Patterson Lec 1.1 1/17/01 CMPUT429/CMPE382 Winter 2001 Topic2: Technology Trend and Cost/Performance (Adapted from David A. Patterson’s CS252 lecture.
EEL5708/Bölöni Lec 1.1 August 21, 2006 Lotzi Bölöni Fall 2006 EEL 5708 High Performance Computer Architecture Lecture 1 Introduction.
Cost and Performance.
S.J.Lee 1 컴퓨터 구조 강좌개요 순천향대학교 컴퓨터학부 이 상 정. S.J.Lee 2 교 재교 재 J.L.Hennessy & D.A.Patterson Computer Architecture a Quantitative Approach, Second Edition.
Pipelining and Parallelism Mark Staveley
Performance Performance
TEST 1 – Tuesday March 3 Lectures 1 - 8, Ch 1,2 HW Due Feb 24 –1.4.1 p.60 –1.4.4 p.60 –1.4.6 p.60 –1.5.2 p –1.5.4 p.61 –1.5.5 p.61.
Performance – Last Lecture Bottom line performance measure is time Performance A = 1/Execution Time A Comparing Performance N = Performance A / Performance.
Lec2.1 Computer Architecture Chapter 2 The Role of Performance.
Performance Analysis Topics Measuring performance of systems Reasoning about performance Amdahl’s law Systems I.
Introduction Computer Organization Spring 1436/37H (2015/16G) Dr. Mohammed Sinky Computer Architecture
Compsci Today’s topics l Operating Systems  Brookshear, Chapter 3  Great Ideas, Chapter 10  Slides from Kevin Wayne’s COS 126 course l Performance.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 2: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed. Oct. 4,
Lecture 1: Introduction CprE 585 Advanced Computer Architecture, Fall 2004 Zhao Zhang.
EEL-4713 Ann Gordon-Ross.1 EEL-4713 Computer Architecture Performance.
VU-Advanced Computer Architecture Lecture 1-Introduction 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 1.
CS4100: 計算機結構 Course Outline 國立清華大學資訊工程學系 九十九年度第二學期.
ECE 4100/6100 Advanced Computer Architecture Lecture 1 Performance
How do we evaluate computer architectures?
Morgan Kaufmann Publishers
Computer Architecture CSCE 350
CS775: Computer Architecture
Performance of computer systems
COMS 361 Computer Organization
Overview Prof. Eric Rotenberg
August 30, 2000 Prof. John Kubiatowicz
Performance of computer systems
Presentation transcript:

Lecture 1: Course Introduction, Technology Trends, Performance Professor Alvin R. Lebeck Computer Science 220 Fall 2001

2 © Alvin R. Lebeck 2001 CPS 220 Administrative Office Hours Office: D304 LSRC Hours: Mon 10:00-11:00 Thurs 2:00-3:00 or by appointment ( ) Phone: Teaching Assistant Fareed Zaffar Office: D125 LSRC Hours: Tuesday 10:00-11:00, Wednesday 1:00-2:00 Phone:

3 © Alvin R. Lebeck 2001 CPS 220 Administrative (Grading) 30% Homeworks –6 Homeworks –5 points per day late, for first 10 days –Always do the homework (better late than never) 30% Examinations (Midterm + Final) 30% Research Project (work in pairs) 10% Class Participation This course requires hard work.

4 © Alvin R. Lebeck 2001 Administrative (Continued) Midterm Exam: In class (75 min) Closed book Final Exam: (3 hours) closed book This is a “Quals” Course. –Quals pass based on Midterm and Final exams only

5 © Alvin R. Lebeck 2001 Administrative (Continued) Course Web Page – –Lectures posted there after class (pdf) –Homework posted there Course News Group –duke.cs.cps220 –Use it to 1) read announcements/comments on class or homework, 2) ask questions (help), 3) communicate with each other Need Duke CS account –Duke ID, ACPUB account name (see HW #0)

6 © Alvin R. Lebeck 2001 SPIDER: Systems Seminar Systems & Architecture Seminar –Wednesdays 3:45-5:00 in D344 –duke.cs.os-research (spider newsgroup) Presentations on current work –Practice talks for conferences –Discussion on recent papers –Your own research Why you should go? –If you want to work in Systems/Architecture… –Good time to practice public speaking in front of friendly crowd –Learn about current topics

7 © Alvin R. Lebeck 2001 Assignment Homework #0 (Background, due Thursday) Read Chapters 1 & 2

8 © Alvin R. Lebeck 2001 CPS 220 CPS 220 Course Focus Understanding the design techniques, machine structures, technology factors, evaluation methods that will determine the form of computers in 21st Century Technology Programming Languages Operating Systems History Applications Interface Design (ISA) Measurement & Evaluation Parallelism Computer Architecture: Instruction Set Design Organization Hardware Power

9 © Alvin R. Lebeck 2001 Related Courses Prerequisites CPS 104: Basic Machine Organization CPS 110: Basic Operating System Functions This course: focus on why, analysis, evaluation –Cost/performance –Power budget Follow on Courses CPS 221: Advanced Computer Architecture II –Parallel computer architecture

10 © Alvin R. Lebeck 2001 CPS 220 Computer Architecture Is … the attributes of a [computing] system as seen by the programmer, i.e., the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls, the logic design, and the physical implementation. Amdahl, Blaaw, and Brooks, 1964 SOFTWARE

11 © Alvin R. Lebeck 2001 CPS 220 Topic Coverage Textbook: Hennessy and Patterson, Computer Architecture: A Quantitative Approach, 2nd Ed., Fundamentals of Computer Architecture (Chapter 1) Instruction Set Architecture (Chapter 2, Appendix C&D) Pipelining (Chapter 3) Advanced Pipelining and ILP (Chapter 4) Memory Hierarchy (Chapter 5) Input/Output and Storage (Chapter 6) Networks and Interconnection Technology (Chapter 7) Multiprocessors (Chapter 8) Vectors (Apendix) New Architectures/trends (papers) Power (papers)

12 © Alvin R. Lebeck 2001 CPS 220 Computer Architecture Topics Instruction Set Architecture Pipelining, Hazard Resolution, Superscalar, Reordering, Prediction, Speculation Addressing, Protection, Exception Handling L1 Cache L2 Cache DRAM Disks, WORM, Tape Coherence, Bandwidth, Latency Emerging Technologies Interleaving Bus protocols RAID VLSI Input/Output and Storage Memory Hierarchy Pipelining and Instruction Level Parallelism

13 © Alvin R. Lebeck 2001 CPS 220 Computer Architecture Topics (CPS 221) M Interconnection Network S PMPMPMP ° ° ° Topologies, Routing, Bandwidth, Latency, Reliability Network Interfaces Shared Memory, Message Passing, Data Parallel Processor-Memory-Switch Multiprocessors Networks and Interconnections

14 © Alvin R. Lebeck 2001 Computer Engineering Methodology Technology Trends

15 © Alvin R. Lebeck 2001 Computer Engineering Methodology Technology Trends Evaluate Existing Systems for Bottlenecks Benchmarks

16 © Alvin R. Lebeck 2001 Computer Engineering Methodology Technology Trends Evaluate Existing Systems for Bottlenecks Benchmarks Simulate New Designs and Organizations Workloads

17 © Alvin R. Lebeck 2001 Technology Trends Evaluate Existing Systems for Bottlenecks Benchmarks Simulate New Designs and Organizations Workloads Computer Engineering Methodology Implement Next Generation System Implementation Complexity

18 © Alvin R. Lebeck 2001 CPS 220 Application Area –Special Purpose (e.g., DSP) / General Purpose –Scientific (FP intensive) / Commercial (Mainframe) –Portable (Power matters) Level of Software Compatibility –Object Code/Binary Compatible (cost HW vs. SW; IBM S/360) –Assembly Language (dream to be different from binary) –Programming Language; Why not? Context for Designing New Architectures

19 © Alvin R. Lebeck 2001 CPS 220 OS Requirements for General Purpose Apps – Size of Address Space – Memory Management/Protection – Context Switch – Interrupts and Traps –Communication Standards: Innovation vs. Competition –IEEE 754 Floating Point –I/O Bus –Networks –Operating Systems / Programming Languages... Context for Designing New Architectures

20 © Alvin R. Lebeck 2001 Technology Trends: Microprocessor Capacity CMOS improvements: Die size: 2X every 3 yrs Line width: halve / 7 yrs “Graduation Window” Pentium Pro: 5.5 million Sparc Ultra: 5.2 million PowerPC 620: 6.9 million Alpha 21164: 9.3 million Alpha 21264: 15 million Pentium III: 28 million Pentium 4: 42 million Alpha 21364: 100 million Alpha 21464: 250 million

21 © Alvin R. Lebeck 2001 DRAM Capacity (single chip) yearsizecyc time Kb250 ns Kb220 ns Mb190 ns Mb165 ns Mb145 ns Mb104 ns Mb Gb

22 © Alvin R. Lebeck 2001 CPS 220 Technology Trends (Summary) CapacitySpeed Logic2x in 3 years2x in 3 years DRAM4x in 3 years1.4x in 10 years Disk2x in 3 years1.4x in 10 years

23 © Alvin R. Lebeck 2001 CPS 220 Processor Performance

24 © Alvin R. Lebeck 2001 Alpha SPECint and SPECfp

25 © Alvin R. Lebeck 2001 Chip Area Reachable in One Clock Cycle Fraction of Chip Reached Nanometers

26 © Alvin R. Lebeck 2001 Power Density Power Density W/cm^2 Microns

27 © Alvin R. Lebeck 2001 Processor Perspective Putting performance growth in perspective: Pentium-III Cray YMP Personal Comp.Supercomputer Year MIPS> 400 MIPS< 50 MIPS Linpack140 MFLOPS160 MFLOPS Cost$3,000$1M ($1.6M in 1994$) Clock400 MHz167 MHz Cache512 KB0.25 KB Memory128 MB256 MB 1988 supercomputer in 1998 personal computer!

28 © Alvin R. Lebeck 2001 CPS 220 Measurement and Evaluation Design Analysis Architecture is an iterative process: Searching the space of possible designs At all levels of computer systems Bad Ideas Good Ideas Creativity Mediocre Ideas Cost / Performance Analysis

29 © Alvin R. Lebeck 2001 CPS 220 Measurement Tools How do I evaluate an idea? Performance, Cost, Die Area, Power Estimation Benchmarks, Traces, Mixes Simulation (many levels) –ISA, RT, Gate, Circuit Queuing Theory Rules of Thumb Fundamental Laws Question: What is “better” Boeing 747 or Concorde?

30 © Alvin R. Lebeck 2001 CPS 220 The Bottom Line: Performance (and Cost) Time to run the task (ExTime) –Execution time, response time, latency Tasks per day, hour, week, sec, ns … (Performance) –Throughput, bandwidth Plane Boeing 747 BAD/Sud Concorde Speed 610 mph 1350 mph DC to Paris 6.5 hours 3 hours Passengers Throughput (pmph) 286, ,200

31 © Alvin R. Lebeck 2001 CPS 220 The Bottom Line: Performance (and Cost) "X is n times faster than Y" means ExTime(Y) Performance(X) = ExTime(X) Performance(Y) Speed of Concorde vs. Boeing 747 Throughput of Boeing 747 vs. Concorde

32 © Alvin R. Lebeck 2001 CPS 220 Performance Terminology “X is n% faster than Y” means: ExTime(Y) Performance(X) n = = ExTime(X)Performance(Y) 100 n = 100(Performance(X) - Performance(Y)) Performance(Y) Example: Y takes 15 seconds to complete a task, X takes 10 seconds. What % faster is X?

33 © Alvin R. Lebeck 2001 CPS 220 Example = = Performance (X) Performance (Y) ExTime(Y) ExTime(X) = n= 100 ( ) 1.0 n=50%

34 © Alvin R. Lebeck 2001 CPS 220 Amdahl's Law Speedup due to enhancement E: ExTime w/o E Performance w/ E Speedup(E) = = ExTime w/ E Performance w/o E Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected, then: ExTime(E) = Speedup(E) =

35 © Alvin R. Lebeck 2001 CPS 220 Amdahl’s Law ExTime new = ExTime old x (1 - Fraction enhanced ) + Fraction enhanced Speedup overall = ExTime old ExTime new Speedup enhanced = 1 (1 - Fraction enhanced ) + Fraction enhanced Speedup enhanced

36 © Alvin R. Lebeck 2001 CPS 220 Amdahl’s Law Floating point instructions improved to run 2X; but only 10% of actual instruction execution time is FP Speedup overall = ExTime new =

37 © Alvin R. Lebeck 2001 CPS 220 Amdahl’s Law Floating point instructions improved to run 2X; but only 10% of actual instruction execution time is FP Speedup overall = =1.053 ExTime new = ExTime old x ( /2) = 0.95 x ExTime old

38 © Alvin R. Lebeck 2001 CPS 220 Corollary: Make The Common Case Fast All instructions require an instruction fetch, only a fraction require a data fetch/store. –Optimize instruction access over data access Programs exhibit locality Spatial Locality Temporal Locality Access to small memories is faster –Provide a storage hierarchy such that the most frequent accesses are to the smallest (closest) memories. Reg's Cache Memory Disk / Tape

39 © Alvin R. Lebeck 2001 CPS 220 Occam's Toothbrush The simple case is usually the most frequent and the easiest to optimize! Do simple, fast things in hardware and be sure the rest can be handled correctly in software

40 © Alvin R. Lebeck 2001 CPS 220 Metrics of Performance Compiler Programming Language Application Datapath Control TransistorsWiresPins ISA Function Units (millions) of Instructions per second: MIPS (millions) of (FP) operations per second: MFLOP/s Cycles per second (clock rate) Megabytes per second Answers per month Operations per second

41 © Alvin R. Lebeck 2001 CPS 220 Aspects of CPU Performance CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle Instr. Cnt CPI Clock Rate Program Compiler Instr. Set Organization Technology

42 © Alvin R. Lebeck 2001 CPS 220 Aspects of CPU Performance CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle Inst Count CPIClock Rate Program X Compiler X (X) Inst. Set. X X Organization X X Technology X

43 © Alvin R. Lebeck 2001 CPS 220 Marketing Metrics Machines with different instruction sets ? Programs with different instruction mixes ? – Dynamic frequency of instructions Uncorrelated with performance Machine dependent Often not where time is spent Normalized: add,sub,compare,mult 1 divide, sqrt 4 exp, sin,... 8 Normalized: add,sub,compare,mult 1 divide, sqrt 4 exp, sin,... 8

44 © Alvin R. Lebeck 2001 Cycles Per Instruction Invest Resources where time is Spent! “Average Cycles Per Instruction” “Instruction Frequency”

45 © Alvin R. Lebeck 2001 CPS 220 Organizational Trade-offs Instruction Mix Cycle Time CPI Compiler Programming Language Application Datapath Control TransistorsWiresPins ISA Function Units

46 © Alvin R. Lebeck 2001 CPS 220 Example: Calculating CPI Typical Mix Base Machine (Reg / Reg) OpFreqCyclesCPI i (% Time) ALU50%1.5(33%) Load20%2.4(27%) Store10%2.2(13%) Branch20%2.4(27%) 1.5

47 © Alvin R. Lebeck 2001 CPS 220 Base Machine (Reg / Reg) OpFreqCycles ALU50%1 Load20%2 Store10%2 Branch20%2 Example Add register / memory operations to traditional RISC: – One source operand in memory – One source operand in register – Cycle count of 2 Branch cycle count to increase to 3. What fraction of the loads must be eliminated for this to pay off?

48 © Alvin R. Lebeck 2001 CPS 220 Next Time Benchmarks Performance Metrics Cost Instruction Set Architectures TODO Read Chapters 1 & 2 Do Homework #0