Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 1: Overview of High Performance Processors * Jeremy R. Johnson Wed. Sept. 27,

Slides:



Advertisements
Similar presentations
Instruction Level Parallelism and Superscalar Processors
Advertisements

Computer Organization and Architecture
1 Lecture 13: Cache and Virtual Memroy Review Cache optimization approaches, cache miss classification, Adapted from UCB CS252 S01.
RISC and Pipelining Prof. Sin-Min Lee Department of Computer Science.
10/11: Lecture Topics Slides on starting a program from last time Where we are, where we’re going RISC vs. CISC reprise Execution cycle Pipelining Hazards.
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang.
Recap. The Memory Hierarchy Increasing distance from the processor in access time L1$ L2$ Main Memory Secondary Memory Processor (Relative) size of the.
Computer ArchitectureFall 2007 © November 21, 2007 Karem A. Sakallah Lecture 23 Virtual Memory (2) CS : Computer Architecture.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Chapter 12 Pipelining Strategies Performance Hazards.
The Memory Hierarchy II CPSC 321 Andreas Klappenecker.
Computer Architecture, Memory Hierarchy & Virtual Memory
Vacuum tubes Transistor 1948 –Smaller, Cheaper, Less heat dissipation, Made from Silicon (Sand) –Invented at Bell Labs –Shockley, Brittain, Bardeen ICs.
RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.
Memory: Virtual MemoryCSCE430/830 Memory Hierarchy: Virtual Memory CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng Zhu.
Cache intro CSE 471 Autumn 011 Principle of Locality: Memory Hierarchies Text and data are not accessed randomly Temporal locality –Recently accessed items.
(6.1) Central Processing Unit Architecture  Architecture overview  Machine organization – von Neumann  Speeding up CPU operations – multiple registers.
RISC CSS 548 Joshua Lo.
Lect 13-1 Lect 13: and Pentium. Lect Microprocessor Family  Microprocessor  Introduced in 1989  High Integration  On-chip 8K.
Instruction Sets and Pipelining Cover basics of instruction set types and fundamental ideas of pipelining Later in the course we will go into more depth.
RISC:Reduced Instruction Set Computing. Overview What is RISC architecture? How did RISC evolve? How does RISC use instruction pipelining? How does RISC.
Lecture 19: Virtual Memory
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
RISC By Ryan Aldana. Agenda Brief Overview of RISC and CISC Features of RISC Instruction Pipeline Register Windowing and renaming Data Conflicts Branch.
RISC Architecture RISC vs CISC Sherwin Chan.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
RISC and CISC. What is CISC? CISC is an acronym for Complex Instruction Set Computer and are chips that are easy to program and which make efficient use.
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Pipelining and Parallelism Mark Staveley
Improving Cache Performance Four categories of optimisation: –Reduce miss rate –Reduce miss penalty –Reduce miss rate or miss penalty using parallelism.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
1 CPRE 585 Term Review Performance evaluation, ISA design, dynamically scheduled pipeline, and memory hierarchy.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
Oct. 18, 2000Machine Organization1 Machine Organization (CS 570) Lecture 4: Pipelining * Jeremy R. Johnson Wed. Oct. 18, 2000 *This lecture was derived.
COMPUTER ORGANIZATIONS CSNB123 NSMS2013 Ver.1Systems and Networking1.
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 2: Performance Evaluation and Benchmarking * Jeremy R. Johnson Wed. Oct. 4,
1 Adapted from UC Berkeley CS252 S01 Lecture 17: Reducing Cache Miss Penalty and Reducing Cache Hit Time Hardware prefetching and stream buffer, software.
Virtual Memory 1 Computer Organization II © McQuain Virtual Memory Use main memory as a “cache” for secondary (disk) storage – Managed jointly.
High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
The Pentium Series CS 585: Computer Architecture Summer 2002 Tim Barto.
1 Lecture 20: OOO, Memory Hierarchy Today’s topics:  Out-of-order execution  Cache basics.
Addressing modes, memory architecture, interrupt and exception handling, and external I/O. An ISA includes a specification of the set of opcodes (machine.
Topics to be covered Instruction Execution Characteristics
Advanced Architectures
Memory COMPUTER ARCHITECTURE
Instruction Level Parallelism
Central Processing Unit Architecture
Chapter 14 Instruction Level Parallelism and Superscalar Processors
Central Processing Unit
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Lecture 14: Reducing Cache Misses
Systems Architecture II
Instruction Level Parallelism and Superscalar Processors
Computer Structure S.Abinash 11/29/ _02.
Morgan Kaufmann Publishers Memory Hierarchy: Virtual Memory
Lecture 20: OOO, Memory Hierarchy
Lecture 20: OOO, Memory Hierarchy
Overview Prof. Eric Rotenberg
Presentation transcript:

Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 1: Overview of High Performance Processors * Jeremy R. Johnson Wed. Sept. 27, 2000 *This lecture was derived from material in the text (HPC Chap. 1-2).

Jan. 5, 2000Systems Architecture II2 Introduction Objective: To review recent developments in the design of high performance microprocessors. To indicate how these features effect program performance. An example program will be used to illustrate benchmarking techniques and the effect of compiler optimizations and code organization on performance. We will indicate how changes in software can improve performance by better utilizing the underlying hardware. Our goal for the course is to understand this behavior. Topics –pipelining –instruction level parallelism, superscalar and out of order execution –Memory Hierarchy: cache, virtual memory

Jan. 5, 2000Systems Architecture II3 RISC vs. CISC CISC: instruction set made up of powerful instructions close to primitives in a high-level language such as C or FORTRAN RISC: low level instructions are emphasized. RISC is a label most commonly used for a set of instruction set architecture characteristics chosen to ease the use of aggressive implementation techniques found in high-performance processors (John Mashey) Prevalence began in mid-1980s (earlier example CDC 6600) when more transistors and better compilers became available. Trade complex instructions for faster clock rate and more room for extra registers, cache and advanced performance techniques.

Jan. 5, 2000Systems Architecture II4 Characterizing RISC Instruction pipelining Pipelining floating point execution Uniform instruction length Delayed branching Load/Store architecture Simple addressing modes

Jan. 5, 2000Systems Architecture II5 Pipelining Instruction pipelining –Instruction Fetch –Instruction Decode –Operand Fetch –Execute –Writeback IFIDFEW IFIDFEW IFIDFEW

Jan. 5, 2000Systems Architecture II6 Branches and Hazards If a branch is executed the pipeline may need to be flushed since the wrong instructions may have been started. IFIDFEW IFIDFEW IFIDFEW IFIDFEW IFIDFE guess sure

Jan. 5, 2000Systems Architecture II7 Advanced Techniques Superscalar Processors –issue more than one instruction per cycle –can’t have dependencies or hardware conflict –for example can execute an add simultaneously with a mult Superpipeling –more stages in the pipeline Out of order and speculative execution –maintain semantics but allow instructions to be computed in different order –may need to guess which instruction to execute –depends on difference between computation and execution

Jan. 5, 2000Systems Architecture II8 Post-RISC Pipeline IFID IRB E RR R Instruction Reorder Buffer Rename Registers

Jan. 5, 2000Systems Architecture II9 Memory Hierarchy SRAM vs. DRAM –small fast memory vs. large slow memory –principle of locality Registers Cache (level 1) Cache (level 2) Main memory Disk

Jan. 5, 2000Systems Architecture II10 Memory Access Speed on DEC Alpha Clock Speed 500 MHz (= 2 ns clock rate) Registers (2 ns) L1 On-Chip (4 ns) L2 On-Chip (5 ns) L3 Off-Chip (30 ns) Memory (220 ns)

Jan. 5, 2000Systems Architecture II11 Cache Organization Since cache is smaller than memory more than one address must map to same line in cache Direct-Mapped Cache –address mod cache size (only one location when memory address gets mapped to) Fully Associative Cache –address can be mapped anywhere in cache –need tag and associative search to find if element in cache Set-Associative Cache –compromise between two extremes –element can map to several locations

Jan. 5, 2000Systems Architecture II12 Virtual Memory Decouple physical addresses (memory locations) from addresses used by a program. Programmer sees a large memory with the same virtual addresses independent of where the program is actually placed in memory. –Virtual to physical mapping performed via a page table –Since page tables can be in virtual memory, there could be several table lookups for a single memory reference. –TLB (translation lookaside buffer) is a cache to store commonly used virtual to physical maps. Page Fault –when page is not in memory it must be brought in (from disk) –very slow (usually occurs with OS intervention)

Jan. 5, 2000Systems Architecture II13 Improving Memory Performance Larger and wider caches Cache bypass Interleaved and pipelined memory systems Prefetching Post-RISC effects on memory New memory trends