On-Chip Cache Analysis A Parameterized Cache Implementation for a System-on-Chip RISC CPU.

Slides:



Advertisements
Similar presentations
361 Computer Architecture Lecture 15: Cache Memory
Advertisements

ITEC 352 Lecture 25 Memory(2). Review RAM –Why it isnt on the CPU –What it is made of –Building blocks to black boxes –How it is accessed –Problems with.
SE-292 High Performance Computing Memory Hierarchy R. Govindarajan
Lecture 19: Cache Basics Today’s topics: Out-of-order execution
Lecture 8: Memory Hierarchy Cache Performance Kai Bu
Performance of Cache Memory
Cache Here we focus on cache improvements to support at least 1 instruction fetch and at least 1 data access per cycle – With a superscalar, we might need.
Computer Organization CS224 Fall 2012 Lesson 44. Virtual Memory  Use main memory as a “cache” for secondary (disk) storage l Managed jointly by CPU hardware.
Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
1 Recap: Memory Hierarchy. 2 Memory Hierarchy - the Big Picture Problem: memory is too slow and or too small Solution: memory hierarchy Fastest Slowest.
CSIE30300 Computer Architecture Unit 10: Virtual Memory Hsin-Chou Chi [Adapted from material by and
CSCI 232© 2005 JW Ryder1 Cache Memory Systems Introduced by M.V. Wilkes (“Slave Store”) Appeared in IBM S360/85 first commercially.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 (and Appendix B) Memory Hierarchy Design Computer Architecture A Quantitative Approach,
Memory/Storage Architecture Lab Computer Architecture Virtual Memory.
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
Computer ArchitectureFall 2008 © CS : Computer Architecture Lecture 22 Virtual Memory (1) November 6, 2008 Nael Abu-Ghazaleh.
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Memory Hierarchy Design Chapter 5 Karin Strauss. Background 1980: no caches 1995: two levels of caches 2004: even three levels of caches Why? Processor-Memory.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 3, 2003 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Virtual memory.
CS 300 – Lecture 20 Intro to Computer Architecture / Assembly Language Caches.
Quantum Computing II CPSC 321 Andreas Klappenecker.
EECS 470 Cache Systems Lecture 13 Coverage: Chapter 5.
Review for Midterm 2 CPSC 321 Computer Architecture Andreas Klappenecker.
2/27/2002CSE Cache II Caches, part II CPU On-chip cache Off-chip cache DRAM memory Disk memory.
Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.
Lecture 33: Chapter 5 Today’s topic –Cache Replacement Algorithms –Multi-level Caches –Virtual Memories 1.
Maninder Kaur CACHE MEMORY 24-Nov
ARM Processor Architecture
CPU Cache Prefetching Timing Evaluations of Hardware Implementation Ravikiran Channagire & Ramandeep Buttar ECE7995 : Presentation.
Spring 2009W. Rhett DavisNC State UniversityECE 406Slide 1 ECE 406 – Design of Complex Digital Systems Lecture 19: Cache Operation & Design Spring 2009.
Lecture 19: Virtual Memory
Lecture 15: Virtual Memory EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
Realistic Memories and Caches Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology March 21, 2012L13-1
Memory and cache CPU Memory I/O. CEG 320/52010: Memory and cache2 The Memory Hierarchy Registers Primary cache Secondary cache Main memory Magnetic disk.
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
Computer Architecture Lecture 26 Fasih ur Rehman.
Computer Architecture Ch5-1 Ping-Liang Lai ( 賴秉樑 ) Lecture 5 Review of Memory Hierarchy (Appendix C in textbook) Computer Architecture 計算機結構.
Caches Where is a block placed in a cache? –Three possible answers  three different types AnywhereFully associativeOnly into one block Direct mappedInto.
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
Computer Organization CS224 Fall 2012 Lessons 45 & 46.
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
M E M O R Y. Computer Performance It depends in large measure on the interface between processor and memory. CPI (or IPC) is affected CPI = Cycles per.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
CPE232 Cache Introduction1 CPE 232 Computer Organization Spring 2006 Cache Introduction Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.
Cache Memory Chapter 17 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi.
CS2100 Computer Organisation Virtual Memory – Own reading only (AY2015/6) Semester 1.
Virtual Memory Ch. 8 & 9 Silberschatz Operating Systems Book.
1 Appendix C. Review of Memory Hierarchy Introduction Cache ABCs Cache Performance Write policy Virtual Memory and TLB.
CAM Content Addressable Memory
Page 1 Computer Architecture and Organization 55:035 Final Exam Review Spring 2011.
Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.
1 Memory Hierarchy Design Chapter 5. 2 Cache Systems CPUCache Main Memory Data object transfer Block transfer CPU 400MHz Main Memory 10MHz Bus 66MHz CPU.
CDA 3101 Spring 2016 Introduction to Computer Organization Physical Memory, Virtual Memory and Cache 22, 29 March 2016.
Chapter 5 Memory II CSE 820. Michigan State University Computer Science and Engineering Equations CPU execution time = (CPU cycles + Memory-stall cycles)
CS2100 Computer Organization
CSC 4250 Computer Architectures
Cache Memory Presentation I
Lecture 21: Memory Hierarchy
Lecture 21: Memory Hierarchy
Chapter 5 Memory CSE 820.
CPE 631 Lecture 05: Cache Design
Virtual Memory Overcoming main memory size limitation
CSC3050 – Computer Architecture
Lecture 13: Cache Basics Topics: terminology, cache organization (Sections )
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

On-Chip Cache Analysis A Parameterized Cache Implementation for a System-on-Chip RISC CPU

Presentation Outline Informal Introduction Underpin Design – xr16 Cache Design Issue Implementation Details Results & Conclusion Future Work Questions

Informal Introduction Field Programmable Gate Array (FPGAs) Verilog HDL System-on-Chip (SoC) Reduced Instruction Set Computer (RISC) Caches Project Theme

Underpin Design – xr16 Classical pipelined RISC Big-Endian, Von-Numen Architecture Sixteen 16-bit registers Forty Two Instructions (16-bit) Result Forwarding, Branch Annulments, Interlocked instructions

Underpin Design – xr16 (cont’d) Internal and external Buses (CPU clocked) Pipelined Memory Interface Single-cycle read, 3-cycle write DMA and Interrupt Handling Support Ported Compiler and Assembler

Underpin Design – xr16 (cont’d) Block Diagram

Underpin Design – xr16 (cont’d) Datapath

Underpin Design – xr16 (cont’d) Memory Preferences

Underpin Design – xr16 (cont’d) RAM Interface

Cache Design Issues Cache Size * Line Size Fetch Algorithm Placement Policy * Replacement Policy * Split vs. Unified Cache

Cache Design Issues (cont’d) Write Back Strategy * Write Allocate Policy * Blocking vs. Non-Blocking Pipelined Transactions Virtually addressed Caches Multilevel Caches

Cache Design Issues (cont’d) Cache Size32 – 256K Data Bits Placement Policy Direct Mapped, Set Associative, Fully Associative Replacement Policy FIFO, Random* Write Back Strategy Write Back, Write Through Write Allocate Policy Write Allocate, Write No Allocate

Implementation Details Configurable Parameters  Cache Size  Placement Strategy  Write Back Policy  Write Allocate Policy  Replacement Policy

Implementation Details (cont’d)

1. Miss  Read  Replacement NOT Required Let the memory operation complete and place fetched data from memory in cache.

Implementation Details (cont’d) 2. Miss  Read  Replacement Required Initiate a write memory operation and write back the set to be replaced. Initiate read operation for desired data.

Implementation Details (cont’d) 3. Miss  Write  No Allocate Let the memory operation complete and do nothing else.

Implementation Details (cont’d) 4. Miss  Write  Yes Allocate  WriteThrough Let the memory operation complete and place the new data in cache.

Implementation Details (cont’d) 5. Miss  Write  Yes Allocate  WriteBack  Replacement NOT Required Cancel memory operation and only update the cache, mark the data dirty.

Implementation Details (cont’d) 6. Miss  Write  Yes Allocate  WriteBack  Replacement Required Instead of writing the data that caused the write miss, write back the set that is to be replaced and update the cache with data that caused the miss.

Implementation Details (cont’d) 7. Hit  Read Cancel memory operation and provide data for either instruction fetch or data load instruction.

Implementation Details (cont’d) 8. Hit  Write  WriteThrough Let the memory operation complete and update the cache when memory operation completes.

Implementation Details (cont’d) 9. Hit  Write  WriteBack Cancel the memory operation and update the cache.

Implementation Details (cont’d) 1. Read Hit 2. Write Hit 3. Read Miss (rep) 4. Read Miss (no rep) 5. Write Miss (rep) 6. Write Miss (no rep)

Results & Conclusion Proof of Concept Rigid Design Parameters R&D Options Architecture Innovation

Future Work LRU Implementation Victim Cache Buffer Split Caches Level 2 Cache Pipeline Enrichment Multiprocessor Support

Questions