CSIT 301 (Blum)1 Cache Based in part on Chapter 9 in Computer Architecture (Nicholas Carter)

Slides:



Advertisements
Similar presentations
Chapter 4 Memory Management Basic memory management Swapping
Advertisements

Lecture 34: Chapter 5 Today’s topic –Virtual Memories 1.
1 Recap: Memory Hierarchy. 2 Memory Hierarchy - the Big Picture Problem: memory is too slow and or too small Solution: memory hierarchy Fastest Slowest.
Cache Memory Locality of reference: It is observed that when a program refers to memory, the access to memory for data as well as code are confined to.
The Memory Hierarchy (Lectures #24) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer Organization.
CSCE 212 Chapter 7 Memory Hierarchy Instructor: Jason D. Bakos.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
The Memory Hierarchy II CPSC 321 Andreas Klappenecker.
Caching I Andreas Klappenecker CPSC321 Computer Architecture.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1  Caches load multiple bytes per block to take advantage of spatial locality  If cache block size = 2 n bytes, conceptually split memory into 2 n -byte.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy (Part II)
Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.
Lecture 33: Chapter 5 Today’s topic –Cache Replacement Algorithms –Multi-level Caches –Virtual Memories 1.
Maninder Kaur CACHE MEMORY 24-Nov
Cache memory October 16, 2007 By: Tatsiana Gomova.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Lecture 15: Virtual Memory EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
Lecture 10 Memory Hierarchy and Cache Design Computer Architecture COE 501.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
Chapter Twelve Memory Organization
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.
Computer Architecture Lecture 26 Fasih ur Rehman.
3-May-2006cse cache © DW Johnson and University of Washington1 Cache Memory CSE 410, Spring 2006 Computer Systems
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
CS 1104 Help Session I Caches Colin Tan, S
CSE 241 Computer Engineering (1) هندسة الحاسبات (1) Lecture #3 Ch. 6 Memory System Design Dr. Tamer Samy Gaafar Dept. of Computer & Systems Engineering.
CSC 370 (Blum)1 Cache Based in part on Chapter 9 in Computer Architecture (Nicholas Carter)
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
Computer Organization CS224 Fall 2012 Lessons 45 & 46.
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
CSIT 301 (Blum)1 Cache Based in part on Chapter 9 in Computer Architecture (Nicholas Carter)
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
COMP SYSTEM ARCHITECTURE HOW TO BUILD A CACHE Antoniu Pop COMP25212 – Lecture 2Jan/Feb 2015.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Caches Hiding Memory Access Times. PC Instruction Memory 4 MUXMUX Registers Sign Ext MUXMUX Sh L 2 Data Memory MUXMUX CONTROLCONTROL ALU CTL INSTRUCTION.
Chapter 9 Memory Organization By Nguyen Chau Topics Hierarchical memory systems Cache memory Associative memory Cache memory with associative mapping.
Cache Memory Chapter 17 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S. Dandamudi.
Review °We would like to have the capacity of disk at the speed of the processor: unfortunately this is not feasible. °So we create a memory hierarchy:
1  2004 Morgan Kaufmann Publishers Chapter Seven Memory Hierarchy-3 by Patterson.
Memory Hierarchy How to improve memory access. Outline Locality Structure of memory hierarchy Cache Virtual memory.
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
SOFTENG 363 Computer Architecture Cache John Morris ECE/CS, The University of Auckland Iolanthe I at 13 knots on Cockburn Sound, WA.
The Memory Hierarchy (Lectures #17 - #20) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
Constructive Computer Architecture Realistic Memories and Caches Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
CSIT 301 (Blum)1 Finish Memory (then to Cache). CSIT 301 (Blum)2 Asynchronous DRAM Asynchronous DRAM was common until the mid to late 1990’s but now is.
COMPSYS 304 Computer Architecture Cache John Morris Electrical & Computer Enginering/ Computer Science, The University of Auckland Iolanthe at 13 knots.
CSE 351 Caches. Before we start… A lot of people confused lea and mov on the midterm Totally understandable, but it’s important to make the distinction.
CSIT 301 (Blum)1 Cache Based in part on Chapter 9 in Computer Architecture (Nicholas Carter)
CS161 – Design and Architecture of Computer
Translation Lookaside Buffer
Memory Hierarchy Ideal memory is fast, large, and inexpensive
CS161 – Design and Architecture of Computer
Lecture 12 Virtual Memory.
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
Morgan Kaufmann Publishers
Morgan Kaufmann Publishers
Based in part on Chapter 9 in Computer Architecture (Nicholas Carter)
Based in part on Chapter 9 in Computer Architecture (Nicholas Carter)
EE108B Review Session #6 Daxia Ge Friday February 23rd, 2007
CSC3050 – Computer Architecture
10/18: Lecture Topics Using spatial locality
Presentation transcript:

CSIT 301 (Blum)1 Cache Based in part on Chapter 9 in Computer Architecture (Nicholas Carter)

CSIT 301 (Blum)2 Bad Direct Mapping Scenario Recalled With direct mapping cache, the loop involves memory locations that share the same cache address. With set associative cache, the loop involves memory locations that share the same set of cache addresses. It is thus possible with set associative cache that each of these memory locations is cache to a different member of the set. The iterations can proceed without repeated cache misses.

CSIT 301 (Blum)3 The Problem with Fully Associative Cache All of those comparators are made of transistors. They take up room on the die. And any space lost to comparators has to be taken away from the data array. –After all we’re talking about thousands of comparators. ASSOCIATIVITY LOWERS CAPACITY!

CSIT 301 (Blum)4 Set-Associative Caches: The Compromise For example, instead of having the 1000-to-1 mapping we had with direct mapping, we could elect to have an 8000-to-8 mapping. That is, a given memory location can be cached into any of 8 cache locations, but the set of memory locations sharing those cache locations has also gone up by a factor of 8. This would be called an 8-way set associative cache.

CSIT 301 (Blum)5 A Happy Medium 4- or 8-way set associative provides enough flexibility to allow one (under most circumstances) to cache the necessary memory locations to get the desired effects of caching for an iterative procedure. –I.e. it minimizes cache misses. But it only requires 4 or 8 comparators instead of the thousands required for fully associative caches.

CSIT 301 (Blum)6 Set-Associative Cache Again the memory address is broken into three parts. –One part determines the position in the line. –One part determines this time a set of cache addresses. –The last part is compared to what is stored in the tags of the set of cache locations. –Etc.

CSIT 301 (Blum)7 PCGuide.com comparison table To which we add that full associativity has an adverse effect on capacity.

CSIT 301 (Blum)8 Cache Misses When a cache miss occurs, several factors have to be considered. For example, –We want the new memory location written into the cache, but where? –Can we continue attempting other cache interactions or should we wait? –What if the cached data has been modified? –Should we do anything with the data we are taking out of the cache?

CSIT 301 (Blum)9 Replacement Policy Upon a cache miss, the memory that was not found in cache will be written to cache, but where? –In Direct Mapping, there is no choice it can only be written to the cache address it is mapped to. –In Associative and Set-Associative there is a choice in what to replace.

CSIT 301 (Blum)10 Replacement Policy (Cont.) Least Recently Used (LRU) –Track the order in which the items in cache were used, replace the line that is last in your order, i.e. the least recently used. –This is best in keeping with the locality of reference notion behind cache, but it requires a fair amount of overhead. This can be too much overhead even in set associative cache where there may only be eight places under consideration.

CSIT 301 (Blum)11 Replacement Policy (Cont.) Least Frequently Used (LFU) –Similar to above, track how often each item in cache is used, replace the item with the lowest frequency. Not-Most-Recently Used –Another approach is to choose a line at random except that one protects the line (from the set) that has been used most recently –Less overhead

CSIT 301 (Blum)12 Blocking or Non-Blocking Cache Replacement requires interacting with a slower type of memory (a lower level of cache or main memory). Do we allow the processor to continue to access cache during this procedure or not? This is the distinction between blocking and non- blocking cache. –In blocking, all cache transactions must wait until the cache has been updated. –In non-blocking, other cache transactions are possible.

CSIT 301 (Blum)13 Cache Write Policy The data cache may not only be read but may also be written to. But cache is just standing in as a handy representative for main memory. That’s really where one wants to write. This is relatively slow just as reading from main memory is relatively slow. Rules about when one does this writing to memory is called one’s write policy. One reason for separating data cache and instruction cache is that the instruction cache does not require a write policy. –Recall the separation is known as the Harvard cache.

CSIT 301 (Blum)14 Write-Back Cache Because writing to memory is slow, in Write-Back Cache, a.k.a. "copy back” cache, one waits to until the line of cache is being replaced to write any values back to memory. –Main memory and cache are inconsistent but the cache value will always be used. –In such a case the memory is said to be “stale.”

CSIT 301 (Blum)15 Dirty Bit Since writing back to main memory is slow, one only wants to do it if necessary, that is, if some part of line has been updated. Each line of cache has a “dirty bit” which tells the cache controller whether or not the line has been updated since it was last replaced. –Only if the dirty bit is flipped does one need to write back.

CSIT 301 (Blum)16 Pros and Cons of Write Back Pro: Write Back takes advantage of the locality of reference concept. If the line of cache is written to, it’s likely to be written to again soon (before it is replaced). Con: When one writes back to main memory, one must write the entire line.

CSIT 301 (Blum)17 Write-Through Cache With Write-Through Cache, one writes the value back to memory every time a cache line is updated. –Con: Effectively a write-through cache is being used as a cache (fast stand in for memory) only for purposes of reading and not for writing. –Pro: When one writes, one is only writing a byte instead of a line. That’s not much of an advantage given the efficiency of burst/page reading/writing when the cache interacts with memory. –Pro: Integrity: cache and memory always agree

CSIT 301 (Blum)18 Comparing Policies Write back is more efficient. Write through maintains integrity. Integrity is not so much an issue at the SRAM-DRAM interface in the memory hierarchy since both are volatile. –This issue is more important at the next lower interface main memory/virtual memory as virtual memory is non-volatile.

CSIT 301 (Blum)19 Victim Cache Other than write modified data back to memory, what do we do with the data that is being replaced? One answer is nothing. Another possibility is to store it in a buffer that is faster than the next lower level, effectively introducing another small level of cache. This is as the victim cache or victim buffer. Monitoring the victim cache can lead to improved replacement policies.

CSIT 301 (Blum)20 References Computer Architecture, Nicholas Carter ews.asp?num= le