Cache Memory By Ed Martinez.  The fastest and most expensive memory on a computer system that is used to store collections of data.  Uses very short.

Slides:



Advertisements
Similar presentations
Cache and Virtual Memory Replacement Algorithms
Advertisements

361 Computer Architecture Lecture 15: Cache Memory
1 Cache and Caching David Sands CS 147 Spring 08 Dr. Sin-Min Lee.
Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
Cache Here we focus on cache improvements to support at least 1 instruction fetch and at least 1 data access per cycle – With a superscalar, we might need.
Technical University of Lodz Department of Microelectronics and Computer Science Elements of high performance microprocessor architecture Memory system.
Cache Memory Locality of reference: It is observed that when a program refers to memory, the access to memory for data as well as code are confined to.
© Karen Miller, What do we want from our computers?  correct results we assume this feature, but consider... who defines what is correct?  fast.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
PLA, Cache memory Prof. Sin-Min Lee Department of Computer Science.
CS61C Midterm #2 Review Session
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 3, 2003 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Chapter IX Memory Organization CS 147 Presented by: Duong Pham.
CS 524 (Wi 2003/04) - Asim LUMS 1 Cache Basics Adapted from a presentation by Beth Richardson
1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.
Cache Memories Effectiveness of cache is based on a property of computer programs called locality of reference Most of programs time is spent in loops.
COEN 180 Main Memory Cache Architectures. Basics Speed difference between cache and memory is small. Therefore:  Cache algorithms need to be implemented.
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
Maninder Kaur CACHE MEMORY 24-Nov
Cache performance CS 147 Prof. Lee Hai Lin Wu Cache performance  Introduction  Primary components –Cache hits Hit ratio –Cache misses  Average memory.
Memory Systems Architecture and Hierarchical Memory Systems
Cache memory October 16, 2007 By: Tatsiana Gomova.
CMPE 421 Parallel Computer Architecture
Five Components of a Computer
Memory Hierarchy and Cache Memory Jennifer Tsay CS 147 Section 3 October 8, 2009.
IT253: Computer Organization
Chapter Twelve Memory Organization
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
How to Build a CPU Cache COMP25212 – Lecture 2. Learning Objectives To understand: –how cache is logically structured –how cache operates CPU reads CPU.
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
CSIT 301 (Blum)1 Cache Based in part on Chapter 9 in Computer Architecture (Nicholas Carter)
Computer Architecture Memory organization. Types of Memory Cache Memory Serves as a buffer for frequently accessed data Small  High Cost RAM (Main Memory)
L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.
Computer Architecture Lecture 26 Fasih ur Rehman.
3-May-2006cse cache © DW Johnson and University of Washington1 Cache Memory CSE 410, Spring 2006 Computer Systems
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 January Session 2.
1 Virtual Memory Main memory can act as a cache for the secondary storage (disk) Advantages: –illusion of having more physical memory –program relocation.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Cache Memory By Tom Austin. What is cache memory? A cache is a collection of duplicate data, where the original data is expensive to fetch or compute.
The Memory Hierarchy Lecture # 30 15/05/2009Lecture 30_CA&O_Engr Umbreen Sabir.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Computer Organization & Programming
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Nov. 15, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 8: Memory Hierarchy Design * Jeremy R. Johnson Wed. Nov. 15, 2000 *This lecture.
Chapter 9 Memory Organization By Nguyen Chau Topics Hierarchical memory systems Cache memory Associative memory Cache memory with associative mapping.
11 Intro to cache memory Kosarev Nikolay MIPT Nov, 2009.
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
COMP SYSTEM ARCHITECTURE PRACTICAL CACHES Sergio Davies Feb/Mar 2014COMP25212 – Lecture 3.
SOFTENG 363 Computer Architecture Cache John Morris ECE/CS, The University of Auckland Iolanthe I at 13 knots on Cockburn Sound, WA.
Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module.
COMPSYS 304 Computer Architecture Cache John Morris Electrical & Computer Enginering/ Computer Science, The University of Auckland Iolanthe at 13 knots.
Cache memory Replacement Policy Prof. Sin-Min Lee Department of Computer Science.
Introduction to computer architecture April 7th. Access to main memory –E.g. 1: individual memory accesses for j=0, j++, j
نظام المحاضرات الالكترونينظام المحاضرات الالكتروني Cache Memory.
Chapter 9 Memory Organization. 9.1 Hierarchical Memory Systems Figure 9.1.
Cache memory Replacement Policy, Virtual Memory Prof. Sin-Min Lee Department of Computer Science.
Associative Mapping A main memory block can load into any line of cache Memory address is interpreted as tag and word Tag uniquely identifies block of.
Cache Memory Yi-Ning Huang. Principle of Locality Principle of Locality A phenomenon that the recent used memory location is more likely to be used again.
COSC3330 Computer Architecture
CSE 351 Section 9 3/1/12.
Ramya Kandasamy CS 147 Section 3
Cache Memory Presentation I
Chap. 12 Memory Organization
Chapter Five Large and Fast: Exploiting Memory Hierarchy
10/18: Lecture Topics Using spatial locality
Presentation transcript:

Cache Memory By Ed Martinez

 The fastest and most expensive memory on a computer system that is used to store collections of data.  Uses very short time to access recently and frequently used data. This helps improve computing performance by decreasing data access time.  With the speeds of modern CPUs ever increasing, accessing main memory has become a major performance bottleneck. Effective use of cache can help to minimize this problem. What is Cache Memory?

Cache Levels  Cache are referred to in levels, such as L1 or L2.  The level describes the connection or physical proximity of the cache to the CPU.  Traditionally, L1 cache, usually the smaller of the two, is located on the processor and runs at the same speed as the processor. L2 cache is outside, but near the processor and runs at the speed of the motherboard or FSB.  Recent designs have begun to integrate L2 cache onto the processor card or into the CPU chip itself, like L1 cache. This speeds access to the larger L2 cache, which improves the computer’s performance.

CPU Roadmap This photo shows a road map of the inside of the CPU, notice the different areas, and their functions. Can you find the L1 cache? This photo shows a road map of the inside of the CPU, notice the different areas, and their functions. Can you find the L1 cache?

Cache Memory This photo shows level 2 cache memory on the Processor board, beside the CPU This photo shows level 2 cache memory on the Processor board, beside the CPU

Cache Mapping  Associative  Direct  Set Associative

Associative Mapping  Sometimes known as fully associative mapping, any block of memory can be mapped to any line in the cache.  The block number is appended to the line as a tag.  There is a bit to indicate whether the line is valid or not.  Difficult and expensive to implement.

Direct Mapping  Each block in main memory can be loaded to only one line in cache.  Each cache line contains a tag to identify the block of main memory in the cache.  This is easy to implement, but inflexible.

Set-Associative Mapping  Blocks can be mapped to a subset of the lines in cache.  A block can be mapped to either 2 or 4 lines in cache, usually referred to as “Two-way” or “Four-way”.  LRU is the most popular replacement policy used with this mapping process.  More flexible than direct mapping, but easier to implement than associative mapping.

Why use a cache replacement policy?  Cache is small compared to available system memory  Cache maintains copies of recently referenced data items to speed access time  Once full it must use a replacement policy to decide how to handle additional data items  The replacement policy can either ignore the new data or decide which of the old data items to remove to make room

Cache Replacement Policy (cont)  Two main issues in determining which policy to use:  Increase the hit ratio by keeping the items that are referenced most frequently  Policy should be inexpensive to implement

Cache Replacement Policies  FIFO  LRU  Random

FIFO Replacement  FIFO (First in, first out) – the oldest data item in the cache frame is replaced.  Easy to implement and has fast operation.  Since it maintains no reference statistics, depending on the sequence of the data you may be constantly removing and replacing the same block of memory.

LRU Replacement  Least Recently Used – The item that was accessed least recently is replaced with the new data item.  Easy to implement, follows the idea that items which have been recently used are likely to be used again.  Keeps a list of data items that are currently in the cache. Referenced data items are moved to the front of the list  Data items at the back of the list are removed as space is needed  Primary drawback is that it requires time to analyze the reference statistics to determine which item to remove. Also requires additional cache space to maintain the statistics

Random Replacement  Random – a randomly generated address determines which data item in the cache frame to replace.  Has no need to store statistics, so it doesn’t waste time analyzing anything.  Performance is close to that of the other policies, and the implementation is fairly simple.

Cache Replacement Policies  Lets assume a computer system has an associative, a direct- mapped, or a two-way set-associative cache of 8 bytes.  The CPU accesses the following locations in order (t ):  The CPU accesses the following locations in order (the subscript is the low-order 3 bits of its address in physical memory ): A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0

Associative Cache (FIFO Replacement Policy) DataABCADBEFACDBGCHIAB CACHECACHE AAAAAAAAAAAAAAAIII BBBBBBBBBBBBBBBAA CCCCCCCCCCCCCCCB DDDDDDDDDDDDDD EEEEEEEEEEEE FFFFFFFFFFF GGGGGG HHHH Hit? * * **** * Hit ratio = 7/18 A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0

Direct mapped Cache Hit ratio = 3/18 DataABCADBEFACDBGCHIAB CACHECACHE 0ABBAABBBAAABBBBBAB 1 DDDDDDDDDDDDDD 2 CCCCCCCCCCCCCCCC 3 GGGGGG 4 EEEEEEEEEEEE 5 FFFFFFFFFFF 6 III 7 HHHH Hit? ** * A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0

Two-way set associative cache (LRU Replacement Policy) Hit ratio = 7/18 A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 DataABCADBEFACDBGCHIAB CACHECACHE 0A-0A-1 A-0 A-1E-0 E-1 B-0 B-1B-0 0 B-1 B-0B-1 A-0 A-1 A-0A-1 1 D-0 D-1 D-0 1 F-0 F-1 2 C-0 C-1 2 I-0 3 G-0 G-1 3 H-0 Hit? * * ** * **

Replacement with 2 byte line size Now lets consider data lines of two bytes. The data pairs that make up the lines are:  A and J; B and D; C and G; E and F; and I and H.

Associative Cache with 2 byte line size (FIFO Replacement Policy) Hit ratio = 11/18 A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 A and J; B and D; C and G; E and F; and I and H DataABCADBEFACDBGCHIAB CACHECACHE AAAAAAAAAAAAAAIIII JJJJJJJJJJJJJJHHHH BBBBBBBBBBBBBBBAA DDDDDDDDDDDDDDDJJ CCCCCCCCCCCCCCCB GGGGGGGGGGGGGGGD EEEEEEEEEEEE FFFFFFFFFFFF Hit? *** ******* *

Direct-mapped Cache with line size of 2 bytes Hit ratio 7/18 DataABCADBEFACDBGCHIAB CACHECACHE 0ABBABBBBAABBBBBBAB 1JDDJDDDDJJDDDDDDJD 2 CCCCCCCCCCCCCCCC 3 GGGGGGGGGGGGGGGG 4 EEEEEEEEEEEE 5 FFFFFFFFFFFF 6 IIII 7 HHHH Hit? * * * *** * A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 A and J; B and D; C and G; E and F; and I and H

Two-way set Associative Cache with line size of 2 bytes Hit ratio = 12/18 Data ABCADBEFACDBGCHIAB CACHECACHE 0A-0A-1 A-0A-1 E-0 E-1B-0 B-1B-0 1J-0J-1 J-0J-1 F-0 F-1D-0 D-1D-0 0 B-0 B-1B-0 B-1 A-0 A-1 A-0A-1 1 D-0 D-1D-0 D-1 J-0 J-1 J-0J-1 2 C-0 C-1 3 G-0 G-1 2 I-0 3 H-0 Hit? *** * * **** *** A 0 B 0 C 2 A 0 D 1 B 0 E 4 F 5 A 0 C 2 D 1 B 0 G 3 C 2 H 7 I 6 A 0 B 0 A and J; B and D; C and G; E and F; and I and H

Cache Performance  Primary reason for including cache memory in a computer is to improve system performance by reducing the time needed to access data in memory.  The cache hit ratio – hits/(hits + misses).  A well-designed cache will have a hit ratio close to 1.  Cache hits outnumber the misses by far. When more data is retrieved from the faster cache memory rather than from main memory system performance is improved.

Cache Performance (cont) hTMTM ns ns ns ns ns ns ns ns ns ns ns  Hit ratios and average memory access times  As the number of hits increase the average memory access time decreases h = hit ratio T M = weighted average cache access time.

References  Carpielli, John D, Computer Systems Organization & Architecture  Comer, Douglas E, Essentials of Computer Architecture  