Set-Associative Cache

Slides:



Advertisements
Similar presentations
Performance of Cache Memory
Advertisements

Practical Caches COMP25212 cache 3. Learning Objectives To understand: –Additional Control Bits in Cache Lines –Cache Line Size Tradeoffs –Separate I&D.
Quiz 4 Solution. n Frequency = 2.5GHz, CLK = 0.4ns n CPI = 0.4, 30% loads and stores, n L1 hit =0, n L1-ICACHE : 2% miss rate, 32-byte blocks n L1-DCACHE.
Cache Here we focus on cache improvements to support at least 1 instruction fetch and at least 1 data access per cycle – With a superscalar, we might need.
11/8/2005Comp 120 Fall November 9 classes to go! Read Section 7.5 especially important!
Using one level of Cache:
How caches take advantage of Temporal locality
Memory Hierarchies Exercises [ ] Describe the general characteristics of a program that would exhibit very little spatial or temporal locality with.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1 Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
1  1998 Morgan Kaufmann Publishers Chapter Seven Large and Fast: Exploiting Memory Hierarchy.
Caches The principle that states that if data is used, its neighbor will likely be used soon.
1  2004 Morgan Kaufmann Publishers Chapter Seven.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
Lecture 32: Chapter 5 Today’s topic –Cache performance assessment –Associative caches Reminder –HW8 due next Friday 11/21/2014 –HW9 due Wednesday 12/03/2014.
1 CSE SUNY New Paltz Chapter Seven Exploiting Memory Hierarchy.
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
Lecture 33: Chapter 5 Today’s topic –Cache Replacement Algorithms –Multi-level Caches –Virtual Memories 1.
Computing Systems Memory Hierarchy.
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
 Higher associativity means more complex hardware  But a highly-associative cache will also exhibit a lower miss rate —Each set has more blocks, so there’s.
Computer Organization and Architecture Tutorial 1 Kenneth Lee.
B. Ramamurthy.  12 stage pipeline  At peak speed, the processor can request both an instruction and a data word on every clock.  We cannot afford pipeline.
1 CENG 450 Computer Systems and Architecture Cache Review Amirali Baniasadi
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
Lecture 14: Caching, cont. EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2014, Dr.
DECStation 3100 Block Instruction Data Effective Program Size Miss Rate Miss Rate Miss Rate 1 6.1% 2.1% 5.4% 4 2.0% 1.7% 1.9% 1 1.2% 1.3% 1.2% 4 0.3%
1 Chapter Seven CACHE MEMORY AND VIRTUAL MEMORY. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4.
Computer Organization CS224 Fall 2012 Lessons 41 & 42.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 5:
1 Chapter Seven. 2 Users want large and fast memories! SRAM access times are ns at cost of $100 to $250 per Mbyte. DRAM access times are ns.
1  1998 Morgan Kaufmann Publishers Chapter Seven.
1  2004 Morgan Kaufmann Publishers Locality A principle that makes having a memory hierarchy a good idea If an item is referenced, temporal locality:
Memory Hierarchy and Caches. Who Cares about Memory Hierarchy? Processor Only Thus Far in Course CPU-DRAM Gap 1980: no cache in µproc; level cache,
COMP SYSTEM ARCHITECTURE PRACTICAL CACHES Sergio Davies Feb/Mar 2014COMP25212 – Lecture 3.
1 Chapter Seven. 2 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value.
EECS 370 Discussion 1 Calvin and Hobbes by Bill Watterson.
نظام المحاضرات الالكترونينظام المحاضرات الالكتروني Cache Memory.
Cache Issues Computer Organization II 1 Main Memory Supporting Caches Use DRAMs for main memory – Fixed width (e.g., 1 word) – Connected by fixed-width.
Improving Memory Access The Cache and Virtual Memory
COSC3330 Computer Architecture
Cache Memory and Performance
CS2100 Computer Organization
The Goal: illusion of large, fast, cheap memory
Improving Memory Access 1/3 The Cache and Virtual Memory
CSC 4250 Computer Architectures
Virtual Memory Use main memory as a “cache” for secondary (disk) storage Managed jointly by CPU hardware and the operating system (OS) Programs share main.
5.2 Eleven Advanced Optimizations of Cache Performance
Consider a Direct Mapped Cache with 4 word blocks
Morgan Kaufmann Publishers
ECE 445 – Computer Organization
Exploiting Memory Hierarchy Chapter 7
Lecture 21: Memory Hierarchy
Chapter 5 Memory CSE 820.
Systems Architecture II
ECE 445 – Computer Organization
Interconnect with Cache Coherency Manager
Lecture 22: Cache Hierarchies, Memory
10/16: Lecture Topics Memory problem Memory Solution: Caches Locality
Morgan Kaufmann Publishers
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
Part V Memory System Design
EE108B Review Session #6 Daxia Ge Friday February 23rd, 2007
Tutorial No. 11 Module 10.
If a DRAM has 512 rows and its refresh time is 9ms, what should be the frequency of row refresh operation on the average?
Lecture 21: Memory Hierarchy
Chapter Five Large and Fast: Exploiting Memory Hierarchy
Cache - Optimization.
Memory & Cache.
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Set-Associative Cache Chapter 7

Cache Configuration The set of blocks in a cache can be configured as: Direct mapped N-way set associative Fully-associative Associative cache have shown to result in lowering miss rate. As N increases, the number of index bits decreases and it reaches 0 for a fully-associative cache. Lets look at a 4-way set associative cache fig.7.17

Cache configuration Direct map: one main memory has only one possible cache block map (direct map). N-way set associative cache has N block choices per main memory block. For example, 2-way set-associative has two possible choices for every main memory block. See Fig.7.14 The comparator complexity increases with N.

Performance with cache (p.505-506) Processor with base CPI of 1, assuming 100% hit ratio in primary cache, and clock rate of 5Ghz. Assume main memory access time of 100ns including all the miss handling. Miss rate per instruction at primary cache 2%. How mush faster will the processor be if we add a another level of cache with following characteristics? Secondary cache has 5ns for hit or miss, and can reduce miss ratio to main memory to 0.5%.

Solution: Performance measured in CPI With only primary cache: 1 + (2/100) X (5Ghz*100ns) = 11 CPI With secondary cache: No miss: 1 + L1 miss : (2/100) X (5Ghz *5ns) + L2 miss: (0.5/100) X (5Ghz * 100ns) = 1 + 0.5 + 2.5 = 4 CPI