1 CMPE 421 Advanced Computer Architecture Caching with Associativity PART2.

Slides:



Advertisements
Similar presentations
CSC 4250 Computer Architectures December 8, 2006 Chapter 5. Memory Hierarchy.
Advertisements

Cs 325 virtualmemory.1 Accessing Caches in Virtual Memory Environment.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Caches Part 2 Instructors: Krste Asanovic & Vladimir Stojanovic
1 Lecture 20: Cache Hierarchies, Virtual Memory Today’s topics:  Cache hierarchies  Virtual memory Reminder:  Assignment 8 will be posted soon (due.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 17, 2003 Topic: Virtual Memory.
How caches take advantage of Temporal locality
331 Week13.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Review CPSC 321 Andreas Klappenecker Announcements Tuesday, November 30, midterm exam.
Computer Systems Organization: Lecture 11
331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Review: The Memory Hierarchy
©UCB CS 161 Ch 7: Memory Hierarchy LECTURE 24 Instructor: L.N. Bhuyan
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
CS3350B Computer Architecture Winter 2015 Lecture 3.2: Exploiting Memory Hierarchy: How? Marc Moreno Maza [Adapted from.
Computer Architecture and Design – ECEN 350 Part 9 [Some slides adapted from M. Irwin, D. Paterson. D. Garcia and others]
Caching & Virtual Memory Systems Chapter 7  Caching l To address bottleneck between CPU and Memory l Direct l Associative l Set Associate  Virtual Memory.
CSE431 L22 TLBs.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 22. Virtual Memory Hardware Support Mary Jane Irwin (
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui COMP 203 / NWEN 201 Computer Organisation / Computer Architectures Virtual.
CS 3410, Spring 2014 Computer Science Cornell University See P&H Chapter: , 5.8, 5.15.
CSIE30300 Computer Architecture Unit 9: Improving Cache Performance Hsin-Chou Chi [Adapted from material by and
CS.305 Computer Architecture Improving Cache Performance Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly.
1 CMPE 421 Advanced Computer Architecture Accessing a Cache PART1.
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
CPE432 Chapter 5A.1Dr. W. Abu-Sufah, UJ Chapter 5A: Exploiting the Memory Hierarchy, Part 2 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
The Three C’s of Misses 7.5 Compulsory Misses The first time a memory location is accessed, it is always a miss Also known as cold-start misses Only way.
1 CMPE 421 Parallel Computer Architecture PART4 Caching with Associativity.
Virtual Memory. Virtual Memory: Topics Why virtual memory? Virtual to physical address translation Page Table Translation Lookaside Buffer (TLB)
Ch7a- 2 EE/CS/CPE Computer Organization  Seattle Pacific University Big is Slow The more information stored, the slower the access 7.1 Amdahl’s.
1Chapter 7 Memory Hierarchies (Part 2) Outline of Lectures on Memory Systems 1. Memory Hierarchies 2. Cache Memory 3. Virtual Memory 4. The future.
Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 3.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
CPE232 Cache Introduction1 CPE 232 Computer Organization Spring 2006 Cache Introduction Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.
Cache Memory.
1 CMPE 421 Parallel Computer Architecture PART3 Accessing a Cache.
Additional Slides By Professor Mary Jane Irwin Pennsylvania State University Group 1.
CPE232 Improving Cache Performance1 CPE 232 Computer Organization Spring 2006 Improving Cache Performance Dr. Gheith Abandah [Adapted from the slides of.
CS 61C: Great Ideas in Computer Architecture Lecture 15: Caches, Part 2 Instructor: Sagar Karandikar
CSE431 L20&21 Improving Cache Performance.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 20&21. Improving Cache Performance Mary Jane.
CS203 – Advanced Computer Architecture Virtual Memory.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Set-Associative Caches Instructors: Randy H. Katz David A. Patterson
CS 61C: Great Ideas in Computer Architecture Caches Part 2 Instructors: Nicholas Weaver & Vladimir Stojanovic
CS 110 Computer Architecture Lecture 15: Caches Part 2 Instructor: Sören Schwertfeger School of Information Science and Technology.
Chapter 9 Memory Organization. 9.1 Hierarchical Memory Systems Figure 9.1.
CMSC 611: Advanced Computer Architecture
CMSC 611: Advanced Computer Architecture
Vivek Seshadri 15740/18740 Computer Architecture
Improving Memory Access The Cache and Virtual Memory
COSC3330 Computer Architecture
Memory COMPUTER ARCHITECTURE
Section 9: Virtual Memory (VM)
CAM Content Addressable Memory
CS 704 Advanced Computer Architecture
Memory Hierarchy Virtual Memory, Address Translation
Cache Memory Presentation I
Consider a Direct Mapped Cache with 4 word blocks
CSCI206 - Computer Organization & Programming
Lecture 21: Memory Hierarchy
Lecture 21: Memory Hierarchy
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
CSCI206 - Computer Organization & Programming
FIGURE 12-1 Memory Hierarchy
Page that info back into your memory!
Lecture 22: Cache Hierarchies, Memory
Morgan Kaufmann Publishers
Lecture 21: Memory Hierarchy
Lecture 13: Cache Basics Topics: terminology, cache organization (Sections )
Memory Management & Virtual Memory
Presentation transcript:

1 CMPE 421 Advanced Computer Architecture Caching with Associativity PART2

2 Other Cache organizations Direct Mapped 0: 1: 2 3: 4: 5: 6: 7: 8 9: 10: 11: 12: 13: 14: 15: VTagData Index Address = Tag | Index | Block offset Fully Associative No Index Address = Tag | Block offset Each address has only one possible location TagDataV

3 Fully Associative Cache

4 A Compromise 2-Way set associative Address = Tag | Index | Block offset 4-Way set associative Address = Tag | Index | Block offset 0: 1: 2: 3: 4: 5: 6: 7: VTagData Each address has two possible locations with the same index Each address has two possible locations with the same index One fewer index bit: 1/2 the indexes 0: 1: 2: 3: VTagData Each address has four possible locations with the same index Each address has four possible locations with the same index Two fewer index bits: 1/4 the indexes

5 Range of Set Associative Caches Block offsetByte offsetIndexTag Decreasing associativity Fully associative (only one set) Tag is all the bits except block and byte offset Direct mapped (only one way) Smaller tags Increasing associativity Selects the setUsed for tag compareSelects the word in the block

6 Set Associative Cache 0 Cache Main Memory Q1: How do we find it? Use next 1 low order memory address bit to determine which cache set (i.e., modulo the number of sets in the cache) TagData Q2: Is it there? Compare all the cache tags in the set to the high order 3 memory address bits to tell if the memory block is in the cache V 0000xx 0001xx 0010xx 0011xx 0100xx 0101xx 0110xx 0111xx 1000xx 1001xx 1010xx 1011xx 1100xx 1101xx 1110xx 1111xx Two low order bits define the byte in the word (32-b words) One word blocks Set Way 0 1 (block address) modulo (# set in the cache)

7 Set Associative Cache Organization FIGURE 7.17 The implementation of a four-way set-associative cache requires four comparators and a 4-to-1 multiplexor. The comparators determine which element of the selected set (if any) matches the tag. The output of the comparators is used to select the data from one of the four blocks of the indexed set, using a multiplexor with a decoded select signal. In some implementations, the Output enable signals on the data portions of the cache RAMs can be used to select the entry in the set that drives the output. The Output enable signal comes from the comparators, causing the element that matches to drive the data outputs.

8 Remember the Example for Direct Mapping (ping pong effect)  Consider the main memory word reference string miss 00 Mem(0) Mem(4) Mem(0) Mem(0) Mem(0) Mem(4) Mem(4) 0 00 Start with an empty cache - all blocks initially marked as not valid  Ping pong effect due to conflict misses - two memory locations that map into the same cache block l 8 requests, 8 misses

9 Solution: Use set associative cache 0404  Consider the main memory word reference string miss hit 000 Mem(0) Start with an empty cache - all blocks initially marked as not valid 010 Mem(4) 000 Mem(0) 010 Mem(4)  Solves the ping pong effect in a direct mapped cache due to conflict misses since now two memory locations that map into the same cache set can co-exist! l 8 requests, 2 misses

10 Set Associative Example VTagData Index : 001: 010: 011: 100: 101: 110: 111: Miss Index VTagData : 01: 10: 11: VTagData Index : 1: Direct-Mapped2-Way Set Assoc.4-Way Set Assoc Miss Hit Miss Miss Hit Miss Hit Byte offset (2 bits) Block offset (2 bits) Index (1-3 bits) Tag (3-5 bits)

11 New Performance Numbers Miss rates for DEC 3100 (MIPS machine) spiceDirect0.3%0.6%0.4% gccDirect2.0%1.7%1.9% spice2-way0.3%0.6%0.4% gcc4-way1.6%1.4%1.5% BenchmarkAssociativityInstructionData missCombined ratemiss rate Separate 64KB Instruction/Data Caches gcc2-way1.6%1.4%1.5% spice4-way0.3%0.6%0.4%

12 Benefits of Set Associative Caches  The choice of direct mapped or set associative depends on the cost of a miss versus the cost of implementation Data from Hennessy & Patterson, Computer Architecture, 2003  Largest gains are in going from direct mapped to 2-way (20%+ reduction in miss rate)

Virtual Memory (32-bit system): 8KB page size,16MB Mem Phys. Page # Disk Address Virt. Pg.# V K Index Virtual Address Page offset Physical Address 4GB / 8KB = 512K entries 2 19 =512K 11

Virtual memory example Virtual Page #ValidPhysical Page #/ (index)BitDisk address sector sector 4323… sector Page Table: System with 20-bit V.A., 16KB pages, 256KB of physical memory Page offset takes 14 bits, 6 bits for V.P.N. and 4 bits for P.P.N. Access to: PPN = 0010 Physical Address: Access to: PPN = Page Fault to sector Pick a page to “kick out” of memory (use LRU). Assume LRU is VPN for this example sector xxxx... Read data from sector 1239 into PPN 1010