Realistic Memories and Caches – Part II Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 2, 2012L14-1.

Slides:



Advertisements
Similar presentations
CS2100 Computer Organisation Cache II (AY2014/2015) Semester 2.
Advertisements

1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 (and Appendix B) Memory Hierarchy Design Computer Architecture A Quantitative Approach,
1 Lecture 20: Cache Hierarchies, Virtual Memory Today’s topics:  Cache hierarchies  Virtual memory Reminder:  Assignment 8 will be posted soon (due.
The Lord of the Cache Project 3. Caches Three common cache designs: Direct-Mapped store in exactly one cache line Fully Associative store in any cache.
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
Computer ArchitectureFall 2008 © November 10, 2007 Nael Abu-Ghazaleh Lecture 23 Virtual.
Caches J. Nelson Amaral University of Alberta. Processor-Memory Performance Gap Bauer p. 47.
Computer ArchitectureFall 2007 © November 21, 2007 Karem A. Sakallah Lecture 23 Virtual Memory (2) CS : Computer Architecture.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Memory Chapter 7 Cache Memories.
331 Lec20.1Fall :332:331 Computer Architecture and Assembly Language Fall 2003 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
EENG449b/Savvides Lec /13/04 April 13, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG Computer.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
331 Lec20.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 13 Basics of Cache [Adapted from Dave Patterson’s UCB CS152.
Computer ArchitectureFall 2008 © November 3 rd, 2008 Nael Abu-Ghazaleh CS-447– Computer.
2/27/2002CSE Cache II Caches, part II CPU On-chip cache Off-chip cache DRAM memory Disk memory.
Caches – basic idea Small, fast memory Stores frequently-accessed blocks of memory. When it fills up, discard some blocks and replace them with others.
Caches Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University See P&H 5.1, 5.2 (except writes)
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Realistic Memories and Caches Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology March 21, 2012L13-1
Multilevel Memory Caches Prof. Sirer CS 316 Cornell University.
CS 3410, Spring 2014 Computer Science Cornell University See P&H Chapter: , 5.8, 5.15.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy.
CSIE30300 Computer Architecture Unit 08: Cache Hsin-Chou Chi [Adapted from material by and
Virtual Memory Part 1 Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology May 2, 2012L22-1
CS2100 Computer Organisation Cache II (AY2010/2011) Semester 2.
Lecture 08: Memory Hierarchy Cache Performance Kai Bu
CS2100 Computer Organisation Cache II (AY2015/6) Semester 1.
Computer Organization CS224 Fall 2012 Lessons 45 & 46.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
CS.305 Computer Architecture Memory: Caches Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly made available.
CPE232 Cache Introduction1 CPE 232 Computer Organization Spring 2006 Cache Introduction Dr. Gheith Abandah [Adapted from the slides of Professor Mary Irwin.
Memory Hierarchy How to improve memory access. Outline Locality Structure of memory hierarchy Cache Virtual memory.
Computer Architecture: A Constructive Approach Next Address Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts.
SOFTENG 363 Computer Architecture Cache John Morris ECE/CS, The University of Auckland Iolanthe I at 13 knots on Cockburn Sound, WA.
Cps 220 Cache. 1 ©GK Fall 1998 CPS220 Computer System Organization Lecture 17: The Cache Alvin R. Lebeck Fall 1999.
CAM Content Addressable Memory
Constructive Computer Architecture Realistic Memories and Caches Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Non-blocking Caches Arvind (with Asif Khan) Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology May 14, 2012L25-1
Realistic Memories and Caches – Part III Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 4, 2012L15-1.
Constructive Computer Architecture Store Buffers and Non-blocking Caches Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute.
COMPSYS 304 Computer Architecture Cache John Morris Electrical & Computer Enginering/ Computer Science, The University of Auckland Iolanthe at 13 knots.
CS 61C: Great Ideas in Computer Architecture Caches Part 2 Instructors: Nicholas Weaver & Vladimir Stojanovic
CS 110 Computer Architecture Lecture 15: Caches Part 2 Instructor: Sören Schwertfeger School of Information Science and Technology.
Memory Hierarchy and Cache Design (3). Reducing Cache Miss Penalty 1. Giving priority to read misses over writes 2. Sub-block placement for reduced miss.
處理器設計與實作 CPU LAB for Computer Organization 1. LAB 7: MIPS Set-Associative D- cache.
Chapter 9 Memory Organization. 9.1 Hierarchical Memory Systems Figure 9.1.
CSCI206 - Computer Organization & Programming
Caches-2 Constructive Computer Architecture Arvind
Realistic Memories and Caches
CSE 351 Section 9 3/1/12.
Constructive Computer Architecture Tutorial 6 Cache & Exception
Realistic Memories and Caches
Multilevel Memories (Improving performance using alittle “cash”)
Caches-2 Constructive Computer Architecture Arvind
Cache Coherence Constructive Computer Architecture Arvind
Lecture 21: Memory Hierarchy
Realistic Memories and Caches
Cache Coherence Constructive Computer Architecture Arvind
Cache Coherence Constructive Computer Architecture Arvind
CC protocol for blocking caches
Bluespec-3: A non-pipelined processor Arvind
Caches-2 Constructive Computer Architecture Arvind
Modular Refinement - 2 Arvind
Caches and store buffers
Caches III CSE 351 Autumn 2018 Instructor: Justin Hsia
Cache Coherence Constructive Computer Architecture Arvind
Caches-2 Constructive Computer Architecture Arvind
Caches III CSE 351 Spring 2019 Instructor: Ruth Anderson
Presentation transcript:

Realistic Memories and Caches – Part II Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 2, 2012L14-1

Data Cache – Interface (0,n) interface DataCache; method ActionValue#(Bool) req(MemReq r); method ActionValue#(Maybe#(Data)) resp; endinterface cache req accepted resp Processor DRAM ReqRespMem Resp method should be invoked every cycle to not drop values Denotes if the request is accepted this cycle or not

Data Cache – Internals (new) Cache req accepted resp Processor DRAM hit wire missFifo evict wire ReqRespMem miss wire TagData Store interface TagDataStore; method ActionValue#(HitOrMiss) tagLookup(Addr addr); method ActionValue#(CacheResp) dataAccess(CacheDataReq req); endinterface

Direct mapped caches

Data Cache – code structure module mkDataCache#(ReqRespMem mem)(DataCache); TagDataStore store <- mkTagDataStore; FIFOF#(Tuple2#(CacheIdentifier, MemReq)) missFifo <- mkSizedFIFOF(missFifoSize); RWire#(Tuple2#(CacheIdentifier, MemReq)) hitWire <- mkRWire; RWire#(Tuple2#(CacheIdentifier, Addr)) evictWire <- mkRWire; RWire#(Tuple2#(CacheIdentifier, MemReq)) missWire <- mkRWire; function CacheDataReq getCacheDataReqWord(CacheIdentifier id, MemReq r);endfunction method Action#(Bool) req(MemReq r) … endmethod; method ActionValue#(Data) resp … endmethod; endmodule

TagDataStore array of a direct-mapped cache module mkTagDataStore(TagDataStore); RegFile#(LineIndex, Tag) tagArray <- mkRegFileFull; Vector#(DCacheRows, Reg#(Bool)) valid <- replicateM(mkReg(False)); Vector#(DCacheRows, Reg#(Bool)) dirty <- replicateM(mkReg(False)); RegFile#(LineIndex, Line) dataArray <- mkRegFileFull;

TagDataStore: taglookup method ActionValue#(HitOrMiss) tagLookup(Addr addr); let lineIndex = getLineIndex(addr); let tag = tagArray.sub(lineIndex); if(valid[lineIndex] && tag == getTag(addr)) return HitOrMiss{hit: True, id:CacheIdentifier{ lineIndex: lineIndex, wordIndex: getWordIndex(addr)}, addr: ?, dirty: ?}; else return HitOrMiss{hit: False, id:CacheIdentifier{ lineIndex: lineIndex, wordIndex: ?}, addr: getAddr(tag, lineIndex, 0), dirty: valid[lineIndex] && dirty[lineIndex]}; endmethod

TagDataStore: dataAccess method ActionValue#(CacheResp) dataAccess(CacheDataReq req); let id = req.id; case (req.reqType) matches tagged ReadWord: : tagged WriteWord.w: : tagged ReadLine: : tagged FillLine {.l,.addr,.d}: : tagged InvalidateLine: : endcase endmethod

Data Cache requests method ActionValue#(Bool) req(MemReq r); $display("mem r %d %h %h", r.op, r.addr, r.data); let hitOrMiss <- store.tagLookup(r.addr); if(hitOrMiss.hit) begin hitWire.wset(tuple2(hitOrMiss.id, r)); $display("Hit %d %h %h", r.op, r.addr, r.data); return True; end else if (hitOrMiss.dirty) begin $display("Miss"); if (mem.reqNotFull) begin $display("eviction"); evictWire.wset(tuple2(hitOrMiss.id, hitOrMiss.addr)); end return False; end

Data Cache requests continued else if (mem.reqNotFull && missFifo.notFull) begin $display("fill"); missWire.wset(tuple2(hitOrMiss.id, r)); return True; end else return False; endmethod

Data Cache responses (hitwire) method ActionValue#(Maybe#(Data)) resp; if(hitWire.wget matches tagged Valid {.id,.req}) begin $display("hit response"); let word <- store.dataAccess( getCacheDataReqWord(id, req)); return tagged Valid (word.Word); end

Data Cache responses (evictwire) else if (evictWire.wget matches tagged Valid {.id,.addr}) begin $display("resp evict"); let line <- store.dataAccess(CacheDataReq{id: id, reqType: InvalidateLine}); $display("resp writeback"); mem.reqEnq(MemReq{op: St, addr: addr, data: unpack(pack(line.Line))}); return tagged Invalid; end

Data Cache responses (missWire) else if (missWire.wget matches tagged Valid {.id,.r}) begin $display("resp miss"); missFifo.enq(tuple2(id, r)); let req = r; req.op = Ld; mem.reqEnq(req); return tagged Invalid; end

Data Cache responses (memory responds) else if (mem.respNotEmpty && missFifo.notEmpty) begin $display("resp process"); mem.respDeq; missFifo.deq; match {.id,.req} = missFifo.first; let line = req.op == Ld? unpack(pack(mem.respFirst)) : unpack(pack(req.data)); let dirty = req.op == Ld? False: True; let l <- store.dataAccess(CacheDataReq{id: id, reqType: tagged FillLine ( tuple3(line, req.addr, dirty))}); return tagged Valid line; end

Non-blocking to blocking: Modify which component? method req: if (missFifo.notEmpty) return False; else // the rest of the original code Cache req accepted resp Processor DRAM hit wire missFifo evict wire ReqRespMem miss wire TagData Store

Writeback to writethrough: Modify which component? Cache req accepted resp Processor DRAM hit wire missFifo evict wire ReqRespMem miss wire TagData Store TagDataStore’s dataAccess DataCache’s req DataCache’s resp No evict wire, dirty bits

2-way set associative

Types of misses Compulsory misses (cold start) Cold fact of life First time data is referenced Run billions of instructions, become insignificant Capacity misses Working set is larger than cache size Solution: increase cache size Conflict misses Multiple memory locations mapped to the same location One set fills up, but space in other cache sets Set-Associative Caches

Reduce Conflict Misses Memory time = Hit time + Prob(miss) * Miss penalty Associativity: Allow blocks to go to several sets in cache 2-way set associative: each block maps to either of 2 cache sets Fully associative: each block maps to any cache frame

2-Way Set-Associative Cache

Data Cache – code structure: Should this change from direct-mapped to n-way set associative? module mkDataCache#(ReqRespMem mem)(DataCache); TagDataStore store <- mkTagDataStore; FIFOF#(Tuple2#(CacheIdentifier, MemReq)) missFifo <- mkSizedFIFOF(missFifoSize); RWire#(Tuple2#(CacheIdentifier, MemReq)) hitWire <- mkRWire; RWire#(Tuple2#(CacheIdentifier, Addr)) evictWire <- mkRWire; RWire#(Tuple2#(CacheIdentifier, MemReq)) missWire <- mkRWire; function CacheDataReq getCacheDataReqWord(CacheIdentifier id, MemReq r);endfunction method Action#(Bool) req(MemReq r) … endmethod; method ActionValue#(Data) resp … endmethod; endmodule

Data Cache – Building the TagDataStore array: Should this change from a direct-mapped cache to a n-way set-associative? module mkTagDataStore(TagDataStore); RegFile#(LineIndex, Tag) tagArray <- mkRegFileFull; Vector#(DCacheRows, Reg#(Bool)) valid <- replicateM(mkReg(False)); Vector#(DCacheRows, Reg#(Bool)) dirty <- replicateM(mkReg(False)); RegFile#(LineIndex, Line) dataArray <- mkRegFileFull;

n-way set-associative TagDataStore typedef 2 NumWays; typedef Bit#(TLog#(NumWays)) WaysIndex; module mkTagDataStore(TagDataStore); Vector#(NumWays, RegFile#(LineIndex, Tag)) tagArray <- replicateM(mkRegFileFull); Vector#(NumWays, Vector#(DCacheRows, Reg#(Bool))) valid <- replicateM(replicateM(mkReg(False))); Vector#(NumWays, Vector#(DCacheRows, Reg#(Bool))) dirty <- replicateM(replicateM(mkReg(False))); Vector#(NumWays, RegFile#(LineIndex, Line)) dataArray <- replicateM(mkRegFileFull); Vector#(DCacheRows, Reg#(WaysIndex)) lru <- replicateM(mkReg(0));

TagDataStore’s tagLookup: Should this change from direct-mapped to n-way set associative? method ActionValue#(HitOrMiss) tagLookup(Addr addr); let lineIndex = getLineIndex(addr); let tag = tagArray.sub(lineIndex); if(valid[lineIndex] && tag == getTag(addr)) return HitOrMiss{hit: True, id: CacheIdentifier{ lineIndex: lineIndex, wordIndex: getWordIndex(addr)}, addr: ?, dirty: ?}; else return HitOrMiss{hit: False, id: CacheIdentifier{ lineIndex: lineIndex, wordIndex: ?}, addr: getAddr(tag, lineIndex, 0), dirty: valid[lineIndex] && dirty[lineIndex]}; endmethod

N-way set-associative TagDataStore’s tag Lookup Updating of LRU: if(hit matches tagged Valid.i) lru[lineIndex] <= fromInteger(1-i); else lru[lineIndex] <= 1-lru[lineIndex];

TagDataStore’s dataAccess: Should this change from direct-mapped to n-way set associative? method ActionValue#(CacheResp) dataAccess(CacheDataReq req); let id = req.id; case (req.reqType) matches tagged ReadWord: : tagged WriteWord.w: : tagged ReadLine: : tagged FillLine {.l,.addr,.d}: : tagged InvalidateLine: : endcase endmethod