Constructive Computer Architecture Store Buffers and Non-blocking Caches Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute.

Slides:



Advertisements
Similar presentations
Stmt FSM Richard S. Uhler Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology (based on a lecture prepared by Arvind)
Advertisements

Computer Architecture: A Constructive Approach Branch Direction Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab.
February 21, 2007http://csg.csail.mit.edu/6.375/L07-1 Bluespec-4: Architectural exploration using IP lookup Arvind Computer Science & Artificial Intelligence.
March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2011L06-1.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 4, 2013
Computer Architecture: A Constructive Approach Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Constructive Computer Architecture Realistic Memories and Caches Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Constructive Computer Architecture: Bluespec execution model and concurrency semantics Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
Realistic Memories and Caches Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology March 21, 2012L13-1
Constructive Computer Architecture Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology 6.175: L01 – September 9,
Constructive Computer Architecture: Branch Prediction Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October.
Caches and in-order pipelines Arvind (with Asif Khan) Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology May 11, 2012L24-1.
Constructive Computer Architecture Cache Coherence Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November.
Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
Virtual Memory Part 1 Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology May 2, 2012L22-1
March 6, 2006http://csg.csail.mit.edu/6.375/L10-1 Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Constructive Computer Architecture Combinational circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Realistic Memories and Caches – Part II Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 2, 2012L14-1.
Computer Architecture: A Constructive Approach Next Address Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts.
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September.
Constructive Computer Architecture Cache Coherence Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology Includes.
Constructive Computer Architecture: Hardware Compilation of Bluespec Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of.
1 Tutorial: Lab 4 Again Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Constructive Computer Architecture Realistic Memories and Caches Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
October 22, 2009http://csg.csail.mit.edu/korea Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Non-blocking Caches Arvind (with Asif Khan) Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology May 14, 2012L25-1
Realistic Memories and Caches – Part III Li-Shiuan Peh Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology April 4, 2012L15-1.
Computer Architecture: A Constructive Approach Bluespec execution model and concurrent rule scheduling Teacher: Yoav Etsion Taken (with permission) from.
Computer Architecture: A Constructive Approach Data Hazards and Multistage Pipelines Teacher: Yoav Etsion Taken (with permission) from Arvind et al.*,
October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.
Constructive Computer Architecture Interrupts/Exceptions/Faults Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
October 6, 2009http://csg.csail.mit.edu/koreaL10-1 IP Lookup-2: The Completion Buffer Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Constructive Computer Architecture
6.175 Final Project Part 0: Understanding Non-Blocking Caches and Cache Coherency Answers.
Caches-2 Constructive Computer Architecture Arvind
Realistic Memories and Caches
Constructive Computer Architecture Tutorial 6 Cache & Exception
Realistic Memories and Caches
Bluespec-6: Modeling Processors
Folded “Combinational” circuits
Caches-2 Constructive Computer Architecture Arvind
Notation Addresses are ordered triples:
IP Lookup: Some subtle concurrency issues
FFT: An example of complex combinational circuits
Cache Coherence Constructive Computer Architecture Arvind
Constructive Computer Architecture
Constructive Computer Architecture Tutorial 7 Final Project Overview
Realistic Memories and Caches
Modular Refinement Arvind
Modular Refinement Arvind
Cache Coherence Constructive Computer Architecture Arvind
Bluespec-4: Architectural exploration using IP lookup Arvind
Combinational circuits
Modules with Guarded Interfaces
Cache Coherence Constructive Computer Architecture Arvind
Caches-2 Constructive Computer Architecture Arvind
IP Lookup Arvind Computer Science & Artificial Intelligence Lab
Caches and store buffers
Multistage Pipelined Processors and modular refinement
Computer Science & Artificial Intelligence Lab.
FFT: An example of complex combinational circuits
Modular Refinement Arvind
Pipelining combinational circuits
Implementing for Correct Concurrency
Realistic Memories and Caches
Cache Coherence Constructive Computer Architecture Arvind
Caches-2 Constructive Computer Architecture Arvind
Pipelined Processors: Control Hazards
Presentation transcript:

Constructive Computer Architecture Store Buffers and Non-blocking Caches Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November 4, 2013http://csg.csail.mit.edu/6.S195L18-1

Contributors to the course material Arvind, Rishiyur S. Nikhil, Joel Emer, Muralidaran Vijayaraghavan Staff and students in (Spring 2013), 6.S195 (Fall 2012), 6.S078 (Spring 2012) Asif Khan, Richard Ruhler, Sang Woo Jun, Abhinav Agarwal, Myron King, Kermin Fleming, Ming Liu, Li- Shiuan Peh External Prof Amey Karkare & students at IIT Kanpur Prof Jihong Kim & students at Seoul Nation University Prof Derek Chiou, University of Texas at Austin Prof Yoav Etsion & students at Technion November 4, 2013http://csg.csail.mit.edu/6.S195L18-2

Non-blocking cache Completion buffer controls the entries of requests and ensures that departures take place in order even if loads complete out-of-order requests to the backend have to be tagged cache req resp mReq mResp mReqQ mRespQ req proc FIFO responses OOO responses Processor req resp cbuf November 4, 2013http://csg.csail.mit.edu/6.S195L18-3

cbuf Completion buffer: Interface interface CBuffer#(type t); method ActionValue#(Token) getToken; method Action put(Token tok, t d); method ActionValue#(t) getResult; endinterface getResult getToken put (result & token) Concurrency requirement getToken < put < getResult November 4, 2013http://csg.csail.mit.edu/6.S195L18-4

Non-blocking FIFO Cache module mkNBFifoCache(Cache); CBuffer cBuf <- mkCompletionBuffer; NBCache nbCache <- mkNBtaggedCache; rule nbCacheResponse; let x <- nbCache.resp; cBuf.put(x); endrule method Action req(MemReq x); let tok <- cBuf.getToken; nbCache.req(TaggedMemReq{req:x, tag:tok}); endmethod method MemResp resp; let x <- cBuf.getResult return x endmethod endmodule req resp cbuf November 4, 2013http://csg.csail.mit.edu/6.S195L18-5

Non-blocking Cache St Q Ld Buff Wait Q V/ D/ I/ W TagData wbQ mReqQmRespQ hitQ respreq November 4, 2013http://csg.csail.mit.edu/6.S195L Waiting load reqs after the req for data has been made An extra bit in the cache to indicate if the data for a line is present Holds St reqs that have not been sent to cache/ memory load reqs before the req for data has been made Behavior to be described by 4 concurrent FSMs St req goes in StQ; Ld req searches: (1)StQ (2)Cache (3)LdQ

Incoming req November 4, 2013L18-7http://csg.csail.mit.edu/6.S195 Type of request Put in stQIn stQ? bypass hit in cache? with data? in ldBuf? hit put in waitQ (data on the way from mem) put in waitQ put in ldBuf & waitQ stld yesno yesno yesnoyesno

Store buffer processing the oldest entry November 4, 2013L18-8http://csg.csail.mit.edu/6.S195 Tag in cache? Data in data? update cache wbReq (no-allocate-on-write-miss policy wait yesno yesno

Load buffer processing the oldest entry November 4, 2013L18-9http://csg.csail.mit.edu/6.S195 Evacuation needed? Wb Req replace tag data missing fill Req replace tag data missing fill Req yesno

Mem Resp (line) November 4, 2013L18-10http://csg.csail.mit.edu/6.S195 Update cache Process all req in waitQ for the addresses in the line

Completion buffer: Implementation IIVIVIIIVIVI cnt iidx ridx buf A circular buffer with two pointers iidx and ridx, and a counter cnt Elements are of Maybe type module mkCompletionBuffer(CompletionBuffer#(size)); Vector#(size, EHR#(Maybe#(t))) cb <- replicateM(mkEHR(Invalid)); Reg#(Bit#(TAdd#(TLog#(size),1))) iidx <- mkReg(0); Reg#(Bit#(TAdd#(TLog#(size),1))) ridx <- mkReg(0); EHR#(Bit#(TAdd#(TLog#(size),1))) cnt <- mkEHR(0); Integer vsize = valueOf(size); Bit#(TAdd#(TLog#(size),1)) sz = fromInteger(vsize); rules and methods... endmodule November 4, 2013http://csg.csail.mit.edu/6.S195L18-11

Completion Buffer cont method ActionValue#(t) getToken() if(cnt.r0!==sz); cb[iidx].w0(Invalid); iidx <= iidx==sz-1 ? 0 : iidx + 1; cnt.w0(cnt.r0 + 1); return iidx; endmethod method Action put(Token idx, t data); cb[idx].w1(Valid data); endmethod method ActionValue#(t) getResult() if(cnt.r1 !== 0 &&&(cb[ridx].r2 matches tagged (Valid.x)); cb[ridx].w2(Invalid); ridx <= ridx==sz-1 ? 0 : ridx + 1; cnt.w1(cnt.r1 – 1); return x; endmethod getToken < put < getResult November 4, 2013http://csg.csail.mit.edu/6.S195L18-12