Tutorial 4: RISCV modules Constructive Computer Architecture

Slides:



Advertisements
Similar presentations
Constructive Computer Architecture: Data Hazards in Pipelined Processors Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute.
Advertisements

EHRs: Designing modules with concurrent methods Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 27,
Computer Architecture: A Constructive Approach EHRs: Designing modules with concurrent methods Teacher: Yoav Etsion Taken (with permission) from Arvind.
Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.
Modeling Processors Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2011L07-1
Computer Architecture: A Constructive Approach Branch Direction Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab.
Constructive Computer Architecture Tutorial 5: Programming SMIPS: Single-Core Assembly Andy Wright TA October 10, 2014http://csg.csail.mit.edu/6.175T04-1.
September 22, 2009http://csg.csail.mit.edu/koreaL07-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab.
Elastic Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 28, 2011L08-1http://csg.csail.mit.edu/6.375.
Computer Architecture: A Constructive Approach Branch Prediction - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of.
Computer Architecture: A Constructive Approach Next Address Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts.
Constructive Computer Architecture: Control Hazards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October.
February 20, 2009http://csg.csail.mit.edu/6.375L08-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
1 Tutorial: Lab 4 Again Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,
October 22, 2009http://csg.csail.mit.edu/korea Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Constructive Computer Architecture Tutorial 6: Five Details of SMIPS Implementations Andy Wright 6.S195 TA October 7, 2013http://csg.csail.mit.edu/6.s195T05-1.
Computer Architecture: A Constructive Approach Data Hazards and Multistage Pipelines Teacher: Yoav Etsion Taken (with permission) from Arvind et al.*,
October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.
Modeling Processors Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 1, 2010
Elastic Pipelines: Concurrency Issues
EHRs: Designing modules with concurrent methods
EHRs: Designing modules with concurrent methods
6.175: Constructive Computer Architecture Tutorial 5 Epochs, Debugging, and Caches Quan Nguyen (Troubled by the two biggest problems in computer science…
Control Hazards Constructive Computer Architecture: Arvind
Bluespec-6: Modeling Processors
Bluespec-6: Modules and Interfaces
Scheduling Constraints on Interface methods
Tutorial 7: SMIPS Epochs Constructive Computer Architecture
Blusepc-5: Dead cycles, bubbles and Forwarding in Pipelines Arvind
Sequential Circuits Constructive Computer Architecture Arvind
Sequential Circuits: Constructive Computer Architecture
Constructive Computer Architecture Tutorial 6: Discussion for lab6
Multistage Pipelined Processors and modular refinement
in Pipelined Processors
Performance Specifications
Pipelining combinational circuits
Constructive Computer Architecture: Lab Comments & RISCV
Pipelining combinational circuits
Modular Refinement Arvind
Modular Refinement Arvind
EHR: Ephemeral History Register
Constructive Computer Architecture Tutorial 5 Epoch & Branch Predictor
Lab 4 Overview: 6-stage SMIPS Pipeline
Blusepc-5: Dead cycles, bubbles and Forwarding in Pipelines Arvind
in Pipelined Processors
Constructive Computer Architecture Tutorial 3 RISC-V Processor
Control Hazards Constructive Computer Architecture: Arvind
Modeling Processors: Concurrency Issues
Pipelining combinational circuits
Bypassing Computer Architecture: A Constructive Approach Joel Emer
Multistage Pipelined Processors and modular refinement
Modular Refinement Arvind
Elastic Pipelines: Concurrency Issues
Modular Refinement - 2 Arvind
in Pipelined Processors
Elastic Pipelines: Concurrency Issues
Pipelined Processors Arvind
Control Hazards Constructive Computer Architecture: Arvind
Pipelined Processors Constructive Computer Architecture: Arvind
Modeling Processors Arvind
Modeling Processors Arvind
Modular Refinement Arvind
Control Hazards Constructive Computer Architecture: Arvind
Implementing for Correct Concurrency
Modular Refinement Arvind
Tutorial 7: SMIPS Labs and Epochs Constructive Computer Architecture
Bluespec-8: Modules and Interfaces
Pipelined Processors: Control Hazards
Presentation transcript:

Tutorial 4: RISCV modules Constructive Computer Architecture Thomas Bourgeat 6.175 TA 4/13/2019 http://csg.csail.mit.edu/6.s195

Normal Register File module mkRFile(RFile); Vector#(32,Reg#(Data)) rfile <- replicateM(mkReg(0)); method Action wr(RIndx rindx, Data data); if(rindx!=0) rfile[rindx] <= data; endmethod method Data rd1(RIndx rindx) = rfile[rindx]; method Data rd2(RIndx rindx) = rfile[rindx]; endmodule {rd1, rd2} < wr 4/13/2019 http://csg.csail.mit.edu/6.s195 2

Bypass Register File using EHR module mkBypassRFile(RFile); Vector#(32,Ehr#(2, Data)) rfile <- replicateM(mkEhr(0)); method Action wr(RIndx rindx, Data data); if(rindex!=0) (rfile[rindex])[0] <= data; endmethod method Data rd1(RIndx rindx) = (rfile[rindx])[1]; method Data rd2(RIndx rindx) = (rfile[rindx])[1]; endmodule wr < {rd1, rd2} 4/13/2019 http://csg.csail.mit.edu/6.s195 3

Searchable FIFO To build a scoreboard 4/13/2019 http://csg.csail.mit.edu/6.s195

Searchable FIFO Interface interface SFifo#(numeric type n, type dt, type st); method Bool notFull; method Action enq(dt x); method Bool notEmpty; method dt first; method Action deq; method Action clear; Bool search(st x); endinterface 4/13/2019 http://csg.csail.mit.edu/6.s195

Scoreboard implementation using searchable Fifos function Bool isFound (Maybe#(RIndx) dst, Maybe#(RIndx) src); return isValid(dst) && isValid(src) && (fromMaybe(?,dst)==fromMaybe(?,src)); endfunction module mkCFScoreboard(Scoreboard#(size)); SFifo#(size, Maybe#(RIndx), Maybe#(RIndx)) f <- mkCFSFifo(isFound); method insert = f.enq; method remove = f.deq; method search1 = f.search1; method search2 = f.search2; endmodule 4/13/2019 http://csg.csail.mit.edu/6.s195

Searchable FIFO Internal States Standard FIFO states: Reg#(Bit#(TLog#(n))) enqP <- mkReg(0); Reg#(Bit#(TLog#(n))) deqP <- mkReg(0); Reg#(Bool) full <- mkReg(False); Reg#(Bool) empty <- mkReg(Empty); Need any more? 4/13/2019 http://csg.csail.mit.edu/6.s195

Searchable FIFO Method Calls {notFull, enq} R: full, enqP, deqP W: full, empty, enqP, data {notEmpty, deq, first} R: empty, enqP, deqP, data W: full, empty, deqP search R: (empty or full), enqP, deqP, data clear W: empty, full, enq, deqP 4/13/2019 http://csg.csail.mit.edu/6.s195

Searchable FIFO Potential Conflicts {notFull, enq} R: full, enqP, deqP W: full, empty, enqP, data {notEmpty, deq, first} R: empty, enqP, deqP, data W: full, empty, deqP search R: (empty or full), enqP, deqP, data clear W: empty, full, enq, deqP deq < enq enq < deq enq C deq Same as FIFO Search is read-only -> it can always come first Clear is write-only -> it can always come last 4/13/2019 http://csg.csail.mit.edu/6.s195

Searchable FIFO Implementation 1 mkCFFifo with a search method Schedule: search < {notFull, enq, notEmpty, deq, first} < clear {notFull, enq} CF {notEmpty, deq, first} 4/13/2019 http://csg.csail.mit.edu/6.s195

Searchable FIFO Implementation 1 module mkSFifo1(SFifo#(n, t, t)) provisos(Eq#(t)); // mkCFFifo implementation method Bool search(t x); Bool found = False; for(Integer i = 0; i < valueOf(n); i = i+1) begin Bool validEntry = full[0] || (enqP[0]>deqP[0] && i>=deqP[0] && i<enqP[0]) || (enqP[0]<deqP[0] && (i>=deqP[0] || i<enqP[0])); if(validEntry && (data[i] == x)) found = True; end return found; endmethod endmodule 4/13/2019 http://csg.csail.mit.edu/6.s195

Searchable FIFO Custom Search Function module mkSFifo1( SFifo#(n, dt, st) ifc); // mkCFFifo implementation method Bool search(st x); Bool found = False; for(Integer i = 0; i < valueOf(n); i = i+1) begin Bool validEntry = full[0] || (enqP[0]>deqP[0] && i>=deqP[0] && i<enqP[0]) || (enqP[0]<deqP[0] && (i>=deqP[0] || i<enqP[0]); if(validEntry && isFound(data[i], x)) found = True; end return found; endmethod endmodule function Bool isFound( dt x, st y ), 4/13/2019 http://csg.csail.mit.edu/6.s195

Scoreboard When using a SFifo for a scoreboard, the following functions are used together: {search, notFull, enq} {notEmpty, deq} Are enq and deq still commutative like in the CFFifo case? No! Search has to be able to be done with enq, and search is not commutative with deq 4/13/2019 http://csg.csail.mit.edu/6.s195

Bypass Register File with external bypassing move rd module mkBypassRFile(BypassRFile); RFile rf <- mkRFile; Fifo#(1, Tuple2#(RIndx, Data)) bypass <- mkBypassSFifo; rule move; begin rf.wr(bypass.first); bypass.deq end; endrule method Action wr(RIndx rindx, Data data); if(rindex!=0) bypass.enq(tuple2(rindx, data)); endmethod method Data rd1(RIndx rindx) = return (!bypass.search1(rindx)) ? rf.rd1(rindx) : bypass.read1(rindx); method Data rd2(RIndx rindx) = return (!bypass.search2(rindx)) ? rf.rd2(rindx) : bypass.read2(rindx); endmodule wr < {rd1, rd2} 4/13/2019 http://csg.csail.mit.edu/6.s195 14

Epoch Tutorial 4/13/2019 http://csg.csail.mit.edu/6.s195

Handling Multiple Epochs If only one epoch changes, it acts just like the case where there is only one epoch. First we are going to look at the execute epoch and the decode epoch separately. 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Execute IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch fEpoch PC Redirect 6 5 4 3 2 1 Mispredicted Write Back 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Execute IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch fEpoch PC Redirect 1 6 5 4 3 2 Poisoning Write Back 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Execute IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch fEpoch PC Redirect 2 1 6 5 4 3 Poisoning Write Back 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Execute IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch fEpoch PC Redirect 3 2 1 6 5 4 Poisoning Killing 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Execute IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch fEpoch PC Redirect 4 3 2 1 6 5 Executing Killing 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Execute IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch fEpoch PC Redirect 5 4 3 2 1 6 Executing Killing 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Execute IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch fEpoch PC Redirect 6 5 4 3 2 1 Executing Write Back 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Decode IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem fEpoch PC Redirect 6 5 4 3 2 1 Mispredicted Write Back dEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Decode IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem fEpoch PC Redirect 1 6 5 4 3 2 Killing Write Back dEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Decode IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem fEpoch PC Redirect 2 1 5 4 3 Decoding Write Back dEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Decode IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem fEpoch PC Redirect 3 2 1 5 4 Decoding Write Back dEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Decode IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem fEpoch PC Redirect 4 3 2 1 5 Decoding Write Back dEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Decode IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem fEpoch PC Redirect 5 4 3 2 1 Decoding Stall dEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Correcting PC in Decode IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem fEpoch PC Redirect 6 5 4 3 2 1 Decoding Write Back dEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States What if execute sees a misprediction, then decode sees one in the next cycle? The decode instruction will be a wrong path instruction, so it will not redirect the PC 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Scoreboard DMem IMem PC Redirect IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem PC Redirect Assume this instruction is a mispredicted jump instruction. 6 5 4 3 2 1 Decoding Mispredicted Write Back dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Scoreboard DMem IMem PC Redirect IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem PC Redirect 1 6 5 4 3 2 Killing Poisoning Write Back dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Scoreboard DMem IMem PC Redirect IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem PC Redirect 1 1 5 4 3 Decoding Poisoning Write Back dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 3 2 1 5 4 IFetch Decode RFetch Exec Memory WB Decoding Stalling Killing PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 4 3 2 1 5 IFetch Decode RFetch Exec Memory WB Decoding Executing Killing PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 5 4 3 2 1 IFetch Decode RFetch Exec Memory WB Decoding Executing Stalling PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 6 5 4 3 2 1 IFetch Decode RFetch Exec Memory WB Decoding Executing Write Back PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States What if decode sees a misprediction, then execute sees one in the next cycle? The decode instruction will be a wrong path instruction, but it won’t be known to be wrong path until later 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem Assume this instruction is a mispredicted branch instruction. Register File Redirect Scoreboard 6 5 4 3 2 1 IFetch Decode RFetch Exec Memory WB Mispredicted Executing Write Back PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 1 6 5 4 3 2 IFetch Decode RFetch Exec Memory WB Killing Mispredicted Write Back PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 1 1 5 4 3 IFetch Decode RFetch Exec Memory WB Killing Poisoning Write Back PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 2 1 5 4 IFetch Decode RFetch Exec Memory WB Decoding Stalling Write Back PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 3 2 1 5 IFetch Decode RFetch Exec Memory WB Decoding Stalling Killing PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 4 3 2 1 IFetch Decode RFetch Exec Memory WB Decoding Executing Stalling PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 5 4 3 2 1 IFetch Decode RFetch Exec Memory WB Decoding Executing Stalling PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Global Epoch States Register File Redirect Scoreboard PC IMem DMem 6 5 4 3 2 1 IFetch Decode RFetch Exec Memory WB Decoding Executing Write Back PC IMem DMem dEpoch eEpoch 4/13/2019 http://csg.csail.mit.edu/6.s195

Implementing Global Epoch States How do you implement this? There are multiple ways to do this, but the easiest way is to use EHRs 4/13/2019 http://csg.csail.mit.edu/6.s195

Implementing Global Epoch States with EHRs IFetch Decode WB RFetch Exec Memory Register File Scoreboard DMem IMem eEpoch[k] eEpoch[i] PC Redirect Each stage looks at a different port of each Epoch EHR. dEpoch[i] 6 5 4 3 2 1 Decoding Executing Write Back dEpoch[j] eEpoch[j] There’s still a problem! PC redirection and epoch update needs to be atomic! 4/13/2019 http://csg.csail.mit.edu/6.s195

Implementing Global Epoch States with EHRs Register File dEpoch[i] Scoreboard eEpoch[i] 6 5 4 3 2 1 IFetch Decode RFetch Exec Memory WB Decoding Executing Write Back PC[i] IMem PC[j] dEpoch[j] PC [k] eEpoch[k] DMem eEpoch[j] Make PC an EHR and have each pipeline stage redirect the PC directly 4/13/2019 http://csg.csail.mit.edu/6.s195

How Does EHR Port Ordering Change Things? Originally we had redirect FIFOs from Decode and Execute to Instruction Fetch. What ordering is this? Fetch – 2 Decode – 0 or 1 Execute – 0 or 1 (not the same as Decode) Does the order between Decode and Execute matter? Not much... Having Fetch use ports after Decode and Execute increase the length of combinational logic The order between Decode/Execute and Fetch matters most! (both for length of combinational logic and IPC) 4/13/2019 http://csg.csail.mit.edu/6.s195

Questions? 4/13/2019 http://csg.csail.mit.edu/6.s195