Computer Architecture: A Constructive Approach EHRs: Designing modules with concurrent methods Teacher: Yoav Etsion Taken (with permission) from Arvind.

Slides:



Advertisements
Similar presentations
An EHR based methodology for Concurrency management Arvind (with Asif Khan) Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Advertisements

Constructive Computer Architecture: Multirule systems and Concurrent Execution of Rules Arvind Computer Science & Artificial Intelligence Lab. Massachusetts.
EHRs: Designing modules with concurrent methods Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 27,
Constructive Computer Architecture: FIFO Lab Comments Revisiting CF FIFOs Andy Wright TA October 20, 2014http://csg.csail.mit.edu/6.175L14-1.
Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.
February 21, 2007http://csg.csail.mit.edu/6.375/L07-1 Bluespec-4: Architectural exploration using IP lookup Arvind Computer Science & Artificial Intelligence.
March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2011L06-1.
December 12, 2006http://csg.csail.mit.edu/6.827/L24-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Pipelining combinational circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February 20, 2013http://csg.csail.mit.edu/6.375L05-1.
Constructive Computer Architecture: Guards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 24, 2014.
September 22, 2009http://csg.csail.mit.edu/koreaL07-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab.
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology
Elastic Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 28, 2011L08-1http://csg.csail.mit.edu/6.375.
Constructive Computer Architecture Sequential Circuits - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Constructive Computer Architecture: Control Hazards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology October.
February 20, 2009http://csg.csail.mit.edu/6.375L08-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,
Constructive Computer Architecture Store Buffers and Non-blocking Caches Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute.
Simple Inelastic and Folded Pipelines Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 14, 2011L04-1.
Computer Architecture: A Constructive Approach Multi-Cycle and 2 Stage Pipelined SMIPS Implementations Teacher: Yoav Etsion Taken (with permission) from.
Computer Architecture: A Constructive Approach Pipelining combinational circuits Teacher: Yoav Etsion Taken (with permission) from Arvind et al.*, Massachusetts.
Computer Architecture: A Constructive Approach Bluespec execution model and concurrent rule scheduling Teacher: Yoav Etsion Taken (with permission) from.
October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.
Elastic Pipelines: Concurrency Issues
EHRs: Designing modules with concurrent methods
EHRs: Designing modules with concurrent methods
Concurrency properties of BSV methods and rules
Bluespec-6: Modeling Processors
Scheduling Constraints on Interface methods
Blusepc-5: Dead cycles, bubbles and Forwarding in Pipelines Arvind
Sequential Circuits Constructive Computer Architecture Arvind
Sequential Circuits: Constructive Computer Architecture
Caches-2 Constructive Computer Architecture Arvind
in Pipelined Processors
Pipelining combinational circuits
Multirule Systems and Concurrent Execution of Rules
Constructive Computer Architecture: Guards
Sequential Circuits Constructive Computer Architecture Arvind
Pipelining combinational circuits
EHR: Ephemeral History Register
Constructive Computer Architecture: Well formed BSV programs
Blusepc-5: Dead cycles, bubbles and Forwarding in Pipelines Arvind
in Pipelined Processors
Constructive Computer Architecture Tutorial 3 RISC-V Processor
Control Hazards Constructive Computer Architecture: Arvind
Modeling Processors: Concurrency Issues
Modules with Guarded Interfaces
Pipelining combinational circuits
Elastic Pipelines: Concurrency Issues
Multirule systems and Concurrent Execution of Rules
IP Lookup: Some subtle concurrency issues
Computer Science & Artificial Intelligence Lab.
in Pipelined Processors
Elastic Pipelines: Concurrency Issues
Pipelined Processors Arvind
Elastic Pipelines and Basics of Multi-rule Systems
Constructive Computer Architecture: Guards
Elastic Pipelines and Basics of Multi-rule Systems
Pipelined Processors Constructive Computer Architecture: Arvind
Multirule systems and Concurrent Execution of Rules
IP Lookup: Some subtle concurrency issues
Bluespec-5: Scheduling & Rule Composition
Tutorial 4: RISCV modules Constructive Computer Architecture
Control Hazards Constructive Computer Architecture: Arvind
Implementing for Correct Concurrency
Constructive Computer Architecture: Well formed BSV programs
Caches-2 Constructive Computer Architecture Arvind
Pipelined Processors: Control Hazards
Presentation transcript:

Computer Architecture: A Constructive Approach EHRs: Designing modules with concurrent methods Teacher: Yoav Etsion Taken (with permission) from Arvind et al.*, Massachusetts Institute of Technology Derek Chiou, The University of Texas at Austin * Joel Emer, Li-Shiuan Peh, Murali Vijayaraghavan, Asif Khan, Abhinav Agarwal, Myron King 1

Contents Concurrent module methods Limitations of the Bluespec model Extending the model with EHRs: Ephemeral History Registers Coding FIFOs with different concurrency properties using EHRs Simultaneous access to several modules 2

Elastic pipeline x fifo1inQ f1f2f3 fifo2outQ rule stage1 if (True); fifo1.enq(f1(inQ.first()); inQ.deq();endrule rule stage2 if (True); fifo2.enq(f2(fifo1.first()); fifo1.deq();endrule rule stage3 if (True); outQ.enq(f3(fifo2.first()); fifo2.deq();endrule L05-3 Can these rules fire concurrently? Need a way to discuss the concurrent properties of method calls

Notation For a group of actions {a1,a2,…} and {b1,b2,…} {a1, a2,…} CF {b1, b2,…} iff i,j. (ai CF bj) {a1, a2,…} < {b1, b2,…} iff i,j. ((ai < bj) or (ai CF bj)) and i,j. ai < bj 4

Concurrency analysis rule stage1 if (True); fifo1.enq(f1(inQ.first)); inQ.deq; endrule rule stage2 if (True); fifo2.enq(f2(fifo1.first)); fifo1.deq; endrule L05-5 Can these rules fire concurrently? For concurrent scheduling of stage1 and stage2, {fifo1.enq, inQ.first, inQ.deq} CF/ {fifo1.first, fifo1.deq, fifo2.enq}  either {fifo1.enq} CF {fifo1.first, fifo1.deq} or {fifo1.enq} < {fifo1.first, fifo1.deq} or {fifo1.enq} > {fifo1.first, fifo1.deq} Need to analyze FIFOs!

module mkCFFifo (Fifo#(1, t)); Reg#(t) data <- mkRegU; Reg#(Bool) full <- mkReg(False); method Action enq(t x) if (!full); full <= True; data <= x; endmethod method Action deq if (full); full <= False; endmethod method t first if (full); return (data); endmethod endmodule One-Element FIFO enq and deq cannot even be enabled together much less fire concurrently! n not empty not full rdy enab rdy enab enq deq Fifo module 6

module mkCFFifo (Fifo#(2, t)); Reg#(t) da <- mkRegU(); Reg#(Bool) va <- mkReg(False); Reg#(t) db <- mkRegU(); Reg#(Bool) vb <- mkReg(False); method Action enq(t x) if (!vb); if va then begin db <= x; vb <= True; end else begin da <= x; va <= True; end endmethod method Action deq if (va); if vb then begin da <= db; vb <= False; end else begin va <= False; end endmethod method t first if (va); return da; endmethod endmodule Two-Element FIFO Assume, if there is only one element in the FIFO it resides in da db da enq and deq can be enabled together Do enq and deq conflict? yes 7 Both read/write the same elements and affect each others functionality!

Bluespec model needs to be extended to express FIFOs with concurrent methods EHR: Ephemeral History Registers 8

EHR: Register with a bypass Interface r0 < w0 DQ 0 1 r0 w0.data w0.en w0 < r1 r1 – if write is not enabled it returns the current state and if write is enabled it returns the value being written r1 normal bypass 9

Ephemeral History Register (EHR) r0 < w0 < r1 < w1 < …. DQ 0 1 w0.data w0.en 0 1 w1.data w1.en w[i+1] takes precedence over w[i] Dan Rosenband [MEMOCODE’04] r0 r1 10

Designing FIFOs using EHRs 11

One-Element Pipelined FIFO module mkPipelineFifo(Fifo#(1, t)) provisos(Bits#(t, tSz)); Reg#(t) data <- mkRegU; Ehr#(2, Bool) full <- mkEhr(False); method Action enq(t x) if(!full.r1); data <= x; full.w1 <= True; endmethod method Action deq if(full.r0); full.w0 <= False; endmethod method t first if(full.r0); return data; endmethod endmodule first < enq deq < enq first < deq 12 One can enq into a full Fifo provided someone is trying to deq from it simultaneously. enq and deq calls can be from the same rule or different rules.

One-Element Bypass FIFO using EHRs module mkBypassFifo(Fifo#(1, t)) provisos(Bits#(t, tSz)); Ehr#(2, t) data <- mkEhr(?); Ehr#(2, Bool) full <- mkEhr(False); method Action enq(t x) if(!full.r0); data.w0 <= x; full.w0 <= True; endmethod method Action deq if(full.r1); full.w1 <= False; endmethod method t first if(full.r1); return data.r1; endmethod endmodule enq < first enq < deq first < deq 13 One can deq from an empty Fifo provided someone is trying to enq into it simultaneously. enq and deq calls can be from the same rule or different rules

module mkCFFifo(Fifo#(2, t)) provisos(Bits#(t, tSz)); Ehr#(2, t) da <- mkEhr(?); Ehr#(2, Bool) va <- mkEhr(False); Ehr#(2, t) db <- mkEhr(?); Ehr#(2, Bool) vb <- mkEhr(False); rule canonicalize if(vb.r1 && !va.r1); da.w1 <= db.r1; va.w1 <= True; vb.w1 <= False; endrule method Action enq(t x) if(!vb.r0); db.w0 <= x; vb.w0 <= True; endmethod method Action deq if (va.r0); va.w0 <= False; endmethod method t first if(va.r0); return da.r0; endmethod endmodule Two-Element Conflict-free FIFO Assume, if there is only one element in the FIFO it resides in da db da 14 first CF enq deq CF enq first < deq enq and deq, even if performed simultaneously, can’t see each other’s effects until the next cycle

N-element Conflict-free FIFO an enq updates enqP and puts the old value of enqP and enq data into oldEnqP and newData, respectively. It also sets enqEn to false to prevent further enqueues a deq updates deqP and sets deqEn to false to prevent further dequeues Canonicalize rule calculates the new count and puts the new data into the array and sets the enqEn and deqEn bits appropriately 15 enqPdeqP enqEndeqEn oldEnqP newData locks for preventing accesses until canonicalization temp storage to hold values until canonicalization

Pointer comparison enqP and deqP contain indices for twice the size of the FIFO, to distinguish between full and empty conditions Full: enqP == deqP + FIFO_size Empty: enqP == deqP L7-16

N-element Conflict-free FIFO module mkCFFifo(Fifo#(n, t)) provisos(Bits#(t, tSz), Add#(n, 1, n1), Log#(n1, sz), Add#(sz, 1, sz1)); Integer ni = valueOf(n); Bit#(sz1) nb = fromInteger(ni); Bit#(sz1) n2 = 2*nb; Vector#(n, Reg#(t)) data <- replicateM(mkRegU); Ehr#(2, Bit#(sz1)) enqP <- mkEhr(0); Ehr#(2, Bit#(sz1)) deqP <- mkEhr(0); Ehr#(2, Bool) enqEn <- mkEhr(True); Ehr#(2, Bool) deqEn <- mkEhr(False); Ehr#(2, t) newData <- mkEhr(?); Ehr#(2, Maybe#(Bit#(sz1))) oldEnqP <- mkEhr(Invalid); 17

N-element Conflict-free FIFO continued-1 rule canonicalize; Bit#(sz1) cnt = enqP[1] >= deqP[1]? enqP[1] - deqP[1]: (enqP[1]%nb + nb) - deqP[1]%nb; if(!enqEn[1] && cnt != nb) enqEn[1] <= True; if(!deqEn[1] && cnt != 0) deqEn[1] <= True; if(isValid(oldEnqP[1])) begin data[validValue(oldEnqP[1])] <= newData[1]; oldEnqP[1] <= Invalid; end endrule 18

N-element Conflict-free FIFO continued-2 method Action enq(t x) if(enqEn[0]); newData[0] <= x; oldEnqP[0] <= Valid (enqP[0]%nb); enqP[0] <= (enqP[0] + 1)%n2; enqEn[0] <= False; endmethod method Action deq if(deqEn[0]); deqP[0] <= (deqP[0] + 1)%n2; deqEn[0] <= False; endmethod method t first if(deqEn[0]); return newData[deqP[0]%nb]; endmethod endmodule 19

N-element searchable FIFO search CF {deq, enq} module mkCFSFifo#(function Bool isFound(t v, st k))(SFifo#(n, t, st)) provisos(Bits#(t, tSz), Add#(n, 1, n1), Log#(n1, sz), Add#(sz, 1, sz1)); Integer ni = valueOf(n); Bit#(sz1) nb = fromInteger(ni); Bit#(sz1) n2 = 2*nb; Vector#(n, Reg#(t)) data <- replicateM(mkRegU); Ehr#(2, Bit#(sz1)) enqP <- mkEhr(0); Ehr#(2, Bit#(sz1)) deqP <- mkEhr(0); Ehr#(2, Bool) enqEn <- mkEhr(True); Ehr#(2, Bool) deqEn <- mkEhr(False); Ehr#(2, t) temp <- mkEhr(?); Ehr#(2, Maybe#(Bit#(sz1))) tempEnqP <- mkEhr(Invalid); Ehr#(2, Maybe#(Bit#(sz1))) tempDeqP <- mkEhr(Invalid); 20 Need a tempDeqP also to avoid scheduling constraints between deq and Search

N-element searchable FIFO search CF {deq, enq} continued-1 Bit#(sz1) cnt0 = enqP[0] >= deqP[0]? enqP[0] - deqP[0]: (enqP[0]%nb + nb) - deqP[0]%nb; Bit#(sz1) cnt1 = enqP[1] >= deqP[1]? enqP[1] - deqP[1]: (enqP[1]%nb + nb) - deqP[1]%nb; rule canonicalize; if(!enqEn[1] && cnt1 != nb) enqEn[1] <= True; if(!deqEn[1] && cnt1 != 0) deqEn[1] <= True; if(isValid(tempEnqP[1])) begin data[validValue(tempEnqP[1])] <= temp[1]; tempEnqP[1] <= Invalid; end if(isValid(tempDeqP[1])) begin deqP[0] <= validValue(tempDeqP[1]); tempDeqP[1]<=Invalid; end endrule 21

N-element searchable FIFO search CF {deq, enq} continued-2 method Action enq(t x) if(enqEn[0]); temp[0] <= x; tempEnqP[0] <= Valid (enqP[0]%nb); enqP[0] <= (enqP[0] + 1)%n2; enqEn[0] <= False; endmethod method Action deq if(deqEn[0]); tempDeqP[0] <= Valid ((deqP[0] + 1)%n2); deqEn[0] <= False; endmethod method t first if(deqEn[0]); return data[deqP[0]%nb]; endmethod 22

N-element searchable FIFO search CF {deq, enq} continued-3 method Bool search(st s); Bool ret = False; for(Bit#(sz1) i = 0; i < nb; i = i + 1) begin let ptr = (deqP[0] + i)%nb; if(isFound(data[ptr], s) && i < cnt0) ret = True; end return ret; endmethod endmodule 23

Scheduling constraints due to multiple modules rule ra; aTob.enq(fa(x)); if (bToa.nonEmpty) begin x <= ga(bToa.first); bToa.deq; end endrule rule rb; bToa.enq(fb(y)); y <= gb(aTob.first); aTob.deq; endrule aTobbToa Concurrent fifofifo scheduling?CF CFpipeline CFbypass pipelineCFpipeline pipelinebypass bypassCF bypasspipelinebypass x   x Can ra and rb be scheduled concurrently? ra rb aTob bToa x y fifos are initially empty      L7-24

Register File: normal and bypass Normal rf: {rd1, rd2} < wr; the effect of a register update can only be seen a cycle later, consequently, reads and writes are conflict-free Bypass rf: wr < {rd1, rd2}; in case of concurrent reads and write, check if rd1==wr or rd2==wr then pass the new value as the result and update the register file, otherwise the old value in the rf is read 25

Normal Register File module mkRFile(RFile); Vector#(32,Reg#(Data)) rfile <- replicateM(mkReg(0)); method Action wr(Rindx rindx, Data data); if(rindx!=0) rfile[rindx] <= data; endmethod method Data rd1(Rindx rindx) = rfile[rindx]; method Data rd2(Rindx rindx) = rfile[rindx]; endmodule {rd1, rd2} < wr 26

Bypass Register File using EHR module mkBypassRFile(RFile); Vector#(32,EHR#(2, Data)) rfile <- replicateM(mkEHR(0)); method Action wr(Rindx rindx, Data data); if(rindex!==0) rfile[rindex].w0 <= data; endmethod method Data rd1(Rindx rindx) = rfile[rindx].r1; method Data rd2(Rindx rindx) = rfile[rindx].r1; endmodule wr < {rd1, rd2} 27