Implementing for Correct Concurrency

Slides:

Advertisements

Similar presentations

BSV execution model and concurrent rule scheduling Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology February.

Advertisements

Elastic Pipelines and Basics of Multi-rule Systems Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February.

Stmt FSM Richard S. Uhler Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology (based on a lecture prepared by Arvind)

Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology October 13, 2009http://csg.csail.mit.edu/koreaL12-1.

February 21, 2007http://csg.csail.mit.edu/6.375/L07-1 Bluespec-4: Architectural exploration using IP lookup Arvind Computer Science & Artificial Intelligence.

March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.

IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2011L06-1.

IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 4, 2013

December 12, 2006http://csg.csail.mit.edu/6.827/L24-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

Constructive Computer Architecture: Guards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 24, 2014.

September 22, 2009http://csg.csail.mit.edu/koreaL07-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab.

Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology

Elastic Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 28, 2011L08-1http://csg.csail.mit.edu/6.375.

February 20, 2009http://csg.csail.mit.edu/6.375L08-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

1 Tutorial: Lab 4 Again Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,

October 22, 2009http://csg.csail.mit.edu/korea Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.

October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.

Modeling Processors Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 1, 2010

October 6, 2009http://csg.csail.mit.edu/koreaL10-1 IP Lookup-2: The Completion Buffer Arvind Computer Science & Artificial Intelligence Lab Massachusetts.

Elastic Pipelines: Concurrency Issues

EHRs: Designing modules with concurrent methods

Bluespec-3: A non-pipelined processor Arvind

Concurrency properties of BSV methods and rules

Bluespec-6: Modeling Processors

Bluespec-6: Modules and Interfaces

Scheduling Constraints on Interface methods

Blusepc-5: Dead cycles, bubbles and Forwarding in Pipelines Arvind

Sequential Circuits Constructive Computer Architecture Arvind

Sequential Circuits: Constructive Computer Architecture

IP Lookup: Some subtle concurrency issues

Stmt FSM Arvind (with the help of Nirav Dave)

Performance Specifications

Pipelining combinational circuits

Multirule Systems and Concurrent Execution of Rules

Constructive Computer Architecture: Guards

Sequential Circuits Constructive Computer Architecture Arvind

Pipelining combinational circuits

Modular Refinement Arvind

Modular Refinement Arvind

EHR: Ephemeral History Register

Bluespec-4: Architectural exploration using IP lookup Arvind

Blusepc-5: Dead cycles, bubbles and Forwarding in Pipelines Arvind

Bluespec-7: Scheduling & Rule Composition

Modeling Processors: Concurrency Issues

Modules with Guarded Interfaces

Pipelining combinational circuits

Elastic Pipelines: Concurrency Issues

Bluespec-3: A non-pipelined processor Arvind

Multirule systems and Concurrent Execution of Rules

Stmt FSM Arvind (with the help of Nirav Dave)

IP Lookup Arvind Computer Science & Artificial Intelligence Lab

IP Lookup: Some subtle concurrency issues

Computer Science & Artificial Intelligence Lab.

Elastic Pipelines: Concurrency Issues

Elastic Pipelines and Basics of Multi-rule Systems

Bluespec-5: Modeling Processors

Constructive Computer Architecture: Guards

Elastic Pipelines and Basics of Multi-rule Systems

Bluespec-7: Scheduling & Rule Composition

Control Hazards Constructive Computer Architecture: Arvind

Multirule systems and Concurrent Execution of Rules

IP Lookup: Some subtle concurrency issues

Bluespec-5: Scheduling & Rule Composition

Tutorial 4: RISCV modules Constructive Computer Architecture

Modeling Processors Arvind

Modeling Processors Arvind

Modular Refinement Arvind

Bluespec-8: Modules and Interfaces

Presentation transcript:

Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 9, 2011 http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375

Dealing with Conflicts When do conflicts arise? How do we Analyze them? How do we fix them? How do we make sure we’re okay? http://csg.csail.mit.edu/6.375 March 9, 2011

SFIFO n = # of bits needed to represent the values of type “t“ interface SFIFO#(type t, type tr, type v); method Action enq(t); // enqueue an item method Action deq(); // remove oldest entry method t first(); // inspect oldest item method Action clear(); // make FIFO empty method Maybe#(v) find(tr); // search FIFO endinterface n enab n = # of bits needed to represent the values of type “t“ m = # of bits needed values of type “tr“ v = # of bits needed values of type “v“ enq rdy not full enab rdy deq SFIFO module not empty n first not empty rdy enab clear V m bool find http://csg.csail.mit.edu/6.375 March 9, 2011 3

Processor Example CPU pc rf fetch decode execute memory write- back iMem dMem CPU 5 – stage Processor. 1 element FIFOs in between stages Let’s add bypassing http://csg.csail.mit.edu/6.375 March 9, 2011 4

Search through each place in design Decode Rule Decode is also correct correct anytime it’s allowed to execute rule decode (!newStallFunc(instr, d2eQ, e2mQ, m2wQ)); let fetInst = f2dQ.first(); f2dQ.deq(); match {.ra, .rb} = getRARB(fetInst); let va0 = rf[ra]; let va1 = fromMaybe (m2wQ.find(ra), va0); let va2 = fromMaybe (e2mQ.find(ra), va1); let vb0 = rf[rb]; let vb1 = fromMaybe (m2wQ.find(rb), vb0); let vb2 = fromMaybe (e2mQ.find(rb), vb1); let newInst = case (fetInst) match Add: return (DAdd .va2 .vb2); … endcase; d2eQ.enq(newInst); endrule Search through each place in design When do we want it to execute? http://csg.csail.mit.edu/6.375 March 9, 2011

some insight into Concurrent rule firing Rules Ri Rj Rk rule steps Rj HW Rk clocks Ri There are more intermediate states in the rule semantics (a state after each rule step) In the HW, states change only at clock edges http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 6

Parallel execution reorders reads and writes Rules rule steps reads writes reads writes reads writes reads writes reads writes reads writes reads writes clocks HW In the rule semantics, each rule sees (reads) the effects (writes) of previous rules In the HW, rules only see the effects from previous clocks, and only affect subsequent clocks http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 7

Correctness Rules Ri Rj Rk rule steps Rj HW Rk clocks Ri Rules are allowed to fire in parallel only if the net state change is equivalent to sequential rule execution Consequence: the HW can never reach a state unexpected in the rule semantics http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 8

Upshot Given the concurrency of method/rules in a system we can determine viable schedules Some variation do to applicability BUT we know what schedule we want (mostly) We should be able to back propagate results to submodules http://csg.csail.mit.edu/6.375 March 9, 2011

Determining Concurrency Properties http://csg.csail.mit.edu/6.375 March 9, 2011

Processor: Concurrencies pc rf fetch decode execute memory write- back iMem dMem CPU In-order: F < D < E < M < W Pipelined W < M < E < D < F http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 11

Concurrency requirements for Full Pipelining – Reg File fetch execute imem rf CPU decode memory pc write- back dMem In-Order RF: (D calls sub) < (W calls upd) Pipelined RF: (W calls upd) < (D calls sub) http://csg.csail.mit.edu/6.375 March 9, 2011

Concurrency requirements for Full Pipelining – FIFOs fetch execute imem rf CPU decode memory pc write- back dMem In-Order FIFOs: 1. m2wQ, e2mQ: find < enq < first < deq 2. d2eQ: find < enq < first < deq, clear Pipeline FIFOs: 3. m2wQ, e2mQ : first < deq < enq < find 4. d2eQ : first < deq < find < enq http://csg.csail.mit.edu/6.375 March 9, 2011

Constructing Appropriately concurrent submodules http://csg.csail.mit.edu/6.375 March 9, 2011

From Analysis to Design We need to create modules which behave as needed Construct modules using “unsafe” primitives to have “safe” behaviors Three major concepts: Use primitives which remove “false” concurrency orderings (e.g. ConfigRegs vs. Regs) Add RWires for forwarding values intra-cycle Reason carefully to assure that execution appears “atomic” http://csg.csail.mit.edu/6.375 March 9, 2011

ConfigReg and RWire mkConfigReg is a Reg without this restriction mkReg requires that read < write Allows us to read stale values (dangerous) RWire is a “wire” wset :: a -> Action writes wget :: Maybe#(a) returns written value if read happened. wset happens before wget each cycle http://csg.csail.mit.edu/6.375 March 9, 2011

Let’s implement some modules http://csg.csail.mit.edu/6.375 March 9, 2011

Processor Redux In-order: F < D < E < M < W pc rf fetch decode execute memory write- back iMem dMem CPU In-order: F < D < E < M < W Pipelined W < M < E < D < F http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 18

Concurrency: RegFile The standard library regfile is implemented using with concurrency (sub < upd) This handles the in-order case We need to build a RegisterFile for the pipelined case http://csg.csail.mit.edu/6.375 March 9, 2011

BypassRegFile module mkBypassRegFile(RegFile#(a,d)) #(d l, d h) provisos#(Bits(a,asz), Bits#(d,dsz)); RegFile#(a,d) rfInt <- mkRegFileWCF(l,h); RWire#(Tuple2#(a,d)) curWrite <- mkRWire(); method Action upd(a x, d v); rfInternal.upd(x,v); curWrite.wset(tuple2(x,v)); endmethod method d sub(a x); case (curWrite.wget()) matches tagged Valid {.wa, .wd} &&& wa == a: return wd; default: return rfInternal.sub(a); endcase endmethod endmodule http://csg.csail.mit.edu/6.375 March 9, 2011

Processor Redux In-order: F < D < E < M < W pc rf fetch decode execute memory write- back iMem dMem CPU In-order: F < D < E < M < W Pipelined W < M < E < D < F http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 21

One Element SFIFO (Naïve) module mkSFIFO1#(function Maybe#(v) findf(tr r, t x)) (SFIFO#(t,tr,v)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkReg(False); method Action enq(t x) if (!full); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; method t first() if (full); return (data); method Maybe#(v) find(tr r); return (full ? findf(r, data): Nothing); endmethod endmodule Concurrency: find < first < (enq C deq) http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 22

One Element SFIFO (In-Order d2eQ #1) find < first < enq < deq module mkSFIFO1#(function Maybe#(v) findf(tr r, t x)) (SFIFO#(t,tr,v)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(t) enqv <- mkRWire(); method Action enq(t x) if (!full); full <= True; data <= x; enqv.wset(x); endmethod method Action deq() if (full || isValid(enqv.wget())); full <= False; method t first() if (full); return data; method Maybe#(v) find(tr r); return full ? findf(r,data): Nothing; endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 23

One Element SFIFO (In-Order e2mQ, m2wQ #2) find < enq < first < deq One Element SFIFO (In-Order e2mQ, m2wQ #2) module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(t) enqv <- mkRWire(); method Action enq(t x) if (!full); full <= True; data <= x; enqv.wset(x); endmethod method Action deq() if (full || isValid(enqv.wget())); full <= False; method t first() if (full || isValid(enqv.wget())); return (fromMaybe(enqv.wget(), data)); method Maybe#(v) find(tr r); return full ? findf(r,data): Nothing; endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 24

One Element Searchable SFIFO (Pipelined #3) first < deq < enq < find module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqw <- mkRWire(); RWire#(void) enqw <- mkRWire(); method Action enq(t x) if (!full || isValid(deqw.wget()); full <= True; data <= x; enqw.wset(x); endmethod method Action deq() if (full); full <= False; deqw.wset(?); method t first() if (full); return (data); method Maybe#(v) find(tr r); return (full&&!isValid(deqw.wget()) ? findf(r,data) : isValid(enqw.wget()) ? findf(r, fromMaybe(enqw.wget(),?)): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 25

One Element Searchable SFIFO (Pipelined #4) first < deq < find < enq module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqw <- mkRWire(); method Action enq(t x) if (!full || isValid(deqw.wget()); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; deqw.wset(?); method t first() if (full); return (data); method Maybe#(v) find(tr r); return (full&&!isValid(deqw.wget()) ? findf(r, data): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 26

One Element Searchable SFIFO (Pipelined #4) first < deq < find < enq module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqEN <- mkRWire(); Bool deqp = isValid (deqEN.wget())); method Action enq(t x) if (!full|| deqp); full <= True; data <= x; 12endmethod method Action deq() if (full); full <= False; deqEN.wset(?); endmethod method t first() if (full); return (data); method Maybe#(v) find(tr r); return (full&&!deqp) ? findf(r, data): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 27

Up-Down Counter http://csg.csail.mit.edu/6.375 March 9, 2011

Counter Module Interface interface Counter method Action up(); method Action down(); method Bit#(32) _read(); endinterface Concurrency: up and down should be independent http://csg.csail.mit.edu/6.375 March 9, 2011

Naïve Counter Example module mkCounter(Counter); Reg#(int) r <- mkReg(); method int _read(); return r; endmethod method Action up(); r <= r + 1; method Action down(); c <= r – 1; endmodule http://csg.csail.mit.edu/6.375 March 9, 2011

Counter Example module mkCounter(Counter); Reg#(int) r <- mkConfigReg(); RWire#(void) upW <- mkRWire(); RWire#(void) downW <- mkRWire(); method int _read(); return r; endmethod method Action up(); upW.wset(); endmethod method Action down(); downW.wset(); endmethod rule updateR(True); r <= r + (isValid( upW.wget()) ? 1 : 0) - (isValid(downW.wget()) ? 1 : 0); endrule endmodule What if want to call up then _read? http://csg.csail.mit.edu/6.375 March 9, 2011

Completion Buffer http://csg.csail.mit.edu/6.375 March 9, 2011

Completion buffer: Interface cbuf getResult getToken put (result & token) interface CBuffer#(type t); method ActionValue#(Token) getToken(); method Action put(Token tok, t d); method ActionValue#(t) getResult(); endinterface typedef Bit#(TLog#(n)) TokenN#(numeric type n); typedef TokenN#(16) Token; http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 33

IP-Lookup module with the completion buffer done? RAM fifo enter getResult cbuf yes no getToken module mkIPLookup(IPLookup); rule recirculate… ; rule exit …; method Action enter (IP ip); Token tok <- cbuf.getToken(); ram.req(ip[31:16]); fifo.enq(tuple2(tok,ip[15:0])); endmethod method ActionValue#(Msg) getResult(); let result <- cbuf.getResult(); return result; endmodule for enter and getResult to execute simultaneously, cbuf.getToken and cbuf.getResult must execute simultaneously http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 34

IP Lookup rules with completion buffer rule recirculate (!isLeaf(ram.peek())); match{.tok,.rip} = fifo.first(); fifo.enq(tuple2(tok,(rip << 8))); ram.req(ram.peek() + rip[15:8]); fifo.deq(); ram.deq(); endrule rule exit (isLeaf(ram.peek())); cbuf.put(ram.peek()); fifo.deq(); ram.deq(); endrule For rule exit and method enter to execute simultaneously, cbuf.put and cbuf.getToken must execute simultaneously  For no dead cycles cbuf.getToken and cbuf.put and cbuf.getResult must be able to execute simultaneously http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 35

Naïve Completion Buffer module mkCBuffer(CBuffer#(a)); Vector#(Reg#(Bool)) valids <- replicateM(mkReg(False)); RegFile#(Token, t) data <- mkRegFile(); Reg#(Token) rdP <- mkReg(0); Reg#(Token) wrP <- mkReg(0); Reg#(Token) cnt <- mkReg(0); method ActionValue#(Token) getToken() if (cnt < Max); cnt <= cnt + 1; rdP <= nextPointer(rdP); valids[rdP] <= False; return rdp; endmethod method Action put(Token tok, t d); valids[tok] <= True; data.upd(tok, d); method ActionValue#(t) getResult() if (valids[wrP]) cnt <= cnt -1; wrP <= nextPointer(wrP); return (data.sub(wrP)); endmodule http://csg.csail.mit.edu/6.375 March 9, 2011

Completion buffer: Interface Requirements cbuf getResult getToken put (result & token) Rules and methods concurrency requirement to avoid dead-cycles: exit < getResult < enter  cbuf methods’ concurency: cbuf.getResult < cbuf.put < cbuf.getToken http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 March 9, 2011 37

Completion Buffer getResult < put < getToken Is valids okay? module mkCBuffer(CBuffer#(a)); Vector#(Reg#(Bool)) valids <- replicateM(mkReg(False)); RegFile#(Token, t) data <- mkRegFile(); Reg#(Token) rdP <- mkConfigReg(0); Reg#(Token) wrP <- mkConfigReg(0); Counter cnt <- mkCounter(); method ActionValue#(Token) getToken() if (cnt < Max); cnt.up(); rdP <= rdP + 1; valids[rdP] <= False; return rdp; endmethod method Action put(Token tok, t d); valids[tok] <= True; data.upd(tok, d); method ActionValue#(t) getResult() if (valids[wrP]) cnt.down(); wrP <= wrP + 1; return (data.sub(wrP)); endmodule Is valids okay? Is the ordering correct? http://csg.csail.mit.edu/6.375 March 9, 2011