380 likes | 539 Views
Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology. http://csg.csail.mit.edu/6.375. Dealing with Conflicts. When do conflicts arise? How do we Analyze them? How do we fix them?
E N D
Implementing for Correct Concurrency Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
Dealing with Conflicts • When do conflicts arise? • How do we Analyze them? • How do we fix them? • How do we make sure we’re okay? http://csg.csail.mit.edu/6.375
SFIFO m V n interface SFIFO#(type t, type tr, type v); method Action enq(t); // enqueue an item method Action deq(); // remove oldest entry method t first(); // inspect oldest item method Action clear(); // make FIFO empty method Maybe#(v) find(tr); // search FIFO endinterface n = # of bits needed to represent the values of type “t“ m = # of bits needed to represent the values of type “tr“ v = # of bits needed to represent the values of type “v“ enab enq rdy not full enab SFIFO module rdy deq not empty n first rdy not empty enab clear bool find http://csg.csail.mit.edu/6.375
Processor Example execute decode write- back memory rf pc fetch dMem iMem CPU 5 – stage Processor. 1 element FIFOs in between stages Let’s add bypassing http://csg.csail.mit.edu/6.375
Decode Rule Decode is also correct correct anytime it’s allowed to execute rule decode (!newStallFunc(instr, d2eQ, e2mQ, m2wQ)); let fetInst = f2dQ.first(); f2dQ.deq(); match {.ra, .rb} = getRARB(fetInst); let va0 = rf[ra]; let va1 = fromMaybe (m2wQ.find(ra), va0); let va2 = fromMaybe (e2mQ.find(ra), va1); let vb0 = rf[rb]; let vb1 = fromMaybe (m2wQ.find(rb), vb0); let vb2 = fromMaybe (e2mQ.find(rb), vb1); let newInst = case (fetInst) match Add: return (DAdd .va2 .vb2); … endcase; d2eQ.enq(newInst); endrule Search through each place in design When do we want it to execute? http://csg.csail.mit.edu/6.375
some insight intoConcurrent rule firing rule steps Ri Rj Rk Rules Rj HW Rk clocks Ri http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 There are more intermediate states in the rule semantics (a state after each rule step) In the HW, states change only at clock edges
Parallel executionreorders reads and writes Rules rule steps reads writes reads writes reads writes reads writes reads writes reads writes reads writes clocks HW http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 In the rule semantics, each rule sees (reads) the effects (writes) of previous rules In the HW, rules only see the effects from previous clocks, and only affect subsequent clocks
Correctness rule steps Ri Rj Rk Rules Rj HW Rk clocks Ri http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 Rules are allowed to fire in parallel only if the net state change is equivalent to sequential rule execution Consequence: the HW can never reach a state unexpected in the rule semantics
Upshot • Given the concurrency of method/rules in a system we can determine viable schedules • Some variation do to applicability • BUT we know what schedule we want (mostly) • We should be able to back propagate results to submodules http://csg.csail.mit.edu/6.375
Determining Concurrency Properties http://csg.csail.mit.edu/6.375
Processor: Concurrencies execute decode write- back memory rf pc fetch dMem iMem CPU http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 In-order: F < D < E < M < W Pipelined W < M < E < D < F
execute decode rf pc fetch write- back memory imem dMem CPU Concurrency requirements for Full Pipelining – Reg File • In-Order RF: • (D calls sub) < (W calls upd) • Pipelined RF: • (W calls upd) < (D calls sub) http://csg.csail.mit.edu/6.375
Concurrency requirements for Full Pipelining – FIFOs In-Order FIFOs: 1. m2wQ, e2mQ: find < enq < first < deq 2. d2eQ: find < enq < first < deq, clear Pipeline FIFOs: 3. m2wQ, e2mQ : first < deq < enq < find 4. d2eQ : first < deq < find < enq execute decode rf pc fetch write- back memory imem dMem CPU http://csg.csail.mit.edu/6.375
Constructing Appropriately concurrent submodules http://csg.csail.mit.edu/6.375
From Analysis to Design • We need to create modules which behave as needed • Construct modules using “unsafe” primitives to have “safe” behaviors • Three major concepts: • Use primitives which remove “false” concurrency orderings (e.g. ConfigRegs vs. Regs) • Add RWires for forwarding values intra-cycle • Reason carefully to assure that execution appears “atomic” http://csg.csail.mit.edu/6.375
ConfigReg and RWire • mkConfigReg is a Reg without this restriction • mkReg requires that read < write • Allows us to read stale values (dangerous) • RWire is a “wire” • wset :: a -> Action writes • wget :: Maybe#(a) returns written value if read happened. • wset happens before wget each cycle http://csg.csail.mit.edu/6.375
Let’s implement some modules http://csg.csail.mit.edu/6.375
Processor Redux execute decode write- back memory rf pc fetch dMem iMem CPU http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 In-order: F < D < E < M < W Pipelined W < M < E < D < F
Concurrency: RegFile • The standard library regfile is implemented using with concurrency (sub < upd) • This handles the in-order case • We need to build a RegisterFile for the pipelined case http://csg.csail.mit.edu/6.375
BypassRegFile module mkBypassRegFile(RegFile#(a,d)) #(d l, d h) provisos#(Bits(a,asz), Bits#(d,dsz)); RegFile#(a,d) rfInt <- mkRegFileWCF(l,h); RWire#(Tuple2#(a,d)) curWrite <- mkRWire(); method Action upd(a x, d v); rfInternal.upd(x,v); curWrite.wset(tuple2(x,v)); endmethod method d sub(a x); case (curWrite.wget()) matches tagged Valid {.wa, .wd} &&& wa == a: return wd; default: return rfInternal.sub(a); endcase endmethod endmodule http://csg.csail.mit.edu/6.375
Processor Redux execute decode write- back memory rf pc fetch dMem iMem CPU http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375 In-order: F < D < E < M < W Pipelined W < M < E < D < F
One Element SFIFO (Naïve) module mkSFIFO1#(function Maybe#(v) findf(tr r, t x)) (SFIFO#(t,tr,v)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkReg(False); method Action enq(t x) if (!full); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full ? findf(r, data): Nothing); endmethod endmodule Concurrency: find < first < (enq C deq) http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
One Element SFIFO (In-Order d2eQ #1) find < first < enq < deq module mkSFIFO1#(function Maybe#(v) findf(tr r, t x)) (SFIFO#(t,tr,v)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(t) enqv <- mkRWire(); method Action enq(t x) if (!full); full <= True; data <= x; enqv.wset(x); endmethod method Action deq() if (full || isValid(enqv.wget())); full <= False; endmethod method t first() if (full); return data; endmethod method Maybe#(v) find(tr r); return full ? findf(r,data): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
One Element SFIFO (In-Order e2mQ, m2wQ #2) find < enq < first < deq module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(t) enqv <- mkRWire(); method Action enq(t x) if (!full); full <= True; data <= x; enqv.wset(x); endmethod method Action deq() if (full || isValid(enqv.wget())); full <= False; endmethod method t first() if (full || isValid(enqv.wget())); return (fromMaybe(enqv.wget(), data)); endmethod method Maybe#(v) find(tr r); return full ? findf(r,data): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
One Element Searchable SFIFO (Pipelined #3) first < deq < enq < find module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqw <- mkRWire(); RWire#(void) enqw <- mkRWire(); method Action enq(t x) if (!full || isValid(deqw.wget()); full <= True; data <= x; enqw.wset(x); endmethod method Action deq() if (full); full <= False; deqw.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full&&!isValid(deqw.wget()) ? findf(r,data) : isValid(enqw.wget()) ? findf(r, fromMaybe(enqw.wget(),?)): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
One Element Searchable SFIFO (Pipelined #4) first < deq < find < enq module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkConfigRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqw <- mkRWire(); method Action enq(t x) if (!full || isValid(deqw.wget()); full <= True; data <= x; endmethod method Action deq() if (full); full <= False; deqw.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full&&!isValid(deqw.wget()) ? findf(r, data): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
One Element Searchable SFIFO (Pipelined #4) first < deq < find < enq module mkSFIFO1#(function Bool findf(tr r, t x)) (SFIFO#(t,tr)); Reg#(t) data <- mkRegU(); Reg#(Bool) full <- mkConfigReg(False); RWire#(void) deqEN <- mkRWire(); Bool deqp = isValid (deqEN.wget())); method Action enq(t x) if (!full|| deqp); full <= True; data <= x; 12endmethod method Action deq() if (full); full <= False; deqEN.wset(?); endmethod method t first() if (full); return (data); endmethod method Maybe#(v) find(tr r); return (full&&!deqp) ? findf(r, data): Nothing; endmethod endmodule http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
Up-Down Counter http://csg.csail.mit.edu/6.375
Counter Module Interface interface Counter method Action up(); method Action down(); method Bit#(32) _read(); endinterface Concurrency: up and down should be independent http://csg.csail.mit.edu/6.375
Naïve Counter Example module mkCounter(Counter); Reg#(int) r <- mkReg(); method int _read(); return r; endmethod method Action up(); r <= r + 1; endmethod method Action down(); c <= r – 1; endmethod endmodule http://csg.csail.mit.edu/6.375
Counter Example module mkCounter(Counter); Reg#(int) r <- mkConfigReg(); RWire#(void) upW <- mkRWire(); RWire#(void) downW <- mkRWire(); method int _read(); return r; endmethod method Action up(); upW.wset(); endmethod method Action down(); downW.wset(); endmethod rule updateR(True); r <= r + (isValid( upW.wget()) ? 1 : 0) - (isValid(downW.wget()) ? 1 : 0); endrule endmodule What if want to call up then _read? http://csg.csail.mit.edu/6.375
Completion Buffer http://csg.csail.mit.edu/6.375
Completion buffer: Interface cbuf getToken getResult put (result & token) interface CBuffer#(type t); methodActionValue#(Token) getToken(); methodAction put(Token tok, t d); methodActionValue#(t) getResult(); endinterface typedef Bit#(TLog#(n)) TokenN#(numeric type n); typedef TokenN#(16) Token; http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
IP-Lookup module with the completion buffer enter getResult getToken cbuf yes done? RAM no fifo module mkIPLookup(IPLookup); rule recirculate… ; rule exit …; method Action enter (IP ip); Token tok <- cbuf.getToken(); ram.req(ip[31:16]); fifo.enq(tuple2(tok,ip[15:0])); endmethod method ActionValue#(Msg) getResult(); let result <- cbuf.getResult(); return result; endmethod endmodule for enter and getResult to execute simultaneously, cbuf.getToken and cbuf.getResult must execute simultaneously http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
IP Lookup rules with completion buffer rule recirculate (!isLeaf(ram.peek())); match{.tok,.rip} = fifo.first(); fifo.enq(tuple2(tok,(rip << 8))); ram.req(ram.peek() + rip[15:8]); fifo.deq(); ram.deq(); endrule rule exit (isLeaf(ram.peek())); cbuf.put(ram.peek()); fifo.deq(); ram.deq(); endrule For rule exit and method enter to execute simultaneously, cbuf.put and cbuf.getToken must execute simultaneously For no dead cycles cbuf.getToken and cbuf.put and cbuf.getResult must be able to execute simultaneously http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
Naïve Completion Buffer module mkCBuffer(CBuffer#(a)); Vector#(Reg#(Bool)) valids <- replicateM(mkReg(False)); RegFile#(Token, t) data <- mkRegFile(); Reg#(Token) rdP <- mkReg(0); Reg#(Token) wrP <- mkReg(0); Reg#(Token) cnt <- mkReg(0); method ActionValue#(Token) getToken() if (cnt < Max); cnt <= cnt + 1; rdP <= nextPointer(rdP); valids[rdP] <= False; return rdp; endmethod method Action put(Token tok, t d); valids[tok] <= True; data.upd(tok, d); endmethod method ActionValue#(t) getResult() if (valids[wrP]) cnt <= cnt -1; wrP <= nextPointer(wrP); return (data.sub(wrP)); endmethod endmodule http://csg.csail.mit.edu/6.375
Completion buffer: Interface Requirements cbuf getToken getResult put (result & token) Rules and methods concurrency requirement to avoid dead-cycles: exit < getResult < enter cbuf methods’ concurency: cbuf.getResult < cbuf.put < cbuf.getToken http://csg.csail.mit.edu/6.375 http://csg.csail.mit.edu/6.375
Completion Buffer getResult < put < getToken module mkCBuffer(CBuffer#(a)); Vector#(Reg#(Bool)) valids <- replicateM(mkReg(False)); RegFile#(Token, t) data <- mkRegFile(); Reg#(Token) rdP <- mkConfigReg(0); Reg#(Token) wrP <- mkConfigReg(0); Counter cnt <- mkCounter(); method ActionValue#(Token) getToken() if (cnt < Max); cnt.up(); rdP <= rdP + 1; valids[rdP] <= False; return rdp; endmethod method Action put(Token tok, t d); valids[tok] <= True; data.upd(tok, d); endmethod method ActionValue#(t) getResult() if (valids[wrP]) cnt.down(); wrP <= wrP + 1; return (data.sub(wrP)); endmethod endmodule Is valids okay? Is the ordering correct? http://csg.csail.mit.edu/6.375