Modular Refinement Arvind Computer Science & Artificial Intelligence Lab

Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology http://csg.csail.mit.edu/korea

Successive refinement & Modular Structure rf pc CPU rf pc writeback fetch memory execute decode dMem iMem CPU fetch & decode execute bu Can we derive the 5-stage pipeline by successive refinement of a 2-stage pipeline? http://csg.csail.mit.edu/korea

A Modular organization Suppose we include rf and pc in Fetch and bu in Execute Fetch delivers decoded instructions to Execute and needs to consult Execute for the stall condition Execute writes back data in rf and supplies the pc value in case of a branch misprediction CPU FIFO bu dMem iMem RFile rf WB execute setPc pc stall enqIt modules call each other (recursive) fetch & decode http://csg.csail.mit.edu/korea

Interface definitions:Fetch and Execute interface Fetch; method Action setPC (Iaddress cpc); method Action writeback (RName dst, Value v); endinterface interface Execute; method Action enqIt(InstTemplate it); method Bool stall(Instr instr) endinterface http://csg.csail.mit.edu/korea

Recursive modular organization module mkCPU#(IMem iMem, DMem dMem)(Empty); module mkFix#(Tuple2#(Fetch, Execute) fe) • (Tuple2#(Fetch, Execute));match{.f, .e} = fe; Fetch fetch <- mkFetch(iMem,e); Execute execute <- mkExecute(dMem,f);return(tuple2(fetch,execute)); • endmodulematch {.fetch, .execute} <- moduleFix(mkFix);endmodule http://csg.csail.mit.edu/korea

Fetch Module module mkFetch#(IMem iMem, Execute execute) (Fetch); Instr instr = iMem.read(pc); Iaddress predIa = pc + 1; Reg#(Iaddress) pc <- mkReg(0); RegFile#(RName, Bit#(32)) rf <- mkBypassRegFile(); rule fetch_and_decode (!execute.stall(instr)); execute.enqIt(newIt(instr,rf)); pc <= predIa; endrule methodAction writeback(RName rd, Value v); rf.upd(rd,v); endmethod methodAction setPC(Iaddress newPC); pc <= newPC; endmethod endmodule http://csg.csail.mit.edu/korea

Execute Module module mkExecute#(DMem dMem, Fetch fetch) (Execute); SFIFO#(InstTemplate) bu <- mkSPipelineFifo(findf); InstTemplate it = bu.first; rule execute … methodActionenqIt(InstTemplate it); bu.enq(it); endmethod method Bool stall(Instr instr); return (stallFunc(instr, bu)); endmethod endmodule no change http://csg.csail.mit.edu/korea

Execute Module Rule rule execute (True); case (it) matchestagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin fetch.writeback(rd, va+vb); bu.deq(); end tagged EBz {cond:.cv,addr:.av}: if (cv == 0) then begin fetch.setPC(av);bu.clear(); end else bu.deq();tagged ELoad{dst:.rd,addr:.av}: begin fetch.writeback(rd, dMem.read(av)); bu.deq(); end tagged EStore{value:.vv,addr:.av}: begin dMem.write(av, vv); bu.deq(); end endcase endrule http://csg.csail.mit.edu/korea

Subtle Architecture Issues After setPC is called the next instruction enqueued via enqIt must correspond to iMem(cpc) stall and writeback methods are closely related; writeback affects the results of subsequent stalls the effect of writeback must be reflected immediately in the decoded instructions interface Fetch; method Action setPC (Iaddress cpc); method Action writeback (RName dst, Value v); endinterface interface Execute; method Action enqIt(InstTemplate it); method Bool stall(Instr instr) endinterface Any modular refinement must preserve these extra-linguistic semantic properties http://csg.csail.mit.edu/korea

Modular refinement: Separating Fetch and Decode http://csg.csail.mit.edu/korea

Fetch Module RefinementSeparating Fetch and Decode module mkFetch#(IMem iMem, Execute execute) (Fetch); FIFO#(Instr) fetch2DecodeQ <- mkPipelineFIFO(); Instr instr = iMem.read(pc); Iaddress predIa = pc + 1; … rulefetch(True); pc <= predIa; fetch2DecodeQ.enq(instr); endrule ruledecode (!execute.stall(fetch2DecodeQ.first())); execute.enqIt(newIt(fetch2DecodeQ.first(),rf)); fetch2DecodeQ.deq(); endrule method Action setPC … method Action writeback … endmodule Are any changes needed in the methods? http://csg.csail.mit.edu/korea

Fetch Module Refinement modulemkFetch#(IMemiMem, Execute execute) (Fetch); FIFO#(Instr) fetchDecodeQ <- mkFIFO(); … rule fetch … rule decode … methodActionwriteback(RName rd, Value v); rf.upd(rd,v); endmethod methodActionsetPC(IaddressnewPC); pc <= newPC; fetch2DecodeQ.clear(); endmethod Endmodule no change The stall signal definition guarantee that any order for the execution of the decode rule and writeback method is valid. http://csg.csail.mit.edu/korea

Modular refinement: Replace magic memory by multicycle memory http://csg.csail.mit.edu/korea

Original Fetch and Decode Instr instr = iMem.read(pc); rulefetch_and_decode(!execute.stall(instr)); execute.enqIt(newIt(instr,rf)); pc <= predIa; endrule iMem.read(pc) needs to be split into a pair of request-response actions: method Action iMem.req(Addrpc) method Actionvalue#(Inst) iMem.res() But we can’t call these methods within a rule to achieve the desired behavior http://csg.csail.mit.edu/korea

The desired behavior • Now imem read may have a side effect (req and res change state) rulefetch_and_decode(True); instr <- actionvalue imem.req(pc) <$> i <- imem.resp(); return i; endactionvalue execute.enqIt(newIt(instr,rf)) when (!execute.stall(instr)); pc <= predIa; endrule We need a guard to prevent us from repeatedly requesting stalling instr http://csg.csail.mit.edu/korea

Action Connectives:Par vs. Seq • Parallel compositions (a1;a2) • Neither action observes others’ updates • Writes are disjoint • Natural in hardware • Sequential Connective (a1 <$> a2) • a2 observes a1’s updates • Still atomic • Not present in BSV because of implementation complications (r1 <= r2 ; r2 <= r1) swaps r1 & r2 We need to split the rule to get rid of <$> http://csg.csail.mit.edu/korea

Predicating Actions:Guards vs. Ifs • Guards affect their surroundings (a1 when p1) ; a2 ==> (a1 ; a2) when p1 • The effect of an “if” is local (if p1 then a1) ; a2 ==> if p1 then (a1 ; a2) else a2 p1 has no effect on a2 http://csg.csail.mit.edu/korea

Splitting the rule rulefetch(True); imem.req(pc) pc <= predIa; endrule rule decode(True); let instr <- imem.resp(); execute.enqIt(newIt(instr,rf)) when (!execute.stall(instr)); endrule Suppose the PC was also needed to decode the instruction http://csg.csail.mit.edu/korea

Passing data from Fetch to Decode All data between actions passed through state (imem + fet2decQ) rulefetch(True); imem.req(pc); fet2decQ.enq(pc); pc <= predIa; endrule rule decode(True); let instr <- imem.resp(); let pc = fet2decQ.first(); fet2decQ.deq(); execute.enqIt(newIt(instr,rf,pc)) when (!execute.stall(instr)); endrule http://csg.csail.mit.edu/korea

Methods of Fetch module To finish we need to change method setPC so that the next instruction sent to the Execute module is the correct one method setPC(Iaddress npc); pc <= npc; fet2decQ.clear(); endmethod http://csg.csail.mit.edu/korea

Multicycle memory:Refined Execute Module Rule rule execute (True); case (it) matchestagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin fetch.writeback(rd, va+vb); bu.deq();end tagged EBz {cond:.cv,addr:.av}: if (cv == 0) then begin fetch.setPC(av);bu.clear(); end else bu.deq();tagged ELoad{dst:.rd,addr:.av}: begin let val <- actionvalue dMem.req(Read {av}) <$> v <- dMem.resp(); return v; endactionvalue; fetch.writeback(rd, val); bu.deq(); end tagged EStore{value:.vv,addr:.av}: begin dMem.req(Write {av, vv}); <$> let val <- dMem.resp(); bu.deq(); end endcase endrule http://csg.csail.mit.edu/korea

Splitting the Backend Rules:The execute rule rule execute (True); bu.deq(); case (it) matchestagged EAdd{dst:.rd,src1:.va,src2:.vb}: begin exec2wbQ.enq ( WWB {dst: rd, val: va+vb});end tagged EBz {cond:.cv,addr:.av}: if (cv == 0) then begin fetch.setPC(av);bu.clear(); end;tagged ELoad{dst:.rd,addr:.av}: begin dMem.req(Read {addr:av}); exec2wbQ.enq(WLd {dst:rd}); end tagged EStore{value:.vv,addr:.av}: begin dMem.req(Write {addr:av, val:vv}); exec2wbQ.enq(WSt {}); end endcase endrule rule writeback(True); …. http://csg.csail.mit.edu/korea

Splitting the Backend Rules The writeback rule rule execute (True);… endrule rulewriteback(True); exec2wbQ.deq(); case exec2wbQ.first() matches tagged WWB {dst: .rd, val: .v}: fetch.writeback(rd,v); taggedWLd {dst: .rd}: beginlet v <- dMem.resp(); fetch.writeback(rd,v); end taggedWSt {} : begin let ack <- dMem.resp(); end default: noAction; endcase endrule http://csg.csail.mit.edu/korea

Final step • Stepwise refinement makes the verification task easier but does not eliminate it • We still need to prove that each step is correct http://csg.csail.mit.edu/korea

Splitting the Backend Rules:The execute rule Stall condition prevents fetched instruction from reading stale data rule execute (True); bu.deq(); case (it) matchestaggedEAdd{dst:.rd,src1:.va,src2:.vb}: begin exec2wbQ.enq ( WWB {dst: rd, val: va+vb});end taggedEBz {cond:.cv,addr:.av}: if (cv == 0) then begin fetch.setPC(av);bu.clear(); end;taggedELoad{dst:.rd,addr:.av}: begin dMem.req(Read {addr:av}); exec2wbQ.enq(WLd {dst:rd}); end taggedEStore{value:.vv,addr:.av}: begin dMem.req(Write {addr:av, val:vv}); exec2wbQ.enq(WSt {});end endcaseendrule rulewriteback(True); …. What about exec2wbQ? Should it be cleared? No because all instructions in exec2wbQ are non speculative and must commit http://csg.csail.mit.edu/korea

Modular Refinement Arvind Computer Science & Artificial Intelligence Lab