1 / 25

Computer Architecture: A Constructive Approach Branch Direction Prediction – Six Stage Pipeline

Computer Architecture: A Constructive Approach Branch Direction Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology. NA pred with decode feedback. Reg Read. Fetch. Decode. Execute. Memory. Write- back. xf.

afia
Download Presentation

Computer Architecture: A Constructive Approach Branch Direction Prediction – Six Stage Pipeline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer Architecture: A Constructive Approach Branch DirectionPrediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology http://csg.csail.mit.edu/6.S078

  2. NA pred with decode feedback RegRead Fetch Decode Execute Memory Write-back xf df F D R X M W fr dr rr xr mr NextAddressPrediction http://csg.csail.mit.edu/6.S078

  3. Decode detected mispredicts Can we do better than PC+4? • Non-branch • When nextPC != PC+4 => use PC+4 • Unconditional target known at decode • When nextPC != known target => use known target • Conditional branch • When nextPC != PC+4 or decoded target => use PC+4 http://csg.csail.mit.edu/6.S078

  4. Dynamic Branch Prediction • Branch direction prediction: • Learn and predict the direction a branch will go • Standard prediction principles: • Temporal correlation • The way a branch resolves may be a good predictor of the way it will resolve at the next execution • Spatial correlation • Several branches may resolve in a highly correlated manner (a preferred path of execution) http://csg.csail.mit.edu/6.S078

  5. Fetch PC 0 0 I-Cache k 2k-entry BHT, 1bits/entry BHT Index Instruction Opcode offset + Branch? Taken/¬Taken? Target PC One-bit predictor Predict branch will go same direction it went last time Fetch Decode http://csg.csail.mit.edu/6.S078

  6. One-bit predictor // Interface interfaceDirectionPred; methodActionValue#(Tuple2#(Bool, DirInfo)) predict(Addraddr); methodAction train(DirInfodirInfo, Bool taken); endinterface // Feedback information typedef 64 BPRows; typedef Bit#(TLog#(BPRows)) DirLineIndex; typedefDirLineIndexDirInfo; http://csg.csail.mit.edu/6.S078

  7. One-bit predictor (continued) • Array of prediction bits • Return prediction saved in array • Update array with last actualbehavior When should we train? modulemkDirectionPredictor(DirectionPred); RegFile#(DirLineIndex, Bool) dirArray<- mkRegFileFull(); methodActionValue#(Tuple2#(Bool, DirInfo)) predict(Addraddr); DirLineIndex index = truncate(addr >> 2); returntuple2(dirArray.sub(index), index); endmethod methodAction train(DirInfodirInfo, Bool taken); DirLineIndex index = dirInfo; dirArray.upd(index, taken); endmethod endmodule http://csg.csail.mit.edu/6.S078

  8. Two-bit PredictorSmith, 1981 How well does one-bit predictor do on short trip count loops? • Assume 2 direction predictionbits per instruction Implement using saturating counter http://csg.csail.mit.edu/6.S078

  9. Saturating Counter How do we determine prediction from counter? typedef Bit#(2) Counter; functionCounter updateCounter(Booldir, Counter counter); returndir?saturatingInc(counter) :saturatingDec(counter); endfunction functionCounter saturatingInc(Counter counter); letplusOne= counter + 1; return(plusOne== 0)?counter:plusOne; endfunction functionCounter saturatingDec(Counter counter); return(counter == 0)?0:counter-1; endfunction http://csg.csail.mit.edu/6.S078

  10. Fetch PC 0 0 k 2k-entry BHT, 1bits/entry BHT Index Taken/¬Taken? Two-bit predictor http://csg.csail.mit.edu/6.S078

  11. Two-bit predictor • Feedback state for training typedef 64 BPRows; typedef Bit#(TLog#(BPRows)) DirLineIndex; // DirInfo data typedefstruct { DirLineIndex index; Counter counter; } DirInfoderiving(Bits, Eq); modulemkDirectionPredictor(DirectionPred); // Direction predictor state RegFile#(DirLineIndex,Counter) cntArray <- mkRegFileFull(); http://csg.csail.mit.edu/6.S078

  12. Two-bit predictor (continued) • Training information is index and counter • Prediction is high bit of counter • Train by updating counter methodActionValue#(Tuple2#(Bool, DirInfo)) predict(Addraddr); DirInfo info = ? info.index = truncate(addr >> 2); info.counter = cntArray.sub(index); Bool taken = (truncate(counter >> 1) == 1); returntuple2(taken, info); endmethod methodAction train(DirInfo info, Bool taken); cntArray.upd(info.index, updateCounter(taken, info.counter)); endmethod endmodule http://csg.csail.mit.edu/6.S078

  13. Exploiting Spatial CorrelationYeh and Patt, 1992 if (x[i] < 7) then y += 1; if (x[i] < 5) then c -= 4; If first condition false, second condition also false Also works well for short trip count loops. Implemented with a history register, ‘hist’, that records the direction of the last N branches executed by the processor. http://csg.csail.mit.edu/6.S078

  14. Ghist predictor typedef 64 BPRows; typedef Bit#(TLog#(BPRows)) DirLineIndex; typedef Bit#(2) Counter; // DirInfo data typedefstruct { DirLineIndexhist; Counter counter; } DirInfo deriving(Bits, Eq); modulemkDirectionPredictor(DirectionPred); // Direction predictor state Reg#(DirLineIndex) hist <- mkReg(0); RegFile#(DirLineIndex,Counter) cntArray <- mkRegFileFull(); http://csg.csail.mit.edu/6.S078

  15. Global history predictor • Calculate feedback information • Shift new prediction into history register How good are predictions while waiting for training? methodActionValue#(Tuple2#(Bool, DirInfo)) predict(Addraddr); DirInfo info = ?; info.hist = hist; info.counter = cntArray.sub(hist); Bit#(1) pred = truncate(info.counter >> 1); hist <= truncate(hist << 1 | zeroExtend(pred)); returntuple2((pred== 1), info); endmethod http://csg.csail.mit.edu/6.S078

  16. Global history predictor • Restore history to state it would be in after the desired prediction What is the state of ‘hist’ afterredirects from decode and execute? method Actiontrain(DirInfo info, Bool taken); counterArray.upd(info.hist, updateCounter(taken, info.counter)); endmethod methodAction repair(DirInfoinfo, Booltaken); hist <= truncate((info.hist << 1) | zeroExtend(pack(taken))); endmethod endmodule http://csg.csail.mit.edu/6.S078

  17. NA pred with decode feedback RegRead Fetch Decode Execute Memory Write-back xf df F D R X M W fr dr rr xr mr NextAddressPrediction DirectionPrediction http://csg.csail.mit.edu/6.S078

  18. Direction prediction recipe • Execute • Send redirects on mispredicts (unchanged) • Send direction prediction training • Decode • Check if next address matches direction pred • Send redirect if different • Fetch • Generate prediction • Learn from feedback • Accept redirects from later stages http://csg.csail.mit.edu/6.S078

  19. Add direction feedback • Feedback needs information for training direction predictor typedefstruct { Bool correct; NaInfonaPredInfo; AddrnextAddr; DirInfodirPredInfo; Bool taken; } Feedback deriving (Bits, Eq); FIFOF#(Tuple3#(Epoch,Epoch,Feedback)) decFeedback<-mkFIFOF; FIFOF#(Tuple2#(Epoch,Feedback)) execFeedback<- mkFIFOF; http://csg.csail.mit.edu/6.S078

  20. Execute (branch analysis) • Recall: may have been set in decode • Always send feedback // after executing instruction... letnextEeEpoch = eeEpoch; letcond = execData.execInst.cond; letnextPc= cond?execData.execInst.addr: execData.pc+4; if (nextPC!= execData.nextAddrPred) nextEeEpoch += 1; eeEpoch<= newEeEpoch; execFeedback.enq(tuple2(nextEeEpoch, Feedback{correct: (nextPC == execData.nextAddrPred), taken: cond, dirPredInfo: execData.dirPredInfo, naPredInfo: execData.naPredInfo, nextAddr: nextPc})); // enqueue instruction to next stage http://csg.csail.mit.edu/6.S078

  21. Decode with mispredict detect • New exec epoch • Same decepoch • Determine if epoch of incoming instruction is on good path ruledoDecode; letdecData = newDecData(fr.first); letcorrectPath = (decData.execEpoch != deEpoch) ||(decData.decEpoch == ddEpoch); letinstResp = decData.fInst.instResp; letpcPlus4 = decData.pc+4; if(correctPath) begin decData.decInst= decode(instResp, pcPlus4); lettarget = knownTargetAddr(decData.decInst); letbrClass = getBrClass(decData.decInst); letpredTarget = decData.nextAddrPred; letpredDir = decData.takenPred; http://csg.csail.mit.edu/6.S078

  22. Decode with mispredict detect • Calculate target as best as decode can • Wrongnext addr? • New dec epoch • Tell exec addr of next instruction! • Send feedback • Enqueue to next stage on correct path let decodedTarget = case (brClass) NonBranch: pcPlus4; UncondKnown: target; CondBranch: (predDir?target:pcPlus4); default:decData.nextAddrPred; endcase; if(decodedTarget!= predTarget) begin decData.decEpoch= decData.decEpoch + 1; decData.nextAddrPred= decodedTarget; decFeedback.enq( tuple3(decData.execEpoch, decData.decEpoch, Feedback{correct: False, naPredInfo: decData.naPredInfo, nextAddr: decodedTarget, dirPredInfo: decData.dirPredInfo, taken: decData.takenPred})); end dr.enq(decData); end // of correct path http://csg.csail.mit.edu/6.S078

  23. Decode with mispredict detect • Preserve current epoch if instruction on incorrect path decData.*Epoch have been set properly so we always save them. else begin // incorrect path decData.decEpoch= ddEpoch; decData.execEpoch= deEpoch; end ddEpoch<= decData.decEpoch; deEpoch<= decData.execEpoch; fr.deq; endrule http://csg.csail.mit.edu/6.S078

  24. Handling redirect from execute Train and repair on redirect Just train on correct prediction if (execFeedback.notEmpty) begin match{.execEpoch, .fb} = execFeedback.first; execFeedback.deq; if(!fb.correct) begin dirPred.repair(fb.dirPredInfo, fb.taken); dirPred.train(fb.dirPredInfo, fb.taken); naPred.repair(fb.naPredInfo, fb.nextAddr); naPred.train(fb.naPredInfo, fb.nextAddr); feEpoch <= execEpoch; fetchPc<= feedback.nextAddr; endelsebegin dirPred.train(fb.dirPredInfo, fb.taken); naPred.train(fb.naPredInfo, fb.nextAddr); enqInst; end end http://csg.csail.mit.edu/6.S078

  25. Handling redirect from decode Just repair never train on feedback from decode elseif (decFeedback.notEmpty) begin decFeedback.deq; match {.execEpoch, .decEpoch, .fb} = decFeedback.first; if (execEpoch== feEpoch) begin if (!fb.correct) begin// epoch unchanged fdEpoch<= decEpoch; dirPred.repair(fb.dirPredInfo, fb.taken); naPred.repair(fb.naPredInfo, fb.nextAddr); fetchPc<= feedback.nextAddr; end else// dec feedback on correct prediction enqInst; end else// dec feedback, but in fetch is in new exec epoch enqInst; else // no feedback enqInst; http://csg.csail.mit.edu/6.S078

More Related