Module 3: Branch Prediction

Module 3: Branch Prediction

Control Dependencies Dependencies Data Name Control Structural Anti Output • Control dependencies determine execution order of instructions • Instructions are control dependent on a branch instruction • Why do we conserve control dependencies? • Correctness • Exception behavior and dataflow • Goal: Maximize utilization of instruction fetch bandwidth  branch prediction • How do we improve prediction accuracy and reduce penalties?

Impact of Branches EX INT • For general pipelines penalties occur because of two reasons • Branch target address generation • PC-relative address generation “can” occur after instruction fetch • Branch condition resolution • Unconditional branches do not incur this penalty • What cycle is the condition known? • ID?  testing the contents of a register as in BNEZ R1, Loop • EX?  testing equality as in BNE R1, R2, Loop IF ID EX FP MEM WB EX BR instruction issue

Branch Prediction • Dominated by history-based predictors past behavior is a good indicator of future behavior? • Design issues • How is history maintained? • How are decisions made based on the this history? • Significant analysis of the behavior of benchmarks is used in the design of predictors

Dynamic Branch Prediction Strategies Shift register n-1 … • Use history of behavior, taken vs. not-taken, by a single branch • Predict the next decision that will be taken by this branch, i.e., taken vs. not-taken • A general model (shown above) captures history and uses it to make predictions 0 Last branch behavior, i.e., taken or not taken How do we capture this history? prediction How do we predict? From Ref: “Modern Processor Design: Fundamentals of Superscalar Processors, J. Shen and M. Lipasti

Predicting the Outcome of a Single Branch • n-bit predictors • Prediction bit addressed by k LSBs of the address of the branch instruction • Prediction bit set by a n-bit history: 2-bit most common • Useful when the branch address is known before the branch condition is known so as to support pre-fetching • Performance parameters: prediction accuracy, penalties, branch frequency • Example – how does this work in the pipeline? Impact on CPI. 1-bit predictor Index using address LSB bits Change to 2-bit predictor Generalize to n-bit predictor

Correlating Predictions Across Multiple Branches • Instead of having a predictor for a single branch have a predictor for the most recent history of branch decisions • For each branch history sequence, use an n-bit predictor Correlating across two successive branches

Performance Comparison • Size and resolution of predictors established empirically

Multi-level Predictors • Use multiple predictors and chose between them • Employ predictors based on local and global information state of the art • Adaptive, multi-level predictors • Substantial work throughout the 90’s starting with seminal work of Yeh &Patt (1992)

Misprediction Recovery • What actions must be taken on a misprediction? • Remove “predicted” instructions • Start fetching from the correct branch target(s) • What information is necessary to recover from misprediction? • Address information for non-predicted branch target address • Identification of those instructions that are “predicted” • To be invalidated and prevented from completion • Association between “predicted” instructions and specific branch • When that branch is mispredicted then only those instructions must be squashed

Branch Target Buffers • Store the branch instruction address (PC) and corresponding target address in a small associative cache • Miss on the first access to a branch instruction • Access in parallel with instruction cache • Hit produces the branch target address

Branch Target Buffers: Operation • Couple speculative generation of the branch target address with branch prediction • Continue to fetch and resolve branch condition • Take appropriate action if wrong • Any of the preceding history based techniques can be used for branch condition speculation • Example: impact on CPI • Store prediction information, e.g., n-bit predictors, along with BTB entry

Some other Techniques • Static prediction techniques • Opcode-based: offline frequency analysis guides prediction • Static prediction recorded in the branch instruction • Off-line prediction (Motorola 8810) • Offset based prediction  negative target address offset triggers branch taken prediction • Motivated by behavior of loops (IBM RS 6000)

Concluding Remarks • Challenge to keeping the execution core fed is handling control flow • Prediction and recovery mechanisms key to keeping the pipeline active • Superscalar datapaths provide increased pressure pushing for better, more innovative techniques to keep pace with technology-enabled appetite for instruction level parallelism • What next?

Module 3: Branch Prediction

Module 3: Branch Prediction

Presentation Transcript

Branch Prediction

Branch Prediction Logic

Branch Prediction

Dynamic Branch Prediction

Branch Prediction

Dynamic Branch Prediction

Dynamic Branch Prediction

Dynamic Branch Prediction

Branch prediction

Branch Prediction

Static Branch Prediction

Microarchitecture of Superscalars (3) Branch Prediction

Branch Prediction

Lecture 3. Branch Prediction

Branch Prediction

Branch prediction

Branch Prediction Techniques

Branch Prediction

Branch Prediction Logic