1 / 14

Branch Prediction using Advanced Neural Methods

Branch Prediction using Advanced Neural Methods. Sunghoon Kim CS252 Project. Introduction. Dynanmic Branch Prediction No doubt about its importance in Speculation performance Given history of branch behaviors, predict branch behaviors at the next step

abril
Download Presentation

Branch Prediction using Advanced Neural Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Branch Prediction using Advanced Neural Methods Sunghoon Kim CS252 Project

  2. Introduction • Dynanmic Branch Prediction • No doubt about its importance in Speculation performance • Given history of branch behaviors, predict branch behaviors at the next step • Common solutions: gshare, bimode,hybrid… • Replace saturating counters with neural methods?

  3. Neural methods • Capable of classification (predicting into which set of classes a particular instance will fall) • Learns correlations between inputs and output, and generalized learning to other inputs • Potential to solve problems of most two-levels predictors

  4. Simulation Models - Gshare • 20-bit Global history shift register • Per-address history table with 2-bit saturation counter GHT Predict 2-bit saturation counter BrachPC>>2 Update counters PHT

  5. Simulation Models - Perceptron PHT GHT • 14-bit Global history shift register • Per-address history table with 8-bit weights and bias • Indexed by Gshare or BranchPC alone BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias

  6. Simulation Models - Backpropagation GHT • 10-bit GHR • Sigmoid transfer function • Floating point computation • Floating point weights and biases • 20 neurons one hidden layer BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias PHT

  7. Simulation Models – Radial Basis Networks GHT PHT • Transfer function for radial basis neuron exp(-n2) • Distance function between an input vector and a weight vector BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias

  8. Simulation Models – Elman Networks GHT PHT • Feedback from the hidden layer outputs to the first layer BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias

  9. Simulation Models – Learning Vector Quantization Networks GHT PHT • Distance function as Radial but without biases • Competitive function gives one only to an winning input (biggest value) and zeroe to the other BrachPC>>2 OR Predict BrachPC>>2 Training weights & bias

  10. Simulation Environment • SimpleScalar Tool • Some of SPEC2000 benchmarks • Execute 100,000,000 instructions and dump conditional branch histories • 5000 branch instructions are used for training • Make all budgets for PHTs the same • Floating point is 4 byte

  11. Results

  12. Hardware constraints • Predictors must predict within a (few) cycle • Gshare : easy to achieve • Perceptron : Integer adders, possible alternative, more accurate if more layers • Other advanced neural net : Hard to implement, Floating point functional units,

  13. Future Works • Replace floating point weights and biases with scaled integer ones? • Replace floating point function with approximately equivalent integer function, using Taylor’s series? • Without budget consideration, what will be the best performance of advanced neural network methods? • Look at codes carefully if there are mistakes

  14. Conclusions • There is not much of benefit using advanced neural networks on the same budget as Gshare, sometimes worse • Elman Networks method is the best • Hard to implement in hardware unless floating point computation are easy to do • NN can be alternative predictors if well designed

More Related