perceptrons branch prediction and its recent developments
Download
Skip this Video
Download Presentation
Perceptrons Branch Prediction and its ’ recent developments

Loading in 2 Seconds...

play fullscreen
1 / 24

Perceptrons Branch Prediction and its ’ recent developments - PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on

Perceptrons Branch Prediction and its ’ recent developments. Mostly based on the Dynamic Branch Prediction with Perceptrons Daniel A. Jim´enez Calvin Lin By Shugen Li. Introduction.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Perceptrons Branch Prediction and its ’ recent developments' - rocio


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
perceptrons branch prediction and its recent developments

Perceptrons Branch Prediction and its’ recent developments

Mostly based on the Dynamic Branch Prediction with Perceptrons

Daniel A. Jim´enez Calvin Lin

By Shugen Li

introduction
Introduction
  • As the new technology development on the deeper pipeline and faster clock cycle, modern computer architectures increasingly rely on speculation to boost instruction-level parallelism.
  • Machine learning techniques offer the possibility of further improving performance by increasing prediction accuracy.
introduction cont
Introduction (cont’)
  • Figure 1. A conceptual system model for branch prediction

Adapted from I. K. Chen, J. T. Coffey, and T. N. Mudge, “Analysis of branch prediction via data compression”,

introduction cont1
Introduction (cont’)
  • we can improve accuracy by replacing these traditional predictor with neural networks, which provide good predictive capabilities
  • Perceptrons is one of the simplest possible neural networks -easy to understand, simple to implement, and have several attractive properties
why perceptrons
Why perceptrons ?
  • The major benefit of perceptrons is that by examining theirweights, i.e., the correlations that they learn, it is easy to understand the decisions that they make.
  • many neural networks is difficult or impossible to determine exactly how the neural network is making its decision.
  • perceptron’s decision-making process is easy to understand as the result of a simple mathematical formula.
perceptrons model
Perceptrons Model
  • Input Xi as the bits of the global branch history shift register
  • Weight W0-n is the Weights vector
  • Y is the output of the perceptrons , Y>0 means prediction is taken , otherwise not taken
perceptrons training
Perceptrons training
  • Let branch outcome t be -1 if the branch was not taken, or 1 if it was taken, and let be the threshold, a parameter to the training algorithm used to decide when enough training has been done.

These two pages and figures are adapted from F. Rosenblatt. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms.

perceptrons limitation
Perceptrons limitation
  • Only capable of learning linearly separable functions
  • It means a perceptron can learn the logical AND of two inputs, but not the exclusive-OR
experimental result
Experimental result
  • Use Spec2000 interger benchmark and compare with gshare and bi-mode.
  • Also compare with a hybrid gshare/perceptron predictor.
  • Its ability to make use of longer history lengths.
  • Done well when the branch being predicted exhibits linearly separable behavior.
implementation
Implementation
  • Computing the Perceptron Output.
    • not needed to compute the dot product.
    • Instead, simply add when the input bit is 1 and subtract (add the two’s complement) when the input bit is -1.
    • similar to that performed by multiplication circuits, which must find the sum of partial products that are each a function of an integer and a single bit.
  • Furthermore, only the sign bit of the result is needed to make a prediction, so the other bits of the output can be computed more slowly without having to wait for a prediction.
litimations
Litimations
  • Delay-huge latency even if simplified method
  • Low performance on the non linearly separable
  • Aliasing and Hardware
recent development 1 low power perceptrons selective weight by kaveh aasaraai amirali baniasadi
Recent development (1)Low-power Perceptrons (selective weight) by Kaveh Aasaraai, Amirali Baniasadi
  • Non-Effective (NE): These weights have a sign opposite to the dot product value sign. We refer to the summation of NEs as NE-SUM.
  • Semi-Effective (SE): Weights having the sign of the dot product value, but with an absolute value less than NE-SUM.
  • Highly-Effective (HE): Weights having the same sign as dot product value and a value greater than NESUM.
recent development 2 the combined perceptron branch predictor by matteo monchiero gianluca palermo
Recent development (2)The Combined Perceptron Branch PredictorBy Matteo Monchiero Gianluca Palermo
  • The predictor consists of two concurrent perceptron-like neural networks; one using as inputs branch history information, the other one program counter bits.
recent development 3 path based neural prediction by daniel a jimennez
Recent development (3)Path-based neural predictionBy Daniel A.Jimennez
  • On a N-branch Path-Based Neural predictor, the prediction for a branch is initiated N-branch ahead. The predictions for the N next branches are computed in parallel.
  • A row of N counters is read using the current instruction block address. On blocks featuring a branch, one of the read counters is added to each of the N partial sums.
  • The delay is the perceptron table read delay followed by a single multiply-add delay.
  • No consider the table read delay. Also the misprediction penalty.
recent development 4 revisiting the perceptron predictor by a seznec
Recent development (4)Revisiting the perceptron predictorBy A. Seznec
  • the accuracy of perceptron predictors is further improved with the following extensions:
    • using pseudo-tag to reduce aliasing impact
    • skewing perceptron weight tables to improve table utilization,
    • introducing redundant history to handle linearly inseparable data sets.
    • The nonlinear redundant history also leads to a more efficient representation, Multiply-Add Contributions (MAC), of perceptron weights
    • Increasing hardware complexity.
recent development 5 the o geometric history length branch predictor by a seznec
Recent development (5)the O-GEometric History Length branch predictorBy A. Seznec
  • The GEHL predictor features M distinct predictor tables Ti
  • The predictor tables store predictions as signed saturated counters.
  • A single counter C(i) is read on each predictor table Ti.(1< i < M)
  • The prediction is computed as the sign of the sum S of the M counters C(i). As the first equation.
  • The prediction is taken when S is positive or nul and not-taken when S is negative.
recent development 5 cont the o geometric history length branch predictor by a seznec
Recent development(5) Cont’the O-GEometric History Length branch predictorBy A. Seznec
  • The history lengths used the second equation for computing the indexing functions for tables Ti
  • The element on all T(i) table is easy to train, similar like in the perceptrons predictor for
  • Low hardware cost and better latency.
conclusion
Conclusion
  • Perceptrons is attractive as using long history lengths without requiring exponential resources.
  • It’s weakness is the increased computational complexity and following latency and hardware cost.
  • As the new idea, it can be combined with the tranditional methods to obtain better performance.
  • There are several methods being developed to reduce the latency and handle the mis-prediction.
  • Finally this technology will be more practical as the hardware cost go down quickly.
  • There should be more space for the further development.
reference
Reference
  • [1] D. Jimenez and C. Lin, “Dynamic branch prediction withperceptrons”, Proc. of the 7th Int. Symp. on High Perf.Comp. Arch (HPCA-7), 2001.
  • [2] D. Jimenez and C. Lin, “Neural methods for dynamic branch prediction”, ACM Trans. on Computer Systems,2002.
  • [3] A. Seznec, “Revisiting the perceptron predictor”,Technical Report, IRISA, 2004.
  • [4] A. Seznec. An optimized 2bcgskew branch predictor. Technical report Irisa, Sep 2003.
  • [5] G. Loh. The frankenpredictor. In The 1st JILP Championship Branch Prediction Competition (CBP-1), 2004
  • [6] K. Aasaraai and A. Baniasadi Low-power Perceptrons
  • [7] A. Seznec. TheO-GEometric History Length branch predictor
  • [8] M. Monchiero and G. Palermo The Combined Perceptron Branch Predictor[9] F. Rosenblatt. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan, 1962.
thank you

Thank You!

Question?

ad