1 / 20

Speculative instruction validation for performance-reliability trade-off

Speculative instruction validation for performance-reliability trade-off. Sumeet Kumar SUNY Binghamton Binghamton, NY 13902 skumar1@binghamton.edu. Aneesh Aggarwal SUNY Binghamton Binghamton, NY 13902 aneesh@binghamton.edu. Cosmic/Alpha Radiation. CLK. Latch. Latch. Logic.

tanginika
Download Presentation

Speculative instruction validation for performance-reliability trade-off

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Speculative instruction validation for performance-reliability trade-off Sumeet Kumar SUNY Binghamton Binghamton, NY 13902 skumar1@binghamton.edu Aneesh Aggarwal SUNY Binghamton Binghamton, NY 13902 aneesh@binghamton.edu caps.cs.binghamton.edu

  2. Cosmic/Alpha Radiation CLK Latch Latch Logic Soft Error What are Soft Errors? 0001 0000 0000 caps.cs.binghamton.edu

  3. Micro-architectural Techniques to Detect Soft Errors • Execute multiple copies of a program • Redundant Multi Threading (RMT) • Probabilistic fault detection techniques • Errors are flagged if the program behavior is out of the ordinary (i.e. Unpredictable) • Probabilistic techniques may have high false alarms, e.g. when instructions do not have predictable behavior caps.cs.binghamton.edu

  4. Simultaneous and Redundantly Threaded (SRT) • SRT is an implementation of RMT in SMT environment • Two copies of a program run simultaneously on a single core • Slack is provided between the two copies for better performance. • Thread running ahead is known as the Main thread, one running behind is known as the Redundant thread • Provides complete fault coverage • Has considerable performance impact (our experiments show 25% performance impact) caps.cs.binghamton.edu

  5. Schematic Diagram of SRT ROB Arch Register Files Register File M Map Table Fetch Buffer Issue Queue M R M M M Fetch Decode Rename Writeback Compare Commit FU R R R M M R Map Table LVQ SVQ R Data Cache M - Main Thread M LSQ R – Redundant Thread LVQ – Load Value Queue R R SVQ – Store Value Queue ECC Protected

  6. Performance-Reliability Trade-off in RMT • Reducing redundancy by reacting to processor state • Avoiding redundant in high IPC phase (PER-IRTR) • RMT toggling • Reducing redundancy by exploiting instruction properties • Instruction Reuse concept (DIE-IRB) • Removing backward slices of silent stores, dead values (SS-mod) or predictable stores (SlicK) caps.cs.binghamton.edu

  7. SpecIV (Speculative Instruction Validation) caps.cs.binghamton.edu

  8. Basic Idea • An instruction validator(similar to data value predictor) is used to store the expected result values of the main thread instructions • Instructions producing values that match the stored value are known as successfully validated instructions • Successfully validated instructions are not redundantly executed caps.cs.binghamton.edu

  9. Schematic Diagram for SpecIV Physical Register File Arch Register Files OFB – Operand Forward Buffer CVQ – Commit Value Queue Dependent on Non executing redundant Instruction Dependent on Executing redundant Instruction Redundant Instruction dropped Fetch Buffer Issue Queue M R M M M Fetch Decode Rename Compare Commit R R R M M M M R R LVQ SVQ CVQ Instruction Validator OFB OFB OFB R R R R M Re-execute bit-vector 0 1 0

  10. Undetected Errors in SpecIV Correct Value Erroneous Value Validator Value Undetected Error Inst X 10 11 11 Error Detected Inst X 10 11 ≠11 Undetected Error Inst X 10 11 10 11 Erroneous Values Only Interested in Single Event Upsets • Errors in OFB and CVQ will be detected, as they are used by redundant thread only

  11. Fault Injection to Measure Vulnerability Source Architectural Register Arch. Register File Register File Source Physical Register ROB Operand Value Map Table Result Value Arch. Register File Decoder Register File Decoder Rename Table Decoder Issue Queue Fetch Decode Rename Writeback Commit FU LSQ caps.cs.binghamton.edu

  12. Hardware Setup for Experimental Results • ROB – 164 Entries • Physical Register File – 128 Int/ 128 Float • Fetch/Decode/Commit Width – 8 • Issue Width – 5 Int/ 3 Float • Issue Queue – 48 Int/ 32 Float • Branch Predictor – Bimodal 4K entries caps.cs.binghamton.edu

  13. Performance Results for SpecIV Instruction Validator Size – 4K Entries IPC caps.cs.binghamton.edu

  14. Instruction Redundancy Reduction Average Reliability Results for SpecIV Average caps.cs.binghamton.edu

  15. Sensitivity to Validator Size Performance Impact Reduction Error Rates caps.cs.binghamton.edu

  16. Performance-Reliability Trade-Off Exploration with SpecIV Performance – Reliability Trade-Offs Performance Trade-Off for Better Reliability Reliability Trade-Off for Better Performance Low Performance Impact High Performance Impact Low Reliability Impact High Reliability Impact Avoiding Redundancy for Producers of Successful Validations Avoiding Low Confidence Validations Multi-Value Validator Result Width & Stride Width Validation Partial Result Validation caps.cs.binghamton.edu

  17. Avoiding Low Confidence Validations(Low Performance Impact) • By stopping validations for entries with no stride the total error rate reduces from 0.45% to 0.23% with negligible performance impact • No additional hardware required to implement this technique Non-Control Instructions Average caps.cs.binghamton.edu

  18. Avoiding Redundancy for Producers of Successfully Validated Instructions(Low Reliability Impact) RBIT 0 Validation Unsuccessful Inst A, R3 op IMM R1 2 1 Validation Successful Inst B, R1 op IMM R30 30 31 Redundant execution reduced by 69% Performance Impact Reduction Increases to 58% Undetected error rate increases to 0.5% Re-execute Bit Vector 1 0 0 caps.cs.binghamton.edu

  19. Conclusion • We propose SpecIV as an effective scheme to achieve performance-reliability trade-off • SpecIV achieves significant reduction in redundant execution, which leads to impressive performance improvement of SRT technique • SpecIV has very small undetected error rate • We also explore the performance-reliability trade-off design space with schemes based on SpecIV, obtaining further performance as well as reliability gains caps.cs.binghamton.edu

  20. Thank You Sumeet Kumar skumar1@binghamton.edu Aneesh Aggarwal aneesh@binghamton.edu caps.cs.binghamton.edu

More Related