1 / 17

Designing and Testing Fault-Tolerant Techniques for SRAM-based FPGAs

Paper by F.L. Kastensmidt , G. Neuberger, L. Carro , R. Reis Talk by Nick Boyd. Designing and Testing Fault-Tolerant Techniques for SRAM-based FPGAs. Overview of the paper. Exploring techniques for detecting and dealing with radiation-induced faults in FPGAs Why?

barto
Download Presentation

Designing and Testing Fault-Tolerant Techniques for SRAM-based FPGAs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Paper by F.L. Kastensmidt, G. Neuberger, L. Carro, R. Reis Talk by Nick Boyd Designing and Testing Fault-Tolerant Techniques for SRAM-based FPGAs

  2. Overview of the paper • Exploring techniques for detecting and dealing with radiation-induced faults in FPGAs • Why? • Drive to use commercial off-the-shelf to minimize cost and development time (for space apps) • As technology gets smaller, radiation becomes an issue even at ground level

  3. Background - Radiation and Transistors • Incident radiation deposits energy • creation of electron-hole pairs and secondary ionizations produce transient current pulse • Can change a ‘0’ to a ‘1’ or vice versa, often called “bit flip” • In combinational logic: Single Event Transient (SET) • In sequential (or memory): Single Event Upset (SEU)

  4. Background – Transistor Faults in FPGAs • In FPGAs there are further considerations • SEU in the configuration SRAM (logic, routing) • SET in combinational FPGA fabric • SEU in BlockRAM

  5. Background – Transistor Faults in FPGAs • Effects of SEU in configuration fabric

  6. Techniques – TMR • TMR = Triple Modular Redundancy • Logic is triplicated and results are accepted by majority vote • Everything is tripled; including combinational, sequential, routing and i/o

  7. Techniques - TMR • Benefits • Able to detect and correct SEU\SET anywhere in the FPGA • No performance penalty • Drawbacks • Very large area/resource penalty (particularly problematic for i/o pads)

  8. Techniques – DMR-CED • A new technique proposed by the authors of this paper • DMR-CED: Double Modular Redundancy with Concurrent Error Detection • Motivation: Want to find a way that is as reliable as TMR in detecting/correcting errors with less area overhead

  9. Background: CED • CED = Concurrent Error Detection • Exploits some property of the logic block to find error • Time-redundant examples: • bit-wise inversion • re-computing with shifted operands (RESO) • re-computing with swapped operands (REWSO)

  10. Background: CED • Result calculated from direct input and stored • Input then encoded, new result calculated and decoded • Two outputs compared – should be equal

  11. Back to DMR-CED • How can we use CED? • Only duplicate combinational logic • Use CED to determine the faulty module only if there is disagreement

  12. Back to DMR-CED

  13. Evaluating DMR-CED Effectiveness:Methodology • Three sample sequential circuits tested • 8-bit multiplier • 8-bit ALU • FIR filter • Sample circuits generated then each node was replaced with a multiplexor which either passes ‘correct’, ‘0’, or ‘1’ • Able to simulate every possible SEU fault

  14. Evaluating DMR-CED Effectiveness:Results

  15. Evaluating DMR-CED Effectiveness:Results

  16. Evaluating DMR-CED • Benefits • Reduces area required for combinational logic (by a significant amount in some cases) • Drawbacks • Significantly more complicated due to CED • CED circuit needs to be chosen to be optimized for each combinational circuit you protect • Speed reduced by as much as 50%

  17. Comments on the original paper • Reasonably well written and complete • Necessary to read the references to understand the minutiae of underlying principles • DMR-CED probably only useful under very specific conditions

More Related