1 / 25

Ketil Røed 1,2,3 Johan Alme 2 , Dominik Fehlker 2 , H. Helstrup 1 ,

Radiation Tolerance of an SRAM based FPGA Used in a Large Tracking Detector. Ketil Røed 1,2,3 Johan Alme 2 , Dominik Fehlker 2 , H. Helstrup 1 , Matthias Richter 2 , Kjetil Ullaland 2 , Dieter Röhrich 2 1. Bergen University College 2. University of Bergen 3. CERN. Outline.

kellyhenry
Download Presentation

Ketil Røed 1,2,3 Johan Alme 2 , Dominik Fehlker 2 , H. Helstrup 1 ,

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Radiation Tolerance of an SRAM based FPGA Used in a Large Tracking Detector Ketil Røed1,2,3 Johan Alme2, Dominik Fehlker2, H. Helstrup1, Matthias Richter2, Kjetil Ullaland2, Dieter Röhrich2 1. Bergen University College 2. University of Bergen 3. CERN

  2. Outline • Main focus: reconfiguration solution applied to reduced the probability of functional failures due to SEUs. • Introduction & background • System description • Testing & Results

  3. ALICE: A Large Ion Collider Experiment TPC RCU

  4. Challenge • Make use of commercial SRAM based FPGAs for data readout in the TPC radiation environment. Physics: Nuclear Interaction Consequence: Functional Failure Effect: Single Event Upset SRAM cell value 1  0 or 0  1

  5. Failure prediction • Various SEU cross section results (29*,63**,180 MeV p***, mixed E n****): 2 - 4 x 10-14 cm2 / bit • FPGAs exposed to a hadron flux of 100-200 particles /cm2s* • (n,π,p E > 10 MeV) • Failure prediction for all 216 FPGAs and a 4 hour run • SEUs 20 - 80 • Conservatively only 1 out of every 10 config. bits are used***** • Functional failures 2 – 8 * K. Røed, Bergen University College, Phd thesis to be published ** H. Quinn, Radiation-induced multi-bit upsets in sram-based fpgas. Nuclear Science, IEEE Transactions on, 52(6):2455{2461, Dec. 005. *** G. Tröger, KIP Uni. Heidelberg, PhD thesis to be published **** Lesea et. al. The Rosetta Experiment, IEEE TRANS. ON DEVICE AND MATERIALS RELIABILITY, V 5, N3, 2005 ***** Using an SEUPI: Single Event Upset Probability Impact = 10***

  6. Repeated Outline • A systen solution is developed to both reduced the probability of failure and to offer additional testing functionality • Introduction & background • System description • Testing & Results

  7. Readout Control Unit (RCU) • RCU main FPGA controls readout of detector data • Keep data path intact by correcting SEUs (Task of Support FPGA) • Reconfiguration solution based on Active Partial Reconfiguration

  8. Active Partial Reconfiguration (APR) Rewriting a subset (frame) of the configuration memory of an FPGA while the user design is operating. Source: UG012 - Virtex-II Pro and Virtex-II Pro X FPGA User Guide

  9. Support FPGA Configuration Controller Memory Mapped Interface to Detector Control System Configuration Interface

  10. Frame by frame Readback, Verification and Correction Memory Mapped Interface to Detector Control System Original frame data Frame Readback Reconfigure frame

  11. Repeated Outline • Introduction & background • System description • Testing & Results

  12. Testing • Irradiation testing (physical) • Errors (SEUs) are injected into the configuration memory using a proton beam • Fault injection (software) • Errors (”SEUs”) are injected into the configuration memory through manipulation of the configuration bitstream • Alternative to irradiation testing? • Main Objectives • Validate implementation of Support FPGA configuration controller and fault injection solution • Investigate effect of mitigation approach

  13. FPGA test design • Basic shift register extended with a configurable TMR solution (on/off) • Can reconfiguration and TMR reduced the failure probability? 1 1 0 1

  14. Test procedure start Tstart Mitigation procedure None Continuous checking of shift register output Irradiation or Fault injection T1 T2 FRVC TMR Tend end FRVC: Frame by frame Readback, Verification and Correction

  15. Irradiation test results (1) • Reconfiguration (FRVC) • corrects and prevents accumualtion of SEUs • reduced life time of functional failures • Additional mitigation (TMR) • Masks out functional failures due to individual SEUs No mitigation FRVC enabled FRVC + TMR enabled

  16. Irradiation test results (2) • Only a fraction of the SEUs leads to functional failure (as expected) • Reconfiguration alone does not reduce the failure probability • Must be combined with mitigation at user design level to be effective • Fault injection reproduces irradiation test results

  17. Fault injection results • Only a fraction of the SEUs leads to functional failure (as expected) • Reconfiguration alone does not reduce the failure probability • Must be combined with mitigation at user design level to be effective • Fault injection reproduces irradiation test results

  18. Distribution of sensitive bits • I/O and clock resources (no mitigation implemented) • Voter + shift register • Only shift register 2 3 1 No mitigation FRVC + TMR

  19. Summary • Successful implementation of reconfiguration network • Allows us to use COTS SRAM FPGAs in radiation environments. • Prevents accumulation of SEUs by continuous reconfiguration, but • mitigation at the level of user design is needed. • Combination will significantly reduce the probability of functional failures during operation. • System allows to monitor SEUs during operation • Fault injection implemented as alternative test method • Locate sensitive bits  optimize mitigation approach • To do: Predict the failure probability of the final design

  20. Acknowledgements • Gerd Tröger, University of Heidelberg • Luciano Musa, Blahoslav Pastircák, CERN • Austin Lesea, Xilinx • Alexander Prokofiev, TSL University of Uppsala • Jon Wikne, Eivind Olsen, OCL University of Oslo

  21. Backup

  22. Irradiation test results 1 2 3 1 1+2 FRVC enabled TMR enabled No action

  23. FLASH mode SelectMAP mode Normal mode RCU support FPGA

  24. Some numbers

  25. Readback and correct Inject errors From software Inject bit error Check design General Fault Injection Flow 150 ms (1 frame) 60 - 120 ms 15 ms (96 frames) Store result 1 cycle FRVC If requested FRVC: Frame by frame Readback Verification and Correction

More Related