1 / 21

Department of Computer Engineering Sharif University of Technology Tehran, IRAN - PowerPoint PPT Presentation

  • Uploaded on

Mahdi Fazeli , Seyed Ghassem Miremadi , Hossein Asadi , Seyed Nematollah Ahmadian. A Fast and Accurate Multi-Cycle Soft Error Rate Estimation Approach to Resilient Embedded Systems Design. Presenter : Saman Aliari University of Illinois at Urbana Chamapign.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Department of Computer Engineering Sharif University of Technology Tehran, IRAN' - denis

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Mahdi Fazeli, SeyedGhassemMiremadi, HosseinAsadi, SeyedNematollahAhmadian

A Fast and Accurate Multi-Cycle Soft Error Rate Estimation Approach to Resilient Embedded Systems Design



University of Illinois at Urbana Chamapign

Department of Computer Engineering

Sharif University of Technology

Tehran, IRAN

Speech outlines
Speech Outlines

  • Soft Errors

  • SER Modeling in Multi-Cycle Operation

  • SER Modeling in Single Cycle Operation

  • Proposed SER Modeling in Multi Cycle Operation

  • Tool Overview

  • Experimental Results and Discussions

  • Conclusions

What is soft error
What is soft error?






  • Transient Faults

    • Due to radiation events

    • 1 0 or 0 1

    • Alpha particles or Neutrons

    • Memory, Flip-flops, Combinational Logic







Evidences of particle strikes
Evidences of Particle Strikes

  • 2000 [Forbes Magezine’00]

    • SUN Enterprise servers crash, due to Cache problem

  • 2001[ITRS’01]

    • Soft errors as a major issue in chip design

  • 2003 [EE Times’04]

    • Cisco routers failure, due to soft errors

  • 2004[]

    • Xilinx FPGAs highly sensitive to soft errors

  • 2005[]

    • Soft error workshop (70% industry attendees)

  • 2011 [ZeroSoft’06]

    • Expected 70% chips to fail in a year

Multi cycle soft error propagation
Multi-Cycle Soft Error Propagation

First Cycle:

The SET does not

propagate to the Primary

Output (PO)

Second Cycle:

The error propagates to

the Primary Output (PO)

Ser modeling in single cycle







SER Modeling in Single Cycle

  • Nominal FIT  Logic Derating  Timing Derating  Electrical Derating

  • Nominal FIT:

    • Occurrence rate of cosmic rays at error site

    • Computed once for library characterization

  • Logical Derating

  • Timing Derating

  • Electrical Derating




Logical derating modeling
Logical Derating Modeling

  • The Main Idea:

    • Traversing structural paths from SEU site to POs and FFs

    • Using Signal Probabilities (SP) for off-path signals

      • SPA: probability of gate “A” having logic value “1”

      • Effective techniques available for SP computation

off-path signals















  • EPP(AD) = SPB = 0.2 EPP: Error Propagation Probability

  • EPP(AE) = EPP(AD)(1-SPC) = 0.20.6 = 0.12

Propagation rules on path gates
Propagation Rules: On-Path Gates

  • Reconvergent Paths

    • Error propagated to two or more inputs of a gate

      • Polarity of propagated error matters!

  • Need of 4 logic values to represent state of each line

    • 0, 1 : no error propagation (Error masked)

    • a: error propagation with same polarity as error site

    • ā : error propagation with opposite polarity as error site

  • Pa(Ui), Pā(Ui), P1(Ui), P0(Ui)

  • Developed Error Propagation Probability (EPP) Rules

    • For all logic gates

Propagation rules
Propagation Rules

  • On-path gates: Pa(Ui) + Pā(Ui) + P1(Ui) + P0(Ui) = 1

  • Off-path gates: P1(Ui) + P0(Ui) = 1

Timing derating modeling
Timing Derating Modeling

  • Find all possible propagated waveforms

    • Enhanced static timing analysis

      • Record all possible transitions at each reachable gate

        • Due to glitch at error site

  • How?

    • Create glitch of width w

      • Represented by two events: (a,t), (ā,t+w)

        • For both positive and negative glitches

    • Inject two events (a,t), (ā,t+w) at error site

    • Find all events at the outputs of all on-path gates

    • Calculate the error propagation probabilities Pa, Pā for each event

    • The propagation is done until reaching a PO or FF.

    • Error propagation probabilities for all possible waveforms are computed

    • For each waveform, Latching Probability is computed as follows:

    • S: Setup Time, H: Hold Time, W: Glitch Width, T:Clock Period

  • Different Glitches may propagate to the POs or FFs due to re-convergent fan-out

Electrical derating modeling
Electrical Derating Modeling

  • Algorithm: Computing electrical masking while propagating events

  • Vomin(Gj, inputk): Minimum voltage of input k of Gj

  • Vomax(Gj, inputk): Maximum voltage of input k of Gj

  • Vomin(Gj): Minimum voltage of Gj output

  • Vomax(Gj): Maximum voltage of Gj output

  • PWo: Output pulse width

  • For each gate Gj in List(Gi) do

  • For each valid waveform (Wl) in Event List(Gj) do

  • Vomin(inputs) = Max(V omin of gate inputs on waveform Wl);

  • Vomax(inputs) = Min(V omax of gate inputs on waveform Wl);

  • Compute Vomin(Gj)

  • Compute Vomax(Gj)

  • Compute Pwousing computed Vomin(Gj ) and Vomax(Gj)

  • end

  • end

A case study error propagation for two clock cycles
A Case Study: Error Propagation for Two Clock Cycles

All three deratings may occur

Only logical derating may occur

The tool mlet multi cycle logical electrical timing derating
The Tool: MLET Multi-Cycle Logical-Electrical-Timing Derating

Experimental results run time
Experimental Results: Run Time

  • On average, 4 orders of magnitude faster than MC based simulation

  • Time required to compute SPs is also 5 orders of magnitude less than MC based simulation

Execution times for MC simulation approach, SP computation, and MLET approach

Experimental results accuracy
Experimental Results: Accuracy

  • The MLET have an accuracy of about 97% as compared to the MC fault injection approach

Difference of derating factors obtained by MLET using various SP variances compared to MC simulations (for an injected pulse width of 50 ps)

Multi cycle sers
Multi-Cycle SERs

Multi-cycle SER estimation of s820 and s832 ISCAS’89 circuits using MLET

Conclusions future work
Conclusions & Future Work

  • SER Estimation is very challenging as it requires dynamic analysis of transients.

  • The existing SER estimation approaches rely on investigation of error propagation probabilities for only single cycle resulting in inaccurate system failure rate.

  • We have proposed a very fast and accurate analytical approach so called MLET which has four main features:

    • It runs very fast.

    • All three masking factors are considered.

    • The effects of error propagation in re-convergent fan-outs are modeled.

    • The effect of multi-cycle error propagation on overall circuit SER is considered.

Conclusions future work cont d
Conclusions & Future Work Cont’d

  • Experimental results extracted for some ISCAS89 circuit benchmark show that MLET is:

    • 4 orders of magnitude faster than the MC simulation based fault injection method

    • It has an accuracy of about 97%.

  • Future work:we are going to estimate the SER of a circuit in the presence of Multiple Event Transients (METs) as a reliability concern in ultra deep sub-micron technologies

Related work ser modeling
Related Work: SER Modeling

  • Circuit/Logic-Level Approach

    • Fault injection

      • SERA by Zhang et. al. [ICCAD’04]

      • SEAT-LA by Rajaraman et. al. [VLSID’06]

      • Mohanram et. al. [ITC’03]

      • Maheshwari et. al. [DFT’03]

      • Asadi et. al. [DSN’03] [PRDC’04]

      • Seifert et. al. [TDMR’04]

    • Probabilistic Transfer Matrices (PTM)

      • Krishnaswamy et. al. [DATE’05]

    • Binary Decision Diagram (BDD)

      • FASER by Zhang et. al. [ISQED’06] [SELSE’05]