Create Presentation
Download Presentation

On the Threat of Metastability in an Asynchronous Fault-Tolerant Clock Generation Scheme

On the Threat of Metastability in an Asynchronous Fault-Tolerant Clock Generation Scheme

133 Views

Download Presentation
Download Presentation
## On the Threat of Metastability in an Asynchronous Fault-Tolerant Clock Generation Scheme

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**On the Threat of Metastability in an Asynchronous**Fault-Tolerant Clock Generation Scheme Vienna University of Technology Embedded Computing Systems Group {fuchs, fuegger, steininger}@ecs.tuwien.ac.at Gottfried Fuchs, Matthias Függer and Andreas Steininger**Outline**• Asynchronous fault-tolerant algorithm • Investigate its susceptibility to metastability • In this context: study Sutherland’s micropipeline 2**Clocking in SoCs**DARTS GALS synchronousSoC • (+) no single point of failure • (-) no common time across chip • (+) no single point of failure • (+) common time across chip (< small # of ticks) (-) single point of failure (Seifert et al.) (+) common time across chip (< 1 tick)**SoC with Common Time**q’s local clock domain tick(3) tick(4) tick(5) q p p tick(2) tick(3) tick(4) tick(5) q π(t) = 2 #ticks(Δ) = 3 precision: at any t,π(t) bounded accuracy: l(Δ) < #ticks in any Δ < u(Δ) Common time eases solving other problems (replica determinism, …). 4**DARTS Hardware Implementation**Common time property proved in [EDCC06, PODC09]. • Initially: • send tick(0) to all; clock:= 0; • If received tick(m) from at least f+1 remote nodes and m > clock: • send tick(clock+1),…, tick(m) to all; clock:= m; • If received tick(m) from at least 2f+1 remote nodes and m >= clock: • send tick(m+1) to all; clock:= m+1; 5**DARTS Hardware Implementation**Common time property proved in [EDCC06, PODC09]. But: Proofs cover digital behavior, only. What about metastability (during non-normal operation)? 6**Potential for metastability (1)**• TG-Alg has • (a) stable state • (b) fault non-closed (unrestricted) environment • (no stability condition as in QDI) • exists a malicious input pulse. Make sure metastability does not propagate across ECR boundary 7**Existence of metastability barrier?**(Sutherland) 8**Does a micropipeline “synchronize”?**in(t) out(t) maliciousout (t) tE1 tE2 Critical pulse window size (2 stages) = tE2 -tE1 9**Does a micropipeline “synchronize”?**in(t) out(t) maliciousout (t) Critical pulse window size (4 stages) 10**Metastability decay in a C-Element (1)**Latch C-Element Model Model Decay towards LO/HI MTBU formula Do equivalent formulas exist? 11**Metastability decay in a C-Element (2)**a,b inputs (b = armed)z outputx feedback a(t),f(a,b,x)(t) For t > tE :Consider homogenous solution f(a,b,x)(t) = x(t) x0 tE 12**Metastability decay in a C-Element (2)**Near metastability point: with assumption x0= “midway” yields Remember the latch: strong indication for synchronizing behavior 13**Simulation Setup**4 stage pipeline, MATLABs stiff ODE parameters: CMOS 180nm,but G = 1.66 (numeric resolution) maliciousout (t) choose Tmaxcorr = 3Tnom 14**Simulation Results (1)**Dependence on RC constants critical window critical window size approx. linear dependence only 15**Simulation Results (2)**Dependence on #stages critical window critical window size ~10-1/stage 16**Simulation Results (3)**Dependence on G ~10-7/1 In case of DARTS Simulation indicates that critical pulse window size < 1fs. 17**Conclusions**• Example for fault-tolerant asynchronous algorithm: DARTS. • Identified micropipeline as metastability barrier. • Characterized its synchronizing behavior. Open research: • Refined C-Element models (yield results for larger G). • Extend analysis to incorporate masking effects and calculate metastability upset probability. 18