EE 587SoC Design & Test Partha Pande School of EECS Washington State University email@example.com
Why Model Faults? • I/O function tests inadequate for manufacturing (functionality versus component and interconnect testing) • Real defects (often mechanical) too numerous and often not analyzable • A fault model identifies targets for testing • A fault model makes analysis possible • Effectiveness measurable by experiments
Functional Vs Structural Testing • Consider testing of a ten-input AND function • We apply an input pattern 0101010101 • Output is 0 • More than one inferences possible • Functional test is necessary for verification • The purpose of manufacturing test is to find any faults caused due to manufacturing defects
Common Fault Models • Single stuck-at faults • Transistor open and short faults • Memory faults • PLA faults (stuck-at, cross-point, bridging) • Functional faults (processors) • Delay faults (transition, path) • Analog faults
Stuck-at Fault • The circuit is modeled as an interconnection of Boolean gates • Each connecting line can have two types of faults • Stuck-at-1 (s-a-1) & Stuck-at-0 (s-a-0) • A circuit with n lines can have 3^n-1 possible stuck line combinations • An n-line circuit can have at most 2n single stuck-at faults
Single Stuck-at Fault • Three properties define a single stuck-at fault • Only one line is faulty • The faulty line is permanently set to 0 or 1 • The fault can be at an input or output of a gate • Example: XOR circuit has 12 fault sites ( ) and 24 single stuck-at faults Faulty circuit value Good circuit value j c 0(1) s-a-0 d a 1(0) g h 1 z i 0 1 e b 1 k f Test vector for h s-a-0 fault
Fault Equivalence • Number of fault sites in a Boolean gate circuit = #PI + #gates + # (fan-out branches). • Fault equivalence: Two faults f1 and f2 are equivalent if all tests that detect f1 also detect f2. • If faults f1 and f2 are equivalent then the corresponding faulty functions are identical. • Fault collapsing: All single faults of a logic circuit can be divided into disjoint equivalence subsets, where all faults in a subset are mutually equivalent. A collapsed fault set contains one fault from each equivalence subset.
sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 Equivalence Rules sa0 sa0 sa1 sa1 WIRE AND OR sa0 sa1 NOT sa0 sa1 sa0 sa1 sa0 sa1 sa0 NAND NOR sa1 sa0 sa0 sa1 sa1 sa0 sa1 FANOUT
Equivalence Example sa0 sa1 Faults removed by equivalence collapsing sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 20 Collapse ratio = ----- = 0.625 32
Fault Dominance • If all tests of some fault F1 detect another fault F2, then F2 is said to dominate F1. • Dominance fault collapsing: If fault F2 dominates F1, then F2 is removed from the fault list. • When dominance fault collapsing is used, it is sufficient to consider only the input faults of Boolean gates. See the next example. • If two faults dominate each other then they are equivalent.
F2 s-a-1 Dominance Example All tests of F2 F1 s-a-1 001 110 010 000 101 100 011 Only test of F1 s-a-1 s-a-1 s-a-1 s-a-0 A dominance collapsed fault set
Dominance Example sa0 sa1 Faults in blue removed by equivalence collapsing sa0sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0 sa1 sa0sa1 sa0sa1 sa0 sa1 Faults in green removed by dominance collapsing sa0 sa1 sa0 sa1 sa0sa1 sa0 sa1 sa0sa1 sa0 sa1 15 Collapse ratio = ----- = 0.47 32
Dominance Fault Collapsing • An n-input Boolean gate requires (n+1) single stuck-at faults to be modeled. • To collapse faults of a gate, all faults from the output can be eliminated retaining one type (s-a-1 for AND and NAND; S-A-0 for OR and NOR) of fault on each input and the other type (s-a-0 for AND and NAND; s-a-1 for OR and NOR) on any one of the inputs • The output faults of the NOT gate, and the wire can be removed as long as both faults on the input are retained.
Multiple Stuck-at Faults • A multiple stuck-at fault means that any set of lines is stuck-at some combination of (0,1) values. • The total number of single and multiple stuck-at faults in a circuit with k single fault sites is 3k-1. • A single fault test can fail to detect the target fault if another fault is also present, however, such masking of one fault by another is rare. • Statistically, single fault tests cover a very large number of multiple faults.
Transistor (Switch) Faults • MOS transistor is considered an ideal switch and two types of faults are modeled: • Stuck-open -- a single transistor is permanently stuck in the open state. • Stuck-short -- a single transistor is permanently shorted irrespective of its gate voltage. • Detection of a stuck-open fault requires two vectors. • Detection of a stuck-short fault requires the measurement of quiescent current (IDDQ).
Stuck-Open Example Vector 1: test for A s-a-0 (Initialization vector) Vector 2 (test for A s-a-1) VDD pMOS FETs Two-vector s-op test can be constructed by ordering two s-at tests A 0 0 1 0 Stuck- open B C 0 1(Z) Good circuit states nMOS FETs Faulty circuit states
Stuck-Short Example Test vector for A s-a-0 VDD pMOS FETs IDDQ path in faulty circuit A Stuck- short 1 0 B Good circuit state C 0 (X) nMOS FETs Faulty circuit state
Basic Principle of IDDQ Testing • Measure IDDQ current throughVssbus
SCAN DESIGN We already know this!
BIST Motivation • Useful for field test and diagnosis (less expensive than a local automatic test equipment) • Software tests for field test and diagnosis: • Low hardware fault coverage • Low diagnostic resolution • Slow to operate • Hardware BIST benefits: • Lower system test effort • Improved system maintenance and repair • Improved component repair • Better diagnosis
Costly Test Problems Alleviated by BIST • Increasing chip logic-to-pin ratio – harder observability • Increasingly dense devices and faster clocks • Increasing test generation and application times • Increasing size of test vectors stored in ATE • Expensive ATE needed for 1 GHz clocking chips • Hard testability insertion – designers unfamiliar with gate-level logic, since they design at behavioral level • In-circuit testing no longer technically feasible • Shortage of test engineers • Circuit testing cannot be easily partitioned
Typical Quality Requirements • 98% single stuck-at fault coverage • 100% interconnect fault coverage • Reject ratio – 1 in 100,000
Economics – BIST Costs • Chip area overhead for: • Test controller • Hardware pattern generator • Hardware response compacter • Testing of BIST hardware • Pin overhead -- At least 1 pin needed to activate BIST operation • Performance overhead – extra path delays due to BIST • Yield loss – due to increased chip area or more chips In system because of BIST • Reliability reduction – due to increased area • Increased BIST hardware complexity – happens when BIST hardware is made testable
BIST Benefits • Faults tested: • Single combinational / sequential stuck-at faults • Delay faults • Single stuck-at faults in BIST hardware • BIST benefits • Reduced testing and maintenance cost • Lower test generation cost • Reduced storage / maintenance of test patterns • Simpler and less expensive ATE • Can test many units in parallel • Shorter test application times • Can test at functional system speed
Pattern Generation • Store in ROM – too expensive • Exhaustive • Pseudo-exhaustive • Pseudo-random (LFSR) – Preferred method • Binary counters – use more hardware than LFSR • Modified counters • Test pattern augmentation • LFSR combined with a few patterns in ROM
Pseudo-Random Pattern Generation • StandardLinear Feedback Shift Register (LFSR) • Produces patterns algorithmically – repeatable • Has most of desirable random # properties • Need not cover all 2n input combinations • Long sequences needed for good fault coverage
Matrix Equation for Standard LFSR 0 0 . . . 0 0 1 X0 (t + 1) X1 (t + 1) . . . Xn-3 (t + 1) Xn-2 (t + 1) Xn-1 (t + 1) 0 1 . . . 0 0 h2 … … … … … 0 0 . . . 1 0 hn-2 0 0 . . . 0 1 hn-1 X0 (t) X1 (t) . . . Xn-3 (t) Xn-2 (t) Xn-1 (t) 1 0 . . . 0 0 h1 = X (t + 1) = Ts X (t) (Ts is companion matrix)
LFSR Implements a Galois Field • Galois field (mathematical system): • Addition operator is XOR () • Ts companion matrix: • 1st column 0, except nth element which is always 1 (X0 always feeds Xn-1) • Rest of row n – feedback coefficients hi • Rest is identity matrix I – means a right shift • Near-exhaustive (maximal length) LFSR • Cycles through 2n – 1 states (excluding all-0) • 1 pattern of n 1’s, one of n-1 consecutive 0’s
Standard n-Stage LFSR Implementation • Autocorrelation – any shifted sequence same as original in 2n-1 – 1 bits, differs in 2n-1 bits • If hi = 0, that XOR gate is deleted
LFSR Theory • Cannot initialize to all 0’s – hangs • If X is initial state, progresses through states X, Ts X, Ts2 X, Ts3 X, … • Described by characteristic polynomial: f (x) = |Ts – I X | = 1 + h1x + h2x2 + … + hn-1xn-1 + xn
Example External XOR LFSR • Characteristic polynomial f (x) = 1 + x + x3 (read taps from right to left)
X0 X1 X2 1 0 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1 1 0 1 0 0 0 0 1 X0 (t + 1) X1 (t + 1) X2 (t + 1) 0 0 1 1 0 1 0 1 0 X0 (t) X1 (t) X2 (t) = External XOR LFSR • Pattern sequence for example LFSR (earlier): • Always have 1 and xn terms in polynomial • Never repeat an LFSR pattern more than 1 time –Repeats same error vector, cancels fault effect …
Modular Internal XOR LFSR • Described by companion matrixTm = TsT • Internal XOR LFSR – XOR gates in between D flip-flops • Equivalent to standard External XOR LFSR • With a different state assignment • Faster – usually does not matter • Same amount of hardware • X (t + 1) = Tm x X (t) • f (x) = | Tm – I X | = 1 + h1 x + h2x2 + … + hn-1xn-1 + xn
Modular LFSR Matrix 0 1 0 . . . 0 0 0 X0 (t + 1) X1 (t + 1) X2 (t + 1) . . . Xn-3 (t + 1) Xn-2 (t + 1) Xn-1 (t + 1) 0 0 1 . . . 0 0 0 0 0 0 . . . 0 0 0 … … … … … … 0 0 0 . . . 0 1 0 0 0 0 . . . 0 0 1 X0 (t) X1 (t) X2 (t) . . . Xn-3 (t) Xn-2 (t) Xn-1 (t) 1 h1 h2 . . . hn-3 hn-2 hn-1 =
Example Modular LFSR • f (x) = 1 + x2 + x7 + x8 • Read LFSR tap coefficients from left to right
VHDL Implementation signal lfsr_reg : std_ulogic_vector(9 downto 0); begin process (clock) variable lfsr_tap : std_ulogic; begin if clock'EVENT and clock='1' then if reset = '1' then lfsr_reg <= (others => '1'); else lfsr_tap := lfsr_reg(6) xor lfsr_reg(9); lfsr_reg <= lfsr_reg(8 downto 0) & lfsr_tap; end if; end if; end process; data_out <= lfsr_reg; end modular