Testing and diagnosis of interconnect faults in cluster based fpga architectures
This presentation is the property of its rightful owner.
Sponsored Links
1 / 45

Testing and Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures PowerPoint PPT Presentation


  • 115 Views
  • Uploaded on
  • Presentation posted in: General

Testing and Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures. David Mohabir University of Arizona March 19 th , 2012. Testing and diagnosis of interconnect faults in cluster-based FPGA architectures. Section 1. Motivation. Quickly identify faulty components

Download Presentation

Testing and Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Testing and diagnosis of interconnect faults in cluster based fpga architectures

Testing and Diagnosis of Interconnect Faults in Cluster-Based FPGA Architectures

David Mohabir

University of Arizona

March 19th, 2012


Testing and diagnosis of interconnect faults in cluster based fpga architectures1

Testing and diagnosis of interconnect faults in cluster-based FPGA architectures

Section 1


Motivation

Motivation

  • Quickly identify faulty components

  • Design new, efficient testing methodologies to offset the complexity of FPGA testing as compared to ASIC testing

    • Defect location information is an important modern strategy as FPGAs can be reconfigured to avoid faults

    • Increased test generation complexity

    • Increased test application time

    • Multiple configurations to test assortment of switch settings


Limitations

Limitations

  • High complexity for test generation

  • Increased test application time

  • Need for external controllability and observability

  • Multiple configurations to test assortment of switch settings, compared to a single configuration for an ASIC

    • As FPGAs have more programmable switch points, this becomes a bigger issue


Previous and related work

Previous and related work

  • FPGA testing has been divided into interconnect testing and FPGA logic testing

  • Reduction in the need for I/O pads for testing

    • Several configurations are required to ensure all FPGA logic is tested in some configuration

  • Unutilized FPGA logic and routing are being used to implement modular redundancy

  • Faults can be targeted for the entire FPGA structure, or those that are application-specific


Related work con t

Related work (con’t)

  • Need for external controllability and observability has also been reduced using iterative logic array (ILA) test architecture

    • one-dimensional configuration with one direction for signal propagation

    • A complete array of m x m LUT/RAM modules requires 4 test configurations independent of size of array and of modules [11]

  • Problems of defining a set of test configurations for cluster-based architectures and diagnosis


Related work con t1

Related work (con’t)

  • The use of LUTs with logic checkers to implement testing schemes in interconnects

  • Using LUTs to form shift registers to easily check the output of the test pattern

  • Built-in Self Test (BIST) architecture to locate any single and most multiple fault PLBs

    • This is FPGA logic

  • Cluster-based FPGA test methodologies

    • Does not cover specific fault extra-cluster


Geometric scaling

Geometric Scaling

  • Increased defect rates

  • Increased device variation

  • Increased change in device parameters

  • Increased single die capacity

  • Increased susceptibility to transient upsets


Defect tolerance

Defect Tolerance

  • If device failure renders a bitop or an interconnect unusable, the device should be reconfigured to avoid these failing areas

    • Substitute good resources for bad ones

    • As defect rates increase, spare resources should be strategically reserved


Interchangeable luts

Interchangeable LUTs


Interchangeability

Interchangeability

  • Not all unused units will be substitutable, as location strongly affects interconnections to other logic blocks

  • Preferable to have fewer large pools of mostly interchangeable resources


Cluster based architectures

Cluster-based architectures

  • Primitive logic components are grouped into coarse-grained clusters

  • Richness of internal connectivity means large range of potential interconnect patterns

  • External access to internal test points becomes increasingly difficult as device sizes scale

  • Cluster I/O are the input and output pins of the cluster

  • Tile I/O pins include the endpoint of wire segments which can connect to a neighboring tile via programmable interconnect points


Structure

Structure


Built in self test

Built-in Self Test

  • BIST overhead not an issue

    • Easily inserted and removed by reconfiguration

  • Test logic inside the FPGA enables test access to internal components

  • Each BISTER is composed of

    • Test pattern generator

    • Output response analyzer

    • Two blocks under test


Bister test structure

BISTER test structure


Bister

BISTER


Bist strategy

BIST strategy

  • To guarantee testing of all tiles, the FPGA is reconfigured to shift the BISTERs across the entire array

    • All tiles will be tested by acting as a BUT

  • Perimeter tiles are tested by using the I/O pads to access the periphery

  • Total test application time is related to the area of the TPG/ORA logic

  • Decomposes the problem into many identical problems of a size which is determined by the test requirements for a single tile


Interconnect fault detection

Interconnect Fault Detection

  • High density of internal cluster interconnect makes test access difficult

  • Must test intra-cluster interconnect and extra-cluster interconnect

  • Four classes of faults

    • Permanent connection

      • PIP off

    • Permanent disconnection

      • PIP on

    • Stuck-at 0

    • Stuck-at 1


Detection and diagnosis

Detection and Diagnosis

  • Defines testability and diagnosis requirements of each fault and fault pair

  • Some test pattern must exist to detect each fault and differentiate each fault pair

  • All LUTs are configured as 4 input XOR gates

  • The detectability of each fault can be expressed as a function of the tile I/O


Fault detection conditions

Fault Detection Conditions

  • Faulty line segment s1 must be both controllable by at least one tile input and observable by at least one tile output


Fault detection conditions con t

Fault Detection Conditions (con’t)

  • A faulty pair of segments must be both controllable, separately controllable, and both observable

  • The PIP between the two segments must be switched off


Fault detection conditions con t1

Fault Detection Conditions (con’t)

  • If s2 is the floating segment, then the non-floating segment must be controllable and the floating segment must be observable

  • PIP between the two segments must be switched on


Interconnect fault equivalence

Interconnect Fault Equivalence

  • Equivalent faults cannot be differentiated

  • Fault equivalence is determined by the FPGA configuration

    • Faults that are equivalent in one configuration may not be equivalent in another

  • Maximum diagnostic resolution is achieved when every pair of faults is non-equivalent in at least one configuration

  • Two faults are equivalent if their corresponding faulty machines produce the same output with all possible test patterns, at all outputs of the circuit

  • Two segments are test equivalent in a configuration if the segments have identical control sets and identical observe sets


Interconnect fault equivalence con t

Interconnect Fault Equivalence (con’t)

  • Two segments are test equivalent when they are controlled by the same set of tile inputs and observed by the same set of tile outputs


Interconnect fault equivalence con t1

Interconnect Fault Equivalence (con’t)

  • Each segment in a faulty segment pair must be test equivalent to a segment in the other faulty segment pair


Interconnect fault equivalence con t2

Interconnect Fault Equivalence (con’t)

  • Pair of faults may be equivalent if a segment which is not driven by a signal floats to a ‘v’ value

  • The two faults are equivalent if the floating segment is test equivalent to the segment associated with the stuck-at ‘v’ fault

  • The segment with the stuck-at fault and the floating segment must be controlled by the same set of tile inputs and observed by the same set of tile outputs


Interconnect fault equivalence con t3

Interconnect Fault Equivalence (con’t)

  • The pair of segments involved in one fault are test equivalent to the pair of segments involved in the other fault

  • Each segment in a faulty segment pair must be test equivalent to a segment in the other faulty segment pair


Test configurations

Test Configurations

  • Identifies a set of configurations for the tiles acting as BUTs in a BISTER

  • Size of configuration should be minimized to reduce test application time

  • Intra-cluster configurations are defined separately from extra-cluster configurations


Intra cluster configurations

Intra-Cluster Configurations

  • Fault effect on a cluster input must propagate to at least one cluster output

  • Cluster outputs must be separately controllable


Ble configurations

BLE configurations

  • Observability of cluster inputs and BLE output branches must be achieved by propagating fault effects

  • Controllability of the BLE outputs must be achieved through the BLEs

  • Each BLE is composed of a LUT and a multiplexer

    • Both must be configured

    • Each LUT acts as a 4-input XOR gate

    • Good controllability because output value can be determined by controlling any single input

    • Good observability because a fault effect on any input will propagate to output

  • Majority of test configurations bypass the flip-flop

    • A single configuration will test the interconnect associated with the flip-flops


Ble input multiplexer configurations

BLE input multiplexer configurations

  • Input muxes determine controllability of BLE outputs by determining the function which defines the output of each BLE ‘n’

  • BLE output function:

    • All inputs XORed together

  • Multiplexers are not configured to create loops

  • All BLE outputs are separately controllable from each other, and from all cluster inputs

  • Each input multiplexer is configured to select data from each of its inputs in at least one configuration

  • There is a sensitized path from each cluster input stem to a cluster output in every configuration


Algorithm 1

Algorithm 1


Input multiplexer configurations

Input Multiplexer configurations


Extra cluster configurations

Extra-Cluster Configurations

  • Defines current flow paths through the extra-cluster interconnect

  • Modeled as a flow graph

  • Create flow paths between tile I/O nodes which allow the detection criteria of each fault to be satisfied in at least one configuration

  • Flow paths are created from tile I/Os to every cluster input, and from every cluster output to tile I/Os


Transparent extra cluster configuration

Transparent Extra-Cluster Configuration


Algorithm 2

Algorithm 2


Algorithm 3

Algorithm 3


Results

Results

  • Assumptions

    • Cluster inputs and outputs are equally distributed around the sides of the cluster

    • Each cluster I/O on the north face may connect to all horizontal tracks via a set of PIPs

    • West face I/O connects to all vertical tracks

    • Cluster I/O for east and south faces connect directly to tracks in neighboring tiles

  • Results

    • Intra-cluster configuration, and two sets of extra-cluster configuration

      • Extra-Cluster (specific) is for when the fault independent algorithm has reached its coverage limit

    • By using the fault specific extra-cluster configuration algorithm, 100% fault coverage can be guaranteed

      • At a cost of increased number of configurations

    • Fault Coverage Achieved

    • Percent of fault pairs which are differentiated across all configurations

  • A small set of test configurations can detect and diagnose nearly all targeted interconnect faults


Results1

Results


Summary

Summary

  • Approach is encompassing, can guarantee 100% fault detection

    • Does require good deal of computation time for extra-cluster

  • Does a good job of describing fault classes

    • I personally believe they could have described it using less mathematical jargon, so that it would make more sense to a digital logic engineer

  • Algorithms are described neatly in pseudocode

  • All details are covered


Discussion topics

Discussion topics

Section 2


Discussion 1

Discussion #1

  • Let’s discuss the logical ways to test circuitry for the various faults

    • Permanent open

    • Permanent closed

    • Stuck-at 0

    • Stuck-at 1

  • How could you design test patterns without access to all internal signals?


Discussion 2

Discussion #2

  • Algorithms

    • Intra-cluster

    • Extra-cluster


Discussion 3

Discussion #3

  • Defect mapping

  • Annealing placers

    • Marks physical location of defective units as

      • Costly

      • Invalid

  • Routers

    • Marks wires and switches that are defective as

      • In use

      • High cost

  • Avoids these defective components of the FPGA


Discussion 4

Discussion #4

  • Parity


  • Login