Fault tolerant fpga co processing toolkit l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 42

Fault Tolerant FPGA Co-processing Toolkit PowerPoint PPT Presentation


  • 122 Views
  • Uploaded on
  • Presentation posted in: General

Fault Tolerant FPGA Co-processing Toolkit. Douglas Michael DiSabello. Oral defense in partial fulfillment of the requirements for the degree of Master of Science 2006. Overview. Computation in radiation space environments is slow compared to terrestrial computation

Download Presentation

Fault Tolerant FPGA Co-processing Toolkit

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Fault tolerant fpga co processing toolkit l.jpg

Fault Tolerant FPGA Co-processing Toolkit

Douglas Michael DiSabello

Oral defense in partial fulfillment of the requirements for the degree of

Master of Science 2006


Overview l.jpg

Overview

  • Computation in radiation space environments is slow compared to terrestrial computation

  • FPGAs can fill and exceed this gap for specific applications

  • The Fault Tolerant FPGA Co-processing Toolkit facilitates a general capability of FPGA co-processing capability for space based applications


Outline l.jpg

Outline

  • The Space Computation Performance Gap

  • Ionizing Radiation and FPGA Background

  • Development Board

  • Fault Tolerant Co-Processing Toolkit

    • Support System

    • Fault Tolerant Support System

    • End User Environment


The space computation performance gap l.jpg

The Space Computation Performance Gap

  • Radiation Hardened Microprocessor

    • BAE Systems 133MHz PowerPC

    • Special and proprietary design techniques

  • Laptop running this presentation

    • 1,500 MHz PowerPC


Computation gap problems l.jpg

Computation Gap Problems

  • Bare minimum of computations performed in space

  • Large and slow data transfers to limited number of receiving stations


Field programmable gate arrays l.jpg

Field Programmable Gate Arrays

  • 200 – 1600 fold speed ups

    • Exploits fine grain parallelism of algorithms

  • Speed up computations in space beyond what can even be offered by normal terrestrial computers

  • Fast development cycles

  • In flight reprogramming to adapt to changing mission requirements


Fpgas details l.jpg

FPGAs Details

  • SRAM Based

    • Configuration memory stores and implements design

    • Programmed using JTAG or SelectMap interfaces

  • Configuration Logic Blocks

    • Look Up Tables and supporting logic

  • Input/Output Blocks

    • Used for all general I/O package pins

  • Block RAMs

  • General Routing Matrix


Ionizing radiation l.jpg

Ionizing Radiation

  • Low Earth Orbit contains ionizing particles trapped in the Van Allen Belts

  • Particles cause direct and secondary nuclear reactions in silicon substrate

  • Decreasing feature sizes and shrinking threshold voltages increase probability of these interactions causing errors in circuits


Ionizing radiation definitions l.jpg

Ionizing Radiation Definitions

  • Single Event Transient

    • Temporary change in logic value

  • Single Event Upset

    • SET that is latched into a memory

  • Single Event Functional Interrupt

    • Component stops service


Ionizing radiation and fpgas l.jpg

Ionizing Radiation and FPGAs

  • Architecture

    • SEUs in configuration memory

    • Instantiated design changes

    • Usually results in a SEFI

  • Data

    • SEUs in Flip Flops, Latches, BRAMS, etc…

    • Incorrect computation results and/or SEFIs

  • Off Chip Communication

    • Non-dedicated configurable input/outputs package pins

    • SEUs to input/output blocks can disable a package pin


Typical fpga cross section l.jpg

Typical FPGA Cross Section

  • Configuration Memory accounts for 91% of a typical FPGA cross section

    • 78% - 84.8% Routing Structure

    • 20% Control bits and CLB LUT values

Michael Affrey, Paul Graham, Eric Johnson, Michael Wirthlin, Nathan Rollins, and Carl Carmichael, “Single-Event Upsets in SRAM FPGAs” MAPLD, Sep. 2002


Fault mitigation techniques l.jpg

Fault Mitigation Techniques

  • Scrubbing fixes architectural upsets

    • Continuously rewrite static portions of configuration memory

    • Active partial reconfiguration bitstream

    • SEUs corrected at given reconfiguration rate (shorter than expected upsets rate for given orbit)


Fault mitigation techniques13 l.jpg

Fault Mitigation Techniques

  • Triple Modular Redundancy

    • Allows continuous service when architectural upsets occur

    • Majority Voters determine final output

    • Inherent data redundancy


Hardware development board l.jpg

Hardware Development Board

  • Design developed by Naval Post Graduate School

  • Naval Research Laboratory modified with Virtex II FPGA

  • Designed for Configurable Fault Tolerant Computing


Hardware development board15 l.jpg

SelectMap

Support FPGA

Co-processing FPGA

E

EP

R

O

M

SelectMap

Xilinx Virtex

FPGA

Xilinx Virtex II

FPGA

General I/O

F

L

A

S

H

JTAG

PC104/ISA Bus

Embedded

X86 PC

Only off-chip memory

Hardware Development Board


Development board radiation testing l.jpg

Development Board Radiation Testing

  • NPS and NRL conducted tests at the Crocker Nuclear Lab, U.C. Davis

  • Protons were emitted from a cyclotron to interact with the FPGAs

  • Both the Virtex and Virtex II were irradiated


Development board radiation testing17 l.jpg

Development Board Radiation Testing

  • Results:

    • 1 upset for every 5 days for the Virtex in orbit

    • Order of magnitude greater for Virtex II in orbit

    • Scrubbing was proven to repair configuration memory upsets

    • TMR was proven to allow continuous service between SEU scrubbing intervals

James C. Coudeyras, “Radiation Testing of The Configurable Fault Tolerant Processor

(CFTP) for Space-Based Applications,” Thesis, United States Naval Post

Graduate School, 2005


The toolkit objective l.jpg

The Toolkit Objective

  • Build a suite of VHDL designs, C++ software, and tools to give a general FPGA co-processing capability

  • Modular design for easy integration into new hardware platforms and with new HDL modules

  • Allow designers to concentrate mainly on the co-processing algorithm and design instead of Fault Tolerance


How objective was achieved l.jpg

How Objective was Achieved

  • Create a support system to interface between all components

  • Modify the support system into a Fault Tolerant version

  • Co-processing designer templates and interfaces


Support system l.jpg

Support System

  • Interpret and execute all commands from a host embedded computer

  • Route data between all components on the board

  • Program FPGAs

  • Software for embedded x86 to interact with support system instantiated into an FPGA


Support system21 l.jpg

Support System

Virtex

SelectMap

Interface

FLASH

INTERFACE

Virtex II

SelectMap

Interface

Flash

Arbitrator

Inter-FPGA

Communication

Interface

FLASH Control

Interface

PC104/ISA Bus Interfaces


Pc104 isa bus interface l.jpg

PC104 / ISA Bus Interface

  • Responsible for all data transfers on and off the FPGA board

  • Two addresses from the host PC are used:

    • Data address

      • 8 Bit data words

    • Control address

      • A write causes the interface to reset and send a reset to any other modules

      • A read gives the status of the buffer FIFOs

  • Each main support module use a copy of this interface


Pc104 isa bus interface23 l.jpg

WRITE ENABLE

FULL

Write Enable

FULL

READ ENABLE

EMPTY

Output

FIFO

ADDRESS

Input

FIFO

EMPTY

Bus Control

Logic

BUS READ

BUS WRITE

Read Enable

BUS DATA

Data Out

Bus Data

AEN

Data In

Bus Data

PC104 / ISA Bus Interface


Flash components l.jpg

Flash Components

  • Interface

    • Translates commands and data into a series of signals to interact with a Flash chip

  • Control

    • Facilitates interaction between the flash interface and the Bus Interface

  • Arbitrator

    • Each module that requires Flash access is given a priority number

    • A modules must relinquish control before another module can be given access


Selectmap interfaces l.jpg

Flash Address

SelectMap

Interface

SelectMap Data Out <7:0>

Flash Interface CMD

Write

Flash Data

Chip Select

Flash Data Valid

Flash Interface Busy

Configuration Clock

Flash Control Request

Current Flash Owner

SelectMap Interfaces

  • Control configuration data flow to the FPGAs configuration interfaces

  • Configuration Clock is used to allow non-uniform data loading

  • Actual FPGA configuration commands are contained in the configuration data

  • Two versions: Virtex and Virtex II (and Virtex 4)


Selectmap interface commands l.jpg

SelectMap Interface Commands

  • Load Start Address

  • Load Stop Address

  • Program Using Flash

  • Load Bus Word Number

  • Program Using Bus

  • Scrub using Flash

  • Abort (Virtex version only)


Inter fpga communication interface l.jpg

Inter-FPGA Communication Interface

  • Modified ISA Bus interface for delays between physical FPGA chips

  • Co-processing FPGA has direct access to BUS and own memory space

  • Designed specifically to use resources of support FPGA


X86 host pc programs l.jpg

x86 Host PC Programs

  • Flash program

  • Flash verify

  • SelectMap Configuration

  • Scrub On and Scrub Off

  • Co-processing echo check program


Fault tolerant support system l.jpg

Fault Tolerant Support System

  • Add fault tolerance to original support system

  • Specialized fault mitigation techniques

  • FPGA configuration, scrubbing, and BitStream manipulation

  • Support System End User Environment


Support system main fault tolerance methods l.jpg

Support System Main Fault Tolerance Methods

  • Triple Modular Redundancy

    • All modules are made in triplicate

    • Majority voter determines correct output

    • Between HDL modules three voters are used to keep redundancy of signal paths

    • Place and Route to keep redundant modules separate


Majority voters l.jpg

Majority Voters

  • Tri-State Buffers

    • Not made from SRAM material

    • Only interconnects are susceptible and correct operation still results

    • Takes multiple SEUs for incorrect function

Carl Carmichael, “Triple Modular Redundancy Design Techniques for Virtex FPGAs,” Xilinx Application Note 197, 2001


Fsms and tmr l.jpg

FSMs and TMR

  • Method keeps FSM synchronized if SEU occurs in state register

  • Outputs are also majority voted

INPUTS

Next State

Logic

V

Current State Register

Next State

Logic

V

Current State Register

Current State Register

V

Next State

Logic


Block ram fault mitigation l.jpg

Block RAM Fault Mitigation

  • TMR

    • Necessary for routing structure

    • Inherent data redundancy

    • Method is fine for short term data storage

    • SEUs are not corrected during long term data storage in individual BRAM


Bram tmr w refresh l.jpg

BRAM TMR w/ Refresh

  • All BRAMs are dual ported

  • Second ports are used to constantly read data values from three copies, vote, and rewrite the values

  • Data write collision avoidance

  • Not needed for support system, but useful for co-processing applications

Carl Carmichael, “Triple Modular Redundancy Design Techniques for Virtex FPGAs,” Xilinx Application Note 197, 2001


Specialized techniques l.jpg

Specialized Techniques

  • Off FPGA transfers

    • TMR of package pins to a single trace

    • TMR of package pins not available on development board

Carl Carmichael, “Triple Modular Redundancy Design Techniques for Virtex FPGAs,” Xilinx Application Note 197, 2001


Specialized techniques36 l.jpg

Specialized Techniques

  • Inter-FPGA transfers

    • Not enough pins available to triplicate all signals

    • 8 data bit and 4 redundant bit Hamming code used for data

    • Double Error Detection and Single Error Correction

    • Triplication of all other signals


Specialized techniques37 l.jpg

Specialized Techniques

  • PC104 / ISA Bus transfers

    • Updated fault tolerant module allows for extended data transfer sizes

    • All data sent is encoded in (8,4) hamming code

    • No redundancy available for other signals


Flash memory data reads l.jpg

Flash Memory Data Reads

  • CRC values embedded every 512 16-bit flash words by Flash program

  • Fault tolerant Flash interface uses BRAM buffer cache to hold every 512 blocks of data when data is requested

  • If CRC value is incorrect data cache is flushed and the data is read again


Configuration and scrubbing l.jpg

Configuration and Scrubbing

  • Three modes of configuration determined by the bit file created using Xilinx Bitgen tool

    • Initial Configuration

      • Contains startup commands

    • Reconfiguration

      • Contains shutdown and startup commands

      • Doesn’t require a powercycle, but FPGA is taken out of service

    • Active Partial Reconfiguration

      • Used for scrubbing

      • Rewrites static portions of bitstream

      • Removes initial BRAM contents

      • Any portion of design could be masked out


Support system end user environment l.jpg

Support System End User Environment

  • Flash program and verify host programs

    • Store initial co-processing, co-processing scrubbing, and support scrubbing bitstreams at designated flash addresses

  • Scrub On and Off

    • Selectmap interfaces begin scrubbing at given periods using flash data

    • configuration can also still be done across the bus

  • C++ examples programs and headers files for data transfers to the co-processing FPGA


Toolkit user environment l.jpg

Toolkit User Environment

  • Fault Tolerant Support System

    • Controls all data flow

    • Modular design for quick integration to different physical systems with varying number of FPGAs

    • C++ programs and header files

  • Co-Processing Tools

    • Templates for HDL co-processing components

    • HDL “hooks” for data transfers

    • C++ programs and header files


Future toolkit additions l.jpg

Future Toolkit Additions

  • Different bus modules for interaction with radiation hardened microprocessors

  • Co-processing library

  • Use support system on single and multiple FPGA boards


  • Login