dsp for fpga l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
DSP for FPGA PowerPoint Presentation
Download Presentation
DSP for FPGA

Loading in 2 Seconds...

play fullscreen
1 / 47

DSP for FPGA - PowerPoint PPT Presentation


  • 266 Views
  • Uploaded on

DSP for FPGA. SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic. Objectives. Comparison between PDSP and FPGA Virtex II Pro Altera Stratix FPGA Stratix DSP Block and its configuration Altera design flow. What Is an FPGA?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'DSP for FPGA' - jaden


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
dsp for fpga

DSP for FPGA

SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications

Miodrag Bolic

objectives
Objectives
  • Comparison between PDSP and FPGA
  • Virtex II Pro
  • Altera Stratix FPGA
  • Stratix DSP Block and its configuration
  • Altera design flow
what is an fpga
What Is an FPGA?
  • Field Programmable Gate Array
  • Device that Has a Regular Architecture (Set of Blocks) that Can Be Programmed for Various Functions
      • “Glue” Logic
      • Customizable Hardware Solution
      • Configurable Processors
why use fpgas in dsp applications

DSP System

SoftwareDSP

FPGA

Why Use FPGAs in DSP Applications?
  • 10x More DSP Throughput Than DSP Processors
    • Parallel vs. Serial Architecture
  • Cost-Effective for Multi-Channel Applications
  • Flexible Hardware Implementation
  • Single-Chip Solution
    • System (Hardware/Software) Integration Benefits

FPGA

SoftwareEmbeddedProcessor

dsp processors vs fpgas

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

DSP Processors vs. FPGAs

High Speed DSP Processor

High Level of Parallel Processing in FPGA

  • 1-8 Multipliers
    • Needs looping for more than 8 multiplications
  • Needs multiple clock cycles because of serial computation
    • 200 Tap FIR Filter would need 25+ clock cycles per sample with an 8 MAC unit processor
  • Can implement hundreds of MAC functions in an FPGA
  • Parallel implementation allows for faster throughput
    • 200 Tap FIR Filter would need 1 clock cycle per sample
extending range of altera reconfigurable dsp solutions
Extending Range of Altera Reconfigurable DSP Solutions

New!

600 -

Performance (MMACs/sec)

100 -

Embedded Processors

Embedded Processors Hardware Acceleration

Complete Hardware Implementation

objectives8
Objectives
  • Comparison between PDSP and FPGA
  • Virtex II Pro
  • Altera Stratix FPGA
  • Stratix DSP Block and its configuration
  • Altera design flow
trimatrix memory 1
TriMatrix™ Memory [1]

Dedicated External Memory Interface

M512 Blocks

M-RAM

M4K Blocks

  • Packet / Data Storage
  • Nios Program Memory
  • System Cache
  • Video Frame Buffers
  • Echo Canceller Data Storage
  • Small FIFOs
  • Shift Register
  • Rake Receiver Correlator
  • FIR Filter Delay Line
  • Header / Cell Storage
  • Channelized Functions
  • ATM cell–packet processing
  • Nios Program Memory
  • Look-Up Schemes
  • Packet & Cell Buffering
  • Cache

More Bits For Larger Memory Buffering

512 Kbits per block + parity

4 Kbits per block + parity

512 bits per block + parity

More Data Ports for Greater Memory Bandwidth

logic element le 2

D

DATA

Logic Element (LE) [2]

LUT Chain Input

Register Chain Input

Register Control Signals

addnsub

cin

(2)

data1

4-Input LUT

Sync Load & Clear Logic

data2

Row, Column & DirectLink

Routing

data3

data4

Local Routing

Register Feedback

LUT Chain Output

Register Chain Output

  • Note:
  • Functional Diagram Only. Please See Datasheet for more Details.
  • Addnsum & data1 connected via XOR logic
dynamic arithmetic mode

D

DATA

Dynamic Arithmetic Mode

Register Chain Input

Register Control Signals

LAB Carry-In

Carry-In Logic

Carry-In0

Carry-In1

addnsub

data1

Sum Calculator

Sync Load & Clear Logic

data2

Row, Column & DirectLink

Routing

data3

Carry Calculator

Local Routing

Carry-Out Logic

Carry-In0

Carry-In1

Register Chain Output

Carry-Out1

Carry-Out0

Note: Functional Diagram Only. Please See Datasheet for more Details.

logic array blocks lab 2

LE1

LE2

LE3

LE4

LE5

LE6

LE7

LE8

LE9

LE10

Logic Array Blocks (LAB) [2]

Control Signals

  • 10 LEs
  • Local Interconnect
  • LAB-Wide Control Signals

4

4

4

4

30 LAB Input Lines

10 LE Feedback Lines

4

Local Interconnect

4

4

4

4

4

avalon switch fabric contents
Avalon Switch Fabric Contents
  • Avalon Switch Fabric provides the following to peripherals it connects
    • Data-Path Multiplexing
    • Address Decoding
    • Wait-State Generation
    • Dynamic Bus Sizing
    • Interrupt-Priority Assignment
    • Latent Transfer Capabilities
    • Streaming Read and Write Capabilities
  • Avalon Switch Fabric tailors transactions to the characteristic of peripherals that are attached
sopc design example

DMA Controller With Streaming

Control Port (Slave)

Read Port (Master – Streaming)

Write Port (Master – Streaming)

SOPC Design Example

CPU 32 Bit

Inst

Master

Data

Master

Avalon

Switch Fabric

Allows for Masters and Slaves to communicate

without knowledge of each others interface details

Instruction Memory 32-bit Data path

Data Memory 32-bit Data path

UART

Avalon Tri-State Bridge

VGA Controller

External FLASH 1 MB 16-bit Datapath

External SRAM 256 KB 32-bit Datapath

data path multiplexing slave arbitration

CPU 32 Bit

Inst

Master

Data

Master

DMA Controller With Streaming

Control Port (Slave)

Read Port (Master – Streaming)

Write Port (Master – Streaming)

Data Path Multiplexing & Slave Arbitration
  • Data-Path Multiplexing

Avalon

Switch Fabric

MUX

2- Slave Arbitration

Arbiter

Instruction Memory 32-bit Data path

Data Memory 32-bit Data path

UART

Avalon Tri-State Bridge

VGA Controller

External FLASH 1 MB 16-bit Datapath

External SRAM 256 KB 32-bit Datapath

3- Address Decoding

objectives20
Objectives
  • Comparison between PDSP and FPGA
  • Virtex II Pro
  • Altera Stratix FPGA
  • Stratix DSP Block and its configuration
  • Altera design flow
dsp blocks
Eight 9 × 9 bit multipliers

Four 18 × 18 bit multipliers

One 36 × 36 bit multiplier

DSP Blocks
dsp blocks cont
DSP Blocks (cont.)

The DSP block consists of

  • A multiplier block
  • An adder/subtractor/accumulator block
  • A summation block
  • An output interface
  • Output registers
  • Routing and control signals
stratix dsp blocks

Input Register Unit

Optional Pipelining

+ - S

+ - S

+

Output Multiplexer

Output Register Unit

Stratix DSP Blocks
  • High Performance Dedicated Multiplier Circuitry
    • 18x18 Functions at 280 MHz
  • Variable Operand Widths with Full Precision Outputs
    • 9x9 (8 Max.)
    • 18x18 (4 Max.)
    • 36x36 (1 Max.)
  • Add, Accumulate orSubtract
    • Signed & UnsignedOperations
    • Dynamically Changebetween Add & Subtract
    • Supports DSP RequirementsIncluding Complex Numbers
slide30

Resource Savings with DSP Blocks

  • DSP Block
    • Reduces LE Usage
    • Reduces Routing Congestion
    • Reduces Power
    • Maintains Performance

90% of your problems are hidden under the surface!

18

18

18

18

SAVES 652 ROUTING NETS!

X

X

36

36

36

36

+

+

+

38

design flow overview
Design Flow Overview
  • Create Design in Simulink Using Altera Libraries
  • Simulate in Simulink
  • Add SignalCompiler to Model
  • Create HDL Code & Generate Testbench
  • Perform RTL Simulation
  • Synthesize HDL Code & Place & Route
  • Program Device
  • Signal Tap II Logic Analyzer
step 1 create design in simulink using altera libraries
Step 1- Create Design in Simulink Using Altera Libraries
  • Drag & Drop Library Blocks into Simulink Design & Parameterize Each Block
step 3 add signal compiler to model to generate hdl code
Step 3 - Add “Signal Compiler” to Model to Generate HDL code
  • APEX20K/E/C
  • APEX II
  • Stratix & Stratix GX
  • Cyclone & ACEX 1K
  • Mercury
  • FLEX10K & FLEX 6000
  • DSP Boards
  • Leonardo Spectrum
  • Synplify
  • Quartus II

Speed vs. Area

Testbench Generation

Message Window

step 4 create hdl code generate testbench
Step 4 - Create HDL Code & Generate Testbench

AltrFir32.mdl

Enable "Generate Stimuli for VHDL Testbench" Button

AltrFir32.vhd

dsp builder report file
DSP Builder Report File
  • Lists All Converted Blocks
    • Port Widths
    • Sampling Frequencies
    • Warnings & Messages
step 5 perform rtl simulation modelsim
Step 5 – Perform RTL Simulation ( ModelSim )
  • Set working directory (File => Change Directory)
  • Run TCL file (Tools => Execute Macro)
perform verification
Perform Verification

ModelSim vs Simulink

step 6 synthesize hdl place route
Step 6 - Synthesize HDL & Place & Route
  • Leonardo Spectrum
  • Synplify
  • Quartus II

– Synthesis

– Quartus II Fitter

step 7 program device
Step 7 – Program Device

Download Design to DSP Development Kits

stratix dsp development board
Stratix DSP Development Board

Nios Expansion Prototype Connector

MAX 7000 Device

Prototyping Area

D/A Converters

Mictor-Type Connectors for HP Logic Analyzers

A/D Converters

Analog SMA Connectors

40-Pin Connectors for Analog Devices

Texas Instruments Connectors on Underside of Board

stratix dsp board key features
Stratix DSP Board – Key Features
  • Stratix EP1S25F780C5 Device (Starter Version)
  • Stratix EP1S80B956C7 Device (Professional Version)
  • Analog I/O
    • Two 12-bit, 125 MHz A/D Converters
    • Two 14-bit, 165 MHz D/A Converters
  • Digital I/O
    • Two 40-pin Connectors for Analog Devices A/D Converter Evaluation Boards
    • Connector for TI TMS320 Cross-Platform Daughter Card
    • 3.3V Expansion/Prototype Headers
    • RS-232 Serial Port
  • Memory
    • 2 Mbytes of 7.5-ns Synchronous SRAM
    • 32 Mbytes of FLASH
step 8 signaltap ii logic analyzer
Step 8 - SignalTap II Logic Analyzer
  • Embedded Logic Analyzer
    • Downloads into Device with Design
    • Captures State of Internal Nodes
    • Uses JTAG for Communication
signaltap ii logic analyzer
SignalTap II Logic Analyzer

Analysis of Imported Data

Imported Data

Imported Plot