DSP for FPGA - PowerPoint PPT Presentation

jaden
dsp for fpga l.
Skip this Video
Loading SlideShow in 5 Seconds..
DSP for FPGA PowerPoint Presentation
Download Presentation
DSP for FPGA

play fullscreen
1 / 47
Download Presentation
DSP for FPGA
320 Views
Download Presentation

DSP for FPGA

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. DSP for FPGA SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic

  2. Objectives • Comparison between PDSP and FPGA • Virtex II Pro • Altera Stratix FPGA • Stratix DSP Block and its configuration • Altera design flow

  3. What Is an FPGA? • Field Programmable Gate Array • Device that Has a Regular Architecture (Set of Blocks) that Can Be Programmed for Various Functions • “Glue” Logic • Customizable Hardware Solution • Configurable Processors

  4. DSP System SoftwareDSP FPGA Why Use FPGAs in DSP Applications? • 10x More DSP Throughput Than DSP Processors • Parallel vs. Serial Architecture • Cost-Effective for Multi-Channel Applications • Flexible Hardware Implementation • Single-Chip Solution • System (Hardware/Software) Integration Benefits FPGA SoftwareEmbeddedProcessor

  5. MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC MAC DSP Processors vs. FPGAs High Speed DSP Processor High Level of Parallel Processing in FPGA • 1-8 Multipliers • Needs looping for more than 8 multiplications • Needs multiple clock cycles because of serial computation • 200 Tap FIR Filter would need 25+ clock cycles per sample with an 8 MAC unit processor • Can implement hundreds of MAC functions in an FPGA • Parallel implementation allows for faster throughput • 200 Tap FIR Filter would need 1 clock cycle per sample

  6. Extending Range of Altera Reconfigurable DSP Solutions New! 600 - Performance (MMACs/sec) 100 - Embedded Processors Embedded Processors Hardware Acceleration Complete Hardware Implementation

  7. Comparison of DSP Devices

  8. Objectives • Comparison between PDSP and FPGA • Virtex II Pro • Altera Stratix FPGA • Stratix DSP Block and its configuration • Altera design flow

  9. Stratix EP1S10 [2]

  10. TriMatrix™ Memory [1] Dedicated External Memory Interface M512 Blocks M-RAM M4K Blocks • Packet / Data Storage • Nios Program Memory • System Cache • Video Frame Buffers • Echo Canceller Data Storage • Small FIFOs • Shift Register • Rake Receiver Correlator • FIR Filter Delay Line • Header / Cell Storage • Channelized Functions • ATM cell–packet processing • Nios Program Memory • Look-Up Schemes • Packet & Cell Buffering • Cache More Bits For Larger Memory Buffering 512 Kbits per block + parity 4 Kbits per block + parity 512 bits per block + parity More Data Ports for Greater Memory Bandwidth

  11. Memory Bandwidth SummaryStratix Device Family [1]

  12. D DATA Logic Element (LE) [2] LUT Chain Input Register Chain Input Register Control Signals addnsub cin (2) data1 4-Input LUT Sync Load & Clear Logic data2 Row, Column & DirectLink Routing data3 data4 Local Routing Register Feedback LUT Chain Output Register Chain Output • Note: • Functional Diagram Only. Please See Datasheet for more Details. • Addnsum & data1 connected via XOR logic

  13. D DATA Dynamic Arithmetic Mode Register Chain Input Register Control Signals LAB Carry-In Carry-In Logic Carry-In0 Carry-In1 addnsub data1 Sum Calculator Sync Load & Clear Logic data2 Row, Column & DirectLink Routing data3 Carry Calculator Local Routing Carry-Out Logic Carry-In0 Carry-In1 Register Chain Output Carry-Out1 Carry-Out0 Note: Functional Diagram Only. Please See Datasheet for more Details.

  14. LE1 LE2 LE3 LE4 LE5 LE6 LE7 LE8 LE9 LE10 Logic Array Blocks (LAB) [2] Control Signals • 10 LEs • Local Interconnect • LAB-Wide Control Signals 4 4 4 4 30 LAB Input Lines 10 LE Feedback Lines 4 Local Interconnect 4 4 4 4 4

  15. Avalon Switch Fabric Contents • Avalon Switch Fabric provides the following to peripherals it connects • Data-Path Multiplexing • Address Decoding • Wait-State Generation • Dynamic Bus Sizing • Interrupt-Priority Assignment • Latent Transfer Capabilities • Streaming Read and Write Capabilities • Avalon Switch Fabric tailors transactions to the characteristic of peripherals that are attached

  16. DMA Controller With Streaming Control Port (Slave) Read Port (Master – Streaming) Write Port (Master – Streaming) SOPC Design Example CPU 32 Bit Inst Master Data Master Avalon Switch Fabric Allows for Masters and Slaves to communicate without knowledge of each others interface details Instruction Memory 32-bit Data path Data Memory 32-bit Data path UART Avalon Tri-State Bridge VGA Controller External FLASH 1 MB 16-bit Datapath External SRAM 256 KB 32-bit Datapath

  17. CPU 32 Bit Inst Master Data Master DMA Controller With Streaming Control Port (Slave) Read Port (Master – Streaming) Write Port (Master – Streaming) Data Path Multiplexing & Slave Arbitration • Data-Path Multiplexing Avalon Switch Fabric MUX 2- Slave Arbitration Arbiter Instruction Memory 32-bit Data path Data Memory 32-bit Data path UART Avalon Tri-State Bridge VGA Controller External FLASH 1 MB 16-bit Datapath External SRAM 256 KB 32-bit Datapath 3- Address Decoding

  18. Objectives • Comparison between PDSP and FPGA • Virtex II Pro • Altera Stratix FPGA • Stratix DSP Block and its configuration • Altera design flow

  19. Eight 9 × 9 bit multipliers Four 18 × 18 bit multipliers One 36 × 36 bit multiplier DSP Blocks

  20. DSP Blocks (cont.) The DSP block consists of • A multiplier block • An adder/subtractor/accumulator block • A summation block • An output interface • Output registers • Routing and control signals

  21. Input Register Unit Optional Pipelining + - S + - S + Output Multiplexer Output Register Unit Stratix DSP Blocks • High Performance Dedicated Multiplier Circuitry • 18x18 Functions at 280 MHz • Variable Operand Widths with Full Precision Outputs • 9x9 (8 Max.) • 18x18 (4 Max.) • 36x36 (1 Max.) • Add, Accumulate orSubtract • Signed & UnsignedOperations • Dynamically Changebetween Add & Subtract • Supports DSP RequirementsIncluding Complex Numbers

  22. DSP Block for 18 x 18-bit Mode

  23. Shift Register Chain

  24. Adder/Output Block

  25. Time-Domain Multiplexed FIR Filters

  26. Operation of TDM Filter

  27. Resource Savings with DSP Blocks • DSP Block • Reduces LE Usage • Reduces Routing Congestion • Reduces Power • Maintains Performance 90% of your problems are hidden under the surface! 18 18 18 18 SAVES 652 ROUTING NETS! X X 36 36 36 36 + + + 38

  28. Design Flow

  29. Design Flow Overview • Create Design in Simulink Using Altera Libraries • Simulate in Simulink • Add SignalCompiler to Model • Create HDL Code & Generate Testbench • Perform RTL Simulation • Synthesize HDL Code & Place & Route • Program Device • Signal Tap II Logic Analyzer

  30. Step 1- Create Design in Simulink Using Altera Libraries • Drag & Drop Library Blocks into Simulink Design & Parameterize Each Block

  31. Parameterization of IP Megacores

  32. Step 2 - Simulate in Simulink

  33. Step 3 - Add “Signal Compiler” to Model to Generate HDL code • APEX20K/E/C • APEX II • Stratix & Stratix GX • Cyclone & ACEX 1K • Mercury • FLEX10K & FLEX 6000 • DSP Boards • Leonardo Spectrum • Synplify • Quartus II Speed vs. Area Testbench Generation Message Window

  34. Step 4 - Create HDL Code & Generate Testbench AltrFir32.mdl Enable "Generate Stimuli for VHDL Testbench" Button AltrFir32.vhd

  35. HDL Code Generation

  36. DSP Builder Report File • Lists All Converted Blocks • Port Widths • Sampling Frequencies • Warnings & Messages

  37. Step 5 – Perform RTL Simulation ( ModelSim ) • Set working directory (File => Change Directory) • Run TCL file (Tools => Execute Macro)

  38. Perform Verification ModelSim vs Simulink

  39. Step 6 - Synthesize HDL & Place & Route • Leonardo Spectrum • Synplify • Quartus II – Synthesis – Quartus II Fitter

  40. Step 7 – Program Device Download Design to DSP Development Kits

  41. Stratix DSP Development Board Nios Expansion Prototype Connector MAX 7000 Device Prototyping Area D/A Converters Mictor-Type Connectors for HP Logic Analyzers A/D Converters Analog SMA Connectors 40-Pin Connectors for Analog Devices Texas Instruments Connectors on Underside of Board

  42. Stratix DSP Board – Key Features • Stratix EP1S25F780C5 Device (Starter Version) • Stratix EP1S80B956C7 Device (Professional Version) • Analog I/O • Two 12-bit, 125 MHz A/D Converters • Two 14-bit, 165 MHz D/A Converters • Digital I/O • Two 40-pin Connectors for Analog Devices A/D Converter Evaluation Boards • Connector for TI TMS320 Cross-Platform Daughter Card • 3.3V Expansion/Prototype Headers • RS-232 Serial Port • Memory • 2 Mbytes of 7.5-ns Synchronous SRAM • 32 Mbytes of FLASH

  43. Step 8 - SignalTap II Logic Analyzer • Embedded Logic Analyzer • Downloads into Device with Design • Captures State of Internal Nodes • Uses JTAG for Communication

  44. SignalTap II Logic Analyzer Analysis of Imported Data Imported Data Imported Plot