Fft in hardware and software
This presentation is the property of its rightful owner.
Sponsored Links
1 / 35

FFT in Hardware and Software PowerPoint PPT Presentation


  • 114 Views
  • Uploaded on
  • Presentation posted in: General

FFT in Hardware and Software. Background. Core Algorithm Original Algorithm, the DFT, O(n 2 ) complexity New Algorithm, the FFT (Fast Fourier Transform), O(nlog 2 (n)) depending on implementation. DFT Computation.

Download Presentation

FFT in Hardware and Software

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Fft in hardware and software

FFT in Hardware and Software


Background

Background

  • Core Algorithm

  • Original Algorithm, the DFT, O(n2) complexity

  • New Algorithm, the FFT (Fast Fourier Transform), O(nlog2(n)) depending on implementation.


Dft computation

DFT Computation

  • A summation over the whole input array for every single element in the output array.

  • A VERY computationally inefficient algorithm to implement.


Fft computation

FFT Computation

  • A much more computationally efficient algorithm

  • Works using the divide and conquer principle.

  • First developed by Cooley and Tukey in 1965!


Dft vs fft number of operations

DFT vs. FFT (Number of Operations)


Dft vs fft

DFT vs. FFT


Fft butterfly operations

FFT Butterfly Operations

  • Butterfly arrangement of computations

  • Repeated on successive pairs of input data

  • Then half as many times on alternating pairs

  • Then half again as many times on every fourth element


The butterfly

xe[n]

X[n]

WnN

xo[n]

X[n+N/2]

-WnN

The Butterfly

  • Simple operations repeated many times


8 point fft demonstration the entire calculation

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT DemonstrationThe Entire Calculation

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration1

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration2

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration3

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration4

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration5

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration6

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration7

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration8

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration9

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration10

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration11

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


Why hardware

Why Hardware?

  • Even more speed for FFT

  • Extremely parallelizable

  • A whole layer can be done in two FPGA clock cycles

    • 1 multiply cycle

    • 1 add cycle

    • (Assuming sufficient multipliers)


Hardware problems

Hardware Problems

  • Complexity

  • Input speed

  • Output speed

  • If the FPGA takes 24.4ns but takes 20s to transfer the input data, what gain is there?

    • i.e. 24.4ns + 20s + 20s = ~40s!


Mitigation of hardware problems

Mitigation of Hardware Problems

  • Use a faster bus

    • AMD Opteron’s Hypertransport

      • 20.8 GB/s (166.4 Gb/s) per Link (V. 3)

      • Modules that fit into an AMD 64-bit Opteron Socket

      • http://www.drccomputer.com/pages/modules.html - xilinx based module

      • http://www.xtremedatainc.com/xd1000_brief.html - altera based module


Mitigation of hardware problems1

Mitigation of Hardware Problems

  • Put the FPGA on the die with the DSP

    • Need silicon vendor support

    • FPGA can access memory on a very wide bus (i.e. 128 bits per cycle)

  • Implement the entire project in FPGA

    • Time consuming to program

    • Possibly insufficient room on the FPGA


8 point fft demonstration in hardware

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT DemonstrationIn Hardware

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration in hardware1

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT DemonstrationIn Hardware

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration in hardware2

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT DemonstrationIn Hardware

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration in hardware3

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT DemonstrationIn Hardware

Input Array

Output

Multiplication by W factor Addition


Why not software

Why Not Software?

  • Each butterfly must be done sequentially

  • Only slight parallelism enabled by a DSP like the TigerSHARC

  • Each Butterfly can be done in 2 cycles (after optimization).


Results of testing

Results of Testing

  • Linear Profiling of FFT Algorithm in C++


Results of testing1

Results of Testing

  • Profiling of VHDL on FPGA

  • Butterfly takes 24.377ns to execute

    • 62% is computational, 38% is routing on FPGA


Product offerings

Product Offerings

  • Most DSP Vendors

  • Many FPGA Vendors (IP – Intellectual Property)

  • Microcontroller Vendors (i.e. Blackfin)

  • FFTW – The Fastest Fourier Transform in the West

  • AMD Math Core Library

  • Intel Library

  • Highly Optimized for the expected hardware


Published results

Published Results

  • The Radix 4 version delivers a 1 K points complex processing time of 25 microseconds at 200-MHz system speeds and uses only about 10 percent of the resources in a mid-range Stratix device. The Radix 2 is half the size of the Radix 4 and offers a 1 K points complex processing time of 50 microseconds at 200-MHz system speeds. Additional versions of the new cores are under development. [6]


References

References

[1] Signals Systems and Transforms

[2] James W. Cooley and John W. Tukey, "An algorithm for the machine calculation of complex Fourier series," Math. Comput.19, 297–301 (1965).

[3] http://www.drccomputer.com/pages/modules.html - xilinx based module

[4] http://www.xtremedatainc.com/xd1000_brief.html - altera based module

[5] http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_2353,00.html

[6] http://www.us.design-reuse.com/news/news5650.html

[7] http://www.4dsp.com/fft.htm


  • Login