Fft in hardware and software
Download
1 / 35

FFT in Hardware and Software - PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on
  • Presentation posted in: General

FFT in Hardware and Software. Background. Core Algorithm Original Algorithm, the DFT, O(n 2 ) complexity New Algorithm, the FFT (Fast Fourier Transform), O(nlog 2 (n)) depending on implementation. DFT Computation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

FFT in Hardware and Software

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript



Background
Background

  • Core Algorithm

  • Original Algorithm, the DFT, O(n2) complexity

  • New Algorithm, the FFT (Fast Fourier Transform), O(nlog2(n)) depending on implementation.


Dft computation
DFT Computation

  • A summation over the whole input array for every single element in the output array.

  • A VERY computationally inefficient algorithm to implement.


Fft computation
FFT Computation

  • A much more computationally efficient algorithm

  • Works using the divide and conquer principle.

  • First developed by Cooley and Tukey in 1965!


Dft vs fft number of operations
DFT vs. FFT (Number of Operations)



Fft butterfly operations
FFT Butterfly Operations

  • Butterfly arrangement of computations

  • Repeated on successive pairs of input data

  • Then half as many times on alternating pairs

  • Then half again as many times on every fourth element


The butterfly

xe[n]

X[n]

WnN

xo[n]

X[n+N/2]

-WnN

The Butterfly

  • Simple operations repeated many times


8 point fft demonstration the entire calculation

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT DemonstrationThe Entire Calculation

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration1

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration2

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration3

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration4

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration5

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration6

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration7

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration8

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration9

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration10

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration11

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT Demonstration

Input Array

Output

Multiplication by W factor Addition


Why hardware
Why Hardware?

  • Even more speed for FFT

  • Extremely parallelizable

  • A whole layer can be done in two FPGA clock cycles

    • 1 multiply cycle

    • 1 add cycle

    • (Assuming sufficient multipliers)


Hardware problems
Hardware Problems

  • Complexity

  • Input speed

  • Output speed

  • If the FPGA takes 24.4ns but takes 20s to transfer the input data, what gain is there?

    • i.e. 24.4ns + 20s + 20s = ~40s!


Mitigation of hardware problems
Mitigation of Hardware Problems

  • Use a faster bus

    • AMD Opteron’s Hypertransport

      • 20.8 GB/s (166.4 Gb/s) per Link (V. 3)

      • Modules that fit into an AMD 64-bit Opteron Socket

      • http://www.drccomputer.com/pages/modules.html - xilinx based module

      • http://www.xtremedatainc.com/xd1000_brief.html - altera based module


Mitigation of hardware problems1
Mitigation of Hardware Problems

  • Put the FPGA on the die with the DSP

    • Need silicon vendor support

    • FPGA can access memory on a very wide bus (i.e. 128 bits per cycle)

  • Implement the entire project in FPGA

    • Time consuming to program

    • Possibly insufficient room on the FPGA


8 point fft demonstration in hardware

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT DemonstrationIn Hardware

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration in hardware1

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT DemonstrationIn Hardware

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration in hardware2

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT DemonstrationIn Hardware

Input Array

Output

Multiplication by W factor Addition


8 point fft demonstration in hardware3

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

8-point FFT DemonstrationIn Hardware

Input Array

Output

Multiplication by W factor Addition


Why not software
Why Not Software?

  • Each butterfly must be done sequentially

  • Only slight parallelism enabled by a DSP like the TigerSHARC

  • Each Butterfly can be done in 2 cycles (after optimization).


Results of testing
Results of Testing

  • Linear Profiling of FFT Algorithm in C++


Results of testing1
Results of Testing

  • Profiling of VHDL on FPGA

  • Butterfly takes 24.377ns to execute

    • 62% is computational, 38% is routing on FPGA


Product offerings
Product Offerings

  • Most DSP Vendors

  • Many FPGA Vendors (IP – Intellectual Property)

  • Microcontroller Vendors (i.e. Blackfin)

  • FFTW – The Fastest Fourier Transform in the West

  • AMD Math Core Library

  • Intel Library

  • Highly Optimized for the expected hardware


Published results
Published Results

  • The Radix 4 version delivers a 1 K points complex processing time of 25 microseconds at 200-MHz system speeds and uses only about 10 percent of the resources in a mid-range Stratix device. The Radix 2 is half the size of the Radix 4 and offers a 1 K points complex processing time of 50 microseconds at 200-MHz system speeds. Additional versions of the new cores are under development. [6]


References
References

[1] Signals Systems and Transforms

[2] James W. Cooley and John W. Tukey, "An algorithm for the machine calculation of complex Fourier series," Math. Comput.19, 297–301 (1965).

[3] http://www.drccomputer.com/pages/modules.html - xilinx based module

[4] http://www.xtremedatainc.com/xd1000_brief.html - altera based module

[5] http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_2353,00.html

[6] http://www.us.design-reuse.com/news/news5650.html

[7] http://www.4dsp.com/fft.htm


ad
  • Login