fft accelerator project
Skip this Video
Download Presentation
FFT: Accelerator Project

Loading in 2 Seconds...

play fullscreen
1 / 13

FFT: Accelerator Project - PowerPoint PPT Presentation

  • Uploaded on

FFT: Accelerator Project. Rohit Prakash Anand Silodia. Work done till now. Studied various FFT algorithms Implemented radix-4, recursive and iterative algorithms Optimized these Compared the results with FFTW RESULT- FFTW fares better than our implementation. Current Objectives.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' FFT: Accelerator Project' - redell

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
fft accelerator project

FFT: Accelerator Project

Rohit Prakash

Anand Silodia

work done till now
Work done till now
  • Studied various FFT algorithms
  • Implemented radix-4, recursive and iterative algorithms
  • Optimized these
  • Compared the results with FFTW


  • FFTW fares better than our implementation
current objectives
Current Objectives
  • Validate the number of complex calculations in our implementation with theoretical number of computations
  • Document the work done till now
  • Make a website of the project
  • Study FFTW code (also figure out the reasons for its efficiency)
  • Run the code on intel compiler (icc)/ visual c++
validating the computations
Validating the computations
  • Incorrect theoretical formula (cnx.org)
  • Theoretical formula (for no. of complex computations) =

(11/4)*nlog4(n) =8960 (Correct)

(3/4)*nlog4(n) = 3840 (Incorrect)

Actual 8960

documentation and website
Documentation and website
  • Website of the project –
    • www.cse.iitd.ac.in/~cs1030186/btp
  • Includes the details and results of our experimentations (till last week)
running on intel compiler icc
Running on intel compiler icc
  • No improvement
  • Possible reasons –
    • Tested on Intel Pentium Mobile
    • This does not support optimizations like exploiting SSE3 instructions (-fast flag)
fftw code
FFTW code
  • 56,489+ LOC (contains code written in Ocaml and C)
  • We decided to study why FFTW is so fast (before going into the code itself)
  • Text we came across in this context –
    • Design and implementation of FFTW3 (Matteo Frigo and Steven G. Johnson)
    • Documentation of FFTW
why is fftw fast
Why is FFTW fast?
  • The transform is computed by an executor, composed of highly optimized, composable blocks of C code called codelets
    • At runtime, a ‘planner’ finds an efficient way to compose codelets: it measures the speed of different plans and chooses the best using a dynamic programming algorithm
    • The executor interprets the plan with negligible overhead
    • Codelets are generated automatically and are fast
  • The executor implements the recursive divide and conquer Cooley Tukey FFT algorithm
  • Basically, it adapts to hardware in order to maximize performance
  • ‘Performance has little to do with the number of operations.Fast code must exploit instruction level parallelism of the processor. It is important to write the code in such a way that C compiler can schedule it efficiently’
  • It uses some tricky optimizations like –
  • It also exploits SIMD instructions
further plan
Further plan ?
  • Since FFTW supports MPI and adapts itself to the given hardware architecture, we may use it as it is.
  • www.fftw.org
  • The Design and Implementation of FFTW3 (Matteo Frigo and Steven G. Johnson)
  • The Fastest Fourier Transform in the West (Matteo Frigo and Steven G. Johnson)