The fft on a gpu l.jpg
Sponsored Links
This presentation is the property of its rightful owner.
1 / 25

The FFT on a GPU PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on
  • Presentation posted in: General

The FFT on a GPU. Graphics Hardware 2003 July 27, 2003 Kenneth MorelandEdward Angel Sandia National LabsU. of New Mexico.

Download Presentation

The FFT on a GPU

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The FFT on a GPU

Graphics Hardware 2003

July 27, 2003

Kenneth MorelandEdward Angel

Sandia National LabsU. of New Mexico

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.


Overview

  • Introduction

    • Motivation, FFT review.

  • FFT Techniques

    • Exploitable FFT properties.

  • Implementation

  • Results

    • Performance, applications, conclusions.

Graphics Hardware 2003


Motivation

  • The Fourier transform is a principal tool for digital image processing.

    • Filtering.

    • Correction.

    • Compression.

    • Classification.

    • Generation.

  • As such, should not our graphics hardware support such a tool?

Graphics Hardware 2003


The Discrete Fourier Transform

  • Converts data in the spatial or temporal domain into frequencies the data comprise.

Graphics Hardware 2003


DFT

IDFT

The Discrete Fourier Transform

  • 2D transform can be computed by applying the transform in one direction, then the other.

Graphics Hardware 2003


The Fast Fourier Transform

  • Divide and Conquer Algorithm

    • Input sequence is divided into subsequences consisting of values from even and odd indices, respectively.

Graphics Hardware 2003


Index Magic

  • Do not use recursion.

    • Use dynamic programming: iterate over entire array computing all values for each recursive depth together, like mergesort.

  • Indexing is non-obvious.

    • Unlike mergesort, recursive step does not divide array into contiguous chunks.

    • At any iteration, what partition does a given index belong to, and where can one find the applicable values of the sub-partitions?

Graphics Hardware 2003


Index Magic

  • Common solution: rearrange data by reversing the bits of indices.

    • FFT can occur with contiguous partitions.

    • Requires an extra data copy.

  • Our solution, determine indexing in place.

Note that the paper has a typo.

Graphics Hardware 2003


Fourier Symmetry of Real Sequences

  • In general, the frequency spectra of even real functions contain imaginary values.

    • Captures magnitude and phase shift of sinusoids.

  • Brute force FFT doubles computation and storage costs.

  • But, Fourier transforms of real functions have symmetry.

    • Values at and are real (because they are conjugates with themselves).

Graphics Hardware 2003


Fourier Transform of Real Functions

  • Pick two functions, let them be f(x) and g(x).

  • Let h(x) = f(x) + j g(x).

    • Note that there is no loss of information.

  • Can perform FFT of h in half the time as performing the brute force FFT of f and g individually.

    • Simply point to one row of image as real components and another as imaginary components.

f

g

Graphics Hardware 2003


Untangling Fourier Transform Pairs

  • Fourier transform is linear.

    • H(u) = F(u) + j G(u)

  • We can “untangle” using symmetry of F and G.

    • Add and subtract H(u) and H(N – u) to cancel out conjugate terms of F and G.

Graphics Hardware 2003


Untangling Fourier Transform Pairs

Graphics Hardware 2003


Real Values

Imaginary Values

Packing Transforms of Real Functions

  • We can store Fourier transform in an array the same size as the input.

    • Throw away conjugate duplicates.

    • Throw away imaginary values known to be zero.

Graphics Hardware 2003


Column-wise FFT

  • We have two columns with real values.

    • Use same “tangled” approach.

  • All other columns are complex numbers.

    • Use regular FFT.

Real

Real

Paired for

Complex

Graphics Hardware 2003


Packing 2D Transforms of Real Functions

  • Rows transformed from complex values are already packed appropriately.

  • The two rows transformed from real values are untangled and packed to follow suite.

Real Values

Imaginary Values

Graphics Hardware 2003


Available Resources

  • nVidia GeForce FX 5800 Ultra.

    • Full 32-bit floating point pipeline and frame buffers.

    • Fully programmable vertex and fragment units.

  • Cg

    • High level language for vertex and fragment programs.

  • Traditional CPU: 1.7 GHz Intel Zeon

    • Freely available high performance FFT implementations.

Graphics Hardware 2003


Implementation

  • Using a SIMD model for parallel computation.

    • Draw quadrilateral parallel to screen.

    • Rasterizer invokes the same fragment program “in parallel” over all pixels covered by quadrilateral.

    • Inputs/output dependent on location of pixel the fragment program is running.

  • We require many rendering passes.

    • Use “render to texture” extension.

    • Use two frame buffers: one for retrieving values of last pass and one for storing results of current computation.

Graphics Hardware 2003


Imaginary

Tangled

Imaginary

Tangled

Real

Tangled

Real

Tangled

Scale

Real

G

Scale

Imag.

G

Pass

Real

G

Pass

Imag.

G

Real

F

Imag.

F

Real

F

Imag.

F

Real, Tangled

Real

Untangled

Imag., Tangled

Imaginary

Untangled

Real, Tangled

Real

Untangled

Imag., Tangled

Imaginary

Untangled

I, F

Scale

I, G

Scale

I, F

Pass

I, G

Pass

R, F

R, G

R, F

R, G

Implementation

FFT

Untangle

FFT

Untangle

Frequency Spectra

Images

FFT

Untangle

FFT

Untangle

Graphics Hardware 2003


Fragment Programs

  • Written in Cg, compiled for GeForce FX.

Graphics Hardware 2003


Applications

  • Digital image filtering.

Graphics Hardware 2003


Applications

  • Texture generation.

  • Volume rendering.

Graphics Hardware 2003


Performance

  • Computation speed: 2.5 GigaFLOPS

  • Texture read rate: 3.4 GB/sec

Graphics Hardware 2003


Conclusions

  • The Fourier transform on the GPU has many potential applications.

  • A well established FFT on the CPU (FFTW) still has an edge over GPU implementation.

    • Both software and hardware of GPU are first generations.

    • Room for improvement.

Graphics Hardware 2003


Get the Cg Code

  • http://www.cgshaders.org ?

  • http://www.cs.unm.edu/~kmorel/documents/fftgpu

  • [email protected]

Graphics Hardware 2003


Questions?

Graphics Hardware 2003


  • Login