The fft on a gpu l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 25

The FFT on a GPU PowerPoint PPT Presentation


  • 69 Views
  • Uploaded on
  • Presentation posted in: General

The FFT on a GPU. Graphics Hardware 2003 July 27, 2003 Kenneth MorelandEdward Angel Sandia National LabsU. of New Mexico.

Download Presentation

The FFT on a GPU

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The fft on a gpu l.jpg

The FFT on a GPU

Graphics Hardware 2003

July 27, 2003

Kenneth MorelandEdward Angel

Sandia National LabsU. of New Mexico

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.


Overview l.jpg

Overview

  • Introduction

    • Motivation, FFT review.

  • FFT Techniques

    • Exploitable FFT properties.

  • Implementation

  • Results

    • Performance, applications, conclusions.

Graphics Hardware 2003


Motivation l.jpg

Motivation

  • The Fourier transform is a principal tool for digital image processing.

    • Filtering.

    • Correction.

    • Compression.

    • Classification.

    • Generation.

  • As such, should not our graphics hardware support such a tool?

Graphics Hardware 2003


The discrete fourier transform l.jpg

The Discrete Fourier Transform

  • Converts data in the spatial or temporal domain into frequencies the data comprise.

Graphics Hardware 2003


The discrete fourier transform5 l.jpg

DFT

IDFT

The Discrete Fourier Transform

  • 2D transform can be computed by applying the transform in one direction, then the other.

Graphics Hardware 2003


The fast fourier transform l.jpg

The Fast Fourier Transform

  • Divide and Conquer Algorithm

    • Input sequence is divided into subsequences consisting of values from even and odd indices, respectively.

Graphics Hardware 2003


Index magic l.jpg

Index Magic

  • Do not use recursion.

    • Use dynamic programming: iterate over entire array computing all values for each recursive depth together, like mergesort.

  • Indexing is non-obvious.

    • Unlike mergesort, recursive step does not divide array into contiguous chunks.

    • At any iteration, what partition does a given index belong to, and where can one find the applicable values of the sub-partitions?

Graphics Hardware 2003


Index magic8 l.jpg

Index Magic

  • Common solution: rearrange data by reversing the bits of indices.

    • FFT can occur with contiguous partitions.

    • Requires an extra data copy.

  • Our solution, determine indexing in place.

Note that the paper has a typo.

Graphics Hardware 2003


Fourier symmetry of real sequences l.jpg

Fourier Symmetry of Real Sequences

  • In general, the frequency spectra of even real functions contain imaginary values.

    • Captures magnitude and phase shift of sinusoids.

  • Brute force FFT doubles computation and storage costs.

  • But, Fourier transforms of real functions have symmetry.

    • Values at and are real (because they are conjugates with themselves).

Graphics Hardware 2003


Fourier transform of real functions l.jpg

Fourier Transform of Real Functions

  • Pick two functions, let them be f(x) and g(x).

  • Let h(x) = f(x) + j g(x).

    • Note that there is no loss of information.

  • Can perform FFT of h in half the time as performing the brute force FFT of f and g individually.

    • Simply point to one row of image as real components and another as imaginary components.

f

g

Graphics Hardware 2003


Untangling fourier transform pairs l.jpg

Untangling Fourier Transform Pairs

  • Fourier transform is linear.

    • H(u) = F(u) + j G(u)

  • We can “untangle” using symmetry of F and G.

    • Add and subtract H(u) and H(N – u) to cancel out conjugate terms of F and G.

Graphics Hardware 2003


Untangling fourier transform pairs12 l.jpg

Untangling Fourier Transform Pairs

Graphics Hardware 2003


Packing transforms of real functions l.jpg

Real Values

Imaginary Values

Packing Transforms of Real Functions

  • We can store Fourier transform in an array the same size as the input.

    • Throw away conjugate duplicates.

    • Throw away imaginary values known to be zero.

Graphics Hardware 2003


Column wise fft l.jpg

Column-wise FFT

  • We have two columns with real values.

    • Use same “tangled” approach.

  • All other columns are complex numbers.

    • Use regular FFT.

Real

Real

Paired for

Complex

Graphics Hardware 2003


Packing 2d transforms of real functions l.jpg

Packing 2D Transforms of Real Functions

  • Rows transformed from complex values are already packed appropriately.

  • The two rows transformed from real values are untangled and packed to follow suite.

Real Values

Imaginary Values

Graphics Hardware 2003


Available resources l.jpg

Available Resources

  • nVidia GeForce FX 5800 Ultra.

    • Full 32-bit floating point pipeline and frame buffers.

    • Fully programmable vertex and fragment units.

  • Cg

    • High level language for vertex and fragment programs.

  • Traditional CPU: 1.7 GHz Intel Zeon

    • Freely available high performance FFT implementations.

Graphics Hardware 2003


Implementation l.jpg

Implementation

  • Using a SIMD model for parallel computation.

    • Draw quadrilateral parallel to screen.

    • Rasterizer invokes the same fragment program “in parallel” over all pixels covered by quadrilateral.

    • Inputs/output dependent on location of pixel the fragment program is running.

  • We require many rendering passes.

    • Use “render to texture” extension.

    • Use two frame buffers: one for retrieving values of last pass and one for storing results of current computation.

Graphics Hardware 2003


Implementation18 l.jpg

Imaginary

Tangled

Imaginary

Tangled

Real

Tangled

Real

Tangled

Scale

Real

G

Scale

Imag.

G

Pass

Real

G

Pass

Imag.

G

Real

F

Imag.

F

Real

F

Imag.

F

Real, Tangled

Real

Untangled

Imag., Tangled

Imaginary

Untangled

Real, Tangled

Real

Untangled

Imag., Tangled

Imaginary

Untangled

I, F

Scale

I, G

Scale

I, F

Pass

I, G

Pass

R, F

R, G

R, F

R, G

Implementation

FFT

Untangle

FFT

Untangle

Frequency Spectra

Images

FFT

Untangle

FFT

Untangle

Graphics Hardware 2003


Fragment programs l.jpg

Fragment Programs

  • Written in Cg, compiled for GeForce FX.

Graphics Hardware 2003


Applications l.jpg

Applications

  • Digital image filtering.

Graphics Hardware 2003


Applications21 l.jpg

Applications

  • Texture generation.

  • Volume rendering.

Graphics Hardware 2003


Performance l.jpg

Performance

  • Computation speed: 2.5 GigaFLOPS

  • Texture read rate: 3.4 GB/sec

Graphics Hardware 2003


Conclusions l.jpg

Conclusions

  • The Fourier transform on the GPU has many potential applications.

  • A well established FFT on the CPU (FFTW) still has an edge over GPU implementation.

    • Both software and hardware of GPU are first generations.

    • Room for improvement.

Graphics Hardware 2003


Get the cg code l.jpg

Get the Cg Code

  • http://www.cgshaders.org ?

  • http://www.cs.unm.edu/~kmorel/documents/fftgpu

  • [email protected]

Graphics Hardware 2003


Questions l.jpg

Questions?

Graphics Hardware 2003


  • Login