1 / 31

Parallel FFT in Julia

Parallel FFT in Julia. Review of FFT. Review of FFT (cont.). Review of FFT (cont.). Sequential FFT Pseudocode. Recursive-FFT ( array ) arrayEven = even indexed elements of array arrayOdd = odd indexed elements of array Recursive-FFT ( arrayEven ) Recursive-FFT ( arrayOdd )

hibbard
Download Presentation

Parallel FFT in Julia

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel FFT in Julia

  2. Review of FFT

  3. Review of FFT (cont.)

  4. Review of FFT (cont.)

  5. Sequential FFT Pseudocode Recursive-FFT ( array ) • arrayEven = even indexed elements of array • arrayOdd = odd indexed elements of array • Recursive-FFT ( arrayEven ) • Recursive-FFT ( arrayOdd ) • Combine results using Cooley-Tukey butterfly • Bit reversal, could be done either before, after or in between

  6. Parallel FFT Algorithms • Binary Exchange Algorithm • Transpose Algorithm

  7. Binary Exchange Algorithm

  8. Binary Exchange Algorithm

  9. Binary Exchange Algorithm

  10. Parallel Transpose Algorithm

  11. Parallel Transpose Algorithm

  12. Julia implementations • Input is represented as distributed arrays. • Assumptions: N, P are powers of 2 for sake of simplicity • More focus on minimizing communication overhead versus computational optimizations

  13. Easy Recursive-FFT ( array ) • …………. • @spawn Recursive-FFT ( arrayEven ) • @spawn Recursive-FFT ( arrayOdd ) • …………. Same as FFTW parallel implementation for 1D input using Cilk.

  14. Not so fast Too much unnecessary overhead because of random spawning. Better: Recursive-FFT ( array ) • …………. • @spawnat owner Recursive-FFT ( arrayEven ) • @spawnat owner Recursive-FFT ( arrayOdd ) • ………….

  15. Binary Exchange Implementation • FFT_Parallel ( array ) • Bit reverse input array and distribute • @spawnat owner Recursive-FFT ( first half ) • @spawnat owner Recursive-FFT ( last half ) • Combine results

  16. Binary Exchange Implementation

  17. Binary Exchange Implementation

  18. Binary Exchange Implementation

  19. Binary Exchange Implementation

  20. Binary Exchange Implementation

  21. Alternate approach – Black box • Initially similar to parallel transpose method: data is distributed so that each sub-problem is locally contained within one node • FFT_Parallel ( array ) • Bit reverse input array and distribute equally • for each processor • @spawnat proc FFT-Sequential( local array ) • Redistribute data and combine locally

  22. Alternate approach

  23. Alternate approach – Black box Pros: • Eliminates needs for redundant spawning which is expensive • Can leverage black box packages such as FFTW for local sequential run, or black box combiners • Warning: Order of input to sub-problems is important Note: • Have not tried FFTW due to configuration issues

  24. Benchmark Caveats array = redist(array, 1, 5)

  25. Benchmark Results

  26. Benchmark Results

  27. Communication issues

  28. New Results

  29. New Results

  30. More issues and considerations • Communication cost: where and how. Better redistribution method. • Leverage of sequential FFTW on black box problems • A separate algorithm, better data distribution?

  31. Questions

More Related