1 / 1

Scalable Multi-core Sonar Beamforming with Computational Process Networks

Scalable Multi-core Sonar Beamforming with Computational Process Networks. John F. Bridgman, III, Gregory E. Allen and Brian L. Evans Applied Research Laboratories and Dept. of Electrical and Computer Engineering The University of Texas at Austin, Austin, Texas. Motivation. Algorithm.

dougal
Download Presentation

Scalable Multi-core Sonar Beamforming with Computational Process Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalable Multi-core Sonar Beamforming with Computational Process Networks John F. Bridgman, III, Gregory E. Allen and Brian L. Evans Applied Research Laboratories and Dept. of Electrical and Computer Engineering The University of Texas at Austin, Austin, Texas Motivation Algorithm • Sonar beamforming requires significant computation and input/output • Beamforming is traditionally done with custom hardware • We would like to use inexpensive commodity computer hardware • To achieve real time performance a parallel implementation is required • OpenMP and other fork and join models do not scale as well as we would like • We use Computational Process Networks for more scalability • This allows more efficient use of current multi-core computer hardware • Inputs to the beamformer are complex basebanded 16 bit elements • The beamformer is separated into vertical and horizontal components • The vertical beamformer produces three sets of vertical output beams • The vertical beamformer is implemented as a four tap FIR filter • Three horizontal beamformers concurrently produce the final beam output • The horizontal beamformer uses circular convolution with an FFT • Geometric symmetry is exploited to reduce the number of calculations Beamformer block diagram Sonar Beamforming Calculation for vertical beamformer • A beamformer is a spatial filter to steer an array in a desired direction • Beamforming is often implemented as a weighted delay-and-sum of sensors • Delays are the distance to a plane perpendicular to the steering direction • This array is cylindrical with 12 vertical elements at each horizontal position • There are 256 horizontal positions regularly spaced around a circle • The horizontal gaps provide space for mechanical structures Calculation for horizontal beamformer Simulated beam pattern Steps of the horizontal beamformer Top view of half the array, with projections onto a plane for steering Implementation • The horizontal kernel uses FFTW, horizontal and vertical kernels use SSE3 • Each kernel uses OpenMP internally for data parallelism • We run tests on 2.4GHz Intel dual quad core Nehalem processors with Hyper-Threading • We use RedHat Enterprise Linux Server 5.5 and GCC 4.1.2 • We enable an increasing number of cores to evaluate scalability for several cases • OpenMP provides “active” (busy wait, the default) and “passive” (OS assisted) waiting • We compare the system composed with OpenMP to the system composed with CPN • We measure throughput in samples per second of the entire system Computational Process Networks (CPN) • Kahn Process Networks are a formal model of concurrency • This model provides provable deterministic behavior, but is unbounded • Processes and queues are represented by a directed graph • The directed graph is similar to the block diagram of the system • CPN is a model and framework for high-throughput signal processing • CPN uses Parks’ bounded scheduling of process networks • CPN has enhancements for high performance: multi-token transactions, multi-channel queues and firing thresholds • The CPN framework exploits both SMP and cluster parallelism Results • Default OpenMP settings (“active”) hinders performance in both cases • The plateau is caused by transition to Hyper-Threaded cores • CPN version is 13.2% faster than OpenMP-only version at 8 cores • At the peak, the CPN version operates at 27.3 GFLOPS • CPN framework increases beamformer scalability and performance • The CPN framework can trivially provide a distributed implementation Average throughput versus number of cores Beamformer realization in CPN CPN available at http://webspace.utexas.edu/gallen/ This work was supported by the Independent Research and Development Program at Applied Research Laboratories: The University of Texas at Austin.

More Related