GPU Cluster for Scientific Computing
Download
1 / 1

GPU Cluster for Scientific Computing Zhe Fan, Feng Qiu, Arie Kaufman, Suzanne Yoakum-Stover - PowerPoint PPT Presentation


  • 383 Views
  • Updated On :

GPU Cluster for Scientific Computing Zhe Fan, Feng Qiu, Arie Kaufman, Suzanne Yoakum-Stover Center for Visual Computing and Department of Computer Science, Stony Brook University http://www.cs.sunysb.edu/ ~ vislab/projects/gpgpu/GPU_Cluster/GPU_Cluster.html and Large-Scale Simulation

Related searches for GPU Cluster for Scientific Computing Zhe Fan, Feng Qiu, Arie Kaufman, Suzanne Yoakum-Stover

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'GPU Cluster for Scientific Computing Zhe Fan, Feng Qiu, Arie Kaufman, Suzanne Yoakum-Stover' - omer


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

GPU Cluster for Scientific Computing

Zhe Fan,Feng Qiu, Arie Kaufman,Suzanne Yoakum-Stover

Center for Visual Computing and Department of Computer Science, Stony Brook University

http://www.cs.sunysb.edu/~vislab/projects/gpgpu/GPU_Cluster/GPU_Cluster.html

and Large-Scale Simulation

  • Stony Brook Visual Computing Cluster

    • GPU Cluster

    • 35 nodes with nVIDIA GeForce FX 5800 Ultra

    • Gigabit Ethernet

    • 70 Pentium Xeon 2.4GHz CPUs

    • 35 VolumePro 1000

    • 9 HP Sepia-2A with ServerNet II

  • Scale up LBM to the GPU Cluster

    • Each GPU computes a sub-lattice

    • Particles stream out of the sub-lattice

      • Gather particle distributions in a texture

      • Read out from GPU in a single operation

      • Transfer through GigaE (MPI)

      • Write into neighboring GPU nodes

  • Network performance optimization:

    • Conduct network transfer while computing

    • Schedule to reduce the likelihood of interruption

    • Simplify the connection pattern

  • Times Square Area of NYC

    Flow Streamlines

    • 0.31 second / step on 30 GPUs

    • 4.6 times faster than software version on 30 CPUs

  • 1.66 km x 1.13 km

  • 91 blocks

  • 851 buildings

  • 480 x 400 x 80 lattice

  • LBM on the GPU

    Application: large-scale CFD simulations using Lattice

    Boltzmann Model (LBM)

    LBM Computation:

    • Particles stream along lattice links

    • Particles collide when they meet at a site

      Map to GPU:

    • Pack 3D lattice states into a series of 2D textures

    • Update the lattice with fragment programs

  • GPU Cluster / CPU Cluster Speedup

    • Each node computes an 80 x 80 x 80 sub-lattice

    • GeForce FX 5800 Ultra / Pentium Xeon 2.4GHz

Dispersion Plume

  • Acknowledgements

  • NSF CCR0306438

  • Department of Homeland Security, Environment Measurement Lab

  • HP

  • Terarecon


ad