stratified magnetohydrodynamics accelerated using gpus smaug n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Stratified Magnetohydrodynamics Accelerated Using GPUs:SMAUG PowerPoint Presentation
Download Presentation
Stratified Magnetohydrodynamics Accelerated Using GPUs:SMAUG

Loading in 2 Seconds...

play fullscreen
1 / 16

Stratified Magnetohydrodynamics Accelerated Using GPUs:SMAUG - PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on

Stratified Magnetohydrodynamics Accelerated Using GPUs:SMAUG. The Sheffield Advanced Code. The Sheffield Advanced Code (SAC) is a novel fully non-linear MHD code based on the Versatile Advection Code (VAC) designed for simulations of linear and non-linear wave propagation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Stratified Magnetohydrodynamics Accelerated Using GPUs:SMAUG' - marv


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide2

The Sheffield Advanced Code

  • The Sheffield Advanced Code (SAC) is a novel fully non-linear MHD code based on the Versatile Advection Code (VAC)
    • designed for simulations of linear and non-linear wave propagation
    • with gravitationally strongly stratified magnetised plasma.
    • Shelyag, S.; Fedun, V.; Erdélyi, R. Astronomy and Astrophysics, Volume 486, Issue 2, 2008, pp.655-662
numerical diffusion
Numerical Diffusion
  • Central differencing can generate numerical instabilities
  • Difficult to find solutions for shocked systems
  • We define a hyperviscosity parameter which is the ratio of the forward difference of a parameter to third order and first order
  • Tracking evolution of the hyperviscosity we can identify numerical noise and apply smoothing where necessary
why mhd using gpu s
Why MHD Using GPU’s?
  • Consider a simplified 2d problem
  • Solving flux equation
    • Derivative using central diffrencing
    • Time step using RungeKutta
  • Excellent scaling with GPU’s but,
  • Central differencing requires numerical stabilisation
  • Stabilisation with GPU’s trickier, requires
    • Reduction/maximum routine
    • An additional and larger mesh
halo messaging
Halo Messaging
  • Each proc has a “ghost” layer
    • Used in calculation of update
    • Obtained from neighbouring left and right processors
    • Pass top and bottom layers to neighbouring processors
      • Become neighbours ghost layers
  • Distribute rows over processors N/nproc rows per proc
    • Every processor stores all N columns
  • SMAUG-MPI implements messaging using a 2D halo model for 2D and 3D halo model for 3D
  • Consider a 2d model – for simplicity distribute layers over a line of processes
slide7

N+1

N

Processor 1

p1min

p2max

p1min

p2max

Processor 2

p2min

Send top layer

p3max

Receive

bottom

layer

p2min

p3max

Processor 3

Send

bottom

layer

Receive

top

layer

Processor 4

1

N+1

mpi implementation
MPI Implementation
  • Based on halo messaging technique employed in SAC code

void exchange_halo(vector v) {

gather halo data from v into gpu_buffer1cudaMemcpy(host_buffer1, gpu_buffer1,...);MPI_Isend(host_buffer1,...,destination,...);MPI_Irecv(host_buffer2,...,source,...);

MPI_Waitall(...);

cudaMemcpy(gpu_buffer2,host_buffer2,...); scatter halo data from gpu_buffer2 to halo regions in v}

halo messaging with gpu direct
Halo Messaging with GPU Direct
  • Simpler faster call structure

void exchange_halo(vector v)

{ gather halo data from v into gpu_buffer1MPI_Isend(gpu_buffer1,...,destination...);MPI_IRecv(gpu_buffer2,...,source...)

MPI_Waitall(...);

scatter halo data from gpu_buffer2 to halo regions in v}

progress with mpi implementation
Progress with MPI Implementation
  • Successfully running two dimensional models under GPU direct
    • Wilkes GPU cluster at The University of Cambridge
    • N8 - GPU Facility, Iceberg
  • 2D MPI version is verified
  • Currently optimising communications performance under GPU direct
  • 3D MPI implementation is already implemented still requires testing
slide11

Orszag-Tang Test

200x200 Model at t=0.1, t=0.26, t=0.42 and t=0.58s

a model of wave propagation in the magnetised solar atmmosphere
A Model of Wave Propagation in the Magnetised Solar Atmmosphere

The model features a Flux Tube with Torsional Driver, with a fully stratified quiet solar atmosphere based on VALIIIC

Grid size is 128x128x128, representing a box in the solar atmosphere of dimensions 1.5x2x2Mm

Flux tube has a magnetic field strength of 1000G

Driver Amplitude 200km/s

performance results hyperdiffusion disabled
Performance Results (Hyperdiffusion disabled)
  • Timings in seconds for 100 iterations (Orszag-Tang test)
performance results with hyperdiffusion enabled
Performance Results (With Hyperdiffusion enabled)
  • Timings in seconds for 100 iterations (Orszag-Tang test)
conclusions
Conclusions
  • We have demonstrated that we can successfully compute large problems by distributing across multiple GPUs
  • For 2D problems the performance using messaging with and without GPUdirect is similar.
    • This is expected to change when 3D models are tested
  • It is likely that much of the communications overhead arises from routines used transfer data within the GPU memory
    • Performance enhancements possible through application architecture modification
  • Further work needed with larger models for comparisons with X86 implementation using MPI
  • The algorithm has been implemented in 3D testing of 3D models will be undertaken over the forthcoming weeks