Efficient Parallelization for AMR MHD Multiphysics Calculations. Implementation in AstroBEAR. Outline. Motivation Better memory management Sweep updates Distributed tree Improved parallelization Level threading Dynamic load balancing. Outline. Motivation – HPC
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Efficient Parallelization for AMR MHD Multiphysics Calculations
Implementation in AstroBEAR
Sweep Method for Unsplit Schemes
Distributed Tree
Level 0 base grid
Level 0 node
Parent-Child
Level 1 nodes are children of level 0 node
Level 1 grids nested within parent level 0 grid
Neighbor
Neighbor
Conservative methods require synchronization of fluxes between neighboring grids as well as the exchanging of ghost zones
Level 2 nodes are children of level 1 nodes
Level 2 grids nested within parent level 1 grids
Since the mesh is adaptive there are successive generations of grids and nodes
New generations of grids need access to data from the previous generation of grids on the same level that physically overlap
Current generation
Current generation
Previous generation
Current generation
Previous generation
Current generation
Level 1 Overlaps
Previous generation
Current generation
Level 2 Overlaps
Previous generation
Current generation
AMR Tree
Previous generation
While the AMR grids and the associated computations are normally distributed across multiple processors, the tree is often not.
Current generation
Previous generation
Each processor only needs to maintain its local sub-tree with connections to each of its grid’s surrounding nodes (parent, children, overlaps, neighbors)
For example processor 1 only needs to know about the following sub-tree
And the sub-tree for processor 2
And the sub-tree for processor 3
Tree starts at a single root node that persists from generation to generation
Processor 2
Processor 3
Processor 1
Root node creates next generation of level 1 nodes
Processor 2
Processor 3
Processor 1
Root node creates next generation of level 1 nodes
It next identifies overlaps between new and old children
Processor 2
Processor 3
Processor 1
Root node creates next generation of level 1 nodes
It next identifies overlaps between new and old children
As well as neighbors among new children
Processor 2
Processor 3
Processor 1
New nodes along with their connections are then communicated to the assigned child processor.
Processor 2
Processor 3
Processor 1
Processor 2
Processor 3
Processor 1
Processor 2
Processor 3
Processor 1
Level 1 nodes then locally create new generation of level 2 nodes
Processor 2
Processor 3
Processor 1
Level 1 nodes then locally create new generation of level 2 nodes
Processor 2
Processor 3
Processor 1
Processor 2
Processor 3
Processor 1
Level 1 grids communicate new children to their overlaps so that new overlap connections on level 2 can be determined
Processor 2
Processor 3
Processor 1
Level 1 grids communicate new children to their neighbors so that new neighbor connections on level 2 can be determined
Processor 2
Processor 3
Processor 1
Level 1 grids and their local trees are then distributed to child processors.
Processor 2
Processor 3
Processor 1
Each processor now has the necessary tree information to update their local grids.