1 / 13

Some Experiences on Parallel Finite Element Computations Using IBM/SP2

Some Experiences on Parallel Finite Element Computations Using IBM/SP2. Yuan-Sen Yang and Shang-Hsien Hsieh National Taiwan University Taipei, Taiwan, R.O.C. Contents. Parallel Substructure Method Three Issues : Mesh Partitioning Nodal Renumbering within Substructures

Download Presentation

Some Experiences on Parallel Finite Element Computations Using IBM/SP2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Some Experiences on Parallel Finite Element Computations Using IBM/SP2 Yuan-Sen Yang and Shang-Hsien Hsieh National Taiwan University Taipei, Taiwan, R.O.C.

  2. Contents • Parallel Substructure Method • Three Issues : • Mesh Partitioning • Nodal Renumbering within Substructures • Solution of Interface DOFs • Conclusions

  3. Parallel Substructure Method • Partition a structure into several substructures. • Assign each substructure to a processor. • Matrix assembly & static condensation within each substructure.

  4. Parallel Substructure Method (cont.) • Solve the displacements of interface DOFs. • Solve the displacements of internal DOFs in each substructure. • Perform force recovering in each substructure.

  5. Mesh Partitioning • Requirements • Automatic Partitioning • Handling regular & irregular meshes. • Balanced distribution of number of elements. • Minimization of number of interface nodes.

  6. Experiences (Mesh Partitioning) • GR, RST, METIS are used in this work. • Balanced distribution of number of elements is achieved. • Condensational load are unbalanced. RST

  7. Substructural Nodal Renumbering • Purpose: • To reduce the skyline of substructure matrix. • Constraint: • Interface nodes must be numbered after internal nodes • Reversed Cuthill-Mckee (RCM, Liu & Sherman 1975) is modified and used.

  8. Experiences (Substructure Nodal Renumbering) 30STORY. RST. With 4 processors • Help to Reduce the condensational loads. • Rarely balance the condensational loads among processors. Without Substructure Nodal Renumbering RST With modified RCM Substructure Nodal Renumbering

  9. Solution of Interface DOFs • Achieving high parallel efficiency for linear equation solver is not an easy task. • When NP increases NI increases Parallel Efficiency decreases

  10. Experiences (Solution of Interface DOFs) • In this work, a sequential direct method(Cholesky decomposition) is used. • NI is affected by both NP and the performance of the partitioning algorithm.

  11. Conclusions • Mesh partitioning • Computational loads of each processor is not necessarily proportional to its number of elements. • Minimization of interface nodes reduces the interface equations and usually improves the parallel efficiency. • Substructural nodal renumbering • Substructural nodal renumbering always reduces the condensational loads. • But rarely balance the condensational loads among procesors. • Parallel solution of interface DOFs • High-efficiency parallel solvers of interface equations are needed for improving the efficiency of parallel substructure method.

  12. Acknowledgement • This research is supported by the National Science Council of R.O.C., under the project Nos. NSC 86-2211-E-002-029 and NSC 87-2211-E-002-034. • The parallel computations are performed on IBM/SP2 comupters of National Center for High-performance Computing, Hsin-Chu, Taiwan, R.O.C.

  13. IBM/SP2 in NCHC • Model • IBM POWER2 SuperChip (P2SC) • Floating PeakPerformance • 480-MFLOPS • Memory • 128 Mbtyes per node

More Related