1 / 9

David Camp (IDAV)

Lessons Learned From the MPI-Hybrid Parallelism for Streamlines on Large Multi-Core Clusters Project.

tocho
Download Presentation

David Camp (IDAV)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lessons Learned From the MPI-Hybrid Parallelism for Streamlines on Large Multi-Core Clusters Project E. WES BETHEL (LBNL), CHRIS JOHNSON (UTAH), KEN JOY (UC DAVIS), SEAN AHERN (ORNL), VALERIO PASCUCCI (LLNL), JONATHAN COHEN (LLNL), MARK DUCHAINEAU (LLNL), BERND HAMANN (UC DAVIS), CHARLES HANSEN (UTAH), DAN LANEY (LLNL), PETER LINDSTROM (LLNL), JEREMY MEREDITH (ORNL), GEORGE OSTROUCHOV (ORNL), STEVEN PARKER (UTAH), CLAUDIO SILVA (UTAH), XAVIER TRICOCHE (UTAH), ALLEN SANDERSON (UTAH), HANK CHILDS (LLNL) David Camp(IDAV) www.vacet.org

  2. MPI-Hybrid • Other VACET projects have shown good performance gains with MPI-Hybrid • This project wanted to explore MPI-Hybrid style with two standard Streamlines algorithms, LOD and Static Domains • Talk about some of the problems encountered and performance gains www.vacet.org

  3. Baseline test for MPI-Hybrid • Original MPI test • Ran in 100 seconds, on 128 cores • First MPI-Hybrid Test • Ran in ~20,000 seconds, on 128 cores • Final MPI-Hybrid Test, After many fixes • Ran in 15 seconds, on 128 cores www.vacet.org

  4. VTK • At the heart of VTK pipeline is the data time stamp • This is used to drive their data flow model • Every action in VTK changes the data time stamp • vtkTimeStamp::Modified() • A small test found • Call ~1,000,000 times • Found a pthread_mutex_lock to protect the time stamp www.vacet.org

  5. Crashing in VTK • VTK – Thread Safe? • Documentation said Thread Safe • Look like memory corruption • VTK – Documents Say Thread Safe • But many function where defined “Not Thread Safe” • Some “This Method is Thread Safe if first called from a Single Thread and the dataset is not Modified” • Real Answer is VTK is not Thread Safe • vtkObjectBase did not protect it reference count variable, so Data Concurrency was lost. • Memory was being delete before it life time had truly ended www.vacet.org

  6. C++ Exception Across Share Libraries • Streamline code used an Exception to handle data boundary condition • Linux used a pthread_mutex_lock to handle this Execption • Code was change to remove the exception www.vacet.org

  7. VTK – Object Creation • VTK forces you to use it’s New function • VTK uses a factory method pattern • vtkObjectFactory • Used to override VTK classes with custom versions. • It used strcmp to match object • Strcmp was the most called function in the Streamlines test www.vacet.org

  8. I/O • Found that MPI I/O was better • They where doing multi-I/O operations by default by running four process per node • Changed the Streamline code to thread I/O www.vacet.org

  9. Conclusion – Hard Work Pays Off • Original MPI test • Run on Jaguar • 100 seconds (10,000 Streamlines 128 cores) • Original MPI test with code improvements • Run on Franklin • 45 seconds (20,000 Streamlines 128 cores) • MPI-Hybrid test • Run on Franklin • 15 seconds (20,000 Streamlines 128 cores) www.vacet.org

More Related