1 / 18

A Performance Comparison of DSM, PVM, and MPI

A Performance Comparison of DSM, PVM, and MPI. Paul Werstein Mark Pethick Zhiyi Huang. Introduction. Relatively little is known about the performance of Distributed Shared Memory systems compared to Message Passing systems.

kaiyo
Download Presentation

A Performance Comparison of DSM, PVM, and MPI

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Performance Comparison of DSM, PVM, and MPI Paul Werstein Mark Pethick Zhiyi Huang

  2. Introduction Relatively little is known about the performance of Distributed Shared Memory systems compared to Message Passing systems. We compare the performance of the TreadMarks DSM system with two popular message passing systems, MPICH-MPI, and PVM.

  3. Introduction Three applications are compared, Mergesort, Mandelbrot Set Generation, and Backpropergation Neural Network. Each application represents a different class of problem.

  4. TreadMarks DSM • Provides locks and barriers as primitives. • Uses Lazy Release Consistency. • Granularity of sharing is a page. • Creates page differentials to avoid the false sharing effect. • Version 1.0.3.3

  5. Parallel Virtual Machine • Provides concept of a virtual parallel machine. • Exists as a daemon on each node. • Inter-process communication is mediated by the daemons. • Design for flexibility. • Version 3.4.3.

  6. MPICH - MPI • Standard interface for developing Message Passing Applications. • Primary design goal is performance. • Primarily defines communications primitives. • MPICH is a reference platform for the MPI standard. • Version 1.2.4

  7. System • 32 Node Linux Cluster • 800mhz Pentium with 256 MB • Redhat 7.2 • 100mbit Ethernet • Results determined for 1, 2, 4, 8, 16, 24, and 32 processes.

  8. Mergesort • Parallelisation Strategy used is Divide and Conqueror. • Synchronisation between pairs of nodes. • Loosely Synchronous class problem. • Coarse grained synchronisation • Irregular synchronisation points. • Alternate phases of computation and communication.

  9. Mergesort Results (1)

  10. Mergesort Results (2)

  11. Mandelbrot Set • Strategy used is Data Partitioning. • Work Pool is used as computation time of sections differs. • Work Pool size >= 2 * num processes. • Embarrassingly Parallel class problem. • May involve complex computation, but there is very little communication. • Give indication of performance Under ideal conditions.

  12. Mandelbrot Set Results

  13. Neural Network (1) • Strategy is Data Partitioning. • Each processor trains the network on a subsection of the data set. • Changes are summed and applied at the end of each epoch. • Requires large data sets to be effective. .

  14. Neural Network (2) • Synchronous class problem. • Characterised by algorithm that carries out the same operation on all points in the data set. • Synchronisation occurs at regular points. • Often applies to problems that use data partitioning. • A large number of problems appear to belong to the synchronous class.

  15. Neural Network Results (1)

  16. Neural Network Results (2)

  17. Neural Network Results (3)

  18. Conclusion • In general the performance of DSM is poorer than that of MPICH or PVM. • Main reasons identified are: • The increased use of memory associated with the creation of page differentials. • False sharing affect due to the granularity of sharing. • Differential accumulation in the gather operation.

More Related