1 / 6

Send Buffer Access

Send Buffer Access. MPI Forum meeting 1/2008. Send Buffer Access. Background:

dee
Download Presentation

Send Buffer Access

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Send Buffer Access MPI Forum meeting1/2008

  2. Send Buffer Access • Background: MPI 1.1 standard prohibits users from accessing the send buffer for read until the send operation completes. Be it access in the same thread in case of an async send operation or by another thread in the case of blocking send operation. The rational in the MPI 1.1 standard was to enable the performance for DMA engine that is not cache-coherent with the main processor. • Proposal: Remove the access restriction on the send buffers.

  3. Why not? • It is a change, and as a change to the standard requires a good reason • Requires changing MPI impl. that do modify the buffer in place • Currently no known impl. • This limitation affects performance • Currently no known impl. takes advantage of this restriction. Potentially future HW might take advantage of this restriction. For example, HW, that switches off the pages while sending.

  4. Myth & old discussions • Myth: computers with non-cache coherent design. (using multiple threads) • We learned that its not true; this can happen with a cache coherent machines too. or the non-cache coherent need explicitly to sync the caches. • Old discussion: old mail thread on mpi-forum.org • Related to MPI impl on SGI machines that did not want to for overlapping memory regions. • Mail thread with William C. Saphir from NASA re SGI machine performance. • Okay with read from send buffer was not okay with 2 isends.

  5. Why yes? • Developers are surprised by this restriction • And/or writing non-portable code unknowingly • Overlapped send perf: MPI_Isend(buf, … rank=3) MPI_Isend(buf, … rank=4) • Is illegal in MPI 2.0 • Perf implication: need to copy the buffer • Solution: use comm and bcast: too expensive • Send to a list of ranks: API does not exist (yet?) • Overlapped computation perf: • One thread MPI_Isend & compute pattern (using same memory) • Multi threaded computation perf: • One thread, in a collective operation • Second thread use same memory for computation

  6. Work group list • Erez Haba erezh@microsoft.com • David Gingold david.gingold@sicortex.com • Gil Bloch gil@mellanox.com • George Bosilcabosilca@eecs.utk.edu • Darius Buntinas buntinas@mcs.anl.gov • Dries KimpeDries.Kimpe@cs.kuleuven.be • Patrick Geoffray patrick@myri.com • Doug Gregor dgregor@osl.iu.edu

More Related