Smart MPI Intra-node Communication among Multicore and Manycore Machines Teng Ma, George Bosilca, Aurelien Bouteiller and Jack J. Dongarra.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Kernel assisted collective: KNEM coll
(b) Nehalem EP
(a) Tigerton inter-socket
(b) Nehalem inter-socket
(c) Tigerton intra-socket
(d) Nehalem intra-socket
(c) Nehalem EX
Fig 2. Performance comparison of Broadcast Operations between shared memory based modules (Basic, SM and Tuned) and KNEM coll, normalized to the Basic module runtime (lower is better).
Fig 1. Bandwidth of ping-pong test for vanilla MPICH2, vanilla OpenMPI and multi-tuning OpenMPI