1 / 31

High Performance Communication using MPJ Express

High Performance Communication using MPJ Express. Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan. Presentation Outline. Introduction Parallel computing HPC Platforms Software programming models MPJ Express Design Communication devices

vicky
Download Presentation

High Performance Communication using MPJ Express

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Performance Communication using MPJ Express Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan

  2. Presentation Outline • Introduction • Parallel computing • HPC Platforms • Software programming models • MPJ Express • Design • Communication devices • Performance Evaluation

  3. Serial vs Parallel Computing • Serial Computing • Parallel Computing

  4. HPC Platforms • There are three kind of High performance computing (HPC) platforms. • Distributed Memory Architecture • Massively Parallel Processor (MPP) • Shared Memory Architecture • Symmetric Multi processor (SMP) , Multicore computers • Hybrid Architecture • SMP Clusters • Most of the modern HPC hardware is based on hybrid models Distributed Memory Shared Memory Hybrid

  5. Software Programming Models • Shared Memory Models • Process has direct access to all memory • Pthreads, OpenMP • Distributed Memory Models • No direct access to memory of other processes • Message Passing Interface (MPI) Process

  6. Message Passing Interface (MPI) • Message Passing Interface is the defacto standard for writing applications on parallel hardware • Primarily designed for distributed memory machines but it is also used for shared memory machines

  7. MPI Implementations • OpenMPI • It is an open source production quality implementation of MPI-2 in C • Existing high performance drivers • TCP/IP, Shared memory, Myrinet, Quadrics, Infiniband • MPICH2 • It is the implementation of MPI on SMPs, clusters, and massively parallel processors • POSIX shared memory, SysV shared memory, Windows shared memory, Myrinet, Quadrics, Infiniband, 10 Gigabit Ethernet • MPJ Express • Implements the high level functionality of MPI in pure Java • Provides flexibility to update the layers or add new communication devices • TCP/IP, Myrinet, Threads shared memory, SysV shared memory

  8. Presentation Outline • Introduction • Parallel computing • HPC Platforms • Software programming models • MPJ Express • Design • Communication devices • Performance Evaluation 8

  9. Java NIO Device • Uses non-blocking I/O functionality, • Implements two communication protocols: • Eager-send • For small messages (< 128 Kbytes), • May incur additional copying. • Rendezvous: • Exchange of control messages before the actual transmission, • For long messages ( 128 Kbytes).

  10. Standard mode with eager send protocol (small messages)

  11. Standard mode with rendezvous protocol (large messages)

  12. Shared Memory Communication Device • Threads based • MPJ process is represented by a Java thread and data is communicated using shared data structures. • sendQueue and recvQueue • SysV based • MPJ process is represented by a Unix process and data is communicated using shared data structures. • Java Module -The xdev API implementation for shared memory communication • C Module - Unix SysV Inter Process Communication methods • JNI Module – Bridge between C and Java.

  13. MPI communication using sockets MPI communication using shared memory

  14. Key Implementation aspects • Critical operations include: • Initialize • Point to point • Send • Receive • Finalize

  15. Initialization Process 0’s shared memory segment Process 1’s shared memory segment Process 2’s shared memory segment Process 3’s shared memory segment

  16. Point-to-point communication • Communication between two processes. • Source process sends message to destination process. • Source and destination processes are identified by their rank

  17. Send Modes • Blocking Send • Only return from sub routine call when the operation has completed • Non Blocking Send • Return straight away and allow sub program to continue to perform other work. • At some time later check for the completion of the process

  18. Sending a message • Memory space of each process is divided into sub-sections equal to the number of processes. • Each subsection is used for communication with one process.

  19. Receiving a message • Destination process attaches itself to the shared memory segment of source process and starts reading messages from the sub-section allocated to it using offset

  20. Finalization • When the communication is completed, barrier method is called at the end which synchronizes all process. • Then the finalize method is called which destroys the shared memory allocated to the processes.

  21. Presentation Outline • Introduction • Parallel computing • HPC Platforms • Software programming models • Design and Implementation • Design • Communication devices • Performance Evaluation 22

  22. Performance Evaluation • A ping pong program was written in which two processes repeatedly pass a message back and forth. • Timing calls to measure the time taken for one message. • We used a warm up loop of 10K iterations and the average time was calculated for 20K iterations after warm up. • We present latency and throughput graphs • Latency is the delay between the initiation of a network transmission by a sender and the receipt of that transmission by a receiver • Throughput is the amount of data that passes through a network connection over time as measured in bits per second. • We have plotted the latency graph from message size of 1 byte up to 2KB and bandwidth graph from 2KB to 16MB

  23. Latency on Fast Ethernet

  24. Throughput on Fast Ethernet

  25. Latency on Gigabit Ethernet

  26. Throughput on GigE

  27. Latency on Myrinet

  28. Throughput on Myrinet

  29. Q ?

  30. Further Reading • Parallel Computing • https://computing.llnl.gov/tutorials/parallel_comp/ • MPI • www.mcs.anl.gov/mpi • MPJ Express • http://mpj-express.org/ • MPICH2 • http://www.mcs.anl.gov/research/projects/mpich2/ • OpenMPI • http://www.open-mpi.org/

More Related