1 / 62

MPJ (Message Passing in Java): The past, present, and future

MPJ (Message Passing in Java): The past, present, and future. Aamir Shafi Distributed Systems Group University of Portsmouth. Presentation Outline. Introduction Java messaging system Java NIO (New I/O package) Comparison of Java with C The trend towards SMP clusters

jlai
Download Presentation

MPJ (Message Passing in Java): The past, present, and future

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MPJ (Message Passing in Java): The past, present, and future Aamir Shafi Distributed Systems Group University of Portsmouth

  2. Presentation Outline • Introduction • Java messaging system • Java NIO (New I/O package) • Comparison of Java with C • The trend towards SMP clusters • Background and review • MPJ design & implementation • Performance evaluation • Conclusions

  3. Introduction • A lot of interest in a Java messaging system: • There is no reference messaging system in pure Java, • A reference system should follow the API defined by the MPJ specification. • What a Java messaging system has to offer? • Portability: • Write once run anywhere. • Object oriented programming concept: • Higher level of abstraction for parallel programming. • An extensive set of API libraries: • Avoids reinventing the wheel. • Multi-threaded language: • Thread-safety mechanisms: • ‘synchronized’ blocks, • wait() and notify() in Object class. • Automatic memory management.

  4. Introduction • But, is not Java slower than C (in terms of I/O)? • The traditional I/O package of Java is an blocking API: • A separate thread to handle each socket connection. • Java New I/O package: • Adds non-blocking I/O to the Java language: • C select () like functionality. • Direct Buffers: • Conventional Java objects are allocated on JVM heap, • Unlike conventional Java objects, direct buffers are allocated in the native OS memory, • Provides faster I/O, not subject to garbage collection. • JIT (Just In Time) compilers: • Convert the object code into native machine code. • Communication performance: • Comparison of Java NIO and C Netpipe drivers, • Java performs similar to C on Fast Ethernet.

  5. Introduction – Some background • Parallel programming paradigms: • Shared memory: • Standard: SHMEM and more recently OpenMP. • Implementation: JOMP (Java OpenMP). • Distributed memory: • Standard: Message Passing Interface (MPI). • Implementation: MPJ (Message Passing in Java). • Hybrid paradigms. • Clusters have become a cost effective alternative to traditional HPC hardware • This trend towards clusters lead to the emergence of SMP clusters: • StarBug, the DSG cluster consists of eight dual CPU nodes, • Shared memory for intra-node communications, • Distributed memory for inter-node communications, • Thus, a framework based on a hybrid programming paradigm.

  6. Aims of the project • Research and development of a reference messaging system based on Java NIO. • A secure runtime infrastructure to bootstrap and control MPJ processes. • MPJ framework for SMP clusters: • Integrate MPJ and JOMP: • Use MPJ for distributed memory, • Use JOMP for shared memory. • Map the parallel application to the underlying hardware for optimal execution. • Debugging, monitoring and profiling tools. • This talk discusses MPJ, the secure runtime and motivates for efficient execution on shared memory processors.

  7. Presentation Outline • Introduction • Background and review • MPJ design & implementation • Performance evaluation • Conclusions

  8. Background and review • This section of talk discusses: • Messaging systems in Java, • Shared memory libraries in Java, • The runtime infrastructures. • A detailed literature review is available in DSG first year technical report: • “A Status Report: Early Experiences with the implementation of a Message Passing System using Java NIO” • http://dsg.port.ac.uk/~shafia/res/papers/DSG_2.pdf

  9. Messaging systems in Java • Three approaches to build messaging systems in Java, using: • RMI (Remote Method Invocation): • An API of Java that allows execution of remote objects, • Meant for client server interaction, • Transfers primitive datatypes as objects. • JNI (Java Native Interface): • An interface that allows to invoke C (and other languages) from Java, • Not truly portable, • Additional copying between Java and C. • Sockets interface: • Java standard I/O package, • Java New I/O package.

  10. Using RMI • JMPI (University of Massachusetts): • Cons: • Not active, • Poor performance because of RMI, • KaRMI was used instead of RMI: • KaRMI runs on Myrinet. • CCJ (Vrije Universiteit Amsterdam): • Cons: • Not active. • Supports the transfer of objects as well as basic datatypes. • Poor performance because of RMI. • JMPP (National Chiao-Tung University).

  11. Using JNI • mpiJava (Indiana University + UoP): • Pros: • Moving towards the MPJ API specification, • Well-supported and widely used. • Cons: • Uses JNI and native MPI as the communication medium. • JavaMPI (University of Westminster): • Cons: • No longer active (uses Native Method Interface NMI), • Source code not available. • M-JavaMPI (The University of Hong Kong): • Supports process migration using JVMDI (JVM Debug Interface) that has been deprecated in Java 1.5. • JVMTI (JVM Tool Interface)

  12. Using sockets interface … • MPJava (University of Maryland) • Pros: • Based on Java NIO, • Cons: • No runtime infrastructure, • Source code is not available, • MPP (University of Bergen) • Based on Java NIO • Subset of MPI functionality of a bug in the TCP/IP stack.

  13. Shared memory libraries in Java • OpenMP implementation using Java (EPCC): • JOMP (Java OpenMP), • Single JVM, starts multiple threads to match the number of processors on an SMP node. • Efficient shared memory communications can also be implemented by MappedByteBuffer class: • Memory mapped to a file, • Sender may lock and write to the file, • Reader may lock and read from the file. • Single JVM implementation of mpjdev • Threads are processes

  14. The runtime infrastructures • Shell/Perl scripts • Most messaging systems use SSH to start remote processes (for linux). • SPMD (Argonne National Lab): • Part of MPICH-2, • SPMD stands for “Super Multi Purpose Daemon”, • Different implementation for linux and windows. • Java is ideal to implement the runtime infrastructure • Portability - same implementation will run on different operating systems.

  15. Presentation Outline • Introduction • Literature review • MPJ design & implementation • Performance evaluation • Conclusions

  16. Design Goals • Portability. • Standard Java: • Assuming no language extensions. • High Performance. • Modular and layered architecture: • Device drivers, and other layers. • Allows higher level of abstraction: • By enabling the transfer of objects.

  17. High Level MPJ Collective operations Process topologies Base Level MPJ All point-to-point modes Groups Communicators Datatypes MPJ Device Level isend, irecv, waitany, . . . Physical process ids (no groups) Contexts and tags (no communicators) Byte vector data Buffer packing/unpacking JNI Wrapper Communication medium Java NIO and Thread APIs Native MPI Specialised Hardware Library (For e.g. VIA communication primitives) Process Creation and Monitoring MPJ service daemon Java Reflection API to start processes Dynamic Class loading The Generic Design

  18. Implementation of MPJ • Device drivers: • Java NIO device driver (mpjdev). • The native MPI device driver (native mpjdev). • Swapped in/out of MPJ. • Similar to device drivers in MPICH. • MPJ Point to point communications: • Blocking and non-blocking. • Communicators, virtual topologies. • MPJ Collective Communications: • Various collective communications methods. • Instantiation of MPJ design (on next slide).

  19. Java NIO device driver • Communication Protocols: • Eager-Send: • Assumes the receiver has infinite memory, • For small messages (< 128 Kbytes), • May incur additional copying. • Rendezvous: • Exchange of control messages before the actual transmission, • For long messages ( 128 Kbytes). • The buffering API: • Supports gathering/scattering of data, • Support the transfer of Java objects.

  20. Pt2Pt and collective methods • Point to Point communications: • Blocking/non-Blocking methods, • Buffered/ready/synchronous modes of send: • Supported by eager-send and rendezvous protocols at the device level. • Communicators. • Virtual Topologies. • Collective Communications: • Provided as utility to MPI programmers, • Gather/scatter/all-to-all/reduce/all-Reduce/scan.

  21. The runtime infrastructure

  22. Design of the runtime infrastructure

  23. Implementation of the runtime • The administrator installs MPJDaemons: • SSH allows us to install the daemons remotely on Linux, • Adds admin certificate to all the daemons keystore (a repository of certificates). • Using the daemons: • The administrator adds the user certificate into the keystore, • The MPJRun module is used run the parallel application. • Copying executables from MPJRun to MPJDaemon: • Via dynamic class loading. • Stdout/Stderr is redirected to MPJRun.

  24. Implementation issues • Issues with Java NIO: • Taming the NIO circus thread: • http://forum.java.sun.com/thread.jsp?forum=4&thread=459338&start=0&range=15&hilite=false&q • Allocating direct buffers lead to OutOfMemoryException (a bug). • Selectors taking 100 percent CPU: • No need to register for write events, • Only register for read events. • J2SE (Java2 Standard Edition) 1.5 has solved many problems. • MPJDaemons went out of memory because of direct buffers: • These buffers are not subject to garbage collection, • Details shown in coming slides, • Solved by starting a new JVM at MPJDaemon.

  25. Machine names where MPJDaemon will be installed

  26. Installing the daemon from the initiator machine

  27. First execution …

  28. Memory stats of one of machines where MPJDaemon is installed and is executing MPJ app

  29. After second execution …

  30. After a few more executions …

  31. Finally, Out of memory ….

  32. Presentation Outline • Introduction • Literature review • MPJ design & implementation • Performance evaluation • Conclusion

  33. Sequence of perf evaluation graphs • Java NIO device driver evaluation: • Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Same node of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Importance of OpenMP for shared memory communications. • Evaluation of eager-send & rendezvous protocols: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds). • MPJ Pt2Pt evaluation: • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds).

  34. Java NIO device driver (red line) performs similar to native mpjdev device driver • Latency (the time taken to transfer one byte) is ~ 260 microseconds

  35. Sequence of perf evaluation graphs • Java NIO device driver evaluation: • Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Same node of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Importance of OpenMP for shared memory communications. • Evaluation of eager-send & rendezvous protocols: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds). • MPJ Pt2Pt evaluation: • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds).

  36. Throughput for both devices is ~ 89 Mbits/s Change from eager send to rendezvous protocol

  37. Sequence of perf evaluation graphs • Java NIO device driver evaluation: • Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Same node of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Importance of OpenMP for shared memory communications • Evaluation of eager-send & rendezvous protocols: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds). • MPJ Pt2Pt evaluation: • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds).

  38. mpjdev (represented by red line) is communicating through sockets (the time is dictated by memory bus bandwidth) • ‘native mpjdev’ is communicating through shared memory • A problem for SMP clusters!

  39. Sequence of perf evaluation graphs • Java NIO device driver evaluation: • Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Same node of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Importance of OpenMP for shared memory communications. • Evaluation of eager-send & rendezvous protocols: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds). • MPJ Pt2Pt evaluation: • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds).

  40. Sequence of perf evaluation graphs • Java NIO device driver evaluation: • Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Same node of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Importance of OpenMP for shared memory communications. • Evaluation of eager-send & rendezvous protocols: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds). • MPJ Pt2Pt evaluation: • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds).

  41. Eager-send is for small messages < 128 Kbytes • Eager-send may incur additional copying • The time for exchanging control messages in rendezvous dictates the communication time of small messages • Rendezvous is suitable for large message > 128 Kbytes

  42. Sequence of perf evaluation graphs • Java NIO device driver evaluation: • Comparing to native mpjdev (mpjdev uses MPICH by interfacing through JNI) • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Same node of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Importance of OpenMP for shared memory communications. • Evaluation of eager-send & rendezvous protocols: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds) • MPJ Pt2Pt evaluation: • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds).

  43. Sequence of perf evaluation graphs • Java NIO device driver evaluation: • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Same node of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Importance of OpenMP for shared memory communications. • Evaluation of eager-send & rendezvous protocols: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds). • MPJ Pt2Pt evaluation: • Comparing to MPICH, mpiJava (mpiJava uses MPICH by interfacing through JNI) • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds).

  44. Sequence of perf evaluation graphs • Java NIO device driver evaluation: • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Same node of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds), • Importance of OpenMP for shared memory communications. • Evaluation of eager-send & rendezvous protocols: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds). • MPJ Pt2Pt evaluation: • Comparing to MPICH, mpiJava (mpiJava uses MPICH by interfacing through JNI) • Remote nodes of a cluster: • Transfer time (micro-seconds), • Throughput achieved (Mbits/seconds).

More Related