1 / 25

Thoughts on a Java Reference Implementation for MPJ

Thoughts on a Java Reference Implementation for MPJ. Mark Baker * , Bryan Carpenter . * University of Portsmouth  Florida State University IPDPS, Cancun, Mexico – 5 th May 2000 http://www.dcs.port.ac.uk/~mab/Talks/. Contents. Introduction Some design decisions

gates
Download Presentation

Thoughts on a Java Reference Implementation for MPJ

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Thoughts on a Java Reference Implementation for MPJ Mark Baker*, Bryan Carpenter *University of Portsmouth Florida State University IPDPS, Cancun, Mexico – 5th May 2000 http://www.dcs.port.ac.uk/~mab/Talks/ Mark.Baker@Computer.Org

  2. Contents • Introduction • Some design decisions • An overview of the architecture • Process creation and monitoring • The MPJ daemon • Handling aborts and failures • MPJ device • Conclusions and future work Mark.baker@Computer.Org

  3. Introduction • The Message-Passing Working Group of the Java Grande Forum was formed in late 1998 as a response to the appearance of several prototype Java bindings for MPI-like libraries. • An initial draft for a common API specification was distributed at Supercomputing '98. • Since then the working group has met in San Francisco and Syracuse. • The present API is now called MPJ. Mark.baker@Computer.Org

  4. Introduction • No complete implementation of the draft specification. • mpiJava, is moving towards the “standard”. • The new version (1.2) of the software supports direct communication of objects via object serialization, • Version 1.3 of mpiJava will implement the new API. • The mpiJava wrappers rely on the availability of platform-dependent native MPI implementation for the target computer. Mark.baker@Computer.Org

  5. Introduction • While this is a reasonable basis in many cases, the approach has some disadvantages. • The 2-stage installation procedure – get and build native MPI then install and match the Java wrappers – tedious/off-putting to new users. • On several occasions we saw conflicts between the JVM environment and the native MPI runtime behaviour. The situation has improved, and mpiJava now runs on various combinations of JVM and MPI implementation. • This strategy simply conflicts with the ethos of Java – write-once-run-anywhere software is the order of the day. Mark.baker@Computer.Org

  6. MPJ – the Next Generation of Message Passing in Java, • An MPJ reference implementation could be implemented as: • Java wrappers to a native MPI implementation, • Pure Java, • Principally in Java – with a few simple native methods to optimize operations (like marshalling arrays of primitive elements) that are difficult to do efficiently in Java. • We are aiming at pure Java to provide an implementation of MPJ that is maximallyportable and that hopefully requires the minimum amount of support effort. Mark.baker@Computer.Org

  7. Benefits of a pure Java implementation of MPJ • Highly portable. Assumes only a Java development environment. • Performance: moderate. May need JNI inserts for marshalling arrays. Network speed limited by Java sockets. • Good for education/evaluation. • Vendors provide wrappers to native MPI for ultimate performance? Mark.baker@Computer.Org

  8. Design Criteria for the MPJ Environment • Need an infrastructure to support groups of distributed processes: • Resource discovery, • Communications, • Handle failure, • Spawn processes on hosts. Mark.baker@Computer.Org

  9. Resource discovery • Technically, Jini discovery and lookup seems an obvious choice. • Daemons register with lookup services. • A “hosts file” may still guide the search for hosts, if preferred. Mark.baker@Computer.Org

  10. Communication base • Maybe, some day, Java VIA?? • For now sockets are the only portable option. • RMI surely too slow. Mark.baker@Computer.Org

  11. Handling “Partial Failures” • Need to overcome: • When a network connection breaks, • The host system goes down, • The JVM running the remote MPJ task halts for some other reason (e.g., occurrence of a Java exception), • The programthat initiated the MPJ job is killed. • Unexpected termination of any particular MPJ job. • Concurrent tasks associated with other MPJ jobs should be unaffected, even if they were initiated by the same daemon. • All processes associated with the particular job must shut down within some (preferably short) interval of time cleanly. Mark.baker@Computer.Org

  12. Handling “Partial Failures” • A useable MPJ implementation must deal with unexpected process termination or network failure, without leaving orphan processes, or leaking other resources. • Could reinvent protocols to deal with these situations, but Jini provides a ready-made framework (or, at least, a set of concepts). Mark.baker@Computer.Org

  13. Handling failures with Jini • If any slave dies, client generates a Jini distributed event, MPIAbort – all slaves are notified and all processes killed. • In case of other failures (network failure, death of client, death of controlling daemon, …) client leases on slaves expire in a fixed time, and processes are killed. Mark.baker@Computer.Org

  14. Integration of Jini and MPI • Provides a natural Java framework for parallel computing with the powerful fault tolerance and dynamic characteristics of Jini combined with proven parallel computing functionality and performance of MPI Mark.baker@Computer.Org

  15. MPJ - Implementation • In the initial reference implementation we will use Jini technology to facilitate location of remote MPJ daemons and to provide a framework for the required fault-tolerance. • This choice rests on our guess that in the medium-to-long-term Jini will be a ubiquitous component in Java installations. • Hence using the Jini paradigms from the start should eventually help inter-working and compatibility between our software and other systems. Mark.baker@Computer.Org

  16. Acquiring compute slaves through Jini Mark.baker@Computer.Org

  17. MPJ • We envisage that a user will download a jar-file of MPJ library classes onto machines that may host parallel jobs, and install a daemon on those machines – technically by registering an activatable object with an rmid daemon. • Parallel java codes are compiled on one host. • An mpjrun program invoked on that host transparently loads the user's class files into JVMs created on remote hosts by the MPJ daemons, and the parallel job starts. Mark.baker@Computer.Org

  18. MPJ - Implementation • In the short-to-medium-term – beforeJini software is widely installed – we might have to provide a “lite” version of MPJ that is unbundled from Jini. • Designing for Jini protocols should, nevertheless, have a beneficial influence on overall robustness and maintainability. • Use of Jini implies use of RMI for various management functions. Mark.baker@Computer.Org

  19. Slave 1 Slave 2 Slave 3 Slave 4 Host Mpj Deamon Mpjrun myproggy –np 4 rmid http server Mark.baker@Computer.Org

  20. MPJ – Implementation • Some assumptions that have a bearing on the organization of the MPJ daemon: • stdout (and stderr) streams from all tasks in an MPJ job are merged non-deterministically and copied to the stdout of the process that initiates the job. • No guarantees are made about other IO operations - these are system dependent. • Rudimentary support for global checkpointing and restarting of interrupted jobs may be quite useful, although checkpointing would not happen without explicit invocation in the user-level code, or that restarting would happen automatically. Mark.baker@Computer.Org

  21. MPJ – Implementation • The role of the MPJ daemons and their associated infrastructure is to provide an environment consisting of a group of processes with the user-code loaded and running in a reliable way. • The process group is reliable in the sense that no partial failures should be visible to higher levels of the MPJ implementation or the user code. • We will use Jini leasing to provide fault tolerance –clearly no software technology can guarantee the absence of total failures, where the whole MPJ job dies at essentially the same time. Mark.baker@Computer.Org

  22. MPJ - Implementation • Once a reliable cocoon of user processes has been created through negotiation with the daemons, we have to establish connectivity. • In the reference implementation this will be based on Java sockets. • Recently there has been interest in producing Java bindings to VIA - eventually this may provide a better platform on which to implement MPI, but for now sockets are the only realistic, portable option. Mark.baker@Computer.Org

  23. MPJ – Implementation • Between the socket API and the MPJ API there will be an intermediate “MPJ device” level – modelled on the Abstract Device Interface (ADI) of MPICH. • Although the role is slightly different here - we do not really anticipate a need for multiple platform-specific implementations - this still seems like a good layer of abstraction to have in our design. • The API is actually not modelled in detail on the MPICH device, but the level of operations is similar (based on isend/irecv/waitany calls). Mark.baker@Computer.Org

  24. High Level MPI Collective Operations Process Topologies Base Level MPI All pt-to-pt modes Groups Communicators Datatypes Isend, irecv, waitany, … Physical PIDs Contexts & Tags Byte vector data MPJ Device Level Java Socket and Thread API All-to-all TCP Connect Input Handler Threads Synchronised methods, wait, notify… MPJ Daemon Lookup, Leasing (Jini) Exec java MPJLoader Serializable objects Process Creation and Monitoring Layers of an MPJ Reference Implementation Mark.baker@Computer.Org

  25. MPJ - Conclusions • On-going effort (NSF proposal + volunteer help). • Collaboration to define exact MPJ interface – consisting of other Java MP system developers. • Work at the moment is based around the development of the low-level MPJ device and exploring the functionality of Jini. Mark.baker@Computer.Org

More Related