html5-img
1 / 21

Lin Chen, Cho-Li Wang, Francis C. M. Lau and Ricky K. K. Ma

G-JavaMPI: A Grid Middleware for Distributed Java Computing with MPI Binding and Process Migration Supports. Lin Chen, Cho-Li Wang, Francis C. M. Lau and Ricky K. K. Ma Department of Computer Science and Information Systems The University of Hong Kong {lchen2+clwang+fcmlau+kk1ma}@csis.hku.hk.

jorryn
Download Presentation

Lin Chen, Cho-Li Wang, Francis C. M. Lau and Ricky K. K. Ma

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. G-JavaMPI:A Grid Middleware for Distributed Java Computing with MPI Binding and Process Migration Supports Lin Chen, Cho-Li Wang, Francis C. M. Lau and Ricky K. K. Ma Department of Computer Science and Information Systems The University of Hong Kong {lchen2+clwang+fcmlau+kk1ma}@csis.hku.hk

  2. Outline • Motivation • Overall system architecture • Detailed Issues • Related works • Conclusion & Future Work

  3. Motivation • Grid computing: large-scale resource sharing, high performance • Globus Project: basic services required by building and using a Grid (authentication, security, resource allocation, remote data access, information services, etc.) • However • long-running applications  continuous computation • Better utilization of resource  scheduling and load balancing • Java process migration • architecture-independent bytecode makes migration easier

  4. Motivation • Let the programmer write a grid application easily • no care about inter-site communication and intra-site communication (we must care about it if directly using globus communication libraries) • SPMD: one program can be executed in multiple places or sites • MPI paradigm • a group of distributed processes, they can do peer-to-peer or collective communication • Communication source or destination addresses are unrelated with the real physical network address (adaptable)

  5. (3) (1) (1*) Gatekeeper LS Gatekeeper LS Java-MPI communication (2) WAN (*) Some legacy messages are redirected during migration (2*) Migrating (restarting a new process through Globus remote job request with delegated user credentials and Java-MPI job credentials) JVM Migration module resides in each JVM M Gatekeeper LS (3*) System Overview

  6. Globus ToolkitLibraries Java MPI communication daemons Local schedulers Java-MPI processes Migration modules A Java-MPI process Java-MPI process (before migration) (after migration) (1*) – (2*) – (3*): MPI communication route before migration (1*) – (2*) – (3*): MPI communication route after migration (*): Java MPI communication daemons redirect some legacy messages which should be go to the migrated process LS M System overview

  7. Java-MPI Applications Authentication Java-MPI API & Java API (Java-MPI API Layer) Restorable Communication Services JVM JVMDI Execution State Probe & Migration Plug-in (Migration Layer) Message Queues Info. Update Control Block DLB Policy (Restorable MPI Comm Layer) (Load Balancing Module) MPICH-G2 Globus Services OS Layered design Migration Instructions Hardware

  8. Java-MPI binding • Restorable communication layer • Daemon, a running MPICH-G2 process, providing MPI communication services • Communicate with JavaMPI process through IPC • Post-migration message re-direction Process space Restorable Communication

  9. Java Process Migration • State capturing: • a probe attached in each JVM, saves the process context through JVMDI (JVM Debugger Interface) • All runtime data: PC register, stack frames, objects, method area (local variables), etc. • Event notification: method_entry, frame_pop, etc. • Use object serialization to package all reachable objects in heap • New JDK1.4.0 & 1.4.1 released in Aug. 2002 support “full-speed debugging” JVM JVMDI 1. Execution state data 2. Event notification probe

  10. Java process migration • State Restoration: • Exception handler inserted in bytecode (pre-processing before execution) to restore local variables and “jump” to the original execution point • Re-allocate objects when re-starting JVM • Dynamic class loading

  11. Information update Migration Source site Migration begin Notify other sites (including destination site) The process arrives the safe migration point (consume all legacy messages) Update local site of the process’s new place Begin process state capturation Other sites Migration Destination site

  12. Process Restart JVM initialization At the same time, the probe started Process suspended in the beginning, Probe read out context from dumpfile Restoring the execution context Process resumed and continued from the last point Original Process creates a new user certificate proxy (proxy_init_cred ) delegated to remote site get the resource allocation The new process can be started (similar to normal globus job submit) New-started Process

  13. Experiment Results • Hardware • 32-node Cluster “ostrich” • configured as two grid points of 16 nodes • 733MHz Pentium III processor • 392MB of memory • connected by a 24-port Fast Ethernet switch • Software • Linux 2.2.14 • Gloubs 2.0 • Sun JDK 1.4.0_02 (supporting JVMDI with full-speed debugging mode) • MPICH 1.2.4 (MPICH-G2)

  14. Experiment results Bandwidth comparison between inter-site and intra-site communication with the installation of the MPI communication layer.

  15. Experiment results Latency comparison for small messages between intra-site and inter-site communication with the installation of the MPI communication layer.

  16. Experiment results Time spent in capturing and restoring objects

  17. Experiment results Time spent in capturing and restoring frames

  18. Related Works • Java bindings for MPI: “mpiJava”, “JavaMPI”, “MPIJ”, etc. • Java process or thread migration: • Add additional backup codes in programs [Aglets[IBM96]] • Insert backup statements in the source or byte code, a backup object is used to store state [Wasp project [Funfrocken98]] • Extend the JVM, make state accessible from Java programs, support type recognition of Java stack [sara Bouchenak 2000] • Use JVMDI to capture state, insert bytecode instructions in program body to help restoring [Torsten2001] • JESSICA (supports thread migration in JVM)

  19. Conclusion • a new middleware for the Grid with Java-MPI communication and transparent process migration features. • write MPI-style programs in Java language • Java process migration mechanism supports the development of any dynamic load balancing policy or fault tolerance mechanism

  20. Future Plan • Develop some scientific and engineering applications on top of this middleware • Support of the transfer of other I/O (including file stage-in/out) • Load balancing algorithm for the grid environment (both CPU and network load)

  21. The End Thanks !

More Related