Java for High Performance Computing

Java for High Performance Computing Java RMI and RMI-Based Approaches for HPC http://www.hpjava.org/courses/arl Instructor: Bryan Carpenter Pervasive Technology Labs Indiana University dbcarpen@indiana.edu

Remote Method Invocation • Java RMI is a mechanism that allows a Java program running on one computer to apply a method to an object living on a different computer. • RMI is an implementation of the of the Distributed Object programming model—similar to CORBA, but simpler and specialized to the Java language. • The syntax of the remote method invocation looks like an ordinary Java method invocation. • The remote method call can be passed arguments computed in the context of the local machine. It can return arbitrary values computed in the context of the remote machine. The RMI runtime system transparently communicates all data required. • In some ways Java RMI is more general than CORBA—it can exploit Java features like object serialization and dynamic class loading to provide more complete object-oriented semantics. dbcarpen@indiana.edu

Distributed Object Picture • Code running in the local machine holds a remote reference to an object obj on a remote machine: res = obj.meth(arg) ; obj ResType meth(ArgType arg) { . . . Any code … return new ResImpl(. . .) ; } Local Machine Remote Machine dbcarpen@indiana.edu

Java RMI References • Java RMI, Troy Bryan Downing, IDG books, 1998. • “Getting Started Using RMI”, and other documents, at: http://java.sun.com/products/jdk/rmi/ • Java RMI Lectures in “Applications of IT” course, run at Florida State University, 2001: http://aspen.ucs.indiana.edu/it1spring01/ dbcarpen@indiana.edu

A Simple Use of RMI dbcarpen@indiana.edu

The Remote Interface • In RMI, a common remote interface is the minimum amount of information that must be shared in advance between “client” and “server” machines. It defines a high-level “protocol” through which the machines will communicate. • A remote interface is a normal Java interface, which must extent the marker interface java.rmi.Remote. • Corollaries: because the visible parts of a remote object are defined through a Java interface, constructors, static methods and non-constant fields are not remotely accessible (because Java interfaces can’t contain such things). • All methods in a remote interface must be declared to throw the java.rmi.RemoteException exception. dbcarpen@indiana.edu

A Simple Example • A file MessageWriter.java contains the interface definition: import java.rmi.* ; public interface MessageWriter extends Remote { void writeMessage(String s) throws RemoteException ; } • This interface defines a single remote method, writeMessage(). dbcarpen@indiana.edu

java.rmi.Remote • The interface java.rmi.Remote is a marker interface. • It declares no methods or fields; however, extending it tells the RMI system to treat the interface concerned as a remote interface. • In particular we will see that the rmic compiler generates extra code for classes that implement remote interfaces. This code allows their methods to be called remotely. dbcarpen@indiana.edu

java.rmi.RemoteException • Requiring all remote methods be declared to throw RemoteException was a philosophical choice by the designers of RMI. • RMI makes remote invocations look syntactically like local invocation. In practice, though, it cannot defend from problems unique to distributed computing—unexpected failure of the network or remote machine. • Forcing the programmer to handle remote exceptions helps to encourage thinking about how these partial failures should be dealt with. • See the influential essay: “A Note on Distributed Computing” by Waldo et al, republished in The Jini Specification: http://java.sun.com/docs/books/jini dbcarpen@indiana.edu

The Remote Object • A remote object is an instance of a class that implements a remote interface. • Most often this class also extends the library class java.rmi.server.UnicastRemoteObject. This class includes a constructor that exports the object to the RMI system when it is created, thus making the object visible to the outside world. • Usually you will not have to deal with this class explicitly—your remote object classes just have to extend it. • One fairly common convention is to name the class of the remote object after the name of the remote interface it implements, but append “Impl” to the end. dbcarpen@indiana.edu

A Remote Object Implementation Class • The file MessageWriterImpl.java contains the class declaration: import java.rmi.* ; import java.rmi.server.* ; public class MessageWriterImpl extends UnicastRemoteObject implements MessageWriter { public MessageWriterImpl() throws RemoteException { } public void writeMessage(String s) throws RemoteException { System.out.println(s) ; } } dbcarpen@indiana.edu

Compiling the Remote Object Class • To compile classes that implement Remote, you must use the rmic compiler. The reasons will be discussed later. For example: sirah$ rmic MessageWriterImpl dbcarpen@indiana.edu

Client and Server Programs • We have completed the Java files for the remote object class itself, but we still need the actual client and server programs that usethis class. • In general there are some pieces of administrivia one has to deal with—publishing class files and installing security managers. • We initially make the simplifying assumption that both client and server have copies of all class files for MessageWriter (e.g., they may share access through shared NFS directories). • Then “publishing class files” is not an issue, and we also don’t need a security manager, because all codeis “local”, and therefore trusted. dbcarpen@indiana.edu

A Server Program • We assume the file HelloServer.java contains the class declaration: import java.rmi.* ; public class HelloServer { public static void main(String [] args) throws Exception { MessageWriter server = new MessageWriterImpl() ; Naming.rebind(“messageservice”, server) ; } } dbcarpen@indiana.edu

Remarks • This program does two things: • It creates a remote object with local name server. • It publishes a remote reference to that object with external name “MessageWriter”. • The call to Naming.rebind() places a reference to server in an RMI registry running on the local host (i.e., the host where the HelloServer program is run). • Client programs can obtain a reference to the remote object by looking it up in this registry. dbcarpen@indiana.edu

A Client Program • We assume the file HelloClient.java contains the class declaration: import java.rmi.* ; public class HelloClient { public static void main(String [] args) throws Exception { MessageWriter server = (MessageWriter) Naming.lookup( “rmi://sirah.csit.fsu.edu/messageservice”) ; server.writeMessage(“Hello, other world”) ; } } dbcarpen@indiana.edu

Remarks • Again the program does two things: • It looks up a reference to a remote object with external name “MessageWriter”, and stores the returned reference with local name server. • Finally (!), it invokes the remote method, writeMessage(), on server. • The call to Naming.lookup() searches ina remoteRMI registry. Its argument is a URL,with protocol tag “rmi”. • This example assumes the remote object lives on the host “sirah”, and has been registered in the default RMI registry (which happens to listen on port 1099) on that machine. dbcarpen@indiana.edu

Compiling and Running the Example • Compile HelloServer and HelloClient on their respective hosts, e.g.: sirah$ javac HelloServer merlot$ javac HelloClient • Either ensure client and server share the current directory, or copy all files with names of the form MessageWriter * .class to the client’s current directory. dbcarpen@indiana.edu

Running HelloClient/HelloServer dbcarpen@indiana.edu

The Mechanics of Remote Method Invocation dbcarpen@indiana.edu

Is RMI a Language Extension? • Invocation of a method on a remote object reproduces the “look and feel” of local invocation very well. • But the internal mechanics of remote invocation are muchmore complex than local invocation: • Arguments—which may be objects of arbitrary complexity—are somehow collected together into messages suitable for shipping across the Internet. • Results (or exceptions) are similarly shipped back. • Perhaps surprisingly, RMI involves nomodification to the Java language, compiler, or virtual machine. • The illusion of remote invocation is achieved by clever libraries, plus one relatively simple “post-processor” tool (rmic). dbcarpen@indiana.edu

Exchanging Remote References • A powerful feature of distributed object models like RMI is that references to other remote objects can be passed as arguments to, and returned as results from, remote methods. • Starting with one remote object reference (presumably obtained from an RMI registry) a client can, for example, obtain references to additional remote objects—returned by methods on the first one. dbcarpen@indiana.edu

Example: a Printer Registry • Suppose we are on a LAN and we need to get a Java driver for one of several available printers: public interface Printer extends Remote { void print(String document) throws RemoteException ; } public interface PrinterHub extends Remote { Printer getPrinter(int dpi, boolean isColor) throws RemoteException ; } • A client might initially obtain a PrinterHub reference from the RMI registry. The remote object contains some table of printers on the network. • An individual Printer interface is returned to the client, according to specifications given in getPrinter(). • Jini takes an approach similar to this. dbcarpen@indiana.edu

Remote References have Interface Type • This is a powerful feature, but there is one interesting restriction: • If a particular argument or result of a remote method itself implements Remote, the type appearing in the method declaration must be aremote interface. • The declared type of an RMI argument or result cannot be a remote implementation class. • At the “receiving end” this reference cannot be cast it to the implementation class of the remote object. A ClassCastException will occur if you try. dbcarpen@indiana.edu

Stubs • What this tells us is that “remote references” are not literally Java references to objects in other virtual machines. • In fact they are Java references to local objects that happen to implement the same remote interfaces as the remote objects concerned. • The local Java object referenced is an instance of a stub class. dbcarpen@indiana.edu

Some Important Parts of RMI • Stubs. • Each remote object class has an associated stub class, which implements the same remote interfaces. An instance of the stub class is needed on each client. Client-side remote invocations are “actually” local invocations on the stub class. • Serialization. • Arguments and results have to be “marshaled”—converted to a representation that can be sent over the Net. In general this is a highly non-trivial transformation for Java objects. Serialization is also used for distributing stubs. • The Server-side “Run-time System”. • This is responsible for listening for invocation requests on suitable IP ports, and dispatching them to the proper, locally resident, remote object. dbcarpen@indiana.edu

Architecture Internet Client Server Call stub method locally Call remote object method locally Send marshaled arguments RMI “Run-time” System Remote Object Stub Client Code Send marshaled result or exception Return value or throw exception Return value or throw exception dbcarpen@indiana.edu

The Role of rmic • The only “compiler” technology peculiar to RMI is the rmicstub generator. • The input to rmic is a remote implementation class, compiled in the normal way with javac (for example). • The stub generator outputs a new class that implements the same remote interfaces as the input class. • The methods of the new class contain code to send arguments to, and receive results from, a remote object, whose Internet address is stored in the stub instance. dbcarpen@indiana.edu

Example Operation of rmic • An earlier example of a remote implementation class: public class MessageWriterImpl extends UnicastRemoteObject implements MessageWriter { . . . public void writeMessage(String s) throws RemoteException { . . . } } • We issue the command: rmic –keep MessageWriterImpl • The flag –keep causes the intermediate Java source to be retained. dbcarpen@indiana.edu

The Generated Stub Class public final class MessageWriterImpl_Stub extends java.rmi.server.RemoteStub implements MessageWriter, java.rmi.Remote { . . . public MessageWriterImpl_Stub(java.rmi.server.RemoteRef ref) { super(ref); } public void writeMessage(java.lang.String $param_String_1) throws java.rmi.RemoteException { try { ref.invoke(this, $method_writeMessage_0, new java.lang.Object[] {$param_String_1}, 4572190098528430103L); } . . . } } dbcarpen@indiana.edu

Remarks on the Stub Class • The stub class includes an inherited field ref,of type RemoteRef. • Essentially the stub class is just a wrapper for this remote reference. • Remote methods are dispatched through the invoke() method on ref. • This is passed an array of Objects holding the original arguments (in general it also returns an Object). • It is also passed arguments to identify the particular method to be invoked on the server. • Essentially the stub wrapper is providing compile-time type safety. The actual work is done in library classes that don’t know the compile-time type in advance. dbcarpen@indiana.edu

Marshalling of Arguments • Objects passed as arguments to invoke() must be marshaled for transmission over the network. • Java has a general framework for converting objects (and groups of objects) to an external representation that can later be read back into an arbitrary JVM. • This framework is Object Serialization. dbcarpen@indiana.edu

Serialization Preserves Object Graphs • Consider this binary tree node class: class Node implements Serializable { Node() {} Node(Node left, Node right) { this.left = left ; this.right = right ; } private Node left, right ; } • We create a small tree, d, by: Node a = new Node(), b = new Node() ; // Leaves Node c = new Node(a, b) ; Node d = new Node(c, null) ; dbcarpen@indiana.edu

Serializing and Deserializing a Tree • Write out the root of the tree: out.writeObject(d) ; a b c d a’ b’ • Read a node later by: • Node e = (Node) in.readObject() ; c’ e • The whole of the original tree is reproduced. Copies a’, b’, c’ of the original sub-nodes are recreated along with e. The pattern of references is preserved. dbcarpen@indiana.edu

Referential Integrity is Preserved • This behavior is not limited to trees. • In this example both b and c reference a single object a. • Again the pattern of links is preserved. When the root object is reconstructed from its serialized form, a single a’, referenced twice, is also created. • Generally referential integrity is preserved amongst all objects written to a single ObjectOutputStream. a c b d a’ c’ b’ e dbcarpen@indiana.edu

The Serializable Interface • Serializable is another marker interface. An object’s class must implement Serializable if it is to be passed to writeObject() (the library method that actually does serialization). If it doesn’t, a NotSerializableException will be thrown. • Arrays are serializable if their elements are. • Many (most?) of the utility classes in the standard Java library are serializable. • For example container classes as complex as HashMaps can readily be serialized (assuming the user data stored in them is serializable), and thus passed as arguments and returned as results of RMI methods. dbcarpen@indiana.edu

Argument Passing in RMI: Summary • In general any object-valued argument or result of a remote method must either implement Remote or Serializable. • If the argument or result implements Remote, it is effectively passed by (remote) reference. • If it implements Serializable, it is passed by serialization and copying. Referential integrity is preserved within the limits of the arguments of a single invocation, as described above. dbcarpen@indiana.edu

Mechanics of RMI: Summary • In principle most of the features that have been discussed in this section are hidden inside the implementation of RMI. In an ideal world you would not have to know about them. • In practice, to successfully deploy an RMI-based application, you will probably need to at least be aware of some fundamental issues. • You need to be aware of the existence of stub objects, and the basic working of object serialization. • You should be aware that references to remote objects are normally produced by creating a stub object on the server, then passing this stub to registry and clients in serialized form. dbcarpen@indiana.edu

Dynamic Class Loading dbcarpen@indiana.edu

Byte Code Instructions for Stubs? • As we have seen: before any client can use an RMI remote object, it must receive a serialized stub object. • The serialized stub contains a remote reference. Data fields of the reference may include information like: • The name of the host where the remote object lives, • Some port number on that host, where the RMI run-time system is listening for invocation requests. • Any other information needed to uniquely identify the remote object within its host. • One thing serialized objects do not contain is the actual JVM instructions (the byte codes), that implement methods on the local object. dbcarpen@indiana.edu

Serialization Only Saves the Data • In general the Java serialization process stores all data fields from the original object. • It does not store any representation of the code associated with the methods in the object’s class. • When an object is deserialized (e.g. on some client), the client JVM must have some way of loading a class file that does contain this information. • If it cannot find a suitable class file, the deserialization process will fail. You will see a java.rmi.UnmarshalException thrown, with a nested java.lang.ClassNotFoundException. • When you are doing development using RMI, you will probably see this exception a lot! dbcarpen@indiana.edu

Copying Stub Class Files • In RMI, there are at least two ways to get the class files to the client. • The straightforward approach is to manually copy class files for all stub classes to the client: either put them in the current directory on the client, or in some directory on the client’s CLASSPATH. • This approach is reliable, easy to understand, and perhaps the best approach for initial experiments with RMI. • But eventually you may find this too limiting. One of the benefits of the OO approach is supposed to be that the user code (here the client) doesn’t need need to know the exact implementation class in advance—only the interface. But stubs are associated with the implementation class. dbcarpen@indiana.edu

Dynamic Class Loading • A more general approach is to publish implementation class files that may be needed by clients on a Web Server. • Although the serialized representation of an object does not contain the actual information from the class file, the representation can be annotated witha URL. This specifies a Web Server directory from which the class file can be downloaded. • When the object is deserialized, the client Java Virtual Machine transparently downloads the byte codes from the Web Server specified in the annotation. On the client side, this process happens automatically. dbcarpen@indiana.edu

Dynamic Class Loading Serialized stub, annotated with code-base: http://myWWW/download/ Remote Object (MyImpl instance) Client JVM Server Request stub class file Client Web Server html/ download/ Server (myWWW) MyImpl_Stub.class dbcarpen@indiana.edu

Remarks • In simple examples, the serialized stub will probably be obtained through an RMI registry running on the server (the same server where the remote object is running). • The two servers—the server where the remote object is running, and the Web Server publishing the class files—may, of course, be physically the same machine. dbcarpen@indiana.edu

The java.rmi.server.codebase Property • We need a way to cause serialized object representations to be annotated with suitably chosen URLs. • In principle this is straightforward. We set a propertycalledjava.rmi.server.codebasein the JVM where the stub (or serialized object in general) originates. • The value of this property is a code-baseURL. • The RMI serialization classes read the code-base property, and embed the URL they find there in the serialized representation of arguments or results. • Unless this JVM itself downloaded the class file for the object from a Web server, in which case they embed the URL from which the class was originally loaded. dbcarpen@indiana.edu

Setting the Code-base • For example, our original HelloServer example might be run as follows: java –Djava.rmi.server.codebase=http://sirah.csit.fsu.edu/users/dbc/ HelloServer • This sets the java.rmi.server.codebase property to: http://sirah.csit.fsu.edu/users/dbc/ This URL gets embedded in serialization streams created by the HelloServer program. • If an object is subsequently recreated by deserialization in a different JVM (and that JVM cannot find a local copy of the associated class file) it will automatically request it from the Web server sirah, looking in the document directory users/dbc/. dbcarpen@indiana.edu

Security Managers • There is one more thing we need to worry about. • Before a Java application is allowed to download code dynamically, a suitable security manager must be set. This means a security policy must also be defined. • In general this is a complicated topic. We won’t go into any detail: just give a recipe you can follow. dbcarpen@indiana.edu

Setting the Security Manager • In an RMI application, if no security manager is set, stubs and classes can only be loaded from the local CLASSPATH. • To enable dynamic loading, issue the command: System.setSecurityManager(new RMISecurityManager()) ; at the start of the program. • You should do this in any application that may have to download code—in the simple examples considered so far this means RMI clients that need to download stubs. • This isn’t the end of the story. You also have to define a new property: the java.security.policy property. • In simple cases this property is needed forclients, whereas java.rmi.server.codebaseis needed for servers. dbcarpen@indiana.edu

Defining a Security Policy • The simplest security policy you can define is a plain text file with contents: grant { permission java.security.AllPermission “”, “” ; } ; • This policy allows downloaded code to do essentially anything the current user has privileges to do: • Read, write and delete arbitrary files; open, read and write to arbitrary Internet sockets; execute arbitrary UNIX/Windows commands on the local machine, etc. • It is a dangerous policy if there is any chance you may download code from untrustworthy sources (e.g. the Web). • For now you can use this policy, but please avoid dynamically loading code you cannot trust! dbcarpen@indiana.edu

Java for High Performance Computing