Java for High Performance Computing

Java for High Performance Computing java.nio: High Performance I/O for Java http://www.hpjava.org/courses/arl Instructor: Bryan Carpenter Pervasive Technology Labs Indiana University dbcarpen@indiana.edu

NIO: New I/O • Prior to the J2SE 1.4 release of Java, I/O had become a bottleneck. • JIT performance was reaching the point where one could start to think of Java as a platform for High Performance computation, but the old java.io stream classes had too many software layers to be fast—the specification implied much copying of small chunks of data; there was no way to multiplex data from multiple sources without incurring thread context switches; also there was no way to exploit modern OS tricks for high performance I/O, like memory mapped files. • New I/O changes that by providing: • A hierarchy of dedicated buffer classes that allow data to be moved from the JVM to the OS with minimal memory-to-memory copying, and without expensive overheads like switching byte order; effectively buffer classes give Java a “window” on system memory. • A unified family of channel classes that allow data to be fed directly from buffers to files and sockets, without going through the intermediaries of the old stream classes. • A family of classes to directly implement selection (AKA readiness testing, AKA multiplexing) over a set of channels. • NIO also provides file locking for the first time in Java. dbcarpen@indiana.edu

References • The Java NIO software is part of J2SE 1.4 and later, from http://java.sun.com/j2se/1.4 • Online documentation is at: http://java.sun.com/j2se/1.4/nio • There is an authoritative book from O’Reilly: “Java NIO”, Ron Hitchens, 2002 dbcarpen@indiana.edu

Buffers dbcarpen@indiana.edu

Buffers • A Buffer object is a container for a fixed amount of data. • It behaves something like a byte [] array, but is encapsulated in such a way that the internal storage can be a block of system memory. • Thus adding data to, or extracting it from, a buffer can be a very direct way of getting information between a Java program and the underlying operating system. • All modern OS’s provide virtual memory systems that allow memory space to be mapped to files, so this also enables a very direct and high-performance route to the file system. • The data in a buffer can also be efficiently read from, or written to, a socket or pipe, enabling high performance communication. • The buffer APIs allow you to read or write from a specific location in the buffer directly; they also allow relative reads and writes, similar to sequential file access. dbcarpen@indiana.edu

The java.nio.Buffer Hierarchy dbcarpen@indiana.edu

The ByteBuffer Class • The most important buffer class in practice is probably the ByteBuffer class. This represents a fixed-size vector of primitive bytes. • Important methods on this class include: byte get() byte get(int index) ByteBuffer get(byte [] dst) ByteBuffer get(byte [] dst, int offset, int length) ByteBuffer put(byte b) ByteBuffer put(int index, byte b) ByteBuffer put(byte [] src) ByteBuffer put(byte [] src, int offset, int length) ByteBuffer put(ByteBuffer src) dbcarpen@indiana.edu

File Position and Limit • Apart from forms with an index parameter, these are all relative operations: they get data from, or insert data into, the buffer starting at the current positionin the buffer; they also update the position to point to the position after the read or written data. The position property is like the file pointer in sequential file access. • The superclass Buffer has methods for explicitly manipulating the position and related properties of buffers, e.g: int position() Buffer position(int newPosition) int limit() Buffer limit(int newLimit) • The ByteBuffer or Buffer references returned by these various methods are simply references to this buffer object, not new buffers. They are provided to support cryptic invocation chaining. Feel free to ignore them. • The limit property defines either the last space available for writing, or how much data has been written to the file. • After finishing writing a flip() method can be called to set limit to the current value of position, and reset position to zero, ready for reading. • Various operations implicitly work on the data between position and limit. dbcarpen@indiana.edu

Creating Buffers • Four interesting factory methods can be used to create a new ByteBuffer: ByteBuffer allocate(int capacity) ByteBuffer allocateDirect(int capacity) ByteBuffer wrap(byte [] array) ByteBuffer wrap(byte [] array, int offset, length) These are all static methods of the ByteBuffer class. • allocate() creates a ByteBuffer with an ordinary Java backing array of size capacity. • allocateDirect()—perhaps the most interesting case—creates a directByteBuffer, backed by capacity bytes of system memory. • The wrap() methods create ByteBuffer’s backed by all or part of an array allocated by the user. • The other typed buffer classes (CharBuffer, etc) have similar factory methods, except they don’t support the important allocateDirect() method. dbcarpen@indiana.edu

Other Primitive Types in ByteBuffer’s • It is possible to write other primitive types (char, int, double, etc) to a ByteBuffer by methods like: ByteBuffer putChar(char value) ByteBuffer putChar(int index, char value) ByteBuffer putInt(int value) ByteBuffer putInt(int index, int value) … The putChar() methods do absolute or relative writes of the two bytes in a Java char, the putInt() methods write 4 bytes, and so on. • Of course there are corresponding getChar(), getInt(), … methods. • These give you fun, unsafe ways of coercing bytes of one primitive type to another type, by writing data as one type and reading them as another. • But actually this isn’t the interesting bit—this was always possible with the old java.ioDataStream’s. • The interesting bit is that the new ByteBuffer class has a method that allows you to set thebyte order… dbcarpen@indiana.edu

Endian-ness • When identifying a numeric type like int or double with a sequence of bytes in memory, one can either put the most significant byte first (big-endian), or the least significant byte first (little-endian). • Big Endian: Sun Sparc, PowerPC CPU, numeric fields in IP headers,… • Little Endian: Intel processors • In java.io, numeric types were always rendered to stream in big-endian order. • Creates a serious bottleneck when writing or reading numeric types. • Implementations typically must apply byte manipulation code to each item, to ensure bytes are written in the correct order. • In java.nio, the programmer specifies the byte order as a property of a ByteBuffer, by calling one of: myBuffer.order(ByteOrder.BIG_ENDIAN) myBuffer.order(ByteOrder.LITTLE_ENDIAN) myBuffer.order(ByteOrder.nativeOrder()) • Provided the programmer ensures the byte order set for the buffer agrees with the native representation for the local processor, numeric data can be copied between JVM (which will use the native order) and buffer by a straight block memory copy, which can be extremely fast—a big win for NIO. dbcarpen@indiana.edu

View Buffers • ByteBuffer has no methods for bulk transfer of arrays other than type byte[]. • Instead, create a view of (a portion of) a ByteBuffer as any other kind of typed buffer, then use the bulk transfer methods on that view. Following methods of ByteBuffer create views: CharBuffer asCharBuffer() IntBuffer asIntBuffer() … • To create a view of just a portion of a ByteBuffer, set position and limit appropriately beforehand—the created view only covers the region between these. • You cannot create views of typed buffers other than ByteBuffer. • You can create another buffer that represents a subsection of any buffer (without changing element type) by using the slice() method. • For example, writing an array of floats to a byte buffer, starting at the current position: float [] array ; … FloatBuffer floatBuf = byteBuf.asFloatBuffer() ; floatBuf.put(array) ; dbcarpen@indiana.edu

Channels dbcarpen@indiana.edu

Channels • A channel is a new abstraction in java.nio. • In the package java.nio.channels. • Channels are a high-level version of the file-descriptors familiar from POSIX-compliant operating systems. • So a channel is a handle for performing I/O operations and various control operations on an open file or socket. • For those familiar with conventional Java I/O, java.nio associates a channel with any RandomAccessFile, FileInputStream, FileOutputStream, Socket, ServerSocket or DatagramSocket object. • The channel becomes a peer to the conventional Java handle objects; the conventional objects still exist, and in general retain their role—the channel just provides extra NIO-specific functionality. • NIO buffer objects can written to or read from channels directly. Channels also play an essential role in readiness selection, discussed in the next section. dbcarpen@indiana.edu

Simplified Channel Hierarchy Some of the “inheritance” arcs here are indirect: we missed out some interesting intervening classes and interfaces. dbcarpen@indiana.edu

Opening Channels • Socket channel classes have static factory methods called open(), e.g.: SocketChannel sc = SocketChannel.open() ; Sc.connect(new InetSocketAddress(hostname, portnumber)) ; • File channels cannot be created directly; first use conventional Java I/O mechanisms to create a FileInputStream, FileOutputStream, or RandomAccessFile, then apply the new getChannel() method to get an associated NIO channel, e.g.: RandomAccessFile raf = new RandomAccessFile(filename, “r”) ; FileChannel fc = raf.getChannel() ; dbcarpen@indiana.edu

Using Channels • Any channel that implements the ByteChannel interface—i.e. all channels except ServerSocketChannel—provide a read() and a write() instance method: int read(ByteBuffer dst) int write(ByteBuffer src) • These may look reminiscent of the read() and write() system calls in UNIX: int read(int fd, void* buf, int count) int write(int fd, void* buf, int count) • The Java read() attempts to read from the channel as many bytes as there are remaining to be written in the dst buffer. Returns number of bytes actually read, or -1 if end-of-stream. Also updates dst buffer position. • Similarly write() attempts to write to the channel as many bytes as there are remaining in the src buffer. Returns number of bytes actually read, and updates src buffer position. dbcarpen@indiana.edu

Example: Copying one Channel to Another • This example assumes a source channel src and a destination channel dest: ByteBuffer buffer = ByteBuffer.allocateDirect(BUF_SIZE) ; while(src.read(buffer) != -1) { buffer.flip() ; // Prepare read buffer for “draining” while(buffer.hasRemaining()) dest.write(buffer) ; buffer.clear() ; // Empty buffer, ready to read next chunk. } • Note a write() call (or a read() call) may or may not succeed in transferring whole buffer in a single call. Hence need for inner while loop. • Example introduces two new methods on Buffer: hasRemaining() returns true if position < limit; clear() sets position to 0 and limit to buffer’s capacity. • Because copying is a common operation on files, FileChannel provides a couple of special methods to do just this: long transferTo(long position, long count, WriteableByteChannel target) long transferFrom(ReadableByteChannel src, long position, long count) dbcarpen@indiana.edu

Memory-Mapped Files • In modern operating systems one can exploit the virtual memory system to map a physical file into a region of program memory. • Once the file is mapped, accesses to the file can be extremely fast: one doesn’t have to go through read() and write() system calls. • One application might be a Web Server, where you want to read a whole file quickly and send it to a socket. • Problems arise if the file structure is changed while it is mapped—use this technique only for fixed-size files. • This low-level optimization is now available in Java. FileChannel has a method: MappedByteBuffer map(MapMode mode, long position, long size) • mode should be one of MapMode.READ_ONLY, MapMode.READ_WRITE, MapMode.PRIVATE. • The returned MappedByteBuffer can be used wherever an ordinary ByteBuffer can. dbcarpen@indiana.edu

Scatter/Gather • Often called vectored I/O, this just means you can pass an array of buffers to a read or write operation; the overloaded channel instance methods have signatures: long read(ByteBuffer [] dsts) long read(ByteBuffer [] dsts, int offset, int length) long write(ByteBuffer [] srcs) long write(ByteBuffer [] srcs, int offset, int length) • The first form of read() attempts to read enough data to fill all buffers in the array, and divides it between them, in order. • The first form of write() attempts to concatenate the remaining data in all buffers and write it. • The arguments offset and length select a subset of buffers from the arrays (not, say, an interval within buffers). dbcarpen@indiana.edu

SocketChannels • As mentioned at the beginning of this section, socket channels are created directly with their own factory methods • If you want to manage a socked connection as a NIO channel this is the only option. Creating NIO socket channel implicitly creates a peer java.net socket object, but (contrary to the situation with file handles) the converse is not true. • As with file channels, socket channels can be more complicated to work with than the traditional java.net socket classes, but provide much of the hard-boiled flexibility you get programming sockets in C. • The most notable new facilities are that now socket communications can be non-blocking, they can be interrupted, and there is a selection mechanism that allows a single thread to do multiplex servicing of any number of channels. dbcarpen@indiana.edu

Basic Socket Channel Operations • Typical use of a server socket channel follows a pattern like: ServerSocketChannel ssc = ServerSocketChannel.open() ; ssc.socket().bind( new InetSocketAddress(port) ) ; while(true) { SocketChannel sc = ssc.accept() ; … process a transaction with client through sc … } • The client does something like: SocketChannel sc = SocketChannel.open() ; sc.connect( new InetSocketAddr(serverName, port) ) ; … initiate a transaction with server through sc … • The elided code above will typically be using read() and write() calls on the SocketChannel to exchange data between client and server. • So there are four important operations: accept(), connect(), write(), read() . dbcarpen@indiana.edu

Nonblocking Operations • By calling the method socket.configureBlocking(false) ; you put a socket into nonblocking mode (calling again with argument true restores to blocking mode, and so on). • In non-blocking mode: • A read() operation only transfers data that is immediately available. If no data is immediately available it returns 0. • Similarly, if data cannot be immediately written to a socket, a write() operation will immediately return 0. • For a server socket, if no client is currently trying to connect, the accept() method immediately returns null. • The connect() method is more complicated—generally connections would always block for some interval waiting for the server to respond. • In non-blocking mode connect() generally returns false. But the negotiation with the server is nevertheless started. The finishConnect() method on the same socket should be called later. It also returns immediately. Repeat until it return true. dbcarpen@indiana.edu

Interruptible Operations • The standard channels in NIO are all interruptible. • If a thread is blocked waiting on a channel, and the thread’s interrupt() method is called, the channel will be closed, and the thread will be woken and sent a ClosedByInterruptException. • To avoid race conditions, the same will happen if an operation on a channel is attempted by a thread whose interrupt status is already true. • See the lecture on threads for a discussion of interrupts. • This represents progress over traditional Java I/O, where interruption of blocking operations was not guaranteed. dbcarpen@indiana.edu

Other Features of Channels • File channels provide a quite general file locking facility. This is presumably important to many applications (database applications), but less obviously so to HPC operations, so we don’t discuss it here. • There is a DatagramChannel for sending UDP–style messages. This may well be important for high performance communications, but we don’t have time to discuss it. • There is a special channel implementation representing a kind of pipe, which can be used for inter-thread communication. dbcarpen@indiana.edu

Selectors dbcarpen@indiana.edu

Readiness Selection • Prior to New I/O, Java provided no standard way of selecting—from a set of possible socket operations—just the ones that are currently ready to proceed, so the ready operations can be immediately serviced. • One application would be in implementing an MPI-like message passing system: in general incoming messages from multiple peers must be consumed as they arrive and fed into a message queue, until the user program is ready to handle them. • Previously one could achieve equivalent effects in Java by doing blocking I/O operations in separate threads, then merging the results through Java thread synchronization. But this can be inefficient because thread context switching and synchronization is quite slow. • One way of achieving the desired effect in New I/O would be set all the channels involved to non-blocking mode, and use a polling loop to wait until some are ready to proceed. • A more structured—and potentially more efficient—approach is to use Selectors. • In many flavors of UNIX this is achieved by using the select() system call. dbcarpen@indiana.edu

Classes Involved in Selection • Selection can be done on any channel extending SelectableChannel—amongst the standard channels this means the three kinds of socket channel. • The class that supports the select() operation itself is Selector. This is a sort of container class for the set of channels in which we are interested. • The last class involved is SelectionKey, which is said to represent the binding between a channel and a selector. • In some sense it is part of the internal representation of the Selector, but the NIO designers decided to make it an explicit part of the API. dbcarpen@indiana.edu

Setting Up Selectors • A selector is created by the open() factory method. This is naturally a static method of the Selector class. • A channel is added to a selector by calling the method: SelectionKey register(Selector sel, int ops) • This, slightly oddly, is an instance method of the SelectableChannel class—you might have expected the register() method to be a member of Selector. • Here ops is a bit-set representing the interest set for this channel: composed by oring together one or more of: SelectionKey.OP_READ SelectionKey.OP_WRITE SelectionKey.OP_CONNECT SelectionKey.OP_ACCEPT • A channel added to a selector must be in nonblocking mode! • The register() method returns the SelectionKey created • Since this automatically gets stored in the Selector, so in most cases you probably don’t need to save the result yourself. dbcarpen@indiana.edu

Example • Here we create a selector, and register three pre-existing channels to the selector: Selector selector = Selector.open() ; channel1.register (selector, SelectionKey.OP_READ) ; channel2.register (selector, SelectionKey.OP_WRITE) ; channel3.register (selector, SelectionKey.OP_READ | SelectionKey.OP_WRITE) ; • For channel1 the interest set is reads only, for channel2 it is writes only, for channel3 it is reads and writes. • Note channel1, channel2, channel3 must all be in non-blocking mode at this time, and must remain in that mode as long as they are registered in any selector. • You remove a channel from a selector by calling the cancel() method of the associated SelectionKey. dbcarpen@indiana.edu

select() and the Selected Key Set • To inspect the set of channels, to see what operations are newly ready to proceed, you call the select() method on the selector. • The return value is an integer, which will be zero if no status changes occurred. • More interesting than the return value is the side effect this method has on the set of selected keys embedded in the selector. • To use selectors, you must understand that a selector maintains a Set object representing this selected keys set. • Because each key is associated with a channel, this is equivalent to a set of selected channels. • The set of selected keys is different from (presumably a subset of) the registered key set. • Each time the select() method is called it may add new keys to the selected key set, as operations become ready to proceed. • You, as the programmer, are responsible for explicitly removing keys from the selected key set belonging to the selector, as you deal with operations that have become ready. dbcarpen@indiana.edu

Ready Sets • This is quite complicated already, but there is one more complication. • We saw that each key in the registered key set has an associated interest set, which is a subset of the 4 possible operations on sockets. • Similarly each key in the selected key set has an associated ready set, which is a subset of the interest set—representing the actual operations that have been found ready to proceed. • Besides adding new keys to the selected key set, a select() operation may add new operations to the ready set of a keyalready in the selected key set. • Assuming the selected key set was not cleared after a preceding select(). • You can extract the ready set from a SelectionKey as a bit-set, by using the method readyOps(). Or you can use the convenience methods: isReadable() isWriteable() isConnectable() isAcceptable() which effectively return the bits of the ready set individually. dbcarpen@indiana.edu

A Pattern for Using select() … register some channels with selector … while(true) { selector.select() ; Iterator it = selector.selectedKeys().iterator() ; while( it.hasNext() ) { SelectionKey key = it.next() ; if( key.isReadable() ) … perform read() operation on key.channel() … if( key.isWriteable() ) … perform write() operation on key.channel() … if( key.isConnectable() ) … perform connect() operation on key.channel() … if( key.isAcceptable() ) … perform accept() operation on key.channel() … it.remove() ; } } dbcarpen@indiana.edu

Remarks • This general pattern will probably serve for most uses of select(): • Perform select() and extract the new selected key set • For each selected key, handle the actions in its ready set • Remove the processed key from the selected key set • Note the remove() operation on an Iterator removes the current item from the underlying container. • More generally, the code that handles a ready operation may also alter the set of channels registered with the selector • e.g after doing an accept() you may want to register the returned SocketChannel with the selector, to wait for read() or write() operations. • In many cases only a subset of the possible operations read, write, accept, connect are ever in interest sets of keys registered with the selector, so you won’t need all 4 tests. dbcarpen@indiana.edu

Key Attachments • One problem with the pattern above is that when it.next() returns a key, there is no convenient way of getting information about the context in which the associated channel was registered with the selector. • For example channel1 and channel3 are both registered for OP_READ. But the action that should be taken when the read becomes ready may be quite different for the two channels. • You need a convenient way to determine which channel the returned key is bound to. • You can specify an arbitrary object as an attachment to the key when you create it; later when you get the key from the selected set, you can extract the attachment, and use its content in to decide what to do. • At its most basic the attachment might just be an index identifying the channel. dbcarpen@indiana.edu

Simplistic Use of Key Attachments channel1.register (selector, SelectionKey.OP_READ, new Integer(1) ) ; // attachment … channel3.register (selector, SelectionKey.OP_READ | SelectionKey.OP_WRITE, new Integer(3) ) ; // attachment … while(true) { … Iterator it = selector.selectedKeys().iterator() ; … SelectionKey key = it.next() ; if( key.isReadable() ) switch( ((Integer) key.channel().attachment() ).value() ) { case 1 : … action appropriate to channel1 … case 3 : … action appropriate to channel3 … } … } dbcarpen@indiana.edu

Conclusion • We briefly visited several topics in New I/O that are likely to be interesting for HPC with Java. • Some topics that are less obviously relevant we skipped, like file locking, and regular expressions. • Also we didn’t cover datagram channels, which may well be relevant. • New I/O has been widely hailed as an important step forward in getting serious performance out of the Java platform. • See the paper: “MPJava: High-Performance Message Passing in Java using java.nio” William Pugh and Jaime Spacco For a good example of how New I/O may affect the “Java for HPC” landscape. dbcarpen@indiana.edu

dbcarpen@indiana.edu

Java for High Performance Computing