Java based technologies to support parallel and distributed applications
Download
1 / 75

Java-based Technologies to Support Parallel and Distributed Applications - PowerPoint PPT Presentation


  • 322 Views
  • Uploaded on

Java-based Technologies to Support Parallel and Distributed Applications Prof Mark Baker ACET, University of Reading Tel: +44 118 378 8615 E-mail: [email protected] Web: http://acet.rdg.ac.uk/~mab Outline The pros and cons of Java. Problems developing Java applications.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Java-based Technologies to Support Parallel and Distributed Applications' - issac


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Java based technologies to support parallel and distributed applications l.jpg

Java-based Technologies to Support Parallel and Distributed Applications

Prof Mark Baker

ACET, University of Reading Tel: +44 118 378 8615 E-mail: [email protected]

Web: http://acet.rdg.ac.uk/~mab

[email protected]


Outline l.jpg
Outline Applications

  • The pros and cons of Java.

  • Problems developing Java applications.

  • Java and the Grid:

    • WS, REST and Web 2.0.

  • Some Java application development.

    • GridRM,

    • Tycho,

    • MPJ Express.

  • Some lessons learnt!

  • Conclusions.

[email protected]


The pros and cons of java l.jpg
The pros and cons of Java Applications

  • Object Oriented:

    • Java is a modern object-oriented language:

      • There are many popular object-oriented languages available today:

        • Some like Smalltalk and Java are designed from the beginning to be object-oriented,

        • Others, like C++, are partially object-oriented, and partially procedural.

      • In Java, almost every variable is an object of some type or another - even strings.

      • In C++, you can still overwrite the contents of data structures and objects, causing the application to crash,

      • Java prohibits direct access to memory contents, leading to a more robust system.

[email protected]


The pros and cons of java4 l.jpg
The pros and cons of Java Applications

  • Portable:

    • Most programming languages are compiled and converted to machine code that executed only on one type of machine - typically produces fast native code.

    • Interpreted code often does not need to be compiled - it is translated as it is run, which can be quite slow, but often portable across different OSs and processor architectures.

    • Java uses both techniques - it is compiled into a platform-neutral machine code (bytecode) then an interpreter:

      • Either a Java Virtual Machine (JVM), a JIT (Just-In-Time) compiler, or an adaptive optimising engine such as HotSpot.

    • A couple of advantages:

      • The source code is protected from view and modification - only the bytecode needs to be made available to users.

      • Security mechanisms can scan bytecode for signs of modification or harmful code,

      • Means that Java code can be compiled once, and run on any machine and OS combination that supports a JVM - wonderful delegation of responsibility!

[email protected]


The pros and cons of java5 l.jpg
The pros and cons of Java Applications

  • Multi-threaded

    • Some languages, like C/C++ have support for threads - Pthreads (POSIX 1003.1c standard), which is a separate library that you need to include and use - an after thought!

    • Many language do not have thread support, so you need to potentially create multiple-processes…

    • Java has support for multiple threads of execution built right into the language:

      • Thread support was designed into the language from the out start, simple, and commonly used in applications and applets.

[email protected]


The pros and cons of java6 l.jpg
The pros and cons of Java Applications

  • Concurrency Control:

    • Java 1.5 has comprehensive support for general-purpose concurrent programming,

    • The support is partitioned into three packages:

      • java.util.concurrent - this provides various classes to support common concurrent programming paradigms, e.g., support for various queuing policies such as bounded buffers, sets and maps, thread pools etc,

      • java.util.concurrent.atomic - this provides support for lock-free thread-safe programming on simple variables such as atomic integers, atomic booleans, etc,

      • java.util.concurrent.locks - this provides a framework for various locking algorithms that augment the Java language mechanisms, e.g., read -write locks and condition variables.

[email protected]


The pros and cons of java7 l.jpg
The pros and cons of Java Applications

  • Garbage collection:

    • Languages like C/C++ force programmers to allocate/deallocate memory for data and objects manually:

      • Adds complexity, but also causes another problem - memory leaks.

    • When Java applications create objects, the JVM allocates memory space for their storage:

      • When the object is no longer needed (no reference to the object exists), the memory space can be reclaimed for later use.

      • In Java, the programmer is free from such worries, as the JVM will perform automatic garbage collection of objects.

[email protected]


The pros and cons of java8 l.jpg
The pros and cons of Java Applications

  • Memory Management:

    • There are no destructors in Java,

    • There is no "scope" of a variable per se, to indicate when the object’s lifetime is ended – the lifetime of an object is determined instead by the garbage collector,

    • finalize() method is called by the garbage collector and is supposed to be responsible only for releasing "resources”:

      • Such as open files, sockets, ports, URLs.

    • All objects in C++ will be (or rather, should be) destroyed, but not all objects in Java are garbage collected,

    • Java garbage collector can be changed, but no explicit control over object collection!

[email protected]


The pros and cons of java9 l.jpg
The pros and cons of Java Applications

  • Security:

    • Security is not built into most traditional programming languages,

    • Security is a big issue with Java:

      • At the API level, there are strong security restrictions on file and network access for applets, as well as support for digital signatures to verify the integrity of downloaded code.

      • At the bytecode level, checks are made for obvious hacks, such as stack manipulation or invalid bytecode.

      • The weakest link in the chain is the Java Virtual Machine on which it is run - a JVM with known security weaknesses can be prone to attack:

        • Worth noting that while there have been a few identified weaknesses in JVMs, they are rare, and usually fixed quickly.

[email protected]


The pros and cons of java10 l.jpg
The pros and cons of Java Applications

  • Network and "Internet" aware”:

    • In languages like C/C++, the networking code must be re-written for different operating systems, and is usually more complex,

    • Java was designed to be "Internet" aware, and to support network programming:

      • The Java API provides network support, from sockets and IP addresses, to URLs and HTTP,

      • It is easy to write network applications in Java, and the code is portable between platforms,

      • The networking support of Java saves a lot of time, and effort,

      • Java support for exotic network programming, such as remote-method invocation (RMI), SOAP, CORBA and Jini.

[email protected]


The pros and cons of java11 l.jpg
The pros and cons of Java Applications

  • Arrays in Java:

    • Although they look similar, arrays have a very different structure and behaviour in Java than they do in C/C++,

    • There is a read-only length member (size of array) and run-time checking throws an exception if you go out of bounds,

    • In Java a two-dimensional array is an array of one-dimensional arrays,

    • Although we might expect that elements of rows are stored contiguously, one cannot depend upon the rows themselves being stored contiguously:

      • In fact, there is no way to check whether rows have been stored contiguously after they have been allocated.

      • The possible non-contiguity of rows implies that the effectiveness of block-oriented algorithms may be dependent on the particular implementation of the JVM as well as the current state of the memory manager.

[email protected]


The pros and cons of java12 l.jpg
The pros and cons of Java Applications

  • The Floating-point Issue:

    • The fact that Java programs produce bitwise identical floating-point results in every JVM seems to be not only unrealisable in practice, but also inhibits efficient floating-point processing on some platforms.

    • For example, it precludes the efficient use of floating-point hardware on Intel processors which utilise extended precision in registers.

[email protected]


The pros and cons of java13 l.jpg
The pros and cons of Java Applications

  • Exceptions:

    • Java exception handling is superior to those in C/C++:

      • No exception handling classes in C.

    • The C++ approach of calling a function at run-time when the wrong exception is thrown, where as in Java exception specifications are checked and enforced at compile-time.

    • Also, overridden methods must conform to the exception specification of the base-class version of that method: they can throw the specified exceptions or exceptions derived from those.

[email protected]


The pros and cons of java14 l.jpg
The pros and cons of Java Applications

  • Accessing OS Resources:

    • Java can be too restrictive in some cases, you are prevented from doing important tasks such as directly accessing hardware,

    • Java solves this with native methods (JNI) that allow you to call a function written in another language (currently only C and C++ are supported),

    • Thus, you can always solve a platform-specific problem (in a relatively non-portable fashion, but then that code is isolated),

    • Applets cannot call native methods, only applications.

[email protected]


The pros and cons of java15 l.jpg
The pros and cons of Java Applications

  • Documentation

    • Java has built-in support for comment-based documentation:

      • The source code file can also contain its own documentation, which is stripped out and reformatted into HTML via a separate program (javadoc):

      • Create API docs…

        • This is a boon for documentation maintenance and use.

[email protected]


Some advantages of java l.jpg
Some Advantages of Java Applications

  • Java development tools:

    • e.g. Eclipse platform, and a large number of open source, free software that has been made available by the community of programmers.

    • If nothing else, this software can be a starting place to develop new ideas and more sophisticated applications.

  • Programmer productivity:

    • At least two times greater with Java.

    • A lot can be done in a short amount of time with Java because it has such an extensive library of functions already built into the language, integrated development environments, and a wide selection of supporting tools.

[email protected]


Java summary l.jpg
Java Summary Applications

  • Generally, Java is more robust, via:

    • Object handles initialised to null (a keyword),

    • Handles are always checked and exceptions are thrown for failures,

    • All array accesses are checked for bounds violations,

    • Automatic garbage collection prevents memory leaks,

    • Clean, relatively fool-proof exception handling,

    • Simple language support for multi-threading,

    • Java Security, Tooling…

[email protected]


Problems developing java applications l.jpg
Problems Developing Java Applications Applications

  • Memory - both amount used by an applications and speed that memory is recovered:

    • Some evidence that Java applications are becoming increasingly bloated - never been particularly small though!

    • Default JVM has 64 Mbytes, increase the amount of memory allocated using -Xmx command-line switch..

    • Obviously, applications could have multiple threads, or you could be using multiple JVMs.

    • Issues with how quickly GC recovers memory - can cause java.lang.OutOfMemoryError: Java heap space:

      • Commonly seen problem in message systems (i.e. NaradaBroker) and when storing in-memory data-structures (MDS4).

[email protected]


Problems developing java applications19 l.jpg
Problems Developing Java Applications Applications

  • Accessing system resources and legacy applications:

    • Java Native Interface created for the purpose of interacting with resources that cannot accessed normally via the JVM:

      • Interacting with OS resources - such as /Proc!, sending signals…

      • User applications, or libraries (MPI).

    • Use of JNI breaks Java programming model:

      • Non object oriented,

      • No GC - memory leaks possible,

      • Relatively complicated interface - easy to make mistakes,

      • OK for C and C++ - need to have a good think about wrapping Fortran as you call it from C/C++!,

      • Lose portability.

[email protected]


Problems developing java applications20 l.jpg
Problems Developing Java Applications Applications

  • Relative performance of a Java-based applications depends on a number of factors:

    • Coding efficiency,

    • Version of the JVM,

    • Operating System,

    • Memory available,

    • Others…

The authors conclude, "On Intel Pentium hardware, especially with Linux, the performance gap is small enough to be of little or no concern to programmers."

[email protected]


Problems developing java applications21 l.jpg
Problems Developing Java Applications Applications

  • Java is now nearly equal to (or faster than) C++ on low-level and numeric benchmarks:

    • This should not be surprising as Java is a compiled language (albeit JIT compiled).

  • Nevertheless, the idea that "java is slow" is widely believed.

  • Several possible reasons:

    • Early Java (circa 1995 was slow):

      • Did not have Java JIT compiler, and hence bytecode was interpreted (like Python).

      • Situation changed with the emergence of JIT compilers from Microsoft, Symantec, and in Sun's java 1.2.

    • Java can be slow still:

      • For example, programs written with the thread-safe Vector class are necessarily slower, than those written with the equivalent thread-unsafe ArrayList class.

[email protected]


Problems developing java applications22 l.jpg
Problems Developing Java Applications Applications

  • Java program start-up is slow - as a it starts, it unzips the Java libraries and compiles parts of itself, so an interactive program can be sluggish for the first couple seconds of use.

  • There is a similar "garbage collection is slow" myth that persists despite evidence to the contrary.

  • Together these suggest that it is possible that no amount of data will alter peoples' beliefs, and that in actuality these "speed beliefs" probably have little to do with java, garbage collection, or the otherwise stated subject:

    • The answer probably lies somewhere in sociology or psychology.

    • Programmers, despite their professed appreciation of logical thought, are not immune to a kind of mythology, though these particular "myths" are arbitrary and relatively harmless.

  • [email protected]


    Java and the grid l.jpg
    Java and the Grid Applications

    • Almost all the core Grid software is now Java-based!

      • Especially now that it is based on Web Services’ technologies…

    • Could ask why this is the case…?

      • Can probably be explained by the fact that Java is a good language for rapid prototyping, it is free, runs anywhere, extensive tooling, libraries,…

    • Core of Globus/Unicore/gLite/OMII/Crown… etc all Java-based.

      • Well defined APIs, so clients can be written in more or less any language.

      • Interacting with legacy applications is more problematic.

    [email protected]


    The grid s architecture l.jpg
    The Grid’s Architecture Applications

    • Moved to Service Oriented Architecture (SOA) Web Services-based Grid infrastructure back in early 2003 - Open Grid Service Architecture (OGSA):

      • Huge effort to standardise everything Grid-related!

      • Out popped OGSI, quickly dropped,

      • Then WSRF (Jan 2004+),

      • More recently WS-Resource Transfer (Oct 2006)!

    • Also there is a great debate about services and state!

    • Funnily enough, Globus 2.4, is still very popular and a key part of many Grid projects.

    • Far too many changes, standards and specifications, and all very confusing and complicated…

    • Mutterings in the Grid community for several years now, due to fact that no one really knows what standards and specs to use:

      • And if the one chosen will actually still being used in the future.

    [email protected]


    Web services standards and specifications l.jpg

    Web Services Standards and Specifications Applications

    Some Standards - list from Geoffrey Fox circa 2004

    [email protected]


    A list of web services i l.jpg
    A List of Web Services I Applications

    • a) Core Service Architecture

      • XSD: XML Schema (W3C Recommendation) V1.0 February 1998, V1.1 February 2004

      • WSDL 1.1: Web Services Description Language Version 1.1, (W3C note) March 2001

      • WSDL 2.0: Web Services Description Language Version 2.0, (W3C under development) March 2004

      • SOAP 1.1: (W3C Note) V1.1 Note May 2000

      • SOAP 1.2: (W3C Recommendation) June 24 2003

    • b) Service Discovery

      • UDDI:(Broadly Supported OASIS Standard) V3 August 2003

      • WS-Discovery: Web services Dynamic Discovery (Microsoft, BEA, Intel …) February 2004

      • WS-IL:Web Services Inspection Language, (IBM, Microsoft) November 2001

    [email protected]


    A list of web services ii l.jpg
    A List of Web Services II Applications

    • c) Security

      • SAML:Security Assertion Markup Language (OASIS) V1.1 May 2004

      • XACML: eXtensible Access Control Markup Language (OASIS) V1.0 February 2003

      • WS-Security: Web Services Security: SOAP Message Security (OASIS) Standard March 2004

      • WS-SecurityPolicy: Web Services Security Policy (IBM, Microsoft, RSA, Verisign) Draft December 2002

      • WS-Trust:Web Services Trust Language (BEA, IBM, Microsoft, RSA, Verisign …) May 2004

      • WS-SecureConversation: Web Services Secure Conversation Language (BEA, IBM, Microsoft, RSA, Verisign …) May 2004

      • WS-Federation:Web Services Federation Language (BEA, IBM, Microsoft, RSA, Verisign) July 2003

    [email protected]


    A list of web services iii l.jpg
    A List of Web Services III Applications

    • d) Messaging

      • WS-Addressing: Web Services Addressing (BEA, IBM, Microsoft) March 2004

      • WS-MessageDelivery: Web Services Message Delivery (W3C Submission by Oracle, Sun ..) April 2004

      • WS-Routing: Web Services Routing Protocol (Microsoft) October 2001

      • WS-RM: Web Services Reliable Messaging (BEA, IBM, Microsoft, Tibco) v0.992 March 2004

      • WS-Reliability: Web Services Reliable Messaging (OASIS Web Services Reliable Messaging TC) March 2004

      • SOAP MOTM: SOAP Message Transmission Optimization Mechanism (W3C) June 2004

    • e) Notification

      • WS-Eventing: Web Services Eventing (BEA, Microsoft, TIBCO) January 2004

      • WS-Notification: Framework for Web Services Notification with WS-Topics, WS-BaseNotification, andWS-BrokeredNotification (OASIS) OASIS Web Services Notification TC Set up March 2004

      • JMS:Java Message Service V1.1 March 2002

    [email protected]


    A list of web services iv l.jpg
    A List of Web Services IV Applications

    • f) Coordination and Workflow, Transactions and Contextualization

      • WS-CAF:Web Services Composite Application Framework including WS-CTX, WS-CFandWS-TXM below (OASIS Web Services Composite Application Framework TC) July 2003

      • WS-CTX:Web Services Context (OASIS Web Services Composite Application Framework TC) V1.0 July 2003

      • WS-CF: Web Services Coordination Framework (OASIS Web Services Composite Application Framework TC) V1.0 July 2003

      • WS-TXM: Web Services Transaction Management (OASIS Web Services Composite Application Framework TC) V1.0 July 2003

      • WS-Coordination: Web Services Coordination (BEA, IBM, Microsoft) September 2003

      • WS-AtomicTransaction: Web Services Atomic Transaction (BEA, IBM, Microsoft) September 2003

      • WS-BusinessActivity: Web Services Business Activity Framework (BEA, IBM, Microsoft) January 2004

      • BTP: Business Transaction Protocol (OASIS) May 2002 with V1.0.9.1 May 2004

      • BPEL: Business Process Execution Language for Web Services (OASIS) V1.1 May 2003

      • WS-Choreography: (W3C) V1.0 Working Draft April 2004

      • WSCI: (W3C) Web Service Choreography Interface V1.0 (W3C Note from BEA, Intalio, SAP, Sun, Yahoo)

      • WSCL:Web Services Conversation Language (W3C Note) HP March 2002

    [email protected]


    A list of web services v l.jpg
    A List of Web Services V Applications

    • h) Metadata and State

      • RDF:Resource Description Framework (W3C) Set of recommendations expanded from original February 1999 standard

      • DAML+OIL: combining DAML (Darpa Agent Markup Language) and OIL (Ontology Inference Layer) (W3C) Note December 2001

      • OWLWeb Ontology Language (W3C) Recommendation February 2004

      • WS-DistributedManagement: Web Services Distributed Management Framework with MUWS and MOWS below (OASIS)

      • WSDM-MUWS: Web Services Distributed Management: Management Using Web Services (OASIS) V0.5 Committee Draft April 2004

      • WSDM-MOWS: Web Services Distributed Management: Management of Web Services (OASIS) V0.5 Committee Draft April 2004

      • WS-MetadataExchange: Web Services Metadata Exchange (BEA,IBM, Microsoft, SAP) March 2004

      • WS-RF:Web Services Resource Framework including WS-ResourceProperties, WS-ResourceLifetime, WS-RenewableReferences, WS-ServiceGroup, and WS-BaseFaults(OASIS) Oasis TC set up April 2004 and V1.1 Framework March 2004

      • ASAP: Asynchronous Service Access Protocol (OASIS) with V1.0 working draft G June 2004

      • WS-GAF:Web Service Grid Application Framework (Arjuna, Newcastle University) August 2003

    [email protected]


    A list of web services vi l.jpg
    A List of Web Services VI Applications

    • g) General Service Characteristics

      • WS-Policy: Web Services Policy Framework (BEA, IBM, Microsoft, SAP) May 2003

      • WS-PolicyAssertions:Web Services Policy Assertions Language (BEA, IBM, Microsoft, SAP) May 2003

      • WS-Agreement: Web Services Agreement Specification (GGF under development) May2004

    • i) User Interfaces

      • WSRP: Web Services for Remote Portlets (OASIS) OASIS Standard August 2003

      • JSR168: JSR-000168 Portlet Specification for Java binding (Java Community Process) October 2003

    [email protected]


    Thoughts l.jpg
    Thoughts Applications

    • Should we be using standards IF they:

      • Are new and just emerging,

      • Are changing frequently, for example UDDI!

      • Enhance interoperability, but potentially cripple performance,

      • Are not widely adopted,

      • Are not easy to understand and complicated to implement.

    • What are the alternatives?

      • REST,

      • Web 2.0…

    [email protected]


    Representation state transfer rest l.jpg
    Representation State Transfer (REST) Applications

    • REST is a style of software architecture for distributed systems such as the Web.

    • The term originated in a 2000 doctoral dissertation written by Roy Fielding, one of the principal authors of the HTTP protocol specification.

    • The term is also often used to describe any simple interface that transmits domain-specific data over HTTP without an additional messaging (e.g. SOAP) or session tracking (e.g. cookies).

    • The idea behind REST is to take what we all take for granted in the Web (i.e. HTTP and URIs) and use it with other payloads (other than HTML, images, etc.) and other HTTP Methods more than the usual GET and POST.

    • REST defines identifiable resources (URIs), and methods for accessing and manipulating the state of those resources (HTTP Methods).

    [email protected]


    Slide34 l.jpg
    REST Applications

    • It uses well documented, established, and used technologies and methodologies.

    • It is already here today; been around for while!

    • Resource centric rather than method centric.

    • Given a URI, anyone already knows how to access it.

    • NOT another protocol on top of another protocol on top of another protocol on top of...

    • Response payload can be of any format.

    • Uses the inherent HTTP security model, certain methods to certain URIs can easily be restricted by firewall configuration, unlike other XML over HTTP messaging formats.

    • REST makes sense, use what we already have; it is the next logical extension of the Web… isn’t it!!

    [email protected]


    What s web2 0 l.jpg
    What’s Web2.0? Applications

    • “Web 2.0, a phrase coined by O'Reilly Media in 2004, refers to a supposed second-generation of Internet-based services such as social networking sites, Wikis, communication tools, and folksonomies that let people collaborate and share information online in previously unavailable ways.” - Wikipedia

    mark[email protected]


    Hard to define know it when you see it l.jpg

    Emerging Applications

    Tech

    Apps You

    Know…

    Some Apps You Don’t know

    Major Retailers

    Hard to Define, Know it When you See it…

    • Web API’s

    • “Folksonomies” / Content tagging

    • “AJAX”

    • RSS

    • Flickr

    • Google Maps

    • Blogging & Content Syndication

    • Linkedin, Friendster

    • Del.icio.us

    • Upcoming.org

    • 43Things.com

    • Amazon API’s

    • Google Adsense API

    • Yahoo API

    • Ebay API

    [email protected]



    Further thoughts l.jpg
    Further Thoughts Applications

    • What is special about the REST model? and Web 2.0?

      • Nothing special from what I can see, but both are gaining popularity!

    • Why?

      • REST architecture is simple and its technologies are mature and stable:

        • HTTP(S) and ASCII!

      • Web 2.0 is some mashup of web-based patterns and technologies:

        • Accessible to users and relatively simple to create Web-based applications.

    • In the end users will vote with their feet and use whatever technologies help them develop applications quickly and effectively.

    [email protected]




    Some java application development l.jpg

    Some Java Application Development Applications

    Incidental REST Development!

    [email protected]


    Java based application development l.jpg
    Java-based Application Development Applications

    • Over the last decade my group have been developing Java-based middleware for parallel and distributed applications.

    • We have gained a lot of experience and knowledge about the development of Java-based applications.

    • The next part of the talk will outline some the projects that my group have been developing and some interesting observations about the pitfalls that can be made developing Java applications.

    [email protected]


    Some history l.jpg

    Started off with GridRM (gridrm.org) a wide-area Java-based resource monitoring system:

    A scalable infrastructure for gathering native information from computer-based resources,

    Agents, gateways, linked in a hierarchical configuration,

    Started designing it pre-2000,

    No standards in monitoring arena (apart from SNMP), so issue was what to use for our infrastructure?

    Choose HTTP and XML as this seemed to provide us with a core, that we could adapt at the end points.

    Some History!

    [email protected]


    Slide44 l.jpg

    Local Layer resource monitoring system:

    Hierarchy of gateways:

    Each manages a site:

    Interacts with agents,

    Deals with policies and security.

    Embedded database:

    Stores resource and historical data.

    Agents:

    Use what is there!

    Global Layer

    A system to bind together gateways:

    Producer/consumer paradigm,

    Meta-registry,

    Messaging Layer,

    Default security.

    GridRM Architecture

    [email protected]


    Further history l.jpg
    Further History resource monitoring system:

    • Earlier versions of GridRM was relatively hardwired, we wanted to create a more flexible and maintainable solution.

    • Needed software to bind together GridRM gateways and provide a messaging infrastructure…

      • Looked at obvious solutions: MDS, G-RMA, NaradaBrokering…

    • In 2003 we came up with Tycho, a GMA-based infrastructure that would provide a meta-registry and a message system that could bind together our GridRM gateways:

      • GMA was a GGF informational recommendation,

      • Unfortunately in 2003 still no Grid standards - transition from GT2 to GT3 at the time!

      • Consequently, we continued to use HTTP and XML.

    [email protected]


    Tycho architecture l.jpg

    Tycho consists of the following components: resource monitoring system:

    Mediators that allow producers and consumers to discover each other and establish remote communications,

    Consumers that typically subscribe to receive information or events from producers,

    Producers that gather and publish information for consumers.

    There is an asynchronous messaging API.

    In Tycho, producers and/or consumers (clients) can publish their existence in a Virtual Registry (VR).

    Tycho Architecture

    [email protected]


    Tycho design l.jpg
    Tycho Design resource monitoring system:

    • Tycho is a based on a publish, subscribe and bind paradigm.

    • Design Philosophy:

      • We believed that the system should have an architecture similar to the Internet, where every node provides reliable core services, and the complexity is kept, as far as possible, to the edges:

        • The core services can be kept to the minimum, and endpoints can provide higher-level and more sophisticated services, that may fail, but will not cause the overall system to crash.

      • We have kept Tycho’s core small, simple and efficient, so that it has a minimal memory foot-print, is easy to install, and is capable of providing robust and reliable services.

      • More sophisticated services can then be built on this core and are provided via libraries and tools to applications.

    • Allows Tycho to be flexible and extensible so that it will be possible to incorporate additional features and functionality.

    [email protected]


    Tycho messaging tests l.jpg

    Tycho Messaging Tests resource monitoring system:

    Performance Tests of Tycho against NaradaBrokering

    [email protected]


    Performance tests messaging l.jpg

    NaradaBrokering: resource monitoring system:

    The NaradaBrokering framework is a distributed messaging infrastructure, developed by the Community Grids Lab at Indiana University.

    NaradaBrokering is an asynchronous messaging infrastructure with a publish and subscribe based architecture.

    NaradaBrokering is Sun JMS compliant. This messaging standard allows application components to exchange unified messages in an asynchronous system.

    Performance Tests (Messaging)

    [email protected]


    Summary tycho vs nb l.jpg
    Summary - Tycho vs NB resource monitoring system:

    • Latency and throughput:

      • When looking at end-to-end performance, on a LAN for messages less than 2 Kbytes, Tycho and NaradaBrokering have comparable performance,

      • Tycho achieves 95% bandwidth, whereas NB only 65.3%,

      • Tyco’s performance is inhibited by the fact that it creates a new socket for each message sent:

        • NB reuses sockets instances once they have been created.

    • Scalability Summary:

      • The scalability tests have shown Tycho and NaradaBrokering producers and consumers to be stable under heavy load:

        • Performance is weaker when there is a large ratio of consumers to producers.

      • Heap Size:

        • The heap size for NB becomes a limiting factor in when a broker is receiving messages faster than it can send them, as the internal message buffer fills until the heap is consumed and system fails,

        • Tycho tests were performed without modifying the heap size, throttling used as there is limited buffering.

    [email protected]


    Tycho registry tests l.jpg

    Tycho Registry Tests resource monitoring system:

    Performance Tests against Globus’s MDS4 and gLite’s R-GMA

    [email protected]


    Registries l.jpg

    MDS4: resource monitoring system:

    The Globus Toolkit’s Monitoring and Discovery Service (MDS version 4) is a Web Services Resource Framework (WSRF) based implementation of a wide-area information and registry service.

    MDS4 provides a framework that can be used to collect, index and expose data about the state of grid resources and services.

    Typical uses for MDS4 include making resource data available for decision making in job submission services or notifying an administrator when storage space is running low on a cluster.

    Registries

    [email protected]


    Registries53 l.jpg

    R-GMA: resource monitoring system:

    A Java-based implementation of the Grid Monitoring Architecture (GMA) for publishing network monitoring information over the wide-area and as an information service.

    R-GMA uses a relational model to search, using an SQL-like API, and describe the monitoring information it collects.

    based on a consumer/producer paradigm with client data being stored in a directory service, which presents it as a virtual database.

    R-GMA uses the term tuples for sets of data being published or consumed.

    Registries

    [email protected]


    Tycho compared to r gma and mds4 l.jpg
    Tycho compared to R-GMA and MDS4 resource monitoring system:

    • For the tests we created a set of randomly generated strings to act as attributes for records to be inserted into the registries.

    • A single record, with no mark up, had an average size of 114 bytes.

    • Two tests were used to assess the performance of the registries:

      • [S1] simulates a client searching the registry for records matching some known attributes:

        • Systematic queries are generated using a function to select a record name at random from the test data to guarantee the query will only match one record.

      • [S2] measures the worst-case scenario of the client requesting all of the records from within the registry.

    [email protected]


    Tycho compared to r gma and mds455 l.jpg
    Tycho compared to R-GMA and MDS4 resource monitoring system:

    Query Response Time Versus the Number of Records for a Query When Selecting a Single Random Record From the Registry (S1)

    Query Response Time Versus the Number of Records for a Query that Selects All the Records From the Registry (S2)

    [email protected]


    Tycho compared to r gma and mds456 l.jpg
    Tycho compared to R-GMA and MDS4 resource monitoring system:

    • When testing the effect of the number of records on response time, we see that when selecting a single record from 100,000:

      • Tycho 32 seconds faster than R-GMA,

      • MDS4 runs out of heap space for larger records sizes.

    • We also ran tests where there were multiple clients accessing the registries:

      • Again Tycho's VR had a lower response latency than R-GMA and MDS4:

      • With 100 clients Tycho was 94 seconds faster than R-GMA and 65 seconds faster than MDS4.

    • The results highlight that one of the strengths of our implementation is its performance under load:

      • Tycho's performance is linear with regard to both increasing numbers of clients and response sizes.

    [email protected]


    Some conclusions l.jpg
    Some Conclusions resource monitoring system:

    • We designed Tycho to have a relatively small, simple and efficient core, so that it has a minimal memory footprint, is easy to install, and is capable of providing robust and reliable services.

    • More sophisticated services can then be built on this core and be provided via libraries and tools to applications:

      • This provides us with a flexible and extensible framework where it is possible to incorporate additional feature and functionality, which are created as producers or consumers, which do not affect the core.

    • Performance:

      • Tycho performance is comparable to that of NaradaBrokering, a more mature system:

        • Certain features of NaradaBrokering are superior to those of Tycho, but its memory utilisation and indirect communications are limiting features.

      • Compared to MDS4 and R-GMA, Tycho shows superior performance and scalability to both these systems:

        • In addition, we would argue that both MDS4 and R-GMA have problems with memory utilisation and without significant extra effort limited scalability.

    [email protected]


    Some conclusions58 l.jpg
    Some Conclusions resource monitoring system:

    • Tycho’s functionality has all been incorporated within a single Java JAR and requires only Java 1.5 JDK for building and running applications:

      • Use embedded Jetty applications server,

      • Hypersonic DB.

    • Tycho is now used in:

      • Swarm system - Bit Torrent like system.

      • GridRM,

      • Slogger (Semantic Log Analyser)

      • VOTechBroker.

      • FP6 SORMA project - GridRM and messaging layer.

      • Others:

        • Events/Transactions, Computational Steering.

    [email protected]


    Java messaging l.jpg
    Java Messaging resource monitoring system:

    • The Paradox:

      • The fastest implementation is via JNI – using the likes of mpiJava:

        • This introduces various issues:

          • Compatibility with underlying MPI,

          • Breaks programming model,

          • Compromises portability!

      • Using 100% pure Java ensures portability:

        • But it might not provide the most efficient solution, especially in the presence of commodity high-performance hardware.

      • It is important to address these contradictory requirements of portability and high performance, in the design of Java messaging systems.

    [email protected]


    Mpj express l.jpg
    MPJ Express resource monitoring system:

    • To address this, MPJE (MPJ Express) has been developed – project started early 2004.

    • Originally wanted a system that could be used for programming clusters of SMP nodes - Pt2Pt and shared memory.

    • MPJE is an implementation of a MPI-like API for Java:

      • The higher-level concepts, such as communicators (inter and intra), virtual topologies, and derived datatypes have been implemented in pure Java.

      • MPJE uses a unique buffering API (mpjbuf) that allows explicit memory management at the application level.

      • Both Java and JNI communication devices can use this buffering layer, by making use of direct byte buffers for the actual storage.

      • Currently, two communication devices have been implemented:

        • niodev based on Java NIO (New I/O),

        • mxdev based on Myrinet eXpress (MX).

    • MPJE includes a runtime system that permits boot-strapping of processes across remote nodes.

    [email protected]


    Mpj express architecture l.jpg

    The high and base levels rely on the resource monitoring system:mpjdev and the xdev level for actual communications and interaction with the underlying networking hardware.

    Two implementations of the mpjdev are envisaged:

    JNI wrappers to native MPI implementations,

    Use of a lower level device, called xdev, to provide access to Java sockets or specialised communication libraries.

    MPJ Express - Architecture

    [email protected]


    Mpje performance l.jpg
    MPJE - Performance resource monitoring system:

    Latency

    [email protected]


    Mpje performance63 l.jpg
    MPJE - Performance resource monitoring system:

    Bandwidth

    [email protected]


    Mpje performance64 l.jpg
    MPJE - Performance resource monitoring system:

    • Bandwidth:

      • The one-byte latency of MPJE is 164 microseconds, approximately 20% slower than MPICH:

        • Difference due to:

          • Complexity introduced by the thread-safety algorithm, which requires synchronized.

          • MPJE uses an intermediate buffering layer:

            • Implies additional copying,

            • A faster alternative on proprietary networks like Myrinet.

            • Direct byte buffers reside outside the JVM’s heap, possible to get pointers to these buffers from C - avoid copying between JVM/JNI.

    • Throughput:

      • mpiJava achieves 84%.

        • Difference due to additional data copying in JNI.

      • MPICH and MPJE achieve 89% and 87% respectively.

      • mpjdev achieves 90% bandwidth which shows that the main overhead for MPJE is packing/unpacking data onto buffers

      • The drop at 128 Kbytes is due to the change of communication protocol from eager send to rendezvous.

    [email protected]


    Mpj express meets gadget l.jpg
    MPJ Express Meets Gadget resource monitoring system:

    • To help establish the practicality of real scientific computing using MPJE, the parallel cosmological simulation code, Gadget-2, was ported from C to Java.

    • Gadget-2 is a massively parallel structure formation code developed by Volker Springel at the Max Planck Institute of Astrophysics.

    • Versions of Gadget-2 has been used in various research papers in astrophysics literature, including the noteworthy “Millennium Simulation” – the largest ever model of the Universe.

    [email protected]


    Gadget l.jpg

    A 3-dimensional visualization of the Millennium Simulation. resource monitoring system:

    The movie shows a journey through the simulated universe.

    On the way, we visit a rich cluster of galaxies and fly around it.

    During the two minutes of the movie, we travel a distance for which light would need more than 2.4 billion years.

    Gadget

    [email protected]


    J gadget l.jpg
    J- Gadget resource monitoring system:

    • Gadget-2 was manually translated to Java.

    • The data structures were deliberately kept similar so that a cross-reference to the original source code could be made for debugging purposes.

    • Gadget-2 uses the GNU Scientific Library (GSL), a parallel version of Fastest Fourier Transforms in the West (FFTW), and a MPI library.

    • We used the Barnes-Hut tree algorithm for calculating gravitational forces and for communication, we used MPJE.

    • The main simulation loop involves calculating gravitational forces for each particle in the simulation and updating their accelerations;

      • This is the most compute intensive task in the simulation.

    [email protected]


    Performance l.jpg

    Java Gadget can achieve around 70% of performance of the C version.

    This is acceptable performance given that Java has many extra safety features including array bounds checking.

    Comparison is between a production quality C code against a Java code that could be optimized further.

    The performance of Java Gadget-2 reinforces our belief that Java is a viable option for HPC.

    Performance

    [email protected]


    Summary l.jpg
    Summary version.

    • MPJ Express is a MPI-like Java messaging system.

    • MPJE is a pure Java system, which has an architecture that can also use fast proprietary networks too.

    • The latency and bandwidth of MPJE is not far off that for native C.

    • The performance of Java Gadget-2 reinforces our belief that Java is a viable option for HPC. With careful programming, it is possible to achieve performance in the same general ballpark as C code.

    • MPJE is also being used in other projects, including:

      • MoSeS (Modelling and Simulation for e-Social Science) at Leeds.

      • CartaBlanca (Sandia) that is simulating non-linear physics on unstructured grids.

    [email protected]



    Some lessons learnt l.jpg
    Some Lessons Learnt! version.

    • In general:

      • Packages should be simple to install, configure and use:

        • Limit external dependences - attempt to embed “extras”,

        • Possible to install in “user space” - avoid need to be root!

      • Use a sensible extensible architecture:

        • Believe in the internet-type architecture:

          • Stable core, with complexity at the edges.

      • HTTP(S) and XML works!

    • Java Development:

      • Necessary to management ones own memory:

        • Cannot guarantee behaviour of GC!

      • Embedded application server, such as Jetty has far superior performance to Tomcat.

      • In-memory storage of data structure is not really always sensible - Java databases, such as Hypersonic overall are as fast to use for storing (persistently) this data.

      • Thread safety, especially with Java 1.5+.

      • Efficient thread use and throttling algorithms for messaging.

    [email protected]


    Overall conclusions l.jpg
    Overall Conclusions version.

    • Discussed Pros and Cons of Java - not all cream!

    • Made a few comments about WS and REST!

      • What’s the best development route?

      • We have taken the REST-like route, which has enabled us to concentrate on functionality, rather than refactoring!

    • Looked at some Java-based middleware:

      • Tycho is a combined wide area-messaging framework with a built-in distributed registry (VR):

      • MPJ Express, a thread-safe MPI-like bindings for Java:

    [email protected]


    Ending l.jpg
    Ending version.

    • Even though sceptics may still feel that Java is less than ideal, the community of middleware and application developers is increasingly using Java and its related technologies to produce quality software.

    • Java has an extensive number of features, libraries and tools that provide a rich development environment.

    • This uptake is no doubt assisted by the fact that Java is often the first language learnt by students in educational and other institutions.

    • Downloads:

      • Tycho - http://acet.rdg.ac.uk/projects/tycho/

      • MPJE - http://acet.rdg.ac.uk/projects/mpj/

      • GridRM - http://gridrm.org/

    [email protected]



    Rest toolkits l.jpg
    REST toolkits version.

    • Restlet - http://www.restlet.org,

    • REST-art - http://rest-art.sourceforge.net,

    • Cognitive Web - http://cweb.sourceforge.net)

    • Also Axis2 and XFire provide support for REST Web Services.

    [email protected]


    ad