use of java in computational science n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Use of Java in Computational Science PowerPoint Presentation
Download Presentation
Use of Java in Computational Science

Loading in 2 Seconds...

play fullscreen
1 / 146

Use of Java in Computational Science - PowerPoint PPT Presentation


  • 104 Views
  • Uploaded on

Use of Java in Computational Science. presented by Tomasz Haupt Northeast Parallel Architectures Center at Syracuse University. Part I. Web Portals. Web Portals. A portal is a web entrance to a set of resources and consists of a mix of information, computer simulations and various services

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Use of Java in Computational Science' - arlen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
use of java in computational science

Use of Java in Computational Science

presented by

Tomasz HauptNortheast Parallel Architectures Centerat Syracuse University

part i

Part I

Web Portals

web portals
Web Portals
  • A portal is a web entrance to a set of resources and consists of a mix of information, computer simulations and various services
  • For businesses portals generalize the concept of a a company Intranet and encompass domain of IBM main frames, Lotus Notes etc.
  • For computing, portals are called Problem Solving Environments
examples of portals
Examples of Portals
  • Portal to NPAC is http://www.npac.syr.edu
  • Portal to the world is http://www.yahoo.com/ or http://my.netscape.com/
  • Portal to latest news is http://www.cnn.com
  • Portal to computational chemistry is http://www.osc.edu/~kenf/theGateway/
  • Portal to stock trading is http://quote.yahoo.com/
example portal netscape

Access to:

  • Search Engines
  • News
  • Weather
  • Stocks
  • Sport
  • Services
  • Bookmarks
  • Mail
  • Calendar
  • ...
Example Portal: Netscape
special portals computing
Special Portals -- Computing
  • But perhaps more interestingly computing portals involve building a web based problem solving environment to link together all the capabilities needed to compute
  • run programs and access dynamically status of jobs and computers -- in particular allow a uniform interface to running a given job on one of many backend compute servers
  • compile and debug programs
  • link diverse data sources with computations run on multiple backend machines
  • visualize results
  • web-based help systems and collections of related scientific papers
  • computational steering i.e. interacting with a job (change parameters) based on dynamic results such as visualized results
  • See http://www.osc.edu/~kenf/theGateway/ andhttp://www-fp.mcs.anl.gov/~gregor/datorr/
portals for scientific and engineering communities
Portals for scientific and engineering communities
  • Seamless access to HPC resources
  • Seamless access to instruments
  • Data storage
  • Application specific data base
  • Visualization Tools
  • Collaboratory
  • Scientific notepads
slide10

Front-End

Desktop/Laptop

FRONT-END:

high-level user friendly

- visual programming and authoring tools - application GUI

Seamless

Access

RESOURCES:

all hardware and software

components needed to complete the user task, including, but not limited to,

compute engines from

workstations to supercomputers,

storage, databases, instruments,

codes, libraries, and licenses.

Remote Resources

seamless access
Seamless Access
  • Create an illusion that all resources needed to complete the user tasks are available locally.
  • In particular, an authorized user can allocate the resources she needs without explicit login to the host controlling the resources.
  • An analogy: NSF mounted disk or a network printer.
example globus

Gatekeeper

Gatekeeper

Gatekeeper

Example: Globus

Advantages:

- platform independent mini-language (RSL) for specification of resources

- can be layered on top of different schedulers- enables interoperability between resources (can allocate many resources at a time, file transfer, monitoring, etc.)Disadvantage:

- a bag of low level tools

GRAMClient

MDS

Directory

Service

Contact addressResource Language Specification

GSS-API

towards a complete solution
Towards a complete solution ...

PSE: problem description (physics, chemistry, ...)

Task description: I need 64 nodes of SP-2 at Argonne to run my

MPI-based executable “a.out” you can find in “/tmp/users/haupt” onmarylin.npac.syr.edu. In addition, I need any idle workstationwith jdk1.1 installed. Make sure that the output of my a.out is

transferred to that workstation

Middle-Tier: map the user’s task description onto the resource specification; this may include resource discovery, and other services

Resource Specification

Resource Allocation: run, transfer data, run

slide14

Front-End

Front-End

Abstract Task Specification

Middle-Tier

Resource Specification

We need a third tier!

Remote Resources

target architecture

Abstract Task Specification

Middle-Tier

Resource Specification

Target Architecture

Problem Solving Environment

CTA specificknowledgedatabases

Visual

Authoring

Tools

User and

Group

Profiles

ResourceIdentificationand Access

Visualizations

Collaboration

WebFlow

Back-End Resources

example of a portal

Navigate and choose an existing applicationto solve the problem at hand.Import all necessary data.

Example of a portal

Select host

Select model

Set parameters

Run

Retrieve data

Pre/post-processing

Run simulations

pse example ccm ipse
PSE Example: CCM IPSE

1 .Define your problem

2. Identify resources (software and hardware)

3. Create input file

4. Run your application

5. Analyze results

Ken Flurchick, http://www.osc.edu/~kenf/Gateway

qs front end

Data-Flow Front-End

QS Front End

Compose interactively

your application

from pre-existing

modules

building a portal
Building a Portal
  • We can identify a set of tools that enable the construction of portals
  • These are roughly equivalent to the tools needed to build a general application based on “object web technologies”
  • There is also an architecture implying multi-tier systems with standard compliant interfaces
  • A common portal architecture means that portals can be conveniently linked together
    • e.g. the 3 portals to biology, chemistry, and physics naturally form portal to science
portal building blocks i
Portal Building Blocks I
  • So we currently have languages (Java ..) distributed object support and standards (CORBA …), interface standards (XML), transport protocols (HTTP, TCP/IP) at various levels, rendering standard (HTML).
  • We have web clients and servers
  • We need certain capabilities in common to many portals. These include
    • security
    • collaboration
    • visualization (for computing portals)
    • persistence, registration, look-up (part of most distributed object infrastructure)
portal building blocks security
Portal Building Blocks -- Security
  • So in this course, we will discuss security as it is a common service needed by many portals
    • It can be implemented simply as a user name / password but there are several special features
    • Encryption -- keeping information secret
    • Authentication -- identifying and authorizing individuals to access particular capabilities
    • Different technical approaches -- especially Kerberos and Public Key Infrastructure
  • And a discussion of special difficulties as seen by spate of stories about viruses, hackers and security leaks of computer information from government facilities
portal building blocks distributed object support
Portal Building Blocks -- Distributed Object Support
  • Although CORBA for instance provides (by definition) most key distributed object services such as persistence, this is not sufficient as we will inevitably mix object models and further these services are still being developed
  • So as one part of this course we will discuss “discuss distributed object” and “internet” (software) infrastructure
  • with special attention to issues of naming, registering, looking up (yellow pages) and addressing objects
    • Remember a web page is most common object and every Java program is an object
  • We need to contrast classical hierarchy of naming as in DNS and web URL’s as implemented in LDAP with much more intriguing dynamic model in Sun’s Jini and UCB’s Ninja which are suitable for mobile ephemeral objects
basic 3 tier computing model

File System orDatabase holding

Web Pages

Web Server

Basic 3 Tier Computing Model
  • A serveraccepts input and produces output
    • A Web Server accepts HTTP request and returns a web page
    • a Database Server accepts a SQL request and returns records selected from database
    • An Object Broker accepts IIOP requests to invoke methods of an “object” (e.g. run a program)
  • IIOP and HTTPare two common protocols (formats of control data) for inter program messages
  • A Web browser (Netscape or Microsoft) can access any server at “the click of a button” with data from user refining action
object view of running a program

PressRunButton

Object View of running a program

Fortran Program

is an Important

Type of Object

It can be built up from

smaller objects

e.g. Matrix

library could be an

object

  • Similar to invoking a web page
  • “CORBA” or “WIDL” (pure XMLCGI specification) is just CGIdone right …...

Convert GenericRun Request intoSpecific Requeston Chosen Computer

FortranSimulation Codeon Sequential or

Parallel Machine

Object Broker

pragmatic object web technology model i
Pragmatic Object Web Technology Model - I
  • Basic Vision: The current incoherent but highly creative Web will merge with distributed object technology in a multi-tier client-server-service architecture with Java based combined Web-ORB’s
  • Need to abstract entities (Web Pages, database entries, simulations) and services as objects with methods (interfaces)
    • CORBA .. XML is “just” CGI done right
  • COM(Microsoft) and CORBA(world) are competing cross platform and language object technologies
    • Every Netscape4 browser has a Visigenic ORB built in
  • Javabeans plus RMI and JINI is 100% pure Java distributed object technology
  • W3C says you should use XML which defines a better IDL and perhaps an object model -- certainly does for documents
  • How do we do this while technology is still changing rapidly!
multi tier client server service

Relational

Database

Object Store

Multi-Tier Client Server Service

Back-end Tier

Services

Middle Tier

Servers

Client Tier

Object Broker

IIOP

HTTP

Web Server

Specialized Java Server

RMI(IIOP)or Custom

Old and New Useful Backend Systems

Javabean Enterprise Javabean

pragmatic object web technology model ii
Pragmatic Object Web Technology Model - II
  • Need to use mix of approaches -- choosing what is good and what will last
  • For example develop Web-based databases with Java objects using standard JDBC (Java Database Connectivity) interfaces
    • Oracle DB2 Informix Sybase, Lotus Notes, Object database confusion becomes an issue of performance/robustness NOT functionality
  • Use XML to record small databases in flat files
  • Use CORBA to wrap existing applications
  • Use COM to access components of major PC applications such as Microsoft Excel and Word
  • Use Jini and Java to implement dynamic registration of objects
  • Use HTML to render everything
functionality of layers
Functionality of layers

Database

MPP

Telescope

File System

1)Rendering of (Multiple)Objects2)Proxy to some backend capability used to render

input and output to and

from service

1)Server acts as a broker

and control layer

2)Same software as client

but higher performance

multi-user

3)Again service represented

as proxy used as a token for

control logic

Services with

specialized software

and capabilities

proxy proxy backend capability
Proxy -- Proxy -- Backend Capability

XML

XML

Real Capability

  • The Proxies and actual instantiation are linked by messages whose semantic content is defined (best) in XML
  • The lower system level format can be HTTP RMI IIOP or …
  • The client proxy is for rendering input and output including specification of object
  • The middle tier proxy allows choice of backend provider and functional integration (the user can specify integration at client proxy level)
basic multi tier architecture

RenderingEngine

Broker or Server

XML

Result

XMLQuery

XML Requestfor service

followed byreturn of XMLresult

HTML

Browser

RenderingEngine

Universal Interfaces

IDL or Templates

Objects

Basic Multi Tier architecture
  • Objects (at “logical backend”) can be on client of course
  • Front end can define a generic (proxy for a) object. The middle control tier brokers a particular instantiation
emerging object web multi server model
Emerging Object Web Multi-Server Model

Back End Servers and

their services

Clients andtheir servers

Middle Tier Custom Servers

multi server web computing system or portal to computing

MultidisciplinaryControl (WebFlow)

Gateway Control

Parallel DBProxy

Database

NEOS ControlOptimization

OptimizationService

Origin 2000Proxy

MPP

NetSolveLinear Alg.Server

Matrix Solver

Agent-basedChoice ofCompute Engine

IBM SP2Proxy

Data AnalysisServer

MPP

Multi-Server Web Computing System or Portal to Computing
some caveats and comments

Multi-UserMiddleware

User SpecificServer

Palm Top

Some caveats and comments
  • Need version 5 browsers with good XML support to properly implement this
  • We draw three tier as minimum architecture but as previous diagram suggests, one is likely to build a more structured system with middle tier having more layers
  • Network computer breaks client tier into two with simple HTML at user and Java servlets providing heavier weight capability on a different machine
    • Here user is at a WebTV or Palm Pilot or similar low-end device
what does it take to implement this
What does it take to Implement This
  • Well you need some hardware -- that is either an Internet or Intranet
    • Internet is the world running object web software
    • Intranet is a dedicated network (for a company, university department, PC cluster) running object web software
  • You need some software
  • You need some standards and capabilities expressed in these standards
  • You need some capabilities common to many applications
  • You need a specific system to solve a particular problem
more details on the implementation
More details on the implementation
  • Note the hardware can be as little as 1 PC on your desk
  • More interestingly it is your 64 PC Linux or Windows NT Cluster up to the cluster of 64 128 node SGI Origin’s at Los Alamos
    • i.e. a parallel computer is by definition a special case of an Intranet
  • Software divides into several types

HTMLRendering

Fortran Program

PLSQL Database

or …..

“Glue” with (multiple) tier servers and XML inter tier communication

Style Sheetsand Page Design

Java/CORBA/WIDLWrapper

implementation continued i
Implementation Continued I
  • The backend software can be parallel or sequential and simulation or information based
    • It can be COBOL, Fortran, C++, Perl ...
  • We need to define in XML its interface needed to
    • run it
    • set its parameters -- i.e. its input mechanisms
    • get its output -- numbers or visualization
  • This backend program interface is defined as an XML file e.g.<program name=“physicssimulation1”> <run domain=“npac” machine=“maryland” type=“pc” os=“nt” exec=c:\myprogs\a.out</run> <input type=“htmlform” > <name>userinput</name> <field default=“10” >iterations</field> ………. </input> <output> …</output></program>

Becomes HTML form with name

userinput andtext field iterations

with default value 10 on client

implementation continued ii
Implementation Continued II
  • For this example (running a physics program), we could use a specific machine as defined on previous foil (the Windows NT PC maryland) or a generic machine<run domain=“any” machine=“any” type=“pc” os=“nt” >
  • In this case, middle tier broker, would look in a database (XML file) to find what machines were available and either automatically or with user input choose one.
  • Both Software and Hardware are defined in XML
  • Note databases and an XML files are logically equivalent
  • JDBC can access either Oracle (Microsoft) database or XML ASCII file
  • More generally XML can be thought of as general object serialization
    • A database table is one type of object
implementation continued iii
Implementation Continued III
  • The front end is some document consisting of a mix of HTML or XML
    • The HTML is whatever you want to make a nice page
    • The XML is converted into some variant of HTML by
      • Browser default or
      • XSL style sheet
      • User Program -- logically in middle tier
    • Note HTML can include Java applets either directly or invoked from XSL style sheets
  • We will NOT discuss either how to code backend in PLSQL or Fortran or how to compose final rendered document in HTML
collaboration i
Collaboration I
  • This is often termed groupware support and Lotus Notes is best known corporate product
  • Collaboration implies sharing of electronic objects and is needed in asynchronous and synchronous modes
  • AOL Yahoo etc. have Internet games which illustrate one sophisticated form of collaboration
  • Chat rooms are perhaps most popular and are simplest synchronous tool. White boards next most popular
  • Asynchronous mode is
    • shared web pages and documents (these are be shared synchronously or asynchronously)
    • electronic mail, event calendars, bulletin boards
  • http://www.npac.syr.edu/tango/ is a collaboration system supporting synchronous sharing of events where events signify changes in objects
collaboration ii
Collaboration II
  • Notification and linkage service can be based on object registration mechanism and allows important collaborative capabilities
  • one associates with each group activity a “magic ID” (barcode)
  • every digital object associated with this activity registers itself when it comes on line with some registry (registry can be distributed). A given object may have multiple barcodes attached to it
    • This is Jini model for registration but can be implement for pure Web (using JavaScript) or CORBA
  • Either users or Portals (PSE’s) register interest in certain barcodes
  • The (Portal) event service notifies registered observers when a digital object of interest becomes available
collaboration iii
Collaboration III
  • Notification mechanism enhances collaboration as enables dynamic federation of relevant objects to be automatically maintained
    • Kodak would like this service to group together digital versions of photographs taken at particular events (e.g. a wedding)
  • Notification can be used for people so their presence on-line can be made known to those in particular collaborative users
  • Users decide if notification causes an active signal (send electronic mail, ring a buzzer) or passively alters a list on a web page.
  • Event Model unifies synchronous and asynchronous models of collaboration
    • Event either triggers action immediately and/or asynchronously ( sending e-mail immediately is synchronous act generating asynchronous record)
collaboration and portals i
Collaboration and Portals I
  • Shared Objects need to accept data from Portal compliant applications
  • Portal Events need to be integrated into SPDL
  • Portal federates different “event domains”

Collaboration ==Sharing Event

in “Tango” Server

PortalEvents

Local Event/Message Bus

Local Event/Message Bus

Local Event/Message Bus

collaboration and portal ii
Collaboration and Portal II
  • Whiteboard and Shared Browser can be loaded with files and data from Portal compliant systems
  • More generally consider any client side rendering of a gateway system -- either data input or (visualization/data) output
    • These can be shared collaboratively
  • Examples from Tango and Computing Portal
    • Shared Biology Workbench shares client side input forms
    • Shared visualization (NCSA, NPAC) shares output file of a computation
part ii

Part II

WebFlow

webflow design

Object B

(event target)

Object A

(event source)

Method M(){

…}

Fire event E

WebFlow design
  • Object Oriented, follows JavaBeans model
    • everything is an object
    • objects interact through events

Firing event E by object A causes invocation of method M of object B.The association of event E and method M is achieved by an eventregistration mechanism. An event is also an object and it carries data.

a few words about corba

A few words about CORBA

(a digression)

more information on Java, Corba, Distributed Object:

http://www.npac.syr.edu/projects/cps616spring96/index.html

distributed objects
Distributed objects
  • Typically WebFlow objects live in different address spaces. We use CORBA to invoke methods of the remote objects.

Object B

(event target)

Object A

(event source)

Method M(){

…}

Fire event E

ORB

how is this possible
How is this possible?

Object B

(event target)

Object A

(event source)

Method M(){

…}

Fire event E

IIOP

ORB2

ORB1

Object Adapter

serves also as a daemon

-Objects A and B are CORBA objects (thus not Java objects)

- Objects are defined in IDL (Interface Definition Language)

- IDL definitions are compiled using (Java)IDL compiler

- The IDL compiler generates new classes to be used by the Java compiler

(javac) instead of the original ones, on both the client and server side

- The IDL compiler generates either classes to be extended, or interfaces to be implemented

example of idl definition
Example of IDL definition

#include “..\BC.idl”module WebFlow {

module lms{

interface runEdys:BeanContextChild { void run(); void receiveData();

void setParameter(in string p);

}; interface runCasc2d:BeanContextChild{ void run(); void runAgain();

}; interface DoneEvent{ Object getSource(); };

};};

We will create 3 CORBA objects

* two modules: - runEdys - runCasc2d * one event - DoneEventThey will be added to packageWebFlow.lms

we need more flexibility
We need more flexibility...
  • WebFlow objects are developed independently of each other(reusable modules): we cannot assume that the event source knows anything about the event target and vice versa
event binding
Event binding

Event Source

Adapter

Event Target

addEventListener

rmEventListener

fireEvent(E,M)

method M

binding

table

Event

DII

DSI

ORB

controlling a module

Applet

Module

Controls

Controlling a module

Another complication:

Java sandbox!

IIOP

Module

ActionButton1

ActionButton2

….

Proxy Module

adding a remote module
Adding a remote module

Local Host

Remote Host

FE

request

Add module

Add module

Module Factory

Module Factory

Proxy Module

Module

corba based middle tier
Mesh of WebFlow Servers

implemented as CORBA objects

that manage and coordinate

distributed computation.

Front End

CORBA Based Middle-Tier

Gatekeeper

Authentication

Authorization

webflow server
WebFlow Server
  • The WebFlow server is a container object, a.k.a. context - in fact it implements JavaBeanContext class (Java1.2)
  • The BeanContext acts as a logical container for JavaBeans (“WebFlow modules and services”) and BeanContexts.
webflow server1
WebFlow Server

WebFlow server is given

by a hierarchy of containers

and components

WebFlow server hosts users and services

Each user maintainsa number of applicationscomposed of custom modules and common services

User 1

User 2

Application 1

App 1

App 2

Application 2

WebFlow Services

webflow context hierarchy
WebFlow Context Hierarchy

Master Server (Gatekeeper)

Slave Server Proxy

Slave Server

Application Context

Module

Slave Server

User Context

middle tier modules serve as proxies of back end services

Browser

based

Front-End

BrowserbasedFront-End

Middle-Tier modulesserve as proxies ofBack-End Services

User Space Definition and Task Specification

Services

User Modules

Metacomputing Services

Back-End Resources

modules
Modules
  • Similar to JavaBeans
    • full power of Java (or C++) to implement functionality
    • can encapsulate legacy applications
  • May serve as Proxies
    • JDBC
    • metacomputing services (such as Globus)
    • schedulers (such as PBS, CONDOR, etc)
services
Services
  • Services are modules provided by the system and offers a generic functionality
    • job services (submit,monitor,kill,... a job)
    • file services (edit,copy,move,… a file)
    • XML parser
    • database access
    • mass storage access
    • ...
example of a proxy module

GRAM resource description

&(rsl_substitution = (MYDIR “/tmp/haupt”))

(DATADIR $(MYDIR)/data)(EXECDIR) $MYDIR)/bin))

(executable = $(EXECDIR)/a.out)

(arguments=$(DATADIR)/file1)

(stdout=(MYDIR)/result.dat))

(count=1)

Example of a proxy module

The Run Job module is a proxy module. It generates the RSL on-the-fly

and submits the job for execution using globusrun function.

The module knows only exec name, location and its arguments/parameters.

Generate Data

Run Job

Analyze

webflow over globus
In order to run WebFlow over Globus there must be at least one WebFlow node capable of executing Globus commands, such as globusrun

Jobs that require computational power of massively parallel computers are directed to the Globus domain, while other jobs can be launched on much more modest platforms, such as the user’s desktop or even a laptop running Windows NT.

Bridge between WebFlow and Globus

WebFlow over Globus
secure access terminology
Secure Access: terminology
  • Access Control (or Authorization)
    • Assurance that the person or computer at the other end of the session is permitted to do what he asks for.
  • Authentication
    • Assurance that the resource (human or machine) at the other end of the session is what it claims to be
  • Integrity
    • Assurance that the information that arrives is the same as when it was sent
  • Accountability (or non-repudiation)
    • Assurance that any transaction that takes place can subsequently proved to have taken place
  • Privacy
    • Assurance that sensitive information is not visible to an eavesdropper (usually achieved using encryption)
secure access
Secure Access
  • Mutual authentication of servers and users
    • Certificates, Keberos/SecurID
  • Access control
    • Full autonomy of the resources owner(s)
    • Akenti
  • Privacy
  • Integrity
security model

https (SSL)

AKENTI

CORBA security service

GSSAPI (Globus)

Security Model

Front End Applet

Stakeholders

https

Layer 1: secure Web

delegation

Layer 2: secure CORBA

Gatekeeper

SECIOP

Layer 3: Secure access to resources

authentication

& authorization

GSSAPI

GSSAPI

HPCC resources

Policies defined by resource owners

distributed objects are less secure
Distributed Objects are less secure
  • can play both client and server
    • in client/server you trust the server, but not the clients
  • evolve continually
    • objects delegate parts of their implementation to the other objects (also dynamically composed at runtime). Because of subclassing, the implementation of an object may change over time
  • interactions are not well defined
    • because of encapsulation, you cannot understand all the interactions between objects
  • are polymorphic (ideal for Trojan horses!)
  • can scale without limit
    • how do you manage the access right to millions of servers?
  • are very dynamic
corba security is built into orb
CORBA security is built into ORB

User

Client

Server

Object

Adapter

ORB

Credentials

Authentication

Encryption

Encryption

Audit

Authorization

Secure Communications

authentication
Authentication
  • A principal is authenticated once by ORB and given a set of credentials, including one or more roles, privileges, and an authenticated ID.
  • An authenticated ID is automatically propagated by a secure ORB; it is part of the caller context

Client

Server

authenticate

Principal

Credentials

Current

set_credentials

get_attributes

privilege delegation
Privilege Delegation

Client

Target

Object

Target

Client

Target

Client

Target

Client

  • No delegation
    • The intermediary uses its own credentials
  • Simple delegation
    • The intermediary impersonates the client
  • Composite delegation
    • The intermediary uses both

IIOP

corba access model
CORBA access model
  • Based on a trusted ORB model:you must trust that your ORB will enforce the access policy on the server resource
  • The ORB determines:if this client on behalf of this principal can do this operation on this object
  • Server uses Access Control Lists (ACL) to control user access

Principal

Role

Rights

Operation

part iii

Part III

WebFlow Applications

slide74
Applications vary by the functionality of their Front-Ends
    • Front-End Applications
      • must be pre-installed
      • run fast, no restrictions
    • Front-End Applets
      • no installation, but may take time to download
      • sandbox restrictions apply, unless signed
slide75
Applications vary by how they are composed from modules
    • statically
      • can by prepared in the Middle-Tier
    • dynamically
      • the user composes them from reusable components
slide76
The modules can interact with each other in different ways:
    • through events (object oriented approach)
    • through ports (data flow model)
    • through message passing
slide77
Applications vary on how the Front-End interacts with the Middle-Tier
    • A complete task description is sent to the middle-tier
      • composed of reusable modules
      • predefined
    • Objects are added to the user context one at a time, and Front-End keeps their references
lms objectives
LMS Objectives

To develop a web based system that implements a “navigate-and-choose” paradigm and allows the end user to:

  • Select (a set of) computational modules that provide answers to the problem at hand
  • Retrieve input data sets from remote sources
  • Use adequate (remote) computational resources
  • Visualize and analyze output data on the local host

Anytime, anywhere, using any platform

(e.g., a connected to the Internet laptop PC)

lms changes in vegetation
LMS: Changes in Vegetation
  • A decision maker (the end user of the system) wants to evaluate changes in vegetation in a geographical region over a long time period caused by short term disturbances such as a fire or human activity.
  • One of the critical parameters of the vegetation model (EDYS) is soil condition at the time of the disturbance.
  • This in turn is dominated by rainfall that possibly occurs at that time (CASC2D simulation)
  • Input data for the simulations are available from the Internet, such as Data Elevation Models (DEM) from USGS web site or from custom databases (spices characteristics)
lms changes in vegetation1
Data retrieval

Data preprocessing

Simulation: two interacting codes

EDYS

CASC2D

Visualization

LMS: Changes in Vegetation

DEM

Land Use

Soil

Texture

Vegetation

WMS

EDYS

CASC2D

WMS: Watershed Modeling System

EDYS: vegetation model

CASC2D: watershed model

lms front end
LMS Front End

Data retrieval

Data pre- and post-processing

Simulations

data retrieval
Data Retrieval

The data wizard allows the user tointeractivelyselect the data and

download them tothe local machine.The raw data arethen fed to the WMS systemlaunched from the browser togenerate input filesfor simulations.

slide84

Select host

Select model

Set parameters

Run

Launching coupled simulations on different Back-End computational resources

wms based visualizations
WMS based Visualizations

The results of the

simulations are send

back to the Front-End,

and can be visualized

using tools included

in WMS package

implementation of lms
Implementation of LMS
  • Front-End (client) is a Java application
    • Data wizard, EDYS and WMS are run locally
  • “navigate and choose” - no interactive composition of applications
    • EDYS, CASC2D, EDYS and CASC2D
  • modules exchange data through message passing mediated by WebFlow
  • client keeps the module references
running lms

- WebFlow modules

Running LMS

UNIX

WinNT

Web

Server

Web

Server

lms.class

Data wizard

WMS

exeCasc2d

WebFlow Servers

Client

slave

slave

runCasc2d

runEdys

master

client code
Client code

try {

//add modules

p1 = slaveNT.addNewModule("runEdys"); //as defined in conf.file

runEdys re = runEdysHelper.narrow(p1);

p2 = slaveUNIX.addNewModule("runCasc2d"); //as defined in conf.file

runCasc2d rc = runCasc2dHelper.narrow(p2);

//bind events

master.attachEvent(p2,"Casc2dDone","Casc2dDone",p1,"run");

master.attachEvent(p1,"EdysStarted","EdysStarted",p2,"run");

master.attachEvent(p1,"EdysDone","EdysDone",p2,"runAgain");

//invoke methods of runCasc2dImp

rc.run();

}

catch(COMM_FAILURE ex)

{System.err.println(ex.getMessage()); System.exit(1);}

interactions between components
Interactions between components

UNIX

WinNT

Web

Server

Web

Server

lms.class

Data wizard

WMS

Write

Write

exeCasc2d

http

http

slave

slave

casc2d

runCasc2d

IIOP

runEdys

master

implementation of qs
Implementation of QS
  • Front-End (client) is a Java applet
  • applications are created dynamically from pre-existing modules
  • modules exchange data through ports (data flow model)
  • server keeps the module references;the references are published on a web site
building an application
Building an application

Front-End Applet

XML

A visual representationis converted into a XML

document

parse

Middle-Tier

Web

Server

save

XML

service

ApplContext

Publishes IOR

Generates Java code to add modules to ApplContext

document type definition
Document Type Definition

<!DOCTYPE taskspec [

<!ELEMENT taskspec (task)+>

<!ATTLIST taskspec

UserContextRef CDATA #REQUIRED

AppName CDATA #REQUIRED>

<!ELEMENT task ((task | module)*,connection*) >

<!ELEMENT module (#PCDATA) >

<!ATTLIST module

modulename CDATA #REQUIRED

host CDATA #REQUIRED >

<!ELEMENT connection (out,in)>

<!ELEMENT in EMPTY>

<!ELEMENT out EMPTY>

<!ATTLIST out

modulename CDATA #REQUIRED

eventname CDATA #REQUIRED

<!ATTLIST in

modulename CDATA #REQUIRED

method CDATA #REQUIRED >

]>

example xml document
Example XML document

<taskspec UserContextRef="123as321" AppName="TestApplication">

<task>

<module modulename="FileBrowser" host="localhost">

</module>

<module modulename="FileEditor" host="localhost">

</module>

<module modulename="Gaussian" host="localhost">

</module>

<connection>

<out modulename="FileBrowser" eventname="FileEvent" event="File"/>

<in modulename="FileEditor" method="run"/>

</connection>

<connection>

<out modulename="FileEditor" eventname="FileEvent" event="File"/>

<in modulename="Gaussian" method="run"/>

</connection>

</task>

</taskspec>

mobility system s applications

- object oriented approach - implementation:- CORBA based Middle-Tier - bean-box type API - JDBC proxy modules

databases

Mobility System’s Applications

Remote HPCC resources

- Web interface to store data in DB in variable format- Data transfer from DB to a visualization engine

- Coordinates transformations on

a remote server- Launching simulations on remote

hosts with interactive input

Coordinatestransformations

part iv

Part IV

Gateway:Portal for Computing

target architecture1

Abstract Task Specification

Middle-Tier

Resource Specification

Target Architecture

Problem Solving Environment

CTA specificknowledgedatabases

Visual

Authoring

Tools

User and

Group

Profiles

ResourceIdentificationand Access

Visualizations

Collaboration

WebFlow

Back-End Resources

design issues
Design Issues
  • Support for a seamless access (security)
  • Support for distributed, heterogeneous Back-End services (HPCC, DBMS, Internet, ...) managed independently from Gateway
  • Variable pool of resources: support for discovery and dynamical incorporation into the system
  • Scalable, extensible, low-maintenance Middle Tier
  • Web-based, extensible, customizable, self-adjusting to varying capacities and capabilities of clients (humans, software and hardware) front end
gateway implementation
Gateway Implementation
  • Distributed, object-oriented middle tier
    • CORBA objects (Gateway Containers, Gateway Modules and Gateway Services) implemented in Java. [Scalable, extensible, low-maintenance middle tier]
    • Containers define the user environment.
    • Modules and Services serve as proxies: they accept the user requests (Front End) and delegate them to the Back End. [Support for distributed, heterogeneous back-end services managed independently from Gateway]

Note: modules can be implemented in C++; also can be DCOM components

gateway implementation 2
Gateway Implementation (2)
  • Gateway operates in a keberized environment[Support for a seamless access]
    • tickets are generated on the client side
    • Keberos-based CORBA security service is used to manage the user sessions
    • Globus GSSAPI implemented over Keberos is used for resource allocation
gateway implementation 3
Gateway Implementation (3)
  • Task Specification is expressed in XML
    • CTA independent
    • Decouples implementation of the Front End and the Middle Tier
    • Allows for an abstract (platform independent) task specification, and thus the Middle Tier may act as a resource broker
  • Resource Specification is expressed in XML
    • Simplifies match-making and resource discovery
    • Simplifies generating Globus RSL in-the-fly

[Support for distributed, heterogeneous Back-End services; Variable pool of resources; Scalable, extensible, low-maintenance Middle Tier]

gateway implementation 4
Gateway Implementation (4)
  • Component-based Front-End[extensible]
  • Front-End Components (“toolbox interfaces”) are
    • applets (interfaces for common services)
    • XML pages or frames[Web-based, extensible, customizable, self-adjusting]
  • All components (Front End, Middle-Tier) are defined in XML and contain metadata (used for component mining)
cta specific knowledge database
CTA specific knowledge database
  • requires server side support (both the middle tier and the back-end) through well defined interfaces
  • should be constructed from reusable or cloneable components
  • allows for identification of software components best suited to solve the problem at hand
visual authoring tools
Visual Authoring Tools
  • Allows for composition of the computational task from components (reusable modules)
  • Different tools to support various programming models such as data parallel, task parallel, data flow, object oriented
  • No assumption on granularity
  • Metadata about components and support for archiving and mining the components
  • Support for instrumentation and steering
user and group profile
User and Group Profile
  • Controls the user/group environment
    • file access
    • job monitoring
    • ...
  • Allows for customization
    • preferences
    • users with disabilities
    • ...
  • History of actions
  • Scientific notebook
resource identification and access
Resource Identification and Access
  • Computational resources
    • hardware, software, licenses
    • desktop applications
  • Data
    • file systems, mass storage, distributed databases
    • Internet data repositories
  • Networks
front end support
Front-End Support
  • Portal Page
  • User Context
  • Control Applet
  • Navigator (extensible, customizable)
  • PSE specific toolboxes
    • A placeholder for the Problem Description toolboxes
    • A placeholder for the code toolbox
    • Resource request toolbox
    • Data postprocessing toolbox
  • Other (Collaboration, Visualizations, …)
user context
User Context
  • Represents a Gateway session.
  • The session is associated with a user (or group) profile.
  • WebFlow extends the notion of the UNIX profile via the 'User Data Base' (UDB). This UDB contains information about submitted jobs, history of the users actions, and other user state information. The user context may also contain application/front-end specific information.
control applet
Control Applet
  • The control applet is responsible for maintaining the session, and direct communication with the middle-tier.
  • Direct communication is the most efficient, but since it is buried into an applet, this mechanism is not readily customizable.
  • The generic services, such as file service (upload, download, edit, copy, move, delete) and job services (show current jobs/show queues/kill jobs) will be supported this way. [combination of the user context and a query]
  • The Gateway will also support a non-direct communication with the middle-tier through servelts.
navigator
Navigator
  • The navigator allows the user to select and customize toolboxes.
  • Embedded in a separate frame, it consists of menus, buttons, links, etc, derived from an XML document.
  • The navigator is a hierarchical, extensible and customizable.
problem description toolboxes
Problem description toolboxes
  • The problem description is application specific, and the Gateway only provides a general framework for creating a PSE.
  • The most important part is the specification of what services (middle and back tier) are needed, what is their API, and how to add new services.
  • Example services: access to databases, XML parsing, generating HTML in-the-fly, file services.
code toolboxes
Code toolboxes
  • The end user see it as a mapping between the problem description and software to be used to solve the problem. Actually, it identifies WebFlow modules and their parameters to be used to construct the application (see resource request toolbox below).
  • The module parameters may include input files, and if necessary, the input files are generated at this stage (using this or a separate toolbox). In addition, some parameters will be constructed from information stored in data bases, including UDB, and other sources.
resource request toolbox
Resource Request Toolbox
  • The front-end activities result in an abstract task specification.
  • Abstract in the sense that the user may not know nor care what actual resources are used.
  • The task is composed of independently developed modules and services following different programming models.
file formats i
File Formats I
  • We noted that Gateway should support input and output files with certain characteristics including
  • Native: internal format known to a particular application -- no checking of format matching if link output of one module to input of another
  • parameter input: special file constructed in XML defining basic parameters needed by a batch job
    • Gateway automatically sets defaults, allows user input and applies any appropriate checks at front end
    • User provides a bridge to convert Gateway file to form understood by batch job. This can use special Gateway utilities which provide Fortan, C and Java functions to get parameter values in backend user code
    • We decided such parameters would need default, min, max, type and array dimensions to be specified in XML specification of these parameters
file formats ii
File Formats II
  • Third file format is
    • prescribed: data is encoded in some defined fashion known to more than one Gateway application/service
      • Here HDF XML HTML are examples
    • other file characteristics include database generated, realtime …
  • Gateway modules need to specify nature of their input and output files and what type of linkage of modules is allowed or even required.
    • E.g. scientific notepad may allow data to be imported from the visualization service or require that it be encoded in ScienceML
other toolboxes
Other toolboxes
  • Visualizations
  • Collaboration
  • Scientific notebook
  • ...
scienceml
ScienceML
  • This we define as a group of defined formats that support scientific data, note taking and sketches
  • XSIL (Scientific data Interchange) defines metadata needed to specify scientific data files including high level parameters and methods needed to read data
    • This would a prescribed format in Gateway
  • VML is Vector Graphics Mark up Language
  • DrawML is designed to support simple technical drawings (easier than VML but VML should be able to do this?)
  • VRML (3D scenes) reimplemented in XML as X3D
  • MathML Mathematical Expressions
  • ChemML Support Chemistry
scientific notepad
Scientific Notepad
  • Presumably this allows Scientists to make notes and record thoughts in a way that it supports important scientific constructs
  • At its simplest this is an authoring tool like Microsoft Word, PowerPoint or Framemaker
    • These will improve and support standards such as MathML (openMath) with better WYSIWYG authoring
  • One useful utility would be a whiteboard that supported scientific notes using ScienceML
  • Such a collaborative whiteboard (implemented in Tango for instance) would be useful in research and teaching
    • Use commercial authoring tool and WebEQ or equivalent to render
webflow server2
WebFlow Server

WebFlow server is given

by a hierarchy of containers

and components

WebFlow server hosts users and services

Each user maintainsa number of applicationscomposed of custom modules and common services

User 1

User 2

Application 1

App 1

App 2

Application 2

WebFlow Services

corba based middle tier1
Mesh of WebFlow Servers

implemented as CORBA objects

that manage and coordinate

distributed computation.

Front End

CORBA Based Middle-Tier

Gatekeeper

Authentication

Authorization

back end services
Back End Services
  • Access to HPCC (via Globus)
  • Access to distributed databases (via JDBC)
  • Access to mass storage
  • Access to the Internet resources
  • Access to desktop application and local data
  • Access to code repositories
security model keberos
Security Model (Keberos)

Front End Applet

SECIOP

Layer 1: secure Web

delegation

Layer 2: secure CORBA

Gatekeeper

SECIOP

Layer 3: Secure access to resources

authentication

& authorization

GSSAPI

GSSAPI

HPCC resources

Policies defined by resource owners

slide139
.
  • Gateway applications are composed of independent reusable modules
  • Modules are written by module developers who have only limited knowledge of the system on which the modules will run.
  • The WebFlow system hides module management and coordination functions
Middle-Tier is given by amesh of WebFlow Servers that manage and coordinate distributed computation
how to develop a gateway component or a toolbox
How to develop a Gateway component (or a toolbox)
  • Back-end service
  • Middle-tier proxy
  • Front-end controls
how the back end interacts with the rest of the system
How the Back-End interacts with the rest of the system?
  • Often, your job do not need to interact.
    • Using GRAM and GASS you stage data and executable, submit the job and retrieve output.
    • Using DUROC you can coallocate resources and run MPI-based parallel/distributed codes. The messages between nodes are sent outside Gateway control or support.
    • HPF runtime will distribute your job and facilitate interprocess communication.
implementing back end services
Implementing Back-End Services
  • If you need to interact
    • Using a separate module, you may move files between nodes while your jobs are executing
    • Your job may be a server (e.g., database, GRAM) [if socket listener - be careful about security!]
    • Your job my be a CORBA client (Java, C++)
    • ...
what does it take to develop a gateway module a proxy
What does it take to develop a Gateway module (a proxy)?
  • Many come as a standard Gateway modules
  • User’s modules
    • Are CORBA objects
      • Define IDL (as an XML document)
      • Compile IDL (in the tie mode)
      • Implement the functionality of the module
      • Implement events
      • Develop Front-End controls that invoke methods of the module
gateway webflow mission
Gateway/WebFlow Mission
  • seamless access to remote resources
    • through a Web based user interface
    • customized application GUI
  • high-level user friendly visual programming and runtime environment for HPDC
  • portable system based on industry standards and commodity software components
updates
Updates
  • Contact person: Tomasz Haupt
  • haupt@npac.syr.edu
  • voice (315) 443-2087
  • http://www.npac.syr.edu/users/haupt/WebFlow/