Portals and Portlets Marlon Pierce Community Grids Lab Indiana University September 27, 2004
What a Science Portal Is/Is Not • It is • A tool for aggregating and managing web content • A user customizable view of these Web content pieces. • You see what you want/can see. • But you must log in. • The portal recognizes you. • Implemented on top of standard services • Like login, authorization, customization. • May include collaboration, etc, that depend on login. • A way to accomplish tasks through browsers: • Launch, monitor jobs • Move files • Run science applications based on these services. • Compatible with emerging standards and best practices (such as portlets, JSR 168 and WSRP). • It is not (just) • A web page • A collection of links • An applet
Things I Hate About Portals • If you involved in portal efforts, be aware of the following: • Browsers have limited interactivity. • Desktop GUIs provide much better interactivity but have other problems. • Applets are a solution, but they don’t interact with other parts of the browser very well. • Solution: Service Oriented portals let you use services through both portals and grid desktops. • Developing really useful user interfaces to your set of services is a time consuming, non-scaling process. • Get users involved early in design. • If you can get users directly involved, even better. • Browsers notoriously have incompatible features. • Things don’t work the same on IE, Mozilla, Konqueror, etc. • Same browsers on Macs, Windows don’t work the same. • No substitute for lots of testing. • Portals depend upon backend resources and security • But these things are not under portal developers’ control. • When things change (and they do, constantly) things will break. • You will be blamed.
Interoperable Portal Services (ET-03-011) Marlon Pierce Community Grids Lab Indiana University
What 3rd Party Technologies Did We Use? • Tomcat 3.x and 4.x, Jetspeed 1.4, JSP 2.x, JavaBeans, Apache Axis 1.x • We also used kerberized CORBA in previous projects • Unfashionable these days • If I were starting today, I would use • Gridsphere (for JSR168), JavaServer Faces, Apache Axis, NaradaBrokering
Project: ET011 • Goals of the project are to demonstrate interoperability between Portal/PSE projects • Mary Thomas (PI), TACC: HotPage • Tomasz Haupt, MSU: DMEFS • Marlon Pierce, IU: Gateway • We investigated building interoperability at two levels: • Web services provide standard interfaces • Portlets provide component-based interfaces • I was responsible for deployment at ASC and ARL. • Project ended Oct 1, 2003.
Portal Security • We are building off Gateway’s approach for Web-based security for DOD portals. • Approved for ARL and ASC • Users kinit to a web server to get a ticket. • SSL, MD’d sessions, Certificates maintain secure connection. • Web server typically located in “DMZ” • Web server manages session IDs, invokes backend requests with Kerberos client utilities. • Probably would break today. Browser HTTPS Web Server DMZ krcp, krsh HPC HPC HPC
Portlets and Containers • One of the problems of previous portal development is that there is no good way to share interface components. • How do developers share web interfaces? • Also, how can we avoid constantly reinventing things like login services, customization services, page organization, access controls. • Answer: use portlets and containers. • Becoming a recognized best practice for portal development because it enables distributed portal development.
What Is a Portlet? • A portlet is a piece of Java code that runs in a Web server inside a container servlet. • Portlets can do two things: • Perform non-visual operations such as make connections to remote hosts, perform operations. • Example: get a list of local files. • Create their display • The portlet passes its display to its parent, which is responsible for constructing the entire display. • Typically this is HTML, with tables uses to organize component displays. • Other displays are possible (VoiceXML, WML).
Portal Services • We had several services that we portletizing as part of this project: • Job submission • File Transfer • Job Monitoring • We develope DOD versions of TACC’s GPIR services • We extended Jetspeed login to support web kiniting (with SecurID).
Job Submission • Primarily based at ARL • Support Fluent, ANSYS, ABAQUS • Services construct GRD scripts, allow users to run and archive jobs. • We are extending this to support ANSYS at ASC, DMEFS codes at ARL. • We need to extend script generators for other queuing systems. • PBS, LoadLeveler, LSF
File Management • File management services allow you to • Upload, download files between desktop and remote HPC • Download entire directories as zipped files. • Delete remote files. • Navigate remote directories. • Unzip/untar remotely. • Targetting ASC and ARL initially • ARL is available in production
Job Monitoring • We have web interfaces that will allow to monitor your jobs on various hosts. • Constructs an HTML table of your running jobs in a unified format. • Allows you to stop jobs • We support GRD in production portal at ARL. • Have ported this to PBS, LSF, and LoadLeveler as part of this project.
Access to Portlets • Obviously not all users have accounts at all centers. • An ASC file browser should be accessible only to users with an ASC account. • Jetspeed has role based accessed control to portlets. • Each user can be assigned to one or more user roles (“ERDC”, “ASC”, etc). • This controls which portlets a user can add to his or her display.
GridPort Information Repository (GPIR) • Developed by TACC group for NPACI resources. • Porting this to DOD. • Aim is to aggregate and cache grid and portal related data from multiple sources in a uniform way. • MDS, NWS, custom data providers
GPIR Approach • GPIR is implemented as a set of Java Web Services, one to handle the input of GPIR data (Ingester WS) and another to facilitate the querying of that data (Query WS) • The Ingester WS accepts or "ingests" several types of XML documents and stores them in a relational database (currently MySQL, Postgres). • These documents are created by a variety of means, including Java Clients that exist on the resources themselves, http "web scraping" of machine-specific flat-file formats, and queries of additional information providers such and MDS, GMS (Grid Monitor Service), and NWS (Network Weather Service). • Persistently stored data can then be queried via the Query Web Service which uses the same XML resources used by the Ingester, in addition to some Query specific documents that can return XML such as Machine Summary data.
GPIR Schema Types • Static: static data for a machine. • Load: load data for a machine. • Status: machine status (up, down, unavailable). • Downtime: downtime data for a machine. • Jobs: job data for a machine. • MOTD: Message of the Day data for a machine. • Nodes: Nodes data for a machine. • Services: represents the status of grid software running on a system. • NWS: This returns bandwidth and latency measurements of the type returned by NWS.
Where Are We Today? • JSR 168 is an important new standard for Java portals. • Standardizes portlet containers • Commercial products available from Sun, IBM, Oracle, BEA • Open source implementations include uPortal, GridSphere, eXo, Jetspeed2 • Web Services for Remote Portlets (WSRP) also available. • Compatible standard to JSR 168 that uses SOAP and WSDL to communicate between portlet container and portlet. • Potentially allows containers and portlets to be from different languages (Java, C#, Python, PHP).
Why Is This Important? • HPCMP developers generate extremely sophisticated web interfaces and services to science and engineering applications. • These should be portlets • Things like login, user display layouts managers, access controls to content, etc., should not be reinvented. • There are plenty of portal projects that have done this. • These are portlet containers. • You should not reinvent this. • By adopting the portlet/container approach, you can also create portals that combine HPCMP specific content as well as third party portlets. • Calendars, RSS news feeds, document managers, WIKI like features, etc.
What is JSR 168? • Defines the (Java) standard for vendor container-independent portlet components. • Portlets can be developed independently of the container. • Many implementations: • Gridsphere, uPortal, WebSphere, Jetspeed2, …. • From the portlet development point of view, it is really very simple: • You write a java class that extends GenericPortlet. • You override/implement several methods inherited from GenericPortlet. • You use some supporting classes/interfaces • Many are analogous to their servlet equivalents • Some (portletsession) actually seem to be trivial wrappers around servlet equivalents in Pluto.
The Infamous Big Picture • As a portlet developer, the previous set of classes are all you normally touch. • The portlet container (such as Pluto or Gridsphere) is responsible for running your portlets. • Init, invoke methods, destroy. • Portlets have a very limited way of interacting with the container. • It is a black box. • The API is basically one-way.
Some Suggestions • Java-based systems should adopt JSR 168 based portlet containers for Java based systems. • All components should be at least Web Service “ready” • Use WSDL to define APIs for services (and bind later to specific programming implementations) • Make a clean separation between portals and services. • “Service oriented approach” will allow both desktop GUIs and portals to use the same services. • Monitor WSRP as a way to build multilingual portals.
More Suggestions • Portal/service teams have to be first-class members of the MSRC infrastructure. • Need to know about HPC turnover well in advance. • Command line login changes need to be accompanied by simultaneously approved Web logins. • Need more uniformity in MSRCs • TeraGrid’s CTSS “common software stack” as a possible model.
Telescoping Portal Architecture Job Submit File Manage Security