1 / 41

Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

Constructing. WEB PORTALS. For Computational Communities. Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University. Overview. What is a Web Portal? Web Portal Architecture Distributed Components: WebFlow Interfaces: Task Descriptor Grid Interface Portal Security

varian
Download Presentation

Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Constructing WEB PORTALS For Computational Communities Tomasz Haupt Northeast Parallel Architectures Center, Syracuse University

  2. Overview • What is a Web Portal? • Web Portal Architecture • Distributed Components: WebFlow • Interfaces: • Task Descriptor • Grid Interface • Portal Security • Summary

  3. What is a Web Portal? Customizable access to information and services

  4. Portal is not a static web page • It relies on a sophisticated browser technology • DHTML, JavaScript, cookies, applets, … • Server side processing • cgi-bin, servlets, asp, jsp, server side includes, XML • search engines, mail servers, calendar, ... • Back End • data bases • credit card processing • external services: news, weather, stock quotes,...

  5. Computational Portals • To provide a problem-oriented interface (a Web portal) to more effectively utilize HPC resources from the desktop via the Web browser. • This “point & click” view hides the underlying complexities and details of the HPC resources and creates a seamless interface between the user’sproblemdescriptionon his/herdesktop system and theheterogeneous computing resources • These HPC resources include supercomputers, mass storage systems, databases, workstation clusters, instruments, and visualization servers.

  6. Example: Nanomaterials Research datarepository select edit Gaussian convert Gamess convert QS QS QS QS Features:Data Flow computations, user supplied modules, seamlessaccess to heterogeneous mixture of computational resources, seamless data transfer, visualizations, data management.Goals:automate the task, maximize throughput.

  7. Example: LMS Landscape Management System DEM Land Use Soil Texture Vegetation Features:access to remote data (distributed databases, internet repositories), data pre- and postprocessing, tightly coupledapplications running on remote hosts, visualizations Goal: decision support systemavailable anytime, anywhere WMS EDYS CASC2D WMS: Watershed Modeling System EDYS: vegetation model CASC2D: watershed model

  8. Problem Description new select arch Resources (software, hardware) templates, visualization tools Input files Output files Resource Allocation Example: Gateway SystemProblem Solving Environment Features:access to remote data, data pre- and postprocessing, applications running on remote hosts, visualizations, archivization. Goal: guide the user to select software, generate input files, submit jobs, analyze data; hide complexity and details of a heterogeneous back end.

  9. Design Issues • Support for a seamless access (security) • Support for distributed, heterogeneous Back-End services (HPCC, DBMS, Internet, ...) managed independently • Variable pool of resources: support for discovery and dynamical incorporation into the system • Scalable, extensible, low-maintenance Middle Tier • Web-based, extensible, customizable, self-adjusting to varying capacities and capabilities of clients (humans, software and hardware) Front End • Access to desktop applications

  10. Towards the solution ... problem description (physics, chemistry, ...) Task description: I need 64 nodes of SP-2 at Argonne to run my MPI-based executable “a.out” you can find in “/tmp/users/haupt” onmarylin.npac.syr.edu. In addition, I need any idle workstationwith jdk1.1 installed. Make sure that the output of my a.out is transferred to that workstation Middle-Tier: map the user’s task description onto the resource specification; this may include resource discovery, and other services Resource Specification, Control Access, Events Resource Allocation: run, transfer data, run

  11. Three Tier System Front End: Tools to select or specify the problem to solve Task Descriptor Middle Tier: Translates the user task into resource requests Resource Descriptor Back End: Resources and data to execute the task.

  12. Abstract Application Descriptor (AAD) • “man pages written in XML” • specifies how to install and run the application on different hosts [current status of Gateway] • describes requirements, input and output data, options, arguments, etc. • to submit a job it must be reduced to a job descriptor (select host, options, input data…) More on AAD: http://www.npac.syr.edu/users/haupt/WebFlow/MODULES/AAD.html

  13. Reducing AAD to JDAbstract Application Descriptor to Job Descriptor select options select input ... AAD select host JD Generate batch script GenerateRSL GUI Data Flow manager Problem Solving Environment Resource broker Informationservice(MDS) condor JINI submit

  14. Example Job Descriptor(with selected application, host and i/o files) <?xml version="1.0"?> <!DOCTYPE application SYSTEM "ApplDescV2.dtd"> <application id=”Casc2d" installable="No"> selected application <target id="aga.npac.syr.edu"> selected host <status installed="Yes"/> <installed> <CmdLine command="/npac/home/haupt/CASC2D/casc2d" /> how to run it <input> <inFile Path="/npac/home/haupt/CASC2D/lms/" Name="sand.map"/> it expects this input file <source Host="maine.npac.syr.edu" Path="C:\LMS\fromEdys\" Name="S.map" > actual </input> location of the file <output> <outFile Path="/npac/home/haupt/CASC2D/lms/" Name="sed.out"/> it generates this output file <dest Host="maine.npac.syr.edu" Path="C:\LMS\toEdys\" Name="sed.out" > store it there </output> <stdout Host="aga.npac.syr.edu" Path="/npac/home/haupt/CASC2D/history/" Name="job2001.out" > <stderr Host="aga.npac.syr.edu" Path="/tmp/" Name="haupt_job2001.err" > </installed> </target> save stdout </application> and stderr

  15. run(); AAD success failure simple job object (atomic task) “input port”: method to be invoked “output port”: event fired

  16. run(); run(); run(); run(); success success success failure failure failure success failure Complex Tasks run(); success failure run(); success failure

  17. Task Descriptor • A computational task requested by the user may involve many steps. • Some steps can be performed concurrently, but typically there are data dependencies that force execution of the steps in some particular order. • Tasks can be defined recursively. • Task may specify resources explicitly, or provide requirements and/or preferences leaving the selection of resources to the discretion of a resource broker.

  18. ATD.dtd <!ELEMENT Task (TaskName, (Task|connection)*, InputPort+, OutputPort+> <!ELEMENT TaskName EMPTY> <!ATTLIST TaskName name CDATA #REQUIRED descriptor CDATA #IMPLIED> <!ELEMENT connection (output+,input+)> <!ELEMENT output EMPTY> <!ATTLIST output task CDATA #REQUIRED event CDATA #IMPLIED> <!ELEMENT input EMPTY> <!ATTLIST input task CDATA #REQUIRED method CDATA #IMPLIED> <!ELEMENT InputPort EMPTY> <!ATTLIST InputPort task CDATA #REQUIRED> <!ELEMENT OutputPort EMPTY> <!ATTLIST OutputPort task CDATA #REQUIRED> Example Task Descriptor <Task> <TaskName name="ComplexTask" /> <Task> <TaskName name="atomic_task1" descriptor="task1.xml" /> <InputPort method="run" /> <OutputPort event ="done" /> </Task> <Task> <TaskName name="atomic_task2" descriptor="task2.xml" /> <InputPort event="run" /> <OutputPort method ="done" /> </Task> <connection> <output task="task1" /> <input task="task2" /> </ connection> <InputPort task="atomic_task1" /> <OutputPort task="atomic_task2" /> </Task>

  19. How the task descriptors are generated ? • Predefined (“set of scenarios”) • Created interactively by the user using Front End tools • Generated by middle-tier components

  20. Navigate and choose an existing applicationto solve the problem at hand.Import all necessary data. LMS Front End Select host Select model Set parameters Run Retrieve data Pre/post-processing Run simulations

  21. QS Front End Data-Flow Front-End Compose interactively your application from pre-existing modules

  22. Building an application Front-End Applet XML A visual representationis converted into a XML document parse Middle-Tier Web Server save XML service ApplContext Publishes IOR Generates Java code to add modules to ApplContext

  23. Gateway Navigator where do you want to go today? Control applet: File AccessJob monitor Gateway Front End Define the systemyou are interested in

  24. User 1 User 2 Application 1 App 1 App 2 Application 2 WebFlow Services Middle-Tier: WebFlow ServerCORBA-based distributed components • WebFlow server is given by a hierarchy of containers (contexts) and components • The server is the root context. • A context • knows its location in the hierarchy • has attributes • maintains a persistent state • controls its children life-cycle • is responsible for intercomponent communications (events) • can be specialized by adding services (WebFlow modules)

  25. Master WebFlow Server Web Server Distributed Middleware WebFlowContextProxies Task Descriptor Distributed Back-End Resources Grid Interface  JBDC  Information Services Clients DownloadApplet “slave” WebFlow Server

  26. Example: Gateway Components • Using PSE (Front End) user defines a problem • Session is an instance of it (an attempt to solve it) • Session comprises jobs • Session context reflects the structure of ATD • Session context can submit itself User Context Problem Context Session Context Job context application descriptor, job id, date submitted, completed, input file(s), output file(s)

  27. EventAdapter usesCORBADSI,DII Context 1 A Event e Client WebFlow Events A B B Method m Dynamic Interfaces Context 2 Method m is a public method: anyone can invoke it, including the Event Adapter of Context 1.No protection against misuse! Module A does not care who is expecting the event; method fire Event invokes a method of its parent context

  28. User Context Credentials (proxy) profile Session Context Job object Job object Job object Middle Tier Components Task Specification PSEsupport context lifecycle access control Component Container XML parser data analysis resource broker Multi-disciplinary task control data flow manager NetSolve Linear Algebra proxy batchscriptgenerator Fileaccess &transfer Informationservices job control archivization databaseaccess Resource Specification

  29. Grid Interface • How to hide complexity and details of Back End resources? • Example: JDBC Middle-Tier Java.sql Servlet Driver Manager Front End Application Oracle D. Sysbase D. mSQL D. Oracle Sysbase mSQL Back-End independent business logic

  30. JDBC model to provide accessto computational resources Grid Interface Servlet GRAM Front End Application PBS NQS CONDOR O2K SP2 NOW Back-End independent business logic Grid Interface: access control, allocations, resource look-up, discovery, (co)allocation, monitoring, QoS, fault tolerance, services, events, ... Addressed by Grid Forum. Approximated by Globus. Portal: builds on top of it, implementing proxies

  31. Generate Data GRAM resource description &(rsl_substitution = (MYDIR “/tmp/haupt”)) (DATADIR $(MYDIR)/data)(EXECDIR) $MYDIR)/bin)) (executable = $(EXECDIR)/a.out) (arguments=$(DATADIR)/file1) (stdout=(MYDIR)/result.dat)) (count=1) Run Job Analyze Example of a proxy module The Run Job module is a proxy module. It generates the RSL in-the-fly and submits the job for execution using globusrun function. The module has access to a job descriptor.

  32. Security: Issues Front End (Applet or Application) Connection through open Internet Layer 1: Secure Web access control Layer 2: Secure Middle Tier Gatekeeper access control and delegation Layer 3: Secure access to resources HPCC resources Policies defined by resource owners The same of different security domain

  33. Security (2) • Different model than most commercial solutions: • charge for service and not CPU used to render the service • identify by credit card number • Three-tier architecture: • delegation of credentials

  34. Security: CORBA security service

  35. Kerberos/SecurID Download frameset Web Server applet Master WebFlow Server ORB ORB SECIOP C:\>kinit C:\>krsh Slave WebFlow Server ORB (with forwardable ticket) krsh Front End Middle Tier Back End

  36. Master WebFlow Server ORB SSL Web Server Download frameset Servlets https applet IIOP Slave WebFlow Server ORB GlobusGSSAPI Front End Middle Tier Back End

  37. Proxy Objects • The master creates and maintains proxies for each component • to forward requests from the Web client to remote objects • Simplify the association of the distributed components • Enable the communication between the client and the slave servers running on different hosts • having the capability of logging, tracking and filtering all messages between components in the system to implement fault tolerance and security and transaction monitors.

  38. Master WebFlow Server Web Server Distributed Middleware WebFlowContextProxies Task Descriptor Distributed Back-End Resources Grid Interface  JBDC  Information Services Clients DownloadApplet “slave” WebFlow Server

  39. Emerging Object Web Multi-Server Model Back End Servers and their services Clients andtheir servers Middle Tier Custom Servers

  40. Summary • We build Web Portals using new, emerging, often ephemeral technologies and standards • What will survive? • Multitier architectures • Distributed Components • XML to define interfaces • Metadata (UML, XMI, …) • What next? • Developer tools for enterprise servers • ASP: Application Service Providers

  41. Summary (2) • Academic example: WebFlow • Gateway, LMS, GEM, NCSA • We extended notion of a Web Portal • support for HPCC • Grid Interface (a new CORBA facility?) • Abstract Task Descriptor in XML • Middle-Tier components as proxies • To be added: proxy communication channels

More Related