1 / 59

The Gateway System

The Gateway System. This project is a collaborative effort between Northeast Parallel Architectures Center (NPAC) Ohio Supercomputer Center (OSC) Aeronautical Systems Center (ASC) MSRC. Overview. Goals.

stamos
Download Presentation

The Gateway System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Gateway System This project is a collaborative effort between Northeast Parallel Architectures Center (NPAC) Ohio Supercomputer Center (OSC) Aeronautical Systems Center (ASC) MSRC T. Haupt

  2. Overview T. Haupt

  3. Goals • To provide a problem-oriented interface to more effectively utilize HPC resources from the desktop via the Web browser. • This “point & click” view hides the underlying complexities and details of the HPC resources and creates a seamless interface between the user’s problem description on his/herdesktop system and theheterogeneous computing resources • TheseHPC resources include supercomputers, mass storage systems, databases, workstation clusters, collaborative tools, and visualization servers. T. Haupt

  4. Seamless Access • Create an illusion that all resources needed to complete the user tasks are available locally. • In particular, an authorized user can allocate the resources she needs without explicit login to the host controlling the resources. • An analogy: NSF mounted disk or a network printer. T. Haupt

  5. Front-End Services User Modules Back-End Resources Three-Tier Architecture • Tier 1 is a high-level, browser-based Front End • for visual programming (including selection of applications, • generation of input data sets, specification of resources, • post-processing and visualizations) • Distributed, object-based, • scalable, and reusable • Web server, Object broker • and Resource Manager • Middleware forms Tier 2 • Back-End services • comprise Tier 3. T. Haupt

  6. Towards a complete solution ... Problem description:I need to model the surface damage due to the impact of laser to harden the material bulk. I need access to models including material bulk properties and interaction with intense electromagnetic fields. Task description: I need 64 nodes of SP-2 at Argonne to run my MPI-based executable “a.out” you can find in “/tmp/users/haupt” onmarylin.npac.syr.edu. In addition, I need any idle workstationwith jdk1.1 installed. Make sure that the output of my a.out is transferred to that workstation Middle-Tier: map the user’s task description onto the resource specification; this may include resource discovery, and other services Resource Specification Resource Allocation: run, transfer data, run

  7. PSE Example: CCM 1. Enter the Gateway system 2. Define your problem 3. Identify resources (software and hardware) 4. Create input file 5. Run your application 6. Analyze results Ken Flurchick, http://www.osc.edu/~kenf/theGateway

  8. Abstract Task Specification Middle-Tier Resource Specification Target Architecture Problem Solving Environment CTA specificknowledgedatabases Visual Authoring Tools User and Group Profiles ResourceIdentificationand Access Visualizations Collaboration WebFlow Back-End Resources

  9. Design Issues • Support for a seamless access (security) • Support for distributed, heterogeneous Back-End services (HPCC, DBMS, Internet, ...) managed independently from Gateway • Variable pool of resources: support for discovery and dynamical incorporation into the system • Scalable, extensible, low-maintenance Middle Tier • Web-based, extensible, customizable, self-adjusting to varying capacities and capabilities of clients (humans, software and hardware) front end T. Haupt

  10. Gateway Implementation • Distributed, object-oriented middle tier • CORBA objects (Gateway Containers, Gateway Modules and Gateway Services) implemented in Java. [Scalable, extensible, low-maintenance middle tier] • Containers define the user environment. • Modules and Services serve as proxies: they accept the user requests (Front End) and delegate them to the Back End. [Support for distributed, heterogeneous back-end services managed independently from Gateway] Note: modules can be implemented in C++; also can be DCOM components T. Haupt

  11. Gateway Implementation (2) • Gateway operates in a keberized environment[Support for a seamless access] • tickets are generated on the client side • Keberos-based CORBA security service is used to manage the user sessions • Globus GSSAPI implemented over Keberos is used for resource allocation T. Haupt

  12. Gateway Implementation (3) • Task Specification is expressed in XML • CTA independent • Decouples implementation of the Front End and the Middle Tier • Allows for an abstract (platform independent) task specification, and thus the Middle Tier may act as a resource broker • Resource Specification is expressed in XML • Simplifies match-making and resource discovery • Simplifies generating Globus RSL in-the-fly [Support for distributed, heterogeneous Back-End services; Variable pool of resources; Scalable, extensible, low-maintenance Middle Tier] T. Haupt

  13. Gateway Implementation (4) • Component-based Front-End[extensible] • Front-End Components (“toolbox interfaces”) are • applets (interfaces for common services) • XML pages or frames[Web-based, extensible, customizable, self-adjusting] • All components (Front End, Middle-Tier) are defined in XML and contain metadata (used for component mining) T. Haupt

  14. Front End T. Haupt

  15. CTA specific knowledge database • Subject of Ken’s talk • requires server side support (both the middle tier and the back-end) through well defined interfaces • should be constructed from reusable or cloneable components • allows for identification of software components best suited to solve the problem at hand T. Haupt

  16. Visual Authoring Tools • Allows for composition of the computational task from components (reusable modules) • Different tools to support various programming models such as data parallel, task parallel, data flow, object oriented • No assumption on granularity • Metadata about components and support for archiving and mining the components • Support for instrumentation and steering T. Haupt

  17. Example: Data Flow

  18. Example: DARP T. Haupt

  19. User and Group Profile • Controls the user/group environment • file access • job monitoring • ... • Allows for customization • preferences • users with disabilities • ... • History of actions • Scientific notebook T. Haupt

  20. Resource Identification and Access • Computational resources • hardware, software, licenses • desktop applications • Data • file systems, mass storage, distributed databases • Internet data repositories • Networks T. Haupt

  21. Visualizations, Collaboration, ... • Baker/Clarke’s talk on SciVis • Geoffrey’s talk on Collaborative Tools • support for “streaming” applications as components • support for heterogeneous hardware (capabilities/capacities) T. Haupt

  22. Front-End infrastructure T. Haupt

  23. Front-End Support • Portal Page • User Context • Control Applet • Navigator (extensible, customizable) • PSE specific toolboxes • A placeholder for the Problem Description toolboxes • A placeholder for the code toolbox • Resource request toolbox • Data postprocessing toolbox • Other (Collaboration, Visualizations, …) T. Haupt

  24. Portal Page • Provides initial access to the Gateway. • After mutual authentication of the user and the Gateway server, creates a user context, and returns a (signed) control applet. T. Haupt

  25. User Context • Represents a Gateway session. • The session is associated with a user (or group) profile. • WebFlow extends the notion of the UNIX profile via the 'User Data Base' (UDB). This UDB contains information about submitted jobs, history of the users actions, and other user state information. The user context may also contain application/front-end specific information. T. Haupt

  26. T. Haupt

  27. Control Applet • The control applet is responsible for maintaining the session, and direct communication with the middle-tier. • Direct communication is the most efficient, but since it is buried into an applet, this mechanism is not readily customizable. • The generic services, such as file service (upload, download, edit, copy, move, delete) and job services (show current jobs/show queues/kill jobs) will be supported this way. [combination of the user context and a query] • The Gateway will also support a non-direct communication with the middle-tier through servelts. T. Haupt

  28. Screen Dump of the Control Applet T. Haupt

  29. Navigator • The navigator allows the user to select and customize toolboxes. • Embedded in a separate frame, it consists of menus, buttons, links, etc, derived from an XML document. • The navigator is a hierarchical, extensible and customizable. T. Haupt

  30. T. Haupt

  31. Problem description toolboxes • The problem description is application specific, and the Gateway only provides a general framework for creating a PSE. • The most important part is the specification of what services (middle and back tier) are needed, what is their API, and how to add new services. • Example services: access to databases, XML parsing, generating HTML in-the-fly, file services. T. Haupt

  32. Code toolboxes • The end user see it as a mapping between the problem description and software to be used to solve the problem. Actually, it identifies WebFlow modules and their parameters to be used to construct the application (see resource request toolbox below). • The module parameters may include input files, and if necessary, the input files are generated at this stage (using this or a separate toolbox). In addition, some parameters will be constructed from information stored in data bases, including UDB, and other sources. T. Haupt

  33. Resource Request Toolbox • The front-end activities result in an abstract task specification. • Abstract in the sense that the user may not know nor care what actual resources are used. • The task is composed of independently developed modules and services following different programming models. T. Haupt

  34. Other toolboxes • Visualizations • Collaboration • Scientific notebook • ... T. Haupt

  35. Middle-Tier T. Haupt

  36. WebFlow Server WebFlow server is given by a hierarchy of containers and components WebFlow server hosts users and services Each user maintainsa number of applicationscomposed of custom modules and common services User 1 User 2 Application 1 App 1 App 2 Application 2 WebFlow Services

  37. Mesh of WebFlow Servers implemented as CORBA objects that manage and coordinate distributed computation. Front End CORBA Based Middle-Tier Gatekeeper Authentication Authorization T. Haupt

  38. WebFlow Context Hierarchy Master Server (Gatekeeper) Slave Server Proxy Slave Server Application Context Module Slave Server User Context T. Haupt

  39. Browser based Front-End BrowserbasedFront-End Middle-Tier modulesserve as proxies ofBack-End Services User Space Definition and Task Specification Services User Modules Metacomputing Services Back-End Resources

  40. Back End T. Haupt

  41. Back End Services • Access to HPCC (via Globus) • Access to distributed databases (via JDBC) • Access to mass storage • Access to the Internet resources • Access to desktop application and local data • Access to code repositories T. Haupt

  42. In order to run WebFlow over Globus there must be at least one WebFlow node capable of executing Globus commands, such as globusrun Jobs that require computational power of massively parallel computers are directed to the Globus domain, while other jobs can be launched on much more modest platforms, such as the user’s desktop or even a laptop running Windows NT. Bridge between WebFlow and Globus WebFlow over Globus T. Haupt

  43. How to add new Back-End hardware resources • Computational engines • install Globus • MDS (The Alliance LDAP server) will contain all relevant info, including contact address • we need “private” directory, with entries in XML(learn from MDS, CONDOR’s class-add, NPAC’s … • access control provided by Keberos (cross MSRC) • Databases • create a new user: Gateway T. Haupt

  44. Gateway Security T. Haupt

  45. Security Model (Keberos) Front End Applet SECIOP Layer 1: secure Web delegation Layer 2: secure CORBA Gatekeeper SECIOP Layer 3: Secure access to resources authentication & authorization GSSAPI GSSAPI HPCC resources Policies defined by resource owners

  46. Building Gateway Components T. Haupt

  47. . • Gateway applications are composed of independent reusable modules • Modules are written by module developers who have only limited knowledge of the system on which the modules will run. • The WebFlow system hides module management and coordination functions Middle-Tier is given by amesh of WebFlow Servers that manage and coordinate distributed computation

  48. How to develop a Gateway component(or a toolbox) • Back-end service • Middle-tier proxy • Front-end controls T. Haupt

  49. What does it take to convert a legacy (high performance) application into a Gateway Back-End service? • Nothing! • it is the only way we can support commercial codes such as Gaussian • A middle-tier proxy will submit your job for you T. Haupt

  50. How the Back-End interacts with the rest of the system? • Often, your job do not need to interact. • Using GRAM and GASS you stage data and executable, submit the job and retrieve output. • Using DUROC you can coallocate resources and run MPI-based parallel/distributed codes. The messages between nodes are sent outside Gateway control or support. • HPF runtime will distribute your job and facilitate interprocess communication. T. Haupt

More Related