1 / 46

PROGRESS – Computing Portal and Data Management in the Cluster of SUNs

PROGRESS – Computing Portal and Data Management in the Cluster of SUNs. Michał Kosiedowski Sun HPC Consortium Heidelberg 2003. R & D Center. PSNC was established in 1993 and is an R&D Center in: New Generation Networks POZMAN and PIONIER networks 6-NET, SEQUIN, ATRIUM projects

gratia
Download Presentation

PROGRESS – Computing Portal and Data Management in the Cluster of SUNs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PROGRESS – Computing Portal and Data Management in the Cluster of SUNs Michał Kosiedowski Sun HPC Consortium Heidelberg 2003

  2. R & D Center • PSNC was established in 1993 and is an R&D Center in: • New Generation Networks • POZMAN and PIONIER networks • 6-NET, SEQUIN, ATRIUM projects • HPC and Grids • GRIDLAB, CROSSGRID, PROGRESS projects • Portals and Content Delivery Tools • Polish Educational Portal "Interkl@sa", Multimedia City Guide, dLibra framework Sun HPC Consortium 2003

  3. Center of Excellence • PSNC became the Sun CoE in New Generation Networks, Grids and Portals in November 2002 Sun HPC Consortium 2003

  4. PROGRESS • Duration: December 2001 – May 2003 (R&D) • Budget: ~4,0 MEuro • Project Partners • SUN Microsystems Poland • PSNC IBCh Poznań • Cyfronet AMM, Kraków • Technical University Łódź • Co-funded by The State Committee for Scientific Research (KBN) and SUN Microsystems Poland Sun HPC Consortium 2003

  5. PROGRESS (2) • Cluster of 80 processors • Networked Storage of 1,3 TB • Software: ORACLE, HPC Cluster Tools, Sun ONE, Sun Grid Engine Sun HPC Consortium 2003

  6. PROGRESS – architecture Sun HPC Consortium 2003

  7. PROGRESS – the talk • HPC Window – the access environment to grid resources: user interfaces (Computing Portal and Migrating Desktop) + Grid Service Provider • Data Management System – a distributed system designed for storing scientific data used in experiments performed in the computing cluster (grid) Sun HPC Consortium 2003

  8. HPC Window - motivation • grid access environments lacked flexibility: one grid -> one portal • user must have relied on the grid as far as the history is concerned • user interfaces were not that much functional Sun HPC Consortium 2003

  9. PORTAL GRID SERVICE PROVIDER GRID MANAGEMENT SYSTEM HPC RESOURCES 4-tier new grid-portal environment Grid-Portal Environment PORTAL GRID MANAGEMENT SYSTEM HPC RESOURCES 3-tier classical grid-portal environment Sun HPC Consortium 2003

  10. Entity Beans Session Bean Content Provider Webservice PROGRESS GPE Sun HPC Consortium 2003

  11. PROGRESS HPC Portal • interaction with users • presentation of data obtain from services Sun HPC Consortium 2003

  12. PROGRESS HPC Portal (2) Sun HPC Consortium 2003

  13. PROGRESS HPC Portal (3) Sun HPC Consortium 2003

  14. PROGRESS HPC Portal (4) Sun HPC Consortium 2003

  15. PROGRESS HPC Portal (5) Sun HPC Consortium 2003

  16. PROGRESS HPC Portal (6) Sun HPC Consortium 2003

  17. PROGRESS HPC Portal (7) Sun HPC Consortium 2003

  18. PROGRESS HPC Portal (8) Sun HPC Consortium 2003

  19. Grid Service Provider • the use of the grid resources more comfortable to the end users • allows for easy building of numerous portals and other user interfaces; users can switch from one to another and use the same GSP services • various thematic scientific web portals sharing the same grid resources • possibility of providing allclients (user interfaces) with computing resources belonging to two or more different grids Sun HPC Consortium 2003

  20. Grid Service Provider (2) • Necessary services to provide: • job submission service • managing the creation of user jobs, their submission to the grid and the monitoring of their execution (typically through reverse reporting performed by the Grid Management System about events connected with the execution of jobs) • application management service • storing information about applications available for running in the grid • assisting application developers in adding new applications to the application repository • provider management service • keeping up-to-date information on the services available within the provider Sun HPC Consortium 2003

  21. GSP: Job submission service • computing job building, submitting them to the grid for execution and viewing the results • job description is prepared using the XRSL language and transferred to the grid resource broker for the execution of the job • grid resource broker reverse reports on grid events connected with the job • "workflowed" jobs: sequences and parallels Sun HPC Consortium 2003

  22. GSP: Application mgmt. srv. • application repository management • application descriptor contains a reference to the application executable: a reference to a file stored in the DMS or a path to a binary on grid computing server filesystems • also included in the application descriptor: available (required or optional) arguments, required environment variables and required input and output files • applications in PROGRESS may be unconfigured or configured: one executable -> multiple configured applications • virtual applications Sun HPC Consortium 2003

  23. GSP: Provider mgmt. service • enables keeping up-to-date information on services available in the grid service provider • a service descriptor contains information on the Web Service interface: URL at which the service is available, the service namespace reference (URN) and the service WSDL reference • services may have multiple instances: informational services Sun HPC Consortium 2003

  24. GSP: Informational services • examples of instance enabled services • intended for use by web portals • PROGRESS example: short news service • other: document directory, discussion forum (under development) Sun HPC Consortium 2003

  25. GSP: XRSL Language • Extended Resource Specification Language (XRSL) is an XML based language designed for description of computing jobs • the XML documents describing grid computing jobs are passed to the grid resource broker, which analyzes them and executes jobs in accordance with requirements included Sun HPC Consortium 2003

  26. Data Mgmt. Syst. - motivation • data management systems were not oriented towards grid access environment • data management systems concentrated on distributing data between grid computers Sun HPC Consortium 2003

  27. Clients Portal Grid broker Migrating desktop Data Management System Metadata Repository WS Mirror & Proxy Data Broker SRS Data Storage Data Storage Data Storage GASS FTP Grid FTP (...) Sun HPC Consortium 2003

  28. Data Management System (2) • provides seamless access to data and information for grid computing • uses metadata for describing stored data • stores data on various media such as files, tapes and databases • serves as the source of input data and the destination for the results of computing experiments Sun HPC Consortium 2003

  29. Clients Portal Grid broker Migrating desktop DMS: Data broker Metadata Repository WS Mirror & Proxy Data Broker SRS Data Storage Data Storage Data Storage GASS FTP Grid FTP (...) Sun HPC Consortium 2003

  30. DMS: Data broker (2) • serves as an interface (Web Services based) for external clients, such as the HPC Portal and the grid resource broker • delivers DMS functions for directory, file and metadata management Sun HPC Consortium 2003

  31. DMS: Data broker (3) • directory mgmt.: add, remove and rename directories, retrieve root and current path, change path, list contents • file mgmt.: add, remove and rename files, add, remove and retrieve physical location, add and remove archives, add and remove symbolic links • metadata mgmt.: metadata scheme mgmt., retrieve list of schemes and attributes, assign schemes to files and edit values and attributes, search metadata repository Sun HPC Consortium 2003

  32. Clients Portal Grid broker Migrating desktop DMS: Metadata repository Metadata Repository WS Mirror & Proxy Data Broker SRS Data Storage Data Storage Data Storage GASS FTP Grid FTP (...) Sun HPC Consortium 2003

  33. DMS: Metadata repository (2) • responsible for storing and managing metadata • format of the metadata scheme associated with a file can be defined by the user or chosen from the predefined formats like the Dublin Core Sun HPC Consortium 2003

  34. Clients Portal Grid broker Migrating desktop DMS: Data storage modules Metadata Repository WS Mirror & Proxy Data Broker SRS Data Storage Data Storage Data Storage GASS FTP Grid FTP (...) Sun HPC Consortium 2003

  35. DMS: Data storage mod. (2) • enable access to physical data • data are arranged in data containers and can be stored on all media types and accessed by a uniform interface • data can be organized as files on generic filesystems, BLOBs in databases or files on data tapes • GASS, GridFTP and FTP as the data transport protocols Sun HPC Consortium 2003

  36. Clients Portal Grid broker Migrating desktop DMS: Mirror & proxy module Metadata Repository WS Mirror & Proxy Data Broker SRS Data Storage Data Storage Data Storage GASS FTP Grid FTP (...) Sun HPC Consortium 2003

  37. DMS: Mirror & proxy mod. (2) • enables access to external scientific databanks • mirrors earlier defined sets of data (like the SRS databank in PROGRESS) • serves as a proxy to internet based data resources (with caching functions) Sun HPC Consortium 2003

  38. Web Services Communication saveJob() getApplications() saveTaskOfJob() saveStdOfTask() submitJob() getUserJobs() getJobStatus() HPC Portal Grid Service Provider changeJobStatus() listUserDirectory() addUserFile() submitJob() Data Management System Grid Resource Broker getUserFileLocation() Sun HPC Consortium 2003

  39. Authentication & authorization • utilize the services available within the Sun One Portal Server 6.0 package: authentication techniques, user database, portlet access control, identity server • design an authorization system for the grid service provider and the data management system: based on the RAD model • apply a Single Sign-On mechanism Sun HPC Consortium 2003

  40. Logon Method invocation Request Token validation Authentication Resource access authorization Authorization scheme GRID SERVICE PROVIDER Portal Identity server RAD based authorization system Sun HPC Consortium 2003

  41. Visualization of job results Sun HPC Consortium 2003

  42. Where to go now? • Project: Research & Development finished; the test and deployment phase now • We will continue the R&D on the tools, including the grid service provider and the data management system Sun HPC Consortium 2003

  43. Where to go now? (2) • PROGRESS HPC Portal is a bioinformatic thematic portal: other thematic scientific portals possible to deploy, using the same grid service provider, data management system and grid resources and utilizing the same portal tools Sun HPC Consortium 2003

  44. Where to go now? (3) • The grid service provider may be equipped with means of communication with multiple grids: • cooperation with GRIDLAB • perhaps some Sun Grid Engine based grids? Sun HPC Consortium 2003

  45. Where to go now? (4) • encourage application developers to add their applications to the PROGRESS repository • encourage visualization module developers to add their software to the PROGRESS repository • encourage bioinformatic experts to take over as information services editors Sun HPC Consortium 2003

  46. PROGRESS http://progress.psnc.pl/ http://progress.psnc.pl/portal/ kat@man.poznan.pl Sun HPC Consortium 2003

More Related