1 / 63

Development & Implementation of an Inter-institutional Multi-purpose Grid SURAgrid, 11/22/05 UNC-Charlotte: Grid Com

Development & Implementation of an Inter-institutional Multi-purpose Grid SURAgrid, 11/22/05 UNC-Charlotte: Grid Computing-ITSC 4010-001. Mary Fran Yafchak, SURA Jim Jokl, University of Virginia Art Vandenberg, Georgia State University. Presentation agenda.

rolf
Download Presentation

Development & Implementation of an Inter-institutional Multi-purpose Grid SURAgrid, 11/22/05 UNC-Charlotte: Grid Com

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Development & Implementation of an Inter-institutional Multi-purpose GridSURAgrid, 11/22/05UNC-Charlotte: Grid Computing-ITSC 4010-001 Mary Fran Yafchak, SURA Jim Jokl, University of Virginia Art Vandenberg, Georgia State University

  2. Presentation agenda • About SURAgrid - Mary Fran Yafchak • SURAgrid build/portal - MF Yafchak • SURAgrid authN/authZ - Jim Jokl • SURAgrid applications - Art Vandenberg • Q&A - All This is a living, breathing project. Exchange of ideas encouraged!

  3. About SURAgrid • A “beyond regional” initiative in support of SURA regional strategy “Mini-About” SURA: • SURA region: 16 states & DC; Delaware to Texas • SURA membership: 62 SE research universities • SURA mission: Foster excellence in scientific research, strengthen capabilities, provide training opportunities • Evolved from the NMI Testbed Grid project, part of the NMI Integration Testbed Program • http://www1.sura.org/3000/NMI-Testbed.html

  4. SURAgrid Goals • SURAgrid: Organizations collaborating to bring grids to the level of seamless, shared infrastructure • Goals: • To develop grid infrastructure that is scalable and that leverages local identity and authorization while managing access to shared resources • To promote use of this infrastructure for the broad research and education community • To provide a forum for participants to share experience with grid technology, and participate in collaborative project development

  5. University of Alabama at Birmingham* University of Alabama in Huntsville* University of Arkansas* University of Florida* George Mason University* Georgia State University* Great Plains Network University of Kentucky* University of Louisiana at Lafayette* Louisiana State University* University of Michigan Mississippi Center for SuperComputing Research* University of North Carolina, Charlotte North Carolina State University* Old Dominion University* University of South Carolina* University of Southern California Southeastern Universities Research Association (SURA)** Texas A&M University* Texas Advanced Computing Center (TACC)* Texas Tech University Tulane University* Vanderbilt University* University of Virginia* Resources on grid *SURA member **Project planning SURAgrid Participants

  6. Focus Areas • Authentication & Authorization • Themes: maintain local autonomy, leverage enterprise infrastructure • Grid-Building • Themes: heterogeneity, flexibility, interoperability, scalability • Application Development • Themes: immediate benefit to applications, applications drive development • Project Planning • Themes: cooperative, representative, sustainable

  7. In the Coming Months… • Continue evolving key areas • Grow and solidify grid infrastructure • Continue expanding and exploring authN/authZ • Identify & “grid-enable” new applications • “Formal” work on organizational definition • Charter, membership, policies, governance • Develop funding & collaboration opportunities • Some areas of interest: scalable mechanisms for shared, dynamic access; interoperability in grid products; grid-enabling applications; grids for education; broadening participation; support and management of large-scale grid operations

  8. (Ashok Adiga, Texas Advanced Computing Ctr.)Building SURAgrid& SURAgrid portal

  9. SURAgrid Software Requirements • SURAgrid supports dedicated & non-dedicated compute nodes • Non-dedicated nodes are typically shared across multiple grids, • Could have constraints on the software that can be installed • Must allow resource owner to set usage policies • Dedicated nodes run only SURAgrid jobs • Common software stack being defined for dedicated nodes • Will consider using packaged Grid solutions • Virtual Data Toolkit (VDT) • NSF Middleware Initiative (NMI Grids)

  10. Configuring Non-dedicated nodes • Non-dedicated nodes support basic grid services • Document simple process to add resources to the grid • Job & data management • Install Globus (pre-web services GRAM & gridftp) • Authentication • Cross sign CA certificates with Bridge CA • Work with individual resource owners to get authorized • Resource monitoring • Install GPIR perl provider scripts on resource and add resource description to User Portal

  11. SURAgrid Resource Status • Number of Compute Clusters: 14 • Total number of CPUs: 611 • Peak GigaFlops: 1,367 • Memory (GigaBytes): 621 • Storage (GigaBytes): 5,645

  12. Motivation for User Portals • Make joining the SURAgrid easier for users • Single place for users to find user information and get user support • Certain information can be displayed better in a web page than in a command shell • Allow novice users to start using grid resources securely through a Web interface • Increase productivity of SURAgrid researchers – do more science!

  13. What is a Grid User Portal? • In general - a gateway to a set of distributed services accessible from a Web browser • Provides • Aggregation of different services as a set of Web pages • Single URL • Single Sign-On • Personalization • Customization

  14. Characteristics of a User Portal • A User Portal can include the following services: • Documentation Services • Notification Services • User Support Services • Allocations • Accounts • Training • Consulting

  15. User Portal Characteristics (cont’d) • Collaborative Services • Calendar • Chat • Resource sharing • Information Services • Resource • Grid-wide • Interactive Services • Manage Jobs & Data • Doesn’t replace the command shell but provides a simpler, alternative interface

  16. Service Aggregation User Support Consulting Notification User News Collaborative Calendar Chat Documentation User Guides Information Resource Grid Interactive Job Submission File Transfer HTTP/SSL/SOAP GSI User Portal HTTP/SSL Client Browser

  17. Portal Built Using GridPort 4 • Developed at TACC & San Diego State • Interface to grid technologies • GRAM, GridFTP, MyProxy, WSRF, science applications • Includes: • Portal framework-independent “portlets” • Expose backend services as customizable web interfaces • Small changes allow portlets to run in any JSR-168 compliant portal framework (e.g., uPortal, WebSphere, Jetspeed; installs into Gridsphere by default) • Portal services • Run in the same web container as portlets • Provide portlet cohesion and portal framework level support

  18. Single sign-on to access all grid resources • Documentation tab has details on: • Adding resources to the grid • Setting up user ids and uploading proxy certificates

  19. Information Services • Resource-level view • State information about individual resources • Queue, Status, Load, OS Version, Uptime, Software, etc.. • Grid-level view • Grid-wide network performance • Aggregated capability • GPIR information Web Service • Collects and provides information above

  20. Resource Monitoring http://gridportal.sura.org

  21. Interactive Services • Security • Hidden from the user as much as possible • File Management • Upload • Download • Transfer between resources • Job Submission to a single resource • Job Submission to a grid meta-scheduler (future) • Composite Job Sequencing (future)

  22. Proxy Management • Upload proxy certificates to MyProxy server • Portal provides support for selecting a proxy certificate to be used in a user session

  23. File Management • List directories, Move files between grid resources, Upload/download files from local machine

  24. Job Management • Submit Jobs for execution on remote grid resources • Check status of, cancel and delete submitted jobs

  25. Future Directions • User Portal currently offers basic user, informational and interactive services. • Build on other services such as user support • Need to expand services as grid grows • Resource broker to automatically select resource for job execution • Workflow support for automation and better utilization of grid resources • Reliable file transfer services • Build customized application portlets

  26. Jim Jokl, University of VirginiaSURAgrid authN/authZ

  27. SURAgrid Authentication • Goal • Develop a scalable inter-campus solution • Preferred mechanisms • Leverage campus middleware activities • Researchers should not need to operate their own authentication systems • Use local campus credentials inter-institutionally • Rely on existing higher education inter-institutional authentication efforts

  28. Inter-campus Globus Authentication • Globus uses PKI credentials for authentication • Leverage native campus PKI credentials on SURAgrid • Users do all of their work using local campus PKI credentials • How do we create the inter-campus trust fabric? • Standard inter-campus PKI trust mechanisms include • Operating a single Grid CA or trusting other campus CAs • Cross-certification and Bridge PKIs • How well does Globus operate in a bridged PKI? • OpenSSL PKI in Globus is not bridge-aware • Known to work from NMI Testbed project • Decision: intercampus trust based on a PKI Bridge • Leverage EDUCAUSE Higher Education Bridge CA (HEBCA) when ready

  29. Background: Cross-certification I: UABS: UAB I: UVAS: UVA • Top section • Traditional hierarchical validation example • Bottom section • Validation using cross certification example • UVA signed a certificate request from the UAB CA • UAB signed a certificate request from the UVA CA • This pair of cross certificates enables each school to trust certs from the other using only their own root as a trust anchor • An n2 problem I: UABS: User-2 I: UVAS: User-1 I: UABS: UAB I: UVAS: UVA I: UABS: UVA Cross Certs I: UVAS: UAB I: UVAS: User-1 I: UABS: User-2

  30. Bridge CA Cross-certificate pairs Campus A Campus B Campus n Mid-A Mid-B User A1 User B1 User B1 User A2 Background: Bridged PKI • Used to enable trust between multiple hierarchical CAs • Generally more infrastructure than just the cross-certificate pairs • Typically involves strong policy & practices • Solves the n2 problem • For SURAgrid we preload cross-certs

  31. SURAgrid Authentication Schematic Campus F Grid F’s PKI SURAgrid Bridge CA Campus E Grid E’s PKI Cross-cert pairs D’s PKI Campus D Grid A’s PKI B’s PKI C’s PKI Campus A Grid Campus B Grid Campus C Grid

  32. SURAgrid Authentication Status • SURAgrid Bridge CA • Off-line system • Used Linux and OpenSSL to build bridge • Cross-certifications with the bridge complete or in progress for 8 SURAgrid sites • Several more planned in near future • SURAgrid Bridge Web Site • Interesting PKI issues discussed in paper

  33. Higher Education Bridge Certification Authority (HEBCA) • A project of EDUCAUSE • Implement a bridge for higher education based on the Federal PKI bridge model • Support both campus PKIs and sector hierarchical PKIs • Cross-certify with the Federal bridge (and others as appropriate) • Should form an excellent permanent trust fabric for a bridge-based Grid

  34. Model SURAgrid Authentication Campus F Grid F’s PKI HEBCA Campus E Grid E’s PKI Cross-cert pairs D’s PKI Campus D Grid A’s PKI B’s PKI C’s PKI Campus A Grid Campus B Grid Campus C Grid

  35. FBCA HEBCA SAFE Commercial Others Bridge to Bridge Context • A federal view on how the inter-bridge environment is likely to develop • FBCA – Federal Bridge • SAFE – Pharmaceutical • HEBCA – Higher Ed • Commercial - aerospace and defense • Grid extensible across PKI bridges?

  36. SURAgrid AuthN/AuthZ Status • Bridge CA and cross-certification process • Forms the basic AuthN infrastructure • Builds a trust fabric that enables each site to trust the certificates issued by the other sites • The grid-mapfile • Controls the basic (binary) AuthZ process • Sites add certificate Subject DNs from remote sites to their grid-mapfile based on email from SURAgrid sites

  37. SURAgrid AuthZ Development • Grid-mapfile automation • Sites that use a recent version of Globus will use a LDAP callout that replaces the grid-mapfile • For other sites there will be some software that provides and updates a grid-mapfile for their gatekeeper

  38. SURAgrid AuthZ Development • LDAP AuthZ Directory • Web interface for site administrators to add and remove their SURAgrid users • Directory holds and coordinates • Certificate Subject DN • Unix login name (prefixed by school initials) • Allocated Unix UID (high numbers) • Some Unix GIDs? (high numbers) • Perhaps SSH public key, perhaps gsissh only • Other (tbd) • Reliability • Replication to sites that want local copies

  39. SURAgrid AuthZ Development • Sites contributing non-dedicated resources to SURAgrid greatly complicate the equation • We will provide a code template for editing grid-mapfiles to manage SURAgrid users • Publish our LDAP schema • Sites may query LDAP to implement their own SURAgrid AuthZ/AuthN interface

  40. Likely SURAgrid AuthZ Directions and Research • User directory or directory access • Group management • Person attributes • VO names • Store per-person, per-group allocations • Integrate with accounting • Local and remote stop-lists • Resource directory • Hold resource usage policies • Time of day, classifications, etc • Mapping users to resources within resource policy constraints • We’ll learn a lot more about what is actually required as we work with the early user groups

  41. Art Vandenberg, Georgia State UniversityApplications on SURAgrid

  42. SURAgrid Applications • Need applications to inform and drive development • Want to be of immediate service to real applications • Believe in grids as infrastructure • but not “if you build it they will come”… • Identifying & Fostering Applications

  43. Proposed Application Process • Continuing survey of applications • Catalog of Grid Applications; similar agency and partner databases; survey of SURA membership • Identify target applications • Region significance, multi-institutional, intersection other e-Science • Illustrating grid benefits • Test it • Globus, authN-Z/BridgeCA, compilers, portal… and more • Implementation options 1) Immediate deployment 2) Demonstration deployment opportunities 3) Combined with proposal development

  44. Catalog of Grid Applications • http://art11.gsu.edu:8080/grid_cat/index5.jsp • Researchers of grid, grid potential applications • Initial intent just to see who's doing what • Potentially larger resource (collaboration, regional perspective, overall trends) • 20 sites, 475+ researchers • Current focus: • Automated maintenance • Improved search, browse

  45. Identify an Applications Base • Build from application activities already underway in SURAgrid • Integrate with regional strategy (SURA HPC-Grid Initiatives Planning Group) • Apply additional resources • Seeking additional collaboration, external funding • Achieve critical mass • Seek FUNDING

  46. SURAgrid Applications • SCOOP/ADCIRC (UNC, RENCI, MCNC, SCOOP partners, SURAgrid partners) • Multiple Genome Alignment (GSU, UAB, UVA) • ENDYNE (TTU) • Task Farming (LSU) • Data Mining on the Grid (UAH) • BLAST (UAB) • … and more …

  47. SCOOP/ADCIRC- UNC, RENCI, MCNC, SCOOP Partners, SURAgrid Participants • SURA program to create infrastructure for distributed Integrated Ocean Observing System (IOOS) in the southeast • Shared means for acquisition of observational data • Enables modeling, analysis and delivery of real-time data • SCOOP will serve as a model for national effort • http://www1.sura.org/3000/3300_Coastal.html • SCOOP/ADCIRC: forecast storm surge • resource selection (query MDS) • build package (application & data) • send package to resource (gridftp) • run adcirc in mpi mode (globus rsl & qsub) • retrieve results from resource (gridftp)

  48. Left: ADCIRC max water level for 72 hr forecast starting 29 Aug 2005,driven by the "usual, always-available” ETA winds. Right: ADCIRC max water level over ALL of UFL ensemble wind fields for 72 hr forecast starting 29 Aug 2005, driven by “UFL always-available” ETA winds. Images credit: Brian O. Blanton, Dept of Marine Sciences, UNC Chapel Hill SCOOP/ADCIRC…

  49. SCOOP/ADCIRC Results SURAgrid U. Kentucky (CCS-UKY, 48 CPU/230 Gflops/48G RAM, 500G Disk) • -rwx------ 1 howard howard 1458444 Sep 14 13:39 adcirc.x • -rwx------ 1 howard howard 12 Sep 14 13:39 adcpost.inp • -rwx------ 1 howard howard 843813 Sep 14 13:39 adcpost.x • -rw------- 1 howard howard 29 Sep 14 13:39 adcprep.inp • -rwx------ 1 howard howard 1150926 Sep 14 13:39 adcprep.x • -rwx------ 1 howard howard 915 Sep 14 13:39 execute_parallel_bundle.sh • -rwx------ 1 howard howard 3042520 Sep 14 13:39 fort.14 • -rw------- 1 howard howard 64545 Sep 14 13:39 fort.15 • -rw------- 1 howard howard 19804050 Sep 14 13:39 fort.22 • -rw-rw-r-- 1 howard howard 1444457 Sep 14 16:17 fort.61 • -rw-rw-r-- 1 howard howard 202457 Sep 14 16:17 fort.62Results stored in fort.61 - 64 • -rw-rw-r-- 1 howard howard 105626297 Sep 14 16:18 fort.63 • -rw-rw-r-- 1 howard howard 169753697 Sep 14 16:19 fort.64 • -rw------- 1 howard howard 1257568 Sep 14 13:39 fort.68 • -rw-rw-r-- 1 howard howard 1326004 Sep 14 13:40 fort.80 • -rw------- 1 howard howard 3940266 Sep 14 13:40 metis_graph.txt • -rwx------ 1 howard howard 1802370 Sep 14 13:39 padcirc.x • -rw-rw-r-- 1 howard howard 403 Sep 14 13:39 pbs_sub-howard • -rw-r--r-- 1 howard howard 1028 Sep 14 13:39 pbs_sub-howard.e125698 • -rw-r--r-- 1 howard howard 91 Sep 14 13:39 pbs_sub-howard.o125698drwxrwxr • -x 2 howard howard 4096 Sep 14 13:41 PE0000drwxrwxr • -x 2 howard howard 4096 Sep 14 13:41 PE0001drwxrwxr Directories created by job • -x 2 howard howard 4096 Sep 14 13:41 PE0002drwxrwxr • -x 2 howard howard 4096 Sep 14 13:41 PE0003

  50. SCOOP/ADCIRC - Challenges • resource selection (query MDS) • Expect MDS to be hosted on resource being queried. CCS-UKY actually pointed to NCSA for their MDS; needed to implement MDS on CCS-UKY as well (essentially CCS-UKY part of multiple MDS) • build package (application & data) • Must address incompatibility between GT3 and GT2 style proxies; must use “-old” option to GT3’s grid-proxy-init to get GT2 style proxy which ADCIRC currently expects • send package to resource (gridftp) • Staff availability… • run adcirc in mpi mode (globus rsl & qsub) • retrieve results from resource (gridftp)

More Related