Ut grid project
Download
1 / 41

UT Grid Project - PowerPoint PPT Presentation


  • 133 Views
  • Updated On :

UT Grid Project. Jay Boisseau, Texas Advanced Computing Center SURA Grid Application Planning & Implementations Workshop December 7, 2005. Outline. Overview Vision Strategy Approach Current Project Status, Near-Term Goals UT Grid Production Compute Resources Roundup Rodeo

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'UT Grid Project' - wyanet


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Ut grid project l.jpg

UT Grid Project

Jay Boisseau, Texas Advanced Computing Center

SURA Grid ApplicationPlanning & Implementations Workshop

December 7, 2005


Outline l.jpg
Outline

  • Overview

    • Vision

    • Strategy

    • Approach

  • Current Project Status, Near-Term Goals

    • UT Grid Production Compute Resources

      • Roundup

      • Rodeo

    • Interfaces to production resources:

      • Grid User Portal

      • Grid User Node

    • Tools to support resources:

      • GridPort

      • GridShell

      • Metascheduling Prediction Services

  • Future Work and Plans


Ut grid vision a powerful flexible and simple virtual environment for research education l.jpg
UT Grid Vision: A Powerful, Flexible, and SimpleVirtual Environment for Research & Education

The UT Grid project vision is to create a cyberinfrastructure for research and education in which people can develop and test ideas, collaborate, teach, and learn through applications that seamlessly harness the diverse campus compute, visualization, storage, data, and instruments as needed from their personal systems (PCs) and interfaces (web browsers, GUIs, etc.).


Ut grid develop and provide a unique comprehensive cyberinfrastructure l.jpg
UT Grid: Develop and Provide a Unique, Comprehensive Cyberinfrastructure…

The strategy of the UT Grid project is to integrate…

  • common security/authentication

  • scheduling and provisioning

  • aggregation and coordination

    diverse campus resources…

  • computational (PCs, servers, clusters)

  • storage (Local HDs, NASes, SANs, archives)

  • visualization (PCs, workstations, displays, projection rooms)

  • data collections (sci/eng, social sciences, communications, etc.)

  • instruments & sensors (CT scanners, telescopes, etc.)

    from ‘personal scale’ to terascale…

  • personal laptops and desktops

  • department servers and labs

  • institutional (and national) high-end facilities


That provides maximum opportunity capability for impact in research education l.jpg
…That Provides Maximum Opportunity & Capability for Impact in Research, Education

…into a campus cyberinfrastructure…

  • evaluate existing grid computing technologies

  • develop new grid technologies

  • deploy and support appropriate technologies for production use

  • continue evaluation, R&D on new technologies

  • share expertise, experiences, software & techniques

    that provides simple access to all resources…

  • through web portals

  • from personal desktop/laptop PCs, via custom CLIs and GUIs

    to the entire community for maximum impact on

  • computational research in applications domains

  • educational programs

  • grid computing R&D



Texas two step hub spoke approach l.jpg
Texas Two-Step: Hub & Spoke Approach in Research, Education

  • Deploying P2P campus grid requires overcoming two trust issues

    • grid software: reliability, security, and performance

    • each other: not to abuse one’s own resources

  • Advanced computing center presents opportunity to build centrally manage grid as step to P2P grid

    • already has trust relationships with users

    • so, when facing both issues, install grid software centrally first

      • create centrally managed services

      • create spokes from central hub

    • then, when grid software is trusted

      • show usage and capability data to demonstrate opportunity

      • show policies and procedures to ensure fairness

      • negotiate spokes among willing participants


Ut grid logical view l.jpg
UT Grid: Logical View in Research, Education

  • Integrate a set of resources(clusters, storage systems, etc.)within TACC first

TACC Compute,

Vis, Storage, Data

(actually spread across two campuses)


Ut grid logical view9 l.jpg
UT Grid: Logical View in Research, Education

  • Next add other UTresources usingsame tools andprocedures

TACC Compute,

Vis, Storage, Data

ACES Cluster

ACES Data

ACES PCs


Ut grid logical view10 l.jpg
UT Grid: Logical View in Research, Education

  • Next add other UTresources usingsame tools andprocedures

GEO Data

GEO Cluster

TACC Compute,

Vis, Storage, Data

GEO Cluster

ACES Cluster

ACES Data

ACES PCs


Ut grid logical view11 l.jpg
UT Grid: Logical View in Research, Education

BIO Data

BIO Instrument

  • Next add other UTresources usingsame tools andprocedures

PGE Cluster

GEO Data

PGE Data

GEO Cluster

TACC Compute,

Vis, Storage, Data

PGE Instrument

GEO Cluster

ACES Cluster

ACES Data

ACES PCs


Ut grid logical view12 l.jpg
UT Grid: Logical View in Research, Education

BIO Data

BIO Instrument

  • Finally negotiateconnectionsbetween spokesfor willing participantsto develop a P2P grid.

PGE Cluster

GEO Data

PGE Data

GEO Cluster

TACC Compute,

Vis, Storage, Data

PGE Instrument

GEO Cluster

ACES Cluster

ACES Data

ACES PCs


Enhancing grid computing r d and deployment expertise for ut and for ibm l.jpg
Enhancing Grid Computing R&D and Deployment Expertise for UT and for IBM

  • Benefits for IBM

    • Increased knowledge of diverse grid user and application requirements in universities

    • Access to new software technologies developed for UT Grid

    • Early awareness of new distributed & grid computing R&D opportunities

    • Exposure & expertise in a variety of grid technologies, open source & commercial, which can be shared internally

    • Experience to be gained from maintaining a large distributed production grids

    • Collaboration with UT in conducting new distributed & grid computing R&D activities, including publications, proposals

    • Exposure among TACC’s collaborators and peers for expertise in grid deployment services, capabilities


Enhancing grid computing r d and deployment expertise for ut and for ibm14 l.jpg
Enhancing Grid Computing R&D and Deployment Expertise for UT and for IBM

  • Benefits for UT Austin

    • greater access to all resources by entire community

    • more effective utilization of existing and future resources

    • unique capabilities presented by access, aggregation, coordination for research, education

    • enhanced collaborative capabilities among researchers, and among teachers & students

  • Additional Benefits for TACC

    • increased expertise in grid deployment issues

    • early awareness of new distributed & grid computing R&D opportunities

    • platform for conducting new distributed & grid computing R&D activities


Enhancing grid computing r d and deployment expertise for ut and for ibm15 l.jpg
Enhancing Grid Computing R&D and Deployment Expertise for UT and for IBM

  • Benefits for TACC Partners

    • UT Grid-supported technologies being integrated into TeraGrid: GridPort/user portal, GridShell/user node, etc.

    • Expertise being developed in scheduling will be used in TeraGrid

    • UT Grid developments will be used

      • in TIGRE and SURA Grid

      • by TACC partners in UT System, HiPCAT, U.S., Latin America

      • by TACC industrial partners

  • Benefits for Community

    • UT Grid producing IBM DeveloperWorks articles

    • UT Grid R&D will produce professional papers in Year 2 (and proposals)


Tacc grid technology deployment activities provide synergy through tech transfer l.jpg
TACC Grid Technology & Deployment Activities Provide Synergy Through Tech Transfer

  • UT Grid

    • creating new tools for integrating compute, vis, storage and data across campus, from ‘personal scale’ to terascale

    • will exchange tools, experiences with TeraGrid & TIGRE to advance both and be interoperable with each

  • TeraGrid

    • will utilize & promote UT Grid user portal & user node technologies, and scheduling & workflow results

    • will provide grid visualization and data collection services to UT Grid, benefiting TACC and IBM

  • TIGRE

    • will utilize, promote UT Grid results and expertise to other state institutions, including industry

    • will provide additional experiences with UT Grid technologies from users from across state, helping to refine technologies


Ut grid compute resources l.jpg
UT Grid Compute Resources Through Tech Transfer

  • PCs and workstations

    • Roughly 1/2 are Windows on Intel/AMD and 1/3 are Macs

    • Most of rest are Linux on Intel/AMD

  • Networks of PCs and Workstations

    • Roundup: United Devices-managed network of PCs

      • Non-dedicated, heterogeneous compute resources across campus

      • Some managed by TACC, ITS, or other departments; some individually managed

      • Windows, Linux & Mac desktop PCs

    • Rodeo: Condor-managed network of PCs

      • Dedicated & non-dedicated, heterogeneous compute resources

      • Some managed by TACC, ITS, or other departments; some individually managed

      • Linux, Windows & Mac PCs , plus some workstations

  • Clusters

    • Lonestar: 1024-processor Linux at TACC

    • Wrangler: 656-processor Linux cluster at TACC

    • Longhorn: 128-processors in 4-way IBM p655 nodes at TACC

    • Other smaller clusters at TACC

    • Various department/lab cluster from 4 to 128+ processors will be included

    • Resources have different resource managers (LSF, PBS, SGE)

  • High-end Servers

    • Longhorn: IBM system 32 Power4 processors, 128 GB memory

    • Maverick: Sun system w/64 dual-core UltraSPARC 4 procs, 512 GB mem


Interfaces and tools for these resources l.jpg
Interfaces and Tools for these Resources Through Tech Transfer

  • For a broad, diverse campus community, access must be easy and from local resources

    • Users access Grid User Portal with standard web browser

      • Grid User Portal submits to Rodeo via SOAP

        • UT-Grid Condor Web Services layer developed to facilitate

        • Condor portlet part of GridPort 4 release

      • Grid User Portal submits to Roundup via Hosted Applications

    • Users access Grid User Node with SSH

      • Grid User Node submits to Rodeo via GridShell

        • GridShell provides command line interface through shell façade

        • Abstracts user from underlying grid technology and complexity

        • Submits to specific resource or determines most appropriate resource using catalog services

      • Grid User Node submits to Roundup

        • Batch job submission supported via GridShell

        • CLI for submitting hosted application jobs


Accessing ut grid compute resources hosted user nodes portals l.jpg
Accessing UT Grid Compute Resources Through Tech TransferHosted User Nodes & Portals


Current status near term goals l.jpg

Current Status & Near Term Goals Through Tech Transfer


Slide21 l.jpg

Roundup Through Tech Transfer


Roundup current status l.jpg
Roundup: Current Status Through Tech Transfer

  • Roundup is a production UT Grid resource

    • Production system with over 1000 PCs distributed in campus

    • Automated account request and creation

    • Production level consulting

    • Comprehensive user guide

    • Training classes offered at TACC

  • Client downloads available for Windows, Mac, Linux from UT Grid web site

  • Hosted Applications Installed

    • HMMer, BLAST, POV-Ray, Coorset, etc.


Roundup next steps l.jpg
Roundup: Next Steps Through Tech Transfer

  • Near-term goals (few months):

    • Support additional production users

    • GSI Integration

      • United devices GridMP has capability for multiple authentication schemes

      • Need to add support extension for GSI

    • Evaluate MP Insight data warehousing and report generation package

    • Test and evaluate screen saver feature and start development of UT specific screen saver

    • Investigate possible solutions to enable sharing jobs across grids

      • Multi-grid agents or job forwarding


Slide24 l.jpg

Rodeo Through Tech Transfer


Rodeo current status l.jpg
Rodeo: Current Status Through Tech Transfer

  • Rodeo is a production UT Grid resource

    • Production system with over 500 PCs made up of dedicated clusters and PCs distributed on campus

    • Automated account request and creation

    • Production level consulting

    • Comprehensive user guide

    • Training classes offered at TACC

  • Client downloads available for Windows, Mac, Linux from UT Grid web site


Rodeo current status26 l.jpg
Rodeo: Current Status Through Tech Transfer

  • Currently the largest production users are:

    • UTCS (Department of Computer Sciences)

    • Graeme Henkleman (Chemistry)

    • Wolfgang Bangerth (Geosciences)


Rodeo next steps l.jpg
Rodeo: Next Steps Through Tech Transfer

  • Near term goals (few months):

    • Continue supporting production users

    • Expand on number of CPUs available to users

    • Explore ‘hosted’ applications possibilities


Ut grid interfaces l.jpg
UT Grid Interfaces Through Tech Transfer

  • UT Grid will provide two types of interfaces:

    • Web-based Grid User Portal (GUP) accessible via any web browser

    • Customized desktop environments for Linux, Windows and Macintosh PCs to act as Grid User Nodes (GUN).

  • Users can access all UT Grid resources using either the GUP or GUNs managed by UT Grid.

  • They will also be able to download the necessary software to build and host their own customized grid user portals or convert their personal desktop systems into grid user nodes.


Motivation for a grid user portal l.jpg
Motivation for a Grid User Portal Through Tech Transfer

  • Lower the barrier of entry for novice user

  • Provide a centralized grid account management interface

    • Easy access to multiple resources through a single interface

  • Simple GUI interface to complex grid computing capabilities

    • Provide simple alternatives to CLI for advanced users

  • Present a “Virtual Organization” view of the Grid as a whole

  • Increase productivity of UT researchers – do more science!


Grid user portal current status l.jpg
Grid User Portal: Current Status Through Tech Transfer

  • Added Roundup and Rodeo as production resources on TACC User Portal

  • Developed JSR-168 Compliant portlets that can:

    • View information on resources within UT Grid, including status, load, jobs, queues, etc.

    • View network bandwidth and latency between systems, aggregate capabilities for all systems.

    • Submit user jobs

    • Manage files across systems, and move/copy multiple files between resources with transfer time estimates

  • These portlets contribute to GridPort 4 release

  • TACC leading portal effort in TeraGrid

    • This will impact TACC User Portal and therefore UTGrid


Grid user portal next steps l.jpg
Grid User Portal: Next Steps Through Tech Transfer

  • New term plans (few months):

    • Complete new TACC User Portal (TUP) based on GridPort 4 including UT Grid resources

      • UT Grid capabilities fully integrated into TUP

      • Ability to customize environment to only expose UT Grid resources

    • Migrate portlets to WebSphere to ensure compatibility(?)

    • Grid Account Management Portlets


Grid user node l.jpg
Grid User Node Through Tech Transfer

  • The Linux GUN current capabilities:

    • Information queries about grid resources

    • Job submission

      • Parallel computing jobs (Dedicated Cluster Resources)

      • Serial computing jobs (Roundup, Rodeo)

    • Monitoring job status

    • Reviewing job results

    • Resource brokering based on ClassAd catalogs

    • GridFTP enabled GSIFTP


Grid user node current status l.jpg
Grid User Node: Current Status Through Tech Transfer

  • Production Linux, development Windows and Mac GUNs

    • Need to decide whether to do GUI versions

  • Submission to Roundup and Rodeo

  • “On-Demand” glide-in of UD resources into Condor pool

  • Integrated “real-life” applications

    • NAMD

    • SNOOP3D

    • HMMeR

    • POVray


Grid user node next steps l.jpg
Grid User Node: Next Steps Through Tech Transfer

  • Near term goals:

    • Investigating distribution of GUN software stack using VDT

    • Prepare and present training class before the end of the year.


Gridport current status l.jpg
GridPort: Current Status Through Tech Transfer

  • GridPort 4 developed and released this month

  • Available to UT Grid and national users as a grid portal toolkit to download and create user and application portals

  • Based on JSR-168 compliant portlets

  • Leveraged technology and knowledge in UT Grid to create Condor and Comprehensive file transfer portlets


Gridport next steps l.jpg
GridPort: Next Steps Through Tech Transfer

  • Near term goals

    • GridPort4 will be part of the TeraGrid User Portal, to be in production in 1Q06

    • Preparing demonstration and lab at Grid Workshop in Venezuela in April 2006

    • Continue evolution of GridPort to include:

      • Advanced job submission functionality

      • Advanced user customization, and more

    • Investigating demo portal based in WebSphere


Gridshell current status l.jpg
GridShell: Current Status Through Tech Transfer

  • GridShell developed and deployed on UT Grid (and TeraGrid)

  • Available to UT Grid users in the GUN software stack.

  • Able to submit jobs first to a Spoke (departmental cluster) and then to the Hub (TACC) if not enough resources are available at the Spoke.

  • Collaborating with researchers at PSC and Caltech, we have extended GridShell to provide a single job submission interface (Condor) to the heterogeneous clusters on the TeraGrid.


Gridshell next steps l.jpg
GridShell: Next Steps Through Tech Transfer

  • Near term goals

    • Create a public download site for GridShell 1.0 (current version available only to NSF TeraGrid and UT Grid users).

    • Continue evolution of GridShell to include:

      • Support submitting jobs to clusters with firewalls

    • Need to hire an additional developer and developers partnerships with external developers (e.g. GridPort)


Mps current status l.jpg
MPS: Current Status Through Tech Transfer

  • Goal is to reduce turn around times of jobs by optimizing resource selection for data movements, queue wait times, and performance

  • Components

    • Prediction Services

      • Execution times, queue wait times, file transfer times

    • Resource Brokering

      • Immediately select resources based on job requirements

        • Including predictions

    • Metascheduling

      • Schedule complex jobs such as workflows

      • Workload management


Mps next steps l.jpg
MPS: Next Steps Through Tech Transfer

  • Near term goals:

    • Create prediction web services

      • Based on existing R&D

      • Predictions based on

        • Historical information

        • Learning algorithms

        • Scheduling simulations

    • Integrate with Condor-G

      • Provide additional information about clusters

        • The clusters themselves (e.g. number of CPUs)

        • The jobs submitted to the clusters

      • Add call outs so matchmaker can request predictions

      • User requests minimizing predicted response time as part of ranking

    • Demonstration with Graham Carey (ICES / UT Austin)

      • Selecting which cluster at TACC to use

      • Matchmaking capability using MPS to rank systems based on user request


Future plans and work l.jpg
Future Plans and Work Through Tech Transfer

  • Complete MPS work and integrate campus cluster with TACC clusters

    • First, just ‘upload’ larger jobs

    • Later, share jobs among spokes

  • Integrate maverick as remote visualization resource into UT Grid

    • Overlapping software stack with PCs

    • Remote vis software downloads (incl. file transfer)

    • Vis portal

  • Integrate campus data collections into UT Grid

    • Hosted collections in DBs

    • WebSphere Information Integrator?

  • Prepare NSF proposal?