Ut grid building a campus grid
1 / 38

UT Grid: Building a campus grid - PowerPoint PPT Presentation

  • Updated On :

UT Grid: Building a campus grid. Ashok Adiga, Ph.D. Distributed & Grid Computing Group Texas Advanced Computing Center The University of Texas at Austin [email protected] (512) 471-8196. TACC Grid Program. TACC involved in several Grid projects

Related searches for UT Grid: Building a campus grid

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'UT Grid: Building a campus grid' - LeeJohn

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Ut grid building a campus grid l.jpg

UT Grid: Building a campus grid

Ashok Adiga, Ph.D.

Distributed & Grid Computing Group

Texas Advanced Computing Center

The University of Texas at Austin

[email protected]

(512) 471-8196

Tacc grid program l.jpg
TACC Grid Program

  • TACC involved in several Grid projects

    • Campus Grid (UT Grid, partially funded by IBM)

    • State Grid (TIGRE)

    • National Grid (ETF)

  • Grid Hardware Resources

    • Wide range of hardware resources available to research community at UT and partners

  • Grid Software Resources

    • Significantly leverage NMI GRIDS components (Globus Toolkit, GPT, MyProxy, Gridport, GridFTP, …)

    • Other software where necessary

      • Resource managers (Condor, LSF, PBS, United Devices)

      • Schedulers (Condor, Community Scheduling Framework)

Teragrid national l.jpg
TeraGrid (National)

  • NSF Extensible Terascale Facility (ETF) project

    • build and deploy the world's largest, fastest, distributed computational infrastructure for general scientific research

    • 40 Gbps backbone with hubs in Los Angeles, Chicago & Atlanta

  • UT (led by TACC) going online on Teragrid October 1 2004

    • 10 Gbps network connection to ETF backbone

    • Provide access to high-end computers capable of 6.2 teraflops, a new terascale visualization system, and a 2.8-petabyte mass storage system

    • Provide access to geoscience data collections used in environmental, geological climate and biological research:

      • high-resolution digital terrain data

      • worldwide hydrological data

      • global gravity data

      • high-resolution X-ray computed tomography data

  • Current software stack includes: Globus (GSI, GRAM, GridFTP), MPICH-G2, Condor-G, GPT, MyProxy, SRB

Tigre state wide grid l.jpg
TIGRE (State-wide Grid)

  • Texas Internet Grid for Research and Education

    • computational grid to integrate computing & storage systems, databases, visualization laboratories and displays, and instruments and sensors across Texas.

    • Funding announced by Gov. Rick Perry at Internet2

    • TIGRE members include several leading state institutes:

      • Rice, Texas A&M, Texas Tech, U of Houston, UT Austin, UT El Paso, others…

    • Initial software stack will use NMI GRIDS

Ut grid vision a powerful flexible and simple virtual environment for research education l.jpg
UT Grid Vision: A Powerful, Flexible, and Simple Virtual Environment for Research & Education

The UT Grid vision is the creation of a cyberinfrastructure for research and education in which people can develop and test ideas, collaborate, teach, and learn through applications that seamlessly harness the diverse campus compute, visualization, storage, data, and instruments as needed from their personal systems (PCs) and interfaces (web browsers, GUIs, etc.).

Ut grid develop and provide a unique comprehensive cyberinfrastructure l.jpg
UT Grid: Develop and Provide a Unique, Comprehensive Cyberinfrastructure…

The strategy of the UT Grid project is to integrate…

  • common security/authentication

  • scheduling and provisioning

  • aggregation and coordination

    diverse campus resources…

  • computational (PCs, servers, clusters)

  • storage (Local HDs, NASes, SANs, archives)

  • visualization (PCs, workstations, displays, projection rooms)

  • data collections (sci/eng, social sciences, communications, etc.)

  • instruments & sensors (CT scanners, telescopes, etc.)

    from ‘personal scale’ to terascale…

  • personal laptops and desktops

  • department servers and labs

  • institutional (and national) high-end facilities

That provides maximum opportunity capability for impact in research education l.jpg
…That Provides Maximum Opportunity & Capability for Impact in Research, Education

…into a campus cyberinfrastructure…

  • evaluate existing grid computing technologies

  • develop new grid technologies

  • deploy and support appropriate technologies for production use

  • continue evaluation, R&D on new technologies

  • share expertise, experiences, software & techniques

    that provides simple access to all resources…

  • through web portals

  • from personal desktop/laptop PCs, via custom CLIs and GUIs

    to the entire community for maximum impact on

  • computational research in applications domains

  • educational programs

  • grid computing R&D

Ut grid approach leverage strengths of campus environment l.jpg
UT Grid Approach: Leverage Strengths of Campus Environment in Research, Education

  • Like any grid, campus grid must provide services to simplify use of distributed resources

  • But

    • Focus must be to support research and/or education mission of the university

    • Campus grid can leverage vast numbers of PCs and large numbers of clusters

    • Campus grid can integrate novel scientific data collections and research instruments

Ut grid approach leverage strengths of campus environment9 l.jpg
UT Grid Approach: Leverage Strengths of Campus Environment in Research, Education

  • Important differences from multi-institution grids:

    • Staff in one location, can collaborate face-to-face

    • ‘Controlled’ network environment

    • High-end computing center can lead deployment

  • Important differences from enterprise grids

    • Researchers generally more independent than in company

    • No central IT group governs researchers’ systems

    • Usage models driven by different priorities

  • Important differences from domain-specific grids

    • Might require integration of wider variety of resources

    • Must support wider variety of usage models

Ut has massive scale and unique deployment environment l.jpg
UT Has Massive Scale and Unique Deployment Environment in Research, Education

  • ACES building is a model for a university grid

    • Massive bandwidth

    • Multidisciplinary users

    • Numerous PC, clusters, visualization systems, storage resources

  • UT main campus + UT research campus can be model for multi-institution grid

    • Separated by true WAN, but UT controls paths

    • Massive bandwidth (10GigE) between campus

    • TACC controls resources on both campuses

Ut grid project team has participation from several campus departments l.jpg
UT Grid Project Team Has Participation From Several Campus Departments…

  • Additional UT Partners

    • Information Technology Services (ITS):

      • deploying Roundup clients, will include client s/w in BevoWare

    • College of Engineering IT Group:

      • deploying Roundup clients

    • Center for Instructional Technology (CIT):

      • Helped with Web site, will create education content

    • Department of Computer Sciences

      • integrating Condor flock, partnering in R&D proposals

    • Institute for Computational Engineering & Sciences (ICES):

      • integrating clusters and Condor flock

And participation will grow significantly as we enter production l.jpg
..and Participation Will Grow Significantly as We Enter Production

  • Additional Partners Expected in next 6 months

    • Mary Wheeler, ICES

      • integrating cluster, leading-edge user

    • Kamy Sepehrnoori, Dept of Petroleum & Geophysical Eng.

      • integrating cluster, leading-edge user

    • College of Fine Arts

      • providing Roundup clients

    • College of Communications

      • interested in storage services

    • Additional outreach through UT ‘Tech Deans’ Committee

    • Additional users through TACC User Services

Ut grid components l.jpg
UT Grid Components Production

  • Grid User Interfaces

    • Typical grid interface is via user portals

    • Grid User Nodes provide users with command line (shell) interfaces to the grid

  • Grid Resources

    • Compute, storage, visualization, instruments

    • Grid software must provide security, monitoring, remote access

  • Grid Services

    • Authentication (GSI, MyProxy)

    • Scheduling (Condor, CSF)

    • Data management (SRB, Avaki)

Ut grid current status l.jpg
UT Grid: Current Status Production

  • Providing compute services to users today

    • Heterogeneous set of cluster resources (LSF, PBS, LoadLeveler, Condor) and desktop resources (United Devices, Condor)

    • Single sign-on access via user portal

    • Allocation and support procedures

    • Resource monitoring

    • Serial and parallel job submission to clusters and desktop resources

    • Evaluation of scheduling technologies (Condor, CSF)

    • Evaluating workflow solutions (Pegasus)

  • Basic data services

    • Reliable File Transfer tool built using GridFTP, NWS, GPIR

    • Share data across resources using Avaki data grid

    • SRB

  • Visualization services coming soon

    • Remote interactive visualization, Batch rendering, Computational Steering

Challenges include scale heterogeneity purpose and policies l.jpg
Challenges Include Scale, Heterogeneity, Purpose, and Policies

  • Usage models:

    • research vs education (vs. administrative)

    • ISV apps vs custom apps

    • Interactive vs batch

    • Serial codes vs parallel codes

    • Etc.

  • Most are locally managed

    • Local policies and procedures

    • Different priorities

    • Sense of ownership

    • Varying expertise levels of administrators

    • Varying levels of support

Ut grid approach to building the grid l.jpg
UT Grid: Approach to building the grid Policies

  • Challenge: Getting scientists to use UT Grid

    • Gain confidence that they can meet their computing goals and benefit from using the grid

    • Share their resources by making them available to other grid users

  • Hub & Spoke approach rather than peer resources

    • Leverage existing trust relationships between TACC and campus research users

    • As users become comfortable with grid software, convince them to share their resources

Ut grid logical view l.jpg
UT Grid: Logical View Policies

  • Integrate each set of resources(compute, vis, storage, data)within TACC first

TACC Compute,

Vis, Storage, Data

(actually spread across two campuses)

Ut grid logical view18 l.jpg
UT Grid: Logical View Policies

  • Next add other UTresources usingsame tools andprocedures

TACC Compute,

Vis, Storage, Data

ACES Cluster



Ut grid logical view19 l.jpg
UT Grid: Logical View Policies

  • Next add other UTresources usingsame tools andprocedures

GEO Data

GEO Cluster

TACC Compute,

Vis, Storage, Data

GEO Cluster

ACES Cluster



Ut grid logical view20 l.jpg
UT Grid: Logical View Policies

BIO Data

BIO Instrument

  • Next add other UTresources usingsame tools andprocedures

PGE Cluster

GEO Data

PGE Data

GEO Cluster

TACC Compute,

Vis, Storage, Data

PGE Instrument

GEO Cluster

ACES Cluster



Ut grid logical view21 l.jpg
UT Grid: Logical View Policies

BIO Data

BIO Instrument

  • Finally negotiateconnectionsbetween spokesfor willing participantsto develop a P2P grid.

PGE Cluster

GEO Data

PGE Data

GEO Cluster

TACC Compute,

Vis, Storage, Data

PGE Instrument

GEO Cluster

ACES Cluster



Accessing ut grid portals vs clis l.jpg
Accessing UT Grid: Portals vs CLIs Policies

  • Choice of portals over command line interfaces is not universal

    • Some researchers prefer to use their current shell interface to access the grid

  • UT Grid supports Grid User Portals (GUPs) and Grid User Nodes (GUNs)

Why are gups important l.jpg
Why Are GUPs Important? Policies

  • Lower the barrier of entry into grid computing

    • Easy access to multiple resources through a single interface

    • Simple GUI interface to complex grid computing capabilities

    • Present a “Virtual Organization” view of the Grid as a whole

Ut gup infrastructure l.jpg
UT GUP Infrastructure Policies

  • Portal based on

    • Grid Portal Toolkit 3 (NMI component)

    • Jetspeed Portal infrastructure

  • Underlying Grid Middleware

    • Globus

    • Community Scheduling Framework

    • Network Weather Service

    • Soon: Avaki, SRB

Ut gup capabilities l.jpg
UT GUP Capabilities Policies

  • Initial GUP capabilities include:

    • View information on resources within UT Grid, including status, load, jobs, queues, etc.

    • View network bandwidth and latency between systems, aggregate capabilities for all systems.

    • Submit user jobs and run hosted applications

    • Manage files across systems, and move/copy multiple files between resources with transfer time estimates

Community scheduling framework csf l.jpg
Community Scheduling Framework (CSF) Policies

  • Open-source metascheduler written by Platform Computing

    • Distributed under Globus Public License

    • Developed using GT3.0.2 and OGSI

    • Will be part of future Globus Toolkit distribution

  • Schedules jobs across heterogeneous resources

    • Advanced reservation support

    • Architecture allows pluggable scheduling policies

    • Resource Manager Adapters required to convert requests to local resource manager.

    • Dynamic performance information stored in Global Information Service

Ut grid csf configuration l.jpg

GT3.0 Policies




for LSF



for PBS

UT Grid CSF Configuration


CSF Server

Web Server






User Portal






Queues implement customizable scheduling policies using plug-ins




Why are guns important l.jpg
Why Are GUNs Important? Policies

  • Most campus users have PCs for their research & education projects

    • They are used to their local systems

  • They also often need additional resources

    • They may want more flexibility than a portal provides

    • They need to be able to keep doing what they know, issuing same commands, but reaching additional resources

    • They would like access to those resources easily, even transparently

  • The Grid User Node concept is designed to provide these features and capabilities

Current linux gun software l.jpg
Current Linux GUN Software Policies

Users have the option of installing software stack on their desktops or using “hosted” GUN.

  • Linux Red Hat 9.0

  • Globus 3.2.1 NMI Release 5

    • Ant v1.6.2

    • Java J2SE SDK v1.4

    • Grid Package Tools v3.2.1 NMI 5

  • GridShell (pre-release version)

  • Condor


  • United Devices SDK 4.1

    • Perl v 5.6

What is gridshell l.jpg
What is GridShell? Policies

  • GridShell is an extension of TCSH and BASH shells

    • includes transparent distributed execution and data transfer features for intra and inter cluster execution of programs

    • Currently supports LSF, Condor and Globus environments

    • Goal is to extend services to match portal services

Grid user node l.jpg
Grid User Node Policies

Ut grid application driven design l.jpg
UT Grid: Application driven design Policies

  • UT Grid design based on user requirements

    • Initial user set has been identified

    • Monthly meetings, mailing lists

    • Interviews to understand use cases

  • Initial set of application areas have compute, storage and visualization requirements

    • Computational Fluid Dynamics (Dr. Carey)

    • Reservoir modeling (Dr. Wheeler)

    • Flood prediction (Dr. Wells & Dr. Maidment)

Ut grid education l.jpg
UT Grid: Education Policies

  • Training courses offered 3-4 times/year

    • Gridport (offered via Access Grid)

    • Running applications using United Devices

    • HPC training (MPI apps, tools)

  • Courses offered through CS department

    • High Performance Computing for scientists (this semester)

    • Grid Computing in science and engineering (summer ‘05)

  • CIT planning to provide educational content about UT Grid research applications

Nmi experiences l.jpg
NMI Experiences Policies

  • TACC has benefited from using NMI

    • Easier to install & configure components

    • Better documentation & support

    • Software is more robust since it has gone through a level of integration testing

    • Exposure to new components (Gridsolve)

    • Working with other NMI Testbed members

  • Although NMI components are fairly reliable

    • They are still evolving, and occasionally cause backward compatibility issues (e.g. between Globus versions 3.0.2, 3.2, 3.2.1, and 4.0)

  • NMI not a complete grid solution

    • Components do not address: scheduling, workflow, accounting, ….

Ut grid project team l.jpg
UT Grid Project Team Policies

  • Jay Boisseau

  • Maytal Dahan

  • Edward Walker

  • Ashok Adiga

  • Ashesh Sahib

  • CJ Barker

  • Akhil Seth

  • David Walling

  • Eric Roberts

  • Jeff Mausolf (IBM)

  • Nina Wilner (IBM)

Slide38 l.jpg

Texas Advanced Computing Center Policies


(512) 475-9411