high throughput urgent computing
Download
Skip this Video
Download Presentation
High Throughput Urgent Computing

Loading in 2 Seconds...

play fullscreen
1 / 16

High Throughput Urgent Computing - PowerPoint PPT Presentation


  • 60 Views
  • Uploaded on

Condor Week 2008. High Throughput Urgent Computing. Jason Cope [email protected] Project Collaborators. Argonne National Laboratory / University of Chicago Pete Beckman Suman Nadella Nick Trebon University of Wisconsin-Madison Ian Alderman Miron Livny. Urgent Computing Use Cases.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' High Throughput Urgent Computing' - abra


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
project collaborators
Project Collaborators
  • Argonne National Laboratory / University of Chicago
    • Pete Beckman
    • Suman Nadella
    • Nick Trebon
  • University of Wisconsin-Madison
    • Ian Alderman
    • Miron Livny
high throughput urgent computing1
High Throughput Urgent Computing
  • Urgent computing provides immediate, cohesive access to computing resources for emergency computations
  • Support for urgent high throughput computing environments is necessary
    • Support for high throughput emergency computing applications
    • Urgent cycle scavenging
spruce
SPRUCE
  • Special PRiority Urgent Computing Environment (SPRUCE)
    • TeraGrid Science Gateway
    • http://spruce.teragrid.org
  • GOAL: Provide cohesive urgent computing infrastructure for emergency computations
    • Authorization
    • Resource Selection
    • Resource Allocation
spruce architecture overview 1 2

Event

Automated

Trigger

2

First Responder

SPRUCE Gateway / Web Services

1

Right-of-Way

Token

Human

Trigger

Right-of-Way Token

SPRUCE Architecture Overview ( 1 / 2 )

Source: Pete Beckman, ‘SPRUCE: An Infrastructure for Urgent Computing’

spruce architecture overview 2 2
SPRUCE Architecture Overview ( 2 / 2 )

User Team

Authentication

4

?

Urgent Computing

Job Submission

Conventional

Job Submission

Parameters

Priority Job

Queue

Choose a

Resource

SPRUCE Job

Manager

3

!

5

Local Site

Policies

Urgent Computing

Parameters

Supercomputer

Resource

Source: Pete Beckman, ‘SPRUCE: An Infrastructure for Urgent Computing’

spruce resources
SPRUCE Resources
  • Deployed on TeraGrid resources at IU, NCSA, NCAR, Purdue, TACC, SDSC, UC/ANL
  • Supported Resource Managers
    • PBS
    • PBS Pro
    • LSF
    • SGE
    • LoadLeveler
    • Cobalt
  • Local and Grid resource managers supported
spruce and condor
SPRUCE and Condor

User Team

Authentication

?

Urgent Computing

Job Submission

Conventional

Job Submission

Parameters

Choose a

Resource

SPRUCE Job

Manager

3

!

4

Local Site

Policies

Urgent Computing

Parameters

Condor Pool

Adapted from Pete Beckman, ‘SPRUCE: An Infrastructure for Urgent Computing’

spruce condor integration
SPRUCE / Condor Integration
  • Added support for urgent computing ClassAds
    • SPRUCE_URGENCY
    • SPRUCE_TOKEN_VALID
    • SPRUCE_TOKEN_VALID_CHECK_TIME
  • Modifications to the Condor schedd that support identifying SPRUCE jobs
  • SPRUCE Grid ASCII Helper Protocol (GAHP) Server
    • Asynchronously invoke SPRUCE Web service operations
    • GAHP calls integrated into the Condor schedd
spruce condor integration2
SPRUCE / Condor Integration
  • SPRUCE provides an authorization mechanism for access to Condor resources
    • “Right-of-Way” access to Condor resources
    • Same authorization infrastructure for supercomputer and Grid resource access
  • Leverage existing Condor features to enhance scheduling policies
    • Job ranking / suspension / preemption
    • Site administrators define local scheduling policies
spruce condor status
SPRUCE / Condor Status
  • Prototype complete August, 2007
    • Demonstrated urgent authorization and scheduling capabilities
    • Deployed and tested on equipment at the University of Colorado
  • Currently revising the prototype for a stable software release
    • Condor 7.0 support
    • Final software development iteration before official release
    • Evaluation of SPRUCE-related software integrated into larger Condor pools
future work
Future Work
  • High throughput support for urgent computing applications
    • SURA SCOOP CH3D Grid Appliance
  • Many additional evaluation tasks
    • Application requirements
    • Security
    • Deadline scheduling / response time
    • Reliability / fault tolerance analysis
    • Data management
high throughput urgent computing2
High Throughput Urgent Computing

Questions?

[email protected]

http://spruce.teragrid.org

ad