slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
SciDAC PowerPoint Presentation
Download Presentation
SciDAC

Loading in 2 Seconds...

play fullscreen
1 / 21

SciDAC - PowerPoint PPT Presentation


  • 293 Views
  • Uploaded on

SciDAC Scalable Systems Software Center August 14-15 Atlanta GA Agenda - August 14 8:00 wireless set up 9:00 Introductions 9:30 Overview and Goals of the Center (Geist) 10:00 SciDAC ISIC Expectations (Johnson) 10:30 Discussion of meeting goals

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'SciDAC' - Jims


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

SciDAC

Scalable Systems Software Center

August 14-15

Atlanta GA

slide2

Agenda - August 14

8:00 wireless set up

9:00 Introductions

9:30 Overview and Goals of the Center (Geist)

10:00 SciDAC ISIC Expectations (Johnson)

10:30 Discussion of meeting goals

11:30 Strawman proposal for an interface framework

12:00 Lunch (as group to hotel restaurant)

1:00 Other proposals/ideas for system interfaces

doug - runtime architecture (scyld vs cplant vs ???)

karl – rm-api, cpu sets, need to schedule fat nodes

don – scyld boot method, multicast status info

paul – checkpoint/restart

sung – science appliance project

3.00 Enumerating key attributes

4:00 Discuss merits of attribute database

5:00 Break for dinner

slide3

Agenda - August 15

8:00 wireless set up

8:30 Decide on logistics of consensus

9:30 Overnight proposals?

10:00 Decide on working groups for key areas

Begin initial discussion of interfaces & integration

12:00 Next meeting dates, what happens till then

12:30 Meeting Ends

Lunch and further discussion for hangers-on

scalable systems software for terascale computer centers

Scalable Systems Softwarefor Terascale Computer Centers

www.scidac.org/ScalableSystems

Problem

Resource

Management

  • Computer centers use incompatible, ad hoc set of systems tools
  • Present tools are not designed to scale to multi-Teraflop systems

Accounting

& user mgmt

Solution

  • Collectively (with industry) define standard interfaces between systems components for interoperability
  • Create scalable, standardized management tools for efficiently running our large computing centers

System

Monitoring

System

Build &

Configure

Impact

  • Revolutionize the way system software is designed and used.

Job management

slide5

Goal and Vision of the Center

Four Goals

Collectively (with industry) agree on and specify standardized interfaces between system components in order to promote interoperability, portability, and long-term usability.

Produce a fully integrated suite of systems softwareand tools for the effective management and utilization of terascale computational resources particularly those at the DOE facilities.

Research and development of more advanced versions of the components as well as OS modifications required to support the scalability and performance requirements of SciDAC applications.

Carry out a software lifecycle planfor support and maintenance of systems software suite.

slide6

Scope—The Spaghetti and Meatballs Picture

Access control

Meta

Meta

Meta

Security

Scheduler

Monitor

Manager

manager

Interacts with

all components

Node

System

Monitor

Accounting

Scheduler

Configuration

& Build

Manager

Resource

Allocation

management

Job

Queue

Manager &

Monitor

Manager

User DB

Data

Migration

High

Usage

User

Checkpoint/

File

Performance

Reports

utilities

Restart

System

Communication

& I/O

Application Environment

slide7

Working with Computer Centers

Our Customers are the Managers and System Administrators

At the terascale computer centers around the nation.

their guidance

their feedback

Working with other SciDAC Centers

Common Component Architecture

parallel startup

event services

runtime framework

Scalable Data Management

others?

slide8

Meeting Goals

Decide logistics of reaching consensus on standard interfaces

MPI-like process, CCA-like process, other?

How to deal will errata

Enumeration of key attributes common across system components

expect there are less than 30

Discuss whether an attribute database be a part of the architecture

could be considered as just another component

Begin defining interfaces and working groups for key areas:

node configuration & building,

resource management,

parallel job startup,

system & job monitoring

slide9

Infrastructure

Project Web Page – www.scidac.org/ScalableSystems

proposal plan

overview slides

links to individual sites and software downloads

Project Notebook – www.csm.ornl.gov/~geist/enote/system.html

meeting notes (like this meeting)

progress reports

draft standards for group to comment on

CVS when we begin to produce software suite

slide10

Strawman

a common integrated interface framework

Easy to swap components

Vendor

optimized

highly scalable version

common pool of attributes

XML format for attributes

Standardized request protocol

Choose an existing transfer protocol-TCP

Every component uses the same framework

Attribute

database

User

Host

OS

Mem

Allocation

Etc…

slide11

Objects and Components

Components:

Job Manager

System Monitor

Accounting

Allocation management

Logging

Node Management

Process Management

Job monitor

Configuration management

Scheduler

Queue manager

Meta-services

Information service

System management

Components:

Checkpoint

File staging

Security manager

Objects:

Job

Node

Task

User

Group

Account/project

Queue???

Data store/IO

Interconnect

partition

slide12

Node, system, and configuration Services

Services:

Start job

Signal job

Services:

Start job

Signal job

Services:

Start job

Signal job

Objects:

Job

Node

Task

User

Group

Account/project

Queue???

Data store/IO

Interconnect

partition

slide13

Job and System Monitor Services

Services:

Start job

Signal job

Services:

Start job

Signal job

Services:

Start job

Signal job

Objects:

Job

Node

Task

User

Group

Account/project

Queue???

Data store/IO

Interconnect

partition

slide14

Accounting and logging Services

Services:

Start job

Signal job

Services:

Start job

Signal job

Services:

Start job

Signal job

Objects:

Job

Node

Task

User

Group

Account/project

Queue???

Data store/IO

Interconnect

partition

slide15

Job and Process Mgmt (+chkpt) Services

Services:

Start job

Signal job

Services:

Start job

Signal job

Services:

Start job

Signal job

Objects:

Job

Node

Task

User

Group

Account/project

Queue???

Data store/IO

Interconnect

partition

slide16

Scheduler, Queue, and meta- Services

Services:

Start job

Signal job

Services:

Start job

Signal job

Services:

Start job

Signal job

Objects:

Job

Node

Task

User

Group

Account/project

Queue???

Data store/IO

Interconnect

partition

slide17

Information Services

Objects:

Job

Node

Task

User

Group

Account/project

Queue???

Data store/IO

Interconnect

partition

Static Services:

Start job

Signal job

Slow Services:

Start job

Signal job

Fast Services:

Start job

Signal job

slide18

Security Mgr Services

Objects:

Job

Node

Task

User

Group

Account/project

Queue???

Data store/IO

Interconnect

partition

Services:

Start job

Signal job

slide19

Storage and I/O Services

Objects:

Job

Node

Task

User

Group

Account/project

Queue???

Data store/IO

Interconnect

partition

Services:

Start job

Signal job

slide20

Consensus and Voting Rules:

Written Documentation:

Written draft standards available to everyone in Project notebook

Drafts must be presented week to 10 days before a vote

Errata or extensions—revisit interface standard every 6 months

Voting:

Pass with simple majority of people voting yes/no

Who can vote?

Organizations with physical attendance at 2 of last 3 meetings

One vote per organization

No email-in or phone votes accepted.

Straw votes are non-binding and many can be used for guidance

Two formal votes are required to accept a chapter for final vote

Both votes can’t occur at the same meeting.

Global vote of whole document as standard interface

slide21

Other organizational:

Weekly teleconference of Working Groups: Up to the groups

Video Conference Meetings: Explore AG in future