Open source cluster application resources oscar
Download
1 / 22

Open Source Cluster Application Resources OSCAR - PowerPoint PPT Presentation


  • 237 Views
  • Updated On :

Open Source Cluster Application Resources (OSCAR). Stephen L. Scott Thomas Naughton Geoffroy Vallée. Network and Cluster Computing Computer Science and Mathematics Division. OSCAR. OSCAR. O pen S ource C luster A pplication R esources.

Related searches for Open Source Cluster Application Resources OSCAR

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Open Source Cluster Application Resources OSCAR' - colette


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Open source cluster application resources oscar
Open Source Cluster Application Resources (OSCAR)

Stephen L. ScottThomas NaughtonGeoffroy Vallée

Network and Cluster Computing

Computer Science and Mathematics Division


O pen s ource c luster a pplication r esources

OSCAR

OSCAR

Open Source Cluster Application Resources

  • Consortium of academic, research and industry members.

  • Snapshot of best known methods for building, programming and using clusters.


Over 5 years of oscar
Over 5-years of OSCAR

Concept first discussed

January 2000

OSCARV5.0

April 2000

Organizational meeting

  • Cluster assembly is time consuming and repetitive

  • Nice to offer a toolkit to automate

  • Leverage wealth of open source components

April 2001

First public release

Nov 2006

Released at SC06


What does oscar do
What does OSCAR do?

  • Wizard based cluster software installation

    • Operating system

    • Cluster environment

  • Automatically configures cluster components

  • Increases consistency among cluster builds

  • Reduces time to build / install a cluster

  • Reduces need for expertise


Design goals
Design goals

Leverage“best practices” whenever possible

Reduce overhead for cluster management

Extensibility for new Software and Projects

  • Modular meta-package system / API – “OSCAR Packages”

  • Keep it simple for package authors

  • Open Source to foster reuse and community participation

  • Fosters “spin-offs” to reuse OSCAR framework

  • Native package systems

  • Existing distributions

  • Management, system and applications

  • Keep the interface simple

  • Provide basic operations of cluster software and node administration

  • Enable others to re-use and extend system – deployment tool


Oscar overview
OSCAR overview

  • Simplifies installation, configuration and operation

  • Reduces time/learning curve for cluster build

    • Requires: pre-installed headnode with supported Linux distribution

    • Thereafter: wizard guides user through setup/install of entire cluster

Framework for cluster management

  • Content: Software + Configuration, Tests, Docs

  • Types:

    • Core: SIS, C3, Switcher, ODA, OPD, APItest, Support Libs

    • Non-core: selected & third-party (PVM, LAM/MPI, Toque/Maui,...)

  • Access: repositories accessible via OPD/OPDer

Package-based framework


Oscar packages
OSCAR packages

  • Simple way to wrap software & configuration

    • “Do you offer package Foo version X?”

  • Basic Design goals

    • Keep simple for package authors

    • Modular packaging (each self contained)

    • Timely release/updates

  • Leverage RPM + meta file + scripts, tests, docs, …

    • Recently extended to better support RPM, Debs, etc.

  • Repositories for downloading via OPD/OPDer


Oscar cluster installation wizard
OSCAR – Cluster Installation Wizard

Start

Step 1

Step 2

Step 8

Step 3

Done!

Step 7

Cluster deployment

monitor

Step 4

Step 5

Step 6


Oscar components
OSCAR components

  • SIS, C3, OPIUM, Kernel-Picker & cluster services (dhcp, nfs, ntp, ...)

  • Security: Pfilter, OpenSSH

Administration/Configuration

  • Parallel Libs: MPICH, LAM/MPI, PVM, Open MPI

  • OpenPBS/MAUI, Torque, SGE

  • HDF5

  • Ganglia, Clumon

  • Other 3rd party OSCAR Packages

HPC Services/Tools

Core Infrastructure/Management

  • System Installation Suite (SIS), Cluster Command & Control (C3), Env-Switcher

  • OSCAR DAtabase (ODA), OSCAR Package Downloader (OPD)


C3 power tools
C3 Power Tools

  • Command-line interface for cluster system administration and parallel user tools

  • Parallel execution cexec

    • Execute across a single cluster or multiple clusters at same time

  • Scatter/gather operations cpush / cget

    • Distribute or fetch files for all node(s)/cluster(s)

  • Used throughout OSCAR

    • mechanism for cluster wide operations


Highlights
Highlights

  • Improved distribution and architecture support

  • Node installation monitor

  • New network setup options and other GUI enhancements

  • “Use Your Own Kernel” (UYOK) for SystemImager

  • New OSCAR Packages:

    • Open MPI, SC3, SGE, YUME, NetbootMgr

  • Supported platforms:

    x86: fc4, fc5, mdv2006, rhel4, suse10.0 (tentative)

    x86_64: fc4, fc5, rhel4

OSCAR v5.0

  • Diskless OSCAR & LiveCD

  • OSCAR on Debian (OoD)

  • OSCAR Command Line Interface (CLI)

  • Native package management – network installation

In progress


Oscar proven scalability
OSCAR: proven scalability

Endeavor 232 nodes with 928 CPUs

ORNL OIC 440 nodes with 880 CPUs

McKenzie 264 nodes with 528 CPUs

SUN-CLUSTER 128 nodes with 512 CPUs

Cacau 205 nodes with 410 CPUs

Barossa 184 nodes with 368 CPUs

Smalley 66 nodes with 264 CPUs

OSCAR-SSC 130 nodes with 130 CPUs

Selected machines registered at OSCAR website

Based on data taken on 11/2/2006:

  • OSCAR Cluster Registration Pagehttp://oscar.openclustergroup.org/cluster-register?sort=node_count

  • ORNL OIC User Guidehttp://oic.ornl.gov/ornlindex.html#architecture


More oscar information

OSCAR

OSCAR

More OSCAR information…

oscar.OpenClusterGroup.org

Home Page

Development Page

svn.oscar.openclustergroup.org/trac/oscar

[email protected]

[email protected]

Mailing Lists

Open Cluster Group

www.OpenClusterGroup.org

OSCAR Symposium

www.csm.ornl.gov/oscar07

  • OSCAR Research supported by the

    • Mathematics, Information and Computational Sciences Office,

    • Office of Advanced Scientific Computing Research, Office of Science,

    • U. S. Department of Energy, under contract No. DE-AC05-00OR22725 with UT-Battelle, LLC.


Oscar flavors
OSCAR “flavors”

SSS-OSCAR

SSI-OSCAR

NEC's OSCAR-Pro

HA-OSCAR


Ha oscar
HA-OSCAR:

RAS Management for HPC cluster: Self-Awareness

  • The first known field-grade open source HA Beowulf cluster release

  • Self-configuration Multi-head Beowulf system

  • HA and HPC clustering techniques to enable critical HPC infrastructure

  • Services:Active/Hot Standby

  • Self-healing with 3-5 sec automatic failover time


Nec s oscar pro
NEC's OSCAR-Pro

Presented at OSCAR'06

keynote by Erich Focht (NEC)

  • Leverage open source tool

  • Joined project / contributions to OSCAR core

  • Integrate additions when applicable

  • Feedback and direction based on user needs

Commercial Enhancements


Scalable system software
Scalable System Software

  • Computer centers use incompatible, ad hoc set of systems tools

  • Tools are not designed to scale to multi-Teraflop systems

  • Duplication of work to try and scale tools

  • System growth vs. Administrator growth

Problems

  • Define standard interfaces for system components

  • Create scalable, standardized management tools

  • (Subsequently) reduce costs & improve efficiency at centers

Goals

  • DOE Labs: ORNL, ANL, LBNL, PNNL, SNL, LANL, Ames

  • Academics: NCSA, PSC, SDSC

  • Industry: IBM, Cray, Intel, SGI

Participants


Sss project overview
SSS project overview

Standardize the system interfaces

Map out functional areas

  • Schedulers, job managers

  • System monitors

  • Accounting and user management

  • Checkpoint/restart

  • Build and configure systems

  • Open forum of universities, labs, industry representatives

  • Define component interfaces in XML

  • Develop communication infrastructure


Components writtenin any mixture of C, C++, Java, Perl, and Python can be integrated into the Scalable Systems Software Suite

Metamonitor

Meta manager

Meta scheduler

Metaservices

Node state manager

Accounting

Scheduler

System and job monitor

Service directory

Node configuration and build manager

Authentication Communication

Standard XML

interfaces

Event manager

Allocation management

Job queue manager

Hardware infrastructure manager

Usage reports

Beckerman_0611

Checkpoint restart

Process manager


Sss oscar components
SSS-OSCAR Components

Queue/Job Manager

Bamboo

Berkeley Checkpoint/Restart

BLCR

Accounting & Allocation Management System

Gold

LAM/MPI (w/ BLCR)

Checkpoint/Restart enabled MPI

Job Scheduler

MAUI-SSS

  • SSS Communication library

    • Includes: SD, EM, PM, BCM, NSM, NWI

SSSLib

Warehouse

Distributed System Monitor

MPD2

MPI Process Manager


Single system image open source application resources ssi oscar
Single System Image Open Source Application Resources (SSI-OSCAR)

  • Easy use thanks to SSI systems

    • SMP illusion

    • High performance

    • Fault Tolerance

  • Easy management thanks to OCSAR

    • Automatic cluster install/update


Contacts
Contacts (SSI-OSCAR)

Stephen L. Scott

Network and Cluster Computing

Computer Science and Mathematics Division

(865) 574-3144

[email protected]

Thomas Naughton

Network and Cluster Computing

Computer Science and Mathematics Division

(865) 576-4184

[email protected]

  • Geoffroy Vallée

    Network and Cluster Computing

    Computer Science and Mathematics Division

  • (865) 574-3152

  • [email protected]

Beckerman_0611

22 Scott_OSCAR_0611


ad