Open source cluster application resources oscar
1 / 22

- PowerPoint PPT Presentation

  • Updated On :

Open Source Cluster Application Resources (OSCAR). Stephen L. Scott Thomas Naughton Geoffroy Vallée. Network and Cluster Computing Computer Science and Mathematics Division. OSCAR. OSCAR. O pen S ource C luster A pplication R esources.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '' - colette

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Open source cluster application resources oscar
Open Source Cluster Application Resources (OSCAR)

Stephen L. ScottThomas NaughtonGeoffroy Vallée

Network and Cluster Computing

Computer Science and Mathematics Division

O pen s ource c luster a pplication r esources



Open Source Cluster Application Resources

  • Consortium of academic, research and industry members.

  • Snapshot of best known methods for building, programming and using clusters.

Over 5 years of oscar
Over 5-years of OSCAR

Concept first discussed

January 2000


April 2000

Organizational meeting

  • Cluster assembly is time consuming and repetitive

  • Nice to offer a toolkit to automate

  • Leverage wealth of open source components

April 2001

First public release

Nov 2006

Released at SC06

What does oscar do
What does OSCAR do?

  • Wizard based cluster software installation

    • Operating system

    • Cluster environment

  • Automatically configures cluster components

  • Increases consistency among cluster builds

  • Reduces time to build / install a cluster

  • Reduces need for expertise

Design goals
Design goals

Leverage“best practices” whenever possible

Reduce overhead for cluster management

Extensibility for new Software and Projects

  • Modular meta-package system / API – “OSCAR Packages”

  • Keep it simple for package authors

  • Open Source to foster reuse and community participation

  • Fosters “spin-offs” to reuse OSCAR framework

  • Native package systems

  • Existing distributions

  • Management, system and applications

  • Keep the interface simple

  • Provide basic operations of cluster software and node administration

  • Enable others to re-use and extend system – deployment tool

Oscar overview
OSCAR overview

  • Simplifies installation, configuration and operation

  • Reduces time/learning curve for cluster build

    • Requires: pre-installed headnode with supported Linux distribution

    • Thereafter: wizard guides user through setup/install of entire cluster

Framework for cluster management

  • Content: Software + Configuration, Tests, Docs

  • Types:

    • Core: SIS, C3, Switcher, ODA, OPD, APItest, Support Libs

    • Non-core: selected & third-party (PVM, LAM/MPI, Toque/Maui,...)

  • Access: repositories accessible via OPD/OPDer

Package-based framework

Oscar packages
OSCAR packages

  • Simple way to wrap software & configuration

    • “Do you offer package Foo version X?”

  • Basic Design goals

    • Keep simple for package authors

    • Modular packaging (each self contained)

    • Timely release/updates

  • Leverage RPM + meta file + scripts, tests, docs, …

    • Recently extended to better support RPM, Debs, etc.

  • Repositories for downloading via OPD/OPDer

Oscar cluster installation wizard
OSCAR – Cluster Installation Wizard


Step 1

Step 2

Step 8

Step 3


Step 7

Cluster deployment


Step 4

Step 5

Step 6

Oscar components
OSCAR components

  • SIS, C3, OPIUM, Kernel-Picker & cluster services (dhcp, nfs, ntp, ...)

  • Security: Pfilter, OpenSSH


  • Parallel Libs: MPICH, LAM/MPI, PVM, Open MPI

  • OpenPBS/MAUI, Torque, SGE

  • HDF5

  • Ganglia, Clumon

  • Other 3rd party OSCAR Packages

HPC Services/Tools

Core Infrastructure/Management

  • System Installation Suite (SIS), Cluster Command & Control (C3), Env-Switcher

  • OSCAR DAtabase (ODA), OSCAR Package Downloader (OPD)

C3 power tools
C3 Power Tools

  • Command-line interface for cluster system administration and parallel user tools

  • Parallel execution cexec

    • Execute across a single cluster or multiple clusters at same time

  • Scatter/gather operations cpush / cget

    • Distribute or fetch files for all node(s)/cluster(s)

  • Used throughout OSCAR

    • mechanism for cluster wide operations


  • Improved distribution and architecture support

  • Node installation monitor

  • New network setup options and other GUI enhancements

  • “Use Your Own Kernel” (UYOK) for SystemImager

  • New OSCAR Packages:

    • Open MPI, SC3, SGE, YUME, NetbootMgr

  • Supported platforms:

    x86: fc4, fc5, mdv2006, rhel4, suse10.0 (tentative)

    x86_64: fc4, fc5, rhel4

OSCAR v5.0

  • Diskless OSCAR & LiveCD

  • OSCAR on Debian (OoD)

  • OSCAR Command Line Interface (CLI)

  • Native package management – network installation

In progress

Oscar proven scalability
OSCAR: proven scalability

Endeavor 232 nodes with 928 CPUs

ORNL OIC 440 nodes with 880 CPUs

McKenzie 264 nodes with 528 CPUs

SUN-CLUSTER 128 nodes with 512 CPUs

Cacau 205 nodes with 410 CPUs

Barossa 184 nodes with 368 CPUs

Smalley 66 nodes with 264 CPUs

OSCAR-SSC 130 nodes with 130 CPUs

Selected machines registered at OSCAR website

Based on data taken on 11/2/2006:

  • OSCAR Cluster Registration Page

  • ORNL OIC User Guide

More oscar information



More OSCAR information…

Home Page

Development Page

[email protected]

[email protected]

Mailing Lists

Open Cluster Group

OSCAR Symposium

  • OSCAR Research supported by the

    • Mathematics, Information and Computational Sciences Office,

    • Office of Advanced Scientific Computing Research, Office of Science,

    • U. S. Department of Energy, under contract No. DE-AC05-00OR22725 with UT-Battelle, LLC.

Oscar flavors
OSCAR “flavors”





Ha oscar

RAS Management for HPC cluster: Self-Awareness

  • The first known field-grade open source HA Beowulf cluster release

  • Self-configuration Multi-head Beowulf system

  • HA and HPC clustering techniques to enable critical HPC infrastructure

  • Services:Active/Hot Standby

  • Self-healing with 3-5 sec automatic failover time

Nec s oscar pro

Presented at OSCAR'06

keynote by Erich Focht (NEC)

  • Leverage open source tool

  • Joined project / contributions to OSCAR core

  • Integrate additions when applicable

  • Feedback and direction based on user needs

Commercial Enhancements

Scalable system software
Scalable System Software

  • Computer centers use incompatible, ad hoc set of systems tools

  • Tools are not designed to scale to multi-Teraflop systems

  • Duplication of work to try and scale tools

  • System growth vs. Administrator growth


  • Define standard interfaces for system components

  • Create scalable, standardized management tools

  • (Subsequently) reduce costs & improve efficiency at centers



  • Academics: NCSA, PSC, SDSC

  • Industry: IBM, Cray, Intel, SGI


Sss project overview
SSS project overview

Standardize the system interfaces

Map out functional areas

  • Schedulers, job managers

  • System monitors

  • Accounting and user management

  • Checkpoint/restart

  • Build and configure systems

  • Open forum of universities, labs, industry representatives

  • Define component interfaces in XML

  • Develop communication infrastructure

Components writtenin any mixture of C, C++, Java, Perl, and Python can be integrated into the Scalable Systems Software Suite


Meta manager

Meta scheduler


Node state manager



System and job monitor

Service directory

Node configuration and build manager

Authentication Communication

Standard XML


Event manager

Allocation management

Job queue manager

Hardware infrastructure manager

Usage reports


Checkpoint restart

Process manager

Sss oscar components
SSS-OSCAR Components

Queue/Job Manager


Berkeley Checkpoint/Restart


Accounting & Allocation Management System



Checkpoint/Restart enabled MPI

Job Scheduler


  • SSS Communication library

    • Includes: SD, EM, PM, BCM, NSM, NWI



Distributed System Monitor


MPI Process Manager

Single system image open source application resources ssi oscar
Single System Image Open Source Application Resources (SSI-OSCAR)

  • Easy use thanks to SSI systems

    • SMP illusion

    • High performance

    • Fault Tolerance

  • Easy management thanks to OCSAR

    • Automatic cluster install/update

Contacts (SSI-OSCAR)

Stephen L. Scott

Network and Cluster Computing

Computer Science and Mathematics Division

(865) 574-3144

[email protected]

Thomas Naughton

Network and Cluster Computing

Computer Science and Mathematics Division

(865) 576-4184

[email protected]

  • Geoffroy Vallée

    Network and Cluster Computing

    Computer Science and Mathematics Division

  • (865) 574-3152

  • [email protected]


22 Scott_OSCAR_0611