Open source cluster application resources oscar
This presentation is the property of its rightful owner.
Sponsored Links
1 / 22

Open Source Cluster Application Resources (OSCAR) PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Open Source Cluster Application Resources (OSCAR). Stephen L. Scott Thomas Naughton Geoffroy Vallée. Network and Cluster Computing Computer Science and Mathematics Division. OSCAR. OSCAR. O pen S ource C luster A pplication R esources.

Download Presentation

Open Source Cluster Application Resources (OSCAR)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Open source cluster application resources oscar

Open Source Cluster Application Resources (OSCAR)

Stephen L. ScottThomas NaughtonGeoffroy Vallée

Network and Cluster Computing

Computer Science and Mathematics Division

O pen s ource c luster a pplication r esources



Open Source Cluster Application Resources

  • Consortium of academic, research and industry members.

  • Snapshot of best known methods for building, programming and using clusters.

Over 5 years of oscar

Over 5-years of OSCAR

Concept first discussed

January 2000


April 2000

Organizational meeting

  • Cluster assembly is time consuming and repetitive

  • Nice to offer a toolkit to automate

  • Leverage wealth of open source components

April 2001

First public release

Nov 2006

Released at SC06

What does oscar do

What does OSCAR do?

  • Wizard based cluster software installation

    • Operating system

    • Cluster environment

  • Automatically configures cluster components

  • Increases consistency among cluster builds

  • Reduces time to build / install a cluster

  • Reduces need for expertise

Design goals

Design goals

Leverage“best practices” whenever possible

Reduce overhead for cluster management

Extensibility for new Software and Projects

  • Modular meta-package system / API – “OSCAR Packages”

  • Keep it simple for package authors

  • Open Source to foster reuse and community participation

  • Fosters “spin-offs” to reuse OSCAR framework

  • Native package systems

  • Existing distributions

  • Management, system and applications

  • Keep the interface simple

  • Provide basic operations of cluster software and node administration

  • Enable others to re-use and extend system – deployment tool

Oscar overview

OSCAR overview

  • Simplifies installation, configuration and operation

  • Reduces time/learning curve for cluster build

    • Requires: pre-installed headnode with supported Linux distribution

    • Thereafter: wizard guides user through setup/install of entire cluster

Framework for cluster management

  • Content: Software + Configuration, Tests, Docs

  • Types:

    • Core: SIS, C3, Switcher, ODA, OPD, APItest, Support Libs

    • Non-core: selected & third-party (PVM, LAM/MPI, Toque/Maui,...)

  • Access: repositories accessible via OPD/OPDer

Package-based framework

Oscar packages

OSCAR packages

  • Simple way to wrap software & configuration

    • “Do you offer package Foo version X?”

  • Basic Design goals

    • Keep simple for package authors

    • Modular packaging (each self contained)

    • Timely release/updates

  • Leverage RPM + meta file + scripts, tests, docs, …

    • Recently extended to better support RPM, Debs, etc.

  • Repositories for downloading via OPD/OPDer

Oscar cluster installation wizard

OSCAR – Cluster Installation Wizard


Step 1

Step 2

Step 8

Step 3


Step 7

Cluster deployment


Step 4

Step 5

Step 6

Oscar components

OSCAR components

  • SIS, C3, OPIUM, Kernel-Picker & cluster services (dhcp, nfs, ntp, ...)

  • Security: Pfilter, OpenSSH


  • Parallel Libs: MPICH, LAM/MPI, PVM, Open MPI

  • OpenPBS/MAUI, Torque, SGE

  • HDF5

  • Ganglia, Clumon

  • Other 3rd party OSCAR Packages

HPC Services/Tools

Core Infrastructure/Management

  • System Installation Suite (SIS), Cluster Command & Control (C3), Env-Switcher

  • OSCAR DAtabase (ODA), OSCAR Package Downloader (OPD)

C3 power tools

C3 Power Tools

  • Command-line interface for cluster system administration and parallel user tools

  • Parallel execution cexec

    • Execute across a single cluster or multiple clusters at same time

  • Scatter/gather operations cpush / cget

    • Distribute or fetch files for all node(s)/cluster(s)

  • Used throughout OSCAR

    • mechanism for cluster wide operations



  • Improved distribution and architecture support

  • Node installation monitor

  • New network setup options and other GUI enhancements

  • “Use Your Own Kernel” (UYOK) for SystemImager

  • New OSCAR Packages:

    • Open MPI, SC3, SGE, YUME, NetbootMgr

  • Supported platforms:

    x86: fc4, fc5, mdv2006, rhel4, suse10.0 (tentative)

    x86_64: fc4, fc5, rhel4

OSCAR v5.0

  • Diskless OSCAR & LiveCD

  • OSCAR on Debian (OoD)

  • OSCAR Command Line Interface (CLI)

  • Native package management – network installation

In progress

Oscar proven scalability

OSCAR: proven scalability

Endeavor232 nodes with 928 CPUs

ORNL OIC440 nodes with 880 CPUs

McKenzie264 nodes with 528 CPUs

SUN-CLUSTER128 nodes with 512 CPUs

Cacau205 nodes with 410 CPUs

Barossa184 nodes with 368 CPUs

Smalley66 nodes with 264 CPUs

OSCAR-SSC130 nodes with 130 CPUs

Selected machines registered at OSCAR website

Based on data taken on 11/2/2006:

  • OSCAR Cluster Registration Page

  • ORNL OIC User Guide

More oscar information



More OSCAR information…

Home Page

Development Page

[email protected]

[email protected]

Mailing Lists

Open Cluster Group

OSCAR Symposium

  • OSCAR Research supported by the

    • Mathematics, Information and Computational Sciences Office,

    • Office of Advanced Scientific Computing Research, Office of Science,

    • U. S. Department of Energy, under contract No. DE-AC05-00OR22725 with UT-Battelle, LLC.

Oscar flavors

OSCAR “flavors”





Ha oscar


RAS Management for HPC cluster: Self-Awareness

  • The first known field-grade open source HA Beowulf cluster release

  • Self-configuration Multi-head Beowulf system

  • HA and HPC clustering techniques to enable critical HPC infrastructure

  • Services:Active/Hot Standby

  • Self-healing with 3-5 sec automatic failover time

Nec s oscar pro


Presented at OSCAR'06

keynote by Erich Focht (NEC)

  • Leverage open source tool

  • Joined project / contributions to OSCAR core

  • Integrate additions when applicable

  • Feedback and direction based on user needs

Commercial Enhancements

Scalable system software

Scalable System Software

  • Computer centers use incompatible, ad hoc set of systems tools

  • Tools are not designed to scale to multi-Teraflop systems

  • Duplication of work to try and scale tools

  • System growth vs. Administrator growth


  • Define standard interfaces for system components

  • Create scalable, standardized management tools

  • (Subsequently) reduce costs & improve efficiency at centers



  • Academics: NCSA, PSC, SDSC

  • Industry: IBM, Cray, Intel, SGI


Sss project overview

SSS project overview

Standardize the system interfaces

Map out functional areas

  • Schedulers, job managers

  • System monitors

  • Accounting and user management

  • Checkpoint/restart

  • Build and configure systems

  • Open forum of universities, labs, industry representatives

  • Define component interfaces in XML

  • Develop communication infrastructure

Open source cluster application resources oscar

Components writtenin any mixture of C, C++, Java, Perl, and Python can be integrated into the Scalable Systems Software Suite


Meta manager

Meta scheduler


Node state manager



System and job monitor

Service directory

Node configuration and build manager

Authentication Communication

Standard XML


Event manager

Allocation management

Job queue manager

Hardware infrastructure manager

Usage reports


Checkpoint restart

Process manager

Sss oscar components

SSS-OSCAR Components

Queue/Job Manager


Berkeley Checkpoint/Restart


Accounting & Allocation Management System



Checkpoint/Restart enabled MPI

Job Scheduler


  • SSS Communication library

    • Includes: SD, EM, PM, BCM, NSM, NWI



Distributed System Monitor


MPI Process Manager

Single system image open source application resources ssi oscar

Single System Image Open Source Application Resources (SSI-OSCAR)

  • Easy use thanks to SSI systems

    • SMP illusion

    • High performance

    • Fault Tolerance

  • Easy management thanks to OCSAR

    • Automatic cluster install/update



Stephen L. Scott

Network and Cluster Computing

Computer Science and Mathematics Division

(865) 574-3144

[email protected]

Thomas Naughton

Network and Cluster Computing

Computer Science and Mathematics Division

(865) 576-4184

[email protected]

  • Geoffroy Vallée

    Network and Cluster Computing

    Computer Science and Mathematics Division

  • (865) 574-3152

  • [email protected]


22 Scott_OSCAR_0611

  • Login