quick overview of npaci rocks n.
Skip this Video
Loading SlideShow in 5 Seconds..
Quick Overview of NPACI Rocks PowerPoint Presentation
Download Presentation
Quick Overview of NPACI Rocks

Loading in 2 Seconds...

play fullscreen
1 / 14

Quick Overview of NPACI Rocks - PowerPoint PPT Presentation

  • Uploaded on

Quick Overview of NPACI Rocks. Philip M. Papadopoulos Associate Director, Distributed Computing San Diego Supercomputer Center. Seed Questions. Do you buy-in installation services? From the supplier or a third-party vendor? We integrate. Easier to have vendor integrate larger clusters

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Quick Overview of NPACI Rocks' - buck

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
quick overview of npaci rocks

Quick Overview of NPACI Rocks

Philip M. Papadopoulos

Associate Director, Distributed Computing

San Diego Supercomputer Center

seed questions
Seed Questions
  • Do you buy-in installation services? From the supplier or a third-party vendor?
    • We integrate. Easier to have vendor integrate larger clusters
  • Do you buy pre-configured systems or build your own configuation?
    • Rocks is adaptable to many configurations
  • Do you upgrade the full cluster at one time or in rolling mode?
    • Suggest all at once (very quick with Rocks) can be done as a batch job.
    • Can support rolling, if desired.
  • Do you perform formal acceptance or burn-in tests?
    • Unfortunately, no. Need more automated testing.
installation management
  • Need to have a strategy for managing cluster nodes
  • Pitfalls
    • Installing each node “by hand”
      • Difficult to keep software on nodes up to date
    • Disk Imaging techniques (e.g.. VA Disk Imager)
      • Difficult to handle heterogeneous nodes
      • Treats OS as a single monolithic system
    • Specialized installation programs (e.g. IBM’s LUI, or RWCPs Multicast installer) –
      • let Linux packaging vendors do their job
  • Penultimate
    • RedHat Kickstart
      • Define packages needed for OS on nodes, kickstart gives a reasonable measure of control.
      • Need to fully automate to scale out (Rocks gets you there)
scaling out
Scaling out
  • Evolve to management of “two” systems
    • The front end(s)
      • Log in host
      • User’s home areas, passwords, groups
      • Cluster configuration information
    • The compute nodes
      • Disposable OS image
      • Let software manage node heterogeneity
      • Parallel (re)installation
      • Data partitions on cluster drives untouched during re-installs
  • Cluster-wide configuration files derived through reports from a MySQL database (DHCP, hosts, PBS nodes, …)
npaci rocks toolkit rocks npaci edu
NPACI Rocks Toolkit – rocks.npaci.edu
  • Techniques and software for easy installation, management, monitoring and update of clusters
  • Installation
    • Bootable CD + floppy which contains all the packages and site configuration info to bring up an entire cluster
  • Management and update philosophies
    • Trivial to completely reinstall any (all) nodes.
    • Nodes are 100% automatically configured
      • Use of DHCP, NIS for configuration
    • Use RedHat’s Kickstart to define the set of software that defines a node.
    • All software is delivered in a RedHat Package (RPM)
      • Encapsulate configuration for a package (e.g.. Myrinet)
      • Manage dependencies
    • Never try to figure out if node software is consistent
      • If you ever ask yourself this question, reinstall the node
rocks current state ver 2 1
Rocks Current State – Ver. 2.1
  • Now tracking Redhat 7.1
    • 2.4 Kernel
    • “Standard Tools” – PBS, MAUI, MPICH, GM, SSH, SSL, …
    • Could support other distros … don’t have staff for this.
  • Designed to take “bare hardware” to cluster in a short period of time
    • Linux upgrades are often “forklift-style”. Rocks supports this as the default mode of admin
  • Bootable CD
    • Kickstart file for Frontend created from Rocks webpage.
    • Use same CD to boot nodes. Automated integration “Legacy Unix config files” derived from mySQL database
  • Re-installation (we have a single HTTP server, 100 Mbit)
    • One node: 10 Minutes
    • 32 nodes: 13 Minutes
    • Use multiple HTTP servers + IP-balancing switches for scale
more rocksisms
More Rocksisms
  • Leverage widely-used (standard) software wherever possible
    • Everything is in RedHat Packages (RPM)
    • RedHat’s “kickstart” installation tool
    • SSH, Telnet (only during installation), Existing open source tools
  • Write only the software that we need to write
  • Focus on simplicity
    • Commodity components
      • For example: x86 compute servers, Ethernet, Myrinet
    • Minimal
      • For example: no additional diagnostic or proprietary networks
  • Rocks is a collection point of software for people building clusters
    • It evolving to include cluster software and packaging from more than just SDSC and UCB
    • <[your-software.i386.rpm] [your-software.src.rpm] here>
rocks dist
  • Integrate RedHat Packages from
    • Redhat (mirror) – base distribution + updates
    • Contrib directory
    • Locally produced packages
    • Local contrib (e.g. commerically bought code)
    • Packages from rocks.npaci.edu
  • Produces a single updated distribution that resides on front-end
    • Is a RedHat Distribution with patches and updates applied
  • Kickstart (RedHat) file is a text description of what’s on a node. Rocks automatically produces frontend and node files.
  • Different Kickstart files and different distribution can co-exist on a front-end to add flexibility in configuring nodes.
insert ethers
  • Used to populate the “nodes” MySQL table
  • Parses a file (e.g., /var/log/messages) for DHCPDISCOVER messages
    • Extracts MAC addr and, if not in table, adds MAC addr and hostname to table
  • For every new entry:
    • Rebuilds /etc/hosts and /etc/dhcpd.conf
    • Reconfigures NIS
    • Restarts DHCP and PBS
  • Hostname is
    • <basename>-<cabinet>-<chassis>
  • Configurable to change hostname
    • E.g., when adding new cabinets
configuration derived from database
Configuration Derived from Database

Automated node



Node 0


Node 1




Node N



pbs node list

remote re installation shoot node and ekv
Remote re-installationShoot-node and eKV
  • Rocks provides a simple method to remotely reinstall a node
    • CD/Floppy used to install the first time
  • By default, hard power cycling will cause a node to reinstall itself.
    • Addressable PDUs can do this on generic hardware
  • With no serial (or KVM) console, we are able to watch a node as installs (eKV), but …
    • Can’t see BIOS messages at boot up
  • Syslog for all nodes sent to a log host (and to local disk)
    • Can look at what a node was complaining about before it went offline
remote re installation shoot node and ekv1
Remote re-installationShoot-node and eKV

Remotely starting reinstallation on two nodes

monitoring your cluster
Monitoring your cluster
  • PBS has a GUI called xpsmon. Gives a nice graphical view of up/down state of nodes
  • SNMP status
    • Use the extensive SNMP MIB defined by the Linux community to find out many things about a node
      • Installed software
      • Uptime
      • Load
      • Slow
  • Ganglia (UCB) – IP Multicast-based monitoring system
    • 20+ different health measures
  • I think we’re still weak here – learning about other activities in this area (e.g. ngop, CERN activities, City Toolkit)
  • Cern.ch/hep-proj-grid-fabric
  • Installation tools : wwwinfo.cern.ch/pdp