1 / 16

Tools for Cluster Administration and Applications (ancient technology – from 2001…)

Tools for Cluster Administration and Applications (ancient technology – from 2001…). System Administrators DO NOT scale Install / update operating system Install applications Add / Remove users etc. Users DO NOT scale Install applications Move data files Launch applications

emily
Download Presentation

Tools for Cluster Administration and Applications (ancient technology – from 2001…)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tools for ClusterAdministration and Applications(ancient technology – from 2001…)

  2. System Administrators DO NOT scale Install / update operating system Install applications Add / Remove users etc. Users DO NOT scale Install applications Move data files Launch applications Interact with active jobs etc. Tools that… Treat cluster as single machine Scale from 1-to-N nodes 10,000’s of nodes Scale to Federated clusters Easy to  learn – use – adapt ...Problem Large Cluster Administration: what is the... ...Solution

  3. Tool Review • Systemimager • LUI – Linux Utility for cluster Install • VA Cluster Management (VACM) • Alert • Parallel UNIX Commands – (Ptools) • dsh • prsh • Webmin • ALINKA LCM - Linux Cluster Manager • ALINKA RAISIN • SCMS – Smile Cluster Management System • C3 – Cluster Command & Control • M3C – Managing Multiple Multi-User Clusters

  4. Systemimager • Disk image / system administration • maintain disk coherency across cluster • administrator level tool • image server stores images • can build image server database of site disk images • Pros: • supported by VA Linux as opensource • architecture independent • Cons: • requires each node to request image (“pull image”) • only operates at disk image level (not individual file) • Dependencies: • rsync, DHCP • http://download.sourceforge.net/systemimager

  5. Linux Utility for cluster Install – (LUI) • System install / restore • administrator level tool • easy to duplicate install by resource • linux kernel, system map, partition table, RPMs, “user exits”, local & remote NFS file systems • no need to store disk images • Pros: • LUI available as an RPM • supported by IBM as opensource • architecture independent • machine & resource groups • Cons: • only useful for system initialization • manually installed packages will have to be reinstalled • Dependencies: • NFS, tftp-hpa, bootp or dhcp, perl • http://oss.software.ibm.com/developer/opensource/linux/projects/lui

  6. VA Cluster Management - (VACM) • GUI based Hardware level monitor • device power control, hardware reset, remote bios control, chasis intrusion, cpu fan status • Intel Intelligent Platform Management Interface motherboards • Pros: • monitor does not impact performance as IPMI runs in hardware micro controllers • Cons: • only available for Intel IMPI compliant motherboards • does not monitor power supply fan or external fan • Dependencies: • IMPI motherboard: • NB440BX Server Platform (Nightshade) • T440BX Server Platform (Nightlight) • L440GX Server Platform (Lancewood) • GTK+ v1.02, Gnome-libs, GDK v1.2, imlib v1.0.6 • http://www.valinux.com/software/vacm/

  7. Alert • Web based UNIX cluster monitoring tool • local clients on each node reports to monitor node(s) • clients are scripts running as cron jobs • monitors run daemon to receive reports from clients • Monitors • alerts • print web pages • email notification of events • Pros • supports cluster configuration files, allowing definitions of subclusters • errors can be categorized • notifications can be assigned for each category • uses a special Alert log as opposed to having to search syslog • clients can be written to handle new monitoring tasks • Cons • no proactive event correction ability • http://www.cs.virginia.edu/~jdm2d/alert/

  8. Parallel UNIX Commands – (Ptools project) • Parallel version of common UNIX commands • cp, cat, ls, rm, mv, find, ps, kill, exec, and test • Other parallel tools • parallel process find, command execution on satisfied condition, command execution on collection of files, display command output • Target Architecture • MPP with full Unix environment on each node • SP-1 • Meiko CS-2 • Unix NOWs • Argonne National Laboratory • William Gropp • Ewing Lusk • Status: vaporware -- latest reference ‘94 SHPCC paper • http://www.ptools.org/ • http://www.ptools.org/projects.html#PUC

  9. Distributed Shell – (dsh) • Command line based • sequential execution across collection of hosts • rsh to access nodes • output prepended with host name • Pros: • single or multiple remote commands • can create node groups • command can specify individual hosts or use node groups • Cons: • no concurrent execution • no interactive operation • Dependencies: • rsh, Perl • environment vars: • BEOWULF_ROOT – directory with beowulf related files • WCOLL – location of file with default working collective • http://www.ccr.buffalo.edu/dsh.htm

  10. Parallel Remote Shell – (prsh) • Command line based • concurrent execution across collection of hosts • run UNIX command across nodes • stderr & stdout returned to originating computer • Pros: • ability to use rsh or ssh • hosts and options can be specified in environment variables • output can be associated with hostname using --prepend • Cons: • not able to perform interactive tasks (stdin set to /dev/null) • using --status with rsh unreliable • Dependencies: • rsh, ssh, Perl • environment vars: • PRSH_OPTIONS – used before command line options • PRSH_HOSTS – default host list • http://www.cacr.caltech.edu/projects/beowulf/GrendelWeb/ • software/index.html

  11. Webmin • web interface for system administration • designed for use on individual systems – not clusters • web server and CGI programs to perform administration tasks • Pros: • quick, graphical interface to most common system administration tasks • telnet module for console access to hosts • ability to define custom commands • view and manage running processes • easy addition of user written modules, and standards for writing them • Cons: • not intended for clusters • must have web server on every host • modules must be written entirely in Perl • Dependencies: • Perl 5 or later • web server • http://www.webmin.com/webmin/

  12. ALINKA LCM - Linux Cluster Manager • Command line based management and configuration • Pros: • cluster-wide command execution, except superuser commands • ability to define and manage subclusters • load monitoring of nodes • MPI/PVM job execution support • Cons: • master node is NFS server for /home, /etc, and /var, limiting scalability • no support for using SSH, and cluster command doesn't work as root • no support for NIS or Shadow passwords • limited to homogeneous clusters • difficult to install and operate • Dependencies: • rsh, tar, nfs-server, sudo, php cgi-bin with pgsql support, bootpd, tcpdump, postgresql, gawk • http://www.alinka.com/download.htm#lcm

  13. ALINKA RAISIN • GUI based management and configuration • same functionality as ALINKA LCM • added GUI • Pros: • cluster-wide command execution, except superuser commands • ability to define and manage subclusters • load monitoring of nodes • MPI/PVM job execution support • Cons: • all cons of ALINKA LCM • commercial license • Dependencies: • same as ALINKA LCM • apache • php module for apache with postgresql support • gnuplot • http://www.alinka.com/araisin.htm

  14. Smile Cluster Management System – (SCMS) • Command line and GUI environment • designed managing beowulf-type clusters as single machine • latest version looks promising with ptools like command line interface • Pros: • many system utilities (e.g. node status, node control panel, node file system, disk space, ftp, process status, reboot/shutdown, rpm package manager, telnet, parallel UNIX commands, alarm services, and motherboard monitoring) • performance monitoring/logging of CPU, memory, I/O, and network • user-definable alarm levels with e-mail or script notifications • Cons: • no support for job scheduling and cluster resource allocation • no MPI/PVM job submission tool • no support for using SSH • Dependencies: • rsh, Java, Perl • http://smile.cpe.ku.ac.th/

  15. Cluster Command & Control (C3) Tools • Command line based • single machine interface • cluster configuration file • serial & parallel versions • Pros: • serial version – deterministic execution, good for debugging • parallel version – efficient execution • ability to rapidly deploy software updates and update system images • command line list option allows subcluster management • distributed file scatter and gather operations • execution of any non-interactive command • Cons: • no support for interactive command execution • Dependencies: • DHCP, rsync 2.4.3 or later, OpenSSL, OpenSSH, DNS, SystemImager v0.23, • Perl v5.6.0 or later • http://www.csm.ornl.gov/clusterpowertools torc@msr.epm.ornl.gov

  16. Cluster Command & Control (C3) Tools • System administration • cpushimage - “push” image across cluster • cshutdown - Remote shutdown to reboot or halt cluster • User tools • cpush - push single file -to- directory • crm - delete single file -to- directory • cget - retrieve files from each node • cexec - execute arbitrary command on each node • cps - run ps and retrieve the output from each node • ckill - kill a process on each node • Add “s” to end for serial version -- cshutdowns, cpushs, etc...

More Related