xc 3 0 systemsadministration n.
Skip this Video
Loading SlideShow in 5 Seconds..
XC 3.0 SystemsAdministration PowerPoint Presentation
Download Presentation
XC 3.0 SystemsAdministration

Loading in 2 Seconds...

play fullscreen
1 / 102

XC 3.0 SystemsAdministration - PowerPoint PPT Presentation

  • Uploaded on

XC 3.0 SystemsAdministration. XC Systems administration Fred Cassirer January 25, 2006. Agenda. Management Overview Command broadcast System services Configuration and management database Monitoring the System Remote console Firewall Customization Network device naming

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

XC 3.0 SystemsAdministration

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
xc 3 0 systemsadministration

XC 3.0 SystemsAdministration

XC Systems administration

Fred Cassirer

January 25, 2006

  • Management Overview
  • Command broadcast
  • System services
  • Configuration and management database
  • Monitoring the System
  • Remote console
  • Firewall
  • Customization
  • Network device naming
  • Troubleshooting
management philosophy
Management Philosophy
  • Simple is better
  • Minimize load on application nodes, concentrate management processing on management hubs where possible
  • Use industry standard tools and techniques
  • Provide the necessary “glue”, configuration, and policy for best in class cluster management
  • Database
    • MySQL server using schema designed for HP’s HPC environment
  • Nagios
    • System monitoring, reporting, management, eventing, and notification
  • SuperMon
    • Optimized metric aggregation across entire cluster.
  • Syslog-NG
    • Next Generation syslog replaces syslogd functionality and provides tiered aggregation of system logs
  • Pdsh
    • Parallel distributed command execution and copy services
  • Console management
    • Log and control console output
xc management stack
XC Management Stack

Monitoring & logging

Distributed commands













XC Database


Red Hat Compatible Distribution

Orange components are open tools configured / adapted for use by XC.

system operations monitoring and logging
System Operations:Monitoring and logging






syslog-ng forwarding

Supermon aggregation

Aggregation points




pdsh 1
pdsh (1)
  • Multithreaded remote shell
  • Used to send a shell command to multiple nodes
    • [root@n5 root]# pdsh -a hostname
    • n5: n5
    • n4: n4
    • n3: n3
    • n1: n1
    • n2: n2
    • [root@n5 root]#
pdsh 2
pdsh (2)
  • Options:
    • -a all nodes
    • -f # sets max number of similtaneous commands
    • -w nodelist sets a list of target nodes
    • -x nodelist exclude a list of nodes

Where « nodelist » is standard node list syntax such as:


  • A shell script that invoke pdsh to perform commands on a group of hosts
  • The host group has previously been defined with the hostgroup command

# hostgroup -c group1

# hostgroup -a n1 group1


# hostgroup -a n2 group1

n2, n1

# cexec -r group1 hostname

n1: n1

n2: n2

  • Copy files to multiple nodes:

pdcp –a –n `nodename` /etc/passwd /etc/passwd

displaying all services
Displaying all services
  • shownode servers

cmf: n[15-16]

compute_engine: n[10-14,16]

dbserver: n16

dhcp: n16

gather_data: n[10-16]

gmmon: n16

hpasm: n[10-16]

hptc-lm: n16

hptc_cluster_fs: n16

hptc_cluster_fs_client: n[10-16]

httpd: n16

imageserver: n16

iptables: n[10-16]

lkcd: n[10-16]

lsf: n16

mpiic: n16

munge: n[10-16]

nagios: n16

nagios_monitor: n[15-16]

nat: n16

network: n[10-16]

nfs_server: n16

nrpe: n[10-16]

nsca: n16

ntp: n16

pdsh: n[10-16]

pwrmgmtserver: n16

slurm: n16

supermond: n[15-16]

swmlogger: n16

syslogng_forward: n[15-16]

displaying nodes that provide a given service
Displaying nodes that provide a given service

shownode servers [service]

shownode servers imageserver


shownode servers ntp


shownode servers lvs


displaying the services provided by a given node
Displaying the services provided by a given node
  • shownode services n7

mond lsf network lkcd slurm_compute slurm_controller lvs iptables cmf_client hptc_cluster_fs_client syslogng gather_data pdsh nrpe hpasm slurm_launch

  • shownode services n8 clients
  • cmf: n[1-7]
  • hptc_cluster_fs: n[1-7]
  • nagios: n[1-7]
  • nat: n[1-7]
  • ntp: n[1-8]
  • supermond: n[1-7]
  • syslogng_forward: n[1-7]
displaying which are the service providers for a given node
Displaying which are the service providers for a given node

shownode services n3 servers

cmf: n8

hptc_cluster_fs: n8

nagios: n8

nat: n[8,10]

ntp: n8

supermond: n8

syslogng_forward: n8

starting stopping a service
Starting/stopping a service
  • on one node:

service slurm stop

service slurm restart

  • On a group of nodes:

pdsh –w[n1-4] service ntpd stop

pdsh –w[n1-4] service ntpd restart

  • On all nodes:

pdsh –a service ntpd stop

pdsh –a service ntpd restart

adding a new service
Adding a new service
  • /opt/hptc/config/roles_service.ini



compute = <<EOT





external = nat

disk_io = <<EOT



login = lvs

management_hub = <<EOT





Service « stanza »

Service name

(compute, login, etc)

adding a service to an existing role
Adding a service to an existing role
  • cd /opt/hptc/config
  • cp roles_services.ini roles_services.ini.ORIG
  • Add the new service inside the role (in between <<EOT and EOT )
  • Create the appropriate gconfig and nconfig files
  • reset_db
  • /opt/hptc/config/sbin/cluster_prep
  • /opt/hptc/config/sbin/discover –system –verbose
  • /opt/hptc/config/sbin/cluster_config
creating a new role and adding a new service to it
Creating a new role and adding a new service to it
  • cd /opt/hptc/config
  • cp roles_services.ini roles_services.ini.ORIG
  • Insert the new role stanza
  • Insert the appropriate services list in the stanza
  • Create the appropriate gconfig and nconfig files
  • reset_db
  • /opt/hptc/config/sbin/cluster_prep
  • /opt/hptc/config/sbin/discover –system –verbose
  • /opt/hptc/config/sbin/cluster_config
configuration and management database
Configuration and management database
  • Key to the XC configuration
  • Keep track of:

which node provides which service

which node received which service

network interface configuration

enabled/disabled nodes

And other things

  • MySQL server version 4.0.20
  • MySQL text-based client
  • Perl DBI interface for MySQL
  • HPC value added management tools
    • shownodes: displays information on node configuration, statistics, services and status
    • managedb: assists in three areas
      • backs up the entire database
      • archives supermon log table data
      • dumps entire database in human readable form
    • reset_db: if you need to run cluster_config or discover again
database using mysql directly 1
Database: using mysql directly (1)

Location of sql database:


# mysql -p

Enter password:

Welcome to the MySQL monitor. Commands end with ; or \g.

Your MySQL connection id is 55969 to server version: 4.0.20-Max

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> show databases;


| Database |


| cmdb |  this is the XC Configuration and Management Database

| hptc_archive |

| mysql |

| qsnet |

| test

database using mysql directly 2
Database: using mysql directly (2)

mysql> use cmdb;

  • mysql> show tables;
  • +---------------------------------------+
  • | Tables_in_cmdb
  • +---------------------------------------+
  • | CPUInfoLog
  • | CPUTotalsLog
  • | CPUTypeLog
  • | aveNRunLog
  • | bTimeLog
  • | hptc_adminInfo
  • | hptc_archive
  • | hptc_cmf_port
  • | hptc_interface
  • | hptc_interfaceType
  • | hptc_interfaceUsageType
  • | hptc_node
database hpc value added shownode
Database HPC value added: shownode
  • shownode --help
  • USAGE: /opt/hptc/bin/shownode subcommand [options]
  • subcommand is one of:
  • all list all nodes
  • metrics show various statistics
  • roles show information about node roles
  • servers show which nodes provide services to whom
  • services show which services are provided by which nodes
  • clients show who are the clients of the services
  • status show which nodes are up and which are down
  • enabled show which nodes are enabled
  • disabled show which nodes are disabled
  • config show configuration details for nodes and other hardware
  • /opt/hptc/bin/shownode subcommand --help for more details
database hpc value added shownodes cont
Database HPC value added: shownodes (cont)

[root@n9 root]# shownode roles

common: n[1-9]

compute: n[1-9]

disk_io: n9

external: n[8-9]

login: n[8-9]

management_hub: n9

nat_client: n[1-7]

node_management: n9

resource_management: n[8-9]

database hpc value added shownodes cont1
Database HPC value added: shownodes (cont)

shownode enabled








database hpc value added shownodes cont2
Database HPC value added: shownodes (cont)

shownode status

n1 ON

n2 ON

n3 ON

n4 ON

n5 ON

n6 ON

n7 ON

database hpc value added shownodes cont3
Database HPC value added: shownodes (cont)

# shownode config --help

USAGE: /opt/hptc/bin/shownode config [options] { nodes [name ...]

| cp_ports [name ...]

| switches [name ...]

| otherports [name ...]

| hostgroups [name ...]

| roles

| sysparams

| node_prefix

| golden_client

| all }


--help this text

--indent n indent each level by n spaces

--labelwidth n print up to n characters of each label

--perl output in format understood by perl

--admininfo print extra information useful for system admins

database hpc value added enablenode and disablenode
Database HPC value added: enablenode and disablenode
  • The operation manipulates the is_enabled bit in database
  • Example:
    • setnode --disable n4
    • setnode --disable n[3,4]
  • Control for cluster related commands
    • Currently not used by most services
    • Affects startsys(8) and stopsys(8) node lists
      • Disabled nodes will not be affected by these commands
  • Use shownode(1) to list
    • “shownode enabled”
    • “shownode disabled”
database hpc value added backup restore cmdb
Database HPC value added:backup/restore cmdb

managedb backup

  • Creates/opt/hptc/datbase/cmdbbackup-YYMMDDhhmmss.sql
  • Example:


Tue Jul 26 15:54:54 CEST 2005

managedb backup

ls /opt/hptc/database/*.sql


  • Restore sequence:

ls /opt/hptc/database/*.sql


managedb restore /opt/hptc/database/dbbackup-20050726155153.sql

mysql -u root -p cmdb < cmdbbackup-YYMMDDhhmmss.sql

  • Restoration will remove all old data and replace with data from the backup file.
database hpc value added archiving metrics data
Database HPC value added:archiving metrics data
  • Removes old data from metrics tables and stores them in the hptc_archive database.
  • The tables to archive are those with name ending with “log” in the cmdb database:
    • mysql> show tables;
    • ….
    • ….
    • | lastSensorLog
    • | memInfoLog
    • | netInfoLog
    • | pagingLog
    • | processLog
    • | sensorLog
    • | swapInfoLog
    • | switchLog
    • | timeLog
database hpc value added archiving metrics data1
Database HPC value added:archiving metrics data
  • Example 1: archiving metrics data older than four days:

managedb archive –archive ./artfile_Tuesday 4d

  • Example 2: restoring metrics data:

Managedb restore –archive ./artfile_Tuesday

database hpc value added purging metrics data
Database HPC value added:purging metrics data
  • Metrics data on cmdb can be purged without archiving:
  • Example: purging data older than 2 weeks:

managedb purge 2w

monitoring the system
Monitoring the system
  • Health and Metrics
    • Nagios
    • Supermon
    • Syslog-ng
  • Console
    • CMF
management roles
Management roles
  • XC management services are part of three XC roles
    • Management Server
      • Denotes the node where the nagios master resides
      • Only *ONE* node may be assigned the management_server role
      • For V3.0 the management_server role must be assigned to the headnode (nh)
    • Management Hub
      • Denotes those nodes that provide management aggregation services.
      • Assign this role to as many nodes as you like. Cluster-config will by default assign 1 management_hub for every 150-200 compute nodes.
      • The Management Server node is also a management hub.
      • Supermon and CMF services are associated with a Management Hub
    • Common
      • Every node that is monitored has an XC “nrpe” service defined.
      • Every node that is monitored has an XC “mond” service defined
    • Console Services
      • By default, a console service role is assigned identically to Management Hubs. For systems where physical console connectivity is limited, assign this role to those nodes that have direct console access.

Mgmt Server – Nagios master and headnode, 1 per system.

Mgmt Hub – Distributed management services such as nagios monitor, supermon and syslog-ng aggregator

Console Services – Service nodes connected to the console network. Utilized by Mgmt Server/Hub’s to proxy requests that require console connectivity such as sensor collection, firmware log (SEL/IML) and console services (cmf)

Console Services Monitoring












Root Supermon – Supermond connects to all supermonds and manages a set of subnodes

mond – Run on every node (even those with supermond) and collect metrics via the kernel and the “monhole” (if configured)

Nagios/Nagios monitor – Runs the check_metrics plug-in periodically to cause supermon data collection and storage into the database








Supermon manages requests for mond children


Manage requests for mond collection







per-node data collector

reports up to parent supermon

“monhole” can be configured to pass any metric data for aggregation

  • Version 2.0-b6
  • No changes to Nagios engine for Perseus
  • Value add is in the dynamic configuration and template model used by XC
  • Easily extendable and configurable by the end user
  • Dynamically adapts to the cluster configuration based on XC service assignments (roles)
  • Distributed monitoring supported in V3.0! Supports failover for active plugins from nagios monitors
  • Graphically monitors nodes, environmentals, processes, power state, slurm, jobs, LSF, syslog, load average etc
  • Monitors ProCurve switches via SNMP. Checks for slow ethernet links (linkspeed <1GB)
  • Initiates supermon metric gathering
  • Nagios gets all of its data from “plug-ins”,
    • Essentially shell/perl/C programs
    • return 0/1/2 for Ok, Warning, Critical
    • produce a status string on stdout.
  • Runs as the user “nagios”
  • nagiostats command new for V3.0
  • Nagios master tracks distributed nagios monitors and reports if they fail to provide status
  • User customizable threshold values for all nodes/classes via nagios_vars.ini

Runs as user “nagios”

Nagios demon does most all work on management nodes

Serves as scheduler for supermon data collection

LSF failover functionality invoked by nagios plug-in

Dynamically creates nagios configuration files

based on cluster topology and role assignment

Offloads management from nagios clients


Nagios master demon on nh









Nagios process offloading nagios master

Forwards results to master

Nagios clients run mond for metrics gathering

hourly root key synchronization

management server and hub services
Management Server and Hub services
  • Apache HTTPS Server – Apache webserver status
  • Configuration Monitor – Updates node configuration
  • LSF Failover Monitor - Watch LSF and failover if needed
  • Monitor Power – Watch power nodes, report overall status
  • Nagios Monitor – Watch nagios master and monitor
  • Resource Monitor – Report squeue info
  • Root key synchronization – Report on ssh keys
  • Slurm Monitor – Report sinfo for each node
  • Supermon Metrics Monitor – Gather predetermined metrics
  • Syslog Alert Monitor – Watch for patterns in syslogalertrules file and report for each node
  • Load average - Provide per-node load average information and alerts
  • Environment - Provide per-node sensor reporting and alerts
  • Node information - Per node process, user, and disk statistics
plug ins that report data for individual nodes or switches
Plug-ins that report data for individual nodes or switches
  • System Event Log - Monitor hardware event log for iLO and IPMI based systems and alert based on patterns in selRules file.
  • SFS monitoring - Monitor an attached SFS appliance
  • Procurve switch monitor - Gather switch status and metrics via SNMP
per node service alerts
Per-Node service alerts
  • Environment - Sensors, all platforms supported in V3.0
  • Load Average – Per node load ave, default every minute
  • NodeInfo – Total/User/Zombie procs, system uptime.
  • Ping Interconnect – Packet loss and round trip times
  • Power – Actual, expected, and console port ping
  • Resource Status – What jobs are running on this node
  • Slurm Status – What slurm thinks about this node
  • Syslog Alerts – Link to any consolidated log messages that match patterns in syslogalertrules file.
  • System Free Space - /root, /tmp/, /var, and /hptc_cluster
  • Configuration – Static system info, memory, processors, etc
nagios for admins
Nagios for Admins
  • Bring up the web browser interface
  • For small systems (<~200 nodes) browse service details
  • For larger systems, browse service problems
  • Look at host problems page
  • Tactical overview
  • Hostgroup summary provides views by roles
  • Run “nagiostats” to get nagios summary info
  • Supermon is a high-speed cluster monitoring system.
  • Intended to monitor both OS and hardware health and status data of cluster nodes.
  • Provided by LANL
  • Hierarchical design for maximum scalability
  • Used by Nagios, no real end user interface
shownode metrics 1
shownode metrics (1)

shownode metrics paging

Timestamp |Node |pgpgin |pgpgout |pswpin |pswpout


2005-07-26 17:14:13 |n8 |10940619 |14736134 |3 |0

2005-07-26 17:14:13 |n7 |106865 |102847 |3 |0

2005-07-26 17:14:13 |n6 |193606 |356784 |3 |0

2005-07-26 17:14:13 |n5 |195241 |388732 |3 |0

2005-07-26 17:14:13 |n4 |195830 |404832 |3 |0

2005-07-26 17:14:13 |n3 |193587 |412898 |3 |0

2005-07-26 17:14:13 |n2 |224350 |326775 |3 |0

2005-07-26 17:14:13 |n1 |196678 |399856 |3 |0

shownode metrics 2
shownode metrics (2)

[root@n16 root]# shownode metrics cpus

Timestamp |Node |CPU# |User |Nice |System |Idle


2005-01-21 10:54:21 |n16 |0 |8389189 |5030 |2976162 |308323600

2005-01-21 10:54:21 |n15 |0 |204238 |2207 |76279 |321666392

2005-01-21 10:54:21 |n14 |0 |203601 |2285 |74152 |321738046

2005-01-21 10:54:21 |n13 |0 |203076 |2228 |73976 |321733211

2005-01-21 10:54:21 |n12 |0 |203430 |2285 |70311 |321520665

2005-01-21 10:54:21 |n11 |0 |204742 |2252 |71730 |321463992

2005-01-21 10:54:21 |n10 |0 |204457 |2280 |69756 |321643141

2005-01-21 10:54:21 |n9 |0 |203994 |2311 |69751 |321681017

2005-01-21 10:54:21 |n8 |0 |204096 |2245 |69352 |321684120

2005-01-21 10:54:21 |n7 |0 |205430 |2304 |69725 |321684395

monitoring the system syslog ng
Monitoring the system: syslog-NG
  • Use standard syslog on all nodes
  • Subset of nodes configured as syslog-ng aggregators
  • All nodes configured hierarchically to efficiently forward syslog events of priority warning and higher to aggregators
  • Aggregated logs for sets of nodes
    • /hptc_cluster/adm/logs/aggregator_nodename.log
  • Single (global) clusterwide log of “interesting” events
    • /hptc_cluster/adm/logs/consolidated.log
  • All logs located in /hptc_cluster/adm/logs
    • /hptc_cluster is NFS mounted. Logging performed by aggregator nodes only
  • Logs managed by logrotate


Global aggregation – filtered information forwarded from regional aggregators and stored in a single consolidated log

. Regional aggregation – filtered information forwarded to the parent node and aggregated in a common log for this group









Local logging – Standard syslog logging to standard places

syslog ng log files
Syslog-NG log files
  • # cd /hptc_cluster/adm/logs/
  • aggregator_n1039.log aggregator_n1041.log aggregator_n1043.log aggregator_n1040.log aggregator_n1042.log aggregator_n1044.log
  • consolidated.log
console management facility cmf
Console ManagementFacility (CMF)
  • CMF daemon runs on the login or service nodes
  • Collects and stores console output for all nodes in the system
  • All command output stored in:


  • Execution on any cluster node is supported
  • Command format: console <nodename>, example:

console n16

  • NOT usable for MP commands other than “co” (Console)
    • telnet must be used for other MP modes
  • Security: uses the token mechanism per cmfd
the console command usage and operation
The console Command(usage and operation)

As per the man page:

console <node_name>

A connection is made to the appropriate cmfd

The OS console context is accessed

Attempts to use telnet for OS console will fail

Commands can be performed as if on the console

With an appropriate terminal emulator (F10 pass-through) – access to BIOS RBSU is available

Node startup /imagng operations can be observed

cmf operation
CMF (operation)

The facility is based on the CMF daemon (cmfd)

A number of CMF servers may be employed

Each CMF server manages a subset of nodes

A CMF server opens a connection to a cp

Data that traverses the connection is logged:/hptc_cluster/adm/logs/cmf.dated/*

In current, for instance, is a per node log:log is named using <node_name>.log

  • Database
    • None
  • Nagios
    • Changing the nagios templates
    • Customizing nagios_vars.ini
    • Adding contact info and schedules
    • Adding your own plug-ins
    • Changing notification intervals
    • Using NAN to batch Nagios alerts
  • SyslogNG
    • Updating your own custom filters
customizations database
Customizations (Database)
  • Standard mysql performance optimizations apply
customizations nagios
Customizations (Nagios)
  • /opt/hptc/nagios/etc cfg files
    • nagios – master configuration file
    • contacts & contactgroups – Updates to setup site specific email/pager info
    • xc-config/xc-hostgroups, *_local.cfg – Autogenerated - DO NOT CHANGE THESE FILES
    • Preserved during package update, customizations will need to be merged with upgraded packages.
  • /opt/hptc/nagios/etc/templates -
    • Modify these files to change specific service behavior
      • polling interval
      • notifications
      • Additional custom nagios services
      • Named based on XC services, automatically applied to nodes with these services
customizations nagios1
Customizations (Nagios)
  • /opt/hptc/nagios/etc/nagios_vars.ini – This file contains values and expressions that are used either per node or class of nodes to set nagios threshold values. For example, for any given node what is the warning limit for the number of processes on the system? What is a critical load average? This can vary by system type and site.


nagios vars ini


# Setup for a single service node, if you have other nodes that

# are supporting additional services it might be wise to change the

# nodes variable to include those nodes.



nodes = %DOMAIN(%NH%:nagios)%

total_procs_warning = %EXPR(%TPW%*2+50)%

total_procs_critical = %EXPR(%TPC%*2+50)%

users_warning = 30

users_critical = 50


# This section is used for some global constants



nodes = %PREFIX%[%ALL%]

# The timeperiod in minutes used to collect loadave data

# from supermon.


# The timeperiod in minutes used to collect sensor info. On

# some systems this is more expensive process then

# just picking up simple metrics so we do it less

# frequently.




# Add other sections for nodesets and variables these can be defined

# anyplace in this file, only sections "default" and "constant" have

# special meanings to the parser. You may need to change the

# following files:


# /opt/hptc/nagios/etc/nrpe.cfg

# /opt/hptc/nagios/etc/templates/*.cfg


# which reside on the golden master and re-image

# or copy this out to all nodes if you update it.


# [your-section]

# nodes = ???

# yourvar = ??


# The following predefined constants are defined:


# %PREFIX% - The cluster prefix, default "n"

# %HOST% - The current host that is evaluating the value

# %FIRSTNODE% - The lowest node number defined for the system

# %LASTNODE% - The highest node number defined for the system

# %ALL% - All nodes in the cluster, i.e., %FIRSTNODE-%LASTNODE (handles holes)

# %ALLBUTNH% - All the nodes in the cluster except the headnode

# %NH% - alias for the headnode

# %NUMNODES% - Number of nodes on the system

# %NSOCKETS% - Number of Sockets on the system

# %MEMORY% - Memory in GBs

# %EXPR(expression)% - Evaluate 'expression'

# %DOMAIN([host]:service)% - Return client hosts of service for host

# %HUB% - Nagios node that manages %HOST%

# Special section to create constant values. Note, constants are not

# variables, they must be used on the RHS of a variable declaration

# in order to obtain their value, i.e., nsockets = %NSOCKETS% would

# allow %nsockets% in your templates, not %NSOCKETS%


TPW = %EXPR(%NSOCKETS%*25+25)%

TPC = %EXPR(int(%TPW%*1.25))%



# Default values for all nodes (changing nodes here will

# work as expected, but then its not much of a default

# I warned you ... ;-)

# Defaults are assuming a typical compute node


nodes = %PREFIX%[%ALL%]

total_procs_warning = %TPW%

total_procs_critical = %TPC%

loadave_warning = %LDAVEWARN%,%LDAVEWARN%,%LDAVEWARN%,5


loadave_critical = %LDAVECRIT%,%LDAVECRIT%,%LDAVECRIT%,7


sysloadave_warning = 15,10,5

sysloadave_critical = 30,25,20

users_warning = 3

users_critical = 7

admin_ping_warning = 300.0,20%

admin_ping_critical = 1000.0,60%


zombie_warning = 1

zombie_critical = 5

nagios syslog alerts
Nagios Syslog Alerts
  • /opt/hptc/nagios/etc/syslogAlertRules
  • Contains rules for matching consolidated log messages
  • Rules are applied every 5 minutes to messages received since last run (default timeperiod in template)


rule lustre_errors {

name (/LustreError/)

relevance ($subsystem =~ /kernel/)

format "$timestamp $message"


nagios service groups
Nagios service groups
  • Service groups are sets of services put together for easier viewing
nagios service groups1
Nagios service groups
  • Service groups are added via the templates using the notation: #SERVICEGROUP {groupname}
    • Note, this is an XC value add, not part of Nagios
    • You can add additional service groups to any service definition simply by adding a directive.

define service{

use nagios-monitor

host_name %HOST%

name keysync

service_description Root key synchronization

normal_check_interval 60

retry_check_interval 10

active_checks_enabled 1

check_command check_keys!--domain $HOSTNAME$:%NAGIOSMONITOR%

register 1



Configuration directive (%)

Nagios Macro directive ($)

  • Front end to nagios paging.
  • Can group/batch email based on rules
  • NAND – Started when Nagios starts, batches requests
  • NANC – Command line, used to send requests to NAND. Is called by Nagios instead of “mail” in misccommands.cfg.
  • /opt/hptc/nagios/etc/nand.conf describes NAN rules for batching data.
customizations supermon
Customizations (Supermon)

The collection interval for supermon is determined by the nagios “check_metrics” plug-in.

The individual metrics collected and the time period can be changed in the /opt/hptc/nagios/etc/templates/nagios_monitor_template.cfg

customizations syslogng
Customizations (SyslogNG)
  • /opt/hptc/syslog-ng/etc/syslog-ng
    • rules on how to handle the messages logged on the system
tips n tricks
Tips ‘n Tricks
  • Configure the “nagios” user to forward mail to the system admin.
  • Setup an admin group in Nagios in the contactgroups.cfg file.
  • Setup contact definitions for various users and how they want to be notified.
  • Setup a regular cron job to backup the database
  • Use the nagios web view to watch the service detail page or, on large systems, the service problems page.
  • Add custom nagios plug-ins to watch and notify admins for site specific monitoring.
  • If you have a conflict with the nagios userid, change it (via standard usermod etc) after kickstart but before you do a cluster_config.
  • Change nagios_vars.ini for site specific threshold values
other tools and utilities
Other tools and utilities
  • See the serviceability presentation for more information on:
    • Interconnects
    • Diagnostics
    • Integrity checking
firewall configuration
Firewall configuration


Head node

Compute node


External network





Admin network


-A RH-Firewall-1-INPUT -i external -p tcp -m tcp --dport 22 -j ACCEPT

-A RH-Firewall-1-INPUT -i admin -p tcp -m tcp --dport 22 -j ACCEPT

iptables firewall
Iptables Firewall
  • XC uses iptables packet filtering to restrict incoming network communication on all nodes.
  • All outgoing communication from any network is allowed, but connection initiation to a node is limited to ports that have been previously opened
  • Rules are added to the RH-Firewall-1-INPUT chain to specify connection initiations that will be accepted. All other connection initiation attempts will be rejected.
  • Rule are saved to /etc/sysconfig/iptables for boot-time persistence
iptables firewall cont
Iptables Firewall (Cont)

“Out of the box” opened ports:

  • External network
    • ssh
    • https
  • Admin, Interconnect and loopback networks
    • ssh
    • smtp
    • tftp
    • portmapper
    • ntp
    • https
    • rsync
    • Ports 1024 - 65535
iptables firewall cont1
Iptables Firewall (Cont)
  • A prototype file is used to open a pre-determined list of ports on all nodes
  • Some XC services add firewall rules as they are configured, e.g:
    • NFS_server service – modifies NFS daemons to use a specific port, instead of random ports, and then adds the necessary iptables rules
    • NAT service – adds NAT related iptables rules
    • Syslog – Adds the syslog ports
  • Other services will require manually adding rules during their configuration step. Most common services are already supported by XC.
  • Future goal is to have the majority of configured services add rules when they are configured, and remove rules when unconfigured.
firewall implementation
Firewall Implementation
  • Firewall rules exist for individual networkadapters
    • external
    • interconnect
    • management / admin LAN
  • Adapter names for each network may be different depending upon the node; rules must be generated on each node.
  • During initial configuration, a pre-defined /etc/sysconfig/iptables.proto file used as the basis for generating appropriate rules.
  • The rules are subsequently saved in /etc/sysconfig/iptables for boot-time persistence
firewall implementation cont
Firewall implementation (Cont)
  • Firewall rules in the iptables.proto e.g., file looks as follows:
    • RH-Firewall-1-INPUT -i External -p tcp -m tcp --dport 22 -j ACCEPT
    • RH-Firewall-1-INPUT -i External -p tcp -m tcp --dport 443 -j ACCEPT
    • RH-Firewall-1-INPUT -i Admin -p tcp -m tcp --dport 22 -j ACCEPT
    • RH-Firewall-1-INPUT -i Admin -p tcp -m tcp --dport 25 -j ACCEPT
    • RH-Firewall-1-INPUT -i Interconnect -p tcp -m tcp --dport 22 -j ACCEPT
    • RH-Firewall-1-INPUT -i Interconnect -p tcp -m tcp --dport 25 -j ACCEPT
  • When each node is configured, the text fields External, Admin, and Interconnect are replaced with the actual device names on the node and written to the node’s /etc/sysconfig/iptables file.
openipport 8
  • HP value-add command to create/insert one or more firewall rules
    • #openipport –port port-number –protocol udp|tcp --interface External | Admin | Interconnect | lo
  • --port specifies the port number used by the service.
  • --protocol specifies the protocol used by the service for communication. This can be either tcp or udp.
  • --interface specifies the interfaces on which the port should be opened. If more than one interface is requested, then a comma separated list, with no blank spaces, containing one to four of the valid interfaces is accepted.
  • These iptables file changes remain after the system reboots or the iptables service is restarted.
openipport 8 cont
openipport(8) (Cont)
  • For example, the following command will add two iptables rules for a service that uses port 44 and protocol tcp on the Admin and Interconnect networks.
    • #openipport –-port 44 –-protocol tcp --interface Admin,Interconnect
  • In the future, a corresponding closeipport command will be provided.
temporarily disabling the iptables firewall
Temporarily Disabling the Iptables Firewall

If experiencing communications problems and you want to determine whether the problem is caused by the firewall settings perform the following steps:

1) save a copy of the current iptables file

#cp /etc/sysconfig/iptables /etc/sysconfig/iptablesORIG

2) flush the firewall rules

#iptables -F RH-Firewall-1-INPUT

#service iptables save

This flushes all of the current firewall rules in memory and makes a permanent change to the iptables file. If you look at the /etc/sysconfig/iptables file, the firewall rules are gone. This step, however, leaves the rest of the iptables file, such as the NAT table, intact.

temporarily disabling the iptables firewall cont
Temporarily Disabling the Iptables Firewall (Cont)

3) When ready to reinstate the firewall rules copy back the original iptables file and cause the iptables service to reread the iptables file.

#cp /etc/sysconfig/iptablesORIG /etc/sysconfig/iptables

#service iptables restart

Iptables now has re-read the iptables file and all firewall rules are back in use.

NOTE: any iptables commands on the RH-Firewall-1-INPUT chain which were performed after disabling the firewall and before replacing the original iptables file will be lost!

logging rejected iptables firewall messages
Logging Rejected Iptables Firewall Messages

Use the following steps to log rejected firewall messages:

  • A LOG rule must be inserted just before the REJECT rule in the firewall chain so that all rejected messages are logged. Perform one of the following steps to determine the correct rule number for the LOG rule:
    • Obtain a count of the firewall chain

# iptables --list RH-Firewall-1-INPUT | wc –l

# 81

The REJECT rule number in this example is 2 less than the line count since the REJECT rule is always the last rule and the output contains a line for the chain label and the field labels. Therefore, the REJECT rule is currently rule #79.

    • Analyze the ‘iptables --list RH-Firewall-1-INPUT’ output and determine the rule number for the REJECT rule, starting with 1 for the first rule.

Once the REJECT rule number is determined, this number must be specified when adding the LOG rule to the firewall chain. Using the example above, the LOG rule will become the new rule #79, and the REJECT rule becomes rule #80.

logging rejected iptables firewall messages cont
Logging Rejected Iptables Firewall Messages (Cont)
  • Insert the LOG rule as follows:

#iptables –I RH-Firewall-1-INPUT {your_REJECT_rule_#} -j LOG

For example,

# iptables –I RH-Firewall-1-INPUT 79 -j LOG

  • This will result in the logging of rejected messages in the /var/log/{nodename} file on the node before they are rejected.

NOTE: This type of logging will generate a large number of lines in the log file.

console port connections
Console Port Connections
  • Connector is leftmost rj-45 connection facing the back of the node for all supported systems (Ethernet) – wired to the console port switch (XC)
    • rx2600 (Management Card LAN)
    • DL585 (iLO)
    • DL145 (NIC MGMT)
  • Power control performed via
    • telnet session for DL145 and rx2600
    • XML control sequences for DL585 (iLO)
  • OS and BIOS console connection via LAN
    • Only supported for rx2600 for Pegasus
network device overview
Network device overview
  • The XC cluster automatically configures the following devices:
    • Admin network adapter (required)
    • GIGE Interconnect adapter (optional)
    • External network adapter (optional)
    • High speed interconnect adapter (optional)
    • Shared Interconnect (ic=admin)
  • RH EL4 re-orders device names more dynamically (and often ;-) then previous releases
  • Device names are no longer used by XC to determine which ports to assign to specific roles
network device naming
Network device naming
  • Device names such as eth0, eth1, eth2 may change based on:
    • kernel patches
    • loaded modules
    • number of add-on PCI network cards
    • Rainfall in the past 22.35 hours on a small island in the South Pacific …
    • Think about devices in terms of PCI bus ids and mac addresses
  • Devices are assigned for Admin, Interconnect, or External use based on PCI bus ids. Typically, when looking at the back of the node, the leftmost physical port is Admin and the port to its right is either the interconnect or the external port.
  • A rules file is used to determine which PCI bus id to use for which function.
network naming and modelmap
Network naming and modelmap
  • The modelmap file contains device assignment rules, /opt/hptc/config/modelmap:

type ic-type #ofadap admin extern interconnect

dl145g2 Ethernet ==2 03:00.0 undef 02:00.0

dl145g2 Ethernet >2 03:00.0 offboard 02:00.0

dl145g2 !Ethernet >1 03:00.0 02:00.0 undef

  • For dl145g2 systems with GIGE interconnect that have more then 2 NICs, use NIC at busid 03:00.0 for admin, 02:00.0 for the interconnect, and use the highest numbered remaining device as the external interconnect.
database troubleshooting
Database troubleshooting
  • Add logging directives to /etc/my.cnf
    • [mysqld]
    • log=/var/lib/mysql/mysql.log
    • service mysql restart
  • Look for slow queries
    • [mysqld]
    • log-slow-queries=/var/lib/mysql/slow_query.log
    • log-long-format
    • long-query-time=2
    • service mysql restart
troubleshooting nagios
Troubleshooting (Nagios)
  • Check the service problems page in Nagios
  • Check the /opt/hptc/nagios/var/nagios.log and the /opt/hptc/nagios/var/status.log files. Look at syslog entries.
  • If nagios seems like it won’t update the status, then remove the persistent state file:
    • service nagios stop
    • /bin/rm /opt/hptc/nagios/var/status.sav
    • service nagios start
    • note: this is rarely needed, but is good to know about
  • Visit the nagios documentation page, it’s a link on any nagios home page.
  • Nagios runs as user “nagios”, not root. To reproduce errors in nagios display, su - nagios, then run /opt/hptc/nagios/libexec/check_* to test the plug-in.
  • Turn on logging for nagios monitors. Add –log /opt/hptc/nagios/var/monitor.log to /opt/hptc/nagios/etc/checkcommands.cfg for submit_service_check_via_ssh command definition
troubleshooting supermon
Troubleshooting (Supermon)
  • Insure the demons are running
    • service supermon status
  • Insure you can connect to the demon in a simple way:
    • telnet localhost 2709 (and 2710)†


lots of data/metrics should be printed out

† 2710 is supermon and only runs on some nodes

troubleshooting syslog ng
Troubleshooting (syslog-ng)
  • Check the /var/log files
  • cexec –a service syslog-ng status
  • Do not run syslog and syslog-ng as they can deadlock and cause syslog() calls in applications to hang the process.
  • syslog-ng is not an exact replacement for syslog as some logs may not get written identically. If you suspect a problem, temporarily turn off syslog-ng and run syslog and report the discrepancy. Without syslog-ng you will lose log information on subsequent re-images.
troubleshooting cont access to console
Troubleshooting (Cont)Access to Console
  • Verify that the cmfd is running
    • “/sbin/service cmfs status”
  • Verify that the console is a MP console
    • “cat /etc/powerd.conf”
  • Verify that the cmf service has connected successfully
    • Check the cmfd.log for ‘connect failures’
    • Note that cmf retries failed connections periodically