A Mobile-Agent-Based Performance-Monitoring System at RHIC

A Mobile-Agent-Based Performance-Monitoring System at RHIC Richard Ibbotson

Overview • Motivation for a new monitoring system • Design of the Instrumentation system • Use of mobile agents (mobile programs vs remote procedures) • How it works, what it does and doesn’t do • Practical experiences with a test instrument • What works well and what doesn’t • Future enhancements

Monitoring System Purpose The system should: • Provide performance monitoring at service-level • “End-to-end” tests yielding mixed information on the functioning of several services • Track performance changes during configuration changes • Monitor current health of system • Provide some error-tracking/reporting capabilities • Be a tool for administrators & experimenters It will not: • Provide detailed system information for fault diagnosis (system-specific, vendor-supplied tools already exist)

Desired Features of the System • View / compare past and current measurements • Inspect correlations between metrics • Allow variation of sampling rate • Automatically execute scheduled measurements • Can perform measurements on demand at shorter intervals • Perform OS-independent measurements • Use a small fraction of available resources

Components of the System • “Instruments” which perform measurements • Centralized database of Instruments (code) and time-stamped results • Allows simple addition of new metrics • Allows previously run tests to be reproduced • Mechanism for remote execution of Instruments • IBM “Aglets” mobile-agent system (http://www.trl.ibm.co.jp/aglets) parameters code monitor sequence of measurements

Mobile Agents vs. RPC • Remote Procedure Call Remote system User’s system Local search utility Datasetto search Search request A pre-defined procedure on remote host executes and returns result • Mobile Agent Increased network load for large agents Remote system User’s system Daemon on remote host accepts agent and allows execution Datasetto search Search request Local search utility

Advantages of Mobile Agents • Metrics can be defined at any time, and implemented on the central host • Performance is measured on the relevant host • Aglets system is Java-based, providing platform-independent execution • Sophisticated security model exists for restricting actions of the agents

Use of Mobile Agents In Monitoring • Simplest approach, “Single-Remote-Host” was implemented for initial configuration • Waiting between tests is done on central server for reliability Target host Itineraryapproach SingleRemote Host approach Central server Target host Target host Central server Target host Target host Target host

Result MobilityPattern SpecificInstrument startTrip()nextTransfer()... onMeasuring()onInstrumentCreation()... Instrument Aglet storeInDB()setInvalid()... onCreation()run()... loadParams()storeResult()... StatusUpdater ParameterList registerWithMonitor()updateMonitor()... loadParams()getValue(key)... Anatomy of an Instrument The code defining a specific implementation of an Instrument is  30 lines Inherits from Inherits from

Test Instrument: File Access • NFS access time (write) used as test of concept • File size, location (file-system) are passed as parameters in database (specified at run-time) • Measurements are started by automated process as specified by Schedule table in database • Tested access to one file-system on several client computers: • Linux (PIII) system with NFSv2, 1KB blocksize • Linux (PIII) system with NFSv2, 8KB blocksize • Linux (PIII) system with NFSv3 • Solaris system with NFSv3

Report Generation Tool • Sample tests are carried out automatically by a “Scheduler” Aglet • Reports are requested via an html form. Users specify a test-type, parameter-set and target host. A Perl cgi-script queries the database and plots results using Gnuplot.

Sample Report for File access Results indicate server load, client config Nightly backups Weekly de-frag

Problems With the Mobile Agents • Transfer interrupted when several agents move to / from the same host within  1-2 sec • Small size of Aglets currently used (15KB) cannot explain the effective dead-time • The failure is presented to the Aglet as a refusal (can detect, wait and retry) • Congestion at central host can be relieved by following a “circuit” before returning (multiple hosts)

Future System Development • Solve transfer interruption problem • Development of other mobility patterns • NFS read-access may be tested by writing on one host and timing a read on a different host (to avoid caching) • Use of “itinerary” can ease network congestion at the central server • A tracking / error-reporting system is being developed, and will be connected to a paging system

Summary • Initial implementation is proving useful • Mobile agent architecture adds design work but eases implementation, adds flexibility • Transfer interruption causing scalability problems, but not insurmountable • Plan to have expanded system running before data-taking begins

Questions... Richard Ibbotson, BNL ibbotson@bnl.gov Thanks to… David Stampf, BNL Tom Throwe, BNL Bruce Gibbard, BNL

A Mobile-Agent-Based Performance-Monitoring System at RHIC

A Mobile-Agent-Based Performance-Monitoring System at RHIC

Presentation Transcript

Mobile Communications in a Mobile Agent Based Overlay System

Performance-Based Monitoring Analysis System PBMAS

The Highway Performance Monitoring System

Smart Mobile Health Monitoring System

NOMADS: Mobile Agent System

Performance Based Monitoring Analysis System

A Trustworthy Agent Based Online Auction System

Performance-Based Incentive System

Mobile Based Security System

Agent Based Transaction System

Inter-agent communication in a distributed mobile agent system

Agent-based E-commerce System

Security and Robustness in an Agent-based Network Monitoring System

2004-2005 Performance-Based Monitoring Analysis System (PBMAS)

Mobile Communications in a Mobile Agent Based Overlay System

Performance-Based Monitoring Analysis System

An Architecture for a QoS-based Mobile Agent System

Multimedia Services based on Mobile Agent

Volunteer-based Monitoring System

Power System Monitoring Using Wireless Substation and System-Wide Communications Mobile Agent Part