1 / 17

A Mobile-Agent-Based Performance-Monitoring System at RHIC

A Mobile-Agent-Based Performance-Monitoring System at RHIC. Richard Ibbotson. Overview. Motivation for a new monitoring system Design of the Instrumentation system Use of mobile agents (mobile programs vs remote procedures) How it works, what it does and doesn’t do

sammy
Download Presentation

A Mobile-Agent-Based Performance-Monitoring System at RHIC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Mobile-Agent-Based Performance-Monitoring System at RHIC Richard Ibbotson

  2. Overview • Motivation for a new monitoring system • Design of the Instrumentation system • Use of mobile agents (mobile programs vs remote procedures) • How it works, what it does and doesn’t do • Practical experiences with a test instrument • What works well and what doesn’t • Future enhancements

  3. Monitoring System Purpose The system should: • Provide performance monitoring at service-level • “End-to-end” tests yielding mixed information on the functioning of several services • Track performance changes during configuration changes • Monitor current health of system • Provide some error-tracking/reporting capabilities • Be a tool for administrators & experimenters It will not: • Provide detailed system information for fault diagnosis (system-specific, vendor-supplied tools already exist)

  4. Desired Features of the System • View / compare past and current measurements • Inspect correlations between metrics • Allow variation of sampling rate • Automatically execute scheduled measurements • Can perform measurements on demand at shorter intervals • Perform OS-independent measurements • Use a small fraction of available resources

  5. Components of the System • “Instruments” which perform measurements • Centralized database of Instruments (code) and time-stamped results • Allows simple addition of new metrics • Allows previously run tests to be reproduced • Mechanism for remote execution of Instruments • IBM “Aglets” mobile-agent system (http://www.trl.ibm.co.jp/aglets) parameters code monitor sequence of measurements

  6. Mobile Agents vs. RPC • Remote Procedure Call Remote system User’s system Local search utility Datasetto search Search request A pre-defined procedure on remote host executes and returns result • Mobile Agent Increased network load for large agents Remote system User’s system Daemon on remote host accepts agent and allows execution Datasetto search Search request Local search utility

  7. Advantages of Mobile Agents • Metrics can be defined at any time, and implemented on the central host • Performance is measured on the relevant host • Aglets system is Java-based, providing platform-independent execution • Sophisticated security model exists for restricting actions of the agents

  8. Use of Mobile Agents In Monitoring • Simplest approach, “Single-Remote-Host” was implemented for initial configuration • Waiting between tests is done on central server for reliability Target host Itineraryapproach SingleRemote Host approach Central server Target host Target host Central server Target host Target host Target host

  9. Result MobilityPattern SpecificInstrument startTrip()nextTransfer()... onMeasuring()onInstrumentCreation()... Instrument Aglet storeInDB()setInvalid()... onCreation()run()... loadParams()storeResult()... StatusUpdater ParameterList registerWithMonitor()updateMonitor()... loadParams()getValue(key)... Anatomy of an Instrument The code defining a specific implementation of an Instrument is  30 lines Inherits from Inherits from

  10. Test Instrument: File Access • NFS access time (write) used as test of concept • File size, location (file-system) are passed as parameters in database (specified at run-time) • Measurements are started by automated process as specified by Schedule table in database • Tested access to one file-system on several client computers: • Linux (PIII) system with NFSv2, 1KB blocksize • Linux (PIII) system with NFSv2, 8KB blocksize • Linux (PIII) system with NFSv3 • Solaris system with NFSv3

  11. Report Generation Tool • Sample tests are carried out automatically by a “Scheduler” Aglet • Reports are requested via an html form. Users specify a test-type, parameter-set and target host. A Perl cgi-script queries the database and plots results using Gnuplot.

  12. Sample Report for File access Results indicate server load, client config Nightly backups Weekly de-frag

  13. Problems With the Mobile Agents • Transfer interrupted when several agents move to / from the same host within  1-2 sec • Small size of Aglets currently used (15KB) cannot explain the effective dead-time • The failure is presented to the Aglet as a refusal (can detect, wait and retry) • Congestion at central host can be relieved by following a “circuit” before returning (multiple hosts)

  14. Future System Development • Solve transfer interruption problem • Development of other mobility patterns • NFS read-access may be tested by writing on one host and timing a read on a different host (to avoid caching) • Use of “itinerary” can ease network congestion at the central server • A tracking / error-reporting system is being developed, and will be connected to a paging system

  15. Summary • Initial implementation is proving useful • Mobile agent architecture adds design work but eases implementation, adds flexibility • Transfer interruption causing scalability problems, but not insurmountable • Plan to have expanded system running before data-taking begins

  16. Questions... Richard Ibbotson, BNL ibbotson@bnl.gov Thanks to… David Stampf, BNL Tom Throwe, BNL Bruce Gibbard, BNL

More Related