1 / 13

Nagios Grid Monitor

Nagios Grid Monitor. E. Imamagic, SRCE OAT Kickoff Meeting. Overview. Nagios framework Nagios Grid Monitor Architecture Components Current status Future work. Nagios Framework. Host and service problems detection and recovery Alarms in case of problems Fine grained configuration

joyrose
Download Presentation

Nagios Grid Monitor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nagios Grid Monitor E. Imamagic, SRCE OAT Kickoff Meeting

  2. Overview • Nagios framework • Nagios Grid Monitor • Architecture • Components • Current status • Future work OAT Kickoff Meeting / Nagios Grid Monitor

  3. Nagios Framework • Host and service problems detection and recovery • Alarms in case of problems • Fine grained configuration • users, hosts, services, probes, notifications • Host and service dependencies • Easy to develop and use custom sensors • Importing results from external monitoring systems • passive checks • Web interface OAT Kickoff Meeting / Nagios Grid Monitor

  4. Nagios Grid Monitor OAT Kickoff Meeting / Nagios Grid Monitor

  5. Components • Probes for monitoring grid services • Grid Monitoring Probes Specification • Standard probe wrapper • integrates standard probes with Nagios • Remote gatherers • import results from external monitoring system • Grid Monitoring Data Exchange Standard • Publisher • export results from Nagios Grid Monitor • Grid Monitoring Data Exchange Standard OAT Kickoff Meeting / Nagios Grid Monitor

  6. Components • Credential management • provides proxy certificate for standard probes • Nagios Config Generator (NCG) • uses various information sources • enables manual configuration tuning • modular implementation • information sources can be added • can be reused for other monitoring systems OAT Kickoff Meeting / Nagios Grid Monitor

  7. Components • Probe description database • frequencies, timeouts, arguments needed, dependencies • part of NCG • Service dependencies • alarm filtering • hierarchy of probes • simple probes more often (e.g. 5 min) • heavyweight probes less often (e.g. 30 min) OAT Kickoff Meeting / Nagios Grid Monitor

  8. Remote gLite UI • Avoid installation of grid middleware on Nagios server OAT Kickoff Meeting / Nagios Grid Monitor

  9. Current Status • Three sets of standard probes integrated • SRCE, CERN, OSG • Two external monitoring systems • SAM, ENOC DownCollector • Several deployments • CERN-PPS, SRCE, NIKHEF, PIC, IN2P3, ScotGrid • RPMs in apt and yum repository • Installation and configuration manual • More info https://twiki.cern.ch/twiki/bin/view/LCG/GridServiceMonitoringInfo OAT Kickoff Meeting / Nagios Grid Monitor

  10. Future Work • Nagios Grid Monitor node type • easier deployment • gLite box + Yaim configuration • GGUS integration • updating tickets related to failed hosts or services (J. Templon) • Credential management alternatives • certificate-based MyProxy certificates (-R) • custom credential management • NCG multisite support • generating config for multiple sites • crucial for regional instances OAT Kickoff Meeting / Nagios Grid Monitor

  11. Future Work • GOCDB integration • region, site and services information • site personnel information • scheduled downtimes • NRPE on service nodes • monitoring local components (logs, processes, etc.) • Probe description database • separating from NCG to standalone database OAT Kickoff Meeting / Nagios Grid Monitor

  12. Future Work • Regional monitoring instance • monitor services from regional point of view • utilize network topology • integration with existing site-level Nagios instances • integration with messaging systems (ActiveMQ) • both pushing and receiving data • complex regional-level probes • can be further split to NGI-level OAT Kickoff Meeting / Nagios Grid Monitor

  13. Thank You! Questions? OAT Kickoff Meeting / Nagios Grid Monitor

More Related