1 / 26

Nagios – Our Open Source Network Management Solution Presenter: Ling Zhang LBLnet Services Group Information Technologi

Nagios – Our Open Source Network Management Solution Presenter: Ling Zhang LBLnet Services Group Information Technologies and Services Division LBNL. Contributors. Nagios software design and development: Ethan Galstad (www.nagios.org) System integration, configuration, testing:

elata
Download Presentation

Nagios – Our Open Source Network Management Solution Presenter: Ling Zhang LBLnet Services Group Information Technologi

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nagios – Our Open Source Network Management Solution Presenter: Ling Zhang LBLnet Services Group Information Technologies and Services Division LBNL

  2. Contributors Nagios software design and development: Ethan Galstad (www.nagios.org) System integration, configuration, testing: Ling Zhang, Greg Bell, Harper Mann, Cedric Hui, Clark Wood, Mike Bennett ITSD/LBNL

  3. Goals for this talk • To explain: • LBLnet’s point of view of Network Management System • network monitoring problems we encountered • the design of our Nagios network monitoring system • To discuss • the benefits of the nagios system • our future development goals ITSD/LBNL

  4. Our point of view of a NMS • Proactive network management • Alarm Panel • Connectivity • Performance • Fault isolation • Trend Analysis • Capacity planning • The Notification • Precise • Fast ITSD/LBNL

  5. Background Information • Network Monitoring tools we have tested and/or used before: • Sun Net Manager • Spectrum • Whatsup Gold • Netmon • SNMPc • Ipmonitor • HP Openview • OpenNMS • InCharge • Home grown scripts • MRTG/RRDtool • etc. ITSD/LBNL

  6. Background Information Our fair share of problems with NMS: • Notification storm 65 notifications were received during a router up/down event. The router has 20 active interface and 32 downstream monitored devices • False alarms • Integration with existing systems (MRTG, Trouble ticket system) • Tech support our longest outstanding tickets: 2 years and counting • Budget ITSD/LBNL

  7. In Search of a Better NMS • Accurate and efficient fault detection • Good performance • Extensible • Can be integrated with our existing system • Low maintenance • Fits our budget ITSD/LBNL

  8. Features of Nagios • Open source system runs on most Unix system • Highly extensible • Reliable dependency monitoring • Excellent service monitoring capabilities • Ability to schedule maintenance periods • Flexible notification ITSD/LBNL

  9. Our Nagios Topology LBLnet NMS diagram ITSD/LBNL

  10. Nagios Extensibility • Plugins • Event handlers • External commands ITSD/LBNL

  11. Nagios Extensibility - Plugins • Compiled executables or scripts (Perl, shell, etc.) • Run by nagios process • Checks device or service status Example: define host { host_name switch1 address 1.2.3.4 check_command ping_switch } define service { host_name switch1 Service_description CPU Util check_command get_cpu_util } ITSD/LBNL

  12. Services Monitored by Nagios • Nagios uses plugins to check service status • DHCP • DNS • FTP • HTTP • HTTPS • IMAP • NTP • Radius • SMTP • SQL • TFTP • WINS • etc. ITSD/LBNL

  13. Nagios Extensibility – Event Handelers • Compiled executables or scripts • Run by nagios process • Triggered by host or service status change Example: define service{ host_name somehost service_description HTTP max_check_attempts 4 check_command check_http event_handler restart-httpd ...other service variables... } ITSD/LBNL

  14. Nagios Extensibility – External Commands • A predefined set of commands issued externally to control the behavior of nagios • Controls notification, monitor scheduling, program start/stop • Issued by external applications (CGI, snmptrapd, etc.) • Reads in by nagios core process during run time Example • User disabled monitoring of switch1 from web interface • CGI wrote command “disable monitor switch1” to command file • Nagios process read this command and stopped scheduling monitoring for switch1 ITSD/LBNL

  15. Monitoring Network Devices • Ping • Measures system responsiveness via average RTT • SNMP get • CPU • Temperature • Interface/port status • System up time • Power supply status • Throughput • Packet discard rate • etc. • SNMP trap ITSD/LBNL

  16. Nagios Trap handling • Requires Net-SNMP or other trap receiver daemon • Trap receiver notifies nagios about traps received via External Commands • Nagios calls event handlers and/or notifies user ITSD/LBNL

  17. Dependency Configuration define host { use switch-tmpl host_name switch1 address 1.2.3.10 parents router1 } define host { use switch-tmpl host_name switch2 address 1.2.3.20 parents switch1 } define host { use switch-tmpl host_name switch3 address 1.2.3.30 parents switch1 } define host { use switch-tmpl host_name switch4 address 1.2.3.40 parents switch2 } Diagram ITSD/LBNL

  18. Nagios Notification • Similar to event handlers • Triggered by host/service status change • Calls third party notification tools (sendmail, qpage, etc.) • Supports email, page, instant messaging etc. ITSD/LBNL

  19. Nagios Notification format • Email Subject: switch3 (1.2.3.30) DOWN Host: switch3 Address: 1.2.3.30 Date/Time: Thu Jul 15 14:03:37 PDT 2004 Additional Info: (No Information Returned From Host Check) • Page DOWNswitch3(1.2.3.40) ITSD/LBNL

  20. Maintenance Scheduling • Schedule a maintenance window via Nagios web interface • Uses external commands • Fixed window • Float window • Dependency aware ITSD/LBNL

  21. Monitoring Subnet with Redundant Network Connections • Solution: • Monitor interface up/down status via Ping • Monitor HSRP status via HSRP mib • Challenge: • Monitoring interface status • Monitoring standby status at the same time ITSD/LBNL

  22. Performance of Nagios • False alarms • False positive • False negative • Unnecessary • Notification delay • Before: 303 sec • After: 221 sec ITSD/LBNL

  23. Money and Time Saved • Software package cost • InCharge ($$$) • IPmonitor ( $1500) • Nagios ($0) • Software maintenance contract cost • InCharge (>$15,000) • IPmonitor ($500) • Nagios ($0) • Time saved from less unnecessary alarms (Compared to IPmontior) • 20 man.hrs/month ITSD/LBNL

  24. Future development of Nagios • Performance Monitoring • Network element out of resources • Interface buffer drops • Duplex mismatch • Has to be done by inference • Assume heterogeneous network equipment • No use of host SNMP • Derive from combination of interface error types and rates • Integrating with other NMS elements • Syslog • MRTG/RRDtool • Trouble ticket System • Database • Topology discovery ITSD/LBNL

  25. Conclusion • Nagios fits our Network Management needs because: • Accurate and efficient fault detection • Extensibility • Can be easily integrated with our existing system • Low maintenance • Fits our budget • Delete sample documenticons and replace with working document icons as follows: • From Insert Menu, select Object... • Click “Create from File” • Locate File name in “File” box • Make sure “Display as Icon” is checked • Click OK • Select icon • From Slide Show Menu, Select “Action Settings” • Click “Object Action” and select “Edit” • Click OK ITSD/LBNL

  26. Thanks! • We are happy to share • Questions / comments • send to lblnet@lbl.gov • Delete sample documenticons and replace with working document icons as follows: • From Insert Menu, select Object... • Click “Create from File” • Locate File name in “File” box • Make sure “Display as Icon” is checked • Click OK • Select icon • From Slide Show Menu, Select “Action Settings” • Click “Object Action” and select “Edit” • Click OK ITSD/LBNL

More Related