NSClient++ Whats new? - PowerPoint PPT Presentation

nsclient whats new n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
NSClient++ Whats new? PowerPoint Presentation
Download Presentation
NSClient++ Whats new?

play fullscreen
1 / 68
NSClient++ Whats new?
175 Views
Download Presentation
armen
Download Presentation

NSClient++ Whats new?

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Michael Medin (@mickem) michael@medin.name http://blog.medin.name SOA/Middleware Architect NSClient++ Whats new? http://nsclient.org

  2. Michael Medin (@mickem) michael@medin.name http://blog.medin.name SOA/Middleware Architect Monitoring Simplified http://nsclient.org

  3. How many use NSClient++ NS-what did he say? ?#@*&%! I’m in the wrong room!

  4. How many likeNSClient++? ..pdh collection thread not running… ERROR: Missingargument exception PdhCollectQueryData? failed: : -2147481643: No data to return. Failed to query performance counters: ..pdh collection thread not running… ERROR: Missing argument exception PdhCollectQueryData? failed: : -2147481643: No data to return. Failed to query performance counters:

  5. How many thinks it’s simple? CheckEventLog file=application file=system MaxWarn=1 MaxCrit=1 "filter=generated gt -2d AND severity NOT IN ('success', 'informational') AND source != 'SideBySide'" truncate=800 unique descriptions "syntax=%severity%: %source%: %message% (%count%)"

  6. not ops dev Michael Medin a long time ago worked in ops work with “soa” not, C/C++, nagios, …

  7. NSClient++

  8. linuxand windows <0.4.0 agent modular by design Since 2003? Open source not open core 0.4.1: 2012-10-xx Highly extensible 0.4.2: 2013-10-xx? 0.4.3: 2014-02-xx?

  9. 0.4.1 is stable

  10. one-man-band , no commercial version no company , nopayed time

  11. Please don’t be angry! Some times I am busy

  12. Get your a** over here and play NOW! Please don’t be angry! Some times I am busy

  13. one-man-band , no commercial version no company , nopayed time but… sponsoring! donations! support!

  14. Thank you!

  15. What’s New!

  16. * Fixed two include files issues * Fixed Wix 3.7 and added wix to dependencies * Added nsclient-full.ini with "all" (non advanced) avalible options. * Fix for reloading settings from file from script: core:reload('settings') will not work. Notice it still will onlya reload the settings not the modules so modules have to be reloaded manually. * Fixed return code issue in nsclient-ini full generator. * Added support for delayed reloading * Fixed crash when collector thread is not started. * Fixed message dialog when loading PythonScript module without python installed. * (re)add check_fiulesize which was accidentally removed. * Fix for http settings * fix for --version command line option * Reverted default NRPE encoding to "system" (not UTF-8). * Added new option to configure NRPE encoding: [/settings/NRPE/server] encoding = utf8 Valid values are currently system and utf8 (and strangely enough utf7). If you need something else let me know. * Added option scan-range to CheckEventLog. This new option reduces the entries scanned a *lot* and can help solve memory, time and CPU issues. The idea is that is negative we start scanning from the end and once we hit something outsiden the range we stop scanning. There is a chanse that entries reported are "outside" the range so set range bigger then generate/written date/times (to reduce this risk). CheckEventLog file=application file=system MaxWarn=1 MaxCrit=1 "filter=generated gt -1h AND level eq 'info'" truncate=800 unique descriptions "syntax=%severity%: %source%: %message% (%count%)" Executes in 7 seconds adding scan-range=-5h executes in 0 seconds (yields the same result). * Added error message when overriding a commad (ie. when alias check_cpu overrides the new command check_cpu). Wont work (for technical reasons) for duplicate aliases ie.- alias x=foo and x=bar * Fixed path issues in the installer * Fixed shortcuts in the installer * Fixed so clients can understand no prefixed arguments ie. ... -c nrpe_query -a command=check_ok host=129.168.0.1 * Added option to disable new alias check_cpu and only register old ones CheckCPU [/settings/default] modern commands=false * Improved exception handling in server threads * Fixed crash in NRPE server when payload was to large (#585 #582) * Fixed issue with lua unit test * Added payload length simulation in lua unit tests (so it returns various payload sizes) * Added nscp.sleep to Lua scripts (but dont use as I will implement coroutines in 0.4.2) * Fixed registry settings bug * Fixed issue with parsin performance data with leading spaces * Fixed issue with rendering filters * Created nscpnobp.exe which is a version without break pad for older machines (windows 2000 and nt4). This can only be foundin the zip file (not the msi) * Added some missing file to zip * Removed counters.defs since it is not used anymore * CheckEventLog: Added debug message lisgin all loaded filters to make it simpler to detect missing once * SimpleCache: Added keywords not-found-msg and not-found-code option to configure the outcome of "item not found". check_cache index=foobar "not-found-code=Doch! item was not found" not-found-code=critical * CheckProcess is no longer case sensetive * CheckServiceState: added support for pending states * CheckDriveSize now supports regular expressiion filtering: CheckDriveSizeShowAllMaxWarn=1M MaxCrit=2M CheckAll=volumes matching=.*[CD].* * CheckFiles filter is now optional (not specifying a filter will find all files matching) * CheckFiles no longer matching . and .. * Added perf-unit to allow for stable performance data units (if not specified it will guess which is the current solution). checkmem type=paged MaxWarn=80% perf-unit=M => 'paged bytes %'=34%;80;0 'paged bytes'=8454.04M;19629.84;0;0;24537.3 checkmem type=paged MaxWarn=80% perf-unit=K => 'paged bytes %'=34%;80;0 'paged bytes'=8655200K;20100963.19;0;0;25126204 checkmem type=paged MaxWarn=80% perf-unit=B => 'paged bytes %'=34%;80;0 'paged bytes'=8872108032B;20583386316;0;0;25729232896 checkmem type=paged MaxWarn=80% => 'paged bytes %'=34%;80;0 'paged bytes'=8.25G;19.1;0;0;23.96 * Fixed threadding issue related to servers (ie. check_nt causing a crash) Dont know what I was thinking when I designed that, pretty stupid bug :( * Fixed issue with loading performance counters (check_cpu) * Fixed default service name (nscp) * CheckWMI: Added support for lists of integers * Fixed installer in preparation of 0.4.1 * Added support for lists in targets/destination (passive checks/channels) * Added new --remove-defaults option when generating settings files. nscp settings --remove-defaults --generate * Added new module SimpleFileWriterfor writing passive check results to files. * Much improved CheckLogFile with warn/crit/filter concepts (currentl no real-time support) check_logfile "file=c:\\test.txt" "filter=column1 eq '123'" "warn=column3 like 'foo'" "crit=column1 eq '123'" Use "check_logfile help" for more details * Fixed performance data parsing for empty sections such as f=1;;;;1 will now work (and not become f=1;0;0;0;1) * Added NSCP uptime to check_nscp * Improved command line syntax for eventlog * Improved command line help texts and added global --version option. Running nscp without options now lists all context and their use. * settings: bool options are now case insensetive so TrUe will now evaluate to true... * Added support for lists (int and string) to wmi checks and commands. * Full SSL support for all server (NRPE, NSCA, check_mk, NSCP) by full I mean certificate based authentication * Full SSL support for all clients (NRPE, NSCA, check_mk, NSCP) by full I mean certificate based authentication * New module SimpleCache which acts as a brdge between passive and active monitoring (and other use cases) Enables storing of results for later use * Added initial check_mk server CheckMKServer * Added initial check_mk client CheckMKClient * Added sample check_mklua script * New experimental way to build things automagically as well as new cleanedup build scripts. * Added initial retry (default 3) when sending data. * Hopefully fixed the "cant load counter" issue by reowrking how counters are handled. * major improvments to the CheckSystem command line syntax: Run: "nscp sys" to get help. A good way to validate your CheckSystem issues are running the following: nscp sys --validate * Added so commandline parser will stop at .. and pass along all extra options to the module. So now you can do: (Notice the double --log where the first is a log arg and the second is a module arg) nscplua --log debug .. --script foo.lua --log this-is-a-lua-argument But perhaps more importantly you can do: nscplua .. --help Which previously would always give you "command line help" and not lua help. * Added nscp settings --validate to validate a given settings file listing all invalid (unregistered) keys. * Added SamplePluginSimple which currently is not very rich but at least has some comments to explain some of the things. * Added support for using member functions as handlers in Lua scripts * Performance enhancements to build time * LUAScript: Improved lua scripting module a lot * LUAScript: Added protocol buffer support to lua scripts * Initial (rather crude) NRDP support. * Tweaked all servers to use the new internals and added first testcase for NSCP socket * Reworked real time event log support to be a lot more flexible * Added support for ipv6 allowed hosts validation 0.4.1 Sockets: ipv6, ssl (true) What’s new 0.4.1 Modernized: NRPE, NSCA, check_nt New protocols: NRDP, check_mk, Graphite, syslog, smtp Real-time checks: eventlog, logfiles Simplified: Command line syntax

  17. 0.4.1 • Build 90 (2013-02-xx) • nsclient-full.ini • Reload from script • (re)added check_filesize (ie. Check_nt –v FILESIZE) • Encoding support for NRPE • New option: scan-range for CheckEventLog • Various minor bug fixes • Build 96 (2013-04-xx) • Reverted external script quoting issues • (re)added check_fileage (ie. Check_nt –v FILEAGE) • Added support for binding to both ipv6 and ipv4 • Various minor bug fixes • Build 102 (2013-08-xx) • PDH improvements • Performance data: pass through • Encoding support through out • Various minor bug fixes and enhacements

  18. 0.4.2: The goals Modern Windows support Real-time monitoring Simplified monitoring Linux checks

  19. 0.4.2: The STATUS Modern Windows support Real-time monitoring Simplified monitoring Linux checks NSCP protocol Check_xxx clients

  20. 0.4.2: Some Examples Check_os_Version Check_process Check_pagefile Check_service NO MORE PDH Nrpe_client

  21. Filters

  22. filter=” level = ’error’ ”

  23. filter=” source = ’App1’ ”

  24. filter=” source = ’App1 ”

  25. filter=” source = ’App1’ or source = ’App3’ ”

  26. filter=” source = ’App1’ or source = ’App3’ or level = ’error’ ”

  27. filter=” source = ’App1’ or source = ’App3’ or level = ’error’ or level = ’warning’ ”

  28. filter=” (source = ’App1’ or source = ’App3’ or level = ’error’ or level = ’warning’)and source != ’Excel’ ”

  29. filter=” (source = ’App1’ or source = ’App3’ or level = ’error’ or level = ’warning’) and source != ’Excel’ ” filter=” (source in (’App1’, ’App3’) or level in (’error’, ’warning’)) and source != ’Excel’ ”

  30. filter = (id NOT IN ('3', '4', '6', '11', '16', '23', '24', '27', '29', '36', '46', '47', '50', '56', '134', '142', '219', '267', '270', '1006', '1009', '1014', '1030', '1035', '1036', '1055', '1058', '1071', '1073', '1085', '1102', '1110', '1111', '1112', '1131', '1291', '1500', '3095', '5719', '5722', '5783', '5788', '5789', '6008', '7000', '7001', '7003', '7005', '7009', '7011', '7022', '7023', '7024', '7026', '7030', '7031', '7034', '7038', '7041', '9015', '9018', '9026', '9028', '10009', '10010', '10016', '10149', '12294', '15300', '15301', '24679', '36887', '36888', '40960', '40961', '45056') AND level IN ('error', 'warning')) • OR (id IN ('3') AND source NOT IN ('FilterManager') AND level IN ('error', 'warning')) • OR (id IN ('4') AND source NOT IN ('q57','L2ND') AND level IN ('error', 'warning')) OR (id IN ('6') AND source NOT IN ('Security-Kerberos') AND level IN ('error', 'warning')) OR (id IN ('11') AND source NOT IN ('Kerberos-Key-Distribution-Center') AND level IN ('error', 'warning')) OR (id IN ('16') AND source NOT IN ('WindowsUpdateClient') AND level IN ('error', 'warning')) OR (id IN ('23') AND source NOT IN ('Eventlog') AND level IN ('error', 'warning')) OR (id IN ('24') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('27') AND source NOT IN ('Eventlog') AND level IN ('error', 'warning')) OR (id IN ('29') AND source NOT IN ('Kerberos-Key-Distribution-Center') AND level IN ('error', 'warning')) OR (id IN ('36') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('46') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('47') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('50') AND source NOT IN ('TermDD','Time-Service') AND level IN ('error', 'warning')) OR (id IN ('56') AND source NOT IN ('TermDD') AND level IN ('error', 'warning')) OR (id IN ('134') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('142') AND source NOT IN ('Time-Service') AND level IN ('error', 'warning')) OR (id IN ('219') AND source NOT IN ('Kernel-pnp') AND level IN ('error', 'warning')) OR (id IN ('267') AND source NOT IN ('Storage-agents') AND level IN ('error', 'warning')) OR (id IN ('270') AND source NOT IN ('Storage-agents') AND level IN ('error', 'warning')) OR (id IN ('1006') AND source NOT IN ('DNS Client Events','GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1009') AND source NOT IN ('picadm') AND level IN ('error', 'warning')) OR (id IN ('1014') AND source NOT IN ('DNS Client Events') AND level IN ('error', 'warning')) OR (id IN ('1030') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1035') AND source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1036') AND source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1055') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1058') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1071') AND source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1073') AND source NOT IN ('USER32') AND level IN ('error', 'warning')) OR (id IN ('1085') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1102') AND source NOT IN ('SNMP') AND level IN ('error', 'warning')) OR (id IN ('1110') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1111') AND source NOT IN ('Server Agents') AND level IN ('error', 'warning')) OR (id IN ('1112') AND source NOT IN ('GroupPolicy') AND level IN ('error', 'warning')) OR (id IN ('1131') AND source NOT IN ('TerminalServices-RemoteConnectionManager') AND level IN ('error', 'warning')) OR (id IN ('1291') AND source NOT IN ('NIC-agents') AND level IN ('error', 'warning')) OR (id IN ('1500') AND source NOT IN ('SNMP') AND level IN ('error', 'warning')) OR (id IN ('3095') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5719') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5722') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5783') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5788') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('5789') AND source NOT IN ('Netlogon') AND level IN ('error', 'warning')) OR (id IN ('6008') AND source NOT IN ('Eventlog') AND level IN ('error', 'warning')) OR (id IN ('7000') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7001') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7003') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7005') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7009') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7011') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7022') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7023') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7024') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7026') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7030') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7031') AND source NOT IN ('service control manager') AND strings not like 'citrix' AND level IN ('error', 'warning')) OR (id IN ('7034') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7038') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('7041') AND source NOT IN ('service control manager') AND level IN ('error', 'warning')) OR (id IN ('9015') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('9018') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('9026') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('9028') AND source NOT IN ('Metaframe') AND level IN ('error', 'warning')) OR (id IN ('10009') AND source NOT IN ('DistributedCOM') AND level IN ('error', 'warning')) OR (id IN ('10010') AND source NOT IN ('DistributedCOM') AND level IN ('error', 'warning')) OR (id IN ('10016') AND source NOT IN ('DistributedCOM') AND level IN ('error', 'warning')) OR (id IN ('10149') AND source NOT IN ('WindowsRemoteManagement') AND level IN ('error', 'warning')) OR (id IN ('12294') AND source NOT IN ('Directory-Services-SAM') AND level IN ('error', 'warning')) OR (id IN ('15300') AND source NOT IN ('HTTPEVENT') AND level IN ('error', 'warning')) OR (id IN ('15301') AND source NOT IN ('HTTPEVENT') AND level IN ('error', 'warning')) OR (id IN ('24679') AND source NOT IN ('Cissesrv') AND level IN ('error', 'warning')) OR (id IN ('36887') AND source NOT IN ('Schannel') AND level IN ('error', 'warning')) OR (id IN ('36888') AND source NOT IN ('Schannel') AND level IN ('error', 'warning')) OR (id IN ('40960') AND source NOT IN ('LSASRV') AND level IN ('error', 'warning')) OR (id IN ('40961') AND source NOT IN ('LSASRV') AND level IN ('error', 'warning')) OR (id IN ('45056') AND source NOT IN ('LSASRV') AND level IN ('error', 'warning'))

  31. Numbers, constants etc

  32. Strings

  33. All good things are three! Warning Ok Filter Critical

  34. filter=” source = ’App1’ “ warn=” level = ’Warning’ “

  35. Display Custom strings Supports substitutions ${…} top- and detail-syntax

  36. Display detail-syntax=”s: ${source} “ top-syntax=“Hello: ${list}” Hello: s: App1, s: App1, s: App3

  37. check_pagefile • "filter=name = 'total'” • check_uptime • "warn=uptime < -2d“ • "crit=uptime < -1d“ • check_processprocess=explorer.exe • "warn=working_set > 70m" • "detail-syntax=${exe} ws:${working_set}, handles: ${handles}, usertime:${user}s”

  38. Simple?

  39. Let me guess This all seems Like a lot of typing!

  40. Sensible defaults!

  41. check_cpu Just works!

  42. Real timemonitoring

  43. Active monitoring!

  44. Passive monitoring!

  45. Real-time monitoring!

  46. No CPU overhead Notified instantly Powerful filtering

  47. [/modules] CheckLogFile = enabled NSCAClient = enabled SimpleFileWriter = enabled [/settings/logfile/real-time/checks/my_check] destination = FILE,NSCA file = test.txt warning = column1 like ‘warn’ critical = column2 like ‘crit’ [/settings/NSCA/client/targets/default] address = 10.11.12.13 encryption = aes password = secreter

  48. But I use NRPE

  49. No CPU overhead Powerful filtering Stored in cache Check latest result Fetched instantly