1 / 23

Monitoring and Diagnostic of middleware services

Felix Ehm. Monitoring and Diagnostic of middleware services. Introduction. Problem: how can I detect a failure in my systems ? What is the reason? Host, Network ? Add machine monitoring Is my program running correctly ? ?. Introduction.

henrik
Download Presentation

Monitoring and Diagnostic of middleware services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Felix Ehm Monitoring and Diagnostic of middleware services

  2. Introduction Problem: how can I detect a failure in my systems ? What is the reason? Host, Network ? • Add machine monitoring Is my program running correctly ? • ?

  3. Gina Gorgogianni, CMX Feedback mechanism Introduction Problem: how can I detect a failure in my systems ? Gain control by exposing process internal information to enable constant monitoring for pre-failure recognition. • JMX for Java Processes • CMWAdmin for CMW servers • CMX for C/C++ general services • Tracing/Central Logging System

  4. The Java Management Extension • Java Standard to expose process internal information • Inspect data (remotely) via JConsole/Jvisualvm • Many monitoring systems support this Example for JMS Broker

  5. The CMWAdmin GUI • Java GUI to inspect CMW-enabled process • Browse and watch information from one server • Uses CMW middleware to access data CMW Servers list from the Directory Server CMWAdmin

  6. The CMX Library A general solution to allow exposure of internal metrics for C/C++ programs. • Idea origins from JMX: Why can’t we have something like this for C/C++? • Requirements • Small memory footprint • Non-blocking calls • Metrics: floats & strings • Project started in 2012

  7. Architecture • High Level • 2 lightweight APIs withnon-blocking operations to • Update : registers, exposes & updates metrics • Read : retrieves information for metrics / process • No dependencies • Low Level • Main Segment:table containing the registered processes • Process Segments: structures containing information on metrics Shared Memory

  8. CMX Library Characteristics • Very small footprint: 140 KB in memory usage • Easy non-blocking API: 10 core functions in total • Supports floats and string data types • Incorporated input from real-time experts • CMX Library is ready for preproduction • No dependencies on external libraries • Future: Deployment for all cmw servers • But also applicable for other C/C++ projects

  9. Constant Monitoring • Host Health • Process up/down • Process service endpoint ok? • E.g. HTTP Server : is wget successful ? • Process does what it is supposed to do

  10. Constant Monitoring • DIAMON as CO in-house solution • Reads metrics and applies rules • Easy to extend though pluggable architecture • Provides history of metrics • Provides replay functionality • In case of problem detection • Displays it to Operators • Sends notification via SMS/Mail Controls config DIAMON DAQs DAQs DAQs JMX CMW CMX

  11. The DIAMON Synoptic Viewer

  12. The DIAMON Console

  13. Diamon • View History Data on metrics

  14. The Central Tracing/Logging System I need more information than just numbers to diagnose a problem! • Log events are helpful • Find the point where the program crashes/fails • Access to (past) events is required • Problems • Frontends are diskless • Multi-layer systems implies watching many sources at the same time • You get quickly drawn in the amount of information • CMW Project was initiated June 2011 • Target: Collect log events from CMW servers for better diagnostic(n.b. log events = info, debug, error, warning, etc.) • Replace previous system

  15. The Central Tracing/Logging System • Collectingandunifying tracing messages in one central place • Finding/Debugging a problem becomes cumbersome! • Easy correlation of events among many services Tracing Server ? DB Equipment Specialist/ Developer

  16. The Tracing/Log GUI Record to File Filter • Collectingandunifying tracing messages in one central place • Finding/Debugging a problem becomes cumbersome! • Easy correlation of events among many services Tracing Server ? Avail. Log Instances DB Message Panel Incoming log events

  17. The CMW Tracing Package • C++ client library • Very lightweight • Supports TCP + UDP • File + syslog + STOMP appender • Integrated with CMW components • Log level can be changed during runtime • JAVA client library • Based on log4j • Very easy to integrate with existing JAVA services

  18. The CMW Tracing Package • The Server • Modules • Converters to accept message • Broker to distribute data • FileWriter and Database Writer • Registry keeping discovered sources • Can be deployed as all-in-one process or separate processes • Scale horizontally and vertically

  19. The CMW Tracing Package • C/C++ & Java Libraries for Log Events • C/C++ Library for Config Messages • Server • Accepts events coming via UDP or TCP • Stores events in database and files • Sends events to multiple receivers • User Interface(s) • “online” : Java GUI, Linux Console (web console) • “offline” : Database viewer based UDP based

  20. The CMW Tracing Service • Nearly all CMW services send log events • Proxies, RDA servers, JMS, … • Great help for identifying problems • Easy to extend to other protocols • Performance • ~100M Messages/day • 6% stored in the DB • 100% stored additionallyin Files • System does very well Low network and CPU load

  21. The CMW Tracing Service • Collects also other information than log events • What is done, where and when by whom? • Software upgrades / installations • Process restarts events • Configuration changes

  22. Summary Gain control by exposing process internal information to enable constant monitoring for pre-failure recognition • JMX for Java Processes • CMWAdmin for CMW servers • CMX for C/C++ general services • Tracing/Central Logging System • But: try also to monitor the system as the user sees it • JMS : send test message and measure speed> 100ms = WARNING DIAMON DIAMON

More Related