Performance Management with Free and Bundled Tools - PowerPoint PPT Presentation

lotus
performance management with free and bundled tools n.
Skip this Video
Loading SlideShow in 5 Seconds..
Performance Management with Free and Bundled Tools PowerPoint Presentation
Download Presentation
Performance Management with Free and Bundled Tools

play fullscreen
1 / 93
Download Presentation
Performance Management with Free and Bundled Tools
248 Views
Download Presentation

Performance Management with Free and Bundled Tools

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Performance Management with Free and Bundled Tools Adrian Cockcroft Netflix Inc. acockcroft@netflix.com (Co-authored with Mario Jauvin MFJ Associates mario@mfjassociates.net) 9 June, 2014

  2. Agenda • Overview of Capacity Planning Requirements and Data Sources • Performance Data Collection • Free Network Monitoring Tools • Free System Monitoring Tools • Free Load Generation and Modelling Tools • Licences and References Adrian Cockcroft and Mario Jauvin

  3. FREE!! What are we talking about? QA Load generation with Grinder or SLAMD, modelling with PDQ and R Network monitoring with WireShark, MRTG, BigSister, Cacti, Nagios, OpenNMS, Zenoss, Openxtra, ntop Application Tier monitoring with Orca, Cacti, BigSister, Ganglia, XEtoolkit Database Tier monitoring With SEtoolkit, Orca, XEtoolkit Adrian Cockcroft and Mario Jauvin

  4. Capacity Planning Requirements and Data Sources Adrian Cockcroft and Mario Jauvin

  5. Definitions • Capacity • Resource utilization and headroom • Planning • Predicting future needs by analyzing historical data and modeling future scenarios • Performance Monitoring • Collecting and reporting on performance data • Free Tools • Bundled with the OS or available for no $$$ Adrian Cockcroft and Mario Jauvin

  6. Capacity Planning Requirements • We care about CPU, Memory, Network and Disk resources, and Application response times • We need to know how much of each resource we are using now, and will use in the future • We need to know how much headroom we have to handle higher loads • We want to understand how headroom varies, and how it relates to application response times and throughput Adrian Cockcroft and Mario Jauvin

  7. CPU Capacity Measurements • CPU Capacity is defined by CPU type and clock rate, or a benchmark rating like SPECrateInt2000 • CPU utilization is defined as busy time divided by elapsed time for each CPU • CPU load average measures the average number of jobs running and ready to run Adrian Cockcroft and Mario Jauvin

  8. Memory Capacity Measurements • Physical Memory Capacity Utilization and Limits • Kernel memory • Shared Memory segment • Executable code, stack and heap • File system cache usage • Unused free memory • Virtual Memory Capacity - Swap Space • Memory Throughput • Page in and page out rates Adrian Cockcroft and Mario Jauvin

  9. Network Capacity Measurements • Network Interface Throughput • Byte and packet rates input and output • TCP Protocol Specific Throughput • TCP connection count and connection rates • TCP byte rates input and output • NFS/SMB Protocol Specific Throughput • Byte rates read and write • NFS/SMB service response times • HTTP Protocol Specific Throughput • HTTP operation rates • Get and put payload byte rates and size distribution Adrian Cockcroft and Mario Jauvin

  10. Disk Capacity Measurements • Detailed metrics vary by platform • Easy for the simple disk cases • Hard for cached RAID subsystems • Almost Impossible for shared disk subsystems and SANs • Another system or volume can be sharing a backend spindle, when it gets busy your own volume can saturate, even though you did not change your own workload Adrian Cockcroft and Mario Jauvin

  11. Capacity Planning Challenges • Constantly changing infrastructure • Limited attention span from staff • Horizontally scaled commodity systems • Per node software licencing costs too much • Too many tools, too many agents per node • Too much data, not enough analysis • Non-linear and non-intuitive scalability • Lack of tools and metrics for virtualized resources Adrian Cockcroft and Mario Jauvin

  12. Observability • Four different viewpoints • Management • Engineering • QA Testing • Operations • Each needs very different information • Ideal would be different views of the same performance database • Reality is a mess of disjoint tools Adrian Cockcroft and Mario Jauvin

  13. Management Viewpoint • Daily summary of status and problems • Business oriented metrics • Future scenario planning • Marketing and management input • Concise report with dashboard style status indicators • Free tools: R, Spreadsheet and Web based displays, no good summarization tools Adrian Cockcroft and Mario Jauvin

  14. Engineering Viewpoint • Large volumes of detailed data at several different time scales • Input to tuning, reconfiguring and future product development • Low level problem diagnosis • Detailed reports with drill down and correlation analysis • Free tools: XE/SE Toolkit, Orca, Ganglia, Cacti, R Adrian Cockcroft and Mario Jauvin

  15. QA Test Viewpoint • Workload specification tools • Load generation frameworks • Testing for functionality and performance • Regression tools to compare releases • Modelling difference between test configuration and production configuration • Free Tools: The Grinder, SLAMD, R, PDQ Adrian Cockcroft and Mario Jauvin

  16. Operations Viewpoint • Immediate timeframe • Real time display, updated in seconds • Alert based monitoring • High level problem diagnosis • Simple high level graphs and views • Free tools: BigSister, Nagios, OpenNMS, MRTG, Cacti, Ganglia, WireShark, ntop Adrian Cockcroft and Mario Jauvin

  17. Measurement Data Interfaces • Several generic raw access methods • Read the kernel directly (not a good idea) • Structured system data (Solaris kstat, Linux /proc) • Process data • Network data • Accounting data • Application data • Command based data interfaces • Scrape data from vmstat, iostat, netstat, sar, ps • Higher overhead, lower resolution, missing metrics • Data available is platform specific either way • Much more detail on this topic in the Solaris/Linux Performance Measurement and Tuning Class Adrian Cockcroft and Mario Jauvin

  18. Free Network Monitoring Tools Adrian Cockcroft and Mario Jauvin

  19. SNMP • Simple network management protocol • UDP protocol based on port 161 • Client/server like • Client is called management application entity • Server is called an agent entity • Agent entity is designed to be implemented on network hardware, router, switches, etc Adrian Cockcroft and Mario Jauvin

  20. SNMP – MIBs • Management information base • Defines the structure and the semantic of the information that can be reported on • Most commonly used is MIB-II which defines a set of standard networking attributes • Interface tables • System level information • Routing tables • Specified using ASN.1 (abstract syntax notation 1) Adrian Cockcroft and Mario Jauvin

  21. SNMP – commands • Called PDU (protocol data units) • GET • GETNEXT • GETBULK • SET • Encoded using BER (basic encoding rules) Adrian Cockcroft and Mario Jauvin

  22. Versions • Version 1, original version done in May 1991 • Version 2, around 1993. Failed because the IETF credo of “rough consensus and running code” could not be met on securing SNMP • Turned into V2c for community string security (like V1) • Version 3, added security and complexity in 1998 Adrian Cockcroft and Mario Jauvin

  23. SNMP tools • Too numerous to name all but… • OpenNMS • Nagios • Cacti • MRTG • Net-snmp • See www.snmplink.org Adrian Cockcroft and Mario Jauvin

  24. SNMP tools • Snmpwalk – will report all data in a specified MIB • getIf – will report data about interfaces and includes built-in MIB browser • Snmptable – will report tabular data from MIB tables Adrian Cockcroft and Mario Jauvin

  25. OpenNMS • Well…. it’s not that portable • 95% java is not 100% java • Requires about 20-30 different platform specific packages (PostgreSQL, Perl, RRD tool, Tomcat 4 etc…) • Difficult to install • Easy auto discovery • Web-based interface Adrian Cockcroft and Mario Jauvin

  26. OpenNMS • Main screen shot Adrian Cockcroft and Mario Jauvin

  27. OpenNMS • Node screen shot Adrian Cockcroft and Mario Jauvin

  28. Nagios • Easy to build/compile (on Solaris 10) • Easy to install • Quick response from CGI • Configuration is manual and a pain • 13 configuration files with all kinds of interrelated entries • Tedious and error prone • Requires plugins to do anything Adrian Cockcroft and Mario Jauvin

  29. Nagios • Main screen shot Adrian Cockcroft and Mario Jauvin

  30. Nagios • Host detail screen shot Adrian Cockcroft and Mario Jauvin

  31. Adrian Cockcroft and Mario Jauvin

  32. ntop • Similar to familiar UNIX top tool for processes but used for network • Provide huge selection of real-time data • Can be found at http://www.openxtra.co.uk/ Adrian Cockcroft and Mario Jauvin

  33. ntop – Active Sessions Adrian Cockcroft and Mario Jauvin

  34. ntop Hosts Adrian Cockcroft and Mario Jauvin

  35. ntop Network Load Adrian Cockcroft and Mario Jauvin

  36. ntop_Network_Thruput Adrian Cockcroft and Mario Jauvin

  37. ntop Port Dist Adrian Cockcroft and Mario Jauvin

  38. ntop_Protocol_Dist Adrian Cockcroft and Mario Jauvin

  39. ntop Protocols Adrian Cockcroft and Mario Jauvin

  40. Zenoss • Open source monitoring and management of IT infrastructure • Zenoss core is free • Other editions are for a fee • Get it from http://www.zenoss.com/download/ Adrian Cockcroft and Mario Jauvin

  41. zenoss Architecture Adrian Cockcroft and Mario Jauvin

  42. zenoss Dash Config Adrian Cockcroft and Mario Jauvin

  43. zenoss Google Adrian Cockcroft and Mario Jauvin

  44. zenoss Google Alerts Adrian Cockcroft and Mario Jauvin

  45. Zenoss Graphs Adrian Cockcroft and Mario Jauvin

  46. zenoss Topology Adrian Cockcroft and Mario Jauvin

  47. MRTG • Really simple to install and configure • Require manual config file creation • Only for MIB-II interface plotting out of the box • Graphing not flexible, axis, time etc Adrian Cockcroft and Mario Jauvin

  48. MRTG • Interface screen shot Adrian Cockcroft and Mario Jauvin

  49. MRTG • Other CPU screen shot Adrian Cockcroft and Mario Jauvin

  50. RRD tool • Software to store, retrieve and graph numerical time series data • Use a round robin algorithm • Data files are a fixed size • Don’t grow • Don’t require maintenance Adrian Cockcroft and Mario Jauvin