1 / 25

High Performance Logging System for Embedded UNIX and GNU/Linux Applications

High Performance Logging System for Embedded UNIX and GNU/Linux Applications. IEEE RTCSA 2013 (8/21/13) Cisco Systems Jaein Jeong. Introduction - Embedded UNIX in many places. App Process. log. syslogd. App Process. Traditional UNIX Logging System. log. …. App Process. log.

kamea
Download Presentation

High Performance Logging System for Embedded UNIX and GNU/Linux Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Performance Logging System for Embedded UNIX and GNU/Linux Applications IEEE RTCSA 2013 (8/21/13) Cisco Systems Jaein Jeong

  2. Introduction- Embedded UNIX in many places AppProcess log syslogd AppProcess Traditional UNIX Logging System log … AppProcess log File System syslog USER Buffer KERNEL

  3. Problem Statement- Apps slow down w. large amount of logging • Long latency to logging daemon • Inefficiency of unbuffered writes to flash FS • Long latency even with output buffering AppProcess AppProcess AppProcess AppProcess AppProcess log log log log log syslogd syslogd syslogd syslogd syslogd AppProcess AppProcess AppProcess AppProcess AppProcess log log log log log … … … … … AppProcess AppProcess AppProcess AppProcess AppProcess log log log log log FlashLogger FlashFile System FlashFile System FlashFile System FlashFile System FlashFile System syslog syslog syslog syslog syslog USER USER USER USER USER Buffer Buffer Buffer Buffer Buffer Named pipe KERNEL KERNEL KERNEL KERNEL KERNEL

  4. Our Approach • Faster Message Transfer • Compatibility with Existing Logging Apps • Destination-Aware Message Formatting

  5. Organization • Related Work for UNIX Logging Systems • Background • Cisco UCS and Virtual Interface Card (VIC) • Evolution of VIC Logging System • Design Requirements and Implementation • Evaluation and Optimization • Conclusion

  6. Related Work- Logging Methods for UNIX Apps • Not designed for embedded/flash logging • Slow msg passing (msg copying over kernel) • Unbuffered message writes • Rsyslog • An extension used in latest distros • Multi-threading. • Syslog-ng • An extension based on nsyslogd • Reliable transport, encryption, and richer set of information and filtering • Syslog • Introduced in early 80’s • Still most notable one

  7. Background- Cisco UCS and Virtual Interface Card 10GBASE-KRUnified NetworkFabric, 1 to EachFabric Extender Cisco UCS Virtual Interface Card (VIC) Cisco UCS datacenterserver system Mgmt CPU VIC ASIC 128ProgrammableVirtualInterfaces MIPS proc core (500MHz, MIPS 24Kc) Mgmt CPU FCPU 0 Cisco UCS server FCPU 1 Embedded Linux (Linux kernel 2.6.23-rc5) Fibre Channel HBAs Ethernet NICs

  8. Unbufferedsyslogd Buffered syslogd Background- Evolution of VIC Logging System Logd – a simple logging daemon • Logging from Multiple Processes • Different Severity Levels • Formatting and flash writing • Forwards serious msgs to switches • Functional, but with worse write performance • Improves flash write performance of unbufferedsyslogd • Still suffers long latency

  9. Organization • Related Work for UNIX Logging Systems • Background • Cisco UCS and Virtual Interface Card (VIC) • Evolution of VIC Logging System • Design Requirements & Implementation • Evaluation & Optimization • Conclusion

  10. Design Requirements - Faster Message Transfer • Avoid kernel-to-user space msg copying Syslogd Logging Mqlogd Logging

  11. Design Requirements - Faster Message Transfer • Reduce message copying from 4 to 2 Syslogd local copy Syslogd Logging Mqlogd Logging 3 Write to named pipe 4 1 Write from shared memory to named pipe 1’ 2’ 2’ 4 App local copy 3 1 2 Write to kernel buffer 2 Write directly to shared memory 1’

  12. … … … app1 klogd klogd mcp app2 xinetd fls fls Design Requirements- Compatibility with Existing Logging Apps • Thru Logging API • Replace syslog() with share memory lib calls • Direct Syslog Calls • Server receives msgs through UDP Unix socket Logging Client Logging Client Logging Client Logging Client Logging API :log_info(), log_error(), … syslog() library call syslog() library call Logging API :log_info(), log_error(), … syslog() library call Shared MemoryLogging Library UDP Unix Socket UDP Unix Socket UDP Unix Socket Logging Server (Syslogd) Logging Server (mqlogd) Logging Server (Syslogd) Logging Server (mqlogd)

  13. Design Requirements- Destination-Aware Message Formatting • Syslogd • Working but limited • Redundant • Coarse time granularity (in seconds) • Mqlogd • Destination-aware formatting with space saving • Uses system supported timing (in micro-seconds)

  14. Implementation- Shared Memory and Circular Queue • Notification Mechanism • Write-and-select • Signal • Locking Mechanism • Semaphore lock • Pthread lock … LoggingClient LoggingClient Circular Queue Header Enqueue LoggingEvent Shared Memory Notification Disable Flag Notification Disable Flag Dequeue Logging Server Queue Memory Layout Header Entry Non-Header Entry Notification Non-Header Entry … Non-Header Entry

  15. Organization • Related Work for UNIX Logging Systems • Background • Cisco UCS and Virtual Interface Card (VIC) • Evolution of VIC Logging System • Design Requirements & Implementation • Evaluation & Optimization • Conclusion

  16. Evaluation • Metrics • Request Latency • Request Drop Rate • Parameters • Number of clients • Number of iterations (Depth of queue size) • Locking mechanism • Notification mechanism

  17. Performance Results- Performance compared to syslogd • Avg Latency: >10x speed-up • Min Latency: >20x speed-up • Max Latency: >2x speed-up

  18. Performance Results- Effect of Queue Size • No drops within queue size (e.g. 10000) • Queue size should be larger than max expected burst size

  19. Performance Results- Effect of Multiple Clients • Avg request latency increases proportionally • With 2 clients, request starts to drop with smaller number of iterations

  20. Performance Results - Effect of Notification Mechanisms • Makes little difference

  21. Performance Results - Effect of Lock Mechanisms • Pthreadmutex is 40% faster than semaphore. • Semaphore is used for our production code due to a limitation of pthreadmutex lock(Linux kernel 2.6.23-rc5)..

  22. Performance Results- Effect of Client Interface Type • Logging using UNIX socket interface • Backward compatibility is no faster • About the same level as syslogd. • For compatibility, not for general use.

  23. Optimization- Effects of deferred notification • Sends one notification for a batch of msgs • Measured time for host-to-adapter commands(capability & macaddr) with and w.o. logging • 2x speed-up in latency

  24. Future Works • Reduce kernel msg copying even further • Improve performance with faster lock • Avoid loss of serious messages AppProcess AppProcess MemoryMappedFile log log mqlogd mqlogd AppProcess AppProcess log log dequeue dequeue … … enqueue enqueue AppProcess AppProcess log log MemoryMappedFile MemoryMappedFile FlashLogger FlashLogger File System File System USER USER Named pipe KERNEL KERNEL

  25. Conclusion • Logging system for embedded UNIX apps • Up to 100x speed-up in latency, 10x throughput • Backward Compatibility • Commercially used in Cisco UCS Virtual Interface Cards

More Related