1 / 12

Xrd Monitoring

Xrd Monitoring. Jacek Becla Stanford Linear Accelerator Center (SLAC). XrdMon. Allows monitoring I/O in real-time Low overhead, non-intrusive Reconfigurable granularity, flush intervals, output location among others. Typical Architecture.

Download Presentation

Xrd Monitoring

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Xrd Monitoring Jacek Becla Stanford Linear Accelerator Center (SLAC)

  2. XrdMon • Allows monitoring I/O in real-time • Low overhead, non-intrusive • Reconfigurable • granularity, flush intervals, output location among others Jacek Becla

  3. Typical Architecture Jacek Becla

  4. Red indicates “bulk” data. Everything else available in “light” (default) mode Monitored Data • Client information • user name, client host name, process id, session duration, disconnect time • File information • corresponding client, full path, open time, close time, #bytes read, # bytes written, { offset accessed, length, read/write mode, timestamp } • Application information • depends on client. Can pass anything, e.g. job type, cache hit • Xrd forwards to the collector Jacek Becla

  5. Overheads • xrootd • unnoticable • assuming reasonable configuration • Collector / decoder • single collector + real time decoder can easily keep up with typical load at SLAC • <5% of 1 750MHz CPU • decoding bulk data: not in real time, 1CPU enough to keep up • Space • light: few GB/year / all BaBar activities at SLAC • bulk: few TB/year Jacek Becla

  6. XrdMon Configuration @SLAC • Light mode • continueous, production (24x7x365) • starting in few weeks • use to monitor system, gather statistics, look for abnormal activities, understand server load (total and/or per application type) • Bulk tracing • will turn on occasionally for chosen applications • use to understand access patterns Jacek Becla

  7. Demo Data based on test setup. xrootd production version doesn’t contain many xrdmon metrics yet Servers configured specifically for demo: 4 sec flush frequency, 3 sec time window. In practice in production expect longer (~min) delays Jacek Becla

  8. First Analysis of Bulk Traces Jacek Becla

  9. Jacek Becla

  10. First Analysis of Bulk Traces • Easy to simulate effect of prefetching based on bulk data • played with different page sizes, # pages, page position relative to requsted offset • Example (optimal config) • for SP files • prefetch 32K pages, cache 75 pages • don’t reread already fetched section of a page • result: 52.08% cache hit, 43.40% used/read bytes • for SP deep-copied skims • prefetch 128K pages, cache 75 pages • don’t reread already fetched section of a page • result: 95% cache hit, 95% used/read bytes Jacek Becla

  11. Current Status of XrdMon • Server side - all done • Collector + real time decoder • ready to put in production @slac, should happen v. soon • Offline bulk decoder available • need work to decode recently added metrics • Have scripts to setup/load MySQL • alpha version • To do includes: • fully automating data flow, back up • web interface to MySQL data • generating & sending application info • controlling monitoring (off/on, bulk/light) from application • docs Jacek Becla

  12. XrdMon Availability • Available as part of xrootd distribution already • XrdMon • Not built by default (yet) • Will be announced once we run it for few weeks in production • and write documentation • Contact me (becla@slac.stanford.edu) if you want to try it out today Jacek Becla

More Related