1 / 37

Geiger: Monitoring the Buffer Cache in a Virtual Machine Environment

Geiger: Monitoring the Buffer Cache in a Virtual Machine Environment. Stephen T. Jones Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau Department of Computer Sciences. Buffer Cache. In modern OSes, file system buffer and virtual memory system are unified

Download Presentation

Geiger: Monitoring the Buffer Cache in a Virtual Machine Environment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geiger: Monitoring the Buffer Cache in a Virtual Machine Environment Stephen T. Jones Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau Department of Computer Sciences

  2. Buffer Cache • In modern OSes, file system buffer and virtual memory system are unified • When first access a file, data is buffered in a memory page • When under memory pressure, a page will be evicted out • If the page is dirty, write to swap space or file system first • Then the page can be reused • Later, if the data is needed, a page fault occurs • Allocate a free page, reload the data from disk to the page

  3. Useful Information About Buffer Cache • If VMM knows events of eviction/promotion • Tell if guest OS is thrashing and how much more memory allocation is needed to prevent it • Guide eviction-based cache placement • exclusive cache: when hits, data item is removed • A transparent secondary cache maybe desirable • E.g. a 32-bit OS running on a host with 16 GB mem • Why exclusive cache works? • Normally, when a page is read from disk, the OS will not read it again without evicting it first • Increase cache utilization

  4. Services in a VMM • VMM layer is attractive development target • Security (isolation from OS and apps)‏ • Portability (transparent to OS)‏ • Our target services • VMM-driven eviction-based cache placement • Increase hit-ratio for remote storage caches • Transparent to guest OS • Working set size estimation for thrashing VMs • Complement ESX server technique

  5. VMM Services Need Information • Information about guest operating systems • For our target services • Information about OS buffer cache • Hidden from the VMM • Layered design approach • Narrow interface (virtual architecture)‏

  6. Geiger Monitors Buffer Cache • Virtual machine monitor extension • Implicitly observes buffer cache events • Uses only information intrinsically available to VMM • Explicit approach possible, but drawbacks • No guest OS modifications required • Applicable to closed and legacy OS • Accurate (usually less than 5% error)‏ • Low cost (usually less than 3% overhead)‏ • Enables service implementation in VMM

  7. Outline • Geiger approach • New Geiger techniques • Evaluation • Application

  8. Buffer Cache Events • Cache promotion • Disk block inserted into buffer cache • Cache demotion • Disk block removed from cache

  9. Detecting Promotion • Block read • Block write • Disk reads and writes visible to the VMM • Associated Disk Location (ADL)‏ A A ADL User process A B C C C B Buffer cache Disk

  10. Detecting Demotion • Detect when a page is removed from the cache • VMM cannot observe page free directly • Instead, look for page reuse • If cache page data is reused, the page was logically freed in the interim • Reuse inconsistent with ADL -> eviction ADL A B A B C C C Buffer cache Disk

  11. Read / Write Evictions • Read eviction • A non-free page is reused for reading from a different disk location • E.g. read a large file/memory space • Write eviction • A non-free page is reused for writing. When it is written-back, the reuse (eviction) is detected • Lag

  12. Existing Techniques • Promotion via reads and writes • Demotion via reads and writes • Chen et al. -- USENIX 2003 • Within OS (pseudo device driver)‏ • Initial basis for Geiger

  13. Outline • Geiger approach • New Geiger techniques • Evaluation • Application

  14. New Geiger Techniques • Other ways buffer cache pages are evicted • Unified buffer cache/virtual memory system • Non-I/O allocations cause eviction • Two new eviction detection heuristics • Copy-on-write • Anonymous allocation

  15. When Eviction Happens? • Explicit Eviction • Read eviction • Write eviction • Implicit Eviction • A non-free page is reused without disk writing or reading • Page allocation or Copy-on-Write • E.g. when a process requests for a new page, a non-dirty page is allocated it

  16. Detecting Allocation Eviction • Page not-present fault • Page allocation (possible reuse)‏ • New writable mapping • Detect eviction • Invalidate ADL R z User process A B C C C z A B A’ Disk Buffer cache

  17. Filesystem Issues • Filesystem features cause false positives • Filesystem blocks can be deleted • Leads to dangling ADL and spurious eviction • Journaling causes aliasing • Same cache page written to both the journal and filesystem locations • Interferes with write-eviction heuristic

  18. Geiger Is Filesystem Aware • Uses static filesystem info • Journal location and size • Block allocation bitmaps • Ignore writes to the journal • Track allocation bitmap updates and invalidate ADLs when blocks deallocated • Significantly reduces Geiger false positives

  19. Block Liveness • Reusing a free page is not an eviction • Geiger infers the liveness of a page from the liveness of block • A block dies • A file is deleted or truncated • A process with virtual memory usage terminates

  20. Block Liveness for Files • Observing the writes to superblock • :They are at some special disk location • : OS caches them in memory and sync to disk every 30 secs or more • Pages used to cache them are marked read-only • Write attempts will cause page-faults • Invalidate affected ADLs

  21. Block Liveness for Swap Space • No on-disk structure to track block usage • When a disk block is written from a different memory page, the original block is considered to be “dead” • Maintain a reverse mapping from between blocks and ADLs • Invalidate ADLs when blocks are overwritten • If no overwritting, dead blocks can’t be detected • Leads to as much as 37% false positive eviction

  22. Outline • Geiger approach • New Geiger techniques • Evaluation • Application

  23. Evaluation Goals • Measure Geiger accuracy • Missed evictions (false negatives)‏ • Spurious evictions (false positives)‏ • Measure Geiger timeliness • Lag between actual event and detection

  24. Experimental Environment • Xen 2.0.7 VMM [Barham et al., SOSP03] • Extensions to observe page faults, page table updates, and I/O requests/completions • Linux 2.4 and 2.6 guests • Microbenchmarks • Isolate specific eviction types • Read, write, COW, allocation • Application benchmarks • Dbench, Mogrify, TPC-W, SPC disk trace

  25. Eviction Detection Accuracy Workload False Neg % False Pos % Read Evict 0.96% 0.58% Write Evict 1.68% 0.03% COW Evict 2.47% 1.45% Alloc Evict 0.17% 0.17%

  26. Eviction Detection Lag ~3s

  27. Application Accuracy Workload Geiger Opt False Neg% False Pos% Dbench w/o block liveness 1.10% 30.23% Dbench w/ block liveness 2.30% 5.72% Mogrify w/o block liveness 0.05% 22.99% Mogrify w/ block liveness 0.65% 2.46% TPC-W 0.14% 3.12% SPC Web2 2.24% 0.32%

  28. Outline • Geiger approach • New Geiger techniques • Evaluation • Application • Eviction-based cache placement

  29. Application:Eviction-based Cache Placement • Disk cache utilization is critical to performance • Storage servers have large caches • Demand-based placement => poor utilization • Increase cache utilization via exclusivity • Use client cache eviction as placement hint [Chen et al., USENIX ’03, Wong and Wilkes, USENIX ‘02] • Use VMM-based, implicit eviction information to inform a remote storage cache • No client or OS storage interfaces change

  30. Cache Placement Results 13% 51% • Geiger outperforms demand placement • Mogrify: buffer misses too many evictions • Mogrify: false positives are fortuitous • Dbench: Lag causes OS to outperform Geiger

  31. Outline • Geiger approach • New Geiger techniques • Evaluation • Application • Eviction-based cache placement • Working set size estimator

  32. LRU Miss Ratio Curve m m m m m m m m m m m m m m m m c h i j k l n k h n c b c d e g l f f d l n i a a a b e c k i g b h g f d e k e n l n a b c d g e f g h i j k l a b d i f j h i j c k l n a b c d e f g h k c k l c d e f g h n b i l n n a 1 1 0 j a b d f g h i l j k l k n c 4 n 3 2 e g c l k n 0 n 0 0 k 0 0 0 0 n n l 0 l j j 0 0 0 0 c 0 l j n a b c k e f i k h n l d LRU Queue Pages in LRU order 1 14 Hit Histogram Associated with each LRU position 5 Fault Curve faults 1 1 4 11 14 pages

  33. Application:Working Set Size Estimator • MemRx: • Observe evictions/reloads • Compute miss ratio curve WSS = current memory allocation + LRU estimation Only works when WSS > current memory size

  34. Estimation Results:Microbenchmarks Virtual Machine is configured with 128 MB memory Each benchmark accesses 256 MB file/memory FS: file access VM: memory access

  35. Estimation Results:Applications

  36. Summary • System services in a VMM • Need information about the guest OS • Implicit information about the buffer cache • No guest OS modification • Accurate • Low overhead • Build services and optimizations in a VMM • Eviction-based cache placement • Working set size estimation

  37. Computer Sciences Department Advanced Systems Laboratory http://cs.wisc.edu/adsl

More Related