Automatic Heap Sizing for Enhanced Memory Management in Multiprogrammed Environments
This paper discusses the critical need for dynamic heap size adjustment in garbage collection (GC) for optimal memory utilization in multiprogrammed systems. Choosing a heap that is either too small leads to frequent garbage collections, while an excessively large heap incurs paging overhead. The proposed model establishes a cooperative mechanism between the garbage collector and the virtual memory manager (VMM) to dynamically adjust heap sizes based on real-time memory demands, aiming to maximize throughput and minimize paging while adapting to fluctuating memory capacities.
Automatic Heap Sizing for Enhanced Memory Management in Multiprogrammed Environments
E N D
Presentation Transcript
Automatic Heap Sizing Ting Yang, Matthew Hertz Emery Berger, Eliot Moss University of Massachusetts Scott Kaplan Amherst College
Problem & Motivation • Important to select right heap size • Too small: frequent GCs, less progress • Too large: excessive paging overhead • Previous work • Pick “optimal size”, given static real memory • BUT multiprogramming = dynamic real RAM • Cannot select heap size a priori • Must adjust during execution
Cooperation with VMM • GC needs support from virtual memory mgr: • VMM determines footprint • Memory needed to avoid % of misses that fault • GC can then adjust heap • Need to add info & communication: • GC requests footprint and real memory • VMM collects needed information • Informs GC on demand
Tracking Footprint “hot” “cold” • Maintain (decayed) histogram per page position • Provides value to application of n pages, for any n protected unprotected (dynamic) hits LRU stack position (pages)
GC Paging Behavior • Need to understand relationship: • Heap size, footprint, GC algorithm • Analysis methodology: • Obtain reference trace • Simulate Jikes RVM under DSS • Process LRU stack # faults at all memory sizes • Experiments: • GC: Mark-Sweep, Semi-Space, and Appel • Benchmarks: SPECjvm98, ipsixql, and pseudojbb
Paging Behavior • Three regions • Extreme paging: • larger heaps better • Substantial paging: • “plateau” • GC “looping” behavior • Drop in paging: • heap fits in RAM
Paging Model • Propose linear heap footprint model, relating: • Footprint • Heap size • GC algorithm • Model: Footprint = a*HeapSize + b • a = intuitively, how much of heap we loop over • depends on GC algorithm • For SS and Appel: ½ (fill half then collect) • For MS: 1 • b depends on Jikes RVM and application live data
Validating Paging Model • Different thresholds tof paging overhead • Good linearfit
Modeling Cooperative GC • Extended DSS to: • Simulate OS VMM • Add footprint calculation • Add communication to GC (system calls) • Extended SS and Appel GCs to: • Request footprint, real memory allocation • Use them to adjust heap size • Careful about growing heap • Careful in using info from nursery GCs (Appel)
Experimental Results • Adjusting to fixed memory size: • Increases heap size to reduce # of GCs • Decreases heap size to reduce paging • Heap size about right: close to static GC
Experimental Results • Adjusting to changing memory size: • Increases heap when memory increases • Decreases heap when memory decreases • Dominates static GC’s performance • Note: adjustable memory = higher throughput
Conclusion • Automatic heap size adjustment • Maximizes memory utilization • Avoids paging • Adapts quickly to steady and changing real memory allocations • Currently implementing VMM in Linux • Useful for “scheduler-aware” virtual memory, and others