Automatic Heap Sizing: Taking Real Memory into Account

Automatic Heap Sizing:Taking Real Memory into Account Ting Yang, Emery Berger, Matthew Hertz, Scott Kaplan¶, Eliot Moss Department of Computer Science Department of Computer Science¶ University of Massachusetts Amherst College Amherst MA 01003 Amherst MA 01002 {tingy,emery,hertz,moss}@cs.umass.edu sfkaplan@cs.amherst.edu

Problem & Motivation Too large: page a lot Optimal Too small: GC a lot Heap size vs Running time Appel _213_javac 60MB real memory

Problem & Motivation • Multiprogramming makes it harder: • Amount of available real memory changes • Impossible to select heap size a priori • Strategy: adjust adaptively during execution

Outline • Problem & Motivation • Paging behavior of Garbage Collection • VMM: collecting the information we need • Collector: adjusting heap size adaptively • Experimental results • Conclusion & Future work

GC Paging Behavior • For strategy to work, need to relate: • GC algorithm, heap size, and footprint • Analysis methodology: • Obtain reference trace: simulate Jikes RVM under DSS • Process with LRU stack  # faults at all memory sizes • GCs and programs traced: • Mark-Sweep (MS), Semi-Space (SS), and Appel GCs • SPECjvm98, ipsixql, and pseudojbb benchmarks

Heap size= extreme paging substantial paging: “looping” behavior 50 seconds fits into memory # of Faults ≈ 1000 Heap Size = 240Mb 0.5 second Memory = 145Mb Fault curve: Relationship of heap size, real memory & page faults

Page fault threshold = Our definition of footprint: The amount of memory needed so that the time spent on page faults is lower than a certain percentage of total execution time Relationship between heap size and footprint

A Linear Model:Heap size vs. Footprint • Heap footprint model: • Heaputil : SS 0.5; MS: 1 • base : Jikes RVM plus live data size • How the GC can use this model:

VMM ( in OS) Knows memory allocation / available memory Needs to track/calculate application footprint Garbage Collector (in User Space) Has ability to change heap size Needs info: available memory, footprint Strategy Overview Request: Send: mem alloc mem alloc footprint footprint

m m m m m m m m m m m m m m m m c h i j k l n k f n c b c d e g l f k g l n a j b a b e c e i g h a i h d j j l f n l k j i h b g n a e d c b a f d i h e k g h i k c k l n a b c d e f g f c n l l n k a i b f j 0 g h d e 0 0 b f c g j h a k l l n c i 4 3 2 1 1 0 c d n d e f 0 0 0 0 0 e l n k l n c 0 d b j k 0 0 n 0 g n a i k l n c h l k n j n l k 0 Approach to Measuring Footprint Memory reference sequence LRU Queue Pages in Least Recently Used order 1 14 Hit Histogram Associated with each LRU position 5 1 Fault Curve 1 4 11 14 5 pages 12 pages

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 VMM design: SegQ [SKW’99,KMC’02] Strict LRU CLOCK algorithm LRU Queue Hot set Cold set Evicted set Minor fault (in memory) Major fault (on disk) Hit Histogram Footprint Hot / Cold Boundary • Decay histogram periodically • Adaptive control of hot set size

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 VMM design: SegQ [SKW’99,KMC’02] Strict LRU CLOCK algorithm Hot set Cold set Evicted set Footprint Hot / Cold Boundary What is the footprint w.r.t 5%?

Collector Design: • Communicate with VMM after GC • First GC: • Appel, SS: HeapUtil = 0.5 • Following GCs: • Calculate HeapUtil from history

Experiments • Experimental setting: Jikes RVM 2.0.3 • Dynamic SimpleScalar: extended with new VMM model • Major fault = 5 million instructions = 5 ms @ 1 Gips • Minor fault = 2000 instructions = 2 µs • Page fault cost threshold = 5% - 10% • Histogram collecting cost threshold = 1% • Adapting to fixed memory pressure • Adapting to dynamic memory pressure • Add/Remove 15MB real memory after 2 billion insts

Appel _213_javac 60MB Real Memory Paging a lot Larger heap Fewer GCs Less Paging Memory under-utilized Optimal heap

Appel _213_javac 60MB Real Memory Increase memory: 15MB at 2 billion instructions

Appel _213_javac 60MB Real Memory Decrease memory: 15MB at 2 billion instructions

Conclusion: Automatic Heap Sizing • New collector usually picks heap size that: • Maximizes memory utilization (reducing GCs) • While avoiding paging • Linear model works well in practice • Improves performance by up to 8x under pressure • Cost of collecting information is low: around 1% • New collector adapts quickly to steady and to changing real memory allocations • Within 1 or 2 major GCs

Ongoing Work • Implement in real kernel • Extend to more collectors • Adjust during allocation, not just after GC Detailed graphs & tech report: http://www-ali.cs.umass.edu/~tingy/CRAMM

Automatic Heap Sizing: Taking Real Memory into Account

Automatic Heap Sizing: Taking Real Memory into Account

Presentation Transcript

Heap Taichi : Exploiting Memory Allocation Granularity in Heap-Spraying Attacks

Taking Time Into Account and Making Investment Decisions

Parametric Prediction of Heap Memory Requirements

NUMA aware heap memory manager

Translation Models: Taking Translation Direction into Account

Taking into account Context in IS research and practice

Taking Identities Into Account in Access Policies

Image Shading taking into Account Relativistic Effects

Automatic Heap Sizing

Heap liveness and its usage in automatic memory management

Array Allocation Taking into Account SDRAM Characteristics

Taking into Account Regulatory Risk

Taking Ethical Considerations Into Account?

Virtual Enterprise: Taking a Leap Into Real Business

Configuring Automatic Account Determination

Taking Into Account Grotius' StandardsDreams Come True/tutorialoutletdotcom

Item 4.6 - Taking into account radiosonde position in verification

Automatic Raw Cashew Nut Sizing System