300 likes | 415 Views
Optimizing Compilers CISC 673 Spring 2011 Inlining. John Cavazos University of Delaware. Background. Inlining is important Removes call overhead Enables optimization opportunities Can be detrimental Increased compilation time Increased register pressure Cache effects.
E N D
Optimizing CompilersCISC 673Spring 2011Inlining John Cavazos University of Delaware
Background • Inlining is important • Removes call overhead • Enables optimization opportunities • Can be detrimental • Increased compilation time • Increased register pressure • Cache effects
Interprocedural Optimization • Some optimizations are disrupted by calls • Constant propagation might stop at call site • Possible solution: interprocedural optimization • Optimization that involves more than one function • Gets complicated (e.g., when functions not in same file)
Inlining • Replace a function call with body of called function • Assumed to be beneficial to a certain point • Enables optimizations • Constant folding, Common subexpression elimination, better global register allocation • Optimizations can outweigh call overhead reduction
Inlining Advantages • Eliminates call disruption • No register save/restore required • Call overhead removed • Allows context-specific tailoring • Eliminates call barrier for analysis/optimizations
Inlining Disadvantages • Eliminates benefits • Resets state for register allocation • Increase register pressure • Procedure calls (reuse) keep code size small • Compilation time increases • Larger functions • Code bloat
Inlining for Object Oriented • Plays a particular important role in optimization of OO languages • High ratio of calls (and overhead) • Many methods are short (e.g., setter/getter) • Issues mapping virtual calls to concrete implementations • Requires inserting a run-time type test
Inlining Transformation Easy • Actual transformation is easy • Rewrite call site with callee’s body • Rewrite formal parameter names with actual parameter names
Inlining Decision Hard • Resource constraint decision • Code size • must whole program and procedure • Excessive code growth leads to excessive compilation time (important for JITs!) • Profitability depends on specific context • Can callee be tailored and optimized • Each decision affects profitability and resources available later!
Inlining Decision Hard • Consider following call graph • Assign each edge a type {inline, no-inline} • Choice at each edge affects other decisions • Each decision has a profit and a cost (in terms of resources)
Inlining Decision Procedures • Some decisions are obvious • Inline small procedures • Code smaller than linkage • Inline procedures called only once • Still lots of experimental work to do! • Cavazos 2005, Waterman 2006 • Cooper, Hall, & Torczon or Davidson & Holler
Adaptive Decision Making • How should we determine a good decision heuristic? • Cavazos proposed an adaptive solution • Train a heuristic • Specialized for a given hardware or benchmark • Prior Art • Ad hoc (manually-constructed) heuristic based on program properties • Combine ad hoc heuristics into a single a single test applied at each call site – applied in a fixed order
Proposed Solution • Use machine learning • Features predict which methods to inline • Heuristic function controls inlining • Tune heuristic to : • Different compilation scenario • Different architecture
Applying Genetic Algorithms • Cross-validation • Evolve heuristic over set of benchmarks • Test on a different set of benchmarks • Average high performance • Self-validation • Evolve heuristic for one benchmark • Best performance for benchmark
High Performance Compiler • IBM Jikes RVM • Java JIT Compiler • Tuned for Server Applications • Commercial quality • Used by Several Hundred Researchers • Over 100 Publications • Several papers on Inlining
Default Inlining Heuristic • Small methods • Always inline • Medium-sized methods • Use static heuristic (IBM) • Large methods • Never inline
Default Inlining Heuristic if (calleeSize > CALLEE_MAX_SIZE) return NO if (calleeSize < ALWAYS_INLINE_SIZE) return YES if (inlineDepth > MAX_INLINE_DEPTH) return NO if (callerSize > CALLER_MAX_SIZE) return NO return YES
Genetic Algorithms • Tune parameters of IBM heuristic • Individual • Vector of Integers • Fitness is benchmark running time • Tuning time • Few hours per benchmark • Few days per suite
Parameters Tuned by GA Metric to Evaluate an Individual
Scenarios and Metrics • Scenarios • Adaptive • Optimizing • Metrics • Running Time • Total Time
Experimental Setup • High-Performance Java compiler • Jikes RVM 2.3.3 • Intel Pentium 4, 2.6 GHz • PowerPC G4, 500 MHz (not shown) • Training Set • SPEC JVM benchmarks • Test Set • DaCapo benchmarks + SPEC JBB
Adaptive Scenario (SPEC JVM98)
Adaptive Scenario (DaCapo+JBB)
Optimizing Scenario (SPEC JVM98)
Optimizing Scenario (DaCapo+JBB)
Conclusions • Out-performs well-tuned heuristic • 37% total time reduction on Intel • 7% total time reduction on PowerPC • Automatically tunes compiler heuristic • Compilation Scenario • Different Architectures