240 likes | 394 Views
Online Performance Auditing Using Hot Optimizations Without Getting Burned. Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD). Problem. Trend: Increasing complexity of computer systems Hardware: more speculation and parallelism
E N D
Online Performance AuditingUsing Hot Optimizations Without Getting Burned Jeremy Lau (UCSD, IBM) Matthew Arnold (IBM) Michael Hind (IBM) Brad Calder (UCSD)
Problem • Trend: Increasing complexity of computer systems • Hardware: more speculation and parallelism • Software: more abstraction layers and virtualization • Increasing complexity makes it more difficult to reason about performance • Will optimization X improve performance?
Increasing Complexity • Increasing distance between application and raw performance • Stack on right vs. classic Application-OS-Hardware stack • Hard to predict how all layers will react to application-level optimization Application Application Server Java VM OS Hypervisor Hardware
Heuristics • When should I use optimization X? • Common solution: Use heuristics • Example: Apply optimization X if code size < N • “We believe X will improve performance when code size < N” • Determine N by running benchmarks and tuning to maximize average performance • But heuristics will miss opportunities to improve performance • Because they are tuned for the average case
Experiment • Aggressive inlining: 4x inlining thresholds • Allows much larger methods to be inlined • Apply aggressive inlining to one hot method at a time • Calculate per-method speedups vs. default inlining policy • Use cycle counter to measure performance
Experiment Results Aggressive inlining vs. default inlining Per-Method Speedups Using J9, IBM’s high-performance Java VM
Experiment Analysis • Aggressive inlining: mixed results • More slowdowns than speedups • But there are significant speedups!
Wishful Thinking • Dream: A world without slowdowns • Default inlining heuristics miss these opportunities to improve performance • Goal: Be aggressive only when it produces speedup
Approach • Determine if optimization improves or degrades performance as program executes • For general purpose applications • Using VM support (dynamic compilation) • Plan: • Compile two versions of the code: with and without optimization • Measure performance of both versions • Use best performing version
Benefits • Defense: Avoid slowdowns due to poor optimization decisions • Sometimes O3 is slower than O2. Detect and correct • Offense: Find speedups by searching the optimization space • Try high-risk optimizations without fear of long-term slowdowns
Challenge • Which implementation is fastest? • Decide online, without stopping and restarting the program • Can’t just invoke each version once and compare times • Changing inputs, global state, etc • Example: Sorting routine. Size of input determines run time • SortVersionA(10 entries) vs. SortVersionB(1,000,000 entries) • Invocation timings don’t reflect performance of A and B • Unless we know that input size correlates with runtime • But that requires high-level understanding of program behavior • Solution: Collect multiple timing samples for each version • Use statistics to determine how many samples to collect
Invocation of Sort() Randomly choose Version A or B Start timer Sort() Version A Sort() Version B Stop timer Record timing Method exit Timing Infrastructure • Can generalize: • Doesn’t have to be method granularity and • Can use more than two versions
INPUT: Two sets of method timings Statistical Analysis • Is A faster than B? • How confident are we? • Use standard statistical hypothesis testing (t-test) • If low confidence, collect more timing data Version A Timings Version B Timings Statistical Timing Analysis A is faster (or slower) than B by X% with Y% confidence OUTPUT:
Time to Converge • How long will it take to reach a confident conclusion? • Any speedup can be detected with enough timing data • Time to converge depends on: • Variance in timing data • Easy to detect speedup if method always does the same amount of work • Speedup due to optimization • Easy to detect big speedups • Fastest convergence for low variance methods with high speedup
Fixed Number of Samples • Why not just collect 100 samples? • Experiment: Try to detect an X% speedup with 100 samples • How often do the samples indicate a slowdown? • Each slowdown detected is a false positive • Samples do not accurately represent the population
Fixed Number of Samples • Number of samples needed depends on speedup • More speedup → Fewer samples • Fixed sampling inefficient • Suppose we want to maintain 5% false positive rate • Could always collect 10k samples, but that wastes time • Statistical approach collects only as many samples as needed to reach confident conclusion
Prototype Implementation • Prototype online performance auditing system implemented in IBM’s J9 Java VM • Currently audits a single optimization • Experiment with aggressive inlining • Infrastructure is not tied to aggressive inlining. Can evaluate any single optimization • When a method reaches highest optimization level: • Compile two versions of the method (with and without aggressive inlining), collect timing data, run statistical analysis • If aggressive inlining generates quickly detectable speedup, use it, else fall back to default inlining • Timeout can occur when confident conclusion not reached in 5 seconds
Timeouts • Good news: Few incorrect decisions • Timeouts: Only collect one timing sample for each method invocation • Most methods are not invoked frequently enough to converge before timeout • Future work: Reduce timeouts by reducing convergence time • Collect multiple timings per invocation: use loop iteration times instead of invocation times
Future Work • Audit multiple optimizations and settings • Search the optimization space online, as program executes • Exponential search space is both challenge and opportunity • Apply prior work in offline optimization space search • Use Performance Auditor to tune optimization strategy for each method
Summary • Not easy to predict performance • Should I apply optimization X? • Online Performance Auditing • Measure code performance as the program executes • Detect slowdowns • Due to poor optimization decisions • Find speedups • Use high-risk optimizations without long-term slowdown • Enable online optimization space search