1 / 32

The Garbage Collection Advantage: Improving Program Locality

The Garbage Collection Advantage: Improving Program Locality. Xianglong Huang (UT), Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (Umass), Zhenlin Wang (MTU), Perry Cheng (IBM). Presented by Na Meng. Many thanks to authors and the anonymous speaker on MM course last time.

kostya
Download Presentation

The Garbage Collection Advantage: Improving Program Locality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Garbage Collection Advantage:Improving Program Locality Xianglong Huang (UT), Stephen M Blackburn (ANU), Kathryn S McKinley (UT) J Eliot B Moss (Umass), Zhenlin Wang (MTU), Perry Cheng (IBM) Presented by Na Meng Many thanks to authors and the anonymous speaker on MM course last time

  2. Motivation • Memory gap problem • OO programs exacerbates memory gap problem • Automatic memory management • Pointer data structures Goal: improve OO program locality

  3. Opportunity • Copying garbage collector reorders objects at runtime

  4. Copying of Linked Objects 1 1 4 4 2 2 3 3 7 7 6 6 5 5 Breadth First

  5. Copying of Linked Objects 1 1 4 4 2 2 3 3 6 6 7 7 5 5 Breadth First 1 2 3 4 5 6 7 Depth First

  6. Copying of Linked Objects 1 1 4 4 2 2 3 3 7 7 6 6 5 5 Breadth First 4 4 1 1 2 3 5 6 7 Depth First 3 4 4 6 7 5 2 1 1 Online Object Reordering

  7. Outline • Motivation • Online Object Reordering (OOR) • Methodology • Experimental Results • Conclusion

  8. Online Object Reordering • Where are the cache misses? • How to identify hot field accesses at runtime? • How to reorder the objects?

  9. Where Are The Cache Misses? • Heap structure: VM Objects Stack Older Generation Nursery Not to scale

  10. Where Are The Cache Misses?

  11. Where Are The Cache Misses? • Two opportunities to reorder objects in the older generation • Promote nursery objects • Full heap collection

  12. How to Find Hot Fields? • Runtime info (intercept every read)? • Compiler analysis? • Runtime information + compiler analysis Key: Low overhead estimation

  13. Which Classes Need Reordering? Step 1: Compiler analysis • Excludes cold basic blocks • Identifies field accesses Step 2: JIT adaptive sampling identifies hot methods • Mark as hot field accesses in hot methods

  14. Example: Compiler Analysis Method Foo { Class A a; try { …=a.b; … } catch(Exception e){ …a.c } } Hot BB Collect access info Compiler Compiler Cold BB Ignore Access List: 1. A.b 2. …. ….

  15. Example: Adaptive Sampling Method Foo { Class A a; try { …=a.b; … } catch(Exception e){ …a.c } } Adaptive Sampling Foo Accesses: 1. A.b 2. …. …. Foo is hot A.b is hot A A’s type information b c ….. c b B

  16. Copying of Linked Objects Type Information 1 4 3 1 1 4 4 2 2 3 3 7 7 6 6 5 5 Online Object Reordering Cold space Hot space

  17. OOR System Overview Hot Methods Source Code Look Up Access Info Database Adaptive Sampling Adaptive Sampling Baseline Compiler Optimizing Compiler Optimizing Compiler Adds Entries Register Hot Field Accesses GC: Copies Objects GC: Copies Objects Executing Code Affects Locality Improves Locality Advice OOR addition Input/Output JikesRVM component

  18. Outline • Motivation • Online Object Reordering • Methodology • Experimental Results • Conclusion

  19. Virtual Machine • Jikes RVM • VM written in Java • High performance • Timer based adaptive sampling • Dynamic optimization • Experiment setup • Pseudo-adaptive • 2nd iteration [Eeckhout et al.]

  20. Memory Management • Memory Management Toolkit (MMTk) • Allocators and garbage collectors • Multi-space heap • Boot image • Large object space (LOS) • Immortal space • Experiment setup • Generational copying GC with 4M bounded nursery

  21. Overhead: OOR Analysis Only

  22. Detailed Experiments • Separate application and GC time • Vary thresholds for method heat • Vary thresholds for cold basic blocks • Three architectures • x86, AMD, PowerPC • x86 Performance counter: • DL1, trace cache, L2, DTLB, ITLB

  23. Performance javac

  24. Performance db

  25. Performance jython Is the improvement significant?

  26. Phase Changes

  27. Algorithm: Decay Field Heat m1(){ for(… …){ … … a.b = … } } m2(){ for(… …){ … … = a.c; } } • DECAY-HEAT(method) • 1 for each fieldAccess in methoddo • if PotentiallyHot(fieldAccess)then • hotField fieldAccess.field • class  hotField.instantiatingClass • class.hasHotField  true • for each field in classdo • period  Now() – class.lastUpdate • decay  HI/(HI + period) • field.heat  field.heat * decay • iffield.heat < LO then • field.heat = 0 • hotField.heat  HI • class.lastUpdate  Now() for(… …){ m1(); //GC works m2(); //GC works } Will the latest access pattern erase the earlier access pattern(s)?

  28. OOR w/o vs. w phase change • Almost all hot fields within an object are visited around the same time The standard benchmarks have few, if any, traversal order phases.

  29. Copying Advantage (javac) GenCopy vs. MS Mutator time? GC time? Total time?

  30. A Possible Comparison GenCopy vs. GenOOR ?

  31. Discussion Any other solution to improve the locality while doing copying collection

  32. Questions? Thank you!

More Related