thread specific heaps for multi threaded programs l.
Skip this Video
Loading SlideShow in 5 Seconds..
Thread-specific Heaps for Multi-threaded Programs PowerPoint Presentation
Download Presentation
Thread-specific Heaps for Multi-threaded Programs

Loading in 2 Seconds...

play fullscreen
1 / 15

Thread-specific Heaps for Multi-threaded Programs - PowerPoint PPT Presentation

Download Presentation
Thread-specific Heaps for Multi-threaded Programs
An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Thread-specific Heapsfor Multi-threaded Programs Bjarne Steensgaard Microsoft Research

  2. GC and Threads Traditional approaches: • Pseudo-concurrency => no concurrency • Concurrent GC => synchronization overhead • Stop and GC => no concurrency during GC Observations leading to our approach: • Much data is only used by a single thread • When collecting data used only by a single thread, other threads can be ignored

  3. GC and Thread-specific Heaps Thread-specific Heaps • Contains data only accessed by a single thread • Can be GC’ed independently of and concurrently with other thread-specific heaps (no pointers from the outside into these heaps) Shared Heap • Contains data possibly shared among threads • GC’ed using one of the traditional approaches

  4. Advantages • Concurrent collection of thread heaps • Increased locality of GC • Reduced GC latency (shorter “stops”) • Reduced memory overhead for two-space copying components of GC • “To”-space only needed for heaps actively being copied, “from” space can be released as copying of each heap is completed

  5. Enabling Thread-specific Heaps Memory requests must be specialized • Shared or thread-specific; choose conservatively • Must observe the invariant that there are no pointers from shared data to thread-specific data Root set division • May distinguish shared and thread-specific roots • Not necessary (and not implemented), but could reduce GC latency

  6. Compiler Support in Marmot Escape and Access Analysis • Interprocedural, flow-insensitive, context-sensitive • Polymorphic type inference (monomorphic recursion) for a non-standard type system • Tracks object flow and threads object access • Objects “escape” only when potentially accessed by multiple threads (as opposed to being visible to multiple threads)

  7. Compiler Support in Marmot Method specialization • Duplicate methods as necessary to specialize memory requests according to analysis results (and to call other specialized methods) • Crucial for achieving a usable separation of objects into shared and thread-specific objects Very similar to Ruf’s PLDI’00 work • Analysis and transformation stages are similar to Ruf’s work to remove synchronization ops

  8. Thread-specific GC in Marmot Prototype! Proof of concept • Modified two-generation copying GC • Each heap has two generations When a GC is triggered, all heaps are GC’ed • Reachable objects in the shared heap are copied first by a single thread • Threads then copy objects from their own heaps (helper threads are available for blocked threads) • When thread copying is complete, thread is restarted • Minimal synchronization needed for copying shared objects after initial copy of shared objects

  9. Example Shared root Thread 1 root Thread 2 root Thread 3 root Thread-specific object Legend: Shared object

  10. Performance and Efficacy Performance • On par with existing garbage collector for most programs, better for others Efficacy • Unknown! Most available programs do not use multi-threading for interesting purposes

  11. Efficacy Examples • VolanoMark (chat client/server) shares almost all long-lived data among threads • Client: allocates ½MB thread, 16MB shared data,copies 4KB thread, 1.2MB shared data • Server: allocates 5MB thread, 10MB shared data,copies 5KB thread, 1.7MB shared dataGC has improved locality, but otherwise little benefit • Mtrt benefits greatly, but is a poor benchmark • Allocates 27MB thread, ½MB shared data,copies6.5MB thread, 170MB shared data

  12. Future Work • Variations on how to collect the heaps • Heaps for thread groups or groups of threads • Allowing non-followed pointers from shared objects to thread-specific objects • Allowing thread-specific objects in shared containers using programmer annotations

  13. Heap A Heap D Heap F Heap E Heap B Heap C Multi-layer Heap Division Partially ordered rather than per-thread heaps Completely ordered heaps • If very fine-grained, then we are approaching Tofte & Talpin’s “Stack of Regions” approach

  14. Other Heap Divisions User-defined divisions checked by compiler • FX with regions Divisions according to major data structures • Example: a compiler could use different heap for program representation and analysis results • Permits customizing the collector to the nature of the data structure • The IBM folks are experimenting with “memory contexts”

  15. Related Work • Andy King & Richard JonesUniversity of Kent • Static division into thread-specific heaps • Pat Caudill & Allen Wirfs-BrockInstantiations, Inc. (makers of Jove) • Dynamic division into thread-specific heaps • Use write-barrier and copy-on-GC to deal with objects that are really shared among threads