Parallel garbage collection
1 / 25

Parallel Garbage Collection - PowerPoint PPT Presentation

  • Uploaded on

Parallel Garbage Collection. Timmie Smith CPSC 689 Spring 2002. Outline. Sequential Garbage Collection Methods Multi-threaded Methods Parallel Methods for Shared Memory Parallel Methods for Distributed Memory. Motivation. Good software design requires it

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Parallel Garbage Collection' - marged

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Parallel garbage collection

Parallel Garbage Collection

Timmie Smith

CPSC 689

Spring 2002


  • Sequential Garbage Collection Methods

  • Multi-threaded Methods

  • Parallel Methods for Shared Memory

  • Parallel Methods for Distributed Memory


  • Good software design requires it

    • Modular programming, OO even more so, mandates components be independent

      • Explicit memory management requires modules to know what others are doing so they can deallocate objects safely.

      • Introduces bookkeeping that makes modules brittle, hard to reuse, and hard to extend

    • Garbage collection allows modules to not worry about memory management

      • Modules don’t have to have bookkeeping code

      • Reusability and extensibility are improved immediately

  • Memory leaks are avoided

Sequential garbage collection
Sequential Garbage Collection

  • Basic Collection Techniques

    • Reference Counting

    • Mark-Sweep

    • Mark-Compact

    • Copying

    • Non-Copying Implicit Collection

  • Incremental Tracing Techniques

  • Generational Techniques

Garbage collection abstraction
Garbage Collection Abstraction

  • An object is not garbage if it is live, or is reachable from any live object.

  • 2-phase abstraction of garbage detection followed by collection used.

    • Detection determines which objects are live.

      • Root Set – all global objects,local objects, and objects on stack

      • Iteratively find and add objects to the Root Set reachable from the Root Set until nothing is added

    • Collection frees any object that is not live.

Reference counting

Root Set








Reference Counting

  • Object headers store number of references to object

    • Object collected as soon as there are no references to it

    • Operations to update count make technique expensive

  • Reference cycles between objects limit effectiveness

  • Method can be incremental to limit program pauses

  • Overhead of method is proportional to work done by program

Mark sweep collectors
Mark-Sweep Collectors

  • Traces from the root set and marks all live objects, then sweeps heap to collect unmarked objects

    • Collected objects linked to free lists used by allocator

  • Disadvantages include fragmentation, cost of collection, and decrease of locality

    • Fragmentation caused by objects not being compacted

    • Cost of collection is proportional to size of the heap

    • Spatial locality lost as objects allocated among older objects

Mark compact collectors
Mark-Compact Collectors

  • Sweep phase of Mark-Sweep modified

    • Collected objects not linked to free list

    • Marked objects copied into contiguous memory

    • Pointer to end of contiguous space maintained for new allocation

  • Overhead of Sweep not improved

    • Entire heap still swept to find unreachable objects

    • Live objects must be swept several times

      • First pass relocates objects

      • Additional passes required to update pointers

  • Mechanisms to handle pointers also adds overhead

    • Lookup table kept while objects being relocated

    • Indirection of forward pointers used if program not stopped

Copying collectors
Copying Collectors

  • Heap is split into “from space” and “to space”

  • Collection triggered when object cannot be allocated in the current space

    • Program stopped to avoid pointer inconsistencies

    • Forward pointers used to handle objects referenced multiple times

    • Work proportional to number of live objects

  • Collection frequency decreased by increasing size of memory spaces

Non copying collectors
Non-copying Collectors

  • Spaces of copying collector treated as a set

    • Tracing moves live objects to second set

    • After tracing objects in first set are garbage

  • Sets are implemented as a linked list

  • Subject to same locality and fragmentation issues as Mark-Sweep collectors

Incremental tracing collectors
Incremental Tracing Collectors

  • Collection interleaved with program execution

    • No “Stop the World” pause in program execution.

    • Program can change reachability of objects while collector is running.

    • Program is referred to as the mutator.

  • Collector must be conservative to be correct

    • Restarting to collect all garbage caused by changes doesn’t help.

    • Some garbage “floating” until the next collection

Tri color marking system
Tri-color marking system

  • Object traversal status kept by object coloring

    • Simple mark-sweep or copying need only two colors because collection occurs when mutator paused.

    • Incremental approaches require third color to handle changes in reachability.

      • Black – object is live and all children have been traversed

      • Grey – object is live, children have not been traversed

      • White – object not yet reached

    • Mutator must coordinate with collector if a pointer to a white object is added to a black object.

Tri color marking example
Tri-color Marking Example



  • Mutator modifies A and B while garbage collector examines B’s descendants

  • Mutator must coordinate with garbage collector to prevent D being collected.







Mutator collector coordination
Mutator/Collector Coordination

  • Coordination must update collector when a pointer is overwritten.

    • Read Barrier – detects when mutator accesses a pointer to a white object and immediately colors the object grey.

    • Write Barrier – mutator attempts to write a pointer into an object are trapped.

  • Two different write barrier approaches

Write barrier approaches
Write Barrier Approaches

  • Snapshot-at-the-Beginning

    • Ensures a pointer to an object is not destroyed before the collector traverses it.

    • Pointers are saved before they are overwritten.

  • Incremental Update

    • When a pointer is written into a black object, the object is changed to gray and is rescanned before collection is completed.

    • No extra bookkeeping structure needed.

Generational collectors
Generational Collectors

  • Based on empirical evidence that most objects are short lived.

  • Heap space split into generational spaces

    • Older generation spaces are smaller

    • Spaces collected when allocation in the space fails

  • Live objects found during collection of a generation advanced to older generation

    • Long-lived objects copied fewer times than in copying collector

    • Heuristics used to determine when to advance objects to next generation

Intergenerational references
Intergenerational References

  • Method must be able to collect one generation without collecting others

    • Pointers from older generations to younger generation.

      • Table to store pointers in older objects used in root set

      • Write barrier technique used in incremental collectors

    • Pointers from young generations into older generations

      • Write barrier technique to trap all pointer assignments

      • Use live objects in all younger generations in root set

Multi threaded methods
Multi-threaded Methods

  • Attempt to reduce pauses caused by “stopping the world” [2]

  • Garbage collector is a separate thread that is run concurrently with the application.

  • Coordination with application is minimized

    • Sweep proceeds while application running

    • Application marks pages when object modified

    • Dirty pages rescanned before collection

Parallel garbage collection1
Parallel Garbage Collection

  • Parallelization of sequential methods

    • Mark-and-Sweep

    • Reference Counting

  • Different issues in each environment

    • Shared variable access in shared memory systems

    • Disjoint address spaces in distributed memory systems

  • Scheduling in both environments involves stopping application threads during tracing.

    • Long pauses avoided by incremental collection

    • Improves performance in SPMD programs since application has frequent global synchronizations.

Shared memory
Shared Memory

  • Reference Counting

    • References to object updated by all processors

    • Locks on object headers limit scalability

  • Mark-Sweep

    • Each processor begins marking from a local root set, and atomically marks an object

    • Poor scalability unless some mechanism for load balancing implemented

      • Processor must mark all descendants of an object it marks

      • Work stealing allows load rebalancing and improved results

      • Splitting large objects also allows for better load balance.

Distributed memory
Distributed Memory

  • Biggest challenge is representing cross-processor references.

    • Remote Processor – a stub entry is pointed to by the pointer

      • Processor id of the object owner

      • Complement of the remote object address

    • Local Processor – an entry table maintains all references

      • First export of an object reference enters object in table

      • Object is never reclaimed without cooperation of processors

    • Fields of stub and entry table objects are the same

      • Flag – distinguishes type of object

      • Count – a count of the number of unrecieved messages referencing the object.

Distributed memory1
Distributed Memory

  • Marking Phase

    • Processors begin with local root set and mark all local objects

    • When local marking is complete, “mark messages” are sent to remote processors for each marked stub

      • Remote processor receives message and adds object to mark stack and continues local marking.

      • When local marking complete and no more messages are received, remote processor acknowledges messages sent.

      • Marking complete when acknowledgement for first message sent is received.

Distributed memory2
Distributed Memory

  • Collection Phase

    • Expand the heap

      • Processors notified of largest local heap at end of each collection. H < cM, where c < 1 and M is the max heap size.

    • Local collection occurs when the heap cannot be expanded.

    • Global collection occurs when local collection is insufficient.

      • Global collection allows entry tables to be cleared.

      • Infrequent global collections minimize impact of collector on application performance.


  • Non-copying methods are the safest for languages where pointers are not identifiable

    • Fragmentation and loss of locality limit performance of these methods

    • Copying collectors are preferred in cases where memory is limited and pointers can be found

  • Parallel Garbage Collection can be based on parallelization of sequential methods.

    • Parallel collectors subject to same issues as their sequential counterparts

    • Parallel collectors also subject to synchronization and communication issues while maintaining references and performing collection.


[1] Hans Boehm and Mark Weiser. Garbage Collection in an Uncooperative Environment. Software Practice and Experience. September, 1988.

[2] Hans-J. Boehm, Alan J. Demers, and Scott Shenker Mostly Parallel Garbage Collection. Proceedings of the Conference on Programming Language Design and Implementation (PLDI). 1991  

[3] Hans-J. Boehm Fast Multiprocessor Memory Allocation and Garbage Collection. External Technical Report HPL-2000-165, HP Labs. December 2000.

[4] David L. Detlefs, Al Dosser and Benjamin Zorn. Memory Allocation Costs in Large C and C++ Programs. Technical Report CU-CS-665-93, University of Colorado - Boulder, 1993.

[5] John R. Ellis and David L. Detlefs. Safe, efficient garbage collection for c++. Technical report, Xerox Palo Alto Research Center, June 1993.

[6] Kenjiro Taura and Akinori Yonezawa An Effective Garbage Collection Strategy for Parallel Programming Languages on Large Scale Distributed-Memory Machines. Proceedings of the Symposium on Principles and Practice of Parallel Programming (PPOPP). 1997.

[7] Paul R. Wilson Uniprocessor Garbage Collection Techniques. Proceedings of the International Workshop on Memory Management (IWMM). 1992.

[8] Toshio Endo, Kenjiro Taura and Akinori Yonezawa, A Scalable Mark-Sweep Garbage Collector on Large-Scale Shared-Memory Machines in Proceedings of High Performance Networking and Computing (SC97), November 1997.

[9] Hirotaka Yamamoto, Kenjiro Taura, and Akinori Yonezawa. Comparing Reference Counting and Global Mark-and-Sweep on Parallel Computers in Lecture Notes for Computer Science (LNCS), Languages, Compilers, and Run-time Systems (LCR98), volume 1511, pp. 205-218. May 1998.