1 / 50

Garbage collection

Garbage collection. David Walker CS 320. Where are we?. Last time: A survey of common garbage collection techniques Manual memory management Reference counting (Appel 13.2) Copying collection (Appel 13.3) Generational collection (Appel 13.4) Baker’s algorithm (Appel 13.6) Today:

Download Presentation

Garbage collection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Garbage collection David Walker CS 320

  2. Where are we? • Last time: A survey of common garbage collection techniques • Manual memory management • Reference counting (Appel 13.2) • Copying collection (Appel 13.3) • Generational collection (Appel 13.4) • Baker’s algorithm (Appel 13.6) • Today: • Mark-sweep collection (Appel 13.1) • Conservative collection • Compiler interface (13.7)

  3. Recap: Copying GC • Organized with to-space & from-space • Cheny’s algorithm: • Traverse data breadth first, copying objects from from-space to to-space • next pointer: where to copy • scan pointer: what to copy next root next scan

  4. Recap: Copying GC • Characteristics: • Automatic compaction • Requires precise pointer information • supplied by compiler • Fast allocation • no searching for free block to allocate; just limit check and pointer bump • Only touch reachable data. Cost: • assume copying a block costs c instructions • assume R reachable blocks ==> R*c instructions per collection • assume heap has size H • number N of words allocated in between collections: H/2 – R • cost per word allocated: R*c / N • What does the cost look like as a function of R and H?

  5. Recap: Generational GC • Further split into new and old generations • minor collections (frequent) collect new generation • start from roots & remembered set and do copy collection • don’t copy objects in old generation space • major collections collect all generations • ordinary copy collection root new remembered set oldSize = 2*newSize old root

  6. Recap: Generational GC • Further split into new and old generations • minor collections (frequent) collect new generation • start from roots & remembered set and do copy collection • don’t copy objects in old generation space • major collections collect all generations • ordinary copy collection root new oldSize = 2*newSize old root

  7. Recap: Generational GC • Cost: • cost per word allocated: R*c / N where • copying a block costs c instructions (say c = 10) • R reachable blocks • heap has size H • N = H/2 – R • standard copying collection: • H = 4R implies 75% wasted space • cost = R*10 / (4*R/2 – R) = R*10/R = 10 instructions/allocation • generational GC: • assume young generation is 10% reachable; YH = 10 * R • cost = R*10/(10*R/2 – R) = 2.5 instructions/allocation • if 0.1 MB young generation data is reachable, 1 MB wasted space in young generation. • if multi-generation system has 50 MB total, 1 MB is not much • use larger R-H ratio in older generations • manipulating remembered set takes time too. • sometimes old generation is mark-sweep

  8. Mark-sweep • A two-phase algorithm • Mark phase: Depth first traversal of object graph from the roots to mark live data • Sweep phase: iterate over entire heap, adding the unmarked data back onto the free list

  9. Example r1 Free list In use On free list

  10. Example Mark Phase: mark nodes reachable from roots r1 Free list In use On free list Marked

  11. Example Mark Phase: mark nodes reachable from roots r1 Free list In use On free list Marked

  12. Example Mark Phase: mark nodes reachable from roots r1 Free list In use On free list Marked

  13. Example Sweep Phase: set up sweep pointer; begin sweep p Free list r1 In use On free list Marked

  14. Example Sweep Phase: add unmarked blocks to free list p Free list r1 In use On free list Marked

  15. Example Sweep Phase p Free list r1 In use On free list Marked

  16. Example Sweep Phase: retain & unmark marked blocks p Free list r1 In use On free list Marked

  17. Example Sweep Phase p Free list r1 In use On free list Marked

  18. Example Sweep Phase p Free list r1 In use On free list Marked

  19. Example Sweep Phase p Free list r1 In use On free list Marked

  20. Example Sweep Phase p Free list r1 In use On free list Marked

  21. Example Sweep Phase: GC complete when heap boundary encountered; resume program p Free list r1 In use On free list Marked

  22. Cost of Mark Sweep • Cost of mark phase: • O(R) where R is the # of reachable words • Assume cost is c1 * R (c1 may be 10 instr’s) • Cost of sweep phase: • O(H) where H is the # of words in entire heap • Assume cost is c2 * H (c2 may be 3 instr’s) • Analysis • Each collection returns H - R words • For every allocated word, we have GC cost: • ((c1 * R) + (c2 * H)) / (H - R) • R / H must be sufficiently small or GC cost is high • Eg: if R / H is larger than .5, increase heap size • Mark-sweep requires extra space like copying collection

  23. A Hidden Cost • Depth-first search is usually implemented as a recursive algorithm • Uses stack space proportional to the longest path in the graph of reachable objects • one activation record/node in the path • activation records are big • If the heap is one long linked list, the stack space used in the algorithm will be greater than the heap size!! • What do we do?

  24. A nifty trick • Deutsch-Schorr-Waite pointer reversal • Rather using a recursive algorithm, reuse the components of the graph you are traversing to build an explicit stack • This implementation trick only demands a few extra bits/block rather than an entire activation record/block • We already needed a few extra bits per block to hold the “mark” anyway

  25. DSW Algorithm back next …

  26. DSW Algorithm back back next next … …

  27. DSW Algorithm back back next next … … … back next

  28. DSW Algorithm back back next next … … … … back back next next

  29. DSW Algorithm back back next next … … … … back back next next • extra bits needed to keep track of which • record fields we have processed so far

  30. DSW Setup • Extra space required for sweep: • 1 bit/record to keep track of whether the record has been seen (the “mark bit”) • f log 2 bits/record where f is the number of fields in the record to keep track of how many fields have been processed • assume a field of each record x: x.done • Functions: • mark x = sets x’s mark bit • marked x = true if x’s mark bit is set • pointer x = true if x is a pointer • fields x = returns number of fields in the record x

  31. DSW Algorithm (* depth-first search in constant space *) fun dfs(next) = if (pointer next) & not (marked next) then (* initialization *) while true do i = next.done if i < (fields next) then (* process ith field *) else (* back-track to previous record *) (* call dfs passing each root as next *) (* next is object being processed *) (* done[next] is field being processed *)

  32. DSW Algorithm (* depth-first search in constant space *) fun dfs(next) = if (pointer next) & not (marked next) then (* initialization *) while true do i = next.done if i < (fields next) then (* process ith field *) else (* back-track to previous record *) back = nil; mark next; next.done = 0;

  33. DSW Algorithm (* depth-first search in constant space *) fun dfs(next) = if (pointer next) & not (marked next) then (* initialization *) while true do i = next.done if i < (fields next) then (* process ith field *) else (* back-track to previous record *) y = next.i if (pointer y) & not (marked y) then next.i = back; back = next; next = y; mark next; next.done = 0; else next.done = i + 1 reuse field to store back ptr next … back

  34. DSW Algorithm (* depth-first search in constant space *) fun dfs(next) = if (pointer next) & not (marked next) then (* initialization *) while true do i = next.done if i < (fields next) then (* process ith field *) else (* back-track to previous record *) y = next.i if (pointer y) & not (marked y) then next.i = back; back = next; next = y; mark next; next.done = 0; else next.done = i + 1 reuse field to store back ptr … back next

  35. DSW Algorithm (* depth-first search in constant space *) fun dfs(next) = if (pointer next) & not (marked next) then (* initialization *) while true do i = next.done if i < (fields next) then (* process ith field *) else (* back-track to previous record *) y = next.i if (pointer y) & not (marked y) then next.i = back; back = next; next = y; mark next; next.done = 0; else next.done = i + 1 initialize for next iteration

  36. DSW Algorithm (* depth-first search in constant space *) fun dfs(next) = if (pointer next) & not (marked next) then (* initialization *) while true do i = next.done if i < (fields next) then (* process ith field *) else (* back-track to previous record *) y = next.i if (pointer y) & not (marked y) then next.i = back; back = next; next = y; mark next; next.done = 0; else next.done = i + 1 field is done

  37. DSW Algorithm (* depth-first search in constant space *) fun dfs(next) = if (pointer next) & not (marked next) then (* initialization *) while true do i = next.done if i < (fields next) then (* process ith field *) else (* back-track to previous record *) dfs complete temp = next; next = back; if next = nil then return; i = next.done; back = next.i; next.i = temp; next.done = i + 1;

  38. DSW Algorithm (* depth-first search in constant space *) fun dfs(next) = if (pointer next) & not (marked next) then (* initialization *) while true do i = next.done if i < (fields next) then (* process ith field *) else (* back-track to previous record *) y = next; next = back; if next = nil then return; i = next.done; back = next.i; next.i = y; next.done = i + 1; advance to next field

  39. More Mark-Sweep • Mark-sweep collectors can benefit from the tricks used to implement malloc/free efficiently • multiple free lists, one size of block/list • Mark-sweep can suffer from fragmentation • blocks not copied and compacted like in copying collection • Mark-sweep doesn’t require 2x live data size to operate • but if the ratio of live data to heap size is too large then performance suffers

  40. Conservative Collection • Even languages like C can benefit from GC • Boehm-Weiser-Demers conservative GC uses heuristics to determine which objects are pointers and which are integers without any language support • last 2 bits are non-zero => can’t be a pointer • integer is not in allocated heap range => can’t be a pointer • mark phase traverses all possible pointers • conservative because it may retain data that isn’t reachable • thinks an integer is actually a pointer • since it does not copy objects (thereby changing pointer values), mistaking integers for pointers does not hurt • all gc is conservative anyway so this is almost never an issue (despite what people say) • sound if your program doesn’t manufacture pointers from integers by, say, using xor (using normal pointer arithmetic is fine)

  41. Compiler Interface • The interface to the garbage collector involves two main parts • allocation code • languages can allocate up to approx 1 word/7 instructions • allocation code must be blazingly fast! • should be inlined and optimized to avoid call-return overhead • gc code • to call gc code, the program must identify the roots • to traverse data, heap layout must be specified somehow

  42. Allocation Code Assume size of record allocated is N: • Call alloc code • Test next + N < limit (call gc on failure) • Move next into function result • Clear M[next], ..., M[next + N – 1] • next = next + N • Return from alloc code • Move result into computationally useful place • Store useful values into M[next],....,M[next + N - 1]

  43. Allocation Code Assume size of record allocated is N: • Call alloc function • Test next + N < limit (call gc on failure) • Move next into function result • Clear M[next], ..., M[next + N – 1] • next = next + N • Return from alloc function • Move result into computationally useful place • Store useful values into M[next],....,M[next + N - 1] useful computation not alloc overhead

  44. Allocation Code Assume size of record allocated is N: • Call alloc function • Test next + N < limit (call gc on failure) • Move next into function result • Clear M[next], ..., M[next + N – 1] • next = next + N • Return from alloc function • Move result into computationally useful place • Store useful values into M[next],....,M[next + N - 1] inline alloc code

  45. Allocation Code Assume size of record allocated is N: • Call alloc function • Test next + N < limit (call gc on failure) • Move next into computationally useful place • Clear M[next], ..., M[next + N – 1] • next = next + N • Return from alloc function • Move next into computationally useful place • Store useful values into M[next],....,M[next + N - 1] combine moves

  46. Allocation Code Assume size of record allocated is N: • Call alloc function • Test next + N < limit (call gc on failure) • Move next into computationally useful place • Clear M[next], ..., M[next + N – 1] • next = next + N • Return from alloc function • Move next into computationally useful place • Store useful values into M[next],....,M[next + N - 1] eliminate useless store

  47. Allocation Code Assume size of record allocated is N: • Call alloc function • Test next + N < limit (call gc on failure) • Move next into computationally useful place • Clear M[next], ..., M[next + N – 1] • next = next + N • Return from alloc function • Move next into computationally useful place • Store useful values into M[next],....,M[next + N - 1] total overhead for allocation on the order of 3 instructions/alloc

  48. Calling GC code • To call the GC, program must: • identify the roots: • a GC-point, is an control-flow point where the garbage collector may be called • allocation point; function call • for any GC-point, compiler generates a pointer map that says which registers, stack locations in the current frame contain pointers • a global table maps GC-points (code addresses) to pointer maps • when program calls the GC, to find all roots: • GC scans down stack, one activation record at a time, looking up the current pointer map for that record

  49. Calling GC code • To call the GC, program must: • enable GC to determine data layout of all objects in the heap • for ML, Tiger, Pascal: • every record has a header with size and pointer info • for Java, Modula-3: • each object has an extra field that points to class definition • gc uses class definition to determine object layout including size and pointer info

  50. Summary • Garbage collectors are a complex and fascinating part of any modern language implementation • Different collection algs have pros/cons • explicit MM, reference counting, copying, generational, mark-sweep • all methods, including explicit MM have costs • optimizations make allocation fast, GC time, space and latency requirements acceptable • read Appel Chapter 13 and be able to analyze, compare and contrast different GC mechanisms

More Related