Mark and split
This presentation is the property of its rightful owner.
Sponsored Links
1 / 29

Mark and Split PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on
  • Presentation posted in: General

Mark and Split. Kostis Sagonas Uppsala Univ., Sweden NTUA, Greece. Jesper Wilhelmsson Uppsala Univ., Sweden. Copying Collection + GC time proportional to the size of the live data set - requires non-negligible additional space moves objects compacts the heap. Mark-Sweep Collection

Download Presentation

Mark and Split

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Mark and split

Mark and Split

KostisSagonas

Uppsala Univ., Sweden

NTUA, Greece

Jesper Wilhelmsson

Uppsala Univ., Sweden


Copying vs mark sweep

Copying Collection

+ GC time proportional to the size of the live data set

- requires non-negligible additional space

moves objects

compacts the heap

Mark-Sweep Collection

- GC time proportional to the size of the collected heap

+ requires relatively little additional space

non-moving collector

may require compaction

Copying vs. Mark-Sweep

Mark and Split


Mark sweep collection algorithm

Mark-Sweep Collection Algorithm

procmark_sweep_gc()

foreachroot  rootset domark(*root)

sweep()

procmark(object)

ifmarked(object) =false

marked(object) :=true

foreachpointer in object do

mark(*pointer)

Mark and Split


Variants of mark sweep

Variants of Mark-Sweep

  • Lazy sweeping[Hughes 1982; Boehm 2000]

    • Defer the sweep phase until allocation time and then perform it on a demand-driven (“pay-as-you-go”) way

    • Improves paging and/or cache behavior

  • Selective sweeping[Chung, Moon, Ebcioĝlu, Sahlin]

    • During marking, record the addresses of all marked objects in an array (outside the heap)

    • Once marking is finished, sort these addresses

    • Perform the sweep phase selectively guided by the sorted addresses

Mark and Split


Mark split collection idea

Mark-Split Collection: Idea

Rather than (lazily/selectively) sweeping

the heap after marking to locate free areas,

maintain information about them during marking.

More specifically, optimistically assume that

the entire heap will be free after collection and

let the mark phase “repair” the free list

by “rescuing” the memory of live objects.

Mark and Split


Mark split collection illustration

One free interval

Two free intervals

Three free intervals

Two free intervals

Three free intervals

Marking does not always increase the number of free intervals!

Marking can actually decrease the number of free intervals!

Marking splits a free interval

Marking splits another free interval

Mark-Split Collection: Illustration

Heap to be collected

Mark and Split


Mark split collection algorithm 1

Mark-Split Collection: Algorithm (1)

procmark_sweep_gc()

foreachroot  rootset domark(*root)

sweep()

procmark(object)

ifmarked(object) =false

marked(object) :=true

foreachpointer in object do

mark(*pointer)

procmark_?????_gc()

foreachroot  rootset domark(*root)

procmark(object)

ifmarked(object) =false

marked(object) :=true

foreachpointer in object do

mark(*pointer)

procmark_sweep_gc()

foreachroot  rootset domark(*root)

sweep()

procmark(object)

ifmarked(object) =false

marked(object) :=true

foreachpointer in object do

mark(*pointer)

procmark_split_gc()

insert_interval(heap_start, heap_end)

foreachroot  rootset domark(*root)

procmark(object)

ifmarked(object) =false

marked(object) :=true

split(find_interval(&object), object)

foreachpointer in object do

mark(*pointer)

Mark and Split


Mark split collection algorithm 2

Mark-Split Collection: Algorithm (2)

procsplit(interval, object)

objectEnd := &object + size(object)

keepLeft := keep_interval(&object – interval.start)

keepRight := keep_interval(interval.end – objectEnd)

ifkeepLeftkeepRight

insert_interval(objectEnd, interval.end) //Case 1

interval.end := &object

elseifkeepLeft interval.end := &object //Case 2

elseifkeepRight interval.start := objectEnd//Case 3

elseremove_interval(interval.end)//Case 4

functkeep_interval(size)

returnsize T //T is a threshold

Mark and Split


Mark split collection data structure

Mark-Split Collection: Data Structure

For storing the free intervals we need a data structure that allows for:

  • Fast location of an interval (find_interval )

  • Fast insertion of new intervals (insert_interval )

    Data structures with these properties are:

  • Balanced search trees

  • Splay trees

  • Skip lists

  • In our implementation we used the AA tree [Andersson 1993]

Mark and Split


Mark split collection best cases

When live data set is a small percentage of the heap

When marking is consecutive

Mark-Split Collection: Best Cases

When nothing is live

Mark and Split


Mark split collection worst case

Mark-Split Collection: Worst Case

Note:

- the number of free intervals is at most #L + 1

- this number will start decreasing once L  H/2

Mark and Split


Time complexity

Copying

O(L)

Mark-sweep

O(L) + O(H)

Selective sweeping

O(L) + O(L log L) + O(L)

Mark-split

O(L log I)

where:

L = size of live data set

H = size of heap

I = number of free intervals

Note:

I  L  H

I is bounded by

#L+1 if L < H/2

H/(2o) if L  H/2 where o = size of smallest object

Time Complexity

Mark and Split


Space requirements

Best Worst

Copying L H

Mark-sweep M M

Selective sweepingM + #L M + #H

Mark-split M + kM + k(H/2o)

where:

L = size of live data set o = size of smallest object

H = size of heap k = size of interval node

M = size of mark bit area

Space Requirements

Mark and Split


Mark split vs selective sweeping

Assume marking is consecutive

Mark-Split vs. Selective Sweeping

  • Mark-coalesce (the dual of mark-split)

    • Maintains information about occupied intervals

    • Can be seen as a variant of selective sweeping that eagerly merges neighboring marked intervals

    • Requires an extra pass at the end of collection to construct the free intervals list

  • Mark-split requires significantly less auxiliary space than selective sweeping

Mark and Split


Mark split vs lazy sweeping

Mark-Split vs. Lazy Sweeping

  • Lazy sweeping does not affect the complexity of collection

  • But often improves the cache performance of applications run with GC because

    • It avoids (some) negative caching effects

      • Sweep phase disturbs the cache

    • Compared with “plain” mark-sweep, it has positive caching effects

      • Memory to allocate to is typically in the cache during object initialization

Mark and Split


Adaptive schemes

Adaptive Schemes

  • Basic idea is simple:

    • Optimistically start with mark-split

    • If it is detected that the cost will be too high, revert to mark-sweep

  • Criteria for switching:

    • Auxiliary space is exhausted

    • Number of tree nodes visited is too big

    • Keep a record of prior history (last N collections)

  • Note that no single mark-split collection that reverts to mark-sweep can be faster than a mark-sweep only collection, but a sequence of adaptive collections can!

Mark and Split


Implementation

Implementation

  • Done in BEA’s JRockit

    • Mark-sweep collector has existed for quite long

    • Sweeps the heap by examining whole words of the bitmap array

  • Mark-split’s code is about 600 lines of C

    • The threshold T is set at 2KB (because of TLA)

      Benchmarking environment:

    • 4 processor Intel Xeon 2GHz with hyper-threading

    • 512KB of cache, 8GB of RAM running Linux

    • SPECjvm98 benchmarks run for 50 iterations

Mark and Split


Performance evaluation on specjvm98

Performance Evaluation on SPECjvm98

compress

Mark and Split


Performance evaluation on specjvm981

Performance Evaluation on SPECjvm98

jess

Mark and Split


Performance evaluation on specjvm982

Performance Evaluation on SPECjvm98

db

mtrt

javac

jack

Mark and Split


Performance evaluation on specjvm983

Performance Evaluation on SPECjvm98

compress

Mark and Split


Performance evaluation on specjvm984

Performance Evaluation on SPECjvm98

jess

Mark and Split


Performance evaluation on specjvm985

Performance Evaluation on SPECjvm98

db

mtrt

javac

jack

Mark and Split


Specjvm98 gc times on a 128mb heap

SPECjvm98 – GC times on a 128MB heap

Mark and Split


Specjvm98 gc times on a 512mb heap

SPECjvm98 – GC times on a 512MB heap

Mark and Split


Specjvm98 gc times on a 2gb heap

SPECjvm98 – GC times on a 2GB heap

Mark and Split


Other measurements on specjvm98

Other Measurements (on SPECjvm98)

Mark and Split


Performance evaluation on specjbb

Performance Evaluation on SPECjbb

Mark and Split


Concluding remarks on mark split

Concluding Remarks on Mark-Split

New non-moving garbage collection algorithm:

  • Based on a simple idea:

    • maintaining free intervals during marking, rather than sweeping the heap to find them

  • Makes GC cost proportional to the size of the live data set, not the size of the heap that is collected

  • Requires very small additional space

  • Exploits the fact that in most programs live data tends to form (large) neighborhoods

Mark and Split


  • Login