An implementation of mostly copying gc on ruby vm
This presentation is the property of its rightful owner.
Sponsored Links
1 / 35

An Implementation of Mostly-Copying GC on Ruby VM PowerPoint PPT Presentation


  • 44 Views
  • Uploaded on
  • Presentation posted in: General

An Implementation of Mostly-Copying GC on Ruby VM. Tomoharu Ugawa The University of Electro-Communications, Japan. Background(1/2). Script languages are used at various scene Before: only for tiny applications Short lifetime Runs with little memory

Download Presentation

An Implementation of Mostly-Copying GC on Ruby VM

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


An Implementation of Mostly-Copying GC on Ruby VM

Tomoharu Ugawa

The University of Electro-Communications, Japan


Background(1/2)

  • Script languages are used at various scene

    • Before: only for tiny applications

      • Short lifetime

      • Runs with little memory

        ⇒GC (Garbage Collection) was not important

    • Now:for servers such as Rails, as well

      • May have long lifetime

      • May create a lot of objects

        ⇒GC has a great impact on total performance


Background(2/2)

  • Ruby’s GC

    • Conservative Mark-Sweep GC⇒Does not move objects

    • Once we expanded the heap, we can hardly shrink the heap

      • Heap cannot release unless it contains NO object

      • Lucky cases rarely happen

        Ex) Once a server uses a lot of memory for a heavy request, it will run with a large heap even after responding the request.

Live

Additional Heap 2

Initial Heap

Additional Heap 1


Goal

  • Compact the heap so that Ruby can return unused memory to OS.

  • Use Mostly-Copying GC

    • Modify the algorithm for Ruby

  • Minimize the change of C-libraries


Agenda

  • Shrinking the heap using Mostly-Copying GC

  • Modified Mostly-Copying algorithm

  • Evaluation

  • Related work

  • Conclusion


Why Ruby does not move objects?

  • move -> have to update pointers to the moving object

  • Ruby’s GC does not recognize all pointers to Ruby objects

    • In the C runtime stack

    • In regions allocated using “malloc” by C-libraries

      ⇒Cannot update such pointers

Ambiguous pointer (blue arrow)

move

Ambiguous root(cloud mark)

Exact pointer


Even so, we CAN move most objects

  • We can update pointerscontained in Ruby objects

    • Objects referred only fromRuby objects can be moved

  • Most objects are referred only from Ruby Objects

Most objects can be moved

This is the basic idea of the Mostly-Copying GC


Mostly-Copying GC [Bartlett ’88]

  • Objects referred only by exact pointers⇒Move it and update referencing pointers

  • Objects referred by ambiguous pointers (as well)⇒Do not move it


The heap of Mostly-Copying GC

  • Break the heap into equal-sized blocks

    • From-space of copying GC is a set of blocks

From

From

To

root

To

To

From


Shrinking the heap

Free blocks are not contiguous in mostly-coping collector

  • Release memory by the block

    • Block = hardware page

    • To release a block, do not access the block

      • Because such a blocks has no live object, all we have to do is not to allocate new objects on the block

      • Virtual memory system automatically reuses the page frame assigned to the block

    • (optional) We can tell the OS that the page has no valid data

      • madvise system call (Linux)


C-libraries

  • C-libraries wraps “malloc”-ed data to handle as Ruby objects. A wrapper object has:

    • A pointer to “malloc”-ed area

    • A function that “marks” objects referred from the data

    • NO pointer updating interface

Treat all pointers from“malloc”-ed dataas ambiguous pointers

traverse(data) {

mark(data->p1);

mark_location(…);

}

p1


Agenda

  • Shrinking the heap using Mostly-Copying GC

  • Modified Mostly-Copying algorithm

  • Evaluation

  • Related work

  • Conclusion


Mostly-Copying GC of Bartlett

  • Objects referred only from exact pointers⇒Copy it to to-space

  • Objects referred from ambiguous pointers⇒Move the containing block to to-space logically(they call this promotion)

  • The algorithm may encounter new ambiguous pointers. Pointed object may have been copied.

    • Bartlett’s algorithm copies all objects even if they are pointed by ambiguous pointers.

    • Objects in blocks promoted are eventually written back from their copies.


Problem

  • Memory efficiency

    • Copy objects even referred by ambiguous pointers

    • Garbage in promoted pages is not collected

root


Problem

  • Memory efficiency

    • Copy objects even referred by ambiguous pointers

    • Garbage in promoted pages is not collected

root


Problem

  • Memory efficiency

    • Copy objects even referred by ambiguous pointers

    • Garbage in promoted pages is not collected

root


Modify the algorithm

  • Mark-Sweep GC before Copying

    • Mark: find out ambiguous root

      • Objects referred by ambiguous pointers no more be copied

    • Sweep (only promoted block)

      • Each block has a free-list

        • All Ruby objects are 5 words=> Do not cause (external) fragmentation


Modified Algorithm(1/4)

  • Trace pointers from the root set

    • Mark all visited objects

    • Promote blocks containing objects referred by ambiguous pointers

root

Promoted(thick border)

Live mark


Modified Algorithm(2/4)

  • Sweep promoted blocks

    • Collect objects that are not marked

root


Modified Algorithm(3/4)

  • Copying GC (Using promoted block as the root set)

    • Do not copy objects in promoted blocks

root


Modified Algorithm(4/4)

  • Scan promoted blocks to erase mark of each objects

空き

root

空き

空き


The only change of C-libraries

  • Mark-array

    • An array that has the same pointers held in “malloc”-ed data

    • The C-library marks only the mark-array

    • The collector can traverse further

    • But, it cannot recognize they are ambiguous pointers

      • Remember: all pointers from “malloc”-ed data are treated as ambiguous ones

  • Impact

    • 2 modules

    • 3 parts

Change C-libraries so that THEY scan mark-array as ambiguous roots


Ruby VM

YARV r590

(This is old but has essentially the same GC as Ruby 1.9)

Items

Heap size

Elapsed time

Environment

CPU: Pentium 3GHz

OS: Linux 2.6.22

compiler: gcc 4.1.3 (-O2)

Evaluation


Benchmark Program

2.times {

ary = Array.new

10000.times { |i|

ary[i] = Array.new

(1..100).each {|j|

ary[i][j-1] = 1.to_f / j.to_f

}

if (i % 100 == 0) then CP() end

}

10000.times { |i|

ary[i] = nil

if (i % 100 == 0) then CP() end

}

30000.times { |i|

100.times{ “” }

if (i % 100 == 0) then CP() end

}

}

Increases live objects(processing heavy req.)

Profiling the heap by each100 loops checkpoints

Decreases live objects(end of heavy req.)

Make short-live objects(series of ordinary requests)


Heap size

(MB)

Traditional VM

Our VM

Checkpoint

Black line: amount of live objects


Relative elapsed time of our VM(Relative to traditional VM)

(%)

Average (except for thread):102%


Related work

  • Customizable Memory Management Framework [Attardi et. al ’94]

    • Collect garbage by sweeping promoted blocks

    • Ambiguous pointer are found out during copying

      • Copies of objects that has been copied when the collector recognizes they should not be copied will become garbage

      • Our algorithm detects such objects before copying


Related work

  • MCC [Smith et. al ’98]

    • Pins objects referred from ambiguous root

    • Always manage locations of ambiguous root by a list

      • C-libraries have to register/unregister ambiguous root each time they “malloc”/”free”

      • Our algorithm finds ambiguous root by tracing at the beginning of GC


Related work

  • Ruby 1.9

    • Reduce the size of additional heap to 16KB(i.e., heap is expanded by the 16KB block)

    • Increase the opportunity for releasing

      • Objects become distributed all over the heap as execution advances

    • We compact the heap


Conclusion

  • Implemented mostly-copying GC on Ruby VM

    • Modify the algorithm for memory efficiency

  • Evaluated its implementation

    • Shirked the heap after those phases of a program where it temporary uses a lot of memory

    • Elapsed time to execute benchmarks is comparable to traditional VM


Heap size (with Ruby 1.9)

(MB)

Ruby 1.9

YARV

Increase astime spends(even Ruby 1.9)

Our VM

checkpoint

Black line: amount of live objects


Benchmark Program 2

2.times {

ary = Array.new

10000.times { |i|

ary[i] = Array.new

(1..100).each {|j|

ary[i][j-1] = 1.to_f / j.to_f

}

if (i % 100 == 0) then CP() end

}

10000.times { |i|

ary[i] = nil

if (i % 100 == 0) then CP() end

}

30000.times { |i|

100.times{ “” }

if (i % 100 == 0) then CP() end

}

}

sum = 0

ary[i].each {|x| sum+=x}

ary[i] = sum

Make some long-lifetimeobjects during decreasingphase


Heap size (benchmark 2)

(MB)

Ruby 1.9

YARV

Our VM

checkpoint


Relative elapsed time of the VM with Bartlett’sAlgorithm. (Relative to traditional VM)

(%)


Related work

  • Generational GC for Ruby [Kiyama ’01]

    • Generational Mark-Sweep GC

      • Reduced GC time

      • Uses much memory

        • All objects have extra two words (double-linked list) for representing generations

    • Mostly-Copying GC can divide space for generations [Bartlett et. al ’89]


  • Login