Garbage collection introduction and overview
Download
1 / 61

Garbage Collection Introduction and Overview - PowerPoint PPT Presentation


  • 132 Views
  • Uploaded on

Garbage Collection Introduction and Overview. Christian Schulte Programming Systems Lab Universität des Saarlandes, Germany [email protected] Purpose of Talk. Explaining basic concepts terminology Garbage collection… …is simple …can be explained at a high-level Organization.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Garbage Collection Introduction and Overview' - jendayi


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Garbage collection introduction and overview

Garbage Collection Introduction and Overview

Christian Schulte

Programming Systems Lab

Universität des Saarlandes, Germany

[email protected]


Purpose of talk
Purpose of Talk

  • Explaining basic

    • concepts

    • terminology

  • Garbage collection…

    • …is simple

    • …can be explained at a high-level

  • Organization


  • Purpose of talk1
    Purpose of Talk

    • Explaining basic

      • concepts

      • terminology

        (never to be explained again)

  • Garbage collection…

    • …is simple

    • …can be explained at a high-level

  • Organization


  • Overview
    Overview

    • What is garbage collection

      • objects of interest

      • principal notions

      • classic examples with assumptions and properties

  • Discussion

    • software engineering issues

    • typical cost

    • areas of usage

    • why knowledge is profitable

  • Organizational

    • Material

    • Requirements


  • Overview1
    Overview

    • What is garbage collection

      • objects of interest

      • principal notions

      • classic examples with assumptions and properties

  • Discussion

    • software engineering issues

    • typical cost

    • areas of usage

    • why knowledge is profitable

  • Organizational

    • Material

    • Requirements


  • Garbage collection
    Garbage Collection…

    …is concerned with the automatic reclamation of dynamically allocated memory after its last use by a program


    Garbage collection1
    Garbage Collection…

    • dynamically allocated memory

    …is concerned with the automatic reclamation of dynamically allocated memory after its last use by a program


    Garbage collection2
    Garbage Collection…

    • dynamically allocated memory

    • last use by a program

    …is concerned with the automatic reclamation of dynamically allocated memory after its last use by a program


    Garbage collection3
    Garbage Collection…

    • dynamically allocated memory

    • last use by a program

    • automatic reclamation

    …is concerned with the automatic reclamation of dynamically allocated memory after its last use by a program


    Garbage collection4
    Garbage collection…

    • Dynamically allocated memory

    • Last use by a program

    • Examples for automatic reclamation


    Kinds of memory allocation
    Kinds of Memory Allocation

    static int i;

    void foo(void) {

    int j;

    int* p = (int*) malloc(…);

    }


    Static allocation
    Static Allocation

    • By compiler (in text area)

    • Available through entire runtime

    • Fixed size

    static int i;

    void foo(void) {

    int j;

    int* p = (int*) malloc(…);

    }


    Automatic allocation
    Automatic Allocation

    • Upon procedure call (on stack)

    • Available during execution of call

    • Fixed size

    static int i;

    void foo(void) {

    int j;

    int* p = (int*) malloc(…);

    }


    Dynamic allocation
    Dynamic Allocation

    • Dynamically allocated at runtime (on heap)

    • Available until explicitly deallocated

    • Dynamically varying size

    static int i;

    void foo(void) {

    int j;

    int* p = (int*) malloc(…);

    }


    Dynamically allocated memory
    Dynamically Allocated Memory

    • Also: heap-allocated memory

    • Allocation: malloc, new, …

      • before first usage

    • Deallocation: free, delete, dispose, …

      • after last usage

    • Needed for

      • C++, Java: objects

      • SML: datatypes, procedures

      • anything that outlives procedure call


    Getting it wrong
    Getting it Wrong

    • Forget to free (memory leak)

      • program eventually runs out of memory

      • long running programs: OSs. servers, …

    • Free to early (dangling pointer)

      • lucky: illegal access detected by OS

      • horror: memory reused, in simultaneous use

        • programs can behave arbitrarily

        • crashes might happen much later

    • Estimates of effort

      • Up to 40%! [Rovner, 1985]


    Nodes and pointers

    p

    Nodes and Pointers

    • Node n

      • Memory block, cell

    • Pointer p

      • Link to node

      • Node access: *p

    • Children children(n)

      • set of pointers to nodes referred by n

    n


    Mutator
    Mutator

    • Abstraction of program

      • introduces new nodes with pointer

      • redirects pointers, creating garbage


    Shared nodes
    Shared Nodes

    • Nodes referred to by several pointers

    • Makes manual deallocation hard

      • local decision impossible

      • respect other pointers to node

    • Cycles instance of sharing


    Garbage collection5
    Garbage collection…

    • Dynamically allocated memory

    • Last use by a program

    • Examples for automatic reclamation


    Last use by a program
    Last Use by a Program

    • Question: When is node M not any longer used by program?

      • Let P be any program not using M

      • New program sketch:

        Execute P; Use M;

      • Hence:

        M used  P terminates

      • We are doomed: halting problem!

    • So “last use” undecidable!


    Safe approximation
    Safe Approximation

    • Decidable and also simple

    • What means safe?

      • only unused nodes freed

    • What means approximation?

      • some unused nodes might not be freed

    • Idea

      • nodes that can be accessed by mutator


    Reachable nodes
    Reachable Nodes

    root

    • Reachable from root set

      • processor registers

      • static variables

      • automatic variables (stack)

    • Reachable from reachable nodes


    Summary reachable nodes
    Summary: Reachable Nodes

    • A node n is reachable, iff

      • n is element of the root set, or

      • n is element of children(m) and m is reachable

    • Reachable node also called “live”


    Mygarbagecollector
    MyGarbageCollector

    • Compute set of reachable nodes

    • Free nodes known to be not reachable

    • Known as mark-sweep

      • in a second…


    Reachability safe approximation
    Reachability: Safe Approximation

    • Safe

      • access to not reachable node impossible

      • depends on language semantics

      • but C/C++? later…

    • Approximation

      • reachable node might never be accessed

      • programmer must know about this!

      • have you been aware of this?


    Garbage collection6
    Garbage collection…

    • Dynamically allocated memory

    • Last use by a program

    • Examples for automatic reclamation


    Example garbage collectors
    Example Garbage Collectors

    • Mark-Sweep

    • Others

      • Mark-Compact

      • Reference Counting

      • Copying

    • skipped here

    • read Chapter 1&2 of [Lins&Jones,96]


    The mark sweep collector
    The Mark-Sweep Collector

    • Compute reachable nodes: Mark

      • tracing garbage collector

    • Free not reachable nodes: Sweep

    • Run when out of memory: Allocation

    • First used with LISP [McCarthy, 1960]


    Allocation
    Allocation

    node* new() {

    if (free_pool is empty)

    mark_sweep();


    Allocation1
    Allocation

    node* new() {

    if (free_pool is empty)

    mark_sweep();

    return allocate();

    }


    The garbage collector
    The Garbage Collector

    void mark_sweep() {

    for (r in roots)

    mark(r);


    The garbage collector1
    The Garbage Collector

    void mark_sweep() {

    for (r in roots)

    mark(r);

    all live nodes marked


    Recursive marking
    Recursive Marking

    void mark(node* n) {

    if (!is_marked(n)) {

    set_mark(n);

    }

    }


    Recursive marking1
    Recursive Marking

    void mark(node* n) {

    if (!is_marked(n)) {

    set_mark(n);

    }

    }

    nodes reachable from n marked


    Recursive marking2
    Recursive Marking

    void mark(node* n) {

    if (!is_marked(n)) {

    set_mark(n);

    for (m in children(n))

    mark(m);

    }

    }

    i-th recursion: nodes on path with length i marked


    The garbage collector2
    The Garbage Collector

    void mark_sweep() {

    for (r in roots)

    mark(r);

    sweep();


    The garbage collector3
    The Garbage Collector

    void mark_sweep() {

    for (r in roots)

    mark(r);

    sweep();

    all nodes on heap live


    The garbage collector4
    The Garbage Collector

    void mark_sweep() {

    for (r in roots)

    mark(r);

    sweep();

    all nodes on heap live

    and not marked


    Eager sweep
    Eager Sweep

    void sweep() {

    node* n = heap_bottom;

    while (n < heap_top) {

    }

    }


    Eager sweep1
    Eager Sweep

    void sweep() {

    node* n = heap_bottom;

    while (n < heap_top) {

    if (is_marked(n)) clear_mark(n);

    else free(n);

    n += sizeof(*n);

    }

    }


    The garbage collector5
    The Garbage Collector

    void mark_sweep() {

    for (r in roots)

    mark(r);

    sweep();

    if (free_pool is empty)

    abort(“Memory exhausted”);

    }


    Assumptions
    Assumptions

    • Nodes can be marked

    • Size of nodes known

    • Heap contiguous

    • Memory for recursion available

    • Child fields known!


    Assumptions realistic
    Assumptions: Realistic

    • Nodes can be marked

    • Size of nodes known

    • Heap contiguous

    • Memory for recursion available

    • Child fields known


    Assumptions conservative
    Assumptions: Conservative

    • Nodes can be marked

    • Size of nodes known

    • Heap contiguous

    • Memory for recursion available

    • Child fields known


    Mark sweep properties
    Mark-Sweep Properties

    • Covers cycles and sharing

    • Time depends on

      • live nodes (mark)

      • live and garbage nodes (sweep)

    • Computation must be stopped

      • non-interruptible stop/start collector

      • long pause

    • Nodes remain unchanged (as not moved)

    • Heap remains fragmented


    Variations of mark sweep
    Variations of Mark-Sweep

    • In your talk…


    Implementation
    Implementation

    • In your talk…


    Efficiency analysis
    Efficiency Analysis

    • In your talk…


    Comparison
    Comparison

    • In your talk…


    Application
    Application

    • In your talk…


    Overview2
    Overview

    • What is garbage collection

      • objects of interest

      • principal invariant

      • classic examples with assumptions and properties

  • Discussion

    • software engineering issues

    • typical cost

    • areas of usage

    • why knowledge is profitable

  • Organizational

    • Material

    • Requirements


  • Software engineering issues
    Software Engineering Issues

    • Design goal in SE:

      • decompose systems

      • in orthogonal components

  • Clashes with letting each component do its memory management

    • liveness is global property

    • leads to “local leaks”

    • lacking power of modern gc methods


  • Typical cost
    Typical Cost

    • Early systems (LISP)

      up to 40% [Steele,75] [Gabriel,85]

      • “garbage collection is expensive” myth

  • Well engineered system of today

    10% of entire runtime [Wilson, 94]


  • Areas of usage
    Areas of Usage

    • Programming languages and systems

      • Java, C#, Smalltalk, …

      • SML, Lisp, Scheme, Prolog, …

      • Modula 3, Microsoft .NET

    • Extensions

      • C, C++ (Conservative)

    • Other systems

      • Adobe Photoshop

      • Unix filesystem

      • Many others in [Wilson, 1996]


    Understanding garbage collection benefits
    Understanding Garbage Collection: Benefits

    • Programming garbage collection

      • programming systems

      • operating systems

    • Understand systems with garbage collection (e.g. Java)

      • memory requirements of programs

      • performance aspects of programs

      • interfacing with garbage collection (finalization)


    Overview3
    Overview

    • What is garbage collection

      • objects of interest

      • principal invariant

      • classic examples with assumptions and properties

  • Discussion

    • software engineering issues

    • typical cost

    • areas of usage

    • why knowledge is profitable

  • Organizational

    • Material

    • Requirements


  • Material
    Material

    • Garbage Collection. Richard Jones and Rafael Lins, John Wiley & Sons, 1996.

    • Uniprocessor garbage collection techniques. Paul R. Wilson, ACM Computing Surveys. To appear.

      • Extended version of IWMM 92, St. Malo.


    Organization
    Organization

    • Requirements

      • Talk

        • duration 45 min (excluding discussion)

      • Attendance

        • including discussion

      • Written summary

        • 10 pages

        • to be submitted in PDF until Mar 31st, 2002

    • Schedule

      • weekly

      • starting Nov 14th, 2001

      • next on Dec 5th, 2001


    Topics for you
    Topics For You!

    • The classical methods

      • Copying 1. [Brunklaus, Guido Tack]

      • Mark-Sweep 2. [Schulte, Hagen Böhm]

      • Mark-Compact 3. [Schulte, Jens Regenberg]

      • Reference Counting 6. [Brunklaus, Regis Newo]

  • Advanced

    • Generational 4. [Brunklaus, Mirko Jerrentrup]

    • Conservative (C/C++) 5. [Schulte, Stephan Lesch]

    • Incremental & Concurrent 7. [Brunklaus, Uwe Kern]


  • Invariants
    Invariants

    • Only nodes with rc zero are freed

    • RC always positive


    ad