Assessing the Scalability of Garbage
This presentation is the property of its rightful owner.
Sponsored Links
1 / 12

Lokesh Gidra Gaël Thomas Julien Sopena Marc Shapiro Regal-LIP6/INRIA PowerPoint PPT Presentation


  • 92 Views
  • Uploaded on
  • Presentation posted in: General

Assessing the Scalability of Garbage Collectors on Many Cores (Funded by ANR projects: Prose and ConcoRDanT ). Lokesh Gidra Gaël Thomas Julien Sopena Marc Shapiro Regal-LIP6/INRIA. Introduction. Why? MREs are ubiquitous! GC, a vital component of it  performance is critical?

Download Presentation

Lokesh Gidra Gaël Thomas Julien Sopena Marc Shapiro Regal-LIP6/INRIA

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Lokesh gidra ga l thomas julien sopena marc shapiro regal lip6 inria

Assessing the Scalability of GarbageCollectors on Many Cores(Funded by ANR projects: Prose and ConcoRDanT)

LokeshGidraGaël Thomas

JulienSopenaMarc Shapiro

Regal-LIP6/INRIA


Introduction

Introduction

Why?

  • MREs are ubiquitous!

  • GC, a vital component of it  performance is critical?

  • Hardware is more and more multi-resourced.

  • Are GCs scaling with such hardware?

  • Current solutions not evaluated on true many-cores!

    What?

  • Assesses GC scalability : Empirical Results.

  • Possible factors affecting the GC scalability.


Multi node architecture

Multi-Node Architecture

C0

C1

15

C5

C0

C1

C5

40

L2

L2

L2

L2

L2

L2

L3

L3

315

125

To other nodes

MC

MC

Remote access >> Local access

DRAM

DRAM

Our machine has 8 nodes with 6 cores each


Parallel copying garbage collection

Parallel Copying Garbage Collection

Mutator Threads

GC Threads

Application

Time

Pause

Time

Total Time

Live Object

Dead Object

From Space

To Space


Gcs effect on application scalability lusearch

GCs effect on Application Scalability (Lusearch)

Mutator Threads = GC Threads = Varying Number of Cores

  • Up-to 6 cores:

    • 3X performance improvement.

  • More than 6 cores:

    • No improvement in total time.

    • Proportion of pause time increases up-to 50%.


Gc scalability lusearch

GC Scalability (Lusearch)

Mutator Threads = Cores = 48 and, Varying Number of GC Threads

Pause time increases with GC threads  Negative Scalability!


1 remote scanning

1. Remote Scanning

GC Threads

Node 0

Node 1

87.7% scans were remote!

Node 2

Random (Default) object allocation

GC0

GC1

GC2

GC3

Node 3

Live Object

Dead Object

From Space

To Space


2 remote copying

2. Remote Copying

GC Threads

Node 0

82.7% copies were remote!

Node 1

Node 2

GC0

GC1

GC2

GC3

Node 3

Live Object

Dead Object

From Space

To Space


3 load balancing

3. Load Balancing

  • Based on work stealing technique.

  • 1 task queue per GC thread.

Task Queue

Owner: Push and Pop

Other GC Threads: Steal (Pop)

Shared Variable:size (task queue size)

  • Highly unbalanced load:

    • Requires a lot of stealing.

    • Keep doing until all are done.

  • Performance Impact: ≥ 2-4 cache misses/stealing!

  • 33.3% improvement in pause time by disabling it!


Conclusion

Conclusion

  • GC does affect application’s scalability  it matters!

  • GC doesn’t scale with the hardware!

  • Bottlenecks:

    • Remote Scanning

    • Remote Copying

    • Load Balancing

  • Future Work:

    • Fix the bottlenecks  does it help GC to scale?


Dacapo benchmarks scalability

DaCapo Benchmarks’ Scalability


Revisiting app lusearch scalability

Revisiting App. (Lusearch) Scalability…


  • Login