Buffered dynamic run-time profiling of arbitrary data for Virtual Machines which employ interpreter ...
Download
1 / 22

Buffered dynamic run-time profiling of arbitrary data for Virtual Machines which employ interpreter and Just-In-Time JIT - PowerPoint PPT Presentation


  • 168 Views
  • Uploaded on

Buffered dynamic run-time profiling of arbitrary data for Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler . Compiler workshop ’08 Nikola Grcevski, IBM Canada Lab. Agenda. The motivation and the importance of profiling

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Buffered dynamic run-time profiling of arbitrary data for Virtual Machines which employ interpreter and Just-In-Time JIT' - florence


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Buffered dynamic run-time profiling of arbitrary data for Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

Compiler workshop ’08Nikola Grcevski, IBM Canada Lab


Agenda
Agenda Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • The motivation and the importance of profiling

  • Design and implementation of J9 VM interpreter profiler

  • Performance results and start-up overhead


The static vs dynamic compiler
The static vs. dynamic compiler Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • Static compilers can take their time to analyze the code - perform intra procedural analysis

  • Dynamic Just-In-Time compilers don’t have this luxury, compilation happens during application runtime

  • Can dynamic compilers ever produce quality optimized code comparable to static compilers?


Why profile
Why profile? Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • The whole category of speculative optimizations relies on some type of profiling information

  • Opens up opportunities for new code and memory optimizations

  • Critical for high performance dynamic compiler systems


What could we profile
What could we profile? Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • Pretty much anything that we expect will provide repeatable information that we can use to optimize

  • The profiling can be at the Java level or CPU level if the OS supports it.


What kind of profilers does j9 have
What kind of profilers does J9 have Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • JIT profiler

    • Instruments methods with various profiling hooks

    • Targeted only to methods that are very hot

    • Temporal and slows down execution

  • Interpreter profiler

    • The topic of this presentation


What kinds of data we collect with the interpreter profiler
What kinds of data we collect with Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler the interpreter profiler?

  • Branch direction

  • Virtual/Interface call targets

  • Switch statement index

  • Instanceof and checkcast runtime types


Interpreter profiler design
Interpreter profiler design Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • Buffered approach to data collection on the application threads

Application Thread 1

Application Thread N

div

vcall

if

icall

mul

add

vcall

if

if

if

switch

…….


Interpreter profiler design1
Interpreter profiler design Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • Buffer full event triggers processing of the data by the JIT

Buffer full event

Application Thread 1

if

JIT runtime

vcall

if

switch

if

…….


Interpreter profiler design2
Interpreter profiler design Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • JIT parses the application thread profiling buffer and builds internal profiling data structure

JIT profiling hashtable

Profiling buffer

JIT runtime

data

Bytecode program counter

Hash function based on bytecode PC


What s in the data we collect
What’s in the data we collect? Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • Bytecode program counter

  • Variable size data packet

    • 1 byte for branch direction

    • Word size for call targets and runtime types

    • 4 bytes for switch index


Processing the buffered branch information
Processing the buffered Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler branch information

  • We create an object to hold the bytecode PC and branch counts. We are using 4 bytes to store the branch information.

pc;

taken | not taken


What does the jit do with the call information
What does the JIT do with Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler the call information?

  • We keep up to 3 call targets with their counts as well as residue count

pc;

residue

Class A;

count

Class B;

count

Class C;

count

We use the same approach for checkcast and instanceof


What does the jit do with the switch information
What does the JIT do with Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler the switch information?

  • We create a data structure to hold the bytecode PC and counts for switch index. The index data is 8 bytes wide, split into 4 records: the top 3 and the rest.

pc;

record 1

record 2

record 3

The rest

each record is split into 2 portions: 1 byte count and 1 byte switch index

count | index


Storing the profiling data
Storing the profiling data Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • Each data record is stored in global hashtable, using the PC for the hash function

  • On subsequent encounters of the same PC with profiling data the records are updated.

    • Branch and switch counts are incremented

    • Call targets and runtime types are added and counts incremented.


Using the profiling information
Using the profiling information Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • The profiler database only knows of bytecode PC

  • At all points where the compiler is interested in profiling information it generates the bytecode pc from the method information and the bytecode index

  • The compiler has to make sense out of the information in the hashtable


Interpreter profiler design3
Interpreter profiler design Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • JIT compiler consults the profiling hashtable in various stages of method compilation

JIT profiling hashtable

Compilation Thread

inliner

order code

…….

codegen


Performance results
Performance results Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • Up to 30% improvement on various applications

    • EJB and other middleware applications benefit mostly from code ordering and devirtualization for the purpose of inlining

    • Benchmarks typically benefit from other optimization enabled by the ability to devirtualize virtual and interface calls

  • With various tweaks we managed to drive the start-up over head to below 10%


How do we manage the profiling overhead
How do we manage the profiling Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler overhead?

  • We turn the profiler off in –Xquickstart mode

  • No locking on the hashtable

  • We detect startup phase of the application and skip records to ease off the data collection overhead


Turning the profiler on and off
Turning the profiler ON and OFF Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • The profiler is ON by default

  • The sampler thread turns the profiler OFF or back ON

    • Number of consecutive ticks in JIT generated code turns the profiler OFF

    • Number of consecutive ticks in interpreter turns the profiler back ON


Some of the problems we encountered
Some of the problems Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler we encountered

  • Tuning for optimal balance between startup overhead and throughput performance wasn’t easy

  • Application phase change detection wasn’t easy

  • Class unloading created lots of problems


Summary
Summary Virtual Machines which employ interpreter and Just-In-Time (JIT) compiler

  • Profiling is critical for performance of run-time systems

  • Using buffered approach to data collection can help build efficient profilers

  • Tuning for optimal balance of startup overhead and throughput performance is challenging


ad