r unbox variable lookup optimization n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
R Unbox Variable Lookup Optimization PowerPoint Presentation
Download Presentation
R Unbox Variable Lookup Optimization

Loading in 2 Seconds...

play fullscreen
1 / 15

R Unbox Variable Lookup Optimization - PowerPoint PPT Presentation


  • 97 Views
  • Uploaded on

R Unbox Variable Lookup Optimization. Apr. 2012. Outline. Unbox Variable Cache Structure Unbox Variable Lookup Instructions and Compiler Transformation Performance Evaluation Obvious Performance Gain in ForLoopAdd example. Unbox Variable Cache Structure.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'R Unbox Variable Lookup Optimization' - kolton


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
outline
Outline
  • Unbox Variable Cache Structure
  • Unbox Variable Lookup Instructions and Compiler Transformation
  • Performance Evaluation
    • Obvious Performance Gain in ForLoopAdd example
unbox variable cache structure
Unbox Variable Cache Structure
  • A Software Cache for Unboxed Scalar Variables
    • Data:
      • 64bit: store int/real
      • Length: current frame constant table’s size
        • Use symbol’s index to access the cache directly
        • May waste some space, but simple
    • State:
      • Used 4bit right now
      • 0~1: state: INVALID, VALID, MODIFIED
      • 2~3: cache type: Logical, Int, Real
    • Counter for each frame
      • modified_count: count of the total modified cache cells

64bit

Cache State

Cache

Current Frame

Cache

Cache State

cache state
Cache State
  • Simple state change policies

First time get var and unbox

Get var

INVALID

VALID

Set unbox var

Write back

Set unbox var

MODIFIED

Get var/set unbox var

unbox variable lookup instructions
Unbox Variable Lookup Instructions
  • Get Variable
    • GETLOGICALUNBOX, GETINTUNBOX, GETREALUNBOX
    • Merge the semantics of
      • GETVAR, GUARD and UNBOX
    • Three Operands
      • 1): symbol index; 2) Expected type; 3) Guard failure PC
  • Set Variable
    • SETUNBOXVAR
    • Write a scalar value into the cache
      • If the var is a new variable, still not define it in current frame
      • Define it when writing back
  • POP value from the stack
    • POPUNBOX
    • Slightly different to POP, need pop unbox type stack, too
  • Write back modified scalar variables
    • UNBOXWRITEBACK
    • Box all modified variables, and set them in the current frame
unbox variable lookup compiler transformation
Unbox Variable Lookup Compiler Transformation
  • Current Compiling Passes
    • Decoding Pass
    • Build Jump Target
    • Type Annotation Pass
    • Unbox Opportunity Identification Pass
    • Code Unbox Opt Transformation Pass
      • Will use the new Instructions
    • Code Clean Pass
    • PC Fix Pass
    • Jump Target Fix
    • Encoding Pass

Add new policies

unbox opportunity identification pass
Unbox Opportunity Identification Pass
  • SETVAR
    • Original: the top stack element should be a boxed value
    • New: the top stack element could be unboxed
      • Could expose more unbox oportunites

PC STMT

1 LDCONST, 1

3 SETVAR, 2

5 POP

Can unbox this one

code unbox opt transformation pass
Code Unbox Opt Transformation Pass
  • SETVAR
    • If the top stack element is a scalar value
    • SETVAR  UNBOX; SETUNBOXVAR; BOX
      • Insert GUARD if needed (the top stack element’s type is from profile)
    • The final BOX: Always maintain the stack’s shape in this pass
  • CALL, RETURN
    • Add UNBOXWRITEBACK in front of it

PC STMT

1 LDCONST, 1

3 UNBOXREAL

4 SETVARUNBOXVAR, 2

6 BOXREAL

7 POP

PC STMT

1 LDCONST, 1 //Real

3 SETVAR, 2

5 POP

code clean pass instruction combine
Code Clean Pass (Instruction Combine)
  • GETVAR + GUARD + UNBOX
    • According to the type, transform to
    •  GETLOGICALUNBOX, GETINTUNBOX, GETREALUNBOX
  • BOX + POP
    •  POPUNBOX
    • No need box again. But need pop the unbox type stack
transformation example
Transformation Example
  • RealAdd

run <-function() {

a <- 101;

b <- a+202;

print(b);

};

PC STMT

1 LDCONSTREAL, 1

3 SETUNBOXVAR, 2

5 POP_UNBOX

6 GETREALUNBOX, 2, 2, 8

10 LDCONSTREAL, 3

12 REALADD

13 SETUNBOXVAR, 5

15 POP_UNBOX

16 GETFUN, 6

18 MAKEPROM, 7

20 UNBOXWRITEBACK

21 CALL, 8

23 UNBOXWRITEBACK

24 RETURN

PC STMT

1 LDCONST, 1

3 SETVAR, 2

5 POP

6 GETVAR, 2

8 LDCONST, 3

10 ADD, 4

12 SETVAR, 5

14 POP

15 GETFUN, 6

17 MAKEPROM, 7

19 CALL, 8

21 RETURN

special handling in forloop
Special Handling in ForLoop
  • The ForLoop will update the loop variable in STEPFOR
    • The semantic is something like a GETVAR
  • If the loop variable is Logical/Integer/Real
    • Just get the scalar value, and load it into the cache
      • Save the value
      • Change the cache state to VALID
performance evaluation on forloop examples
Performance Evaluation on ForLoop Examples
  • Examples
  • Experiment Methodology
    • Running Method
      • Run “run()” 10 times
        • 1st time: profile
        • 2nd time: trigger compiling/optimization, and start to use the new code
        • 3rd-10th: just use the optimized code
    • All the following result are normalized
      • Average of 5 runs in each case
      • Normalized by only using the 3rd-10th runs
      • And normalized to 1 iteration

ForIntAdd

ForIntAdd3

run <-function() {

r <- 11;

for( i in 1:1000000) {

r <- r + 1 + i + 2;

}

print(r);

};

run <-function() {

r <- 11;

for( i in 1:1000000) {

r <- r + i;

}

print(r);

};

performance result
Performance Result
  • ForIntAdd
  • ForIntAdd3
analysis performance gain source
Analysis - Performance Gain Source
  • Much efficient variable lookup
  • Much Less object creation during SETVAR
    • ForLoop example: the same effect that hosting all box/unbox out of the loop
next step
Next Step
  • Working on more larger test case
    • Code pieces extracted from real benchmark
      • E.g. Shootout
  • Add New Instruction/Compiler transformation
    • To support the codes used in these new test cases