Calculating stack distances efficiently
Download
1 / 19

Calculating Stack Distances Efficiently - PowerPoint PPT Presentation


  • 232 Views
  • Uploaded on

Calculating Stack Distances Efficiently. George Almasi,Calin Cascaval,David Padua {galmasi,cascaval,padua}@cs.uiuc.edu. This talk is about: Algorithms to calculate stack distance histograms Speed/memory optimization of trace analysis to create stack distance histogram.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Calculating Stack Distances Efficiently' - lawson


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Calculating stack distances efficiently l.jpg

Calculating Stack Distances Efficiently

George Almasi,Calin Cascaval,David Padua

{galmasi,cascaval,padua}@cs.uiuc.edu


What this talk is and is not about l.jpg

This talk is about:

Algorithms to calculate stack distance histograms

Speed/memory optimization of trace analysis to create stack distance histogram

This talk is not about:

why stack distance histograms are/are not useful

relative merits of inter-reference distance vs. stack distance

speed/memory optimization of applications

What this talk is, and is not, about


Two measures of locality l.jpg
Two measures of locality

  • Inter-reference distance:

    • the number of other references between two references to the same address in the trace

  • Stack distance:

    • The number of distinct addresses referred between two references to the same address

Inter-ref distance = 7

stack distance = 4

a b c d b c d e a


Stack distances as cache misses l.jpg

C

hits(C) =  s()

=1

Inf

misses(C) =  s()

=C+1

Stack Distances As Cache Misses

  • compute the number of cache hits and misses as follows:


Inter reference distance l.jpg
Inter-reference distance

  • Given that at time t ref(t)=x

  • find t0, time of last previous reference to x

  • inter reference distance:

  • Efficient implementation: a (hash)table

    H(x) = t0, the trace index of the last reference to x;

    Memory usage ~ 2x original program

    Cost O(1) per reference


Stack distance l.jpg
Stack distance

1

3

Depth(x)

a

h

x

x

b

b

a

h

h

x

b

c

a

a

h

d

c

b

b

a

e

d

c

c

c

e

f

d

d

d

f

e

e

...

e

f

f

...

f

x

...

...

...

y

x

y

z

y

y

y

u

z

z

z

z

v

u

u

u

u

v

v

v

v


Stack distance7 l.jpg
Stack distance

  • Simulates an infinite cache with LRU replacement policy

  • nice properties (inclusion!)

  • naïve implementation: stack as linked list/array

    • m = 250,000 average maximum stack depth

    • list traversal/array updates; O(m) per trace element


Insight stack is contained in trace l.jpg
Insight: stack is contained in trace

Time

Trace

a

b

b

g

g

e

d

f

z

f

c

e

b

c

d

a

g

Time=t

Stack

g

z

f

e

b

c

d

a

g

Stack top


Holes l.jpg
Holes

  • Index tx in the trace is a hole if ref(tx) has already been referenced again at a later time ty < t.

  • Using holes, we can say

    • stackdist(t) = refdist(t) - #holes(t0 to t)

  • How many holes are there between t0 and t?


An interval tree of holes l.jpg
An interval tree of holes

t

t0

...

o

o

o

o

a

o

o

o

a

ref to a

Prev. ref to a

k:k

k+4:k+5

k+2:k+3

  • Single tree operation: count_and_add (t0)

  • Determines # of holes between t0 and t; adds a new hole at t0

  • Adding a hole can create a new interval - or fuse two existing ones


Operations on the interval tree l.jpg
Operations on the interval tree

Add to interval edge:

count_and_add(p)

p=n+1

Create new interval:

count_and_add(p)

p > n+1

Join two intervals:

count_and_add(p)

p = n+1

k:n

k:n

k:n

n+2:p

k:n+1

k:n+1

k:p

p:p


Pre allocated hole trees l.jpg

basics:

tree is pre-allocated

binary, balanced

each node contains a number: the number of holes in its right subtree

memory used by node depends on node’s depth

a modified version of the B&K algorithm:

holes instead of references

binary instead of n-ary

better memory usage

Pre-allocated hole trees


Pre allocated hole trees13 l.jpg
Pre-allocated hole trees

a

b

b

g

e

d

f

z

f

c

e

b

c

d

a

1

0

1

0

1

0

0

0

1

1

0

0

3

0

1

n

n

n=n+1

count += n


Many questions l.jpg

Q: Why holes and not stack elements?

A: Holes need 1/2 the maintenance of stack elements.

Q: Will the interval tree grow to ?

A: No. Intervals fuse together spontaneously.

Q: How big will the tree be?

A: #of intervals = O(stack depth)

Depth of a tree of stack elements would be the same size

Q: Will the tree be unbalanced?

A: Yes, because it tends to grow on one side.

Many Questions


More questions l.jpg

Q: what kind of interval tree?

A: RB and AVL

Q: Which is better?

A: AVL is better.

Q: Why?

A:

shorter average tree height: h+1 vs. 2h

not all operations change the tree structure

More questions


Comparisons l.jpg

Interval trees:

exec time O(log(m))

memory usage O(m)

AVL better than RB

pointer chasing, bad locality

Pre-allocated trees:

exec time O(log(n))

memory usage O(n)

hits practical limit

holes are better

reduced maintenance

no pointer chasing, good locality

Comparisons




Conclusions l.jpg
Conclusions

  • Stack distances with holes:

    • using RB/AVL interval trees

    • using pre-allocated trees

  • Using holes reduces linear overhead by 20-40% for both kinds of algorithms.


ad