Loading in 5 sec....

Calculating Stack Distances EfficientlyPowerPoint Presentation

Calculating Stack Distances Efficiently

- 232 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Calculating Stack Distances Efficiently' - lawson

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Calculating Stack Distances Efficiently

George Almasi,Calin Cascaval,David Padua

{galmasi,cascaval,padua}@cs.uiuc.edu

Algorithms to calculate stack distance histograms

Speed/memory optimization of trace analysis to create stack distance histogram

This talk is not about:

why stack distance histograms are/are not useful

relative merits of inter-reference distance vs. stack distance

speed/memory optimization of applications

What this talk is, and is not, aboutTwo measures of locality

- Inter-reference distance:
- the number of other references between two references to the same address in the trace

- Stack distance:
- The number of distinct addresses referred between two references to the same address

Inter-ref distance = 7

stack distance = 4

a b c d b c d e a

hits(C) = s()

=1

Inf

misses(C) = s()

=C+1

Stack Distances As Cache Misses- compute the number of cache hits and misses as follows:

Inter-reference distance

- Given that at time t ref(t)=x
- find t0, time of last previous reference to x
- inter reference distance:
- Efficient implementation: a (hash)table
H(x) = t0, the trace index of the last reference to x;

Memory usage ~ 2x original program

Cost O(1) per reference

Stack distance

1

3

Depth(x)

a

h

x

x

b

b

a

h

h

x

b

c

a

a

h

d

c

b

b

a

e

d

c

c

c

e

f

d

d

d

f

e

e

...

e

f

f

...

f

x

...

...

...

y

x

y

z

y

y

y

u

z

z

z

z

v

u

u

u

u

v

v

v

v

Stack distance

- Simulates an infinite cache with LRU replacement policy
- nice properties (inclusion!)
- naïve implementation: stack as linked list/array
- m = 250,000 average maximum stack depth
- list traversal/array updates; O(m) per trace element

Insight: stack is contained in trace

Time

Trace

a

b

b

g

g

e

d

f

z

f

c

e

b

c

d

a

g

Time=t

Stack

g

z

f

e

b

c

d

a

g

Stack top

Holes

- Index tx in the trace is a hole if ref(tx) has already been referenced again at a later time ty < t.
- Using holes, we can say
- stackdist(t) = refdist(t) - #holes(t0 to t)

- How many holes are there between t0 and t?

An interval tree of holes

t

t0

...

•

•

•

o

o

o

o

a

o

o

o

a

ref to a

Prev. ref to a

k:k

k+4:k+5

k+2:k+3

- Single tree operation: count_and_add (t0)
- Determines # of holes between t0 and t; adds a new hole at t0
- Adding a hole can create a new interval - or fuse two existing ones

Operations on the interval tree

Add to interval edge:

count_and_add(p)

p=n+1

Create new interval:

count_and_add(p)

p > n+1

Join two intervals:

count_and_add(p)

p = n+1

k:n

k:n

k:n

n+2:p

k:n+1

k:n+1

k:p

p:p

tree is pre-allocated

binary, balanced

each node contains a number: the number of holes in its right subtree

memory used by node depends on node’s depth

a modified version of the B&K algorithm:

holes instead of references

binary instead of n-ary

better memory usage

Pre-allocated hole treesPre-allocated hole trees

a

b

b

g

e

d

f

z

f

c

e

b

c

d

a

1

0

1

0

1

0

0

0

1

1

0

0

3

0

1

n

n

n=n+1

count += n

Q: Why holes and not stack elements?

A: Holes need 1/2 the maintenance of stack elements.

Q: Will the interval tree grow to ?

A: No. Intervals fuse together spontaneously.

Q: How big will the tree be?

A: #of intervals = O(stack depth)

Depth of a tree of stack elements would be the same size

Q: Will the tree be unbalanced?

A: Yes, because it tends to grow on one side.

Many QuestionsQ: what kind of interval tree?

A: RB and AVL

Q: Which is better?

A: AVL is better.

Q: Why?

A:

shorter average tree height: h+1 vs. 2h

not all operations change the tree structure

More questionsexec time O(log(m))

memory usage O(m)

AVL better than RB

pointer chasing, bad locality

Pre-allocated trees:

exec time O(log(n))

memory usage O(n)

hits practical limit

holes are better

reduced maintenance

no pointer chasing, good locality

ComparisonsConclusions

- Stack distances with holes:
- using RB/AVL interval trees
- using pre-allocated trees

- Using holes reduces linear overhead by 20-40% for both kinds of algorithms.

Download Presentation

Connecting to Server..