- 181 Views
- Uploaded on
- Presentation posted in: General

Lesson 5

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

- Register allocation for expressions.
- Sethi-Ullman register allocation.
- Spilling in Sethi-Ullman.
- Register allocation from ir-trees.
- Register allocation and maximal munch.
- Some implementation points.

- Goal: Determine the evaluation order for subexpressions of an expression in order to use as few registers as possible.
- This is the lowest (most local) level of register allocation. For better performance register allocation can be done over basic blocks, functions, or whole programs.

- Algorithm: Given an expression represented as a syntax tree:
- Pass 1: For each node in the tree calculate the number of registers needed for each sub–tree.
- Pass 2: Generate code by a bottom up traversal. Traverse the sub-tree with the greater need first.

- Alorithm for Pass 1 (labeling): Calculate need.
calc_need(node) ->

If is_leaf(node) then

need(node) = 1

else

letn1 = calc_need(left(node))

n2 = calc_need(right(node))

in

need(node) =

ifn1 = n2thenn1 + 1

else max(n1, n2)

- Alorithm for Pass 2: Generate code.
gen_code(node, env) ->

if is_leaf(node) then

locate free reg r;

gen code to load into r

else

(e1, n1) = left_n(node)

(e2, n2) = right_n(node)

if n1 <> n2 then

gen code for larger subexpr

free all but one reg

gen code for the other subexpr

gen code for node.

else

gen code for e1 using n1 regs

free n1 – 1 regs

gen code for e2 using n2 regs

gen code for node

(needs n1 + 1 regs)

+

+

x

1

y

a

2

+

+

(x + 1) + (y + (a + 2))

load t0, xmove t1, 1add t0, t0, t1

load t1, amove t2, 2add t1, t1, t2

load t2, yadd t1, t2, t1

add t0, t0, t1

2

2

1

2

1

1

1

1

- When the need of both children are larger than the number of available registers, then one register has to be spilled (written to the stack).

- This algorithm is well suited for Bar since the evaluation order of expressions is free in Bar.
- This algorithm is well suited for ir-trees.
- No values are live in registers between instructions.
- One expression can be considered at a time.

- The only small complication is function calls, which are nodes with more than two children, but the algorithm can easily be extended to this case.

- These two algorithms fit well together:
- Maximal munch decides “what a node is”.
- Sethi-Ullman decides where to put results of expressions.
- Both work by recursion over the expression tree.

+

+

+

a

2

1

y

x

x

+2

a

y

+1

+

+

+

Maximal munch finds other nodes than a straightforward traversal of the tree: (for example there is an immediate add instruction)

1

2

1

1

1

1

y

+1

+

+

+2

x

a

t0

load t0, y

load t1, a

t1

2

t0

1

addi t1, t1, 2

add t0, t0, t1

t1

t1

1

1

t0

1

load t1, x

addi t1, t1, 1

1

t1

add t0, t1, t0

- Who chooses the result register?
- How to do the labeling?
- How to handle registers?
- When to free registers?
- Where to spill?
- How to handle floats?

- Either the code for handling a node allocates a register for each sub-tree or the code for handling each sub-tree allocates and returns a register.

- munch(+(e1,e2), res_reg)) ->
- munch(e1, res_reg),
- r2 = alloc_reg
- munch(e2, r2)
- emit (res_reg = rres_reg + r2)

- munch(+(e1,e2)) ->
- r1 = munch(e1),
- r2 = munch(e2)
- emit (r1 = r1 + r2)
- r1

- The straightforward way (to calculate thee need of the children as needed) would be very time consuming, since the need for a sub-tree might be calculated several times.
- Do it in two passes (as stated in the algorithm). First calculate the need, then do code generation.
- Do it bottom up so that each sub-tree is visited only once.
- Use a need tree structure for this, e. g.: datatype need of int * need list

- Create an abstract data type (registers) with interface functions as:
allocate : registers -> register * registers

free: register * registers -> registers

num_free : registers -> int

- Try to implement this so that all operations are O(1).

- A register should be freed when the register used for a subexpression is not needed any more.

munch(+(e1,e2),regs) ->

(r1, regs1) = munch(e1, regs),

(r2, regs2) = munch(e2, regs1)

emit (r1 = r1 + r2)

regs3 = free(r2, regs2)

(r1, regs3)

Stack growth

- If a frame pointer is used is is easy to used push spilled values on the stack.

Argument n

Argument 1

- r1 = munch(…
- emit push(r1)
- free(r1)
- r2 = munch(…
- r3 = alloc..
- emit R3 = pop
- emit r3 = r2 op r3

FP

Ret.address

Old FP

Variable 1

Variable n

Spill 1

SP

- Extend the need and register datatypes with another set of registers:
datatype need of int * int * need list

allocate_ireg : registers -> register * registers

allocate_freg : registers -> register * registers

free_ireg: register * registers -> registers

free_freg: register * registers -> registers

num_free_i : registers -> int

num_free_f : registers -> int