Download Presentation
Theory of Compilation 236360 Erez Petrank

Loading in 2 Seconds...

1 / 68

# Theory of Compilation 236360 Erez Petrank - PowerPoint PPT Presentation

Theory of Compilation 236360 Erez Petrank. Lecture 11: Optimizations. Running Time Optimization. Need to understand how the run characteristics (which are often unknown). Usually the program spends most of its time in a small part of the code. if we optimize that, we gain a lot.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

## PowerPoint Slideshow about 'Theory of Compilation 236360 Erez Petrank' - gunnar

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Theory of Compilation 236360ErezPetrank

Lecture 11: Optimizations

Running Time Optimization
• Need to understand how the run characteristics (which are often unknown).
• Usually the program spends most of its time in a small part of the code. if we optimize that, we gain a lot.
• Thus, we invest more in inner loops.
• Example: place together functions with high coupling.
• Need to know the operating system and the architecture.
• We will survey a few simple methods first, starting with building a DAG.

(1)

t1 := 4 * i

t2 := a [ t1 ]

t3 := 4 * i

t4 := b [ t3 ]

t5 := t2* t4

t6:= prod + t5

prod := t6

t7:= i + 1

i := t7

if i <= 20 goto (1)

Representing a basic block computation with a DAG
• Leaves are variable or constants, marked by their names of values.
• Inner vertices are marked by their operators.
• We also associate variable names with the inner vertices according to the computation advance.

+

t6, prod

prod0

*

t5

[ ]

t2

[ ]

t4

<=

(1)

t1,t3

*

+

t7, i

20

a

b

4

i0

1

Building the DAG

For each instruction x: = y + z

• Find the current location of y and z,
• Build a new node marked “+” and connect as a parent to both nodes (if such parent does not exist); associate this node with “x”
• If x was previously associated with a different node, cancel the previous association (so that it is not used again).
• Do not create a new node for copy assignment such as x := y. Instead, associate x with the node that y is associated with.
• Such assignments are typically eliminated during the optimization.

(1)

t1 := 4 * i

t2 := a [ t1 ]

t3 := 4 * i

t4 := b [ t3 ]

t5 := t2* t4

t6:= prod + t5

prod := t6

t7:= i + 1

i := t7

if i <= 20 goto (1)

Using the DAG

+

t6, prod

prod0

*

t5

t1

[ ]

t2

[ ]

t4

<=

(1)

*

t1,t3

+

t7, i

20

prod := prod + t5

i := i + 1

a

b

4

i0

1

Uses of DAGs
• Automatic identification of common expressions
• Identification of variables that are used in the block
• Identification of values that are computed but not used.
• Identifying computation dependence (allowing code movements)
• Avoiding redundant copying instructions.
Aliasing Problems
• What’s wrong about the following optimization?
• The problem is with the side effect due to aliasing.
• Typically, we conservatively assume aliasing: upon assignment to an array element we assume no knowledge in array entries.
• The problem is when we do not know if aliasing exists.
• Relevant to pointers as well.
• Relevant to routine calls when we cannot determine the routine side-effects.
• Aliasing is a major obstacle for program optimizations.
Optimization Methods
• In the following slides we review various optimization methods, stressing performance optimizations.
• Main goal: eliminate redundant computations.
• Some methods are platform dependent.
• In most platforms addition is faster than multiplication.
• Some methods do not look useful on their own, but their combination is effective.
Basic Optimizations
• Common expression elimination:
• DAG identifies common expressions in a basic block; we can eliminate repeated computation.
• Next lecture: data flow analysis will determine common expressions across basic blocks.
• Copy propagation:
• Given an assignment x:=y, we attempt to use y instead of x.
• Possible outcome: x becomes dead and we can eliminate the assignment.
Code motion
• Code motion is useful in various scenarios.
• Identify inner-loop code,
• Identify an expression whose sources do not change in the loop, and
• Move this code outside the loop!
Induction variables & Strength Reduction
• Identify loop variables, and their relation to other variables.
• Eliminate dependence on induction variables as much as possible

(1) i = 0;

(2) t1 = i * 4;

(3) t2 = a[t1]

(4) if (t2 > 100) goto (19)

(5) …

(17) i = i + 1

(18) goto (2)

(19) …

• Why is such code (including multiplication by 4) so widespread?

In many platforms addition is faster than multiplication (strength reduction)

→ t1 = t1 + 4

t1 must be initialized outside the loop

Not just strength reduction! We have removed dependence of t1 in i.

Thus, instructions 1 and 17 become irrelevant.

Peephole (חור הצצה) Optimization
• Optimizing long code sequences is hard
• A simple and effective alternative (though not optimal) is peephole optimization:
• Check a “small window” of code and improve only this code section.
• Identify local optimization opportunities
• Rewrite code “in the window”
• For example:
• x := x * 1;
• a := a + 0;
peephole optimizations
• Some optimizations that do not require a global view:
• Simplifying algebraic computations:
• x := x ^ 2 → x := x * x
• x := x * 8 → x := x << 3
• Code rearrangement:

(1) if x == 1 goto (3)

(2) goto (19)

(3) …

(1) if x  1 goto (19)

(2) …

peephole optimizations
• Eliminate redundant instructions:

(1) a := x

(2) x := a

(3) a := someFunction(a);

(4) x := someOtherFunction(a, x);

(5) if a > xgoto (2)

• Execute peephole optimizations within basic block only and do not elide the first instruction.

זהירות!

אם מישהו קופץ אל פקודה שביטלנו, נוצרת בעיה.

B1

B2

B3

B1

B4

B5

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t4

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t4

t4

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t4

t4

t5

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t4

t2

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t4

t5

t4

t2

t2

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t4

t5

t4

t2

t2

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t3

t4

t5

t4

t2

t2

t2

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t3

t4

t5

t4

t2

t2

t2

t2

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t3

t4

t5

t4

t2

t2

t2

t2

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t3

t3

t4

t5

t4

t2

t2

t2

t2

t1

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t3

t3

t4

t5

t4

t2

t2

t2

t2

t1

t1

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t3

t3

t4

t5

t2

t1

t4

t1

t2

t2

t2

שלב א'

ביטול ביטויים משותפים באופן גלובלי

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

t3

t3

x := t3

t14:= a [ t1 ]

a [ t2 ] := t14

a [ t1 ] := x

t4

x := t3

a [ t2 ] := t5

a [ t4 ] := x

goto B2

t5

t3

t3

שלב א' -- ביטול ביטויים משותפים באופן גלובלי

שלב ב' –

copy propagation: with f:= g, we try to use g and get rid of f.

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

x := t3

t14:= a [ t1 ]

a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := x

goto B2

t3

t3

Global common expression elimination

Copy propagation

Dead code elimination – eliminate redundant code.

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

x := t3

t14:= a [ t1 ]

a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := x

goto B2

t3

t3

שלב א' –ביטול ביטויים משותפים באופן גלובלי

שלב ב' –copy propagation

שלב ג' –dead code elimination

שלב ד'–code motion(הוצאת ביטויים מחוץ ללולאה)

Global common expression elimination

Copy propagation

Dead code elimination

Code motion – move expressions outside the loop

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

x := t3

t14:= a [ t1 ]

a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := x

goto B2

t3

t3

שלב א' –ביטול ביטויים משותפים באופן גלובלי

שלב ב' –copy propagation

שלב ג' –dead code elimination

שלב ד'–code motion

שלב ה' –induction variables and reduction in strength(זיהוי המשתנים של

Global common expression elimination

Copy propagation

Dead code elimination

Code motion

Induction variables and strength reduction

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

t4 := t4 – 4

if i >= j goto B6

x := t3

t14:= a [ t1 ]

a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := x

goto B2

t3

t3

שלב א' –ביטול ביטויים משותפים באופן גלובלי

שלב ב' –copy propagation

שלב ג' –dead code elimination

שלב ד' – code motion

Global common expression elimination

Copy propagation

Dead code elimination

Induction variables and strength reduction

Common expression elimination

Global common expression elimination

Copy propagation

Dead code elimination

Code motion

Induction variables and strength reduction

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

t2 := t2 + 4

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

t4 := t4 – 4

if i >= j goto B6

x := t3

t14:= a [ t1 ]

a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := x

goto B2

t3

t3

Global common expression elimination

Copy propagation

Dead code elimination

Induction variables and strength reduction

Common expression elimination

Global common expression elimination

Copy propagation

Dead code elimination

Code motion

Induction variables and strength reduction

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

t2 := t2 + 4

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

t4 := t4 – 4

if i >= j goto B6

t2 >= t4

x := t3

t14:= a [ t1 ]

a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := x

goto B2

t3

t3

Global common expression elimination

Copy propagation

Dead code elimination

Code motion

Induction variables and strength reduction

Dead code elimination (again)

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

t2 := t2 + 4

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

t4 := t4 – 4

if i >= j goto B6

t2 >= t4

x := t3

t14:= a [ t1 ]

a [ t2 ] := t14

a [ t1 ] := x

x := t3

a [ t2 ] := t5

a [ t4 ] := x

goto B2

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

i := i + 1

t2 := 4 * i

t3:= a [ t2 ]

if t3 < v goto B2

j := j – 1

t4 := 4 * j

t5:= a [ t4 ]

if t5 > v goto B3

if i >= j goto B6

t6 := 4 * i

x := a [ t6 ]

t7 := 4 * i

t8 := 4 * j

t9:= a [ t8 ]

a [ t7 ] := t9

t10 := 4 * j

a [ t10 ] := x

goto B2

t11 := 4 * i

x := a [ t11 ]

t12 := 4 * i

t13 := 4 * n

t14:= a [ t13 ]

a [ t12 ] := t14

t15 := 4 * n

a [ t15 ] := x

i := m – 1

j : = n

t1 := 4 * n

v := a [ t1 ]

t4 := 4 * j

t2 := 4 * i

t2 := t2 + 4

t3:= a [ t2 ]

if t3 < v goto B2

t4 := t4 – 1

t5:= a [ t4 ]

if t5 > v goto B3

if t2 >= t45 goto B6

a [ t2 ] := t5

a [ t4 ] := t3

goto B2

t14:= a [ t1 ]

a [ t2 ] := t14

a [ t1 ] := t3

Global common expression elimination

copy propagation

dead code elimination

code motion

induction variables and reduction in strength

### Data Flow Analysis

Data Flow Analysis
• Global optimizations.
• We need to understand the flow of data to be able to change code wisely and correctly.
• This understanding: an analysis called data flow analysis or DFA
• It’s a set of algorithms, all having the same generic frame, and their specifics are determined by the information we are after.
• Used for optimizations and verification.
• Ambiguous TLAs…
The Idea
• Given a graph of “program constructs”
• Single instructions, Basic blocks, etc.
• Algorithm works in iterations
• In each iteration: update information for each node in the graph according to information in its neighbors.
• A global view is never necessary.
• The algorithm terminates when no node gets updated.
• Typical termination: knowledge “size” increases in each iteration and there is a limit on knowledge size.
DFA – The Generic Algorithm

General structure of any DFA algorithm:

N1…Nn – information about the n program nodes (variables, basic blocks, etc.)

for i in {1…n}: initialize Ni

boolchange = true;

while (change) {

change = false;

for i in {1…n}:

M = new value for Ni(funciton of neighbors of Ni).

if (M  Ni) then

change = true;

Ni = M

}

Specific instantiations of the DFA generic structure differ in the initialization of N, and the computation of the new values.

Example: Reaching Definitions
• A definition is an assignment of variable v.
• A definition d reaches point p in the program, if there is (at least one possible) execution path from the definition d to the point p such that there is no new definition of the same variable along the path.
• The influence of the assignment a = b+c:
• It uses the variables b and c,
• It kills any previous definition of a.
• It generates a new definition for a.
• Similarly, the instruction “if (a<3)” uses a but does not kill nor generate any definition.
Reformulating “Reaching Definitions”
• A definition: an assignment that gives a value to a variable v.
• Reaching: a definition d reaches a point p in the program if there is a path from the definition d to p such that d is not killed on the path.
• Finding reaching definitions:
• Find all definitions that reach any point in the program.
• This information can be used for optimizations.
• This seems to require going over all paths and all definitions in the entire program.
• (We usually think of the program as a single method (or routine). Inter-procedural analysis examines full modules, classes, or even whole programs. )
Computing Reaching Definitions with DFA
• Let Ni:vbe the set of all reaching definitions of variable v in line i.
• There is a DFA variable for each program variable and each code line.
• We start with a subset and gradually enlarge it until it contains all reaching definitions.
• When there are no more possible enlargements available, we know we’re done.
• Usually, we don’t consider copying (v=u) as an assignment because v and u are usually united during the optimization.
Computing Reaching Definitions with DFA
• Recall that Ni:v is the set of all reaching definitions of variable v in line i.
• Initialization: if in line i there is an assignment of a constant, an expression, or a function to variable v, i.e., non-copying assignment, then Ni:v= {i}.(In this case this is the final value.) Otherwise, Ni:v=  (need to compute this value in iterations.)
• Iteration step:
• If in line i variable v is not updated, then Ni:v= Nx:vNy:vNz:v…where x, y, z, ... are all lines from which we can directly go to line i.
• If line i contains a copy v=u, we set Ni:v= Ni:u.
Reaching Definitions: an Example

(1) if (b == 4) goto (4)

(2) a = 5

(3) goto 5

(4) a = 3

(5) if (a > 4) goto 4

(6) c = a

ni:ani:c

 

{2} 

 

{4} 

 

 

ni:ani:c

 

{2} 

{2} 

{4} 

{2,4} 

{2,4} {2,4}

ni:ani:c

 

{2} 

{2} 

{4} 

{2,4} 

{2,4} {2,4}

Correctness Idea
• If x gets updated in some line i, then after k iterations, a line that can be executed k steps after i is updated and “knows” that i is a definition for x (if there is no closer definition of x on the path from i to it).
• Proof by induction on the “distance” of the definition from the set being updated.
• As the program is finite, the longest (non-cyclic) path in it is finite as well.
• Note that if there is an iteration with no updates, then there will not be any updates in subsequent iterations.
• We do not provide a full proof.
A Standard Saving
• Running iterations with all instructions is costly for large programs.
• A standard solution: run the iterations for basic blocks.
• Instead of working with the program instructions graph, we work with the control flow graph of the basic blocks.
• Obtain a smaller graph and (much) faster algorithm.
• Sometimes the operations inside the basic block cancel each other and then the computation becomes easier.
• The output is the reaching definition for each block and not each code line.
• Good enough for optimizations
• Can be easily extended for each line inside any given block.
How Does it Look Inside a Basic Block?
• We have seen earlier the impact of a single instruction like “a = b+c”.
• The impact of a basic block is the sum of all influences in the block.
How Does it Look Inside a Basic Block?
• A block uses a variable v if there exists an instruction p that uses v and there is no point p0 prior to p1 that defines v locally.
• Simply put: p1 uses v’s value that was set before the block started.
• A block kills a definition d of variable v if there is an instruction in the block that defines v.
• A definition d to variable v is generated in a block if the definition d is at location p1, and there is no instruction p2 subsequent to p1 that defines v as well.
• Simply put: the generated definitions are the definitions of B that do not get killed inside B.
Basic Block Reaching Definitions
• Use DFA to find reaching definitions to all basic blocks.
• Data structure:
• IN[B]: all definitions reaching the beginning of B
• OUT[B]: all definitions reaching the end of B
• Each assignment gets a name di, and we compute ahead of time the two sets GEN[B] and KILL[B] for each block B.
• GEN[B]: set of all definitions generated in B, e.g. GEN[B]={d3,d7,d8}.
• KILL[B]: set of all definitions killed in B. In fact, set of all program definitions that set a value to a variable v that is also assigned in B.
Computing Reaching Definitions with DFA
• DFA Initialization: for each block B,
• IN[B] = 
• OUT[B] = 
• DFA step:
• For each block B, re-compute OUT[B] given IN[B], based only on the instructions of B.
• OUT[B] = ƒB(IN[B])
The DFA Step (cont’d)
• We need to compute reaching definitions in the end of the block given reaching definitions in the beginning.
• End-of-block reaching definitions = B’s generated definitions + (definitions that reach the beginning of B – B’s killed definitions)
• In other words: OUT[B] = ƒB(IN[B]) = GEN[B]  (IN[B] \ KILL[B])
• To obtain IN[B] we do: IN[B] = OUT[b1]  OUT[b2]  …  OUT[bk] where b1, b2, … , bk are the blocks that reach B directly.
• IN[B] is computed before OUT[B] (which depends on IN[B]).
An Example

B1

IN[B1] = 

d1

d2

i = 1

m = a[0]

OUT[B1] = {d1,d2}

B2

d3

t = a[i]

if (t > m)

IN[B2] = {d1,d2,d3,d4,d5}

IN[B2] = {d1,d2}

OUT[B2] = {d1,d2,d3}

OUT[B2] = {d1,d2,d3,d4,d5}

B3

IN[B3] = {d1,d2,d3,d4,d5}

IN[B3] = {d1,d2,d3}

d4

m = t

OUT[B3] = {d1,d3,d4}

OUT[B3] = {d1,d3,d4,d5}

B4

IN[B4] = {d1,d2,d3,d4,d5}

i = i + 1

if (i < 10)

IN[B4] = {d1,d2,d3,d4}

d5

OUT[B4] = {d2,d3,d4,d5}

IN[B5] = OUT[B5] = {d2,d3,d4,d5}

B5

Reaching Definitions with Basic Blocks
• As always, execution terminates when there are no modifications in one iteration.
• At the end, the reaching definitions of block B are IN[B].
• We will not prove the algorithm. Some properties:
• The values of IN and OUT are always a subset of their real values.
• Each definition can only increase the sizes of the subsets.
• The final size of is bounded (by the number of definitions in the program) and hence termination is guaranteed.
• Correctness:
• if definition d reaches block B in a path of k blocks, then after k iterations, IN[B] includes d.
• A definition that does not reach B will never enter IN[B].
Uses of Reachable Definitions
• Determine that a variable has a constant value at a given point.
• Identify a variable that is not initialized

inti;

if (…) i = 3;

x = i; ← error: i might have not been initialized

• In OOP: identify an impossible downcast.
• And more…
Using DFA for Liveness Analysis
• Definition: A variable v is live in program point p if there is an execution path starting at p, in which there is a use of v before it is defined again.
• We’ve seen previously how to determine v’s liveness inside a basic block.
• By going backwards line by line in the block.
• Now let’s do the same computation for a full procedure (or program)
• We decide which variables are alive on entry to each basic block.
• Go “backwards” on the CFG.
DFA Initialization and Computation.
• IN[B] and OUT[B] which are now sets of variables (and not sets of definitions).
• Initializing the DFA: for all blocks B, IN[B]=OUT[B]=.
• Compute in advance: Use[B]: set of variables that B uses (without redefining them before use)DEF[B]: set of variables that B generates a definition for.
• Computation step: OUT[B] = IN[b1]  IN[b2]  ...  IN[bn] , where b1,…,bn are all blocks reachable from B. IN[B] = fB(OUT[B]) = USE[B]  (OUT[B] \ DEF[B])
Another Example: Available Expression
• Typically, we assume an entry node B0 in the control flow graph, from which the computation starts.
• Definition: an expression x OP y is available at Point p if each path from the entry point to p has a computation of x OP y with no subsequent update of x or y before reaching p.
• Use for optimization: do not re-compute available expressions.
• Talking basic blocks:
• We say that a block kills the expression x OP y if the block assigns a value to x or y and does not re-compute x OP y after the assignment.
• A block generates the expression x OP y if it computes x OP y and does not update x or y after the computation.
Data, Initialization, Computation
• IN[B] and OUT[B] are sets of expressions.
• Initializing the DFA: IN[B]=OUT[B]=  for all blocks B.
• Compute ahead of time: eKill[B]: set of expressions that B kills by changing one of the variables in the expression. eGen[B]: set of expressions that B generates.
• Computation step: IN[B]=OUT[b1]  OUT[b2]  …  OUT[bn]where b1…bn are all blocks from which B is (directly) reachable.OUT[B] = ƒB(IN[B]) = eGEN[B]  (IN[B] \ eKILL[B])
Is Everything Fine?
• Not Really…
• Consider the graph on the left.
• IN[B2] = OUT[B1]  OUT[B2].
• Suppose “x+y” is computed in B1 but not in B2. (and B2 does not kill it).
• Then it is available in B2, but IN[B2] will never see that.
• The problem: outputs of B1 are available to B2 and should not be eliminated because of it.

B1

B2

Solution: Proper Inialization
• Create an empty entry block B0 and set OUT[B0]=
• But for all other blocks set OUT[Bi]=U, where U is the set of all expressions computed in any basic block.
• The computation step remains IN[B] = OUT[b1]  OUT[b2]  …  OUT[bn]where b1…bn are all blocks from which B is (directly) reachable, and OUT[B] = ƒB(IN[B]) = eGEN[B]  (IN[B] \ eKILL[B])

B1

B2

Example

B0

OUT[B0] = 

Z=x+y

OUT[B1] = U

W=x+y

OUT[B2] = U

Z=x+y

X=7

OUT[B3] = U

W=x+y

OUT[B4] = U

Example

B0

OUT[B0] = 

IN[B1] = 

Z=x+y

{x+y}

OUT[B1] = U

{x+y}

IN[B2] = U

W=x+v

{x+y , x+v}

OUT[B2] = U

{x+y , x+v}

IN[B3] = U

X=7

OUT[B3] = U

IN[B4] = U

W=x+y

{x+y}

OUT[B4] = U

Correctness
• Information flows from blocks to their neighbors during DFA steps.
• The values in IN[B0] and OUT[B0] are always empty and correct.
• The values of OUT[B] are always a superset of the available expressions on exit from B.
• Induction: after n iterations, OUT[B] is correct for all blocks whose distance from B0 is less than n.
• Proof idea: if there is a path of length n from B0 to block B in which “x+y” is not computed, then after n iteration OUT[B] will not include “x+y”.
• We do not provide a full proof.
• But note that the initialization is crucial.
Optimizations Summary
• Improve performance, while preserving semantics.
• A good register allocation is crucial.
• DAG representation helps.
• Often, aliasing makes things tougher.
• Basic optimizations: common subexpression elimination, copy propagation, code motion, strength reduction, dead-code elimination.
• Local optimization framework: peephole optimization
• A generic algorithm for Data Flow Analysis.
• DFA examples: reaching definitions (at the instruction and at the basic block level), liveness analysis, available expressions.