Download Presentation
The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011

Loading in 2 Seconds...

1 / 78

# The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011 - PowerPoint PPT Presentation

The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011. Claims. Theory: Theory of aliasing Loss of precision is small New concepts, in particular inverse variables Abstract, does not mention stack &amp; heap Simple, implementable

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

## PowerPoint Slideshow about ' The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011' - amable

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

The alias calculusBertrand Meyer

ITMO Software Engineering SeminarJune 2011

Claims
• Theory:
• Theory of aliasing
• Loss of precision is small
• New concepts, in particular inverse variables
• Abstract, does not mention stack & heap
• Simple, implementable
• Insights into the essence of object-oriented programming
• Practice:
• Alias calculus
• Almost entirely automatic
• Implemented
Reference

Steps Towards a Theory and Calculus of Aliasing

International Journal of Software and Informatics

July 2011

http://se.ethz.ch/~meyer/publications/aliasing/alias-revised.pdf

The question under study

e

f

• (If so, we say that e and f are aliased to each other, meaning potentially aliased.)

Given expressions e and f (of reference types) and a program location p:

Atp, can eandfever be attached to the same object?

“Given” expressions only

x

a

a

• y may become aliased to:
• x, x  a, x  a  a, x  a  a  aetc.
• (infinite set of expressions!)

a

a

Consider

from y := x loop

y := y  a

end

An example of alias analysis

Consider two linked list structures known through x and y:

x

right

y

item

• Computing the alias relation shows that:
• If x ≠ y, then no cell reachable from x ( or ) can be reached from y ( or ), and conversely
• Without this assumption, such aliasing is possible
Why alias analysis is important

b

-- c = c, i.e. True

a

?

set_a (c)

x.set_a (c)

Understand as

x.a := c

-- y.a = b

-- y.a = b

x

-- x.a = c

-- x.a = c

y

c

1. Without it, cannot apply standard proof techniques to programs involving pointers

2. Concurrent program analysis, in particular deadlock

3. Program optimization

Basic notion

A binary relation is an alias relationif it is symmetric andirreflexive

Can alias x to y

and y to z

but not x to z

Definition:

Not necessarily transitive:

ifcthen

x := y

else

y := z

end

Formulae of interest

The calculus defines, for any instruction p and any alias relation a,the value of

a » p

denoting:

The aliasing relation resulting from executing

pfrom an initial state in which the aliasing

relation is a

For an entire program: compute » p

The programming language

p, q, …: instructions

x, y, …: variables

• z := x y
• x r …
• Current
• r do p end
• call r
• pn -- for integer n
• loop p end

E4: O-O

E3: procedures

• cut x, y

E2:loops

E1:cut

E0:basic constructs

Eiffel: x := Void

Java etc.: x = null;

• skip
• create x
• x := y
• forget x
• (p ; q)
• then p else q end
Describing an alias relation
• Set of binary relations on E; formally: P(E x E)

D

Identity on E

=

Set difference

{[x, y], [y, x], [y, z], [z, y]}

{[x, y], [y, x], [x, z], [z, x], [y, z], [z, y]}

“Complete” alias relation

• If r is a relation in E E, the following is an alias relation:

r (r  r-1) ― Id [E]

• Example: {[x, x], [x, y], [y, z]}=
• Generalized to sets:
• {x, y, z} =
Canonical form & alias diagrams

x, y, y, z, x, u, v

{x, y}  { y, z}  {x, u, v}

x

, y

y

y

y, z

• An alias diagram:
• (not canonical)

y

y

x,

u, v

x

x

• Make it canonical:

Canonical form of an alias relation: union of complete alias relations, e.g.

, meaning

None of the sets of expressionsis a subset of another

Alias calculus for basic operations (E0)

a deprived of all pairs involving x

e.g.x, y, y, z, x, u, v \- {x, u} = y, z, u, v

Override, see next

a » skip = a

a » (then p else q end )= (a » p)  (a » q)

a » (p ; q) = (a » p) » q

a » (forget x) = a \- {x}

a » (create x) = a \- {x}

a » (x := y) = a [x: y]

The forgetrule

y

y

x,

x,

a deprived of allpairs involving x

y, z

u, v

x,

x,

a » (forget x)= a \- {x}

The assignment rule (E0)

a deprived of all pairs involving x

Symmetrize and de-reflect

All pairs [x, u] where u is either aliased to y in b or y itself

a » (x := y) = a [x: y]

with:

a [x: y] =given

b = a \- {x}

then

b  ({x}  (b / y))

end

Operations on alias relations

“Minus”

Set of all expressions

“Quotient”, similar to equivalence class in equivalence relation

For an alias relation a in E E, an expression x, and a set of expressions A E, the following are alias relations:

r \– A = r — E x A

a / y = {z: E | (z = y)  [y, z]  a}

The assignment rule (E0)

Value of a » (x := y)

a deprived of all pairs involving x

Symmetrize and de-reflect

All u aliased to yin b, plusy itself

All pairs [x, u] where u is either aliased to y in b or y itself

a [x: y] = given

b = a \- {x}

then

b  ({x}  ( b / y ))

end

Assignment example 1

x

, y

, z

x,

u, v

, z

Before

After

z := x

Assignment example 2

x

x

, y

, y

x,

u, v

Before

After

x := u

Assignment example 3

x

x

, y

, y

x, z

x,

u, v

x,

Before

After

x := z

The assignment rule (E0)

Value of a » (x := y)

a [x: y] = given

b = a \- {x}

then

b  ({x}  (b / y))

end

The cut instruction

E1:cut

E0:basic constructs

• E1 is E0 plus the instruction

cut x, y

• Semantics: remove aliasing, if any, between x and y
Cut example 1

x

x

, y

, y

x,

u, v

Before

After

cut x, y

Cut example 2

x

, y

x

, v

x,

u, v

x,

Before

After

cut x, u

Cut rule

Set difference

a » cut x, y = a ― x, y

The role of cut

Alias relation: 

x, u, x, y

x, u, z, x, y, z

cut x, y;

cut x, y informs the alias calculus with non-alias properties coming from other sources

Example:

ifm < n thenx := u elsex := y end

m := m + 1

ifm < n thenz := x end

But here x cannot be aliased to y (only to u). The alias theory does not know this property!

To take advantage of it, add the instruction

This expressionrepresents

checkx /= y end (Eiffel)

assertx != y ; (JML, Spec#)

Introducing repetitions

E2:loops

E1:cut

E0:basic constructs

• E2 is E1 plus:
• pn(for integer n): n executions of p

(auxiliary notion)

• loop p end : any sequence (incl. empty) of executions of p
E2 alias calculus

nN

a » p0 = a

a » pn+1 = (a » pn) » p -- For n  0

-- Also equal to (a » p) » pn

a » (loop p end) = (a » pn)

Loop aliasing theorem (1)
• a » (loop p end) = (a » pn)

nN

n:0 N

k:0 n

For any a and p, there exists a constant NN such that

a » (loop p end) = (a » pn)

Proof : the sequence

sn = (a » pk)

is non-decreasing (with respect to inclusion) on a finite set

More generally, for every construct p of E2, the function

l a | (a » p)

is non-decreasing

Loop aliasing theorem (2)

k:0 n

• a » (looppend) is also the fixpoint of the sequence

t0 = a

tn+1 = tn(tn » p)

• Gives a practical way to compute a » (looppend)

Proof: by induction. If sn is original sequence (a » pn), prove separately sn tnand tn sn

Introducing procedures: E3

E3: procedures

• Alias calculus notations:
• rdenotes body of r (i.e. ri = pi)
• rdenotes formals of r (here f)

E2:loops

E1:cut

E0:basic constructs

• A program is now a sequence of procedure definitions (one designated as main):

ri (f) do piend

• Instructions: as before, plus

call ri(a)

• -- Procedure call
Handling arguments

i.e. formal1 :=actual1;… ; formaln:=actualn

Generalize notation a [x: y]to lists: use

• a [a: b]

as abbreviation for

• (…((a [a1:b1])[a2:b2]) …[an:bn]
• For example: a [r : a]
• The calculus will treat
• callr (a) as
• r := a ; call r
• (With recursion, possible loss of precision)
Call rule

Body of r

Formal arguments of r

With arguments:

• a » call r (v) = a[r: a] » r

Without arguments:

a » call r = a » r

Using the call rule
• a » call r (a) = a [r: a] » r
• Because of recursion, no longer just definition but equation
• For entire set of procedures P, this gives a vector equation

a » P = AL (a » P)

• Interpret as fixpoint equation and solve iteratively
• (Fixpoint exists: increasing sequence on finite set)
Object-oriented mechanisms: E4

E4: O-O

E3: procedures

E2:loops

E1:cut

E0:basic constructs

• “General relativity”:
• 1. Qualified expressions: x yCan be used as source (not target!) of assignments

x := y z

• 2. Qualified calls: callx r (v)
• 3. Current
Assignment (original rule)

Value of a » (x := y)

a deprived of all pairs involving x

All u aliased to yin b, plusy itself

Example:

x := y z

All pairs [x, u] where u is either aliased to y in b or y itself

This includes[x, y] !

a [x: y] = given

b = a \- {x}

then

b  ({x}  (b / y) )

end

Assigning a qualified expression
• x := x y

x

y

x

x

z

x does not get aliased tox y!

(only to any z that was aliased tox y)

:= x y

Assignment rule revisited

a deprived of all pairs involving x

Value of a » (x := y)

or an expression starting with x

Example:

x := y z

a [x: y] = given

b = a \–{x}

then

b  ({x}  (b / y))

end

Alias diagrams (E0 to E3)

Value nodes

Value nodes

Value nodes

Source node

Single source node(represents stack)

Each value node represents a set of possible run-time values

Links: only from source tovalue nodes (will becomemore interesting with E4!)

Edge label: set ofvariables; indicates theycan all be aliased to each other

x

, y

y, z

x,

u, v

Alias diagrams (E4)

Value nodes

Value nodes

Object nodes

Source node

x

y

x

x

z

Links may now exist between value nodes(now called object nodes)

Cycles possible (see next)

:= x y

New laws, inverse variables

x Current= x

Current x = x

x’ x = Current

x x’ = Current

Current’ = Current

New form of call: qualified

In E4:

call x  r (a, b, …)

Distribution operator: 

For a list a=<u, v, w, …>:

x a = <xu, x  v, x  v, …>

For a relation r in E E :

x r = {[xu, x  v] | [u, v]  r}

Example:

x ( u, v, w, u, y ) = xu, x  v, xw, xu, x  y

Handling arguments (unqualified call)

a » call r (a) = a [r: a] » r

Handling arguments: unqualified call

Was written r

a » callr (a) = a [ ` r: a] » callr )

Handling arguments: qualified call

Current

x’

x

target

a » call x r (a) = a [ xr: a] » call x r)

• Treat
• call x r (v) as
• x formals := a ; callx r
Handling arguments: an example

Current

x’

x

target

Eiffel: x  r (a, b)

With, in a class C:

r (t: T ; u: U)

Handled as:

x  t := a

x  u := b

callx  r

Without arguments: unqualified call rule

Body of r

a » call r = a » r

Qualified call rule
• Example:
• d := c
• x  r (d)
• with
• r (u: U)
• do
• v := u
• end
• Handled as:
• d := c
• call
• with
• r
• do
• v := u
• end

Current

, d

c

x r

x’

x

u, v

u := x’ d

target

Inverse variable

a » call x r= x  ((x’ a) » r)

The rule in action
• d := c
• call
• with
• r
• do
• v := u
• end
• Alias relation:

Current

c

, d

, d

c, d

x r

Current

• Prefix with x’ :

x’

x’

x

u,

x’c, x’ d

x

v,

x

u := x’ c

x’c, x’d

u, x’ c, x’ d

c,

x

d

x

v, u, x’ c, x’d

x’c, x’d

target

• Prefix with x :

u,

c

x

x’c,

d

x

v,

c,

x

xv, xu, c, d

x

a » call x r= x  ((x’ a) » r)

About the qualified call rule

a » call x r= x  ((x’ a) » r)

• Thus we are permitted to prove that the unqualified call creates certain aliasings, on the assumption that it starts in its own alias environment but has access to the caller’s environment through the inverted variable, and then to assert categorically that the qualified call has the same aliasings transposed back to the original environment. This change of environment to prove the unqualified property, followed by a change back to the original environment to prove the qualified property, explains well the aura of magic which attends a programmer\'s first introduction to object-oriented programming.
The full qualified call rule

Hide internals of r

• As two separate rules:
• a » callx r = x  ((x’ a) » r)
• a » callx r (a) = a [x r:a] » callx r )
• As a single rule:
• a » callx r (a) = x(x’ a [r:x’ a]) » r)\–xr
Termination?

x

a

a

a

a

The original termination argument does not hold any more

Consider

from y := x loop

y := y  a

end

y may become aliased to:

x, x  a, x  a  a, x  a  a  aetc.

(infinite set of expressions!)

Termination: the question under study

Given expressions e and f (of reference types) and a program location p:

Atp, can eandfever be attached to the same object?

The alias calculus

a [x: y] = given b = a\- {x}thenb  ({x} x (b/y))end

• Plus:
• x Current= x
• Current x = x
• x’ x = Current
• x x’ = Current
• Current’ = Current

nN

• a »skip = a
• a » (thenpelseqend)= (a » p)  (a » q)
• a » (p ; q) = (a » p) » q
• a » (forgetx)= a \- {x}
• a » (create x)= a \- {x}
• a » (x := y) = a [x: y]
• a » cutx, y = a – x, y
• a » p0 = a
• a » pn+1 = (a » pn) » p
• a » (looppend) = (a » pn)
• a » callr (a) = (a [r:a]) » r

a » call x r (a) = x(x’(a [xr:a]) » r) \– x r

There is no backward alias calculus

x := z

x, z ?

?

x, y

Consider

createy

createz

x := y

Notation

Targets of an instruction p:

p

This is the set of variables that p may modify

e.g.

(x :=y ; y := z) = {x, y}

Semantics of the alias calculus

D

=

a–  x ≠ y

[x, y]  (Var  Var) – Id – a

{(a)– } p {(a » p)– }

We may ignore variables modified by p

{(a )– } p {(a » p)– }

\- p

Let Var be the set of variables and a an alias relation, the following assertion expresses that there is no aliasing except as implied by a:

Weak soundness of definition of » for an instruction p:

Soundnesshas less demanding precondition but assumes some white-box knowledge:

There is no backward rule!

{(a)– } p {(a » p)– }

It is possible to reconstruct a from a–, but not from a–\- p

(Same for strong soundess)

Consider the definition of weak soundness:

Why alias analysis is important

1. Without it, cannot apply standard proof techniques to programs involving pointers

2. Concurrent program analysis, in particular deadlock

3. Program optimization

Coffman deadlock
• Not the same thing as reverse of liveness
• Consider a set of processors and a set of resources. At every execution time t, for every processor p, two disjoint sets of resources are defined:
• Ht (p)-- Has set: resources that p has acquired
• Wt (p)-- Wait set: resources thatphas requested
• A deadlock exists if for some set D of processors:
•  p: D |  p’ : D | Wt (p)  Ht (p’) ≠ 
• (In such a case p ≠ p’)
Absolute deadlock freedom
• A system, made of a set of program elements E, is absolutely deadlock-free if for all times t
•  r: E |  r’ : E | Wt (r)  Ht (r’) = 
• (Works for r = r’ sinceWt (r)  Ht (r) = )
• Can also be written:
•  r: E |  r’ : E |  a: Wt (r) | a Ht (r’)
Strategy for detecting deadlock
•  r, r’: E | r // r’ (W (r)  H (r’) = )
• For every program element r:
• Compute W (r) and H (r)
• Determine with which other elements r’ it can run parallel
The SCOOP model
• A primary characteristic of SCOOP is that the model removes the distinction between resources and processors
• Properties:
• Any processor p is such that p  H (p)
• Any variable or expression e has an associated processor, its handler <e>
• Any execution of a qualified call xf (a, b, …) (separate or not) satisfies x  H (<Current>)
• For any call r of actual arguments argsincluding uncontrolled arguments U,W (r) is {<a> | a  U}
• Without lock passing, H (r) for any program element r in a routine or formal separate arguments S is <Current>  {<a> | a  S}
Processor abstraction
• In SCOOP:
• Any variable or expression e has an associated processor, its handler <e>
• The computation of H and W sets only involves the handler
• Strategy:
• Processor abstraction: identify every variable or expression e with its processor <e>
• Perform alias analysis
• Use it to compute H and W sets as unions for all possible aliases
• Check W (r)  H (r’) =  for any two calls r, r’, including when they are the same call
Dining philosophers

H = {x, y},H = {f1, f2, …}W = 

H = {z}, W = {right},H = {f1, f2, …},W = {f1, f2, …}

class MEAL create make feature

p1, p2: separate PHILOSOPHER ; f1, f2: separate FORK

make docreate f1; create f2create p1make (f1, f2); create p2make (f2, f1) end

go_right (a, b: separate PHILOSOPHER)

do p1eat_right; p2eat_rightend

go_wrong (a, b: separate PHILOSOPHER)

do p1eat_wrong; p2eat_wrongend

end

class PHILOSOPHER create make feature

left, right: separate FORK

make (u, v: separate FORK) do left:= u ; right := v end

eat_rightdopick_two (left, right) end

pick_two (x, y: separate FORK) do xuse; yuse end

eat_wrongdopick_in_turn (left) end

pick_in_turn (z: separate FORK) dopick_two (z, right) end

end

Claims
• Theory:
• Theory of aliasing
• Loss of precision is small
• New concepts, in particular inverse variables
• Abstract, does not mention stack & heap
• Simple, implementable
• Insights into the essence of object-oriented programming
• Practice:
• Alias calculus
• Almost entirely automatic
• Implemented
Approaches for comparison

Separation logic

Shape analysis (with abstract interpretation)

Ownership

Dynamic frames

Hoare-style reasoning

Assignment rule:

{P (e)} x := e {P (x)}

Hoare-style reasoning

require

require

do

:= whatever + 10000

y:= y + 1

ensure

end

y+ 1 < 3

-- y + 1 < 3

y+ 1 < 3

x

x

-- y + 1 < 3

--y+ 1 < 3

Assignment rule:

{P (e)} x := e {P (x)}

ensure

y< 3

y< 3

The effect of pointers (references)

b

-- True

?

x.set_a (c)

Understand as

x.a := c

a

-- y.a = b

-- y.a = b

set_a (c)

-- x.a = c

-- x.a = c

x

y

c

The question under study

Given expressions e and f (of reference types) and a program location p:

Atp, can eandfever be attached to the same object?

The effect of pointers : with alias analysis

b

-- x, y

Understand as

x.a := c

--.c = b

a

x.set_a (c)

set_a (c)

-- y.a = b

x

-- x.a = b

-- x.a = b

y

-- x, y

x may be aliased to y

c

Alias relations

A binary relation is an alias relationif it is symmetric andirreflexive

Can alias x to y

and y to z

but not x to z

Relation of interest:

“In the computation, e might become aliased to f”

Definition:

Not necessarily transitive:

ifcthen

x := y

else

y := z

end

Alias diagrams (E0 to E3)

Value nodes

Value nodes

Value nodes

Source node

Single source node(represents stack)

Each value node represents a set of possible run-time values

Links: only from source tovalue nodes (will becomemore interesting with E4!)

Edge label: set ofexpressions; indicates theycan all be aliased to each other

x

, y

y, z

x,

u, v

In canonical form: no label is subset ofanother; each label has at least 2 expressions

Approaches for comparison

Separation logic

Shape analysis

Ownership

Dynamic frames

Achievements
• a » call x f= x  ((x’ a) » call f)

Theory of aliasing

Simple (about a dozen rules)

New concepts: inverse variables, modeling Current

Graphical formalism (alias diagrams), canonical form

Implemented

Almost entirely automatic (except for occasional cut)

Small loss of precision, i.e. not too conservative

Abstract: does not mention stack and heap

Covers object-oriented programming

Faithful to O-O spirit; see qualified call rule

Can cover full modern O-O language

Potential solution to “frame problem”