1 / 78

# The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011 - PowerPoint PPT Presentation

The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011. Claims. Theory: Theory of aliasing Loss of precision is small New concepts, in particular inverse variables Abstract, does not mention stack & heap Simple, implementable

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'The alias calculus Bertrand Meyer ITMO Software Engineering Seminar June 2011' - amable

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

The alias calculusBertrand Meyer

ITMO Software Engineering SeminarJune 2011

• Theory:

• Theory of aliasing

• Loss of precision is small

• New concepts, in particular inverse variables

• Abstract, does not mention stack & heap

• Simple, implementable

• Insights into the essence of object-oriented programming

• Practice:

• Alias calculus

• Almost entirely automatic

• Implemented

Steps Towards a Theory and Calculus of Aliasing

International Journal of Software and Informatics

July 2011

http://se.ethz.ch/~meyer/publications/aliasing/alias-revised.pdf

e

f

• (If so, we say that e and f are aliased to each other, meaning potentially aliased.)

Given expressions e and f (of reference types) and a program location p:

Atp, can eandfever be attached to the same object?

x

a

a

• y may become aliased to:

• x, x  a, x  a  a, x  a  a  aetc.

• (infinite set of expressions!)

a

a

Consider

from y := x loop

y := y  a

end

Consider two linked list structures known through x and y:

x

right

y

item

• Computing the alias relation shows that:

• If x ≠ y, then no cell reachable from x ( or ) can be reached from y ( or ), and conversely

• Without this assumption, such aliasing is possible

b

-- c = c, i.e. True

a

?

set_a (c)

x.set_a (c)

Understand as

x.a := c

-- y.a = b

-- y.a = b

x

-- x.a = c

-- x.a = c

y

c

1. Without it, cannot apply standard proof techniques to programs involving pointers

2. Concurrent program analysis, in particular deadlock

3. Program optimization

A binary relation is an alias relationif it is symmetric andirreflexive

Can alias x to y

and y to z

but not x to z

Definition:

Not necessarily transitive:

ifcthen

x := y

else

y := z

end

The calculus defines, for any instruction p and any alias relation a,the value of

a » p

denoting:

The aliasing relation resulting from executing

pfrom an initial state in which the aliasing

relation is a

For an entire program: compute » p

p, q, …: instructions

x, y, …: variables

• z := x y

• x r …

• Current

• r do p end

• call r

• pn -- for integer n

• loop p end

E4: O-O

E3: procedures

• cut x, y

E2:loops

E1:cut

E0:basic constructs

Eiffel: x := Void

Java etc.: x = null;

• skip

• create x

• x := y

• forget x

• (p ; q)

• then p else q end

• Set of binary relations on E; formally: P(E x E)

D

Identity on E

=

Set difference

{[x, y], [y, x], [y, z], [z, y]}

{[x, y], [y, x], [x, z], [z, x], [y, z], [z, y]}

“Complete” alias relation

• If r is a relation in E E, the following is an alias relation:

r (r  r-1) ― Id [E]

• Example: {[x, x], [x, y], [y, z]}=

• Generalized to sets:

• {x, y, z} =

x, y, y, z, x, u, v

{x, y}  { y, z}  {x, u, v}

x

, y

y

y

y, z

• An alias diagram:

• (not canonical)

y

y

x,

u, v

x

x

• Make it canonical:

Canonical form of an alias relation: union of complete alias relations, e.g.

, meaning

None of the sets of expressionsis a subset of another

a deprived of all pairs involving x

e.g.x, y, y, z, x, u, v \- {x, u} = y, z, u, v

Override, see next

a » skip = a

a » (then p else q end )= (a » p)  (a » q)

a » (p ; q) = (a » p) » q

a » (forget x) = a \- {x}

a » (create x) = a \- {x}

a » (x := y) = a [x: y]

The forgetrule

y

y

x,

x,

a deprived of allpairs involving x

y, z

u, v

x,

x,

a » (forget x)= a \- {x}

a deprived of all pairs involving x

Symmetrize and de-reflect

All pairs [x, u] where u is either aliased to y in b or y itself

a » (x := y) = a [x: y]

with:

a [x: y] =given

b = a \- {x}

then

b  ({x}  (b / y))

end

“Minus”

Set of all expressions

“Quotient”, similar to equivalence class in equivalence relation

For an alias relation a in E E, an expression x, and a set of expressions A E, the following are alias relations:

r \– A = r — E x A

a / y = {z: E | (z = y)  [y, z]  a}

Value of a » (x := y)

a deprived of all pairs involving x

Symmetrize and de-reflect

All u aliased to yin b, plusy itself

All pairs [x, u] where u is either aliased to y in b or y itself

a [x: y] = given

b = a \- {x}

then

b  ({x}  ( b / y ))

end

x

, y

, z

x,

u, v

, z

Before

After

z := x

x

x

, y

, y

x,

u, v

Before

After

x := u

x

x

, y

, y

x, z

x,

u, v

x,

Before

After

x := z

Value of a » (x := y)

a [x: y] = given

b = a \- {x}

then

b  ({x}  (b / y))

end

E1:cut

E0:basic constructs

• E1 is E0 plus the instruction

cut x, y

• Semantics: remove aliasing, if any, between x and y

x

x

, y

, y

x,

u, v

Before

After

cut x, y

x

, y

x

, v

x,

u, v

x,

Before

After

cut x, u

Set difference

a » cut x, y = a ― x, y

The role of cut

Alias relation: 

x, u, x, y

x, u, z, x, y, z

cut x, y;

cut x, y informs the alias calculus with non-alias properties coming from other sources

Example:

ifm < n thenx := u elsex := y end

m := m + 1

ifm < n thenz := x end

But here x cannot be aliased to y (only to u). The alias theory does not know this property!

To take advantage of it, add the instruction

This expressionrepresents

checkx /= y end (Eiffel)

assertx != y ; (JML, Spec#)

E2:loops

E1:cut

E0:basic constructs

• E2 is E1 plus:

• pn(for integer n): n executions of p

(auxiliary notion)

• loop p end : any sequence (incl. empty) of executions of p

nN

a » p0 = a

a » pn+1 = (a » pn) » p -- For n  0

-- Also equal to (a » p) » pn

a » (loop p end) = (a » pn)

• a » (loop p end) = (a » pn)

nN

n:0 N

k:0 n

For any a and p, there exists a constant NN such that

a » (loop p end) = (a » pn)

Proof : the sequence

sn = (a » pk)

is non-decreasing (with respect to inclusion) on a finite set

More generally, for every construct p of E2, the function

l a | (a » p)

is non-decreasing

k:0 n

• a » (looppend) is also the fixpoint of the sequence

t0 = a

tn+1 = tn(tn » p)

• Gives a practical way to compute a » (looppend)

Proof: by induction. If sn is original sequence (a » pn), prove separately sn tnand tn sn

E3: procedures

• Alias calculus notations:

• rdenotes body of r (i.e. ri = pi)

• rdenotes formals of r (here f)

E2:loops

E1:cut

E0:basic constructs

• A program is now a sequence of procedure definitions (one designated as main):

ri (f) do piend

• Instructions: as before, plus

call ri(a)

• -- Procedure call

i.e. formal1 :=actual1;… ; formaln:=actualn

Generalize notation a [x: y]to lists: use

• a [a: b]

as abbreviation for

• (…((a [a1:b1])[a2:b2]) …[an:bn]

• For example: a [r : a]

• The calculus will treat

• callr (a) as

• r := a ; call r

• (With recursion, possible loss of precision)

Body of r

Formal arguments of r

With arguments:

• a » call r (v) = a[r: a] » r

Without arguments:

a » call r = a » r

• a » call r (a) = a [r: a] » r

• Because of recursion, no longer just definition but equation

• For entire set of procedures P, this gives a vector equation

a » P = AL (a » P)

• Interpret as fixpoint equation and solve iteratively

• (Fixpoint exists: increasing sequence on finite set)

E4: O-O

E3: procedures

E2:loops

E1:cut

E0:basic constructs

• “General relativity”:

• 1. Qualified expressions: x yCan be used as source (not target!) of assignments

x := y z

• 2. Qualified calls: callx r (v)

• 3. Current

Value of a » (x := y)

a deprived of all pairs involving x

All u aliased to yin b, plusy itself

Example:

x := y z

All pairs [x, u] where u is either aliased to y in b or y itself

This includes[x, y] !

a [x: y] = given

b = a \- {x}

then

b  ({x}  (b / y) )

end

• x := x y

x

y

x

x

z

x does not get aliased tox y!

(only to any z that was aliased tox y)

:= x y

a deprived of all pairs involving x

Value of a » (x := y)

or an expression starting with x

Example:

x := y z

a [x: y] = given

b = a \–{x}

then

b  ({x}  (b / y))

end

Value nodes

Value nodes

Value nodes

Source node

Single source node(represents stack)

Each value node represents a set of possible run-time values

Links: only from source tovalue nodes (will becomemore interesting with E4!)

Edge label: set ofvariables; indicates theycan all be aliased to each other

x

, y

y, z

x,

u, v

Value nodes

Value nodes

Object nodes

Source node

x

y

x

x

z

Links may now exist between value nodes(now called object nodes)

Cycles possible (see next)

:= x y

x Current= x

Current x = x

x’ x = Current

x x’ = Current

Current’ = Current

In E4:

call x  r (a, b, …)

For a list a=<u, v, w, …>:

x a = <xu, x  v, x  v, …>

For a relation r in E E :

x r = {[xu, x  v] | [u, v]  r}

Example:

x ( u, v, w, u, y ) = xu, x  v, xw, xu, x  y

a » call r (a) = a [r: a] » r

Was written r

a » callr (a) = a [ ` r: a] » callr )

Current

x’

x

target

a » call x r (a) = a [ xr: a] » call x r)

• Treat

• call x r (v) as

• x formals := a ; callx r

Current

x’

x

target

Eiffel: x  r (a, b)

With, in a class C:

r (t: T ; u: U)

Handled as:

x  t := a

x  u := b

callx  r

Body of r

a » call r = a » r

• Example:

• d := c

• x  r (d)

• with

• r (u: U)

• do

• v := u

• end

• Handled as:

• d := c

• call

• with

• r

• do

• v := u

• end

Current

, d

c

x r

x’

x

u, v

u := x’ d

target

Inverse variable

a » call x r= x  ((x’ a) » r)

• d := c

• call

• with

• r

• do

• v := u

• end

• Alias relation:

Current

c

, d

, d

c, d

x r

Current

• Prefix with x’ :

x’

x’

x

u,

x’c, x’ d

x

v,

x

u := x’ c

x’c, x’d

u, x’ c, x’ d

c,

x

d

x

v, u, x’ c, x’d

x’c, x’d

target

• Prefix with x :

u,

c

x

x’c,

d

x

v,

c,

x

xv, xu, c, d

x

a » call x r= x  ((x’ a) » r)

a » call x r= x  ((x’ a) » r)

• Thus we are permitted to prove that the unqualified call creates certain aliasings, on the assumption that it starts in its own alias environment but has access to the caller’s environment through the inverted variable, and then to assert categorically that the qualified call has the same aliasings transposed back to the original environment. This change of environment to prove the unqualified property, followed by a change back to the original environment to prove the qualified property, explains well the aura of magic which attends a programmer's first introduction to object-oriented programming.

Hide internals of r

• As two separate rules:

• a » callx r = x  ((x’ a) » r)

• a » callx r (a) = a [x r:a] » callx r )

• As a single rule:

• a » callx r (a) = x(x’ a [r:x’ a]) » r)\–xr

x

a

a

a

a

The original termination argument does not hold any more

Consider

from y := x loop

y := y  a

end

y may become aliased to:

x, x  a, x  a  a, x  a  a  aetc.

(infinite set of expressions!)

Given expressions e and f (of reference types) and a program location p:

Atp, can eandfever be attached to the same object?

a [x: y] = given b = a\- {x}thenb  ({x} x (b/y))end

• Plus:

• x Current= x

• Current x = x

• x’ x = Current

• x x’ = Current

• Current’ = Current

nN

• a »skip = a

• a » (thenpelseqend)= (a » p)  (a » q)

• a » (p ; q) = (a » p) » q

• a » (forgetx)= a \- {x}

• a » (create x)= a \- {x}

• a » (x := y) = a [x: y]

• a » cutx, y = a – x, y

• a » p0 = a

• a » pn+1 = (a » pn) » p

• a » (looppend) = (a » pn)

• a » callr (a) = (a [r:a]) » r

a » call x r (a) = x(x’(a [xr:a]) » r) \– x r

x := z

x, z ?

?

x, y

Consider

createy

createz

x := y

Targets of an instruction p:

p

This is the set of variables that p may modify

e.g.

(x :=y ; y := z) = {x, y}

Semantics of the alias calculus

D

=

a–  x ≠ y

[x, y]  (Var  Var) – Id – a

{(a)– } p {(a » p)– }

We may ignore variables modified by p

{(a )– } p {(a » p)– }

\- p

Let Var be the set of variables and a an alias relation, the following assertion expresses that there is no aliasing except as implied by a:

Weak soundness of definition of » for an instruction p:

Soundnesshas less demanding precondition but assumes some white-box knowledge:

{(a)– } p {(a » p)– }

It is possible to reconstruct a from a–, but not from a–\- p

(Same for strong soundess)

Consider the definition of weak soundness:

1. Without it, cannot apply standard proof techniques to programs involving pointers

2. Concurrent program analysis, in particular deadlock

3. Program optimization

• Not the same thing as reverse of liveness

• Consider a set of processors and a set of resources. At every execution time t, for every processor p, two disjoint sets of resources are defined:

• Ht (p)-- Has set: resources that p has acquired

• Wt (p)-- Wait set: resources thatphas requested

• A deadlock exists if for some set D of processors:

•  p: D |  p’ : D | Wt (p)  Ht (p’) ≠ 

• (In such a case p ≠ p’)

• A system, made of a set of program elements E, is absolutely deadlock-free if for all times t

•  r: E |  r’ : E | Wt (r)  Ht (r’) = 

• (Works for r = r’ sinceWt (r)  Ht (r) = )

• Can also be written:

•  r: E |  r’ : E |  a: Wt (r) | a Ht (r’)

•  r, r’: E | r // r’ (W (r)  H (r’) = )

• For every program element r:

• Compute W (r) and H (r)

• Determine with which other elements r’ it can run parallel

• A primary characteristic of SCOOP is that the model removes the distinction between resources and processors

• Properties:

• Any processor p is such that p  H (p)

• Any variable or expression e has an associated processor, its handler <e>

• Any execution of a qualified call xf (a, b, …) (separate or not) satisfies x  H (<Current>)

• For any call r of actual arguments argsincluding uncontrolled arguments U,W (r) is {<a> | a  U}

• Without lock passing, H (r) for any program element r in a routine or formal separate arguments S is <Current>  {<a> | a  S}

• In SCOOP:

• Any variable or expression e has an associated processor, its handler <e>

• The computation of H and W sets only involves the handler

• Strategy:

• Processor abstraction: identify every variable or expression e with its processor <e>

• Perform alias analysis

• Use it to compute H and W sets as unions for all possible aliases

• Check W (r)  H (r’) =  for any two calls r, r’, including when they are the same call

H = {x, y},H = {f1, f2, …}W = 

H = {z}, W = {right},H = {f1, f2, …},W = {f1, f2, …}

class MEAL create make feature

p1, p2: separate PHILOSOPHER ; f1, f2: separate FORK

make docreate f1; create f2create p1make (f1, f2); create p2make (f2, f1) end

go_right (a, b: separate PHILOSOPHER)

do p1eat_right; p2eat_rightend

go_wrong (a, b: separate PHILOSOPHER)

do p1eat_wrong; p2eat_wrongend

end

class PHILOSOPHER create make feature

left, right: separate FORK

make (u, v: separate FORK) do left:= u ; right := v end

eat_rightdopick_two (left, right) end

pick_two (x, y: separate FORK) do xuse; yuse end

eat_wrongdopick_in_turn (left) end

pick_in_turn (z: separate FORK) dopick_two (z, right) end

end

• Theory:

• Theory of aliasing

• Loss of precision is small

• New concepts, in particular inverse variables

• Abstract, does not mention stack & heap

• Simple, implementable

• Insights into the essence of object-oriented programming

• Practice:

• Alias calculus

• Almost entirely automatic

• Implemented

Separation logic

Shape analysis (with abstract interpretation)

Ownership

Dynamic frames

Assignment rule:

{P (e)} x := e {P (x)}

require

require

do

:= whatever + 10000

y:= y + 1

ensure

end

y+ 1 < 3

-- y + 1 < 3

y+ 1 < 3

x

x

-- y + 1 < 3

--y+ 1 < 3

Assignment rule:

{P (e)} x := e {P (x)}

ensure

y< 3

y< 3

b

-- True

?

x.set_a (c)

Understand as

x.a := c

a

-- y.a = b

-- y.a = b

set_a (c)

-- x.a = c

-- x.a = c

x

y

c

Given expressions e and f (of reference types) and a program location p:

Atp, can eandfever be attached to the same object?

b

-- x, y

Understand as

x.a := c

--.c = b

a

x.set_a (c)

set_a (c)

-- y.a = b

x

-- x.a = b

-- x.a = b

y

-- x, y

x may be aliased to y

c

A binary relation is an alias relationif it is symmetric andirreflexive

Can alias x to y

and y to z

but not x to z

Relation of interest:

“In the computation, e might become aliased to f”

Definition:

Not necessarily transitive:

ifcthen

x := y

else

y := z

end

Value nodes

Value nodes

Value nodes

Source node

Single source node(represents stack)

Each value node represents a set of possible run-time values

Links: only from source tovalue nodes (will becomemore interesting with E4!)

Edge label: set ofexpressions; indicates theycan all be aliased to each other

x

, y

y, z

x,

u, v

In canonical form: no label is subset ofanother; each label has at least 2 expressions

Separation logic

Shape analysis

Ownership

Dynamic frames

• a » call x f= x  ((x’ a) » call f)

Theory of aliasing

Simple (about a dozen rules)

New concepts: inverse variables, modeling Current

Graphical formalism (alias diagrams), canonical form

Implemented

Almost entirely automatic (except for occasional cut)

Small loss of precision, i.e. not too conservative

Abstract: does not mention stack and heap

Covers object-oriented programming

Faithful to O-O spirit; see qualified call rule

Can cover full modern O-O language

Potential solution to “frame problem”