1 / 41

# Pointer Analysis - PowerPoint PPT Presentation

Pointer Analysis. G. Ramalingam Microsoft Research, India. A Constant Propagation Example. x = 3; y = 4; z = x + 5;. x is always 3 here can replace x by 3 and replace x+5 by 8 and so on. A Constant Propagation Example With Pointers. x = 3; *p = 4; z = x + 5;.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

## PowerPoint Slideshow about ' Pointer Analysis' - lars-sharp

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Pointer Analysis

G. Ramalingam

Microsoft Research, India

x = 3;

y = 4;

z = x + 5;

• x is always 3 here

• can replace x by 3

• and replace x+5 by 8

• and so on

A Constant Propagation ExampleWith Pointers

x = 3;

*p = 4;

z = x + 5;

• Is x always 3 here?

A Constant Propagation ExampleWith Pointers

p = &y;

x = 3;

*p = 4;

z = x + 5;

if (?)

p = &x;

else

p = &y;

x = 3;

*p = 4;

z = x + 5;

p = &x;

x = 3;

*p = 4;

z = x + 5;

pointers affect

most program analyses

x is always 3

x is always 4

x may be 3 or 4

(i.e., x is unknown in our lattice)

A Constant Propagation ExampleWith Pointers

p = &y;

x = 3;

*p = 4;

z = x + 5;

if (?)

p = &x;

else

p = &y;

x = 3;

*p = 4;

z = x + 5;

p = &x;

x = 3;

*p = 4;

z = x + 5;

p always

points-to y

p always

points-to x

p may point-to x or y

• Determine the set of targets a pointer variable could point-to (at different points in the program)

• “p points-to x” == “*p has l-value x”

• targets could be variables or locations in the heap (dynamic memory allocation)

• p = &x;

• p = new Foo(); or p = malloc (…);

• must-point-to vs. may-point-to

A Constant Propagation ExampleWith Pointers

Can *p denote the

same location as *q?

*q = 3;

*p = 4;

z = *q + 5;

what values can

this take?

• *p and *q are said to be aliases (in a given concrete state) if they represent the same location

• Alias analysis

• determine if a given pair of references could be aliases at a given program point

• *pmay-alias *q

• *pmust-alias*q

• Points-To Analysis

• may-point-to

• must-point-to

• Alias Analysis

• may-alias

• must-alias

Points-To Analysis:A Simple Example

p = &x;

q = &y;

if (?) {

q = p;

}

x = &a;

y = &b;

z = *q;

Points-To Analysis:A Simple Example

p = &x;

q = &y;

if (?) {

q = p;

}

x = &a;

y = &b;

z = *q;

x = &a;

y = &b;

if (?) {

p = &x;

} else {

p = &y;

}

*x = &c;

*p = &c;

x: a

x: {a,c}

x: a

y: {b,c}

y: b

y: b

p: {x,y}

p: {x,y}

p: {x,y}

a: c

a: c

a: null

How should we handle

this statement?

Weak update

Weak update

Strong update

• When is it correct to use a strong update? A weak update?

• Is this points-to analysis precise?

• What does it mean to say

• p must-point-to x at pgm point u

• p may-point-to x at pgm point u

• p must-not-point-to x at u

• p may-not-point-to x at u

• We must formally define what we want to compute before we can answer many such questions

• A static program analysis computes approximateinformation about the runtime behavior of a given program

• The set of valid programs is defined by the programming language syntax

• The runtime behavior of a given program is defined by the programming language semantics

• The analysis problem defines whatinformation is desired

• The analysis algorithm determines what approximation to make

• A program consists of

• a set of variables Var

• a directed graph (V,E,entry) with a distinguished entry vertex, with every edge labelled by a primitive statement

• A primitive statement is of the form

• x = null

• x = y

• x = *y

• x = &y;

• *x = y

• skip

(where x and y are variables in Var)

• Omitted (for now)

• Dynamic memory allocation

• Pointer arithmetic

• Structures and fields

• Procedures

1

Vars = {x,y,p,a,b,c}

x = &a;

y = &b;

if (?) {

p = &x;

} else {

p = &y;

}

*x = &c;

*p = &c;

x = &a

2

y = &b

3

p = &x

p = &y

4

5

skip

skip

6

*x = &c

7

*p = &c

8

Programming Language:Operational Semantics

• Operational semantics == an interpreter (defined mathematically)

• State

• Data-State ::= Var -> (Var U {null})

• PC ::= V (the vertex set of the CFG)

• Program-State ::= PC x Data-State

• Initial-state:

• (entry, \x. null)

Vars = {x,y,p,a,b,c}

Initial data-state

1

x: N, y:N, p:N, a:N, b:N, c:N

x = &a

2

Initial program-state

y = &b

x: N, y:N, p:N, a:N, b:N, c:N

<1, >

3

p = &x

p = &y

4

5

skip

skip

6

*x = &c

7

*p = &c

8

Programming Language:Operational Semantics

• Meaning of primitive statements

• CS[stmt] : Data-State -> Data-State

• CS[ x = y ] s = s[x s(y)]

• CS[ x = *y ] s = s[x s(s(y))]

• CS[ *x = y ] s = s[s(x) s(y)]

• CS[ x = null ] s = s[x null]

• CS[ x = &y ] s = s[x y]

must say what happens if null is dereferenced

Programming Language:Operational Semantics

• Meaning of program

• a transition relation  on program-states

• Program-State X Program-State

• state1state2means that the execution of some edge in the program can transform state1 into state2

• Defining 

• (u,s) (v,s’) iff the program contains a control-flow edge u->vlabelled with a statement stmt such that M[stmt]s = s’

Programming Language:Operational Semantics

• A sequence of states s1s2 … snis said to be an execution (of the program) iff

• s1is the Initial-State

• sisi+1for 1 <= I < n

• A state s is said to be a reachable stateiff there exists some execution s1s2 … snis such that sn= s.

• Define RS(u) = { s | (u,s) is reachable }

Programming Language:Operational Semantics

• A sequence of states s1s2 … snis said to be an execution (of the program) iff

• s1 is the Initial-State

• si| si+1 for 1 <= I < n

• A state s is said to be a reachable state iff there exists some execution s1s2 … snis such that sn= s.

• Define RS(u) = { s | (u,s) is reachable }

All of this formalism for this one definition

Ideal Points-To Analysis:Formal Definition

• Let u denote a vertex in the CFG

• Define IdealMustPT (u) to be

{ (p,x) | forall s in RS(u). s(p) == x }

• Define IdealMayPT (u) to be

{ (p,x) | exists s in RS(u). s(p) == x }

May-Point-To Analysis:Formal Requirement Specification

May Point-To Analysis

Compute R: V -> 2Vars’ such that

R(u) IdealMayPT(u)

(where Var’ = Var U {null})

For every vertex u in the CFG,

compute a set R(u) such that

R(u) { (p,x) | \$sÎRS(u). s(p) == x }

May-Point-To Analysis:Formal Requirement Specification

• An algorithm is said to be correct if the solution R it computes satisfies

"uÎV. R(u) IdealMayPT(u)

• An algorithm is said to be precise if the solution R it computes satisfies

"uÎV. R(u)=IdealMayPT(u)

• An algorithm that computes a solution R1 is said to be more precise than one that computes a solution R2 if

"uÎV. R1(u) ÍR2(u)

Compute R: V -> 2Vars’ such that

R(u) IdealMayPT(u)

Back To OurMay-Point-To Algorithm

p = &x;

q = &y;

if (?) {

q = p;

}

x = &a;

y = &b;

z = *q;

(May-Point-To Analysis)Algorithm A

• Is this algorithm correct?

• Is this algorithm precise?

• Let’s first completely and formally define the algorithm.

Algorithm A: A Formal DefinitionThe “Data Flow Analysis” Recipe

• Define semi-lattice of abstract-values

• AbsDataState ::= Var -> 2Var’

• f1 f2 = \x. (f1 (x) È f2 (x))

• bottom = \x.{}

• Define initial abstract-value

• InitialAbsState = \x. {null}

• Define transformers for primitive statements

• AS[stmt] : AbsDataState -> AbsDataState

Algorithm A: A Formal DefinitionThe “Data Flow Analysis” Recipe

• Let st(v,u) denote stmt on edge v->u

• Compute the least-fixed-point of the following “dataflow equations”

• x(entry) = InitialAbsState

• x(u) = v->u AS(st(v,u)) x(v)

x(v)

x(w)

v

w

st(v,u)

st(w,u)

u

x(u)

Algorithm A:The Transformers

• Abstract transformers for primitive statements

• AS[stmt] : AbsDataState -> AbsDataState

• AS[ x = y ] s = s[x s(y)]

• AS[ x = null ] s = s[x {null}]

• AS[ x = &y ] s = s[x {y}]

• AS[ x = *y ] s = s[x s*(s(y))]

where s*({v1,…,vn}) = s(v1) È … È s(vn)

• AS[ *x = y ] s = ???

• We have a complete & formal definition of the problem.

• We have a complete & formal definition of a proposed solution.

• How do we reason about the correctness & precision of the proposed solution?

Enter: The French Recipe(Abstract Interpretation)

• Concrete Domain

• Concrete states: C

• Semantics: For every statement st,

• CS[st] : C -> C

• AbstractDomain

• Asemi-lattice(A, b)

• TransferFunctions

• For every statement st,

• AS[st] : A -> A

• Concrete (Collecting) Domain

• Asemi-lattice(2C, b)

• TransferFunctions

• For every statement st,

• CS*[st] : 2C -> 2C

a

g

2Data-State

2Var x Var’

Points-To Analysis(Abstract Interpretation)

a(Y) = { (p,x) | exists s in Y. s(p) == x }

MayPT(u)

a

Í

a

RS(u)

IdealMayPT(u)

2Data-State

2Var x Var’

IdealMayPT (u) = a ( RS(u) )

Approximating Transformers:Correctness Criterion

c is said to be correctly

approximated by a

iff

a(c) Ía

c1

correctly

approximated by

a1

f

c2

f#

correctly

approximated by

a2

C

A

Approximating Transformers:Correctness Criterion

concretization

g

c1

a1

f

abstraction

a

c2

f#

a2

requirement:

f#(a1) ≥a (f( g(a1))

C

A

• CS[stmt] : Data-State -> Data-State

• CS[ x = y ] s = s[x s(y)]

• CS[ x = *y ] s = s[x s(s(y))]

• CS[ *x = y ] s = s[s(x) s(y)]

• CS[ x = null ] s = s[x null]

• CS*[stmt] : 2Data-State -> 2Data-State

• CS*[st] X = { CS[st]s | s Î X }

• AS[stmt] : AbsDataState -> AbsDataState

• AS[ x = y ] s = s[x s(y)]

• AS[ x = null ] s = s[x {null}]

• AS[ x = *y ] s = s[x s*(s(y))]

where s*({v1,…,vn}) = s(v1) È … È s(vn)

• AS[ *x = y ] s = ???

Algorithm A: TranformersWeak/Strong Update

x: &y

y: &x

z: &a

g

x: {&y}

y: {&x,&z}

z: {&a}

x: &y

y: &z

z: &a

f

f#

*y = &b;

*y = &b;

x: &b

y: &x

z: &a

x: {&y,&b}

y: {&x,&z}

z: {&a,&b}

x: &y

y: &z

z: &b

a

Algorithm A: TranformersWeak/Strong Update

x: &y

y: &x

z: &a

g

x: {&y}

y: {&x,&z}

z: {&a}

x: &y

y: &z

z: &a

f

f#

*x = &b;

*x = &b;

x: &y

y: &b

z: &a

x: {&y}

y: {&b}

z: {&a}

x: &y

y: &b

z: &a

a

Algorithm A: TransformersWeak/Strong Update

• Transformer for “*p = q”