1 / 24

# Efficient Field-Sensitive Pointer Analysis for C - PowerPoint PPT Presentation

Efficient Field-Sensitive Pointer Analysis for C. David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London, UK d.pearce@doc.ic.ac.uk www.doc.ic.ac.uk/~djp1/. What is Pointer Analysis?. Determine pointer targets without running program

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Efficient Field-Sensitive Pointer Analysis for C' - reid

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Efficient Field-Sensitive Pointer Analysis for C

David J. Pearce, Paul H.J. Kelly and Chris Hankin

Imperial College, London, UK

d.pearce@doc.ic.ac.uk

www.doc.ic.ac.uk/~djp1/

• Determine pointer targets without running program

• What is flow-insensitive pointer analysis?

• One solution for all statements – so precision lost

• This is a trade-off for efficiency over precision

• This work considers flow-insensitive pointer analysis only

int a,b,*p,*q = NULL;

p = &a;

if(…) q = p; // p{a,b}, q{a,NULL}

p = &b;

• Generate set-constraints from program and solve them

• Use constraint graph for efficient solving

int a,b,c,*p,*q,*r;

p = &a;

r = &b;

q = &c;

if(...)

q = p;

else

q = r;

(program)

• Generate set-constraints from program and solve them

• Use constraint graph for efficient solving

int a,b,c,*p,*q,*r;

p = &a; // p  { a }

r = &b; // r  { b }

q = &c; // q  { c }

if(...)

q = p; // q  p

else

q = r; // q  r

(program)

(constraints)

p

q

r

• Generate set-constraints from program and solve them

• Use constraint graph for efficient solving

int a,b,c,*p,*q,*r;

p = &a; // p  { a }

r = &b; // r  { b }

q = &c; // q  { c }

if(...)

q = p; // q  p

else

q = r; // q  r

{a}

{b}

{c}

(program)

(constraints)

(constraint graph)

p

q

r

• Generate set-constraints from program and solve them

• Use constraint graph for efficient solving

int a,b,c,*p,*q,*r;

p = &a; // p  { a }

r = &b; // r  { b }

q = &c; // q  { c }

if(...)

q = p; // q  p

else

q = r; // q  r

{a}

{b}

{a,b,c}

(program)

(constraints)

(constraint graph)

p

x

r

q

• How to deal with aggregate types ?

• Standard approach treats them as single variables

typedef struct { int *f1; int *f2; } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p  { a }

q = &b; // q  { b }

x.f1 = p; // x  p

x.f2 = q; // x  q

r = x.f1; // r  x

{b}

{a}

{}

{}

p

x

r

q

• How to deal with aggregate types ?

• Standard approach treats them as single variables

typedef struct { int *f1; int *f2; } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p  { a }

q = &b; // q  { b }

x.f1 = p; // x  p

x.f2 = q; // x  q

r = x.f1; // r  x

{b}

{a}

{a,b}

{a,b}

p

xf2

xf1

r

q

• Use a separate node per field for each aggregate

• Node “x” split in two

typedef struct { int *f1; int *f2 } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p  { a }

q = &b; // q  { b }

x.f1 = p; // xf1 p

x.f2 = q; // xf2 q

r = x.f1; // r  xf1

{b}

{a}

{}

{}

{}

p

xf2

xf1

r

q

• Use a separate node per field for each aggregate

• Node “x” split in two

typedef struct { int *f1; int *f2 } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p  { a }

q = &b; // q  { b }

x.f1 = p; // xf1 p

x.f2 = q; // xf2 q

r = x.f1; // r  xf1

{b}

{a}

{a}

{b}

{a}

xf2

xf1

typedef struct { int *f1; int *f2; } t1;

int **p;

t1 x,*s;

s = &x; // s  { x }

p = &(s->f2); // p ?

• System thus far has no mechanism for this

• First idea – use string concatenation operator ||

• Works well for this example

{..}

{..}

xf2

xf1

typedef struct { int *f1; int *f2; } t1;

int **p;

t1 x,*s;

s = &x; // s  { x }

p = &(s->f2); // p (*s) || f2

• System thus far has no mechanism for this

• First idea – use string concatenation operator ||

• Works well for this example

{..}

{..}

xf2

xf1

typedef struct { int *f1; int *f2; } t1;

int **p;

t1 x,*s;

s = &x; // s  { x }

p = &(s->f2); // p (*s) || f2  p  { x } || f2  p  { xf2 }

• System thus far has no mechanism for this

• First idea – use string concatenation operator ||

• Works well for this example

{..}

{..}

xf4

xf3

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s  { x }

p = &(s->f2); // p (*s) || f2

• First idea – use string concatenation operator ||

• Casting identical types except for field names

• Derivation same as before - but,node xf2 no longer exists!

{..}

{..}

xf4

xf3

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s  { x }

p = &(s->f2); // p (*s) || f2  p  { x } || f2  p  { xf2 }

• First idea – use string concatenation operator ||

• Casting identical types except for field names

• Derivation same as before - but,node xf2 no longer exists!

{..}

{..}

p

xf3

xf4

s

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s  { xf3 }

p = &(s->f2); // p s + 1

• Our solution – map variables to integers

• Solution sets become integer sets

0

1

2

3

p

xf3

xf4

s

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s  { xf3} s  { 2 }

p = &(s->f2); // p s + 1

• Our solution – map variables to integers

• Solution sets become integer sets

0

1

2

3

p

xf3

xf4

s

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s  { xf3} s  { 2 }

p = &(s->f2); // p s + 1  p  { 2 } + 1  p  { 3 }

• Our solution – map variables to integers

• Solution sets become integer sets

0

1

2

3

• Field-sensitive Pointer Analysis

• Presented new technique for C language

• Elegantly copes with language features

• Compatible types and casting

• Technique also handles function pointers without modification

• Experimental evaluation over 7 common C programs

• Considerable improvements in precision obtained

• But, much higher solving times

• And, relative gains appear to diminish with larger benchmarks

p

s

q

r

• What about statements involving a pointer dereference?

• Cannot be represented in the constraint graph

• Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p  { r }

s = &a; // s  { a }

q = p; // q  p

*q = s; // *q  s

{r}

{a}

{}

{}

(program)

(constraints)

(constraint graph)

p

s

q

r

• What about statements involving a pointer dereference?

• Cannot be represented in the constraint graph

• Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p  { r }

s = &a; // s  { a }

q = p; // q  p

*q = s; // *q  s  r  s

{r}

{a}

{r}

{}

(program)

(constraints)

(constraint graph)

p

s

q

r

• What about statements involving a pointer dereference?

• Cannot be represented in the constraint graph

• Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p  { r }

s = &a; // s  { a }

q = p; // q  p

*q = s; // *q  s  r  s

{r}

{a}

{r}

{}

(program)

(constraints)

(constraint graph)

p

s

q

r

• What about statements involving a pointer dereference?

• Cannot be represented in the constraint graph

• Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p  { r }

s = &a; // s  { a }

q = p; // q  p

*q = s; // *q  s  r  s

{r}

{a}

{r}

{a}

(program)

(constraints)

(constraint graph)