Efficient Field-Sensitive Pointer Analysis for C

1 / 24

# Efficient Field-Sensitive Pointer Analysis for C - PowerPoint PPT Presentation

Efficient Field-Sensitive Pointer Analysis for C. David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London, UK [email protected] www.doc.ic.ac.uk/~djp1/. What is Pointer Analysis?. Determine pointer targets without running program

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Efficient Field-Sensitive Pointer Analysis for C' - reid

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Efficient Field-Sensitive Pointer Analysis for C

David J. Pearce, Paul H.J. Kelly and Chris Hankin

Imperial College, London, UK

[email protected]

www.doc.ic.ac.uk/~djp1/

What is Pointer Analysis?
• Determine pointer targets without running program
• What is flow-insensitive pointer analysis?
• One solution for all statements – so precision lost
• This is a trade-off for efficiency over precision
• This work considers flow-insensitive pointer analysis only

int a,b,*p,*q = NULL;

p = &a;

if(…) q = p; // p{a,b}, q{a,NULL}

p = &b;

Pointer analysis via set-constraints
• Generate set-constraints from program and solve them
• Use constraint graph for efficient solving

int a,b,c,*p,*q,*r;

p = &a;

r = &b;

q = &c;

if(...)

q = p;

else

q = r;

(program)

Pointer analysis via set-constraints
• Generate set-constraints from program and solve them
• Use constraint graph for efficient solving

int a,b,c,*p,*q,*r;

p = &a; // p  { a }

r = &b; // r  { b }

q = &c; // q  { c }

if(...)

q = p; // q  p

else

q = r; // q  r

(program)

(constraints)

Pointer analysis via set-constraints

p

q

r

• Generate set-constraints from program and solve them
• Use constraint graph for efficient solving

int a,b,c,*p,*q,*r;

p = &a; // p  { a }

r = &b; // r  { b }

q = &c; // q  { c }

if(...)

q = p; // q  p

else

q = r; // q  r

{a}

{b}

{c}

(program)

(constraints)

(constraint graph)

Pointer analysis via set-constraints

p

q

r

• Generate set-constraints from program and solve them
• Use constraint graph for efficient solving

int a,b,c,*p,*q,*r;

p = &a; // p  { a }

r = &b; // r  { b }

q = &c; // q  { c }

if(...)

q = p; // q  p

else

q = r; // q  r

{a}

{b}

{a,b,c}

(program)

(constraints)

(constraint graph)

Field-Sensitivity

p

x

r

q

• How to deal with aggregate types ?
• Standard approach treats them as single variables

typedef struct { int *f1; int *f2; } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p  { a }

q = &b; // q  { b }

x.f1 = p; // x  p

x.f2 = q; // x  q

r = x.f1; // r  x

{b}

{a}

{}

{}

Field-Sensitivity

p

x

r

q

• How to deal with aggregate types ?
• Standard approach treats them as single variables

typedef struct { int *f1; int *f2; } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p  { a }

q = &b; // q  { b }

x.f1 = p; // x  p

x.f2 = q; // x  q

r = x.f1; // r  x

{b}

{a}

{a,b}

{a,b}

Field-Sensitivity – A simple solution

p

xf2

xf1

r

q

• Use a separate node per field for each aggregate
• Node “x” split in two

typedef struct { int *f1; int *f2 } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p  { a }

q = &b; // q  { b }

x.f1 = p; // xf1 p

x.f2 = q; // xf2 q

r = x.f1; // r  xf1

{b}

{a}

{}

{}

{}

Field-Sensitivity – A simple solution

p

xf2

xf1

r

q

• Use a separate node per field for each aggregate
• Node “x” split in two

typedef struct { int *f1; int *f2 } t1;

int a,b,*p,*q,*r;

t1 x;

p = &a; // p  { a }

q = &b; // q  { b }

x.f1 = p; // xf1 p

x.f2 = q; // xf2 q

r = x.f1; // r  xf1

{b}

{a}

{a}

{b}

{a}

Problem – can take address of field in C

xf2

xf1

typedef struct { int *f1; int *f2; } t1;

int **p;

t1 x,*s;

s = &x; // s  { x }

p = &(s->f2); // p ?

• System thus far has no mechanism for this
• First idea – use string concatenation operator ||
• Works well for this example

{..}

{..}

Problem – can take address of field in C

xf2

xf1

typedef struct { int *f1; int *f2; } t1;

int **p;

t1 x,*s;

s = &x; // s  { x }

p = &(s->f2); // p (*s) || f2

• System thus far has no mechanism for this
• First idea – use string concatenation operator ||
• Works well for this example

{..}

{..}

Problem – can take address of field in C

xf2

xf1

typedef struct { int *f1; int *f2; } t1;

int **p;

t1 x,*s;

s = &x; // s  { x }

p = &(s->f2); // p (*s) || f2  p  { x } || f2  p  { xf2 }

• System thus far has no mechanism for this
• First idea – use string concatenation operator ||
• Works well for this example

{..}

{..}

Problem – compatible types

xf4

xf3

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s  { x }

p = &(s->f2); // p (*s) || f2

• First idea – use string concatenation operator ||
• Casting identical types except for field names
• Derivation same as before - but,node xf2 no longer exists!

{..}

{..}

Problem – compatible types

xf4

xf3

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s  { x }

p = &(s->f2); // p (*s) || f2  p  { x } || f2  p  { xf2 }

• First idea – use string concatenation operator ||
• Casting identical types except for field names
• Derivation same as before - but,node xf2 no longer exists!

{..}

{..}

Field-Sensitivity – Our Solution

p

xf3

xf4

s

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s  { xf3 }

p = &(s->f2); // p s + 1

• Our solution – map variables to integers
• Solution sets become integer sets

0

1

2

3

Field-Sensitivity – Our Solution

p

xf3

xf4

s

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s  { xf3} s  { 2 }

p = &(s->f2); // p s + 1

• Our solution – map variables to integers
• Solution sets become integer sets

0

1

2

3

Field-Sensitivity – Our Solution

p

xf3

xf4

s

typedef struct { int *f1; int *f2; } t1;

typedef struct { int *f3; int *f4; } t2;

int **p;

t1 *s; t2 x;

s = (t1*) &x; // s  { xf3} s  { 2 }

p = &(s->f2); // p s + 1  p  { 2 } + 1  p  { 3 }

• Our solution – map variables to integers
• Solution sets become integer sets

0

1

2

3

Conclusion
• Field-sensitive Pointer Analysis
• Presented new technique for C language
• Elegantly copes with language features
• Compatible types and casting
• Technique also handles function pointers without modification
• Experimental evaluation over 7 common C programs
• Considerable improvements in precision obtained
• But, much higher solving times
• And, relative gains appear to diminish with larger benchmarks
Constraint Graphs (continued)

p

s

q

r

• What about statements involving a pointer dereference?
• Cannot be represented in the constraint graph
• Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p  { r }

s = &a; // s  { a }

q = p; // q  p

*q = s; // *q  s

{r}

{a}

{}

{}

(program)

(constraints)

(constraint graph)

Constraint Graphs (continued)

p

s

q

r

• What about statements involving a pointer dereference?
• Cannot be represented in the constraint graph
• Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p  { r }

s = &a; // s  { a }

q = p; // q  p

*q = s; // *q  s  r  s

{r}

{a}

{r}

{}

(program)

(constraints)

(constraint graph)

Constraint Graphs (continued)

p

s

q

r

• What about statements involving a pointer dereference?
• Cannot be represented in the constraint graph
• Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p  { r }

s = &a; // s  { a }

q = p; // q  p

*q = s; // *q  s  r  s

{r}

{a}

{r}

{}

(program)

(constraints)

(constraint graph)

Constraint Graphs (continued)

p

s

q

r

• What about statements involving a pointer dereference?
• Cannot be represented in the constraint graph
• Thus, computation similar to dynamic transitive closure

int a,*r,*s,**p,**q;

p = &r; // p  { r }

s = &a; // s  { a }

q = p; // q  p

*q = s; // *q  s  r  s

{r}

{a}

{r}

{a}

(program)

(constraints)

(constraint graph)