On abstraction refinement for program analyses in datalog
This presentation is the property of its rightful owner.
Sponsored Links
1 / 44

On Abstraction Refinement for Program Analyses in Datalog PowerPoint PPT Presentation


  • 68 Views
  • Uploaded on
  • Presentation posted in: General

On Abstraction Refinement for Program Analyses in Datalog. Xin Zhang , Ravi Mangal , Mayur Naik Georgia Tech. Radu Grigore , Hongseok Yang Oxford University. Datalog for program a nalysis. Datalog. What is Datalog ?. Datalog. What is Datalog ?. Input relations:

Download Presentation

On Abstraction Refinement for Program Analyses in Datalog

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


On abstraction refinement for program analyses in datalog

On Abstraction Refinement for Program Analyses in Datalog

Xin Zhang, Ravi Mangal, MayurNaik

Georgia Tech

RaduGrigore, Hongseok Yang

Oxford University


Datalog for program a nalysis

Datalog for program analysis

Datalog

Programming Language Design and Implementation, 2014


What is datalog

What is Datalog?

Datalog

Programming Language Design and Implementation, 2014


What is datalog1

What is Datalog?

Input relations:

Output relations:

Rules:

edge(i, j).

path(i, j).

(1) path(i, i).

(2) path(i, k) :- path(i, j), edge(j, k).

Least fixpoint computation:

Input: edge(0, 1), edge(1, 2).

path(0, 0).

path(1, 1).

path(2, 2).

path(0, 1) :- path(0, 0), edge(0, 1).

path(0, 2) :- path(0, 1), edge(1, 2).

Datalog

Programming Language Design and Implementation, 2014


Why datalog

Why Datalog?

If there exists a path from a to b, and there is an edge from b to c, then there exists a path from a to c:

path(a, c) :- path(a, b), edge(b, c).

Datalog

Programming Language Design and Implementation, 2014


Why datalog1

Why Datalog?

k-object-sensitivity,

k = 2, ~100KLOC

Programming Language Design and Implementation, 2014


Limitation

Limitation

k-object-sensitivity,

k = 10, ~500KLOC

k-object-sensitivity,

k = 2, ~100KLOC

Programming Language Design and Implementation, 2014


Program abstraction

Program abstraction

Abstraction

Precision

Scalability

Programming Language Design and Implementation, 2014


Parametric program abstraction

Parametric program abstraction

Abstraction

Precision

Scalability

Programming Language Design and Implementation, 2014


Parametric program abstraction1

Parametric program abstraction

Abstraction

Precision

Scalability

Programming Language Design and Implementation, 2014


Parametric program abstraction example 1

Parametric program abstraction: Example 1

Cloning depth K for each call site and allocation site

Pointer Analysis

Programming Language Design and Implementation, 2014


Parametric program abstraction example 2

Parametric program abstraction: Example 2

Predicates to use as abstraction predicates

Shape Analysis

Programming Language Design and Implementation, 2014


Program abstraction1

Program abstraction

Datalog Program

Datalog Program

alias(p, q)?

alias(m, n)?

Programming Language Design and Implementation, 2014


Program abstraction2

Program abstraction

Counterexample guided refinement (CEGAR) via MAXSAT

Datalog Program

Datalog Program

alias(p, q)?

alias(m, n)?

Programming Language Design and Implementation, 2014


Pointer analysis example

Pointer analysis example

f(){

v1= new ...;

v2= id1(v1);

v3= id2(v2);

q2:assert(v3!= v1);

}

id1(v){return v;}

g(){

v4= new ...;

v5= id1(v4);

v6= id2(v5);

q1:assert(v6!= v1);

}

id2(v){return v;}

Programming Language Design and Implementation, 2014


Pointer analysis as graph reachability

Pointer analysis as graph reachability

0

3

a0

b0

b1

a1

6’

6

6’’

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

Programming Language Design and Implementation, 2014


Graph reachability in datalog

Graph reachability in Datalog

0

3

a0

b0

b1

a1

6’

6

6’’

Input relations:

edge(i, j, n), abs(n)

Output relations:

path(i, j)

Rules:

(1) path(i, i).

(2) path(i, j) :- path(i, k), edge(k, j, n), abs(n).

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

Input tuples:

edge(0, 6, a0), edge(0, 6’, a1), edge(3, 6, b0),

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

16 possible abstractions in total

Programming Language Design and Implementation, 2014


Desired result

Desired result

0

3

Input relations:

edge(i, j, n), abs(n)

Output relations:

path(i, j)

Rules:

(1) path(i, i).

(2) path(i, j) :- path(i, k), edge(k, j, n), abs(n).

a0

b0

b1

a1

6’

6

6’’

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

Input tuples:

edge(0, 6, a0), edge(0, 6’, a1), edge(3, 6, b0),

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

Programming Language Design and Implementation, 2014


Iteration 1

Iteration 1

0

3

path(0, 0).

path(0, 6) :- path(0, 0), edge(0, 6, a0), abs(a0).

path(0, 1) :- path(0, 6), edge(6, 1, a0), abs(a0).

path(0, 7) :- path(0, 1), edge(1, 7, c0), abs(c0).

path(0, 2) :- path(0, 7), edge(7, 2, c0), abs(c0).

path(0, 4) :- path(0, 6), edge(6, 4, b0), abs(b0).

path(0, 7) :- path(0, 4), edge(4, 7, d0), abs(d0).

path(0, 5) :- path(0, 7), edge(7, 5, d0), abs(d0).

a0

b0

b1

a1

6’

6

6’’

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph

Iteration 1 - derivation graph

0

3

a0

b0

b1

a1

6’

6

6’’

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph1

Iteration 1 - derivation graph

abs(a0)

path(0,0)

edge(0,6,a0)

abs(b0)

edge(6,4,b0)

abs(a0)

edge(6,1,a0)

path(0,6)

abs(d0)

edge(4,7,d0)

edge(1,7,c0)

abs(c0)

path(0,4)

path(0,1)

abs(c0)

edge(7,2,c0)

edge(7,5,d0)

abs(d0)

path(0,7)

path(0,2)

path(0,5)

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph2

Iteration 1 - derivation graph

abs(a0)

path(0,0)

edge(0,6,a0)

abs(b0)

edge(6,4,b0)

abs(a0)

edge(6,1,a0)

path(0,6)

abs(d0)

edge(4,7,d0)

edge(1,7,c0)

abs(c0)

path(0,4)

path(0,1)

abs(c0)

edge(7,2,c0)

edge(7,5,d0)

abs(d0)

path(0,7)

path(0,2)

path(0,5)

a0c0

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph3

Iteration 1 - derivation graph

abs(a0)

path(0,0)

edge(0,6,a0)

abs(b0)

edge(6,4,b0)

abs(a0)

edge(6,1,a0)

path(0,6)

abs(d0)

edge(4,7,d0)

edge(1,7,c0)

abs(c0)

path(0,4)

path(0,1)

abs(c0)

edge(7,2,c0)

edge(7,5,d0)

abs(d0)

path(0,7)

path(0,2)

path(0,5)

a0c0

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph4

Iteration 1 - derivation graph

abs(a0)

path(0,0)

edge(0,6,a0)

abs(b0)

edge(6,4,b0)

abs(a0)

edge(6,1,a0)

path(0,6)

abs(d0)

edge(4,7,d0)

edge(1,7,c0)

abs(c0)

path(0,4)

path(0,1)

abs(c0)

edge(7,2,c0)

edge(7,5,d0)

abs(d0)

path(0,7)

path(0,2)

path(0,5)

a0b0d0

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph5

Iteration 1 - derivation graph

0

3

a0

b0

b1

a1

6’

6

6’’

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

Programming Language Design and Implementation, 2014


Encoded as maxsat

Encoded as MAXSAT

Hard Constraints

Soft Constraints

MAXSAT(

Find that

Maximize

Subject to

Programming Language Design and Implementation, 2014


Encoded as maxsat1

Encoded as MAXSAT

Avoid all the counterexamples

Hard constraints:

Minimize the abstraction cost

Soft constraints:

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

Programming Language Design and Implementation, 2014


Encoded as maxsat2

Encoded as MAXSAT

Solution:

Hard constraints:

Soft constraints:

a1c0d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond

Iteration 2 and beyond

Iteration 1

Derivation

a1c0d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond1

Iteration 2 and beyond

Iteration 2

Derivation

a1c0d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond2

Iteration 2 and beyond

Iteration 2

Derivation

a1c0d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond3

Iteration 2 and beyond

Iteration 2

Derivation

a1c1d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond4

Iteration 2 and beyond

Iteration 3

Derivation

q1 is proven.

a1c1d0

a1c1d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond5

Iteration 2 and beyond

Iteration 3

Derivation

q2is impossible to prove.

q1 is proven.

a1c1d0

a1c1d0

Impossibility

Programming Language Design and Implementation, 2014


Mixing counterexamples

Mixing counterexamples

Iteration 1

Iteration 3

Eliminated

Abstractions:

a0c0

a1c1

Programming Language Design and Implementation, 2014


Mixing counterexamples1

Mixing counterexamples

Iteration 1

Mixed!

Iteration 3

Eliminated

Abstractions:

a0c0

a0c1

a1c1

Programming Language Design and Implementation, 2014


Experimental setup

Experimental setup

  • Implemented in JChord using off-the-shelf solvers:

    • Datalog: bddbddb

    • MAXSAT: MiFuMaX

  • Applied to two analyses that are challenging to scale:

    • k-object-sensitivity pointer analysis:

      • flow-insensitive, weak updates, cloning-based

    • typestate analysis:

      • flow-sensitive, strong updates, summary-based

  • Evaluated on 8 Java programs from DaCapo and Ashes.

Programming Language Design and Implementation, 2014


Benchmark characteristics

Benchmark characteristics

Programming Language Design and Implementation, 2014


Results pointer analysis

Results: pointer analysis

4-object-sensitivity

< 50%

< 3% of max

Programming Language Design and Implementation, 2014


Performance of d atalog pointer analysis

Performance of Datalog: pointer analysis

k = 4, 3h28m

Baseline

k = 3, 590s

k = 2, 214s

lusearch

k = 1, 153s

Programming Language Design and Implementation, 2014


Performance of maxsat pointer analysis

Performance of MAXSAT: pointer analysis

lusearch

Programming Language Design and Implementation, 2014


Statistics of maxsat formulae

Statistics of MAXSAT formulae

Programming Language Design and Implementation, 2014


Conclusion

Conclusion

Datalog

MAXSAT

Abstraction

Programming Language Design and Implementation, 2014


Conclusion1

Conclusion

Datalog

MAXSAT

A(x, y):- B(x, z), C(z, y)

Soundness

Hard Constraints

Tradeoffs

Scalability vs. Precision

Soft Constraints

Sound vs. Complete

Programming Language Design and Implementation, 2014


  • Login