On abstraction refinement for program analyses in datalog
Download
1 / 44

On Abstraction Refinement for Program Analyses in Datalog - PowerPoint PPT Presentation


  • 82 Views
  • Uploaded on
  • Presentation posted in: General

On Abstraction Refinement for Program Analyses in Datalog. Xin Zhang , Ravi Mangal , Mayur Naik Georgia Tech. Radu Grigore , Hongseok Yang Oxford University. Datalog for program a nalysis. Datalog. What is Datalog ?. Datalog. What is Datalog ?. Input relations:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

On Abstraction Refinement for Program Analyses in Datalog

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


On abstraction refinement for program analyses in datalog
On Abstraction Refinement for Program Analyses in Datalog

Xin Zhang, Ravi Mangal, MayurNaik

Georgia Tech

RaduGrigore, Hongseok Yang

Oxford University


Datalog for program a nalysis
Datalog for program analysis

Datalog

Programming Language Design and Implementation, 2014


What is datalog
What is Datalog?

Datalog

Programming Language Design and Implementation, 2014


What is datalog1
What is Datalog?

Input relations:

Output relations:

Rules:

edge(i, j).

path(i, j).

(1) path(i, i).

(2) path(i, k) :- path(i, j), edge(j, k).

Least fixpoint computation:

Input: edge(0, 1), edge(1, 2).

path(0, 0).

path(1, 1).

path(2, 2).

path(0, 1) :- path(0, 0), edge(0, 1).

path(0, 2) :- path(0, 1), edge(1, 2).

Datalog

Programming Language Design and Implementation, 2014


Why datalog
Why Datalog?

If there exists a path from a to b, and there is an edge from b to c, then there exists a path from a to c:

path(a, c) :- path(a, b), edge(b, c).

Datalog

Programming Language Design and Implementation, 2014


Why datalog1
Why Datalog?

k-object-sensitivity,

k = 2, ~100KLOC

Programming Language Design and Implementation, 2014


Limitation
Limitation

k-object-sensitivity,

k = 10, ~500KLOC

k-object-sensitivity,

k = 2, ~100KLOC

Programming Language Design and Implementation, 2014


Program abstraction
Program abstraction

Abstraction

Precision

Scalability

Programming Language Design and Implementation, 2014


Parametric program abstraction
Parametric program abstraction

Abstraction

Precision

Scalability

Programming Language Design and Implementation, 2014


Parametric program abstraction1
Parametric program abstraction

Abstraction

Precision

Scalability

Programming Language Design and Implementation, 2014


Parametric program abstraction example 1
Parametric program abstraction: Example 1

Cloning depth K for each call site and allocation site

Pointer Analysis

Programming Language Design and Implementation, 2014


Parametric program abstraction example 2
Parametric program abstraction: Example 2

Predicates to use as abstraction predicates

Shape Analysis

Programming Language Design and Implementation, 2014


Program abstraction1
Program abstraction

Datalog Program

Datalog Program

alias(p, q)?

alias(m, n)?

Programming Language Design and Implementation, 2014


Program abstraction2
Program abstraction

Counterexample guided refinement (CEGAR) via MAXSAT

Datalog Program

Datalog Program

alias(p, q)?

alias(m, n)?

Programming Language Design and Implementation, 2014


Pointer analysis example
Pointer analysis example

f(){

v1= new ...;

v2= id1(v1);

v3= id2(v2);

q2:assert(v3!= v1);

}

id1(v){return v;}

g(){

v4= new ...;

v5= id1(v4);

v6= id2(v5);

q1:assert(v6!= v1);

}

id2(v){return v;}

Programming Language Design and Implementation, 2014


Pointer analysis as graph reachability
Pointer analysis as graph reachability

0

3

a0

b0

b1

a1

6’

6

6’’

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

Programming Language Design and Implementation, 2014


Graph reachability in datalog
Graph reachability in Datalog

0

3

a0

b0

b1

a1

6’

6

6’’

Input relations:

edge(i, j, n), abs(n)

Output relations:

path(i, j)

Rules:

(1) path(i, i).

(2) path(i, j) :- path(i, k), edge(k, j, n), abs(n).

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

Input tuples:

edge(0, 6, a0), edge(0, 6’, a1), edge(3, 6, b0),

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

16 possible abstractions in total

Programming Language Design and Implementation, 2014


Desired result
Desired result

0

3

Input relations:

edge(i, j, n), abs(n)

Output relations:

path(i, j)

Rules:

(1) path(i, i).

(2) path(i, j) :- path(i, k), edge(k, j, n), abs(n).

a0

b0

b1

a1

6’

6

6’’

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

Input tuples:

edge(0, 6, a0), edge(0, 6’, a1), edge(3, 6, b0),

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

Programming Language Design and Implementation, 2014


Iteration 1
Iteration 1

0

3

path(0, 0).

path(0, 6) :- path(0, 0), edge(0, 6, a0), abs(a0).

path(0, 1) :- path(0, 6), edge(6, 1, a0), abs(a0).

path(0, 7) :- path(0, 1), edge(1, 7, c0), abs(c0).

path(0, 2) :- path(0, 7), edge(7, 2, c0), abs(c0).

path(0, 4) :- path(0, 6), edge(6, 4, b0), abs(b0).

path(0, 7) :- path(0, 4), edge(4, 7, d0), abs(d0).

path(0, 5) :- path(0, 7), edge(7, 5, d0), abs(d0).

a0

b0

b1

a1

6’

6

6’’

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph
Iteration 1 - derivation graph

0

3

a0

b0

b1

a1

6’

6

6’’

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph1
Iteration 1 - derivation graph

abs(a0)

path(0,0)

edge(0,6,a0)

abs(b0)

edge(6,4,b0)

abs(a0)

edge(6,1,a0)

path(0,6)

abs(d0)

edge(4,7,d0)

edge(1,7,c0)

abs(c0)

path(0,4)

path(0,1)

abs(c0)

edge(7,2,c0)

edge(7,5,d0)

abs(d0)

path(0,7)

path(0,2)

path(0,5)

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph2
Iteration 1 - derivation graph

abs(a0)

path(0,0)

edge(0,6,a0)

abs(b0)

edge(6,4,b0)

abs(a0)

edge(6,1,a0)

path(0,6)

abs(d0)

edge(4,7,d0)

edge(1,7,c0)

abs(c0)

path(0,4)

path(0,1)

abs(c0)

edge(7,2,c0)

edge(7,5,d0)

abs(d0)

path(0,7)

path(0,2)

path(0,5)

a0c0

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph3
Iteration 1 - derivation graph

abs(a0)

path(0,0)

edge(0,6,a0)

abs(b0)

edge(6,4,b0)

abs(a0)

edge(6,1,a0)

path(0,6)

abs(d0)

edge(4,7,d0)

edge(1,7,c0)

abs(c0)

path(0,4)

path(0,1)

abs(c0)

edge(7,2,c0)

edge(7,5,d0)

abs(d0)

path(0,7)

path(0,2)

path(0,5)

a0c0

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph4
Iteration 1 - derivation graph

abs(a0)

path(0,0)

edge(0,6,a0)

abs(b0)

edge(6,4,b0)

abs(a0)

edge(6,1,a0)

path(0,6)

abs(d0)

edge(4,7,d0)

edge(1,7,c0)

abs(c0)

path(0,4)

path(0,1)

abs(c0)

edge(7,2,c0)

edge(7,5,d0)

abs(d0)

path(0,7)

path(0,2)

path(0,5)

a0b0d0

Programming Language Design and Implementation, 2014


Iteration 1 derivation graph5
Iteration 1 - derivation graph

0

3

a0

b0

b1

a1

6’

6

6’’

a0

b0

b1

a1

1

4

c1

d0

c0

d1

7’

7

7’’

c0

d1

d0

c1

2

5

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

Programming Language Design and Implementation, 2014


Encoded as maxsat
Encoded as MAXSAT

Hard Constraints

Soft Constraints

MAXSAT(

Find that

Maximize

Subject to

Programming Language Design and Implementation, 2014


Encoded as maxsat1
Encoded as MAXSAT

Avoid all the counterexamples

Hard constraints:

Minimize the abstraction cost

Soft constraints:

abs(a0)abs(a1), abs(b0)abs(b1),

abs(c0)abs(c1), abs(d0)abs(d1).

Programming Language Design and Implementation, 2014


Encoded as maxsat2
Encoded as MAXSAT

Solution:

Hard constraints:

Soft constraints:

a1c0d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond
Iteration 2 and beyond

Iteration 1

Derivation

a1c0d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond1
Iteration 2 and beyond

Iteration 2

Derivation

a1c0d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond2
Iteration 2 and beyond

Iteration 2

Derivation

a1c0d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond3
Iteration 2 and beyond

Iteration 2

Derivation

a1c1d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond4
Iteration 2 and beyond

Iteration 3

Derivation

q1 is proven.

a1c1d0

a1c1d0

Programming Language Design and Implementation, 2014


Iteration 2 and beyond5
Iteration 2 and beyond

Iteration 3

Derivation

q2is impossible to prove.

q1 is proven.

a1c1d0

a1c1d0

Impossibility

Programming Language Design and Implementation, 2014


Mixing counterexamples
Mixing counterexamples

Iteration 1

Iteration 3

Eliminated

Abstractions:

a0c0

a1c1

Programming Language Design and Implementation, 2014


Mixing counterexamples1
Mixing counterexamples

Iteration 1

Mixed!

Iteration 3

Eliminated

Abstractions:

a0c0

a0c1

a1c1

Programming Language Design and Implementation, 2014


Experimental setup
Experimental setup

  • Implemented in JChord using off-the-shelf solvers:

    • Datalog: bddbddb

    • MAXSAT: MiFuMaX

  • Applied to two analyses that are challenging to scale:

    • k-object-sensitivity pointer analysis:

      • flow-insensitive, weak updates, cloning-based

    • typestate analysis:

      • flow-sensitive, strong updates, summary-based

  • Evaluated on 8 Java programs from DaCapo and Ashes.

Programming Language Design and Implementation, 2014


Benchmark characteristics
Benchmark characteristics

Programming Language Design and Implementation, 2014


Results pointer analysis
Results: pointer analysis

4-object-sensitivity

< 50%

< 3% of max

Programming Language Design and Implementation, 2014


Performance of d atalog pointer analysis
Performance of Datalog: pointer analysis

k = 4, 3h28m

Baseline

k = 3, 590s

k = 2, 214s

lusearch

k = 1, 153s

Programming Language Design and Implementation, 2014


Performance of maxsat pointer analysis
Performance of MAXSAT: pointer analysis

lusearch

Programming Language Design and Implementation, 2014


Statistics of maxsat formulae
Statistics of MAXSAT formulae

Programming Language Design and Implementation, 2014


Conclusion
Conclusion

Datalog

MAXSAT

Abstraction

Programming Language Design and Implementation, 2014


Conclusion1
Conclusion

Datalog

MAXSAT

A(x, y):- B(x, z), C(z, y)

Soundness

Hard Constraints

Tradeoffs

Scalability vs. Precision

Soft Constraints

Sound vs. Complete

Programming Language Design and Implementation, 2014


ad
  • Login