Invariant Inference for Many-Object Systems

1 / 76

# Invariant Inference for Many-Object Systems - PowerPoint PPT Presentation

Swiss Federal Institute of Technology, Lausanne. Invariant Inference for Many-Object Systems. http://lara.epfl.ch/~kuncak. joint work with: Thomas Wies, IST Austria. iIWIGP , ETAPS 2011. Viktor Kuncak. Broader Agenda: Implicit Programming. requirements. def f(x : Int ) = {

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Invariant Inference for Many-Object Systems

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Swiss Federal Institute of Technology, Lausanne

### Invariant Inference for Many-Object Systems

http://lara.epfl.ch/~kuncak

• joint work with:Thomas Wies, IST Austria

iIWIGP, ETAPS 2011

Viktor Kuncak

requirements

def f(x : Int) = {

y = 2 * x + 1

}

working system

Implicit Programming

A high-level declarative programming model

In addition to traditional recursive functions and loop constructs, use (as a language construct)implicit specificationsgive property of result, not how to compute it

1) more expressive 2) easier to argue it’s correct

Challenge:

• make it executable and efficient so it is useful

Claim: automated reasoning is a key technique

Automated Theorem Proving

Impossibility could similarly apply to

verification and synthesis

How to transfer successful

theorem proving and verification techniques

to synthesis?

Difficulty is in Unbounded Quantities
• interactive executionsof unbounded length
• unbounded values(integers, terms)
• systems w/ unbounded number of processes(topology - also graphs)
Programming Activity

requirements

Consider three related activities:

• Development within an IDE (emacs, Visual Studio, Eclipse)
• Compilation and static checking(optimizing compiler for the language, static analyzer, contract checker)
• Execution on a (virtual) machine

More compute power available for each of these

 use it to improve programmer productivity

def f(x : Int) = {

y = 2 * x + 1

}

iconst_1

42

Implicit Programming at All Levels

requirements

Opportunities for implicit programming in

• Development within an IDE
• isynth tool (CAV’11)
• Compilation
• Execution

def f(x : Int) = {

choose y st...

}

iconst_1

call Z3

42

joint work with: TihomirGvero, Ruzica Piskac

Interactive Synthesis of Code Snippets

def map[A,B](f:A => B, l:List[A]): List[B] = { ... }

defstringConcat(lst : List[String]): String = { ... }

...

defprintInts(intList:List[Int], prn: Int => String): String = 

Suggested valuesfor 

stringConcat(map(prn, intList))

...

• Is there a term of the given type in given environment?
• Monorphic types: decidable. Polymorphic types: undecidable
Our Solution: Use First-Order Resolution

isynth tool:

• idea is related to the Curry-Howard correspondence(but uses classical logic)
• implementation based on a first-order resolution prover we implemented in Scala
• supports method combinations, type polymorphism, user preferences
• ranking of multiple returned solutions
• using a system of weights
• preserving completeness

CAV 2011 tool paper with TihomirGvero and Ruzica Piskac

As the platform for our tools we use the Scala programming language, developed at EPFL

http://www.scala-lang.org/

“Scala is a general purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way. It smoothly integrates features of object-oriented and functional languages, enabling Java and other programmers to be more productive.”

 one can write both ML-like and Java-like code

Implicit Programming at All Levels

requirements

Opportunities for implicit programming in

• Development within an IDE
• isynth tool
• Compilation
• Comfusy and RegSytools
• Execution
• Scala^Z3and UDITA tools

def f(x : Int) = {

choose y st...

}

iconst_1

call Z3

42

The choose Implicit Construct

defsecondsToTime(totalSeconds: Int) : (Int, Int, Int) =

choose((h: Int, m: Int, s: Int) ⇒ (

h * 3600 + m * 60 + s == totalSeconds

&& 0 <= h

&& 0 <= m && m < 60

&& 0 <= s && s < 60 ))

3787 seconds

1 hour, 3 mins. and 7 secs.

Implicit Programming at All Levels

requirements

Opportunities for implicit programming in

• Development within an IDE
• isynth tool
• Compilation
• Comfusy and RegSytools
• Execution
• Scala^Z3and UDITA tools

def f(x : Int) = {

choose y st...

}

iconst_1

call Z3

42

Scala^Z3Invoking Constraint Solver at Run-Time

Java VirtualMachine

- functional and imperative code

- custom ‘decision procedure’ plugins

Z3 SMT Solver

(L. de Moura

N. Bjoerner

MSR)

Q: implicit constraint

A: model

Q: queries containing

extension symbols

A: custom theoryconsequences

with: Philippe SuterandAli SinanKöksal

Executing choose using Z3

defsecondsToTime(totalSeconds: Int) : (Int, Int, Int) =

choose((h: Var[Int], m: Var[Int], s: Var[Int]) ⇒ (

h * 3600 + m * 60 + s == totalSeconds

&& 0 <= h

&& 0 <= m && m < 60

&& 0 <= s && s < 60 ))

will be constant at run-time

syntax tree constructor

3787 seconds

1 hour, 3 mins. and 7 secs.

It works, certainly for constraints within Z3’s supported theories

Implemented as a library (jar + z3.so / dll) – no compiler extensions

Programming in Scala^Z3

find triples of integers x, y, z such that x > 0, y > x,

2x + 3y <= 40, x · z = 3y2, and y is prime

model enumeration (currently: negate previous)

val results = for(

(x,y)  findAll((x: Var[Int], y: Var[Int]) ) => x > 0 && y > x && x * 2 + y * 3 <= 40);

ifisPrime(y);

z findAll((z: Var[Int]) ) => x * z=== 3 *y *y))

yield(x, y, z)

user’s Scala function

λ

Scala’s existing mechanism for composing iterations(reduces to standard higher order functions such as flatMap-s)

• Use Scala syntax to construct Z3 syntax trees
• a type system prevents certain ill-typed Z3 trees
• Obtain models as Scala values
• Can also write own plugin decision procedures in Scala
UDITA: system for Test Generation

void generateDAG(IG ig) {

for (int i = 0; i < ig.nodes.length; i++) {

intnum = chooseInt(0, i);

ig.nodes[i].supertypes = new Node[num];

for (int j = 0, k = −1; j < num; j++) {

k = chooseInt(k + 1, i − (num − j));

ig.nodes[i].supertypes[j] = ig.nodes[k];

} } }

We used to it to find real bugs injavac, JPF itself, Eclipse, NetBeans refactoring

On top of Java Pathfinder’s backtracking mechanism

Can enumerate all executions

Key: suspended execution of non-determinism

Java + choose

- integers

- (fresh) objects

with: M. Gligoric, T. Gvero, V. Jagannath, D. Marinov, S. Khurshid

Implicit Programming at All Levels

requirements

Opportunities for implicit programming in

• Development within an IDE
• isynth tool
• Compilation
• Comfusy and RegSytools
• Execution
• Scala^Z3and UDITA tools

I next examine these tools, from last to first, focusing on Compilation

def f(x : Int) = {

choose y st...

}

iconst_1

call Z3

42

An example

defsecondsToTime(totalSeconds: Int) : (Int, Int, Int) =

choose((h: Int, m: Int, s: Int) ⇒ (

h * 3600 + m * 60 + s == totalSeconds

&& h ≥ 0

&& m ≥ 0 && m < 60

&& s ≥ 0 && s < 60 ))

defsecondsToTime(totalSeconds: Int) : (Int, Int, Int) =

val t1 = totalSecondsdiv 3600

valt2 = totalSeconds + ((-3600) * t1)

valt3 = min(t2 div 60, 59)

valt4 = totalSeconds + ((-3600) * t1) + (-60 * t3)

(t1, t3, t4)

3787 seconds

1 hour, 3 mins. and 7 secs.

Comparing with runtime invocation

Pros of synthesis

Pros of runtime invocation

Conceptually simpler

Can use off-the-shelf solver

for now can be more expressive and even faster

but:

• Change in complexity: time is spent at compile time
• Solving most of the problem only once
• Partial evaluation: we get a specialized decision procedure
• No need to ship a decision procedure with the program

valtimes =

for (secs ← timeStats)

yieldsecondsToTime(secs)

Possible starting point: quantifier elimination
• A specification statement of the form

r = choose(x ⇒ F( a, x ))

“let r be x such that F(a, x) holds”

• Corresponds to constructively solving thequantifier eliminationproblemwherea is a parameter
• Witness terms from QE are the generated program!

∃ x . F( a, x )

choose((x, y) ⇒ 5 * x + 7 * y == a && x ≤ y)

Corresponding quantifier elimination problem:

∃ x ∃ y . 5x + 7y = a ∧ x ≤ y

Use extended Euclid’s algorithm to find particular solution to 5x + 7y = a:

(5,7 are mutually prime, else we get divisibility pre.)

x = 3a

y = -2a

Express general solution of equationsfor x, y using a new variable z:

x = -7z + 3a

y = 5z - 2a

Rewrite inequations x ≤ y in terms of z:

5a ≤ 12z

z ≥ ceil(5a/12)

Obtain synthesized program:

valz = ceil(5*a/12)

valx = -7*z + 3*a

valy = 5*z + -2*a

z = ceil(5*31/12) = 13

x = -7*13 + 3*31 = 2

y = 5*13 – 2*31 = 3

For a = 31:

Comparison to Interactive Systems
• In this context, we have only one quantifier, no alternation
• It suffices to solve parameterized satisfiability problem to obtain synthesized code
Synthesis for sets

defsplitBalanced[T](s: Set[T]) : (Set[T], Set[T]) =

choose((a: Set[T], b: Set[T]) ⇒ (

a union b == s && a intersect b == empty

&& a.size – b.size ≤ 1

&& b.size – a.size ≤ 1

))

defsplitBalanced[T](s: Set[T]) : (Set[T], Set[T]) =

val k = ((s.size + 1)/2).floor

valt1 = k

val t2 = s.size – k

val s1 = take(t1, s)

vals2 = take(t2, s minus s1)

(s1, s2)

s

b

a

Compile-time warnings

defsecondsToTime(totalSeconds: Int) : (Int, Int, Int) =

choose((h: Int, m: Int, s: Int) ⇒ (

h * 3600 + m * 60 + s == totalSeconds

&& h ≥ 0 && h < 24

&& m ≥ 0 && m < 60

&& s ≥ 0 && s < 60

))

Warning: Synthesis predicate is not satisfiable for variable assignment:

totalSeconds = 86400

### RegSy

Synthesis for regular specifications over unbounded domainsJ. Hamza, B. Jobstmann, V. KuncakFMCAD 2010

Synthesize Functions over Integers
• Given weight w, balance beam using weights 1kg, 3kg, and 9kg
• Where to put weights if w=7kg?

w

9

1

3

Synthesize Functions over Integers

1

3

7

9

• Given weight w, balance beam using weights 1kg, 3kg, and 9kg
• Where to put weights if w=7kg?
• Synthesize program that computes correct positions of 1kg, 3kg, and 9kg for any w?
Synthesize Functions over Integers

1

3

7

9

Assumption: Integers are non-negative

Expressiveness of Spec Language
• Non-negative integer constants and variables
• Boolean operators
• Linear arithmetic operator
• Bitwise operators
• Quantifiers over numbers and bit positions

PAbit = Presburger arithmetic with bitwise operators

Problem Formulation

Given

• relation R over bit-stream (integer) variables in WS1S (PAbit)
• partition of variables into inputs and outputs

Constructs program that, given inputs, computescorrect output values, whenever they exist.

Unlike Church synthesis problem, we do not require causality (but if spec has it, it helps us)

010001101010101010101010101101111111110101010101100000000010101010100011010101010001101010101010101010101101111111110101010101100000000010101010100011010101

Basic Idea
• Viewintegers as finite (unbounded) bit-streams(binaryrepresentationstartingwith LSB)
• Specification in WS1S (PAbit)
• Synthesisapproach:
• Step 1: Compile specification to automaton over combined input/output alphabet (automaton specifying relation)
• Step 2: Use automaton to generate efficient function from inputs to outputs realizing relation
Our Approach: Precompute

without losing backward information

Synthesis:

Det. automaton for spec over joint alphabet

Project, determinize, extract lookup table

WS1S

spec

Mona

Synthesized program:

Automaton + lookup table

Execution:

Run A on input w and record trace

Use table to run backwards and output

Input word

Output word

Experiments
• Linear scaling observed in length of input

In 3 seconds solve constraint, minimizing the output; inputs and outputs are of order 24000

Implicit Programming at All Levels

requirements

Opportunities for implicit programming in

• Development within an IDE
• isynth tool
• Compilation
• Comfusy and RegSytools
• Execution
• Scala^Z3and UDITA tools

def f(x : Int) = {

choose y st...

}

iconst_1

call Z3

42

Difficulty is in Unbounded Quantities
• interactive executionsof unbounded length
• unbounded values(integers, terms)
• systems w/ unbounded number of processes(topology - also graphs)
Analysis of Multi-Object Systems

Objects can represent

• nodes of linked data structures
• nodes in a networks of processes

Invariants describing topology are quantified

Further techniques needed for

• proving verification conditions with reachability

 TREX logic is in NP, has reachability, quantifier

• exploring state spaces approach to symbolic shape analysis generalizes predicate abstraction techniques to discover quantified invariants

class List{

private List next;

private Object data;

privatestatic List root;

List n1 = new List();

n1.next = root;

n1.data = x;

root = n1;

}

}

root

next

next

next

data

data

data

data

x

State/Transition Graphs

nodes are states

x:5,y:0

state space

edges are transitions

x := x+y

Domain Predicate Abstraction [Podelski, Wies SAS’05]

Partition heap according to a finite set of predicates.

0

7

3

0

7

3

5

Domain Predicate Abstraction [Podelski, Wies SAS’05]

Partition heap according to a finite set of predicates.

Abstract state

Abstract domain

disjunctions of abstract states

0

7

3

5

How to prove verification conditions with next* ?

Partition heap according to a finite set of predicates.

Abstract state

Abstract domain

disjunctions of abstract states

Field Constraint Analysis[Wies, Kuncak, Lam, Podelski, Rinard, VMCAI’06]
• Goal: precise reasoning about reachability
• Reachability properties in trees are decidable
• Monadic Second-Order Logic over Trees
• decision procedure (MONA)
• construct a tree automaton for each formula
• check emptiness of the language of automaton

Using this approach: We can analyze implementations of trees

left

right

But only trees.Even parent links would introduce cycles!

Field Constraint Analysis
• Enables reasoning about non-tree fields
• Can handle broader class of data structures
• doubly-linked lists, trees with parent pointers

treebackbone

next

next

next

next

next

constrainedfields

nextSub

nextSub

Constrained fields satisfy constraint invariant:

forallx y. x.nextSub= y  next+(x,y)

Elimination of constrained fields

valid

valid

soundness

VC1(next,nextSub)

VC2(next)

field

constraint

analysis

MONA

completeness

invalid

invalid

(for useful class including

preservation of field constraints)

treebackbone

next

next

next

next

next

constrainedfields

nextSub

nextSub

Constrained fields satisfy constraint invariant:

forall x y. x.nextSub= y  next+(x,y)

root

6

9

3

5

1

first

4

Implementation in Bohne

Verified data structure implementations:

• lists with iterators
• sorted lists
• skip lists
• search trees
• trees w/ parent pointers

No manual adaptation of abstract domain / abstract transformer required.

root

6

9

3

5

1

first

4

Implementation in Bohne

Verified properties:

• absence of runtime errors
• shape invariants
• acyclic
• sharing-free
• sorted …
• partial correctness

No manual adaptation of abstract domain / abstract transformer required.

TREX (Trees with Reachability Expressions) Logic
• Field constraint analysis reduces to WS2S
• Works for simple tree operations
• Becomes infeasible for formulas from more complex operations on trees
• Solution: a new fragment logic of trees (TREX), with NP complexity
• supports reachability in trees
• supports a form of universal quantification
• Small model property not enough for NP
• Use local theory extensions
• Can implement it on top of SMT solvers (ongoing)
From verification
• This will give us verification technique for trees
• We plan to lift it to synthesis techniquesfor
• distributed systems
• Our previous experience: CrystalBall(with M. Yabandeh, N. Knezevic, D. Kostic)
• bounded model checking to find errors during actual execution
• avoiding errors by changing behavior

requirements

def f(x : Int) = {

y = 2 * x + 1

}

working system

Notions Related to Implicit Programing
• Code completion
• help programmer to interactively develop the program
• Synthesis – core part of our vision
• key to compilation strategies for specification constructs
• Manual refinement from specs (Morgan, Back)
• Logic Programming
• shares same vision, in particular CLP(X)
• operational semantics design choices limit what systems can do (e.g. Prolog)
• CLP solvers theories limited compared to SMT solvers
• not on mainstream platforms, no curly braces , SAT
Relationship to Verification
• Some functionality is best synthesized from specs
• Other is perhaps best implemented, then verified
• Currently, no choice - always must implement
• so specifications viewed as overhead
• Goal: make specifications intrinsic part of program, with clear benefits to programmers: they execute!
• We expect that this will help both
• verifiability and
• productivity
• example: state an assertion, not how to establish it
Conclusion: Implicit Programming

Developmentwithin an IDE

• isynthtool – FOL resolution as code completion

Compilation

• Comfusy: decision procedure  synthesis procedureScala implementation for integer arithmetic, BAPA
• RegSy: solving WS1S constraints

Execution

• Scala^Z3 : constraint programming
• UDITA: Java + choice as test generation language

Future: linked structures and distributed systems

http://lara.epfl.ch/w/impro

choose((x, y) ⇒ 5 * x + 7 * y == a && x ≤ y && x ≥ 0)

Express general solution of equationsfor x, y using a new variable z:

x = -7z + 3a

y = 5z - 2a

Rewrite inequations x ≤ y in terms of z:

z ≥ ceil(5a/12)

Rewrite x ≥ 0:

z ≤ floor(3a/7)

Precondition on a:

ceil(5a/12) ≤ floor(3a/7)

(exact precondition)

Obtain synthesized program:

assert(ceil(5*a/12) ≤ floor(3*a/7))

valz = ceil(5*a/12)

valx = -7*z + 3*a

valy = 5*z + -2*a

With more inequalities

we may generate a for loop

Handling of NP-Hard Constructs
• Disjunctions
• Synthesis of a formula computes program and exact precondition of when output exists
• Given disjunctive normal form, use preconditionsto generate if-then-else expressions (try one by one)
• Divisibility combined with inequalities:
• corresponding to big disjunction in q.e. ,we will generate a for loop with constant bounds(could be expanded if we wish)

choose x such that F(x,a)  x = t(a)

Result t(a) is expressed in terms of +, -, C*, /C, if

Need arithmetic for solving equations

Need conditionals for

• disjunctions in input formula
• divisibility and inequalities (find a witness meeting bounds and divisibility by constants)

t(a) = if P1(a) t1(a) elseif … elseifPn(a) tn(a) else error(“No solution exists for input”,a)

When do we have witness generating quantifier elimination?
• Suppose we have
• class of specification formulas S
• decision procedure for formulas in class Dthat produces satisfying assignments
• function (e.g. substitution) that, given concrete values of parameters a and formula F in S, computes F(x,a) that belongs to D
• Then we have synthesis procedure for S(proof: invoke decision procedure at run-time)

If have decidability  also have computable witness-generating QE in the language extended w/ computable functions

Synthesis for non-linear arithmetic

defdecomposeOffset(offset: Int, dimension: Int) : (Int, Int) =

choose((x: Int, y: Int) ⇒ (

offset == x + dimension * y && 0 ≤ x && x < dimension

))

• The predicate becomes linear at run-time
• Synthesized program must do case analysis on the sign of the input variables
• Some coefficients are computed at run-time
Parameterized Optimization

6CHF

d1

potato

cabbage

9CHF

d2

Note: integers have unbounded number of bits

Synthesize Functions over bit-Streams

Smoothing a function:

• Given sequence X of 4-bit numbers
• Compute its average sequence Y

X

Y

Example: Parity of Input Bits
• Input x and output y are bit-streams
• Spec: output variable y indicates parity of non-zero bits in input variable x
• y=00* if number of 1-bits in x is even, otherwise y=10*
• Examples:
Example: Parity of Input Bits
• Step 1: construct automaton for spec over joint alphabet
• Spec: y=00* if number of 1-bits in x is even, otherwise y=10*
• Accepting states are green

B

A

C

Note: must read entire input to know first output bit (non-causal)

Idea
• Run automaton on input, collect states for all possible outputs (subset construction)
• From accepting state compute backwards to initial state and output corresponding value

B

A

C

B

B

A

C

B

C

C

Automaton has all needed information

but subset construction for every input

Time and Space
• Automata may be large, but if table lookup is constant time, then forth-back run is linear in input (in number of bits)
• Also linear space
• can trade space for extra time by doing logarithmic check-pointing of state gives O(log(n)) space, O(n log(n)) time
Prototype and Experiments
• RegSy is implemented in Scala
• Uses MONA to construct automaton on joint alphabet from specification
• Build input-deterministic automaton and lookup table using a set of automata constructions
• Run on several bit-stream and parametric linear programming examples
Summary of RegSy
• Synthesize function over bit-stream (integer) variables
• Specification: WS1S or PA with bit‐wise operators (including quantifiers)
• Linear complexity of running synthesized code(linear in number of input bits)
• Synthesize specialized solvers to e.g. disjunctive parametric linear programming problems
• Recent work: replace MONA with different construction

Decision procedure

Synthesis procedure

Takes: a formula, with input (params) and output variables

(model-generating)

• Takes: a formula

a theorem prover that always succeeds

a synthesizer that always succeeds

• Makes: a modelof the formula
• Makes: a program to compute outputvalues from input values

5*x + 7*y = 31

Inputs: { x } outputs: { y }

5*x + 7*y = 31

x := 2

y := 3

y := (31 – 5*x) / 7

Complete Functional Synthesis

• Synthesis: our procedures start from an implicit specification.
• Functional: computes a function that satisfies a given input/output relation.
• Complete: guaranteed to work for all specification expressions from a well-defined class.

Tool: Comfusy

Mikaël Mayer, Ruzica Piskac, Philippe Suter, in PLDI 2010, CAV 2010

Experience with Comfusy
• Works well for examples we encountered
• Needed: synthesis for more expressive logics, to handle more examples
• seems ideal for domain-specific languages
• Efficient for conjunctions of equations (could be made polynomial)
• Extends to synthesis with parametric coefficients
• Extends to logics that reduce to Presburger arithmetic (implemented for BAPA)
Comfusy for Arithmetic 
• Limitations of Comfusy for arithmetic:
• Naïve handling of disjunctions
• Blowup in elimination, divisibility constraints
• Complexity of running synthesized code (from QE): doubly exponential in formula size
• Not tested on modular arithmetic, or onsynthesis with optimization objectives
• Arbitrary-precision arithmetic with multiple operations generates time-inefficient code
• Cannot do bitwise operations(not in PA)
Compile-time warnings

defsecondsToTime(totalSeconds: Int) : (Int, Int, Int) =

choose((h: Int, m: Int, s: Int) ⇒ (

h * 3600 + m * 60 + s == totalSeconds

&& h ≥ 0

&& m ≥ 0 && m ≤ 60

&& s ≥ 0 && s < 60

))

Warning: Synthesis predicate has multiple solutions for variable assignment:

totalSeconds = 60

Solution 1: h = 0, m = 0, s = 60

Solution 2: h = 0, m = 1, s = 0

Arithmetic pattern matching

deffastExponentiation(base: Int, power: Int) : Int = {

deffp(m: Int, b: Int, i: Int): Int= imatch {

case 0 ⇒ m

case 2 * j ⇒ fp(m, b*b, j)

case 2 * j + 1 ⇒ fp(m*b, b*b, j)

}

fp(1, base, p)

}

• Goes beyond Haskell’s (n+k) patterns
• Compiler checks that all patterns are reachable and whether the matching is exhaustive