760 likes | 955 Views
Program Verification and Synthesis using Templates over Predicate Abstraction. Saurabh Srivastava # Sumit Gulwani * Jeffrey S. Foster # # University of Maryland, College Park * Microsoft Research, Redmond On leave advisor: Mike Hicks. Verification: Proving Programs Correct.
E N D
Program Verification and Synthesisusing Templatesover Predicate Abstraction Saurabh Srivastava # Sumit Gulwani * Jeffrey S. Foster # # University of Maryland, College Park * Microsoft Research, Redmond On leave advisor: Mike Hicks
Verification: Proving Programs Correct • Selection Sort: find min find min find min i i i j j j Sorted …
Verification: Proving Programs Correct • Selection Sort: for i = 0..n { for j = i..n { find min index } if (min != i) swap A[i] with A[min] } find min find min find min i i i j j j Sorted …
Verification: Proving Programs Correct Prove the following: • Sortedness • Output array is non-decreasing • Permutation • Output array contains all elements of the input • Output array contains only elements from the input • Number of copies of each element is the same • Selection Sort: for i = 0..n { for j = i..n { find min index } if (min != i) swap A[i] with A[min] } ∀k1,k2 : 0≤k1<k2<n => A[k1] ≤A[k2] ∀k∃j : 0≤k<n => Aold[k]=A[j] & 0≤j<n ∀k∃j : 0≤k<n => A[k]=Aold[j] & 0≤j<n ∀n : n∊A => count(A,n) = count(Aold,n)
Verification: Proving Programs Correct • Selection Sort: for i = 0..n { for j = i..n { find min index } if (min != i) swap A[i] with A[min] } i:=0 i<n j:=i j<n Output array Find min index • if (min != i) • swap A[i], A[min]
Verification: Proving Programs Correct Input state We know from work in the ’70s (Hoare, Floyd, Dijkstra) that it is easy to reason logically about simple paths true • Selection Sort: i:=0 Iouter i<n j:=i Output state Iinner j<n Sortedness Permutation Output array The difficult task is program state (invariant) inference: Iinner, Iouter, Input state, Output state Find min index • if (min != i) • swap A[i], A[min]
Verification: Proving Programs Correct Input state We know from work in the ’70s (Hoare, Floyd, Dijkstra) that it is easy to reason logically about simple paths true • Selection Sort: ∀k1,k2 : 0≤k1<k2<n∧ k1<i => A[k1] ≤A[k2] i:=0 Iouter i < j i ≤ min < n i<n j:=i ∀k1,k2 : 0≤k1<k2<n∧ k1<i => A[k1] ≤A[k2] Output state Iinner ∀k : 0≤k ∧i≤k<j => A[min] ≤A[k] j<n Sortedness Permutation Output array The difficult task is program state (invariant) inference: Iinner, Iouter, Input state, Output state Find min index ∀k1,k2 : 0≤k1<k2<n => A[k1] ≤A[k2] • if (min != i) • swap A[i], A[min]
Inferring program state (invariants) Full functional correctness Good Program properties that we can verify/prove Tolerable Bad No input Fully specified invariants User Input
Inferring program state (invariants) Full functional correctness Mind-blowing Program properties that we can verify/prove Live with current technology Undesirable No input Fully specified invariants User Input
Inferring program state (invariants) Full functional correctness • Arbitrary • quantification • and • arbitrary • abstraction Program properties that we can verify/prove • Alternating • quantification (∀∃) • Quantified facts (∀) • Quantifier-free facts No input Fully specified invariants User Input
Inferring program state (invariants) Full functional correctness • Arbitrary • quantification • and • arbitrary • abstraction Program properties that we can verify/prove • Alternating • quantification (∀∃) • Quantified facts (∀) Undecidable • Quantifier-free facts No input Fully specified invariants User Input
Inferring program state (invariants) Program properties Inference Validation Full functional correctness • Arbitrary • quantification • and • arbitrary • abstraction Jahob SLAyer etc Program properties that we can verify/prove • Alternating • quantification (∀∃) Kovacs etc Shape analysis Gulwani etc • Best-effort inference • for alternating • quantification (∀∃) Validating program paths using non-interactive theorem provers/SMT solvers • Quantified facts (∀) HAVOC, Spec#, etc • Quantifier-free • domain lifted to ∀ Constraint-based Quantifier-free domain Abstract interpretation Quantifier-free domain Gulwani, Beyer etc • Quantifier-free facts Validation using interactive theorem Proving Cousot etc No input Fully specified invariants User Input
Inferring program state (invariants) Program properties Inference Validation Full functional correctness • Arbitrary • quantification • and • arbitrary • abstraction ✗ Program properties that we can verify/prove • Alternating • quantification (∀∃) • Quantified facts (∀) • Quantifier-free facts Magical technique !! No input Fully specified invariants User Input
Inferring program state (invariants) Program properties Inference Validation Full functional correctness • Arbitrary • quantification • and • arbitrary • abstraction Program properties that we can verify/prove • Alternating • quantification (∀∃) • Quantified facts (∀) • Quantifier-free facts Template-based invariant inference No input Fully specified invariants User Input
Inferring program state (invariants) Full functional correctness • Invariant • Templates Template-based invariant inference • Predicates No input Small; intuitive input User Input Example: Example: My task in this talk: Convince you that these small hints go a long way in what we can achieve with verification Take all variables and array elements in the program. If you wanted to prove sortedness, i.e. each element is smaller than some others: If you wanted to prove permutation, i.e. for each element there exists another: Enumerate all pairs related by less than, greater than, equality etc.
Key facilitators: (1) Templates • Templates • Intuitive • Depending on the property being proved; form straight-forward • Simple • Limited to the quantification and disjunction • No complicated error reporting, iterative refining required • Very different from full invariants used by some techniques • Helps: We don‘t have to define insane joins etc. • Previous approaches: Domain D for the holes, construct D∀ • This approach: Work with only D • Catch: Holes have different status: Unknown holes
Key facilitators: (2) Predicates • Predicate Abstraction • Simple • Small fact hypothesis • Nested deep under, the facts for the holes are simple • E.g.,∀k1,k2 : 0≤k1<k2<n => A[k1] ≤A[k2] • Predicates 0≤k1,k1<k2, k2<n, and A[k1] ≤A[k2] are individually very simple • Intuitive • Only program variables, array elements and relational operators Limited constants • Example: Enumerate predicates of the form: Program variables, array elements Relational operator <, ≤, >, ≥ … Arithmetic operator +, - … • E.g., {i<j, i>j, i≤j, i≥j, i<j-1, i>j+1… }
VS3 • VS3: Verification and Synthesis using SMT Solvers • User input • Templates • Predicates • Set of techniques and tool implementing them • Infers • Loop invariants • Weakest preconditions • Strongest postconditions
What VS3 will let you do! A. Infer invariants with arbitrary quantification and boolean structure • ∀: E.g. Sortedness • ∀∃: E.g. Permutation i:=0 i<n j:=i j<n Output array Find min index .. • if (min != i) • swap A[i], A[min] ∀k1,k2 k1 k2 Aold .. j .. k • Anew ∀k∃j
What VS3 will let you do! Weakest conditions on input? • B. Infer weakest preconditions k2 k1 < i:=0 i<n j:=i No Swap! j<n Output array Worst case behavior: swap every time it can Find min index • if (min != i) • swap A[i], A[min] …
Improves the state-of-art • Can infer very expressive invariants • Quantification • E.g. ∀k∃j : (k<n) => (A[k]=B[j] ∧ j<n) • Disjunction • E.g. ∀k : (k<0 ∨k>n∨A[k]=0) • Can infer weakest preconditions • Good for debugging: can discover worst case inputs • Good for analysis: can infer missing precondition facts
Verification part of the talk • Three fixed-point inference algorithms • Iterative fixed-point • Least Fixed-Point (LFP) • Greatest Fixed-Point (GFP) • Constraint-based (CFP) • Weakest Precondition Generation • Experimental evaluation (GFP) (CFP) (LFP)
Verification part of the talk • Three fixed-point inference algorithms • Iterative fixed-point • Least Fixed-Point (LFP) • GreatestFixed-Point (GFP) • Constraint-based (CFP) • Weakest Precondition Generation • Experimental evaluation (GFP) (CFP) (LFP)
Analysis Setup E.g. Selection Sort: pre true ∧i=0 => I1 i:=0 vc(pre,I1) I1 ∧i<n∧ j=i => I2 I1 I1 i<n vc(I1,post) j:=i vc(I1,I2) I2 I2 j<n Output sorted array Find min index post vc(I2,I2) vc(I2,I1) • if (min != i) • swap A[i], A[min] Loop headers (with invariants) split program into simple paths. Simple paths induce program constraints (verification conditions) I1 ∧i≥n =>sorted array
Analysis Setup pre true ∧i=0 => I1 vc(pre,I1) I1 ∧i<n∧ j=i => I2 I1 vc(I1,post) vc(I1,I2) Simple FOL formulae over I1, I2! We will exploit this. I2 post vc(I2,I2) vc(I2,I1) I1 ∧i≥n => ∀k1,k2 : 0≤k1<k2<n => A[k1] ≤A[k2] Loop headers (with invariants) split program into simple paths. Simple paths induce program constraints (verification conditions)
Iterative Fixed-point: Overview Candidate Solution Values for invariants pre <x,y> { vc(pre,I1),vc(I1,I2)} vc(pre,I1) ✗ I1 VCs that are not satisfied vc(I1,post) vc(I1,I2) ✓ ✗ I2 post vc(I2,I2) ✓ vc(I2,I1) ✓
Iterative Fixed-point: Overview pre vc(pre,I1) I1 vc(I1,post) vc(I1,I2) I2 • Improve • candidate • Improve candidate • Pick unsat VC • Use theorem prover to compute optimal change • Results in new candidates post vc(I2,I2) vc(I2,I1) Candidate satisfies all VCs Set of candidates <x,y> { … }
Forward Iterative (LFP) Forward iteration: • Pick the destination invariant of unsat constraint to improve • Start with <⊥,…,⊥> pre ✗ ⊥ ✓ ✓ ✓ ✓ ⊥ post
Forward Iterative (LFP) Forward iteration: • Pick the destination invariant of unsat constraint to improve • Start with <⊥,…,⊥> pre ✓ a ✓ ✓ ✗ ✓ ⊥ post
Forward Iterative (LFP) Forward iteration: • Pick the destination invariant of unsat constraint to improve • Start with <⊥,…,⊥> pre ✓ a ✓ ✓ ✗ ✓ b post How do we compute these improvements? `Optimal change’ procedure
Positive and Negative unknowns • Given formula Φ • Unknowns in have polarities: Positive/Negative • Value of positive unknown stronger => Φstronger • Value of negative unknown stronger => Φweaker • Optimal Soln to Φ • Maximally strong positives, maximally weak negatives positive positive negative ¬ u1 ∨ u2 ) ∧ u3 ( ∀x : ∃y : Φ =
Optimal Change • One approach (old way): ∀k1,k2 : 0≤k1<k2<n∧ k1<i => A[k1] ≤A[k2] Lifted domain D∀ • Domain D: • quantifier-free facts x1:=e1 Under-approximate Over-approximate ∀k1,k2 : ?? => ?? Very difficult tx, join, widen functions required for the defn of domain D∀
Optimal Change • Our approach (new way): ∀k1,k2 : 0≤k1<k2<n∧ k1<i => A[k1] ≤A[k2] x1:=e1 ∀k1,k2 : ?? => ??
Optimal Change • Our approach (new way): ∀k1,k2 : 0≤k1<k2<n∧ k1<i => A[k1] ≤A[k2] x1:=e1 Suppose ei does not refer to xi x2:=e2 x3:=e3 ∀k1,k2 : ??1 => ??2 ∀k1,k2 : ??3 => ??4
Optimal Change • Our approach (new way): ∀k1,k2 : 0≤k1<k2<n∧ k1<i => A[k1] ≤A[k2] ∧ x1=e1 ∧ x2=e2 Suppose ei does not refer to xi ∀k1,k2 : ??1 => ??2 => ∧ x3=e3 ∀k1,k2 : ??3 => ??4
Optimal Change • Our approach (new way): ∀k1,k2 : ??1 => ??2 ∀k1,k2 : 0≤k1<k2<n∧ k1<i => A[k1] ≤A[k2] => Find optimal solutions to unknowns ??1, ??2,??3,??4 ∀k1,k2 : ??3 => ??4 ∧ x1=e1 ∧ x2=e2 ∧ x3=e3 Trivial search will involve kx 2|Predicates| queries to the SMT solver Algorithmic Novelty: This polynomial sized information can be aggregated efficiently to compute optimal soln Key idea: Instantiate some unknowns and query SMT solver polynomial number of times Unknowns have polarities therefore obey a monotonicity property
SMT Solver Query Time 413 queries 30 queries
Optimal Change • Our approach (new way): ∀k1,k2 : ??1 => ??2 ∀k1,k2 : 0≤k1<k2<n∧ k1<i => A[k1] ≤A[k2] => Find optimal solutions to unknowns ??1, ??2,??3,??4 ∀k1,k2 : ??3 => ??4 ∧ x1=e1 ∧ x2=e2 ∧ x3=e3 Trivial search will involve kx 2|Predicates| queries to the SMT solver Algorithmic Novelty: This polynomial sized information can be aggregated efficiently to compute optimal soln Key idea: Instantiate some unknowns and query SMT solver polynomial number of times Unknowns have polarities therefore obey a monotonicity property Efficient in practice
Backward Iterative Greatest Fixed-Point (GFP) Backward: Same as LFP except • Pick the source invariant of unsat constraint to improve • Start with <⊤, …,⊤>
Constraint-based over Predicate Abstraction pre vc(pre,I1) I1 vc(I1,post) vc(I1,I2) I2 post vc(I2,I2) vc(I2,I1)
Constraint-based over Predicate Abstraction If we find assignments bji= T/F Then we are done! pred(I1) vc(pre,I1) vc(I1,post) b11,b21…br1 b12,b22…br2 boolean indicators vc(I1,I2) p1,p2…pr p1,p2…pr predicates vc(I2,I1) unknown 2 unknown 1 template vc(I2,I2) I1 Remember: VCs are FOL formulae over I1, I2
Constraint-based over Predicate Abstraction pred(I1) vc(pre,I1) vc(I1,post) vc(I1,I2) vc(I2,I1) vc(I2,I2) Remember: VCs are FOL formulae over I1, I2 pred(A) : A to unknown predicate indicator variables
Constraint-based over Predicate Abstraction SAT formulae over predicate indicators Program constraint to boolean constraint boolc(pred(I1)) vc(pre,I1) boolc(pred(I1)) vc(I1,post) Invariant soln ∧ boolc(pred(I1), pred(I2)) vc(I1,I2) boolc(pred(I2), pred(I1)) vc(I2,I1) boolc(pred(I2)) vc(I2,I2) Boolean constraint to satisfying soln (SAT Solver) Remember: VCs are FOL formulae over I1, I2 pred(A) : A to unknown predicate indicator variables Fixed-Point Computation Local reasoning
Verification part of the talk • Three fixed-point inference algorithms • Iterative fixed-point • Greatest Fixed-Point (GFP) • Least Fixed-Point (LFP) • Constraint-based (CFP) • Weakest Precondition Generation • Experimental evaluation (GFP) (CFP) (LFP)
Computing maximally weak preconditions • Iterative: • Backwards algorithm: • Computes maximally-weak invariants in each step • Precondition generation using GFP: • Output disjuncts of entry fixed-point in all candidates • Constraint-based: • Generate one solution for precondition • Assert constraint that ensures strictly weaker pre • Iterate till unsat—last precondition generated is maximally weakest
Verification part of the talk • Three fixed-point inference algorithms • Iterative fixed-point • Greatest Fixed-Point (GFP) • Least Fixed-Point (LFP) • Constraint-based (CFP) • Weakest Precondition Generation • Experimental evaluation • Why experiment? • Problem space is undecidable—if not the worst case, is the average, `difficult‘ case efficiently solvable? • We use SAT/SMT solvers—hoping that our instances will not incur their worst case exponential solving time (GFP) (CFP) (LFP)
Implementation Microsoft’s Phoenix Compiler Verification Conditions C Program CFG Templates Predicate Set Iterative Fixed-point GFP/LFP Constraint- based Fixed-Point Z3 SMT Solver Candidate Solutions Boolean Constraint Invariants Preconditions
Verifying Sorting Algorithms • Benchmarks • Considered difficult to verify • Require invariants with quantifiers • Sorting, ideal benchmark programs • 5 major sorting algorithms • Insertion Sort, Bubble Sort (n2 version and termination checking version), Selection Sort, Quick Sort and Merge Sort • Properties: • Sort elements • ∀k : 0≤k<n => A[k] ≤A[k+1] --- output array is non-decreasing • Maintain permutation, when elements distinct • ∀k∃j : 0≤k<n => A[k]=Aold[j] & 0≤j<n --- no elements gained • ∀k∃j : 0≤k<n => Aold[k]=A[j] & 0≤j<n --- no elements lost
Runtimes: Sortedness Tool can prove sortedness for all sorting algorithms! seconds
Runtimes: Permutation 94.42 …Permutations too! … seconds ∞ ∞