220 likes | 374 Views
Haskell to logic through denotational semantics . Dimitrios Vytiniotis, Koen Claessen, Simon Peyton Jones, Dan Rosén POPL 2013, January 2013. Real programs contain assertions. dimitris@artemis :~/GHC/ ghc -head/ ghc /compiler/ typecheck $ grep - i ASSERT ./* hs
E N D
Haskell to logic through denotational semantics Dimitrios Vytiniotis, Koen Claessen, Simon Peyton Jones, Dan Rosén POPL 2013, January 2013
Real programs contain assertions dimitris@artemis:~/GHC/ghc-head/ghc/compiler/typecheck$ grep -i ASSERT ./*hs ./FamInst.lhs: = ASSERT( isAlgTyContycon ) ./Inst.lhs: ; wrap <- ASSERT( null rest && isSingleton theta ) ./TcCanonical.lhs: = ASSERT( tyConAritytc <= length tys ) -- Type functions are saturated ./TcCanonical.lhs: = ASSERT( not (isKind t1) && not (isKind t2) ) ./TcClassDcl.lhs: = ASSERT( ok_first_pred ) local_meth_ty ./TcClassDcl.lhs: rho_ty = ASSERT( length sel_tyvars == length inst_tys ) ./TcDeriv.lhs: ASSERT( null sigs ) ./TcDeriv.lhs: = ASSERT2( equalLengthrep_tc_tvsall_rep_tc_args, pprcls <+> pprrep_tc ) ./TcDeriv.lhs: arg_ty <- ASSERT( isVanillaDataCondata_con ) ./TcEnv.lhs: -> ASSERT( lvl == lvl1 ) id ./TcEnv.lhs: TopLevel -> ASSERT2( isEmptyVarSetid_tvs, ppr id $$ ppr (idType id) ) ./TcErrors.lhs: = ASSERT( isEmptyBaginsols ) ./TcErrors.lhs: = ASSERT( not (null matches) ) ./TcErrors.lhs: = ASSERT( length matches > 1 ) ./TcEvidence.lhs: | otherwise = ASSERT( arity < n_tys ) ./TcEvidence.lhs:mkTcForAllCotv (TcReflty) = ASSERT( isTyVartv ) TcRefl (mkForAllTytvty) ./TcEvidence.lhs:mkTcForAllCotv co = ASSERT( isTyVartv ) TcForAllCotv co ./TcEvidence.lhs:mkTcForAllCostvs (TcReflty) = ASSERT( all isTyVartvs ) TcRefl (mkForAllTystvsty) ./TcEvidence.lhs:mkTcForAllCostvs co = ASSERT( all isTyVartvs ) foldrTcForAllCo co tvs ./TcEvidence.lhs: = ASSERT (tc `hasKey` eqTyConKey) ./TcEvidence.lhs: = ASSERT( equalLengthtvscos ) ./TcExpr.lhs: = ASSERT( not (isSigmaTyres_ty) ) ./TcExpr.lhs: = ASSERT( notNullupd_fld_names) (from the GHC type checker)
Automated static verification of higher-order functional programs This work www.github.com/danr/contracts Tool works on subset of Haskell, uses GHC as frontend
Verify Haskell code: higher-order, lazybut pure Don’t aim for high expressiveness, go for simple, easy-to-prove (e.g. structural) properties • Automatically discharge all tedious but simple goals that a programmer has to manually and repeatedly check Re-use existing technology: • Automated theorem provers (e.g. SMT solvers), model finders • ACL2? Boogie? • Prolog? Property-directed reachability? [Bjorner et al] Our setting No “best” solution yet. Our choice for this work
Syntax of “contracts” (“refinements” more appropriate): C ::= {x | p} | (x:C1)->C2 | C1 && C2 | CF Programs and properties risers [] = [] risers [x] = [[x]] risers (x:y:ys) = case risers (y:ys) of [] -> error “urk” (s:ss) -> if x <= y then (x:s):ss else [x]:s:ss can risers crash? non-empty input ⟶ non-empty result? risers CF && {xs | not (null xs)} -> CF && {ys | not (null ys)} Just an ordinary Haskell expression of type Bool “crash-free”
Design module Foo f x y = … g x = … HALO translation to First Order Logic First Order Logic Formulae g C -- Prelude data [a] = [] | a : as data Bool = True | False … Functions over these… Haskell Source Theorem Prover Z3/Equinox/E/ Vampire/Paradox Satisfiable Probably contract doesn’t hold but who knows Unsatisfiable Contract holds! <loop> Can’t tell anything
Key idea: let denotational semantics guide us A λ/case-lifted language Standard construction Lifting Continuous function space One product of cpos for each constructor of arity Distinguished one-element cpo
Logical language: A translation of expressions to logical terms: 𝓔 𝓔 𝓔 … and use itself as FOL structure Interpreted as Interpreted as the ‘apply’ combinator in apply (,_) = apply (,_) = apply(fun(d),d’) = d(d’) apply(_,_) = Interpreted as injection into the appropriate product
Function definitions become FOL axioms head (Cons x xs) = x head _ = error Theory NB: A Good Thing! Theorem:
Axiomatize (some) true facts about data List a = Cons a (List a)| Nil Theory Theorem:
Higher-order functions head (Cons x xs) = x head _ = error Interpreted as the apply(.,.) combinator in double f x = f (f x)
≙ ≙ Refinements denotationally and logically Denotationally Logically
Assume that: Then: By previous theorems: … hence: … which is equivalent to: Soundness via denotational semantics
Currently support fixpoint induction Automating induction add Z y = y add (S x) y = S (add x y) add CF -> CF -> CF Assume contract holds for uninterpreted functionadd_rec add Z y = y add (S x) y = S (add_rec x y) add_rec CF -> CF -> CF --------------------------- add CF -> CF -> CF NB: A sound thing to do by admissibility of contracts
Admissibility and why it matters In Haskell, data types are not inductive. Hence your familiar induction principle is simply unsound! ones = 1 : ones f Z = [] f (S x) = 1 : f x Lemma: forall x. f x ≠ ones Logical inequality, not admissible! Proof: • Holds for UNR • Holds for Z • Assume holds for x; then holds for (S x) Right? WRONG! Let: u = S u Then: f u = ones
Admissibility = If P is true for all elements of a chain, then true for the limit. Not all predicates are admissible Admissibility and induction Theorem: All predicates are admissible. Comes for-free! Base contracts are Haskell functions, and those are continuous!
Happily implemented on top of GHC API • Z3 rocks for provable properties! • Disclaimer: • 40-80 FOL axioms/problem • Use of fixpoint induction
More features: • Incremental verification • Prove spec for “g”, use either the spec or definition of “g”, or both to prove other specifications … • Some support for lemmas • Mutual (automatic) fixpoint induction • Primitive arithmetic constraints via SMT2 (in Z3) • Experimental features: logical equality, finite unfoldings Not there: • Pre/post inference, strengthening of IH • Support for counterexamples (see next slide) More features (and non-features)
Unprovable contracts (because they’re false or we’re incomplete) What’s next: counterexamples Paradox Equinox Z3 Vampire E-prover AnyMorphism.big_sat_app_any_morphism_fail_stepP:---- X:---- Z:---- V:---- E:---- Loop.sat_id_loop_pred P:0.00 X:0.01 Z:0.01 V:---- E:0.01 Loop.sat_id_recursive_true P:---- X:---- Z:---- V:---- E:0.01 PredLog.sat_concatMap_cf_missing_step P:---- X:---- Z:---- V:---- E:---- PredLog.sat_concatMap_retains_missing_step P:---- X:---- Z:---- V:---- E:---- PredLog.sat_flattenAnd_cf_missing_step P:---- X:---- Z:---- V:---- E:---- PredLog.sat_flattenAnd_retains_missing_step P:---- X:---- Z:---- V:---- E:---- ... Recursion.sat_qfac_cf_broken_stepP:---- X:---- Z:---- V:---- E:---- Recursion.sat_rev_cf_broken_step P:---- X:---- Z:---- V:---- E:---- Risers.big_sat_risersBy_nonEmpty_broken2_step P:---- X:---- Z:---- V:---- E:---- Risers.big_sat_risersBy_nonEmpty_broken_step P:---- X:---- Z:---- V:---- E:---- Risers.sat_risers_broken2_step P:---- X:---- Z:---- V:---- E:---- Risers.sat_risers_broken3_step P:---- X:---- Z:---- V:---- E:---- Risers.sat_risers_broken_step P:---- X:---- Z:---- V:---- E:---- Risers.sat_risers_missing_le_step P:---- X:---- Z:---- V:---- E:---- Shrink.big_sat_shrink_lazy_step P:---- X:---- Z:---- V:---- E:---- Timeouts … We now know why, and how to address this: stay tuned
Proving is reasonably fast, now explore: • Automatic strengthening of induction hypotheses • Pretty printing models as counterexamples • More induction principles • Testing in larger scale • Interfacing with theorem provers for manual proofs? What’s next: usability Lots of man-hours needed, come help please!
Related work • ESC/Haskell [Xu et al] • Contracts areprograms • Symbolic execution/inlining • Catch [Mitchell] • Pattern match errors • Via dataflow analysis • Liquid Types [Jhala et al] • Predicate abstraction • Inference • Quantifiers driven by type system • Zeno [Sonnex et al] • Automated equality proofs • Clever heuristics • Strict semantics Dafny & Boogie [Leino et al], ACL2 • Leon [Suter et al] • Specialized decision procedure for FP • Good for first-order • F7/F* [Swamy et al] • Hoare logic for FP [Regis-Gianas & Pottier] • HO logics • CBV *really* helps • HO model checking, MoChi [Kobayashi et al] • Specialized decision procedures • Lots of techniques stacked • Good for inference, good for counterexamples • HOLCF-based verification [Huffman] • Reasoning in a very rich logic that contains formalization of a domain theory • More sophisticated axiomatization, ability to reason about parametricity and monad laws • Symbolic execution-based [Tobin-Hochstadt and Van Horn][Xu] • Abstraction, lots of “smaller” queries to theorem prover
We’ve given a semantic basis for the verification of Haskell programs We demonstrated that it is implementable What we did and what I learnt • We can verify FP in a simple and robust way: • For this particular case a simple solution seems to do the job. • It appears affordable to use a very precise abstraction of your program and trust your 2013 theorem proving technology Thank you for your attention!