1 / 35

Usable Formal Methods

Usable Formal Methods. Thomas Ball Microsoft Research – Redmond Research in Software Engineering. Thanks to Nikolaj Bjorner, Sumit Gulwani, Shuvendu Lahiri, and Ken McMillan for their slides. UV10 ( Usable Verification ) November 15-16, 2010, Redmond, Washington.

armand
Download Presentation

Usable Formal Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Usable Formal Methods Thomas Ball Microsoft Research – Redmond Research in Software Engineering Thanks to Nikolaj Bjorner, Sumit Gulwani, Shuvendu Lahiri, and Ken McMillan for their slides

  2. UV10 (Usable Verification)November 15-16, 2010, Redmond, Washington • Formal verification [methods] has made rapid progress in recent years. The tools and ideas are increasingly being used in industry in a variety of ways. • Many of the challenging problems of building secure software systems, programming multicore processors, cloud-based systems, and cyber-physical systems will require formal support for modeling and analysis. • A key challenge is to broaden the audience for this technology by making the tools more usable and by applying it to a wider range of problems.

  3. Lesson #1: (Re)Usable Engines • Build on each other’s work • Don’t reinvent the wheel • Higher levels of abstraction, when possible

  4. Usable Automatic Answers Application Analysis Engine Logical Encoding Literal assignment Equalities Simplify Sat/Model Unsat/Proof Unsat. Core Max assignment Interpolants Quant Elim Z3

  5. Some Microsoft Tools using Z3 • SDV: The Static Driver Verifier • PREfix: The Static Analysis Engine for C/C++. • Pex: Program EXploration for .NET. • SAGE: Scalable Automated Guided Execution • Spec#: C# + contracts • VCC: Verifying C Compiler for the Viridian Hyper-Visor • HAVOC: Heap-Aware Verification of C-code. • SpecExplorer: Model-based testing of protocol specs. • Yogi: Dynamic symbolic execution + abstraction. • FORMULA: Model-based Design • F7: Refinement types for security protocols • Rex: Regular Expressions and formal languages • VS3: Abstract interpretation and Synthesis • VERVE: Verified operating system • FINE: Proof carrying certified code

  6. Boogie – a verification platform[Barnett, Jacobs, Leino, Moskal, Rümmer, et al.] Spec# C with HAVOC specifications C with VCC specifications Dafny Chalice Boogie Isabelle/HOL Simplify Z3 SMT Lib

  7. RiSE tool chain

  8. Usable Automatic Answers Regular Expressions Unsat Logical Encoding Rex/Automata Sat/?

  9. Usable Automatic Answers Regular Expressions Proof Logical Encoding Rex/Auotmata Model Labels

  10. Usable Automatic Answers Application Analysis Engine Logical Encoding Literal assignment Equalities Simplify Sat/Model Unsat/Proof Unsat. Core Max assignment Interpolants Quant Elim Z3

  11. Usable Automatic Answers Sat/unsat answers alone have limited use Model/Proof answers help for • Models: Debugging during verification • Proofs: can use solver as untrusted Oracle Much more is possible and needed • Many existing applications wrap several calls into solver, re-using partial information. • Many potential applications use objective functions.

  12. Efficiency and Expressivity Z3 uses DPLL(T) as basic architecture. • Based on efficient DPLL for SAT solvers • Extensible by theory solvers DPLL(T) alone is not enough: • DPLL() – add super-position • DPLL(T) can be exponentially worse than unrestricted resolution. • DPLL() – solving diamonds • CDTR: Conflict Directed Theory ResolutionClaim DPLL(T) + CDTR + Restart Unrestricted T-Resolution

  13. SMT@Microsoft F# Quotations to Z3 Create Quoted Expression open Microsoft.Z3 open Microsoft.Z3.Quotations do Solver.prove <@ Logic.declare (fun t11 t12 t21 t22 t31 t32 -> not ((t11 >= 0I) && (t12 >= t11 + 2I) && (t12 + 1I <= 8I) && (t21 >= 0I) && (t22 >= t21 + 3I) && (t32 + 1I <= 8I) && (t31 >= 0I) && (t32 >= t31 + 2I) && (t32 + 3I <= 8I) && (t11 >= t21 + 3I || t21 >= t11 + 2I) && (t11 >= t31 + 2I || t31 >= t11 + 2I) && (t21 >= t31 + 2I || t31 >= t21 + 3I) && (t12 >= t22 + 1I || t22 >= t12 + 1I) && (t12 >= t32 + 3I || t32 >= t12 + 1I) && (t22 >= t32 + 3I || t32 >= t22 + 1I) ) ) @>

  14. SMT@Microsoft Delivering the goods •  No installation • Support for SMT-LIB2 notation • Only usable forbare bones logicencoding

  15. User Theories Application Analysis Engine Logical Encoding Literal assignment Equalities Simplify Sat/Model Unsat/Proof Unsat. Core Max assignment Interpolants Quant Elim Z3

  16. User Theories Application Analysis Engine Unsat/ Proof Sat/ Model Logical Encoding Z3 Custom Theory: Strings, Queues, Floating points, BAPA, Separation Logic, HOL, MSOL, Orders, Lattices, Local Theories

  17. Lesson #2:Imagine Doing Without • Specifications • Inference • Equivalence checking (HW -> SW) • Change impact analysis • Verification • Automated test generation (HW -> SW) • Integration with existing tool chains (i.e., debuggers) • Code • Synthesize correct-by-construction

  18. SymDiff: Programs as Specifications • Addresses AppCompat/Versioning problem • Performs static semantic diff of closely related programs • Uses Boogie/Z3 to check where programs are different

  19. SynthesisApplication 1: Bitvector Algorithms • Significance Dimension: Algorithm Designers Algorithm Designers Consumers of Program Synthesis Technology

  20. Application 2: String Manipulation Macros • Significance Dimension: End-users Algorithm Designers Software Developers Most Useful Target End-Users Consumers of Program Synthesis Technology

  21. Application 3: Geometry Constructions • Significance Dimension: Students and Teachers Algorithm Designers Software Developers Most Useful Target End-Users Most Transformational Target Students and Teachers Consumers of Program Synthesis Technology

  22. Lesson #3:Keep it Simple, Predictable, Actionable • Our tools are still too hard to use • We need • Simple counterexamples and simple proofs • Predictable behavior (avoid wild swings in performance for small changes) • Tools that explain their failures so that users know what to do next • Like type systems?

  23. Automated Formal Approaches Don’tGive Actionable Feedback to Users • VC checking • Bounded checking • Abstract interpretation • Abstraction refinement • …

  24. VC checking • User provides: loop invariant candidate • Tool failure: loop invariant is not inductive • Feedback: program state in which VC is false User’s task: • Decide if state is reachable • If no, candidate must be strengthened • If yes, candidate must be weakened • Problems • Programmers think in terms of scenarios, not program state • There may be 100 reasons why a given program state is unrealistic. Which reason is actually relevant? Users cannot determine this. • This task requires the user to think abstractly.

  25. Bounded checking • User provides: environment, property • Tool failure: no bug up to some depth • Feedback: depth searched User’s task: • Decide if depth is satisfactory • If no, modify environment? • Problems • Depth feedback is almost meaningless • No information about how environment could be modified to improve chances of finding a bug. • Tool can’t explain the “gaps” in terms user can understand.

  26. AI (disjunctive case) • User provides: abstract domain (say, predicates) • Tool failure: least fixed point not strong enough, or timeout • Feedback: abstract trace (or nothing) User’s task: • Decide whether abstract trace is realistic • If no, modify abstraction • Problems • There may be 100 reasons why a given abstract trace is unrealistic. Which reason is actually relevant? Users cannot determine this. • This task requires user to think abstractly

  27. AI (non-disjunctive case) • User provides: abstract domain + merge, widening operators • Tool failure: least fixed point not strong enough, or timeout • Feedback: sequence of fixed-point approximations User’s task: • Assign blame to abstract domain, merge or widening • Modify and try again • Problems • What’s a widening operator? • This task requires user to understand the analysis process in detail

  28. Abstraction refinement • User provides: property • Tool failure: refinement diverges, or AI times out • Feedback: abstract trace + predicates User’s task: • Decide why refinement is not converging, or abstraction explodes • Suggest different predicates • Problems • In addition to other problems of AI, user must understand why the chosen predicates are irrelevant, or insufficiently generalized. In the worst case this has all the problems of VC checking and AI.

  29. Template-based invariant generation • User provides: property + invariant template • Tool failure: no invariant matching template or search times out • Feedback: ??? User’s task: • Decide why template is insufficient or search explodes • Suggest different template • Problems • Method provides essentially no feedback..

  30. First-order theorem provers • User provides: proof goal • Tool failure: saturates or diverges • Feedback: possibly a saturated set of worked-off clauses User’s task: • Decide why worked-off clauses do not contain goal • Suicide perhaps preferable • Problems • User’s only recourse here is to understand the machines proof attempt in detail. • Machine’s proof will not contain the intermediate lemmas we expect.

  31. Some Questions • What could we learn from studying engineer’s (not FV expert’s) solutions to design problems? • Can we capture explanations of these solutions in a way that can be formalized? • Can we use these concepts to provide explanations of proofs and diagnoses of erroneous or incomplete user arguments? • How can powerful automated reasoning methods be used to aid in producing user-understandable arguments? • Can we explain “coverage” in a user-understandable way? • Can we point to gaps in user reasoning that may indicate bugs, insufficiently general fixes, etc? • Do we need a cognitive model of the user to provide usable verification tools? • In our focus on efficient logical tools, have we completely ignored a crucial aspect of the problem?

  32. Lesson #4:Education Really Matters! • The FM community has great things to offer, especially now that we have (more) usable tools • We need to feature FM in education more • Examples: • ACL2 Sedan (Pete Manolios, Northeastern) • CMU: Principles of Imperative Programming • MSR • Boogie, Code Contracts • http://www.pex4fun.com/ • http://ppcp.codeplex.com/

  33. Summer school on useful tools for verification

More Related