1 / 39

HAVOC: A precise and scalable verifier for systems software

HAVOC: A precise and scalable verifier for systems software. Shaz Qadeer Microsoft Research. Collaborators. Researchers Jeremy Condit, Shuvendu Lahiri Interns Shaunak Chatterjee , Brian Hackett, Zvonimir Rakamaric , Ian Wehrman , Thomas Wies. HAVOC. Modular verifier for C programs

melisande
Download Presentation

HAVOC: A precise and scalable verifier for systems software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HAVOC: A precise and scalable verifier for systems software Shaz Qadeer Microsoft Research

  2. Collaborators • Researchers • Jeremy Condit, Shuvendu Lahiri • Interns • ShaunakChatterjee, Brian Hackett, ZvonimirRakamaric, Ian Wehrman, Thomas Wies

  3. HAVOC • Modular verifier for C programs • Verifies each procedure separately • Requires contracts: preconditions, postconditions, modifies clauses, loop invariants • Features • Accurate heap model • Expressive annotation language • Efficient checking using SMT solvers • Precise and efficient reasoning for loop-free and call-free code

  4. Annotated C program Visual C Front End Control flow graph CtoBoogiePL Memory model Boogie program Boogie VCGenerator Verification condition Z3 SMT solver Verified Warning

  5. Challenges for HAVOC • Concise and precise expression of non-aliasing and disjointness of heap values • Properties of unbounded collections • Lists, Arrays, … • Enable such reasoning for low-level software • pointer arithmetic • interior pointers • nested structures and unions • …

  6. But will programmers ever write contracts? • In some cases, they might • security properties: thousands of buffer annotations in Windows code • maintenance of critical legacy code: the Windows NT file system • Automatic annotation inference • precise and efficient checking of annotated programs is a crucial first step

  7. Roadmap • Novel features of the specification language • Dealing with low-level features of C • Concluding remarks

  8. log_list.head log_list.tail next next next prev prev prev LinkNode data data data char * channel_name file_name logtype struct _logentry [muh: Internet Relay Chat (IRC) bouncer]

  9. LinkNode *iter = log_list.head; while (iter != null) { struct _logentry *entry = iter->data; free (entry->channel_name); free (entry->file_name); free (entry); entry = NULL; iter = iter->next; } Ensure absence of double free Data structure invariant Reachability predicate For every node x in the list between log_list.head and null: x->data is a unique pointer, and x->data->channel_name is a unique pointer, and x->data->file_name is a unique pointer. Universal quantification

  10. Limitations of SMT solvers • No support for precise reasoning with reachability predicate • Incompleteness in Floyd-Hoare proofs for straight line code • Brittle support for quantifiers • Complexity: NP-complete (ground)  undecidable • Leads to unpredictable behavior of verifiers • Proof times, proof success rate • Requires user ingenuity to craft axioms/invariants with quantifiers

  11. Contribution • Expressive and efficient logic for precise reasoning about reachability, unique pointers, and restricted quantification • A decision procedure for the logic built over an SMT solver

  12. Simple Java-like memory model • Heap consists of a set of objects (obj) • Each field “f” is a mutable map • f: obj obj • g: obj  int • h: obj  bool • The sort obj may be refined into a collection of sorts

  13. Reachability predicate: Btwnf next next next x y prev prev prev data data data Btwnnext(x,y) Btwnprev(y,x)

  14. Inverse of a function: f-1 next next next x y prev prev prev data data data w data-1(w) = {x, y}

  15. LinkNode *iter = log_list.head; while (iter != null) { struct _logentry *entry = iter->data; free (entry->channel_name); free (entry->file_name); free (entry); entry = NULL; iter = iter->next; } Data structure invariant For every node x in the list between log_list.head and null: x->data is a unique pointer, and …. x Btwnf(log_list.head, null) \ {null}. data-1(data(x)) = {x} ….

  16. Expressive logic • Express properties of collections x Btwnf(f(hd), hd). state(x) = LOCKED //cyclic • Arithmetic reasoning on data (e.g. sortedness) x Btwnf(hd, null) \ {null}. yBtwnf(x, null) \ {null}. d(x)  d(y)

  17. Precise Need annotations/abstractions only at procedure/loop boundaries • Given the Floyd-Hoare triple X = {P} S {Q} • P and Q are expressed in our logic • S is a loop-free call-free program • We can construct a formula Y in our logic • Y is linear in the size of X • X is valid iff Y is valid

  18. Efficient • Decision problem is NP-complete • Can’t expect any better with propositional logic! • Retains the complexity of current SMT logics • Provide a decision procedure for the logic on top of state-of-the-art Z3 SMT solver • Leverages powerful ground-theory reasoning (arithmetic, arrays, uninterpreted functions…)

  19. Ground Logic Logic t  Term ::= c | x | t1 + t2 | t1 - t2 | f(t) G  GFormula ::= t = t’| t < t’ | t  Btwnf(t1, t2) | G S  Set ::= f-1(t) | Btwnf(t1, t2) F  Formula ::= G | F1 F2 |F1 F2 | x  S. F

  20. Ground decision procedure • Provide a set of 10 rewrite rules for Btwnf • Sound, complete and terminating • E.g. Transitivity3 t1 Btwnf(t0, t2) t  Btwnf(t0, t1) t  Btwnf(t0, t2), t1 Btwnf(t, t2)

  21. t  Term ::= c | x | t1 + t2 | t1 - t2 | f(t) G  GFormula ::= t = t’| t < t’ | t  Btwnf(t1, t2) | G Logic Bounded quantification over interpreted sets S  Set ::= f-1(t) | Btwnf(t1, t2) F  Formula ::= G | F1 F2 |F1 F2 | x  S. F

  22. Lazy quantifier instantiation • Instantiation rule t  Sx  S. F F[t/x] • Lazy instantiation • Instantiate only when a term t belongs to the set S • Substantially reduces the number of terms to instantiate a quantified fact • Terminates if x  S. F is sort-restricted • sort(x) is less than sort(t[x]) for any term t[x] in F

  23. Experience • Compared with an earlier implementation • Unrestricted quantifiers, incomplete axiomatization of reachability, no f-1 • Small to medium sized benchmarks • Greatly improved the predictability of HAVOC • Reduced runtimes (2X – 100X) • Eliminate need for carefully crafted axioms and invariants • Can handle newer examples

  24. Roadmap • Novel features of the specification language • Dealing with low-level features of C • Concluding remarks

  25. p struct list { list *next; list *prev; }; struct record { int data1; list node; int data2; }; q record record data1 next prev data2 data1 next prev data2 q = CONTAINER(p, record, node) = (record *) ((int *) p – (int) (&(((record *)0)node))) = (record *) ((int *) p – 1)

  26. void init_all_records(list *p) { while (p != NULL) { init_record(p); p = p->next; } } void init_record(list *p) { record *r = CONTAINER(p, record, node); r->data2 = 42; } • Type safety requires nontrivial reasoning • the container of every element in list has type record* • Use of memory model with field abstraction is unsound • Field abstraction is crucial to all property checkers • &a->data1 is not aliased to &b->data2 • init_all_records(p) preserves the assertion a->data1 == 0

  27. Unify type checking and property checking • Harness the power of constraint solvers to enhance type checking • type safety often depends on program-specific invariants • Harness the strong guarantees provided by the type invriant to enhance property checking • non-aliasing, field abstraction

  28. Mem:int int Type:int type Mutable Immutable 102 101 Ptr(Int) Ptr(List) Ptr(Record) 100 100 Int List Record 99 int type Type invariant: a:int. HasType(Mem(a), Type(a))

  29. void init_record(list *p) { record *r = CONTAINER(p, record, node); r->data2 = 42; } struct list { list *next; list *prev; }; struct record { int data1; list node; int data2; }; requires a:int. HasType(Mem(a), Type(a)) requires HasType(p, Ptr(List)) ensures a:int. HasType(Mem(a), Type(a)) void init_record(int p) { var r:int; r := p-1; assert HasType(r, Ptr(Record)); Mem(r+3) := 42; assert a:int. HasType(Mem(a), Type(a)); }

  30. struct list { list *next; list *prev; }; HasType(v, Int)  true HasType(v, Ptr(t))  v = 0  (v > 0  Match(v, t)) struct record { int data1; list node; int data2; }; Match(a, Int)  Type(a) = Int Match(a, Ptr(t))  Type(a) = Ptr(t) Match(a, List)  Match(a, Ptr(List))  Match(a+1, Ptr(List)) Match(a, Record)  Match(a, Int)  Match(a+1, List)  Match(a+3, Int)

  31. void init_record(list *p) { record *r = CONTAINER(p, record, node); r->data2 = 42; } struct list { list *next; list *prev; }; struct record { int data1; list node; int data2; }; requires HasType(p-1, Ptr(Record))  p - 1  0 requires a:int. HasType(Mem(a), Type(a)) requires HasType(p, Ptr(List)) ensures a:int. HasType(Mem(a), Type(a)) void init_record(int p) { var r:int; r := p-1; assert HasType(r, Ptr(Record)); Mem(r+3) := 42; assert a:int. HasType(Mem(a), Type(a)); }

  32. struct list { list *next; list *prev; }; HasType(v, Int)  true HasType(v, Data1)  true HasType(v, Data2)  true HasType(v, Ptr(t))  v = 0  (v > 0  Match(v, t)) struct record { int data1; list node; int data2; }; Match(a, Int)  Type(a) = Int Match(a, Data1)  Type(a) = Data1 Match(a, Data2)  Type(a) = Data2 Match(a, Ptr(t))  Type(a) = Ptr(t) Match(a, List)  Match(a, Ptr(List))  Match(a+1, Ptr(List)) Match(a, Record)  Match(a, Data1)  Match(a+1, List)  Match(a+3, Data2)

  33. Other highlights • Decision procedure for type safety • suffices to instantiate the type invariant and definitions of Match and HasType on few terms • Extensions • unions • function pointers • parametric polymorphism • user-defined types • sub-word accesses (char, short)

  34. Experience • Property checking on small benchmarks • list-manipulation: insertion, removal, multiple lists each with a different container type • sorting: bubble sort, merge sort, quick sort • intuitive and concise annotations • Type checking of four WDK drivers • cancel, event, kbfiltr, vserial • ~1 min to check each driver • ~5KLOC, ~225 annotations

  35. Roadmap • Novel features of the specification language • Dealing with low-level features of C • Concluding remarks

  36. Other case studies with HAVOC • Synchronization protocols protecting critical data structures in the NT file system (Brian Hackett) • ~300KLOC, 1500 procedures • reference count usage, lock usage, data races, teardown races • 45 confirmed bugs (out of 125 warnings) • most bugs fixed • Spin lock usage in Windows device drivers (Juan Pablo Galeotti, Thomas Wies) • flpydisk, kbdclass, daytona, serial (~50KLOC)

  37. HAVOC is available • Download: • http://research.microsoft.com/projects/HAVOC

  38. Future directions • Unified decision procedure for reachability, inverse, arrays, and types for the low-level memory model • Exploiting type invariant for property checking on device drivers • Annotation inference

  39. Questions

More Related