Scalable Contract Checking for Systems Software using SMT solvers - PowerPoint PPT Presentation

scalable contract checking for systems software using smt solvers n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Scalable Contract Checking for Systems Software using SMT solvers PowerPoint Presentation
Download Presentation
Scalable Contract Checking for Systems Software using SMT solvers

play fullscreen
1 / 41
Scalable Contract Checking for Systems Software using SMT solvers
153 Views
Download Presentation
alaqua
Download Presentation

Scalable Contract Checking for Systems Software using SMT solvers

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Scalable Contract Checking for Systems Software using SMT solvers Shaz Qadeer RiSE, Microsoft Research Joint work with Jeremy Condit and Shuvendu Lahiri http://research.microsoft.com/en-us/projects/havoc/

  2. Context: Scalable module verification Harness Target: OS components (kernel, drivers, file-systems) • ~100KLOC of lines of codes with >1000 of procedures Module • A set of public/entry procedures • A set of private/internal procedures Specs • Interface specification • Specs for public methods • Specs for external modules • Property assertion • Initialize(..); • while(*) { • choice= nondet(); • If (choice == 1){ • [assume pre_1] • call Public_1(…); • } else if (choice == 2){ • [assume pre_2] • call Public_2(…); • } … • } • Cleanup(…);

  3. Desirable goals • Find bugs • Violations of property assertions • Low false alarms • Use contracts • Modular checking for scalability • Readable contracts are formal documentation • Reduce testing cost by providing high assurance in the verifier • Formal documentation of assumptions • Simple meta-theory for proofs

  4. Existing methods on these examples Harness • Initialize(..); • while(*) { • choice= nondet(); • If (choice == 1){ • [assume pre_1] • call Public_1(…); • } else if (choice == 2){ • [assume pre_2] • call Public_2(…); • } … • } • Cleanup(…); Large difference between theory and practice Imprecise • Modeling of lists/arrays Unsound • Modeling of lists/arrays • Aliasing, pointer arithmetic • Restricted harness Complex “proof” calculus • Combination of analyses

  5. Full functional correctness is not a goal Neither is minimizing the trusted computing base

  6. Proof method: Floyd-Hoare Triple • Floyd-Hoare triple {P} S {Q} P, Q : predicates/property S : a program • From a state satisfying P, if S executes, • No assertion in S fails, and • Terminating executions end in a state satisfying Q

  7. Program verification  Formula { b.f = 5 } a.f = 5 { a.f + b.f = 10 } is valid iff Select(f1,b) = 5  f2 = Store(f1,a,5)  Select(f2,a) + Select(f2,b) = 10 is valid theory of equality: = theory of arithmetic: 5, 10, + theory of arrays: Select, Store • [Nelson & Oppen ’79]

  8. Satisfiability-Modulo-Theory (SMT) • Boolean satisfiabilitysolving + theoryreasoning • Ground theories • Equality, arithmetic, Select/Store • NP-complete logics • Powerful methods to combine decision procedures for theories • [Nelson & Oppen ’79] • Phenomenal progress in the past few years • Yices, Z3, Mathsat, ….

  9. Simple type-state property • Allocation type-state of DEV_OBJ • Device Objects (DEV_OBJ) allocated and freed • Property to check for a module • IoDeleteDevice() only called on elements in MyDevObj ~MyDevObj IoCreateDevice() IoDeleteDevice() MyDevObj

  10. Simple property  simple invariants do typedefstruct _DEV_OBJ{ DEV_EXT *DevExt; … } DEV_OBJ; typedefstruct _DEV_EXT{ DEV_OBJ *Self; … } DEV_EXT; requires (do MyDevObj) NT_STATUS PnP(DEV_OBJ do, IRP *pirp){ PDEV_EXT data = do->DevExt; …. switch(pirp->MajorFn){ case IRP_MN_REMOVE_DEVICE: IoDeleteDevice(data->Self); … } } DevExt DEV_OBJ Self DEV_EXT • x MyDevObj. x->DevExt->Self = x

  11. Simple property  simple invariants NT_STATUS Unload(…){ …. iter = hd->First; while(iter != null) { RemoveEntryList(iter); iter = iter->Next; IoDeleteDevice(iter->Self); } …. } hd First DEV_EXT DEV_EXT DEV_OBJ DEV_OBJ Next Next • x DevExt. x->Self  MyDevObj Self Self • x Btwn(Next, hd->First,NULL). x  DevExt • xDevExt. x->Self->DevExt = x DevExt DevExt

  12. Limitations of SMT solvers • No support for precise reasoning with reachability predicate • Incompleteness in Floyd-Hoare proofs for straight line code • Brittle support for quantifiers • Complexity: NP-complete (ground)  undecidable • Leads to unpredictable behavior of verifiers • Proof times, proof success rate

  13. Limitations of SMT solvers • Answer the query {P} S {Q} for loop-free and call-free programs • To handle loops and procedures, contracts are needed • Loop invariants • Pre/post-conditions • Infeasible to manually supply internal contracts for large modules

  14. Contributions • Efficient decision procedure for verifying list-based programs • Verifying and exploiting C type annotations • Annotation inference for large modules

  15. Reachability predicate: Btwnf next next next x y f f f g g g Btwn(next,x,y)

  16. Expressive logic • Express properties of collections x Btwn(next, next(hd), hd). state(x) = LOCKED //cyclic • Arithmetic reasoning on data (e.g. sortedness) x Btwn(next, hd, null) \ {null}. yBtwn(next, x, null) \ {null}. d(x)  d(y)

  17. Efficient decision procedure • Decides the validity of {P} S {Q} • Worst-case exponential time but works well in practice • Decision problem is NP-complete • Cannot expect any better with propositional logic • Retains the complexity of current SMT logics • Implemented in the Z3 SMT solver • Leverages powerful ground-theory reasoning (arithmetic, arrays, uninterpreted functions…)

  18. Contributions Efficient decision procedure for verifying list-based programs Verifying and exploiting C type annotations Annotation inference for large modules

  19. C language C types • Scalars (int, long, char, short) • Pointers (int*, struct T*, ..) • Nested structs and unions • Array (struct T a[10];) • Function pointers • Void * Difficult to establish type safety in presence of pointer arithmetic, casts • Type Safety  (spatial) memory safety • Important default property to check Lack of types hurts property checking • Difficult to disambiguate heap pointers • Difficult to write concise type invariants

  20. q p Example: Type Checking IRP IRP ListEntry ListEntry Flink Flink Blink Blink q = CONTAINING_RECORD(p, IRP, ListEntry) = (IRP*)((char*)p - &((IRP*)0->ListEntry)) Type Checker:Does variable qhave type IRP*?

  21. q r Example: Property Checking IRP IRP Data1 Data1 ListEntry ListEntry Flink Flink Blink Blink Data2 Data2 ... q->Data2 = 42; Property Checker: Is r->Data1 unchanged?

  22. q r Example: Property Checking Data1 Data2 / Data1 ListEntry ListEntry Flink Flink For all we know, Data1 and Data2 could be aliased! Blink Blink Data2 Data2

  23. Our Approach • Implement a type checker in HAVOC • Provide formal semantics for C and its types • Use types to improve the property checker • Provide Java-style field disambiguation • Fully automated using Z3 SMT solver

  24. Formalizing Type Safety A C program is type safe if the run-time value of every variable and heap location corresponds to its compile-time type. Mem : addr -> value Type : addr -> type HasType : value x type -> bool for all a in addr, HasType(Mem(a), Type(a))

  25. Example #define ENCL(x) CONTAINING_RECORD(x, record, node) requires( HasType(ENCL(p), record*) && ENCL(p) != NULL ) void init_record(list *p) { record *r = CONTAINING_RECORD(p, record, node); r->data2 = 42; } requires( forall(q, Btwn(next, p, NULL), q != NULL ==> HasType(ENCL(q), record*) && ENCL(q) != NULL) ) void init_all_records(list *p) { while (p != NULL) { init_record(p); p = p->next; } }

  26. Decision Procedure • Translation results in verification conditions that refer to Mem, Type, and HasType • Can be encoded into an NP-complete logic • No worse than SAT solving • Provide decision procedure using an SMT solver

  27. Experiments • Implementation supports full C language • Supports polymorphism • Supports user-defined, dependent types • Fancier type invariants => slower checking • Pay only for what you use! • Annotated and checked four Windows drivers • Sample drivers provided with Windows DDK • About 2.3 KLOC total, with 225 annotations • Checking time: ~1 minute each

  28. Contributions Efficient decision procedure for verifying list-based programs Verifying and exploiting C type annotations Annotation inference for large modules

  29. Simple property  simple invariants NT_STATUS Unload(…){ …. iter = hd->First; while(iter != null) { RemoveEntryList(iter); iter = iter->Next; IoDeleteDevice(iter->Self); } …. } hd First DEV_EXT DEV_EXT DEV_OBJ DEV_OBJ Next Next • x DevExt. x->Self  MyDevObj Self Self • x Btwn(Next,hd->First,NULL). x  DevExt • xDevExt. x->Self->DevExt = x DevExt DevExt

  30. Need to simplify the problem Harness Module • A set of public/entry procedures • A set of private/internal procedures Specs • Interface specification • Property assertion Require the user to provide a module invariant • Initialize(..); • [loop_invmoduleInv] • while(*) { • choice= nondet(); • If (choice == 1){ • [assume pre_1] • call Public_1(…); • } else if (choice == 2){ • [assume pre_2] • call Public_2(…); • } … • } • Cleanup(…);

  31. Module invariants • Module invariants • Invariant about all objects of a given type • Invariants on global variables • Preserved by the public functions • Low overhead • On “steady state” and therefore succinct • Only needed to be written at module level

  32. Intra-module inference • Given module M, interface specs, property and module invariants • Infer annotations on internal procedures and loops • Use annotations to verify property and module invariant • Challenges • Module invariants are temporarily broken • Inference has to be scalable

  33. Module invariant broken requires (TypeInvDO) ensures (TypeInvDO) void publicFoo () { PDEV_OBJ do = NewDEV_OBJ(); privateBar(do); } x DevExt DEV_OBJ Self requires (TypeInvDOExcept(do)) requires (TypeInvDO) ensures (TypeInvDO) void privateBar (PDEV_OBJ do) { do->DevExt->Self = do; } DEV_EXT • #define TypeInvDO \ • x MyDevObj. x->DevExt->Self = x \

  34. Houdini algorithm (Flanagan-Leino 01) • Problem statement • Given a set of procedures P1, …, Pn • A set of C of candidate annotations for each procedure • Returns a subset of the candidate annotations such that each procedure satisfy its annotations • Also known as “monomial predicate abstraction” • Algorithm • Performs a greatest-fixed point starting from all annotations • Remove annotations that are violated • Requires a quadratic (n * |C|) number of theorem prover calls • Uses a modular checker

  35. Candidate assertions • Candidate assertions • Type-states in module invariants • Over parameters, globals, locals and their fields • Module invariant exceptions(next slide) • Conditional annotations • Disjunction of above annotations

  36. Module invariant exceptions TypeInvDOExcept({do,de->Self},requires)) TypeInvDOExcept({do,de->Self},ensures)) void privateBar (PDEV_OBJ do, PDEV_EXT de) { … do->DevExt->Self = do; } Exceptions come from parameters, return, globals, fields • #define TypeInvDO \ • x MyDevObj. x->DevExt->Self = x \ • #define TypeInvDOExcept({a,b},ANNOT) \ • ANNOT(x MyDevObj. x = a  x = b  x->DevExt->Self = x)\ • ANNOT(a->DevExt->Self = a) \ • ANNOT(b->DevExt->Sefl = b) \

  37. Observations • Able to synthesize most intermediate invariants • “close” to the module invariant (simple) • readable • Invariants contain quantifiers, Boolean structure • Checking all Boolean combinations expensive (from NP-Complete  PSPACE-complete [CADE’09]) • Retains scalability of the Houdini inference

  38. Experiments • Benchmarks • 4 device drivers (~7KLOC each), contains lists, arrays • #Internal methods: ~30 • #loops: ~20 • Properties • double-free, lock-usage • User provides module invariant • Tool infers intermediate invariants and modifies clauses

  39. Results • Verified the properties with 0 false alarms • Module invariant overhead • Number of module invariants ~5-10 • Reused across multiple drivers • Most internal annotations inferred • Approx 1500 inferred annotation per driver • Less than 5 manual annotation per driver • Mostly conditional annotations (e.g. predicated on return value) • Inference time < 5X of the checking time

  40. Contributions Efficient decision procedure for verifying list-based programs Verifying and exploiting C type annotations Annotation inference for large modules

  41. Questions?