Scalable Contract Checking for Systems Software using SMT solvers

Scalable Contract Checking for Systems Software using SMT solvers Shaz Qadeer RiSE, Microsoft Research Joint work with Jeremy Condit and Shuvendu Lahiri http://research.microsoft.com/en-us/projects/havoc/

Context: Scalable module verification Harness Target: OS components (kernel, drivers, file-systems) • ~100KLOC of lines of codes with >1000 of procedures Module • A set of public/entry procedures • A set of private/internal procedures Specs • Interface specification • Specs for public methods • Specs for external modules • Property assertion • Initialize(..); • while(*) { • choice= nondet(); • If (choice == 1){ • [assume pre_1] • call Public_1(…); • } else if (choice == 2){ • [assume pre_2] • call Public_2(…); • } … • } • Cleanup(…);

Desirable goals • Find bugs • Violations of property assertions • Low false alarms • Use contracts • Modular checking for scalability • Readable contracts are formal documentation • Reduce testing cost by providing high assurance in the verifier • Formal documentation of assumptions • Simple meta-theory for proofs

Existing methods on these examples Harness • Initialize(..); • while(*) { • choice= nondet(); • If (choice == 1){ • [assume pre_1] • call Public_1(…); • } else if (choice == 2){ • [assume pre_2] • call Public_2(…); • } … • } • Cleanup(…); Large difference between theory and practice Imprecise • Modeling of lists/arrays Unsound • Modeling of lists/arrays • Aliasing, pointer arithmetic • Restricted harness Complex “proof” calculus • Combination of analyses

Full functional correctness is not a goal Neither is minimizing the trusted computing base

Proof method: Floyd-Hoare Triple • Floyd-Hoare triple {P} S {Q} P, Q : predicates/property S : a program • From a state satisfying P, if S executes, • No assertion in S fails, and • Terminating executions end in a state satisfying Q

Program verification  Formula { b.f = 5 } a.f = 5 { a.f + b.f = 10 } is valid iff Select(f1,b) = 5  f2 = Store(f1,a,5)  Select(f2,a) + Select(f2,b) = 10 is valid theory of equality: = theory of arithmetic: 5, 10, + theory of arrays: Select, Store • [Nelson & Oppen ’79]

Satisfiability-Modulo-Theory (SMT) • Boolean satisfiabilitysolving + theoryreasoning • Ground theories • Equality, arithmetic, Select/Store • NP-complete logics • Powerful methods to combine decision procedures for theories • [Nelson & Oppen ’79] • Phenomenal progress in the past few years • Yices, Z3, Mathsat, ….

Simple type-state property • Allocation type-state of DEV_OBJ • Device Objects (DEV_OBJ) allocated and freed • Property to check for a module • IoDeleteDevice() only called on elements in MyDevObj ~MyDevObj IoCreateDevice() IoDeleteDevice() MyDevObj

Simple property  simple invariants do typedefstruct _DEV_OBJ{ DEV_EXT *DevExt; … } DEV_OBJ; typedefstruct _DEV_EXT{ DEV_OBJ *Self; … } DEV_EXT; requires (do MyDevObj) NT_STATUS PnP(DEV_OBJ do, IRP *pirp){ PDEV_EXT data = do->DevExt; …. switch(pirp->MajorFn){ case IRP_MN_REMOVE_DEVICE: IoDeleteDevice(data->Self); … } } DevExt DEV_OBJ Self DEV_EXT • x MyDevObj. x->DevExt->Self = x

Simple property  simple invariants NT_STATUS Unload(…){ …. iter = hd->First; while(iter != null) { RemoveEntryList(iter); iter = iter->Next; IoDeleteDevice(iter->Self); } …. } hd First DEV_EXT DEV_EXT DEV_OBJ DEV_OBJ Next Next • x DevExt. x->Self  MyDevObj Self Self • x Btwn(Next, hd->First,NULL). x  DevExt • xDevExt. x->Self->DevExt = x DevExt DevExt

Limitations of SMT solvers • No support for precise reasoning with reachability predicate • Incompleteness in Floyd-Hoare proofs for straight line code • Brittle support for quantifiers • Complexity: NP-complete (ground)  undecidable • Leads to unpredictable behavior of verifiers • Proof times, proof success rate

Limitations of SMT solvers • Answer the query {P} S {Q} for loop-free and call-free programs • To handle loops and procedures, contracts are needed • Loop invariants • Pre/post-conditions • Infeasible to manually supply internal contracts for large modules

Contributions • Efficient decision procedure for verifying list-based programs • Verifying and exploiting C type annotations • Annotation inference for large modules

Reachability predicate: Btwnf next next next x y f f f g g g Btwn(next,x,y)

Expressive logic • Express properties of collections x Btwn(next, next(hd), hd). state(x) = LOCKED //cyclic • Arithmetic reasoning on data (e.g. sortedness) x Btwn(next, hd, null) \ {null}. yBtwn(next, x, null) \ {null}. d(x)  d(y)

Efficient decision procedure • Decides the validity of {P} S {Q} • Worst-case exponential time but works well in practice • Decision problem is NP-complete • Cannot expect any better with propositional logic • Retains the complexity of current SMT logics • Implemented in the Z3 SMT solver • Leverages powerful ground-theory reasoning (arithmetic, arrays, uninterpreted functions…)

Contributions Efficient decision procedure for verifying list-based programs Verifying and exploiting C type annotations Annotation inference for large modules

C language C types • Scalars (int, long, char, short) • Pointers (int*, struct T*, ..) • Nested structs and unions • Array (struct T a[10];) • Function pointers • Void * Difficult to establish type safety in presence of pointer arithmetic, casts • Type Safety  (spatial) memory safety • Important default property to check Lack of types hurts property checking • Difficult to disambiguate heap pointers • Difficult to write concise type invariants

q p Example: Type Checking IRP IRP ListEntry ListEntry Flink Flink Blink Blink q = CONTAINING_RECORD(p, IRP, ListEntry) = (IRP*)((char*)p - &((IRP*)0->ListEntry)) Type Checker:Does variable qhave type IRP*?

q r Example: Property Checking IRP IRP Data1 Data1 ListEntry ListEntry Flink Flink Blink Blink Data2 Data2 ... q->Data2 = 42; Property Checker: Is r->Data1 unchanged?

q r Example: Property Checking Data1 Data2 / Data1 ListEntry ListEntry Flink Flink For all we know, Data1 and Data2 could be aliased! Blink Blink Data2 Data2

Our Approach • Implement a type checker in HAVOC • Provide formal semantics for C and its types • Use types to improve the property checker • Provide Java-style field disambiguation • Fully automated using Z3 SMT solver

Formalizing Type Safety A C program is type safe if the run-time value of every variable and heap location corresponds to its compile-time type. Mem : addr -> value Type : addr -> type HasType : value x type -> bool for all a in addr, HasType(Mem(a), Type(a))

Example #define ENCL(x) CONTAINING_RECORD(x, record, node) requires( HasType(ENCL(p), record*) && ENCL(p) != NULL ) void init_record(list *p) { record *r = CONTAINING_RECORD(p, record, node); r->data2 = 42; } requires( forall(q, Btwn(next, p, NULL), q != NULL ==> HasType(ENCL(q), record*) && ENCL(q) != NULL) ) void init_all_records(list *p) { while (p != NULL) { init_record(p); p = p->next; } }

Decision Procedure • Translation results in verification conditions that refer to Mem, Type, and HasType • Can be encoded into an NP-complete logic • No worse than SAT solving • Provide decision procedure using an SMT solver

Experiments • Implementation supports full C language • Supports polymorphism • Supports user-defined, dependent types • Fancier type invariants => slower checking • Pay only for what you use! • Annotated and checked four Windows drivers • Sample drivers provided with Windows DDK • About 2.3 KLOC total, with 225 annotations • Checking time: ~1 minute each

Simple property  simple invariants NT_STATUS Unload(…){ …. iter = hd->First; while(iter != null) { RemoveEntryList(iter); iter = iter->Next; IoDeleteDevice(iter->Self); } …. } hd First DEV_EXT DEV_EXT DEV_OBJ DEV_OBJ Next Next • x DevExt. x->Self  MyDevObj Self Self • x Btwn(Next,hd->First,NULL). x  DevExt • xDevExt. x->Self->DevExt = x DevExt DevExt

Need to simplify the problem Harness Module • A set of public/entry procedures • A set of private/internal procedures Specs • Interface specification • Property assertion Require the user to provide a module invariant • Initialize(..); • [loop_invmoduleInv] • while(*) { • choice= nondet(); • If (choice == 1){ • [assume pre_1] • call Public_1(…); • } else if (choice == 2){ • [assume pre_2] • call Public_2(…); • } … • } • Cleanup(…);

Module invariants • Module invariants • Invariant about all objects of a given type • Invariants on global variables • Preserved by the public functions • Low overhead • On “steady state” and therefore succinct • Only needed to be written at module level

Intra-module inference • Given module M, interface specs, property and module invariants • Infer annotations on internal procedures and loops • Use annotations to verify property and module invariant • Challenges • Module invariants are temporarily broken • Inference has to be scalable

Module invariant broken requires (TypeInvDO) ensures (TypeInvDO) void publicFoo () { PDEV_OBJ do = NewDEV_OBJ(); privateBar(do); } x DevExt DEV_OBJ Self requires (TypeInvDOExcept(do)) requires (TypeInvDO) ensures (TypeInvDO) void privateBar (PDEV_OBJ do) { do->DevExt->Self = do; } DEV_EXT • #define TypeInvDO \ • x MyDevObj. x->DevExt->Self = x \

Houdini algorithm (Flanagan-Leino 01) • Problem statement • Given a set of procedures P1, …, Pn • A set of C of candidate annotations for each procedure • Returns a subset of the candidate annotations such that each procedure satisfy its annotations • Also known as “monomial predicate abstraction” • Algorithm • Performs a greatest-fixed point starting from all annotations • Remove annotations that are violated • Requires a quadratic (n * |C|) number of theorem prover calls • Uses a modular checker

Candidate assertions • Candidate assertions • Type-states in module invariants • Over parameters, globals, locals and their fields • Module invariant exceptions(next slide) • Conditional annotations • Disjunction of above annotations

Module invariant exceptions TypeInvDOExcept({do,de->Self},requires)) TypeInvDOExcept({do,de->Self},ensures)) void privateBar (PDEV_OBJ do, PDEV_EXT de) { … do->DevExt->Self = do; } Exceptions come from parameters, return, globals, fields • #define TypeInvDO \ • x MyDevObj. x->DevExt->Self = x \ • #define TypeInvDOExcept({a,b},ANNOT) \ • ANNOT(x MyDevObj. x = a  x = b  x->DevExt->Self = x)\ • ANNOT(a->DevExt->Self = a) \ • ANNOT(b->DevExt->Sefl = b) \

Observations • Able to synthesize most intermediate invariants • “close” to the module invariant (simple) • readable • Invariants contain quantifiers, Boolean structure • Checking all Boolean combinations expensive (from NP-Complete  PSPACE-complete [CADE’09]) • Retains scalability of the Houdini inference

Experiments • Benchmarks • 4 device drivers (~7KLOC each), contains lists, arrays • #Internal methods: ~30 • #loops: ~20 • Properties • double-free, lock-usage • User provides module invariant • Tool infers intermediate invariants and modifies clauses

Results • Verified the properties with 0 false alarms • Module invariant overhead • Number of module invariants ~5-10 • Reused across multiple drivers • Most internal annotations inferred • Approx 1500 inferred annotation per driver • Less than 5 manual annotation per driver • Mostly conditional annotations (e.g. predicated on return value) • Inference time < 5X of the checking time

Questions?

Scalable Contract Checking for Systems Software using SMT solvers

Scalable Contract Checking for Systems Software using SMT solvers

Presentation Transcript

Static contract checking for Haskell

Scalable Systems Software for Terascale Computer Centers

Unbounded Data Model Verification Using SMT Solvers

Using SMT solvers for program analysis Shaz Qadeer Research in Software Engineering Microsoft Research

SMT Solvers for Malware Unpacking

Software Model Checking with SMT

Internals of SMT Solvers

Verifying Optimizations using SMT Solvers

SMT Solvers (an extension of SAT)

Use of SMT Solvers in Verification

Static Contract Checking for Haskell

Scalable Solvers and Software for PDE Applications

Software Model Checking for Embedded Systems

Static Contract Checking for Haskell

Scalable Systems Software Project

Scalable Solvers and Software for PDE Applications

Scalable Systems Software for Terascale Computer Centers

Scalable Solvers and Software for PDE Applications

Software Model Checking for Embedded Systems

Scalable Systems Software Project