1 / 66

SMT based predictable analysis of systems code

SMT based predictable analysis of systems code. Shuvendu Lahiri Microsoft Research, Redmond. Joint work with: S . Qadeer (MSR) J . Condit, B. Hackett, Z. Rakamaric, T. Wies, J. Voung , J. Galeotti. Problem. Modular property checking of C modules

enoch
Download Presentation

SMT based predictable analysis of systems code

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SMT based predictable analysis of systems code Shuvendu Lahiri Microsoft Research, Redmond Joint work with: S. Qadeer (MSR) J. Condit, B. Hackett, Z. Rakamaric, T. Wies, J. Voung, J. Galeotti

  2. Problem Modular property checking of C modules • Device drivers, file systems, kernel components,… • Double-free, lock usage, memory safety, user-provided assertions • Goal: Predictable analysis using SMT solvers • Efficiently decidable logics

  3. HAVOC • Property checker for C programs • Active [’06-’09] • Found 100+ errors in various kernel components

  4. HAVOC modular checker C program Annotations C  Boogie Memory model Boogie program Boogie VC gen SMT formula SMT Solver (Z3) Decision Procedures for types, lists, arrays Verified Warning

  5. Challenges imposed for analyzing C Additional challenges (over Java/C#) • Lack of type safety • Presence of low-level data structures • Explicit memory management (free) • Bit-wise operations • ……

  6. Types

  7. p Example: Type Checking IRP IRP ListEntry ListEntry Flink Flink Blink Blink typedefstruct _LIST_ENTRY{ LIST_ENTRY *Flink, *Blink; } LIST_ENTRY, *PLIST_ENTRY; typedefstruct _IRP{ …. LIST_ENTRY ListEntry; … } IRP, *PIRP;

  8. q p Example: Type Checking IRP IRP ListEntry ListEntry Flink Flink Blink Blink q = CONTAINING_RECORD(p, IRP, ListEntry) = (IRP*)((char*)p - &((IRP*)0->ListEntry)) Type Checker:Does variable qhave type IRP*?

  9. q r Example: Property Checking IRP IRP Data1 Data1 ListEntry ListEntry Flink Flink Blink Blink Data2 Data2 ... q->Data2 = 42; Property Checker: Is r->Data1 unchanged?

  10. q r Example: Property Checking Data1 Data2 / Data1 ListEntry ListEntry Flink Flink For all we know, Data1 and Data2 could be aliased! Blink Blink Data2 Data2

  11. Types in C programs • Types in C programs cannot be trusted • Unsafe type casts, pointer arithmetic • Typical type checking in C compilers cannot ensure memory safety • Lack of types hurts property checking • Disambiguation

  12. Lists

  13. Simple type-state property • Allocation type-state of DEV_OBJ • Device Objects (DEV_OBJ) allocated and freed • Property to check for a module • IoDeleteDevice() only called on MyDevObj ~MyDevObj IoCreateDevice() IoDeleteDevice() MyDevObj

  14. Simple property  simple invariants NT_STATUS Unload(…){ …. iter = hd->First; while(iter != null) { RemoveEntryList(iter); iter = iter->Next; IoDeleteDevice(iter->Self); } …. } hd First DEV_EXT DEV_EXT DEV_OBJ DEV_OBJ Next Next Pointers from the list point to distinct objects Self Self DevExt DevExt

  15. Lists • Prevalent in most systems code • Manipulated by explicit pointer operations • Updates to next fields

  16. This talk • Focus on two of these challenges • Lack of type-safety • Presence of low-level data structures • Solution • New efficient SMT theories for the above problems

  17. Overview • Motivation • Background • Exploiting types [POPL’09] • Logic for lists [POPL’08] • Application [CAV’09]

  18. Program Correctness: Floyd-Hoare Triple • Floyd-Hoare triple {P} S {Q} P, Q : predicates/property S : a program • From a state satisfying P, if S executes, • No assertion in S fails, and • Terminating executions end up in a state satisfying Q

  19. Program verification  Formula { b.f = 5 } a.f = 5 { a.f + b.f = 10 } is valid iff Select(f1,b) = 5  f2 = Store(f1,a,5)  Select(f2,a) + Select(f2,b) = 10 is valid theory of equality: f, = theory of arithmetic: 5, 10, + theory of arrays: Select, Store • [Nelson & Oppen ’79]

  20. Satisfiability-Modulo-Theory (SMT) • Boolean satisfiabilitysolving + theoryreasoning • Ground theories • Equality, arithmetic, arrays, bit-vectors, …. • Powerful methods to combine decision procedures for theories • [Nelson & Oppen ’79] • Phenomenal progress in the past few years • Z3, Mathsat, Yices, …. Works best for NP-complete theories

  21. Overview • Motivation • Background • Exploiting types • Logic for lists • Case study

  22. Memory model for C • Each pointer is an integer • Heap as a map // Mutable Mem: intint Alloc: int {UNALLOCATED, ALLOCATED, FREED} // Immutable Base: int int //base address of each pointer

  23. C  Boogie typedef struct { int g[10]; int f;} DATA; DATA *create() { int a; DATA *d = (DATA*) malloc(sizeof(DATA)); init(d->g, 10, &a); d->f = a; d->g[1] = 2; return d; } function f_DATA: int -> int; forall u: int:: f_DATA(u) = u + 40; procedure create() returns d:int{ var @a: int; @a := malloc(4); d := call malloc(44); call init(g_DATA(d),10, @a); Mem[f_DATA(d)] := Mem[@a]; Mem[g_DATA(d) + 1*4]:=2; free(@a); return; }

  24. Missing part: Types? • Types in C programs can’t be trusted • Lack of types hurts property checking

  25. Our Approach • [POPL’09] • Type checking  assertion checking • Provide formal semantics for C and its types • Use types to improve the property checker • Provide Java-style field disambiguation • Provide decision procedures for the assertion checking

  26. Formalizing Type Safety A C program is type safe if the run-time value of every variable and heap location corresponds to its compile-time type. Mem : addr -> value Type : addr -> type HasType : value x type -> bool for all a in addr, HasType(Mem(a), Type(a))

  27. Modeling the Heap • Gives value stored at each heap location • Values are integers • Gives declared type for each heap location • Types include Int, Ptr(Int), … Mem : addr -> value Type : addr -> type

  28. “Match” Predicate Match: addr x type -> bool • Lifts the Type map to multi-word types • Match(a, t) holds iff Type[a … n] matches t C Type C Type HAVOC Axiom HAVOC Axiom structfoo { int n; int m; int *p; } int Match(a, Int) <==> Type[a] == Int Match(a, Foo) <==> Match(a, Int) && Match(a+1, Int) && Match(a+2, Ptr(Int)) int* Match(a, Ptr(Int)) <==> Type[a] == Ptr(Int) ¬Match(101, Foo) Match(99, Foo) Match(101, Ptr(Int)) Match(99, Int) Type Int Int Ptr(Int) Int Ptr(Foo) … 99 100 101 102 103 …

  29. “HasType” Predicate HasType: value x type -> bool • Defines which values belong to each type • HasType(v, t) holds iff v is a value of type t C Type HAVOC Axiom int HasType(v, Int) <==> true t* HasType(v, Ptr(t)) <==> v == 0 || (v > 0 && Match(v, t)) HasType(99, Ptr(Foo)) ¬ HasType(101, Ptr(Foo)) Type Int Int Ptr(Int) Int Ptr(Foo) … 99 100 101 102 103 …

  30. Type Safety Invariant • Part of preconditions, postconditions, loop invariants • Assert at every program point • Add similar assertions for locals (if desired) for all a in addr, HasType(Mem(a), Type(a))

  31. Decision Procedure • Verification conditions refer to Mem, Type, Match, HasType, Type-safety invariant • Decision problem: NP-complete • Provide decision procedure using an SMT solver • Suffices to instantiate the quantifiers in these axioms on a fixed set of terms

  32. q p Example: Type Checking IRP IRP ListEntry ListEntry Flink Flink Blink Blink q = CONTAINING_RECORD(p, IRP, ListEntry) = (IRP*)((char*)p - &((IRP*)0->ListEntry)) Type Checker:Does variable qhave type IRP*?

  33. Solution: Add Preconditions #define ENCL(x) CONTAINING_RECORD(x, record, node) requires( HasType(ENCL(p), record*) && ENCL(p)!= NULL ) void init_record(list *p) { record *r = CONTAINING_RECORD(p, record, node); r->data2 = 42; }

  34. Field Safety Invariant • Field safety • Refinement of type safety • Disambiguate two fields of same type • Change • HasType/Match are refined to distinguish different field names of same type

  35. Adding Field Names struct list { list *prev; list *next; } struct record { int data1; list node; int data2; } Match(a, List) <==> Match(a, Ptr(List)) && Match(a+1, Ptr(List)) Match(a, Record) <==> Match(a, int) && Match(a+1, List) && Match(a+3, int) Match(a, Ptr(List)) <==> Type[a] == Ptr(List) HasType(v, Ptr(List))<==> v == 0 || (v > 0 && Match(v, List)) Match(a, int) <==> Type[a] == int HasType(v, int) <==> true same definition as Int … same for Next and Data2 …

  36. Adding Field Names struct list { list *prev; list *next; } struct record { int data1; list node; int data2; } Match(a, List) <==> Match(a, Prev) && Match(a+1, Next) Match(a, Record) <==> Match(a, Data1) && Match(a+1, List) && Match(a+3, Data2) Match(a, Prev) <==> Type[a] == Prev HasType(v, Prev) <==> v == 0 || (v > 0 && Match(v, List)) Match(a, Data1) <==> Type[a] == Data1 HasType(v, Data1) <==> true same definition as Int … same for Next and Data2 …

  37. Experiments • Implementation supports full C language • Supports polymorphism • Supports user-defined, dependent types • Annotated and checked four Windows drivers • Sample drivers provided with Windows DDK

  38. Enables field splitting Disambiguates writes to fields + faster checking • Can split the heap for “field-safe” programs • One heap map per word-type field and pointer type (almost!) Mem_f: addrval Mem_g : addrval Mem_T*: addrval • Simple example • C code x->f = 1; • Boogie code Mem_f[x + Offset(f)] := 1;

  39. Why almost? struct A {int a; int b; }; struct B {int c; int d; int e;} void P(struct B *x){ struct A *y = (struct A*) x; y->a = 1; assert (x->c == 1); } Field safety assertion will fail Have to merge {a, c} {b, d}

  40. Summary • Types as addition part of the state • Type safety checking  assertion checking • Efficiently decidable (NP) logic • Separation of concern for property checking • Can exploit field disambiguation for “field-safe” programs

  41. Overview • Motivation • Background • Exploiting types • Logic for lists • Case study

  42. Logic for lists • SMT theory with new predicate symbols

  43. Reachability predicate: Btwnf next next next x y prev prev prev data data data Btwnnext(x,y) Btwnprev(y,x)

  44. Inverse of a function: f-1 next next next x y prev prev prev data data data w data-1(w) = {x, y}

  45. Expressive logic • Express properties of collections x Btwnf(f(hd), hd). state(x) = LOCKED //cyclic • Arithmetic reasoning on data (e.g. sortedness) x Btwnf(hd, null) \ {null}. yBtwnf(x, null) \ {null}. d(x)  d(y) • Type/object invariants x Type-1(“__logentry”). logtype(x) > 0 file_name(x) != null

  46. Can express desired invariants NT_STATUS Unload(…){ …. iter = hd->First; while(iter != null) { RemoveEntryList(iter); iter = iter->Next; IoDeleteDevice(iter->Self); } …. } hd First DEV_EXT DEV_EXT DEV_OBJ DEV_OBJ Next Next • x BtwnNext(hd->First,NULL). x->Self->DevExt = x Self Self OR • x BtwnNext(hd->First,NULL). Self-1(x->Self) = {&x->Self} DevExt DevExt

  47. Precise and efficient • [POPL ‘08] • Precision • Given a Floyd-Hoare triple {P} S {Q}, • P/Q are in the assertion logic, and S is a loop-free, call-free code fragment • There is a formula in the assertion logic • Linear in the size of the triple • Valid iff the triple holds • Efficiency • The decision problem is NP-complete

  48. Ground Logic Logic t  Term ::= c | x | t1 + t2 | t1 - t2 | f(t) G  GFormula ::= t = t’| t < t’ | t Btwnf(t1, t2) | G S  Set ::= f-1(t) | Btwnf(t1, t2) F  Formula ::= G | F1 F2 |F1 F2 | x  S. F

  49. Ground decision procedure • Provide a set of 10 rewrite rules for Btwnf • Sound, complete and terminating • E.g. Transitivity3 t1Btwnf(t0, t2) t Btwnf(t0, t1) t Btwnf(t0, t2), t1Btwnf(t, t2)

  50. t  Term ::= c | x | t1 + t2 | t1 - t2 | f(t) G  GFormula ::= t = t’| t < t’ | t  Btwnf(t1, t2) | G Logic Bounded quantification over interpreted sets S  Set ::= f-1(t) | Btwnf(t1, t2) F  Formula ::= G | F1 F2 |F1 F2 | x  S. F

More Related