1 / 39

Constraint-Based Analysis

Constraint-Based Analysis. CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken). unlock. lock. unlock. Error. Unlocked. Locked. lock. Code Example. Flow Sensitivity. void f(state *x, state *y) { result = spin_trylock( & x->lock); spin_lock( & y->lock); …

kalkin
Download Presentation

Constraint-Based Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Constraint-Based Analysis CS 8803 FPL Oct 24, 2012 (Slides courtesy of Alex Aiken)

  2. unlock lock unlock Error Unlocked Locked lock Code Example Flow Sensitivity void f(state *x, state *y) { result = spin_trylock(&x->lock); spin_lock(&y->lock); … if (!result) spin_unlock(&x->lock); spin_unlock(&y->lock); } result (&x->lock); spin_trylock (&y->lock); spin_lock Path Sensitivity (!result) Pointers & Heap (&x->lock); (&y->lock); spin_unlock Inter-procedural

  3. Saturn • What? • SAT-based approach to static bug detection • How? • SAT-based approach • Program constructs  Boolean constraints • Inference  SAT solving • Why SAT? • Lots of reasons, but for now: • Program states naturally expressed as bits • The theory for bits is SAT • Efficient solvers widely available

  4. Intuition • Analyzing in one direction is problematic • Forwards or backwards • Consider null dereference analysis • No null ptr assignments: forwards is best • No dereferences: backwards is best • Constraints • Give a global picture of the program • Allow more efficient order of solution

  5. x31 … x0 y31 … y0 Bitwise-AND x31y31 … x0y0 == Straight-line Code void f(int x, int y) { int z = x & y ; assert(z == x); } ; z y x & == R

  6. Straight-line Code void f(int x, int y) { int z = x & y; assert(z == x); } Query: Is-Satisfiable( ) Answer: Yes x = [00…1] y = [00…0] Negated assertion is satisfiable. Therefore, the assertion may fail. R

  7. Control Flow – Preparation • Approach • Assumes loop free program • Unroll loops, drop backedges • May miss errors that are deeply buried • Bug finding, not verification • Many errors surface in a few iterations • Advantages • Simplicity, reduces false positives

  8. Control Flow – Example • if (c) • x = a; • else • x = b; • res = x; • Merges • preserve path sensitivity • select bits based on the values of incoming guards G = c, x: [a31…a0] G = c, x: [b31…b0] G = cc, x: [v31…v0] where vi = (cai)(cbi) if (c) c c x = a; x = b; true res = x;

  9. Pointers – Overview • May point to different locations… • Thus, use points-to sets p: { l1,…,ln } • … but path sensitive • Use guards on points-to relationships p: { (g1, l1), …, (gn, ln) }

  10. Pointers – Example G = true, p: { (true, x) } • p = &x; • if (c) • p = &y; • res = *p; if (c) res = y; else if (c) res = x; G = c, p: { (true, y) } G = true, p: { (c, y); (c, x)}

  11. Pointers – Recap • Guarded Location Sets { (g1, l1), …, (gn, ln) } • Guards • Condition under which points-to relationship holds • Collected from statement guards • Pointer Dereference • Conditional Assignments

  12. Not Covered • Other Constructs • Structs, … • Modeling of the environment • Optimizations • several to reduce size of formulas • some form of program slicing important

  13. if (l->state == Unlocked) l->state = Locked; else l->state = Error; unlock if (l->state == Locked) l->state = Unlocked; else l->state = Error; lock unlock Error Locked Unlocked lock What can we do with Saturn? int f(lock_t *l) { lock(l); … unlock(l); }

  14. General FSM Checking • Encode FSM in the program • State  Integer • Transition  Conditional Assignments • Check code behavior • SAT queries

  15. How are we doing so far? • Precision:  • Scalability:  • SAT limit is 1M clauses • About 10 functions • Solution: • Divide and conquer • Function summaries

  16. Function behavior can be summarized with a set of state transitions Summary: *l: Unlocked  Unlocked Locked  Error int f(lock_t *l) { lock(l); … … unlock(l); return 0; } Function Summaries (1st try)

  17. int f(lock_t *l) { lock(l); … if (err) return -1; … unlock(l); return 0; } Problem two possible output states distinguished by return value (retval == 0)… Summary 1. (retval == 0) *l: Unlocked  Unlocked Locked  Error 2. (retval == 0) *l: Unlocked  Locked Locked  Error A Difficulty

  18. FSM Function Summaries • Summary representation (simplified): { Pin, Pout, R } • User gives: • Pin: predicates on initial state • Pout: predicates on final state • Express interprocedural path sensitivity • Saturn computes: • R: guarded state transitions • Used to simulate function behavior at call site

  19. int f(lock_t *l) { lock(l); … if (err) return -1; … unlock(l); return 0; } Output predicate: Pout = { (retval == 0) } Summary (R): 1. (retval == 0) *l: Unlocked  Unlocked Locked  Error 2. (retval == 0) *l: Unlocked  Locked Locked  Error Lock Summary (2nd try)

  20. Lock checker for Linux • Parameters: • States: { Locked, Unlocked, Error } • Pin = {} • Pout = { (retval == 0) } • Experiment: • Linux Kernel 2.6.5: 4.8MLOC • ~40 lock/unlock/trylock primitives • 20 hours to analyze • 3.0GHz Pentium IV, 1GB memory

  21. Double Locking/Unlocking static void sscape_coproc_close(…) { spin_lock_irqsave(&devc->lock, flags); if (…) sscape_write(devc, DMAA_REG, 0x20); … } static void sscape_write(struct … *devc, …) { spin_lock_irqsave(&devc->lock, flags); … }

  22. Ambiguous Return State int i2o_claim_device(…) { down(&i2o_configuration_lock); if (d->owner) { up(&i2o_configuration_lock); return –EBUSY; } if (…) { return –EBUSY; } … }

  23. Bugs Previous Work: MC (31), CQual (18), <20% Bugs

  24. Function Summary Database • 63,000 functions in Linux • More than 23,000 are lock related • 17,000 with locking constraints on entry • Around 9,000 affects more than one lock • 193 lock wrappers • 375 unlock wrappers • 36 with return value/lock state correlation • Available on the web . . .

  25. Another Checker • Memory leaks • Common, esp. in error handling code • Hard to find • Problematic in long running applications • Current techniques • Escape analysis • Ownership types • Region based analysis…

  26. Simple Leak char *f() { char *p; p = (char*)malloc(…); … if (err) return NULL; … return p; }

  27. Scenario 1 – Malloc Wrappers char *f() { char *p; p = (char*)strdup(…); … if (err) return NULL; … return p; }

  28. Scenario 2 – External References char *f(struct *s) { char *p; p = (char*)malloc(…); s->name = p; if (err) return NULL; … return p; }

  29. Scenario 3 – Function Calls char *f(struct state *s) { char *p; p = (char*)malloc(…); g(s, p); if (err) return NULL; … return p; } void g(s, p) { s->name = p;}

  30. Scenario 4 – Data dependency void f(int len) { char fastbuf[10], *p; if (len < 10) p = fastbuf; else p = (char *)malloc(len); … if (p != fastbuf) free(p); }

  31. Requirements • Track points-to relationships precisely • Infer escaping functions • ones that create external references to objects passed in via parameters • Infer allocation functions

  32. Analysis Part I – Points-to Rule • PointsTo(p, l) • condition under which p points to l (p) = { (g0, l0), …, (gn-1, ln-1) } PointsTo(p, l) =  gi (if li = l)   false (otherwise)

  33. Analysis PartII – EscapeVia • EscapeVia(l, p, X) • the condition under which location l escapes via pointer p, excluding references in set X • Access Roots • Every object in the function body is accessed through one of the following “roots” • Parameters (p1…pn) • The Return Value (ret_val) • Global Variables • Local Variables • Heap Allocated Objects

  34. Analysis Part II – EscapeVia • Never escape through local variables Root(p)  Locals  X EscapeVia(l, p, X) = false • Always escape through global variables RootOf(p)  Globals EscapeVia(l, p, X) = PointsTo(p, l)

  35. Analysis Part II – EscapeVia • Escaping through parameters/return RootOf(p)  (Params { ret_val }) – X EscapeVia(l, p, X) = PointsTo(p, l) • Escaping via another allocated location RootOf(p) NewLocs – X EscapeVia(l, p, X) = PointsTo(p, l)  Escaped(p,X {RootOf(l)})

  36. Analysis Part III – Escape/Leak • Escape ConditionEscaped(l, X) = p EscapedVia(l, p, X) • Leak ConditionLeaked(l, X) =  Escaped(l, X) • Leak CheckerFor all new locations l, there is a leak ifSatisfiable(Leaked(l, {}))

  37. Results

  38. Why SAT? (Revisited …) • Moore’s Law • Uniform modeling of constructs as bits • Constraints • Local specification • Global solution • Incremental SAT solving • makes multiple queries efficient

  39. Why SAT? (Cont.) • Path sensitivity is important • To find bugs • To reduce false positives • Much easier to model precisely with SAT • Compositionality is important • Function summaries critical for scalability • Easy to construct with SAT queries

More Related