200 likes | 471 Views
Generalized Symbolic Execution for Model Checking and Testing . Sarfraz Khurshid , Corina S. P ă s ă reanu , and Willem Visser TACAS 2003. Contents. Introduction General Methodology Background Symbolic Execution Two-fold Generalization of Symbolic Execution Lazy initialization
E N D
Generalized Symbolic Execution for Model Checking and Testing SarfrazKhurshid, Corina S. Păsăreanu, and Willem Visser TACAS 2003
Contents • Introduction • General Methodology • Background • Symbolic Execution • Two-fold Generalization of Symbolic Execution • Lazy initialization • Instrumentation • Implementations • Applications • Conclusion
Introduction (1/2) • Modern software systems are concurrent and manipulate complex dynamically allocated data structures (e.g., linked lists or binary trees) • Two common techniques for checking correctness of software • Testing • widely used but doesn’t give us an assurance • is not good at finding errors related to concurrent behavior • Model Checking • is automatic and good at analyzing concurrent systems • suffers from state-space explosion problem and typically requires a closed system, i.e., bound on input sizes Symbolic execution + model checking
Introduction (2/2) • Symbolic execution • well-known program analysis • traditionally arose in the context of checking sequential programs with a fixed number of integer variables • requires dedicated tools to perform the analyses and do not handle concurrent systems with complex inputs Generalization of traditional symbolic execution Source to source translation to instrument a model checkable program Novel symbolic execution algorithm for handling dynamically allocated structures (e.g., lists and trees)
General Methodology Decision Procedure continue/backtrack Code Instrumentation Model Checking Instrumented Program Source Program Counter example Correctness specification precondition/ postcondition
Background (1/2) • Symbolic Execution • The main idea is to use symbolic values, instead of actual data, as input values, and to represent the value of program variables as expressions • The state of a symbolically executed program includes the symbolic values of program variables and a path condition (pc) • A symbolic execution tree characterizes the execution paths followed during the symbolic execution of a program
Background (2/2) x : =A y := B PC := true • Symbolic Execution (Example) 1 1 int x, y; 1: if (x > y){ 2: x = x + y ; 3: y = x – y ; 4: x = x – y ; 5: if (x – y > 0) 6: assert(false); } x : = A y := B PC := A>B x : =A y := B PC := A<=B 2 x : =A+B, y :=B PC := A>B 3 x : =A+B, y :=A PC := A>B 4 x : =B, y :=A PC := A>B 5 5 x : =B, y :=A PC := A>B & B-A>0 FALSE! x : =B, y :=A PC := A>B & B-A <=0
Two-fold Generalization of symbolic execution (1/9) • Lazy initialization • is an algorithm for generalizing traditional symbolic execution to support advanced constructs of modern programming languages, such as Java and C++ • A key feature of the lazy initialization algorithm is that it starts execution of the method on inputs with uninitialized fields and use lazy initialization to assign values to these field, i.e., it initialize fields when they are first accessed during the method’s symbolic execution
Two-fold Generalization of symbolic execution (2/9) • Lazy initialization decision procedure : checks the path condition is satisfied. If not, backtracks if ( F is uninitialized ) { if ( F is a reference field of user-defined type T ) { nondeterministically initialize F to 1. null 2. a new object of class T (with uninitialized field values) 3. an object created during a prior initialization of a field of type T if ( method precondition is violated ) backtrack(); } if ( F is a primitive field ) initialize F to a new symbolic value of appropriate type }
next ? Two-fold Generalization of symbolic execution (3/9) next field not accessed • Lazy initialization (example) class Node {intelem;Node next;Node swapNode() {1: if (next != null)2: if (elem > next.elem) {3: Node t = next;4: next = t.next;5: t.next = this;6: return t; } return this;} } Node instance with uninitialized element
next ? next next next next ? ? ? ? ? ? next Two-fold Generalization of symbolic execution (4/9) consider executing1: if (next != null) Precondition: acyclic list 1
next next next next next next next next next next E0 E0 ? E0 E0 ? ? E1 E1 E1 Two-fold Generalization of symbolic execution (5/9) consider executing2: if (elem - next.elem>0) } initialize “elem” } initialize “next.elem” 2 2 1 PC : E0 <= E1, PC : E0 > E1,
next next E0 E1 t next next Precondition: acyclic list next next next E0 E1 ? E0 E1 t t next next next E0 E1 next E0 E1 null t t next next next next next E0 E1 null E0 E1 ? t t Two-fold Generalization of symbolic execution (6/9) consider executing 4: next = t.next;
Two-fold Generalization of symbolic execution (7/9) • Instrumentation • Two steps • The integer fields and operations are instrumented • The declared type of integer fields of input objects is changed to Expression, which is a library class for manipulation of symbolic integer expression • The field accesses are instrumented • Field reads are replaced by get method, get methods implement the lazy initialization • Filed updates are replaced by set method
Two-fold Generalization of symbolic execution (8/9) • Instrumentation (example) class Node { Expression elem; Node next; boolean _next_is_initialized; boolean _elem_is_initialized; Node swapNode () { if(_get_next() != null) if(Expression._pc._update_GT( _get_elem()._minus( _get_next()._get_elem()) , new IntegerConstant(0) ) { Node t = _get_next(); _set_next(t._get_next()); t._set_next(this); return t; } return this; } } class Node { intelem; Node next; Node swapNode () { if (next !=null) if (elem – next.elem) >0) { Node t = next; next = t.next; t.next = this; return t; } return this; } }
Two-fold Generalization of symbolic execution (9/9) • Instrumentation (example) pc class Expression { static PathCondition _pc; Expression _minus(Expression e) { … } } class PathCondition { Constraints c; boolean _update_GT(Expression e1, Expression e2) { boolean result = choose_boolean(); if (result) c.add_constraint_GT(e1, e2); else c.add_constraint_LE(e1, e2); if (!c.is_satisfiable()) backtrack(); return result; }} nondeterministic choice add (e1 > e2) to the path condition add (e1 <= e2) to the path condition decision procedure
Implementations • Code instrumentation • build on the Korat tool • Model checking • Java PathFinder (JPF) • Decision procedure • Java implementation of the Omega library
Applications (1/2) • Checking multithreaded programs with inputs distributed sorting method
Applications (2/2) • The implementation to symbolically execute distributed sort took 11 seconds to analyze the method’s correctness and it produced a counterexample input list : [X] [Y] [Z] such that X > Y > Z Thread- 1 : swaps X and Y Thread -2 : swaps X and Z resulting list : [Y] [Z] [X] ; Y and Z out of order
Conclusions • Novel framework based on symbolic execution, for automated checking of concurrent software systems that manipulate complex data structures • Two-fold generalization • Future work • Plan to integrate different constraints solvers that will allow us to handle floats and non-linear constraints • Symbolic execution during model checking is powerful but we don’t know how well it scales to real applications