600 likes | 702 Views
TVLA: A System for Generating Abstract Interpreters. T. Lev-Ami, R. Manevich, M. Sagiv http://www.cs.tau.ac.il/~tvla. A. Loginov, G. Ramalingam, E. Yahav. T. Reps & R. Wilhelm. Object Oriented Programs. Extensive use of the heap Dynamic allocation Recursive data structures
E N D
TVLA: A System for Generating Abstract Interpreters T. Lev-Ami, R. Manevich, M. Sagiv http://www.cs.tau.ac.il/~tvla A. Loginov, G. Ramalingam, E. Yahav T. Reps & R. Wilhelm
Object Oriented Programs • Extensive use of the heap • Dynamic allocation • Recursive data structures • Destructive reference updates • x.f := y
Verification of OO Programs • Verify properties about the heap • No runtime errors • Null dereferences • Freed memory • Dangling pointers • No memory leaks • Partial correctness
A Common Approach to Pointer Analysis in Software Verification Pointer Analysis Verification Analysis Program • Break verification problem into two phases • Preprocessing phase – separate points-to analysis • typically no distinction between objects allocated at same site • Verification phase – verification using points-to results • May lose precision • May lose ability to perform strong updates • May produce false alarms Property
Loss of Precision in Two-Phase Approach Verify f is not read after it is closed f = new InputStream(); f.read(); f.close(); Straightforward ...
Loss of Precision in Two-Phase Approach while (?) { f = new InputStream(); f.read(); f.close(); } f1 f closed=1/2 False alarm! “read may be erroneous”
The TVLA Approach • Express concrete semantics using first order logic • Abstract Interpretation using 3-valued Kleene’s logic • verification integrated with heap analysis • heap analysis (merging criterion) can adapt to verification Frontend AbstractInterpretation First-OrderTransition Sys Program Property
Concrete States Abstract States f f f closed f closed f closed f closed closed closed closed closed f … The TVLA Approach: An Example while (?) { f = new InputStream(); f.read(); f.close(); }
Outline • Expressing concrete semantics using logic (TVLA input) • Canonincal abstraction • Abstract interpretation using Canonincal abstraction (TVLA engine) • Applications & ongoing Work
90 40 79 50 30 x 79 50 40 0 90 4 3 5 2 1 p Traditional Heap Interpretation • States = Two level stores • env: Var Values • fields: Loc Values • Values=Loc Atoms • Example • env = [x 30, p 79] • next = [30 40, 40 50, 50 79, 79 90] • val = [30 1, 40 2, 50 3, 79 4, 90 5]
Predicate Logic • Vocabulary • A finite set of predicate symbols Peach with a fixed arity • Logical Structures S provide meaning for predicates • A set of individuals (nodes) U • pS: (US)k {0, 1} • FOTC over TC, express logical structure properties
n n n n u1 u2 u3 u4 u5 x p Representing Stores as Logical Structures • Locations Individuals • Program variables Unary predicates • Fields Binary predicates • Example • U = {u1, u2, u3, u4, u5} • x = {u1}, p = {u3} • next = {<u1, u2>, <u2, u3>, <u3, u4>, <u4, u5>}
Invariants • No memory leaksv: active(v) {x PVar} v1: x(v1) next*(v1, v) • Acyclic list(x)v1,v2: x(v1) next*(v1, v2) next+(v2, v1) • Reverse (x)v1,v2, v3: x(v1) next*(v1, v2) next(v2, v3) next’(v3, v2)
1/2 Information order Kleene 3-Valued Logic • 1: True • 0: False • 1/2: Unknown • A join semi-lattice: 0 1 = 1/2 Logical order
3-Valued Logical Structures • A set of individuals (nodes) U • Predicate meaning • pS: (US)k {0, 1, 1/2}
Canonical Abstraction • Partition the individuals into equivalence classes based on the values of their unary predicates • Every individual is mapped into its equivalence class • Collapse predicates via • pS(u’1, ..., u’k) = {pB(u1, ..., uk) | f(u1)=u’1, ..., f(u’k)=u’k) } • At most 2A abstract individuals
n n n n x x t t n n Canonical Abstraction x = NULL; while (…) do { t = malloc(); t next=x; x = t } u2 u2 u1 u1 u3 u3 u2,3 u1 x t
n n u2,3 u1 x t n Canonical Abstraction x = NULL; while (…) do { t = malloc(); t next=x; x = t } n n u2 u1 u3 x t
Canonical Abstraction and Equality • Summary nodes may represent more than one element • (In)equality need not be preserved under abstraction • Explicitly record equality • Summary nodes are nodes with eq(u, u)=1/2
eq Canonical Abstraction and Equality eq eq eq x = NULL; while (…) do { t = malloc(); t next=x; x = t } eq n n u2 u1 u3 eq x t eq eq eq eq n u2,3 u2,3 u1 x t n
n u2,3 u1 x t n Canonical Abstraction x = NULL; while (…) do { t = malloc(); t next=x; x = t } n n u2 u1 u3 x t
Heap Sharing predicate v:is(v) v1,v2: next(v1,v) next(v2,v) eq(v1, v2) is(v)=0 is(v)=0 is(v)=0 … u2 u1 un x n n n t n u2..n u1 x t n is(v)=0 is(v)=0
n n n u2 u3..n u1 x t n is(v)=0 is(v)=1 is(v)=0 Heap Sharing predicate v:is(v) v1,v2: next(v1,v) next(v2,v) eq(v1, v2) is(v)=0 is(v)=1 is(v)=0 … u2 u1 un x n n n t n
Concrete Representation Abstract Representation Concretization Abstract Interpretation Collecting Interpretation st# stc Concrete Representation Abstract Representation Abstraction Best Conservative Interpretation (CC79)
x y y . . . x Evaluate update formulas x y x y Canonincal x x y y . . . “Focus”- Based Transformer (x = x n) x y inverse embedding Canonincal abstraction
x y Focus(x n) x “Partial ” y Evaluate update Formulas (Kleene) x x Canonincal y y x x y y “Focus”-Based Transformer (x = x n) x y
x = NULL n n t t t empty x x n x F T n n t t t t =malloc(..); t n x x x n n n n tnext=x; t t t t n x x x n n n x = t t t t t x n x x n x return x Example: Abstract Interpretation
Cleanness Analysis of Linked Lists (Nurit Dor, SAS 2000) • Static analysis of C programs manipulating linked lists • Defects checked • References to freed memory • Null dereferences • Memory leaks • Existing algorithms are inadequate • Miss errors • Report too many false alarms
Dereference of NULL pointers bool search(int value, Elements *c) {Elements *elem;for (elem = c; c != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE typedef struct element { int value; struct element *next; } Elements
Dereference of NULL pointers bool search(int value, Elements *c) {Elements *elem;for (elem = c; c != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE typedef struct element { int value; struct element *next; } Elements potential null dereference
Dereference of NULL pointers bool search(int value, Elements *c) {Elements *elem;for (elem = c; elem != NULL; elem = elem->next;) if (elem->val == value) return TRUE; return FALSE typedef struct element { int value; struct element *next; } Elements potential null dereference
Memory leakage typedef struct element { int value; struct element *next; } Elements Elements* reverse(Elements *c){ Elements *h,*g;h = NULL;while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; }return h;
Memory leakage typedef struct element { int value; struct element *next; } Elements Elements* reverse(Elements *c){ Elements *h,*g;h = NULL;while (c!= NULL) { g = c->next;h = c; c->next = h; c = g; }return h; leakage of address pointed-by h
Memory leakage typedef struct element { int value; struct element *next; } Elements Elements* reverse(Elements *c){ Elements *h,*g;h = NULL;while (c!= NULL) { g = c->next; h = c; c->next = h; c = g; }return h; ✔No memory leaks
Proving Correctness of Sorting Implementations (Lev-Ami, Reps, S, Wilhelm ISSTA 2000) • Partial correctness • The elements are sorted • The list is a permutation of the original list • Termination • At every loop iterations the set of elements reachable from the head is decreased
Example: InsertSort List InsertSort(List x) { List r, pr, rn, l, pl; r = x; pr = NULL; while (r != NULL) { l = x; rn = r n; pl = NULL; while (l != r) { if (l data > r data) { pr n = rn; r n = l; if (pl == NULL) x = r; else pl n = r; r = pr; break; } pl = l; l = l n; } pr = r; r = rn; } return x; } typedef struct list_cell { int data; struct list_cell *n; } *List; • v:inOrder[dle, n](v) v1: next(v, v1) dle(v,v1) assert sorted[x]; assert permutation[x,x’]
Example: InsertSort List InsertSort(List x) { if (x == NULL) return NULL pr = x; r = x->n; while (r != NULL) { pl = x; rn = r->n; l = x->n; while (l != r) { pr->n = rn ; r->n = l; pl->n = r; r = pr; break; } pl = l; l = l->n; } pr = r; r = rn; } typedef struct list_cell { int data; struct list_cell *n; } *List; Run Demo 14
Example: Mark and Sweep void Mark(Node root) { if (root != NULL) { pending = pending = pending {root} marked = while (pending ) { x = SelectAndRemove(pending) marked = marked {x} t = x left if (t NULL) if (t marked) pending = pending {t} t = x right if (t NULL) if (t marked) pending = pending {t} } } assert(marked == Reachset(root)) } void Sweep() { unexplored = Universe collected = while (unexplored ) { x = SelectAndRemove(unexplored) if (x marked) collected = collected {x} } assert(collected == Universe – Reachset(root) ) }
Challenges: Heap & Concurrency[Yahav POPL’01] • Concurrency with the heap is evil… • Java threads are just heap allocated objects • Data and control are strongly related • Thread-scheduling info may require understanding of heap structure (e.g., scheduling queue) • Heap analysis requires information about thread scheduling Thread t1 = new Thread(); Thread t2 = new Thread(); … t = t1; … t.start();
Configurations – Example held_by at[l_1] at[l_C] rval[myLock] rval[myLock] blocked at[l_1] at[l_0] at[l_0] rval[myLock] l_0: while (true) { l_1: synchronized(myLock) { l_C: // critical section l_2: } l_3: }
ConcreteConfiguration held_by at[l_1] at[l_C] rval[myLock] blocked rval[myLock] at[l_1] at[l_0] at[l_0] rval[myLock]
AbstractConfiguration held_by blocked at[l_1] at[l_C] rval[myLock] rval[myLock] at[l_0]
Compile-Time GC for Java(Ran Shaham, SAS’03) • The compiler can issue free when objects are no longer needed • Analysis of Java/JavaCard programs • Generic Java Bytecode Frontend • Precise analysis but expensive (cost of procedures)
More Scalable Solution[Gilad Arnold] • Combine backward and forward analysis • Forward analysis determine the shape • Backward analysis locates heap liveness • Use
Verifying Safety Propertiesusing Separationand Heterogeneous AbstractionsPLDI’04 E. Yahav School of Computer Science Tel-Aviv University G. Ramalingam IBM T.J. Watson Research Center
Quick Overview:Fine Grained Heap Abstraction Heap H Precise ... ... but often too expensive