Towards Automated Verification Through Type Discovery

Towards Automated Verification Through Type Discovery Scott D. Stoller State University of New York at Stony Brook Joint work with Rahul Agarwal

Automated Verification of Infinite-State Systems • To make it feasible, we must restrict: • the system • Examples: push-down systems, infinite chains of finite-state automata • or the properties (and the system slightly) • Example: variables are initialized before they are used • We restrict the properties.

Verification With Type Systems Many properties can be checked with types. • Properties of sequential programs: • Operations applied to appropriate arguments • Correct calling sequence of procedures in API • Example: open file before reading or writing it • Encapsulation of objects • Example: links are encapsulated in linked list • Information flow

Verification With Type Systems (continued) • Properties of concurrent programs • Race-freedom (= absence of race conditions) • Deadlock-freedom • Atomicity • Called isolation or serializability in databases • Properties of distributed programs • Correctness of authentication protocols (Cryptyc)

Why Are Type Systems Attractive? • The concept of types is familiar to programmers • Extended types can be embedded in comments. • Types support compositional verification • Type-checking a method depends on types (not code) of other methods • Types provide clean separation of “guessing” and checking • Inference algorithms, heuristics, hints from user can be used freely to “guess” types • Only need to show soundness of type checker

Disadvantages of Type Systems • Not all properties can be checked with types • But several useful properties can be checked • Complete static type inference is infeasible (NP-complete or worse) for many interesting type systems • Annotating new code with types takes time • Annotating legacy code with types takes a long time • Developer first needs to understand the code

Type Discovery • Type discovery: guess (and then check) types for a program based on information from run-time monitoring • Is type discovery guaranteed to be effective? • Of course not, if type inference is provably hard. • Type discovery must rely on heuristics to generalize concrete relationships to static (syntactic) relationships. • Why is type discovery likely to be effective? • Assuming static intra-procedural type inference is feasible, monitored executions do not need to achieve high statement coverage.

Related Work: Invariant Discovery • Invariant discovery in Daikon [Ernst et al.] • Daikon considers a set of candidate predicates defined by a grammar, with a limit on the size of the predicates. • Daikon inserts instrumentation at a designated program point to discover which of these predicates hold at that program point. • Type discovery is “harder”: a single type annotation may depend on what happens at many program points • Example: With race-free types, an annotation on a field declaration depends on which locks are held at every point at which the field is accessed.

Outline • Type Discovery for Verification of Race-Freedom • Background on race conditions • Related work on analysis of race conditions • Overview of type system for race-freedom • Type discovery algorithm • Experimental results • Sketch of Type Discovery for Verification of: • Atomicity • Deadlock-freedom • Safe region-based memory management

Race Conditions • A race condition occurs when two threads access a shared variable and: • At least one access is a write • No synchronization is used to prevent the accesses from being simultaneous. • Race conditions indicate that the program may produce different results if the schedule (order of operations) changes. • In many systems, the thread scheduler is loosely specified, so the program is effectively non-deterministic. • Race conditions often reflect synchronization errors.

Race Condition: Example A deposit is lost if both threads read this.balance and then both threads update this.balance. This reflects a race condition on this.balance. Making deposit(int) synchronized eliminates the race condition and the error. class Account { int balance =0; void deposit(int x) { this.balance = this.balance + x; } } Account a = new Account(); fork {a.deposit(10);} fork {a.deposit(10);}

Approaches to Detecting Race Conditions • Run-time monitoring • Pioneering work: Eraser [Savage+, 1997] + automatic • no guarantees about other executions • Static analysis • RacerX [Engler+, 2003] effectively finds some race conditions but relies on unsound heuristics. • Type systems • Race Free Java, PRFJ, Multithreaded Cyclone + well-typed programs are guaranteed race free. - requires manual annotations (greatly reduced by type discovery)

Parameterized Race Free Java (PRFJ)[Boyapati & Rinard, OOPSLA 2001] • Each object is associated with an owner and a root owner • Owner is normally an object indicated by a final expression or self. • Lock on root owner must be held when object is accessed. • Example: owner(x)=self, owner(y)=x x:LinkedList y:Link

Parameterized Race Free Java (continued) • In some special cases, race conditions are avoided without locks. Special owner values indicate these special cases: • thisThread : object is unshared • unique : unique reference to the object • readonly : object cannot be updated • Owner may change in ways that do not cause races, specifically, from unique to any other owner • Unique references are transferred with this syntax: y = x-- ; // equivalent to: y =x; x = null;

Annotations in PRFJ • Classes are annotated with one or more owner parameters. • First parameter specifies the owner of this object. • Remaining parameters (if any) specify owners of fields, method parameters, return values, etc.

Example PRFJ program class Account<thisOwner> { int balance ; public Account(int balance) { this.balance = balance;} void deposit(int x) requires this { this.balance = this.balance + x; } } Account<thisThread> a1 = new Account<unique>(0); a1.deposit(10); Account<self> a2 = new Account<self> fork {synchronized (a2) {a2.deposit(10);}} fork {synchronized (a2) {a2.deposit(10);}}

Annotations in PRFJ • Owner parameters are instantiated at uses of class names. • Classes are annotated with one or more owner parameters. • First parameter specifies the owner of this object. • Remaining parameters (if any) specify owners of fields, method parameters, return values, etc.

Example PRFJ program class Account<thisOwner> { int balance ; public Account(int balance) { this.balance =balance;} void deposit(int x) requires this { this.balance = this.balance +x; } } Account<thisThread> a1 = new Account<unique>(0); a1.deposit(10); Account<self> a2 = new Account<self> fork {synchronized (a2) {a2.deposit(10);}} fork {synchronized (a2) {a2.deposit(10);}}

Annotations in PRFJ • Methods are annotated with requiresl1,l2,... clause. • Locks of rootowners of l1,l2,… should be held at all call sites. • Owner parameters are instantiated at uses of class names. • Classes are annotated with one or more owner parameters. • First parameter specifies the owner of this object. • Remaining parameters (if any) specify owners of fields, method parameters, return values, etc.

Example PRFJ program class Account<thisOwner> { int balance ; public Account(int balance) { this.balance =balance;} void deposit(int x) requires this { this.balance = this.balance + x; } } Account<thisThread> a1 = new Account<unique>(0); a1.deposit(10); Account<self> a2 = new Account<self> fork {synchronized (a2) {a2.deposit(10);}} fork {synchronized (a2) {a2.deposit(10);}}

The cost of PRFJ About 25 annotations/KLOC, in Boyapati & Rinard’s experiments with PRFJ

Towards Type Discovery for PRFJ • Type systems like PRFJ seem to be a promising practical approach to verification of race freedom, if more annotations can be obtained automatically. • Static type inference for (P)RFJ is NP-complete [Flanagan & Freund, 2004] • Type discovery for PRFJ builds on work on run-time race detection.

Run-time Race Detection:The Lockset Algorithm [Savage et al., 1997] • The lockset algorithm detects violations of a simple locking discipline in monitored executions. • Following the locking discipline implies race-freedom. • Fully automatic • No guarantee about other executions

Core Lockset Algorithm • C(v) = set of locks that have protected variable v so far • Initialization: C(v) := set of all locks • On an access to v by thread t, C(v) := C(v) locks_held(t) • If C(v) is empty, issue warning: locking discipline violated (potential for race conditions) Lockset Algorithm core lockset algorithm plus special treatment for initialization of variables and read-only variables.

Overview of Type Discovery for PRFJ • Identify unique references using static analysis. • Instrument the program using an automatic source-to-source transformation. • Execute the instrumented program, which writes information to a log file. • Analyze the log to discover: • owners of fields, parameters, return values • owners in class declarations • values of non-first owner parameters • requires clause for each method • Run intra-procedural type inference to get types for local variables. • Run the type checker.

Step 1: Static Analysis of Unique References • We use a variant of a uniqueness analysis in [Aldrich, Kostadinov, & Chambers 2002]. • Determine which parameters are lent, i.e., when the method returns, no new references to the argument exist. • Determine which expressions are unique references, based on the lent annotations and known sources of unique references, namely, object allocation expressions. • The analysis is flow-insensitive and context-insensitive.

Step 2: Instrumentation • To help infer the owner of a field, method parameter, or return value x, we monitor a set S(x) of objects that are “values” of x. • If x is a field of classC, S(x) contains objects stored in that field of instances of C. • If x is a method parameter, S(x) contains arguments passed through that parameter. • FE(x): set of final expressions that are syntactically legal at the declaration of x. These are candidate owners of x. • Final expressions are built from final variables (including this), final fields, and static final fields

Step 2: Instrumentation (continued) After an object o is added to S(x), every access to o is intercepted and the following information is updated. • lkSet(x,o): set of locks held at every access to o, excluding accessesthrough a unique reference. • rdOnly(x,o): bool: whether a field of o was written • shar(x,o): bool: whether o is shared • val(x,o,e), where e in FE(x): value of e at an appropriate point for x and o: • If x is a field: immediately after constructor invocation that initialized o. • If x is a parameter to method m: immediately before calls to m where o is passed through parameter x.

Step 3: Execute the instrumented program • The instrumented program writes information to a log file.

Step 4.a: Discover owners for fields, method parameters, and return values Note: The first matching rule wins. If Java type of x is an immutable class (e.g, String), then owner(x) = readonly If (o in S(x) : !shar(x,o)), then owner(x) =thisThread If (o in S(x): rdOnly(x,o)), then owner(x) = readonly If (o in S(x): o in lkSet(x,o)), then owner(x) =self

Step 4.a: Discover owners for fields, method parameters, and return values (continued) • If for some e in FE(x), (o in S(x): val(x,o,e) in lkSet(x,o)), then owner(x) = e • Otherwise, owner(x)=thisOwner, where thisOwner is the first owner parameter of the class.

Example: owner of this param of MyThread(..) public class MyThread<thisThread> extends Thread<thisThread> { public ArrayList <self,readonly> l; public MyThread(ArrayList<self,readonly> l) { this.l = l; } public void run() { synchronized(this.l) { l.add(new Integer<readonly>(10)); } } public static void main(String args[]) { ArrayList<self,readonly> l = new ArrayList<self,readonly>(); MyThread<unique> m1 = new MyThread<unique>(l); MyThread<unique> m2 = new MyThread<unique>(l); m1--.start(); m2--.start(); } }

Example: owner of l field and parameter public class MyThread<thisThread> extends Thread<thisThread> { public ArrayList <self,readonly> l; public MyThread(ArrayList<self,readonly> l) { this.l = l; } public void run() { synchronized(this.l) { l.add(new Integer<readonly>(10)); } } public static void main(String args[]) { ArrayList<self,readonly> l = new ArrayList<self,readonly>(); MyThread<unique> m1 = new MyThread<unique>(l); MyThread<unique> m2 = new MyThread<unique>(l); m1--.start(); m2--.start(); } } Obj lkset l l Lockset Table

Step 4.a: Discover owners for fields, method parameters, and return values If Java type of x is an immutable class (e.g, String), then owner(x) = readonly. If (o in S(x) : !shar(x,o)), then owner(x) =thisThread. If (o in S(x): rdOnly(x,o)), then owner(x) = readonly If (o in S(x): o in lkSet(x,o)), then owner(x) =self …

Example: owner of l field and parameter public class MyThread<thisThread> extends Thread<thisThread> { public ArrayList <self,readonly> l; public MyThread(ArrayList<self,readonly> l) { this.l = l; } public void run() { synchronized(this.l) { l.add(new Integer<readonly>(10)); } } public static void main(String args[]) { ArrayList<self,readonly> l = new ArrayList<self,readonly>(); MyThread<unique> m1 = new MyThread<unique>(l); MyThread<unique> m2 = new MyThread<unique>(l); m1--.start(); m2--.start(); } } obj lkset l l Lockset Table

Step 4.b: Discover owners in class declarations Monitor a set S(C) of instances of class C. • If (o in S(C): !shar(x,o)), owner(C)=thisThread • If (o in S(C): o in lkSet(x,o)), then owner(C)=self • Otherwise owner(C) = thisOwner Use owner(C) as the first owner parameter in the declaration of C.

Example: owner of class MyThread public class MyThread<thisThread> extends Thread<thisThread> { public ArrayList <self,readonly> l; public MyThread(ArrayList<self,readonly> l) { this.l = l; } public void run() { synchronized(this.l) { l.add(new Integer<readonly>(10)); } } public static void main(String args[]) { ArrayList<self,readonly> l = new ArrayList<self,readonly>(); MyThread<unique> m1 = new MyThread<unique>(l); MyThread<unique> m2 = new MyThread<unique>(l); m1--.start(); m2--.start(); } }

Step 4.b: Discover owners in class declarations Monitor a set S(C) of instances of class C. • If (o in S(C): !shar(x,o)), owner(C)=thisThread • If (o in S(C): o in lkSet(x,o)), then owner(C)=self • Otherwise owner(C) = thisOwner

Example: owner of class MyThread public class MyThread<thisThread> extends Thread<thisThread> { public ArrayList <self,readonly> l; public MyThread(ArrayList<self,readonly> l) { this.l = l; } public void run() { synchronized(this.l) { l.add(new Integer<readonly>(10)); } } public static void main(String args[]) { ArrayList<self,readonly> l = new ArrayList<self,readonly>(); MyThread<unique> m1 = new MyThread<unique>(l); MyThread<unique> m2 = new MyThread<unique>(l); m1--.start(); m2--.start(); } }

Step 4.c: Discover non-first owner parameters • Assume uses of these parameters in class declaration are given. Example: class ArrayList<thisOwner,eltOwner> { public boolean add(Object<eltOwner> o){…} … } • If the owner parameter is used as the owner of a method parameter (like eltOwner), instantiate it based on discovered owner of the method parameter. • Similar technique is used if the owner parameter is used as the owner of a field.

Example public class MyThread<thisThread> extends Thread<thisThread> { public ArrayList <self,readonly> l; public MyThread(ArrayList<self,readonly> l) { this.l = l; } public void run() { synchronized(this.l) { l.add(new Integer<readonly>(10)); } } public static void main(String args[]) { ArrayList<self,readonly> l =new ArrayList<self,readonly>(); MyThread<unique> m1 = new MyThread<unique>(l); MyThread<unique> m2 = new MyThread<unique>(l); m1--.start(); m2--.start(); } }

Step 4.c: Discover non-first owner parameters If Java type of x is an immutable class (e.g, String), then owner(x) =readonly. If (o in S(x) : !shar(x,o)), then owner(x) =thisThread. If (o in S(x): rdOnly(x,o)), then owner(x) = readonly If (o in S(x): o in lkSet(x,o)), then owner(x) =self …

Step 4.d: Discover requires clause • runmethods are given an empty requires clause. • Each method declared in class with owner thisThread (from Step 4.b) is given an empty requiresclause. • For other classes, the requires clause contains all method parameters p (including the implicit this parameter) such that the method contains a field access p.f outside the scope of a synchronized(p) statement.

Step 5: Intra-procedural type inference • Introduce fresh distinct formal owner parameters for unknown owners in variable declarations and object allocation expressions. • Derive equality constraints between owners from assignment statements and method invocations. • Solve the constraints in almost linear time using the standard union-find algorithm Test suite does not need full (or even high) statement coverage because intra-procedureal type inference propagates owner information into unexecuted parts of the code.

Example: owner of local variable l public class MyThread<thisThread> extends Thread<thisThread> { public ArrayList <self,readonly> l; public MyThread(ArrayList<self,readonly> l) { this.l = l; } public void run() { synchronized(this.l) { l.add(new Integer<readonly>(10)); } } public static void main(String args[]) { ArrayList<self,readonly> l =new ArrayList<self,readonly>(); MyThread<unique> m1 = new MyThread<unique>(l); MyThread<unique> m2 = new MyThread<unique>(l); m1--.start(); m2--.start(); } }

Towards Automated Verification Through Type Discovery

Towards Automated Verification Through Type Discovery

Presentation Transcript

Program Transformations for Automated Verification

recovery through discovery

Type inference in type-based verification

Towards Automated Acoustic Model Training

Discovery through Primo

Software Verification 2 Automated Verification

Software Verification 2 Automated Verification

Automated Discovery in Biological Sciences

Automated discovery in math

Verification methods - towards a user oriented verification

Dependable Software via Automated Verification

Towards Automated Model Output Analysis

Automated Verification via Separation Logic

Towards Automated Related Work Summarization

Program Transformations for Automated Verification

Requirement-Based Automated Aspect Verification

Software Verification 2 Automated Verification

Steps towards usable verification

Towards Evidence-Based Discovery

Automated Discovery in Pure Mathematics

Automated Formal Verification of Software

Automated Business Process Discovery