Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Recent Work, in Two ActsCarlos Pacheco8/15/2008Agenda • Deconstructing Randoop • Mutation-based test generation

Deconstructing Randoop

deconstruct verb [trans.] analyze (a text or a linguistic or conceptual system), typically in order to expose its hidden internal assumptions and contradictions and subvert its apparent significance or unity.

(alt.) deconstruct verb [trans.] analyze (a tool, algorithm or software system), typically in order to expose its hidden internal assumptions and components and evaluate its apparent significance or unity.

Goals • Identify Randoop's key, separable ideas • Determine their individual effectiveness • Determine their combination's effectiveness

Randoop classes under test feedback-directed random test generator failing test cases properties to check

Randoop classes under test feedback-directed random test generator failing test cases properties to check java.util.Collections java.util.ArrayList java.util.TreeSet java.util.LinkedList ...

Randoop classes under test feedback-directed random test generator failing test cases properties to check Reflexivity of equality: " o != null : o.equals(o) == true java.util.Collections java.util.ArrayList java.util.TreeSet java.util.LinkedList ...

Randoop classes under test feedback-directed random test generator failing test cases properties to check public void test() { Object o = new Object(); ArrayList a = new ArrayList(); a.add(o); TreeSet ts = new TreeSet(a); Set us = Collections.unmodifiableSet(ts); // Fails at runtime. assertTrue(us.equals(us)); } Reflexivity of equality: " o != null : o.equals(o) == true java.util.Collections java.util.ArrayList java.util.TreeSet java.util.LinkedList ...

Feedback-directed random test generation • Seedcomponent set components = { ... } • Do until time limit expires: • Create a new sequence • Randomly pick a method call m(T1...Tk)/Tret • For each input parameter of type Ti, randomly pick a sequence Si from the components that constructs an object vi of type Ti • Create new sequenceSnew= S1; ... ; Sk ; Tretvnew = m(v1...vk); • if Snew was previously created (lexically), go to i • Classify the new sequence Snew • May discard, output as test case, or add to components int i = 0; boolean b = false;

Classifying a sequence property violated? minimize sequence execute and check properties yes start no exception thrown? component set contract- violating test case no yes discard sequence

Prior evaluation • Compared with other techniques • Model checking, symbolic execution, traditional random testing • On collection classes (lists, sets, maps, etc.) • Randoop achieved equal or higher code coverage in less time • On a large benchmark of programs (750KLOC) • Randoop revealed more errors

Randoop's two key ideas • Create method sequences incrementally (component set) • Use runtime information to guide generation

What makes it work? • Component set? • Runtime feedback? • Both... Or neither?

Four techniques use feedback? yes no yes use components? no

Naive sequence generation • To generate one sequence: • Start from the empty sequence S • Select an enabled method at random • Select input to the method from S • Extend S with the new method call, go back to 1 • A method is enabled if S declares objects that can serve as its receiver and arguments

Naive generation with feedback • Extend new sequence with method call • Execute method call, check properties • If exception/failure, go back one step • Remove last method call • Attempt different extension

Randoop without feedback Add every new sequence to component set, regardless of its execution result.

Review: four techniques use feedback? yes no yes use components? no

Evaluation • Apply the four techniques to a set of libraries • Compare • coverage • errors revealed

Libraries

Input space size distinct input sequences of length...

Input For each library: • All public members in library • Sequence limit: 50 calls • Small set of primitives (0, -1, 100, 'a', etc.)

Other details • Stopping criterion coverage does not increase after 100 seconds • Five properties Equals symmetric, equals reflexive, equals to null returns false, equals-hashcode, no NPEs • Engineering fairness • Optimized all four techniques to make sequence construction efficient

Output • Failing test cases • One test per (violating method,property) pair • Ongoing: manually inspecting all failures

Failures

Failure kinds

Coverage achieved

Coverage vs. time Randoop Other coverage time

Coverage vs. time Randoop Other coverage time tother

Coverage vs. time Randoop Other coverage time tRandoop tother

tRandoop / tother

Conclusion • Randoop: • High coverage very quickly • More "serious" failures • Naive: • Good coverage, slower/less than Randoop • More NPE failures • Other techniques • Not as effective

Mutation-based generation Carlos Pacheco Jeff Perkins

Motivation • Randoop • Achieves reasonable coverage • Hits a coverage plateau • Can we push the coverage plateau up? Goal Randoop coverage time

Idea • Follow random generation with systematic mutation of method sequences • null • unrelated types • related types (super, subclasses) • aliasing • structurally-equivalent objects

Mutation via dataflow tracking • When coverage plateaus, stop random generation • Identify frontier branches • for each frontier branch: • Select candidate sequences (that reach frontier branches) • Track the variables whose data flows into branch condition • Systematically mutate the variables

Example Frontier branch: Class BinTree { public boolean remove(int x) { . . . if (current.value == x) . . . } } Candidate sequence: int var1 = 5; BinTree var2 = new BinTree(var1); int var3 = 2; t.add(var3); int var4 = 6; t.remove(var4); Runtime analysis: relevant variables: var3 and var4 var3 was compared to 6 var4 was compared to 2 Strategy: Modify every relevant variable to take on each compared value

Runtime analysis • Determine data flow at frontier branch • Tag each variable's runtime value on creation • On each operation, create a tree with the operation as the root and operands as branches • From branch tree, determine relevant variables values that each variable was compared to • Could also track control flow

Sequence mutation strategies • Primitive variables • For each primitive variable x: Set x to compared values +/- {0, 1, 10, 100} • Reference variables • Given two variables x and y (of the same type): • Replace uses of x by y (alias) • Make x and y structurally equivalent (copy) • Make one null, the other non-null

Example 2 Frontier branch: public int next(intn) { . . . if ((n & -n)==n) // i.e. n is a power of 2 . . . } Candidate sequence: int var0 = 100; int var1 = -1; List var2 = nCopies(var0, var1); shuffle(var2); Runtime analysis: Relevant variables: var0, var1 var0 was compared to 4, 100 Winning strategy: set var0 to 4

Example 3 Frontier branch: public int lastIndexOf(Objectelem) { . . . for (int i = size-1 ; i >= 0 ; i--) { if(elem.equals(elementData[i])) . . . } Candidate sequence: ArrayList var0 = new ArrayList(); int var1 = 0; String var2 = "a"; var0.add(var1, var2); int var4 = 1; String var5 = "a"; var0.add(var4, var5); long var7 = 100; boolean var8 = var0.add(var7); int var9 = 0; short var10 = 0; Object var11 = var0.set(var9, var10); String var12 = "b"; boolean var13 = var0.remove(var12); double var14 = 0.0; int var15 = var0.lastIndexOf(var14); Runtime analysis: Relevant variables: var1, var9, var10, var14

Example 3 Frontier branch: public int lastIndexOf(Objectelem) { . . . for (int i = size-1 ; i >= 0 ; i--) { if(elem.equals(elementData[i])) . . . } Candidate sequence: ArrayList var0 = new ArrayList(); int var1 = 0; String var2 = "a"; var0.add(var1, var2); int var4 = 1; String var5 = "a"; var0.add(var4, var5); long var7 = 100; boolean var8 = var0.add(var7); int var9 = 0; short var10 = 0; Object var11 = var0.set(var9, var10); String var12 = "b"; boolean var13 = var0.remove(var12); double var14 = 0.0; int var15 = var0.lastIndexOf(var14); Winning strategy: Replace uses of var14 with var7

Coverage-directed sequence mutation • Randoop covered 933 of 2064 branches • 163 frontier branches • Dataflow information was found for 29 frontier branches • Mutation strategies were able to cover 19 of those branches

Dataflow implementation • Instrument java class files as they are loaded • Maintain tags for each runtime value • When two values interact, merge their tags • Create summaries for JDK methods

Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Recent Work, in Two Acts Carlos Pacheco 8/15/2008 Agenda

Presentation Transcript

Great Things in Acts Two

Acts 8:4-8

8 th Grade Bell Work/Agenda

Acts 8

Recent automatic enrollment research

8 th Grade Bell Work/Agenda

An Overview of Two Recent Advances in Trajectory Modeling

Recent work about crab cavity

Recent Developments in Federal Procurement 2007-2008

Recent work:

On Recent Developments in N =8 Supergravity

Acts 8:36

Where do We Learn? At work; in work; through work

Recent Developments in Federal Procurement 2008-2009

Acts 8-10

Recent mobility strategy developments in HEAnet

Acts 6-8

Recent work on Low Emittance Transport, Main Linac

Acts 1:8

On Recent Developments in N =8 Supergravity

Recent mobility strategy developments in HEAnet

8/17:Bell Work