190 likes | 355 Views
Comparison of Unit-Level Automated Test Generation Tools. Shuang Wang Co-authored with Jeff Offutt April 4, 2009. 1. Motivation. We have more software, but insufficient resources We need to be more efficient Frameworks like JUnit provide empty boxes
E N D
Comparison of Unit-Level Automated Test Generation Tools Shuang Wang Co-authored with Jeff Offutt April 4, 2009 1
Motivation • We have more software, but insufficient resources • We need to be more efficient • Frameworks like JUnit provide empty boxes • Hard question: what do we put in there? • Automated test data generation tools • Reduce time and effort • Easier to maintain • Encapsulate knowledge of how to design and implement high quality tests
What are our criteria? • Two commercial tools • AgitarOne • JTest • Free • Unit-level • Automated test generation • Java What’s available out there? 3
Experiment Goals and Design • Compare three unit level automatic test data generators • Evaluate them based on their mutation scores • Subjects • Three free automated testing tools • - JCrasher, TestGen4j, and JUB • Control groups • - Edge Coverage and Random Test • Metric • Mutant score results 4
Experiment Design muJava JCrasher Mutation Score JCrasher Test Set JC TestGen4J Mutation Score TestGen4J Test Set TG JUB Mutation Score P JUB Test Set JUB Mutants Manual Random Test Set Ram Random Mutation Score Manual Edge Cover Test Set EC Edge Cover Mutation Score 5
Experiment Design muJava JCrasher Mutation Score JCrasher Test Set JC TestGen4J Mutation Score TestGen4J Test Set TG JUB Mutation Score P JUB Test Set JUB Mutants Manual Random Test Set Ram Random Mutation Score Manual Edge Cover Test Set EC Edge Cover Mutation Score 6
Experiment Design muJava JCrasher Mutation Score JCrasher Test Set JC TestGen4J Mutation Score TestGen4J Test Set TG JUB Mutation Score P JUB Test Set JUB Mutants Manual Random Test Set Ram Random Mutation Score Manual Edge Cover Test Set EC Edge Cover Mutation Score 8
Subjects (Automatic Test Data Generators) Control groups • Edge Coverage • one of the weakest and most basic test criterion • Random Test • the “weakest effort” testing strategy 9
Experiment Design muJava Jcrasher Mutation Score JCrasher Test Set JC TestGen4J Mutation Score TestGen4J Test Set TG JUB Mutation Score P JUB Test Set JUB Mutants Manual Random Test Set Ram Random Mutation Score Manual Edge Cover Test Set EC Edge Cover Mutation Score 10
muJava • Create mutants • Run tests 11
Results & findings Total % Killed 12
Results & findings Efficiency 13
Example • For vendingMachine, except for edge coverage, the other four mutation scores are below 10% • MuJava creates dozens of mutants on these predicates, and the mostly random values created by the three generators have a small chance of killing those mutants 15
Example • Scores for BoundedStack were the second lowest for all the test sets except edge coverage • only two of the eleven methods have parameters. The three testing generators depend largely on the method signature, so fewer parameters may mean weaker tests 16
Example • JCrasher got the highest mutation score among the three generators • JCrasher uses invalid values to attempt to “crash” the class 17
Conclusion • These three tools by themselves generate tests that are very poor at detecting faults • Among public-accessible tools, criteria-based testing is hardly used • We need better Automated Test Generation Tools 18
Contact Shuang Wang Computer Science Department George Mason University SWANGB@gmu.edu 18