140 likes | 239 Views
This analysis and survey delves into mutation testing, a fault-based technique used to evaluate the effectiveness of test cases by generating mutants and measuring their killing rate. Explore the underpinning hypotheses, execution process, weaknesses, and techniques for reducing mutants. Discover strategies for cutting execution time and efforts and addressing the "Equivalent Mutant" problem. Learn how mutation testing can be applied to various programming languages and artifacts. Consider the significance of mutants reduction in optimizing test cases.
 
                
                E N D
An Analysis and Survey of the Developmentof Mutation TestingbyYueJia and Mark Harmon A Quick Summary For SWE6673
Introductory Comments • Mutation testing is a fault-based technique (testing to show existence or absence of specific faults)of developing “mutants” to be tested by a set of test cases. • The type of faults are mostly ---- syntax based faults • Test case is ran against several mutant programs. The result is kept and a “Mutation Adequacy Score” is kept: Mutation Adequacy Score = (# of defect found)/( total # of seeded defects) or = (# of mutants killed)/(# of seeded non-equivalent mutants) ≤ 1 • In a sense, Mutation Test is evaluating how good is the set of test cases • An area that was started in 1970’s by Lipton, DeMillo & Hamlet
Mutation Testing (Theory & Process) • 2 Underpinning Hypotheses (assumptions): • Programmers are competent and make simple errors • Errors and faults (defects) are coupled – detection of simple faults lead to detection of many complex faults • Mutation Analysis & Execution Process: • For a program P, generate (develop) mutation P’ of P • Run the original program P with test cases T, fixing all the bugs in P. • Run the mutation program P’ with test cases T • Consider P’ “killed” (mutation detected) if results of running T against P and P’ are different • Continue running all the P’s and score the “killed” P’s versus all the developed P’s----- this “score” gives us the Mutation Adequacy Score. • But --- Mutation Testing has Problems/Weaknesses: • There may be a “high number” of mutants and the cost of running them all. • High amount of “human effort” required in the analysis of mutants is costly.
Mutation Test Case Problem: Mutants Reduction • Given a set of Mutants, M and a set of test cases T • let MST(M) stand for mutation score of M with T • Then the problem is to find a subset of mutants M’ from M where: MST(M) ≈ MST(M’) or (# of killed M/non-equivalent M)≈ (# of killed M’/non-equivalent M’) What do you think ---? Is this really the same?
Mutation Test Case Problem: Mutants Reduction(cont.) • Many different approaches have been tried to reduce the number of mutants for testing: • Random Sampling : found that randomly selecting 10% of mutants only reduces the effectiveness 16% • Mutation clustering: grouping mutants into clusters by traditional clustering techniques such as K-means and selecting representative mutants from each cluster • Selective Mutation: reducing the number of mutation operators to reduce the number of mutants generated without significant loss of effectiveness. • High Order Mutation: use higher order mutants (via multiple applications of mutation operators) to replace number of first order mutants
Techniques for Reducing Execution (Running) time and effort • Consider Strong, Weak, and Firm ways to analyze the killing of mutants during the execution process: • Strong: execute the whole program and if the mutant results differently, then it is “killed” • Weak: execute up to and include the part that is mutant and check the result • Firm: execute somewhere between Strong and Weak
Techniques for Reducing Execution (Running) time and effort (cont.) • Use different tools : • interpreter (expensive) • compiler (fast) • Compiler based (compile the whole original program P, but only compile the mutation part for each program from M • byte code translation to different platforms (for platform testing) • aspect oriented mutation • generate & compile different mutants • Use multiple platform such as distributed processors to simultaneously execute the mutants
Detection of “Equivalent Mutant” Problem • Detecting that a program P and one of its mutant is equivalent is theoretically undecidable: Program P “Equivalent” Mutant P’ for (inti=0; i<5; i++) for (inti=0; i != 5; i++) { no change to { no change to value of i within loop value of i within loop } } • P’ is a mutant of program P if P’ is syntactically different from P but is functionally same as P (e.g. produces same results). • Mutation Score based on non-equivalent mutants without complete detection of “equivalent” mutants implies that we can never get to 100% on mutation score. How much does this bother you ---?
Mutation Testing May be Applied: to Various Artifacts • To Program Source Code in following languages : • FROTRAN (22 mutant operators) • ADA • C (77 mutant operators) • JAVA (include OO mutant operators) • C++; SQL; PHP; AOP; COBOL; Spreadsheet Language • To Specifications in : • Finite State Machine • Star Chart • Petri Net • Web services in XML • Description of runtime environments
Mutation Testing May also be Applied to ---- • Constraint-based test data generation (found that 75% of the mutants can be killed with automatically generated test data.) [e.g. conditions in which mutants will die are written as algebraic constraints on test cases and then generate the test cases] • Regression testing where we reuse the test data: • Reschedule the test sequence based on the mutant killing score • Minimize the test cases that need to be re-run for regression based on the mutant killing score of the test cases
Some Empirical Results • Most of the earlier work dealt with code size of < 50 loc and increased in size as non-academic code was used • Mutation testing developed test cases, when compared against “all-use” data flow test cases, actually subsumes the all use-test cases. (16% more defects were found with mutation generated test cases than all-use data flow) • “Real world” software errors were compared against mutants, and found 85% of mutants were also “real world” faults. • Human errors and mutants are different enough that both automated and human generated faults are needed for testing
Some Concluding Remarks • Tools for Mutation Testing have increased in number since 2000. (30 years after its inception ---- technology maturation takes that long) • Continued Barrier: • Perception of complexity and high cost • Decreasing (can not be solved)Equivalent Mutants problem is gaining some momentum Does thinking about Mutants provide us insight into defects and help us generate better test cases? ----- Your thoughts?
Now that you have read about Mutation Testing --- consider the following pseudo – code Input pairs as test cases Integer m, n, max ; {5,2 } max = 5 Input m, n; {10, 305} max = 305 If (m ≥ n) {20, 20} max = 20 max = m; {-23, -5} max = -5 else { 0, -4} max = 0 max = n; print max; • Is this test set pretty good --- what mutant will not be killed? ----- what type of defect was this?