1 / 22

Mutatis Mutandis Evaluating DBMS Test Adequacy with Mutation Testing

Mutatis Mutandis Evaluating DBMS Test Adequacy with Mutation Testing. Ivan T. Bowman, HANA Product Engineering June 24, 2013. Public. Agenda. Test adequacy and why we want to evaluate it Mutation testing and how to apply it to database systems Conclusions and future work. Test adequacy.

cullen
Download Presentation

Mutatis Mutandis Evaluating DBMS Test Adequacy with Mutation Testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mutatis MutandisEvaluating DBMS Test Adequacy with Mutation Testing Ivan T. Bowman, HANA Product Engineering June 24, 2013 Public

  2. Agenda Test adequacy and why we want to evaluate it Mutation testing and how to apply it to database systems Conclusions and future work

  3. Test adequacy

  4. Challenges of testing database servers • Self-management features • Adapting to current conditions leads to increased internal state that needs to be tested • Relational equivalence • DML statements can be transformed into different equivalent forms • The correct answer may be returned even if a desirable optimization was not considered • Concurrent execution • Database systems need to work well with concurrent connections and intra-query parallelism • Concurrency leads to more primitives that need testing and more interactions to test • Performance requirement are not clearly specified • Tests of performance need to control many factors, limiting their sensitivity

  5. Goals of measuring test adequacy • We want to make sure that software is “thoroughly tested” • We want to compare advanced test generation techniques: • Random query generation (RAGS) • Genetic algorithms • We want to minimize test suites by removing expensive tests that contribute little • We want to prioritize test suites • For example, require all developers to run a minimum-acceptance test before submission

  6. Measuring test adequacy • Coverage techniques are commonly used • Measure how many statements / basic blocks / code combinations are executed under tests • The metric is simple to compute and understand • Coverage metrics are necessary but not sufficient for computing test adequacy • A direct approach to evaluating adequacy considers how many faults the tests find • We could try to use “natural” faults that result from developer errors found before or after shipping • We could manually “seed” faults in the code to see how many are detected • For example, disable the effect of specific types of mutex • We could automatically introduce faults using patterns: mutation testing

  7. Mutation testing • Mutation testing evaluates the effectiveness of a test suite for detecting incorrect programs • Evaluation focuses on those that are “close” to the correct version • Mutation operators are defined to alter source code • For example, change logical and (&&) to or (||) • Each operator creates a “mutant” program • A test suite “kills” the mutant if it passes with the original and fails with the mutant • Mutation adequacy score is the ratio of killed mutants to total mutants • A particular problem arises with mutants that don’t affect program semantics (equivalence)

  8. Does statement coverage predict ability to kill mutants?

  9. General approach of mutation testing Generated Mutants P’ Input Program P Generate Mutants P’ P’ Fails T? Input Test Set T Yes: P’ Killed No: P’ Lives

  10. Mutation operators we evaluated • Function • Applies to all methods and C-style functions. Skip the contents of the function. • Condition • Applies to the condition in for, while, and if statements. Evaluate a mutated condition. • Switch • Applies to all switch statements. Add one to the expression. • Case • Applies to all case statements within a switch. Skip the contents of the case. • Default • Applies to default statements within a switch statement. Skip the default statements.

  11. Mutation types and their outcomes (400K of query processing code)

  12. Practical issues

  13. Large systems generate many mutants • In a 400K line subset of SQL Anywhere, about 59K mutants identified • Compiling these separately would take too much time and space • Running the general mutant algorithm would take 432 days • Generate a single meta-mutant where mutants can be enabled at run-time • A:if( len > 0 ) • B:if(mutation_on(123) ? (len>=0) : (len>0) ) • Use the same code editing to track mutation coverage • Identify which tests cover mutated code lines • No need to evaluate whether a test kills a mutant it does not execute

  14. Simplifying assumptions • Independence of mutations • If M is a set of mutants then test kills iff it kills or (or both) • Test failures do not corrupt state • Test may terminate on input with one of Crash, Timeout, Fail or Pass • Tests deterministically find faults

  15. Proposed Improvements • 1. Test independent mutations simultaneously • Individual tests do not kill many tests • On average, a test kills 12% of the mutants it covers • Guess test will not kill any of the currently living mutants it covers • Use binary search if it does fail with enabled • If test fails on a single mutant, it kills the mutant • From 432 days to 17.3 with this improvement • 2. Identify “lethal” mutations in a first pass • Execute each mutant with the cheapest test that kills it • 3. Order tests so that cheaper tests run first • Mutants that are easily killed are removed quickly

  16. Reminder: general approach of mutation testing Generated Mutants P’ Input Program P Generate Mutants P’ P’ Fails T? Input Test Set T Yes: P’ Killed No: P’ Lives

  17. Adjusted approach of mutation testing Meta-Mutant P’ Input Program P Prepare Meta-Mutant Measure test time and mutant coverage Input Test Set T Living Mutants Find lethal mutants Cheapest Delete lethal For each ordered by duration Guess that passes on Recursively divide and conquer Delete k

  18. Coverage and mutation adequacy as tests execute

  19. Conclusions and Future Work

  20. Conclusions • Mutation testing is practical for large systems such as DBMSs • Mutation testing gives a “harder” adequacy metric than code coverage • Our approach required simplifying assumptions that are not generally true • Small-scale testing shows these hold sufficiently for some purposes • Future work is needed to address simplifying assumptions • Tests do not deterministically find faults; in particular, some mutations are non-deterministic • Mutants may interact; can we characterize them as independent by analysis? • Future work includes comparing test generation frameworks on mutation adequacy

  21. Acknowledgements • Feedback from the SQL Anywhere Query Processing group was invaluable • In particular, detailed suggestions from Daniel J. Farrarhelped shape this paper • Intern J. Devin Papineau provided significant motivation and early discussion for this work • The anonymous reviewers provided helpful and insightful suggestions

  22. Thank you Contact information: Ivan T. Bowman Research Manager of SQL Anywhere Query Processing Ivan.Bowman (at) sap.com

More Related