190 likes | 383 Views
Testing Games: Randomizing Regression Tests Using Game Theory. Nupul Kukreja , William G.J. Halfond , Milind Tambe & Manish Jain Annual Research Review April 30, 2014. Outline. Motivation Problem(s) with traditional test scheduling Game Theory and Randomization
E N D
Testing Games:Randomizing Regression Tests Using Game Theory Nupul Kukreja, William G.J. Halfond, MilindTambe & Manish Jain Annual Research ReviewApril 30, 2014
Outline • Motivation • Problem(s) with traditional test scheduling • Game Theory and Randomization • Modeling software testing as a 2-player game • Evaluation • Conclusion & Future Work
Motivation Regression Suite Dude! Suite XX is not gonna run! Software Size Let’s CODE NOW FIX LATER DEVS The deadline is close too!
Motivating Problem(s) • Existing test case scheduling activities are deterministic • Developers know which test cases will be executed when • Developers can check in insufficiently tested code closer to delivery deadline • High-turn around time for fixing bugs in low priority features • Random test-scheduling helpful but treats each test case as equally important
Software Testing as a 2-Player Game • This tension between software testers and developers can be modeled as a two-player game • We solve the game to answer the following question: • Given an adaptive adversary (developers) and resource constraints (testers) what is the optimum test-scheduling strategy that maximizes the tester’s expected payoff?
Game Theory • Study of strategic decision making among multiple players – corporations, software agents, testers and developers, regular humans etc.,
Two-player “Security” Game 60% 40% Security game assumptions: What is good for one player (+ve payoff) is bad for the other (-ve payoff) Adversary can conduct perfect surveillance and act appropriately i.e., these are simultaneous move games or Stackelberg games
Testing Game *ITC: Insufficiently tested code *PC: Perfect code i.e., 100% tested
Testing Game – Payoffs • Payoffs are either positive or negative • Proportional to the value of the requirement for both, the tester & developer • Payoffs can be derived in many ways: • Directly from requirement priorities • Expert judgment and/or planning poker • Delphi methods • Directly from test-case priorities
Defining Test Requirements • Could be black-box or white-box based • If black-box, TR may correspond to: • Module/component • Method • OR…the requirement as a whole • “We” group test cases by requirements • Each requirement is ‘covered’ by one or more test cases (or suites)
Not All Developers Are The Same • Commonly encountered personality traits • Lazy/sloppy • New Grad • Moderate/Average • Seasoned Developer • Each persona has a probability of “screwing up” i.e., checking in insufficiently tested code • We can compute these probabilities by looking at the team composition
The Testing Game P(sloppy) = 3/10 P(avg) = 5/10 P(seasoned) = 2/10
Solving the Testing Game Probability of schedulingtest case ‘i'
Example Testing Game Create a test case scheduling of ‘m’ test cases by sampling from the above distribution
Evaluation • Large simulation: • 1000 test requirements = 1 Game • 1000 Games randomly generated • Each game played/solved 1000 times over • Payoffs range from [-10,10] • Constraint: Can only schedule/execute 500 test cases • Compared with: • Deterministic test scheduling • Uniform Random test scheduling • Weighted Random test scheduling • Tester-only weights • Tester+developer based weights
Limitations and Threats to Validity • Developers not adversarial • Developers may choose to be sloppy at times with a particular probability • Lack of perfect historical observation for developers • Expected payoffs is mostly a mathematical notation
Conclusion & Future Work • New approach for test case scheduling using Game Theory • Accounts for tester and developer’s payoffs • Randomizing test cases acts as deterrent for developers, for checking in insufficiently tested code • The test case distribution is optimum under resource constraints and maximizes payoff for worst case developer behavior – robust! • Simulation shows positive results and is a first step to analyzing the tester/developer relationship
DEVS Thank you!Questions?