1 / 53

Property-Based Testing A Silver Bullet ?

Property-Based Testing A Silver Bullet ?. John Hughes December 2009. Software testing: most famous quote. ”Program testing can be used to show the presence of bugs , but never to show their absence !” E.W.Dijkstra. $60 billion. $240 billion. 50%. Money spent on testing. ≈.

elga
Download Presentation

Property-Based Testing A Silver Bullet ?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Property-Based TestingA Silver Bullet? John Hughes December 2009

  2. Software testing: mostfamousquote • ”Program testing can be used to show the presence of bugs, butnever to show theirabsence!” • E.W.Dijkstra

  3. $60 billion

  4. $240 billion

  5. 50%

  6. Money spent on testing ≈ Cost of remainingerrors

  7. Testing in Practice? • Human effort? • Test automation

  8. Large-Scale Test Automation 1,5MLOC Erlang, 2MLOC C++ Software under test • Nightlyruns provide rapid feedback • New test casesadded for eacherrorfound Report of test casefailures Test Server Automated test cases 700KLOC Erlang

  9. TypicalLarge Projects Test team Design team

  10. Bug Detection Rate

  11. Developer Testing • Whywaituntilsystem testing to use test automation? • Why not automatedevelopers’ own testing? • Unit testing—onemodule in isolation • A key element of agiledevelopmentmethodssuch as XP

  12. Claims for Unit Testing • Immediatediscovery of errors • bugfixing is cheap! • Confidence in refactoring • cleanercode! • TDD: write tests first, then just enoughcode to make them pass • KISS! No wastedeffort! • Tests serve as a specification • So keep test codeclean and elegant! • Not toomany… onetest for eachthing!

  13. TDD with HUnit in Haskell • Problem: implement a key-value store -- Typesignatures empty :: Store k v store :: Ord k => k -> v -> Store k v -> Store k v find :: Ord k => k -> Store k v -> Maybe v remove :: Ord k => k -> Store k v -> Store k v

  14. Step 1: Tests for find A test case is a definition testFindEmpty = "find empty" ~: find 1 empty @?= (Nothing :: Maybe Int) Attach a name to a test case testFind1 = "find with one element" ~: find 1 (store 1 2 empty) @?= Just 2 An assertion (@)—equalitywhereleftside is unknown, right side is ”expected” value testFind2 = "find with two elements" ~: do let s = store 1 2 (store 3 4 empty) find 1 s @?= Just 2 find 3 s @?= Just 4 find 5 s @?= Nothing Can combineseveralassertions and IO actions in one test case

  15. HunitGlue import Test.HUnit main = runTestTTfindTests findTests = "find tests" ~: [testFindEmpty, testFind1, testFind2]

  16. Step 2: Run the tests *Main> main ### Error in: find tests:0:find empty Prelude.undefined ### Error in: find tests:1:find with one element Prelude.undefined ### Error in: find tests:2:find with two elements Prelude.undefined Cases: 3 Tried: 3 Errors: 3 Failures: 0 Counts {cases = 3, tried = 3, errors = 3, failures = 0} *Main> import Test.HUnit main = runTestTTfindTests A message from eachfailing test findTests = "find tests" ~: [testFindEmpty, testFind1, testFind2] data Store k v = Store find = undefined store = undefined remove = undefined empty = undefined A summary of the test results

  17. Step 3: Write just enoughcode data Store k v = Nil | Node k v (Store k v) (Store k v) deriving (Eq, Show) find k Nil = Nothing find k (Node k' v l r) | k == k' = Just v | k < k' = find k l | k > k' = find k r store k v Nil = Node k v NilNil store k v (Node k' v' l r) | k'<= k = Node k' v' (store k v l) r | k' > k = Node k' v' l (store k v r) empty = Nil remove = undefined Orderedbinarytrees Don’twriteremoveyet

  18. Step 4: Repeat the tests *Main> main ### Failure in: find tests:2:find with two elements expected: Just 2 but got: Nothing Cases: 3 Tried: 3 Errors: 0 Failures: 1 Counts {cases = 3, tried = 3, errors = 0, failures = 1} testFind2 = "find with two elements" ~: do let s = store 1 2 (store 3 4 empty) find 1 s @?= Just 2 find 3 s @?= Just 4 find 5 s @?= Nothing

  19. Step 5: Debug the code store k v Nil = Node k v NilNil store k v (Node k' v' l r) | k'<= k = Node k' v' (store k v l) r | k' > k = Node k' v' l (store k v r) k <=k' k > k'

  20. Step 6: Rerun the tests • All the tests pass—nowwewritemore tests! *Main> main Cases: 3 Tried: 3 Errors: 0 Failures: 0 Counts {cases = 3, tried = 3, errors = 0, failures = 0}

  21. Next Iteration: tests for remove removeTests = "remove tests" ~: [testRemoveEmpty, testRemove1, testRemove2] testRemoveEmpty = "removeempty" ~: remove 1 empty @?= (empty :: Store IntInt) testRemove1 = "remove with one element" ~: remove 1 (store 1 2 empty) @?= empty testRemove2 = "remove with two elements" ~: dolet s = store 1 2 (store 3 4 empty) remove 1 s @?= store 3 4 empty remove 3 s @?= store 1 2 empty remove 5 s @?= s

  22. Run the tests main = runTestTTallTests allTests = "all tests" ~: [findTests, removeTests] *Main> main ### Error in: all tests:1:remove tests:0:remove empty Prelude.undefined ### Error in: all tests:1:remove tests:1:remove with one element Prelude.undefined ### Error in: all tests:1:remove tests:2:remove with two elements Prelude.undefined Cases: 6 Tried: 6 Errors: 3 Failures: 0 Counts {cases = 6, tried = 6, errors = 3, failures = 0}

  23. Implementation of remove k,v nk,nv

  24. Code for remove remove k Nil = Nil remove k (Node k' v l r) | k == k' = case r of Nil -> l _ -> let (nk,nv) = leftmost r in Nodenknv l (removenk r) | k < k' = Node k' v (remove k l) r | k > k' = Node k' v l (remove k r) leftmost (Node k v Nil _) = (k,v) leftmost (Node _ _ l _) = leftmost l

  25. Last step: rerun the tests • No failures, so we’redone! *Main> main Cases: 6 Tried: 6 Errors: 0 Failures: 0 Counts {cases = 6, tried = 6, errors = 0, failures = 0} …or are we???

  26. Test Coverage • All tests pass—buthowgood are our tests? • Sourcecodecoveragetoolstellushowmuchcodewetested • When tests pass, check coverage!

  27. UsingHaskell Program Coverage C:\Users\John Hughes\Desktop> ghc-fhpcStore.hs--make C:\Users\John Hughes\Desktop> Store.exe Cases: 6 Tried: 6 Errors: 0 Failures: 0 C:\Users\John Hughes\Desktop> hpcmarkupStore.exe Writing: Main.hs.html…

  28. Marked-upsourcecode Conditionswhichwerealwaystrue Codewhichwasneverexecuted!

  29. Just… one… more… test… testRemoveNonEmptyRightBranch = "remove with non-empty right branch" ~: remove 1 (store 3 4 (store 1 2 empty)) @?= store 3 4 empty

  30. But… • This last test has nothing to do with a specification • It cannot be written ”first” • Test caseswritten just to get coverage are often bad test cases • Manymany tests are needed—boring! • Does TDD reallycut the mustard?

  31. WhichUnit Tests to Write? • ”You should test things that might break” —Kent Beck • Not toofew, not toomany • Partition the casesintoclasses with similarbehaviour • Writeone test per partition

  32. Example: insertioninto an ordered list • Partitions: • Emptylist/non-empty list • Insert at beginning/middle/end • Test boundaryvalues and middlevalues • Element already present/not present

  33. Partition tests insertEmpty = "insertempty" ~: insert 1 [] @?= [1] insertStart = "insert start" ~: insert 1 [2,4] @?= [1,2,4] insertMid = "insertmid" ~: insert 3 [2,4] @?= [2,3,4] insertEnd = "insertend" ~: insert 5 [2,4] @?= [2,4,5] insertPresent = "insert present" ~: insert 1 [1] @?= [1,1] • insertNonEmptycovered by othercases • insertAbsentcovered by othercases • Note: expectedvalues play a major rôle!

  34. Sum or Product of Partitions? • Given severalways to partition inputs, shouldwe • Writeone test for eachpartition? • Writeone test for eachcombination of partitions? • E.g. Non-empty/Beginning/Present, Non-empty/Beginning/Absent, … • (Can be smart and cover all pairs of partitions, or all triples…)

  35. Property Based Testing • Generate test casesinstead of inventingthem • Automate the boring bit! • Reducesize of test code • Focus on propertiestrue in all cases, not single tests • A truespecification • Minimizefailing test cases to speed debugging

  36. Generating Stores • Howto generate, how to shrink instance (Ord k, Arbitrary k, Arbitrary v) => Arbitrary (Store k v) where arbitrary = do (k,v,s) <- arbitrary elements [empty, store k v s, remove k s] shrinkNil = [] shrink(Node k v l r) = [l,r] ++ [Node k v l' r | l' <- shrink l] ++ [Node k v l r' | r' <- shrink r] ++ [Node k v' l r | v' <- shrink v]

  37. Model-based testing • Whatdoes a store represent? • A set of key-value pairs! Sorted so wecancomparethem with == model s = List.sort (contents s) contents Nil = [] contents (Node k v l r) = (k,v):contents l ++ contents r

  38. Properties: Agreement with the model prop_find k s = find k s == lookup k (model s) where types = s :: Store IntInt prop_store k v s = model (store k v s) == List.insert (k,v) (model s) where types = s :: Store IntInt prop_remove k s = case find k s of Just v -> model (remove k s) == model s List.\\ [(k,v)] Nothing -> remove k s == s where types = s :: Store IntInt

  39. Testing the Properties • Weforgot to considerduplicatekeys! *Main> quickCheckWithstdArgs{maxSuccess=10000} prop_find *** Failed! Falsifiable (after 95 tests and 1 shrink): 1 Node 1 1 (Node 1 (-1) NilNil) Nil prop_find k s = case [v | (k',v) <- model s, k==k'] of [] -> find k s == Nothing vs -> find k s `elem` map Just vs where types = s :: Store IntInt

  40. Testing remove • We’re not removing the duplicate key…??? *Main> quickCheckWithstdArgs{maxSuccess=10000} prop_remove +++ OK, passed 10000 tests. *Main> quickCheckWithstdArgs{maxSuccess=10000} prop_remove *** Failed! Falsifiable (after 2 tests): 0 Node 0 1 Nil (Node 1 1 (Node 1 0 NilNil) Nil) *Main> let s = Node 0 1 Nil (Node 1 1 (Node 1 0 Nil Nil) Nil) *Main> find 0 s Just 1 *Main> remove 0 s Node 1 0 Nil (Node 1 0 Nil Nil)

  41. The Bug remove k Nil = Nil remove k (Node k' v l r) | k == k' = case r of Nil -> l _ -> let (nk,nv) = leftmost r in Nodenknv l (removenk r) | k < k' = Node k' v (remove k l) r | k > k' = Node k' v l (remove k r) Removesnkwith the wrongvalue

  42. How are wedoing for coverage?

  43. Hunit vs QuickCheck • QuickCheck • Findsmorebugs • With bettercoverage • In less time • With less test code • And a clearerspecification

  44. 3G Radio Base Station Setup OK Setup OK Reject

  45. Media Proxy • Multimedia IP-telephony (IMS) • Connectscallsacross a firewall • Test adding and removingcallers from a call Add Add Sub Add Sub Add Sub Call Full

  46. Property Based Testing is Great! • Improvesquality! • Findsmorebugs, achievesbettercoverage • Reducescost! • Less test code, shrinking speeds diagnosis • And it’sactuallyfun! • ”Pleasecan I writesome tests today?”

  47. Howdoweknow? • Case studies in industry +Real software development + Professional software developers - Unrepeatable - Difficult to control • Experiments in universities + Focus on a singlequestion + Carefullycontrolled • Student volunteers • Unrealistically small

  48. Test Driven Development • Case Studies • YES, quality is improved • NO, cost is not reduced (costsriseabout 20%) • Experiments • YES, code is developed faster • NO, quality is not improved (it drops)

  49. Property Based Testing • Case studies + Property-based testing doesincreasequality + Property-based testing doesreducecost • PBT duringsystem testingactuallyreduces the quality of conventionalunit testing doneearlier

  50. Our Experiment • Hypothesis: • Property-based testing is moreeffectivethanconventionalunit testing • Effective? • Quality (betterquality in the same time) • number of bugs, number of tests failed, subjective judgement • Test quality • codecoverage, subjective judgement • Design quality • Size of code, size of test code, subjective judgement

More Related