1 / 41

Tao Xie University of Illinois at Urbana-Champaign

Transferring an Automated Test Generation Tool to Practice: From Pex to Fakes, Code Digger, and Pex4Fun. Tao Xie University of Illinois at Urbana-Champaign.

carsyn
Download Presentation

Tao Xie University of Illinois at Urbana-Champaign

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transferring an Automated Test Generation Tool to Practice: From Pex to Fakes, Code Digger, and Pex4Fun • Tao Xie • University of Illinois at Urbana-Champaign • Part of the research work described in this talk was done in collaboration with the Pex team (Nikolai Tillmann, Peli de Halleux, et al.) @Microsoft Research,students @Illinois ASE, and other collaborators; part of the research work was done by the Pex team only

  2. (Automated) Test Generation • Human • Expensive, incomplete, … • Brute Force • Pairwise, predefined data, etc… • Tool Automation!!

  3. State-of-the-Art/Practice Test Generation Tools Running Symbolic PathFinder ... … ====================================================== results no errors detected ====================================================== statistics elapsed time: 0:00:02 states: new=4, visited=0, backtracked=4, end=2 search: maxDepth=3, constraints=0 choice generators: thread=1, data=2 heap: gc=3, new=271, free=22 instructions: 2875 max memory: 81MB loaded code: classes=71, methods=884 …

  4. Successful Case of MSR Testing Tool: Pex & Relatives • Pex (released on May 2008) • 30,388download# (20 months, Feb 08-Oct 09) • Active user community: 1,436 forum posts during ~3 years (Oct 08- Nov 11) • Moles (released on Sept 2009) • Shippedwith VS 12 as Fakes • “Provide Microsoft Fakes w/ all Visual Studio editions” got 1,457community votes • Code Digger (released on Oct 2008 for VS 08/10, on Apr 2013 in VS Gallery for VS 12) • 22,466 download# (10 months, Apr 13-Jan 14) http://research.microsoft.com/en-us/projects/pex/

  5. Example Comments for Code Digger in VS Gallery • “Greattool to generate unit tests for parameter boundarytests. I like to see it integratedinto Visual Studio and the testing features as far as in ReSharper! :)” • “What an awesometool.. Help us to explore our logic by providing accurateinput parameter for each logic branch.. You should try this as one of your ultimatetool :) It really saves a lot of our time to explore everylogic branch in our apps..”

  6. Example Comments for Code Digger in VS Gallery cont. • “What a fantastictool. Whilst it’s not bullet proof, it shows amazingpromise. I ran the Code Digger over a number of real-world methods and it immediately identified dozens of edge cases we hadn’t thought of. This is getting rolled-out to my team TODAY! Well done. Brilliant. Really brilliant.” • “Topstuff here. Very anxiousfor more of the Pexfeatures that were available in VS 2010 Pex & Moles (like auto-gen unit tests). This tool is poised to become indispensablefor anyone writing solid suites of unit tests.”

  7. Pex4Fun http://pex4fun.com/ 1,462,489 clicked 'Ask Pex!'

  8. Behind the Scene of Pex4Fun behavior Secret Impl== Player Impl Player Implementation class Player { public static int Puzzle(int x) { return x; } } Secret Implementation class Secret { public static int Puzzle(int x) { if (x <= 0) return 1; return x * Puzzle(x-1); } } class Test { public static void Driver(int x) { if (Secret.Puzzle(x) != Player.Puzzle(x)) throw new Exception(“Mismatch”); } }

  9. Example User Feedback on Pex4Fun “I used to love the first person shooters and the satisfaction of blowing away a whole team of Noobies playing Rainbow Six, but this is far more fun.” X “I’m afraid I’ll have to constrain myselfto spend just an hour or so a day on this really exciting stuff, as I’m really stuffed with work.” “It really got me *excited*. The part that got me most is about spreading interest in teaching CS: I do think that it’s REALLY great for teaching | learning!”

  10. Code Hunt: Redesigned as Game https://www.codehunt.com/

  11. ICFP Programming Contest 2013 • August 8 – 11, 2013 • 300+ teams wrote tools to synthesize bit-vector programs • These tools were evaluated on a set of 1,800 benchmark problems • Main goal: • How would the top-teams fare against the best SMT solutions? http://research.microsoft.com/~nswamy/icfpc.pptx http://research.microsoft.com/en-us/events/icfpcontest2013/

  12. ICFP Programming Contest 2013 as Program Synthesis Game ICFP Programming Contest 2013 Ah ha! I guess A = λx. if x & 1 = 0 then x else x + 1 Ah. I bet A = λx. x+1 Hmm. Ok, so what is A(11) and A(12) then? Can you tell me what A(16), A(42), A(128) are? Let me check … Let me check … Yep! That's right! You score one point. A(16)=17, A(42)=43, A(128)=129. Nope. A(9)=9. I have a secret program A. Can you guess what it is? You have 5 minutes. Since you ask so nicely: A(11)=12 and A(12)=13 PLAYER GAME query.smt2 A ≈λx. x+1 ? query.smt2 A ≈λx. if x&1=0…? No! Counterexample: A(9) <> (λx.x+1) 9 Yes! http://research.microsoft.com/~nswamy/icfpc.pptx http://research.microsoft.com/en-us/events/icfpcontest2013/

  13. What Lie Behind Pex • NOT Random: • Cheap, Fast • “It passed a thousand tests” feeling • … • But Dynamic Symbolic Execution: e.g., Pex, CUTE,EXE • White box • Constraint Solving

  14. Dynamic Symbolic Execution Choose next path • Code to generate inputs for: Solve Execute&Monitor void CoverMe(int[] a) { if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug"); } Negated condition a==null F T a.Length>0 T F Done: There is no path left. a[0]==123… F T Data null {} {0} {123…} Observed constraints a==null a!=null && !(a.Length>0) a!=null && a.Length>0 && a[0]!=1234567890 a!=null && a.Length>0 && a[0]==1234567890 Constraints to solve a!=null a!=null && a.Length>0 a!=null && a.Length>0 && a[0]==1234567890

  15. Explosion of Search Space There are decision procedures for individual path conditions, but… • Number of potential paths grows exponentially with number of branches • Reachable code not known initially • Without guidance, same loop might be unfolded forever Fitnex search strategy [Xie et al. DSN 09] http://research.microsoft.com/apps/pubs/default.aspx?id=81089

  16. DSE Example TestLoop(0, {0}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: !(x == 90) ↓ New path condition: (x == 90) ↓ New test input: TestLoop(90, {0})

  17. DSE Example TestLoop(90, {0}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && !(y[0] ==15) ↓ New path condition: (x == 90) && (y[0] ==15) ↓ New test input: TestLoop(90, {15})

  18. Challenge in DSE TestLoop(90, {15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && (y[0] ==15) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] ==15) && (x+1 == 110) ↓ New test input: No solution!?

  19. A Closer Look TestLoop(90, {15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Path condition: (x == 90) && (y[0] ==15) && (0 < y.Length) && !(1 < y.Length) && !(x+1 == 110) ↓ New path condition: (x == 90) && (y[0] ==15) && (0 < y.Length) && (1 < y.Length)  Expand array size

  20. A Closer Look TestLoop(90, {15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } We can have infinite paths! Manual analysis  need at least 20 loop iterations to cover the target branch Exploring all paths up to 20 loop iterations is infeasible: 220 paths

  21. Fitnex: Fitness-Guided Exploration [Xie et al. DSN 2009] TestLoop(90, {15, 0}) TestLoop(90, {15, 15}) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Key observations: with respect to the coverage target • not all paths are equally promising for branch-node flipping • not all branch nodes are equally promising to flip • Our solution: • Prefer to flip branch nodes on the most promising paths • Prefer to flip the most promising branch nodes on paths • Fitness function to measure “promising” extents

  22. Fitness Function • FF computes fitness value (distance between the current state and the goal state) • Search tries to minimize fitness value [Tracey et al. 98, Liu at al. 05, …]

  23. Fitness Function for (x == 110) public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } Fitness function: |110 – x |

  24. Compute Fitness Values for Paths Fitness Value public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } (x, y) (90, {0}) 20 (90, {15}) 19 (90, {15, 0}) 19 (90, {15, 15}) 18 (90, {15, 15, 0}) 18 (90, {15, 15, 15}) 17 (90, {15, 15, 15, 0}) 17 (90, {15, 15, 15, 15}) 16 (90, {15, 15, 15, 15, 0}) 16 (90, {15, 15, 15, 15, 15}) 15 … Fitness function: |110 – x | Give preference to flip paths with better fitness values We still need to address which branch node to flip on paths …

  25. Compute Fitness Gains for Branches Fitness Value public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } (x, y) (90, {0}) 20 (90, {15})  flip b4 19 (90, {15, 0})  flip b2 19 (90, {15, 15})  flip b4 18 (90, {15, 15, 0})  flip b2 18 (90, {15, 15, 15})  flip b4 17 (90, {15, 15, 15, 0})  flip b2 17 (90, {15, 15, 15, 15})  flip b4 16 (90, {15, 15, 15, 15, 0})  flip b2 16 (90, {15, 15, 15, 15, 15})  flip b4 15 … Fitness function: |110 – x | Branch b1: i < y.Length Branch b2: i >= y.Length Branch b3: y[i] == 15 Branch b4: y[i] != 15 • Flipping Branch b4 (b3) gives us average 1 (-1) fitness gain (loss) • Flipping branch b2 (b1) gives us average 0 fitness gain (loss)

  26. Compute Fitness Gain for Branches cont. • For a flipped node leading to Fnew, find out the old fitness value Fold before flipping • Assign Fitness Gain (Fold – Fnew)for the branch of the flipped node • Assign Fitness Gain (Fnew – Fold )for the other branch of the branch of the flipped node • Compute the average fitness gain for each branch over time

  27. Search Frontier • Each branch node candidate for being flipped is prioritized based on its composite fitness value: • (Fitness value of node – Fitness gain of its branch) • Select first the one with the best composite fitness value http://research.microsoft.com/apps/pubs/default.aspx?id=81089

  28. Successful Case of MSR Testing Tool: Pex & Relatives • Pex (released on May 2008): • 30,388download# (20 months, Feb 08-Oct 09) • Active user community: 1,436 forum posts during ~3 years (Oct 08- Nov 11) • Moles (released Sept 2009) • Shippedwith VS 12 as Fakes • “Provide Microsoft Fakes w/ all Visual Studio editions” got 1,457community votes • Code Digger (released on Oct 2008 for VS 08/10, on Apr 2013 in VS Gallery for VS 12) • 22,466 download# (10 months, Apr 13-Jan 14) How to make such successful case????

  29. Lesson 1. Started as (Evolved) Dream Moles/Fakes • Surrounding(Moles/Fakes) • Simplifying (Code Digger) • Retargeting (Pex4Fun/Code Hunt) Parameterized Unit Tests Supported by Pex Code Digger void TestAdd(ArrayList a, object o) { Assume.IsTrue(a!=null); inti = a.Count; a.Add(o); Assert.IsTrue(a[i] == o); } Pex4Fun/Code Hunt

  30. Lesson 2. Chicken and Egg Macro Perspective • Developer/manager: “Who is using your tool?” • Pexteam: “Do you want to be the first?” • Developer/manager: “I love your tool but no.” Tool Adoption by (Mass) Target Users Tool Shipping with Visual Studio Micro Perspective

  31. Lesson 3. Human Factors – Generated Data Consumed by Human • Developer: “Code digger generates a lot of “\0” strings as input. I can’t find a way to create such a string via my own C# code. Could any one show me a C# snippet? I meant zero terminated string.” • Pex team: “In C#, a \0 in a string does not mean zero-termination. It’s just yet another character in the string (a very simple character where all bits are zero), and you can create as Pex shows the value: “\0”.” • Developer: “Your tool generated “\0”” • Pexteam: “What did you expect?” • Developer: “Marc.”

  32. Lesson 3. Human Factors – Generated Name Consumed by Human • Developer: “Your tool generated a test called Foo001. I don’t like it.” • Pexteam: “What did you expect?” • Developer:“Foo_Should_Fail_When_Bar_Is_Negative.”

  33. Lesson 3. Human Factors – Generated Results Consumed by Human Object Creation messages suppressed (related to Covana by Xiao et al. [ICSE’11]) Exception Tree View Exploration Tree View Exploration Results View

  34. Lesson 4. Best vs. Worst Cases public boolTestLoop(int x, int[] y) { if (x == 90) { for (int i = 0; i < y.Length; i++) if (y[i] == 15) x++; if (x == 110) return true; } return false; } http://research.microsoft.com/apps/pubs/default.aspx?id=81089 Fitnexby Xie et al. [DSN’09] Key observations: with respect to the coverage target • not all paths are equally promising for branch-node flipping • not all branch nodes are equally promising to flip To avoid local optimal or biases, the fitness-guided strategy is integratedwith Pex’sfairness search strategies • Our solution: • Prefer to flip branch nodes on the most promising paths • Prefer to flip the most promising branch nodes on paths • Fitness function to measure “promising” extents

  35. Lesson 5. Tool Users’ Stereotypical Mindset or Habits • “Simply one mouse click and then everything would work just perfectly” • Often need environment isolation w/ Moles/Fakes or factory methods, … • “One mouse click, a test generation tool would detect all or most kinds of faults in the code under test” • Developer: “Your tool only finds null references.” • Pex team: “Did you write any assertions?” • Developer: “Assertion???” • “I do not need test generation; I already practice unit testing (and/or TDD). Test generation does not fit into the TDD process”

  36. Lesson 6. Practitioners’ Voice Gathered feedback from target tool users • Directly, e.g., via • MSDN Pex forum, tech support, outreach to MS engineers and .NET user groups • Indirectly, e.g., via • interactions with MS Visual Studio team (a tool vendor to its huge user base) • Motivations of Moles • Refactoring testability issue faced resistance in practice • Observation at Agile 2008: high attention on mock objects and tool supports

  37. Lesson 7. Collaboration w/ Academia • Win-win collaboration model • Win (Ind Lab): longer-term research innovation, man power, research impacts, … • Win (Univ): powerful infrastructure, relevant/important problems in practice, both research and industry impacts, … • Industry-located Collaborations • Faculty visits, e.g., Fitnex, Pex4Fun • Student internships, e.g., FloPSy, DyGen, state cov • Academia-located Collaborations http://research.microsoft.com/en-us/projects/pex/community.aspx#publications

  38. Lesson 7. Collaboration w/ Academia Academia-located Collaborations • Immediate indirect impacts, e.g., • Reggae [ASE’09s]  Rex • MSeqGen[FSE’09]  DyGen • Guided Cov [ICSM’10]  state coverage • Long-term indirect impacts, e.g., • DySy by Csallneret al. [ICSE’08] • Seeker [OOPSLA’11] • Covana [ICSE’11] http://research.microsoft.com/en-us/projects/pex/community.aspx#publications

  39. Summary • Pex practice impacts • Moles/Fakes, Code Digger, Pex4Fun/Code Hunt • Lessons in transferring tools • Started as (Evolved) Dream • Chicken and Egg • Human Factors • Best vs. Worst Cases • Tool Users’ Stereotypical Mindset or Habits • Practitioners’ Voice • Collaboration w/ Academia

  40. Thank you http://research.microsoft.com/pex https://sites.google.com/site/asergrp/

  41. Summary • Pex practice impacts • Moles/Fakes, Code Digger, Pex4Fun/Code Hunt • Lessons in transferring tools • Started as (Evolved) Dream • Chicken and Egg • Human Factors • Best vs. Worst Cases • Tool Users’ Stereotypical Mindset or Habits • Practitioners’ Voice • Collaboration w/ Academia

More Related