1 / 36

Mining Decision Trees as Test Oracles for Java Bytecode

Mining Decision Trees as Test Oracles for Java Bytecode. Frank Xu, Ph.D. Gannon University.

cyrah
Download Presentation

Mining Decision Trees as Test Oracles for Java Bytecode

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Decision Trees as Test Oracles for Java Bytecode Frank Xu, Ph.D. Gannon University • Xu, W., Ding, T., Wang, H., Xu. D., Mining Test Oracles for Test Inputs Generated from Java Bytecode, Proc. of the 37th Annual International Computer Software & Applications Conference, pp. 27-32, Kyoto, Japan, July 2013 • Mining Decision Trees as Test Oracles for Java Bytecode (Extended version of conference paper), Accepted by Journal of Systems and Software

  2. Bio – Frank Xu • Education • Ph.D. in Software Engineering North Dakota State University • M.S. in Computer Science Towson University • B.S. in Computer Science Southeast Missouri State University • Working Experience • GE Transportation, 2008- present, Consultant of Locomotive Remote Diagnostics Service Center • Gannon University, 2008- present, Assistant Professor of Software Engineering, Director of Keystone Software Development Institute, • University VA –Wise, 2007- 2008, Assistant Professor of Software Engineering • Swanson Health Products, 2005 ~ 2007, Sr. Programmer Analyst • Volt Information Science Inc., 2004 ~ 2005, Software Engineer

  3. Teaching • Source: Student Evaluation Report

  4. Research • Source: Google scholar: http://scholar.google.com/citations?user=9_I4ZUgAAAAJ&hl=en

  5. Mining Decision Trees as Test Oracles • Introduction • Running Example • Test Input Generation • Model Miner • Empirical Study • Related Work • Conclusions

  6. Introduction

  7. Exercise • Implementing a method to solve Triangle problem

  8. What is Triangle Problem?

  9. How to test Triangle? • String getTriangleType (int a, int b, intc){ • if((a<b+c) && (b<a+c) && (c<a+b)){ • if (a==b && b==c) • return“Equilateral ”; • elseif (a!=b && a!=c &&b!=c) • return“Scalene ”; • else • return “Isosceles” ; • } • else • return“NotATriangle “; • }

  10. Control Flow Diagram

  11. Summary: Test Triangle Steps assertEquals(“Isosceles ”, triangle.getTriangleType(7,7,7)) Step 3 Step 1 Step 2 assertEquals(“Isosceles ”, triangle.getTriangleType(6,6,8)) ….. Source Code Control Flow Diagram Paths (based on coverage) Junit Test cases

  12. Auto-Generate Test Cases is Challenging • How to generate testing inputs automatically? • E.g. ,(7,7,7), (6,6,8)…. • How to find expected results automatically for each inputs? • Known as test oracle issue • E.g., Equilateral, Isosceles... assertEquals (“Equilateral”, triangle.getTriType(7,7,7)) ? assertEquals(“Isosceles”, triangle.getTriType(6,6,8)) …..

  13. Our Approach to Solve Challenges • Rule-based search method to generate inputs • Seed value adjust seed values based on rules • (5,7,8) for Isosceles • Adjust input values • a==b • (7,7,8) (5,5,8) • Using heuristic model for test oracle (expected results ) • Anew data mining approach to building a heuristic behavioral model (in the form of decision tree) • A heuristic behavioral model represents the estimated expected results

  14. Test Oracle Overview

  15. Revisit: Triangle Problem

  16. Java is Complex • Statement • contains comparison and expression • a <b+c(Java) • Condition • (a<b+c) && (b<a+c) && (c<a+b)

  17. Java Simpler Version • Simplify Statement • a <b+c (Java) • [1]$i3=i1+i2 and[2]i0>=$i3(Jimple) • Simplify condition • (a<b+c) && (b<a+c) && (c<a+b) (Java) • Jimple if (a<b+c) { if (b<a+c) { if(c<a+b) … }}} • www.sable.mcgill.ca/soot/

  18. Path generation Generate CFG Generate inputs (7,7,7) Mine test oracle Equilateral

  19. How to Generate Test Inputs

  20. Path: [1]→[2]→[3]→[4][5]>[18] • Search an input that make predicate [5]:i0>=$i3 to true • a>=b+c (NotATriangle) • Challenge: backtracking $i3 to input variables • Recall $i3=i1+i2 • Solution: Predicate Tree • Recall Property 1 • a>=b+c

  21. Apply Rules to a Predicate Tree for Generating Test Inputs • For a given seed value, we adjust the value to guide the execution path based on rules 10 7 4

  22. Model Miner

  23. Jimple Predicates and Attributes of Triangle Program For a given test input generated by rule-based method, predicates produce a set of Tor F values

  24. Covert Test Inputs Using Attributes

  25. C4.5 mining algorithm • The key idea of the algorithm is to • calculate the highest normalized information gain of attributes and then build a decision node that splits on the attributes • Tool • Weka3: http://www.cs.waikato.ac.nz/ml/weka/

  26. Empirical Study

  27. Three Study Subjects

  28. Goal of Empirical Studies • Measure fault detection capability • # mutants killed /#mutants *100%

  29. Measure fault detection capability: Process • Step 1: Implant mutants • Step 2: Build a decision tree model • Step 3: Find mismatches • Find possible causes • Step 4: Calculate fault detectability Insert bug • Two possible causes • Found bugs • assertEquals(“Equilateral”, new Trianlge(7,7,7).getTriType()) • Model is not correct • assertEquals (“Isosceles”, new Trianlge(7,7,7).getTriType()) Faulty version Find mismatches

  30. Related Work • Lo et al. (Lo, Cheng, Han, Khoo, & Sun, 2009), Milea et al. (Milea, Khoo, Lo, & Pop, 2012) mines a set of discriminative features capturing repetitive series of events from program execution traces. These features are then used to train a classier to detect failures. • Bowring et al. (Bowring, Rehg, & Harrold, 2004) models program executions as Markov models, and a clustering method for Markov models that aggregates multiple program executions into effective behavior classifiers. • (Pacheco & Ernst, 2005) Pacheco and Ernst build an operational model from observations of the software running properly. The operation model includes object invariants and properties. The object invariants are the conditions hold on entry and exit of all public methods. • Our approach generates and classifies inputs based on the internal structure of the UUT. • Briand (Briand, 2008) has proposed the use of machine learning techniques - including decision trees - for the test oracle problem. • The decision tree model he has proposed is manually built from software requirements.

  31. Conclusions • The first attempt to mine decision tree models • from auto-generated test inputs based on static analysis of Java bytecode • Our empirical study indicates that • using the mined test oracles, average 94.67% mutants are killed by the generated test inputs.

  32. Thanks

  33. Future Research Direction • Requirements Engineering & Natural language Process • Generating UML diagrams, e.g., Use case, Class diagram • Validating SRS • Deriving test cases from SRS • Software Design & Social Networks Analysis • Utilizing SSA for analyzing communication diagram, class diagram, and sequence diagram for improving the quality of the software • Software Implementation & Big Data • Mining repository for software quality assurance using Hadoop • Software Testing & Mobile/Cloud Application • Testing mobile applications and distributed applications

  34. Build Variable Dependency Tree (VDT)

More Related