1 / 13

Using Set Operations on Code Coverage Data to Discover Program Properties

Using Set Operations on Code Coverage Data to Discover Program Properties. by Nick Rutar. Motivation. Many Programs already have code coverage data Various Code Coverage Tools Available Widely Explored Area of Research Regression tests with coverage data becoming more common

jeneil
Download Presentation

Using Set Operations on Code Coverage Data to Discover Program Properties

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Set Operations on Code Coverage Data to Discover Program Properties by Nick Rutar

  2. Motivation • Many Programs already have code coverage data • Various Code Coverage Tools Available • Widely Explored Area of Research • Regression tests with coverage data becoming more common • Code coverage data contains wealth of information about the program • Data usually limited to how program reports it • Want to milk the data for all it is worth • Possibly useful for finding errors in the program

  3. Code Coverage • Three Main Types • Statement • Every line of code • Conditional • Every decision in program (if/else) • Path • Every path in the program • Program usually Instrumented • Dynamic or Static • Usually presented as a composite of separate tests

  4. Using Set Operations • Why use set operations? • Most developers familiar with sets • Data for statement coverage maps nicely onto sets • Possible to manipulate data easily and give glimpses of properties of the code • Most code coverage tools implicitly use sets anyway

  5. Set Operations • Union • Traditional Coverage • Intersection • Lines ran on all tests • Difference • Potential for Locating Errors • Probably biggest stretch from what data is currently being used for

  6. No input 1 2 Union Intersection Difference Set Operations At Work Inputs int main(int argc, char *argv) { int x, y, z; x = y = z =0; if (argc == 2) x = atoi(argv[1]); if (x == 1) y = 3; else if (x == 2) y = 4; if (y > 0) z = 5; else z = -2; return z; }

  7. Off the Beaten Path Sets • Diff, - Union, U Intersection, I • U/I Bad Sets - U Good Sets • Sometimes give better basis for finding bad code • Closest example of prior work only dealt with one bad run at a time • Any given test - itself • Gives you the empty set • U (I of Sets & (U/I Bad Sets - U Good Sets)) • Gives you a very rough slice of program that went bad • Manipulate data as seen fit for what you are looking for …

  8. Other Code Coverage Info • Pareto principle • Better known as 80-20 rule • Pareto noticed 80% of the land in Italy owned by 20% of people • Shows up in all kinds of domains • Nick’s high school - 80% of girls dated 20% of the boys • Software 80-20 rule • 20% of the lines of code is 80% of the runtime of the software • Code Coverage often has frequency information • Use that information for performance bottlenecks

  9. Implementation • Create tool that can use the set information • Implementation details • Created in Java • Based on output of format from LCOV coverage tool • Takes in pre-generated coverage information as input • Supports Union, Difference, and Intersection • Supports Frequency Information

  10. Demo

  11. Evaluation • Test Large Program against its regression test • Use Dyninst for evaluation • C++ program that does binary instrumentation • 100+ Source Files • ~30,000 LOC instrumented to create coverage data • Nightly build already has coverage capability with regression tests • Verify Union matches coverage data given by tool • Use Difference to try to find errors • Series of tests with various inputs • See which inputs cause failure and locate lines to discover error

  12. Future Work • For the Tool • Create Template for Insertion into program • This program doesn’t care what language you are using • Just needs input format to generate initial sets • Specify format in text file, program uses it to input data • Better Visualization to specify points of interest • Highlight source code that still has active lines • Usability • Write now more of a proof of concept than a battle hardened tool • In General • More evaluation of using Diff for finding errors in the program • Evaluation of software bottlenecks • IDE integration

  13. Questions???

More Related