1 / 11

Quality Assessment based on Attribute Series of Software Evolution

This paper discusses defect prediction using CVS data to analyze various characteristics such as lines added, co-changed files, and modifications without commit messages. It further explores the use of value series of evolution attributes as relative measures. Results show correlations and validation using different metrics. Likes include interesting series measures and high correlation found in authors and commit messages. Dislikes include bland reading and lack of support for decisions on series attributes.

fregoso
Download Presentation

Quality Assessment based on Attribute Series of Software Evolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quality Assessment based on Attribute Series of Software Evolution Paper Presentation for CISC 864 Lionel Marks

  2. What is this paper about? • Defect Prediction • Uses CVS data to analyze characteristics such as: • Number of lines added for bug fixes • Number of co-changed files • Number of modifications without a commit message • Then they took the analysis further

  3. The “further” analysis • Took “value series” of evolution attributes • These are relative measures • For example, for the number of lines of code deleted for bug fixes • The “value series” version would be: Number of lines deleted for bug fixes/Number of lines deleted (any type)

  4. Examples of Evolution Attributes • Lines Added/Deleted • Number of Changes • Number of Authors • Co-Changed Files • Co-Changed New Files • Number of files that were created together with a change to the investigated file

  5. Examples of Corresponding Value Series • Lines Added: • Lines added within a day/Total lines of code until this day • Number of Changes • Number of Changes within a day/Total number of changes in the history file until this day • Number of Authors • Number of authors within a day/ Number of changes within this day (Interesting!) • Co-Changed New Files • Number of newly introduced files that are co-changed files/number of co-changed files

  6. Validation • Distance Equation – sum of squares of actual minus estimated • y is the actual value • w is the weight • a is the attribute • Over k instances

  7. Correlation Coefficient • p bar is an average of the predicted values • a bar is an average of the actual values • Value of 1 = perfect correlation • Value of 0 = no correlation • Value of -1 = inverse correlation

  8. Mean Absolute Error • Value of 0 means perfect data • Value greater than zero shows the error in the data averaged out over the number of points in the set

  9. Mean Squared Error • Instead of using absolute value bars, squares are used to emphasize error more when there are large deviations • Still averaged out over the number of points in the data set

  10. Results • Comm. had a lot of errors in their system • Authors best indicator overall • TLinesAdd a solid metric as well – • # of lines added in all co-changed files/#of couplings

  11. Likes and Dislikes of This Paper • Likes • The different series measures were interesting • Very impressed with the high correlation found in Authors and Commit Messages • Nicely related to course work • Dislikes • Found the reading bland • Prefer more support for decisions for series attributes, would have liked more discussion on how they decided upon their denominators • Did not find many unique points to the paper

More Related