1 / 28

Aiding Comprehension of Cloning Through Categorization

Aiding Comprehension of Cloning Through Categorization. Cory Kapser and Michael W. Godfrey Software Architecture Group School of Computer Science, University Of Waterloo. Overview. Motivation Background Methods Case Studies Results Discussion Summary. Motivation.

kimama
Download Presentation

Aiding Comprehension of Cloning Through Categorization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Aiding Comprehension of Cloning Through Categorization Cory Kapser and Michael W. Godfrey Software Architecture Group School of Computer Science, University Of Waterloo

  2. Overview • Motivation • Background • Methods • Case Studies • Results • Discussion • Summary

  3. Motivation • Code duplication (“cloning”) is common in large, long-lived industrial software systems. • Negatively affects successful system evolution! • Thus, clone management or removal is desirable.

  4. Problems with clone detection technologies • Comprehension • Result sets often provide little information beyond “it’s a clone” • Scalability • VERY large result sets typical • Accuracy • Esp. false positives

  5. Proposed solution • Classification of clones • Improve comprehension through informative grouping and statistical analysis • Improve scalability through easier navigation • Improve accuracy through region-specific filtering

  6. Overview • Motivation • Background • Methods • Case Studies • Results • Discussion • Summary

  7. Code cloning • A serious problem in industrial software. • Typically, 15% of a system is duplicated code. • As high as 50% in some cases [Ducasse]

  8. Reasons for code cloning • Perceived cost • Time constraints • Insufficient understanding of the underlying problem • Architectural clarity

  9. Problems with clones • Maintenance • Size • Comprehension • Bugs (copied and new) • Indication of poor design

  10. Managing clones • Removal • Documentation

  11. Overview • Motivation • Background • Methods • Case Studies • Results • Discussion • Summary

  12. Our approach • Perform clone detection • Extract/define “regions” from source code • Map clone pairs to regions • Classify clones • Filter clones • Display results

  13. The taxonomy • Classifies clones according to attributes such as location and region type of a clone • Hierarchical

  14. ADD A SLIDE HERE • To discuss what you hoped yoru taxonomy would help you with • Why did you pcik that design? • Give an example of how using this taxonomy could be helpful in a (simple, made up) example case

  15. Overview • Motivation • Background • Methods • Case Studies • Results • Discussion • Summary

  16. Case studies • PostgreSQL • 543,387 LOC • 1097 source files • Linux kernel file-system subsystem • 280,177 LOC • 537 source files

  17. Filtering and classification results • 85 – 87% of clones could be classified using the taxonomy • Fewer unclassified clones in Same Directory Clonescategory • Large percentage of false positives were removed via filtering structural and prototype regions.

  18. Overall cloning in the systems • Function Clones dominate the SameDirectory Clones. • Most cloning occurs within the same directory.

  19. Frequency of clone types • Very few loop clones • Relatively many conditional clones • 38% of the clone pairs in the Linux fs and 53% of the clone pairs of PostgreSQL made up function clones

  20. It is possible to insert a table here with the results even if it is partial (to show that the work is there and that there are numbers)? • Or maybe a graph? Nice to have this to imply: here’s all the hard work we did, boy did we sweat, and there are so many results that the obersvations are probably meaningful

  21. Overview • Motivation • Background • Methods • Case Studies • Discussion • Summary

  22. Cloning comprehension • Classification of clones can improve comprehension • User will have a working understanding of what a clone in a certain type means • We believe navigation of the “clone space” will be greatly improved • We now know more about cloning as it occurs in a software system • Simple metrics are now available

  23. Tool support • Clone Interpretation and Classification System (CICS) • Provides GUI to navigate classified clones • Will provide benchmarking support for clone detection tools • Many features can be added complement the sorting of clones in the taxonomy

  24. CICS

  25. Overview • Motivation • Background • Methods • Case Studies • Discussion • Summary

  26. Summary • Management of clones is important for the healthy evolution of a software system • We can make the process of managing clones more comprehensible, scalable, and accurate

  27. Future work • Deeper classification • Benchmark suite • IDE plugins • Evolution of clones

More Related