1 / 23

Investigating JAVA Classes with Formal Concept Analysis

17-791 Software Research Seminar (SSSG). Investigating JAVA Classes with Formal Concept Analysis. Uri Dekel (udekel@cs.cmu.edu). Based on M.Sc. work at the Israeli Institute of Technology. To appear: 10 th Working Conference on Reverse Engineering (WCRE’03), and as a poster in OOPSLA’03.

moke
Download Presentation

Investigating JAVA Classes with Formal Concept Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 17-791 Software Research Seminar (SSSG) Investigating JAVA Classes with Formal Concept Analysis Uri Dekel (udekel@cs.cmu.edu) Based on M.Sc. work at the Israeli Institute of Technology.To appear: 10th Working Conference on Reverse Engineering (WCRE’03), and as a poster in OOPSLA’03

  2. Outline • Research goals and hypotheses • A crash-course in formal concept analysis • Interface visualization • Reasoning about class implementation. • Applications to code inspection • Additional research Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  3. Goals • Research question: ``Can we exploit the data-member based cohesion between function-methods in a class to reason about the class and discover errors?’’ • Specifically: • Provide faster learning curve for new class users by improving interface presentation • Assist reverse engineering by visualizing structure • Assist code inspection by suggesting reading order • Important principle: keep it simple to use and learn. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  4. Hypothesis #1 • Data-member use is fundamental to understanding a class. • All possible implementations of an operation will use the same fields • Representation changes are rare • Basis for cohesion-based metrics (e.g., LCOM) • Analogous to global variable based modularization of procedural code. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  5. Hypothesis #2 • Methods that use the same combination of fields are likely to be related. • e.g., get/set, add/remove, etc. • Even more so due to the ``shopping list approach’’ • Promotes complete interfaces using composite methods Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  6. Means • Formal Concept Analysis • Mathematical classification technique • Uses binary relation (context) between objects and attributes • not to be confused with OO terms • Produces a concept lattice (next slide) • Much literature on applications in various fields Example: Context of the Pnt3D class Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  7. Formal Concept Analysis • Input: A context <O,A,R> • O is a set of objects • A is a set of attributes • R is a binary relation between O and A • Mapping: Galois Connection • Common attributes of a set of objects: • Common objects of a set of attributes: • Output: Concepts <O’,A’> s.t. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  8. Formal Concept Analysis Example: Concepts of the Pnt3D class A concept lattice is based upon a partial order between concepts: Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  9. Concept Lattices • A sparse concept lattice provides an alternate view of the tabular context and the full concept lattice • Each concept is a group of objects which have the same attributes • The attributes are the union of attributes in that concept and all the concept that it dominates • In our case, methods that usethe same fields are clustered together • Reveals structure and asymmetries Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  10. Interface Visualization • The lattice partitions the methods in the interface into equivalence classes • Similar methods are heuristically clustered together. • An automatic ``feature categorization’’ • Lattice provides multidimensional connections • Compare with simple lexical lists of methods (Note: class is “flattened” to remove inheritance details) Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  11. Interface Visualization • To be effective, multiple methods should appear in each concept, on average • A lattice can have up to n=2MIN(|M|,|F|) concepts • In a data set of circa 6000 classes: • In 99.5%, n < M + F • In 77.4%, n < M Example: Concepts vs. Methods in Eclipse. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  12. Case Study • The Molecule class from CDK • CDK: Chemistry Development Kit • Open source library of chemistry related classes • Developed at the Max Plank institute in Germany • Used in chemistry visualization applications • Why the Molecule class? • Has a large interface (nearly 75 public members) • The represented entity is familiar to most people • Our technique revealed new errors in this class. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  13. Case Study • Lattice structure hints on class structure • A lot of independent operations on the left. • Similar to a C struct. • Cohesive component on the right. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  14. Interface Visualization • Multiple methods with the similar signatures indicate possible repetition. • Inconsistency in naming. • Inconsistencies in return types. • Because related methods are grouped in concepts, we can notice inconsistencies or repetitions Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  15. Investigate Implementation • We examine fields and dependencies between concepts to understand the cohesive component • Collections of atoms and bonds • Micro-management of arrays (count field tracks available items) • Inconsistencies and broken invariants. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  16. Investigate Implementation • Asymmetries are revealed by examining pairs of related concepts. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  17. Embedded Call Graph • A concept lattice clusters methods but does not portray interactions • Call graphs show interaction between methods but layout does not depend on semantics • Embedded call graph combines the two Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  18. Code Inspection • Lattice can help us select a reading order • Minimize focus shifts. • Similar methods are read consecutively. • We define a global order between concepts. • e.g., each component separately, topological ordering, read by order of layers. • We define a local order between methods in each concept. • e.g., topological ordering, read by order of simplicity, etc. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  19. Tooling Support • Batch-mode prototype • Produces lattices and metrics • Database-support for metrics and statistics research • Interactive Eclipse plug-in prototype • Adds an additional view for a .java files • Uses simplistic external static analyzer. • Limited by current 2D capabilities of eclipse. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  20. Research Directions • Conduct user studies to validate methodology • Preliminary user-studies provided good feedback • Lattice-based metrics suite • Application to class design in CASE tools • Interactive class diagram editor based on concept lattice • Semantics assigned by connecting methods to fields. Compare with simply adding methods to a list as in current tools. Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  21. Research Directions • Class-wide “diffing” • Provide birds-eye view of changed areas. Example: Differences between the original version of the “Graph” class of VGJ (Visualizing Graphs with Java) and the Technion adaptation of that class. Original appear in bold font, modifications appear in plain font Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

  22. Backup Material

  23. Graph Class Investigating Classes with FCA, Uri Dekel, 17-791 Software Research Seminar

More Related