The Conceptual Coupling Metrics for Object-Oriented Systems Denys Poshyvanyk and Andrian Marcus SEVERE group @ 22nd IEEE International Conference on Software Maintenance Philadelphia, Pennsylvania September 27, 2006
Motivation Concepts and classes Implementation and representation of concepts Semantic information
Example Methods from MySecMan class in Mozilla
Approach Latent Semantic Indexing Advantages: captures essential semantic info via dimensionality reduction overcomes problems with polysemy and synonymy easy to apply on the source code
Related Work Coupling measures Previously solved problems: Traceability link recovery Managing software artifacts Conceptual cohesion Software clustering Concept/feature location Requirements traceability Isolating concerns in requirements
Extracting Semantic Info Source code -> Corpus (doc = method) Preprocessing: split_identifiers & SplitIdentifiers Vector space = term-by-document matrix Singular Value Decomposition -> LSI subspace
Computing Conceptual Similarity Cosine between vectors
Conceptual Coupling between Classes Method - Class conceptual similarity Class - Class conceptual similarity Conceptual coupling between A and B = 0.4 Class A Class B 0.5 method1 0.6 method1 0.5 0.2 0.7 method2 0.3 0.4 method2 0.4 0.3 0.2 0.4 method3 method3 0.3
Maximal Conceptual Coupling Conceptual coupling based on the strongest conceptual coupling link Conceptual coupling between A and B = 0.56 Class A Class B 0.5 method1 0.6 method1 0.7 0.2 0.7 method2 0.3 0.4 method2 0.6 0.3 0.2 0.4 method3 method3 0.4
Are We Measuring Anything New? Compare with other coupling measures: Coupling between classes (CBO) [Chidamber’04] Response for class (RFC) [Chidamber’04] Message passing coupling (MPC) [Li’93] Data abstraction coupling (DAC) [Li’93] Information-flow based coupling (IPC) [Lee’95] A suite of coupling measures by Briand et al: ACAIC, OCAIC, ACMIC and OCMIC Tools: Columbus [Ferenc’04] IRC2M
Software Systems Ten open-source systems from different domains
Principal Component Analysis Identifying groups of metrics (variables) which measure the same underlying mechanism that defines coupling (dimension) PCA procedure: collect data identify outliers perform PCA
PCA Results: Rotated Components CoCC and CoCCm define new dimensions (PC2 and PC6)
Discussion of the Results Conceptual similarities between all pairs of classes Selected classes with highest values of conceptual coupling No direct structural dependencies
Discussion of the Results Cont. Concepts TortoiseCVS: merge and update CVS operations WinMerge: checking out a revision of the file Related concepts and history of common changes
Current & Future Work Connection to change/fault proneness Impact analysis Hidden dependencies/indirect coupling Aspect mining Refining canonical feature sets Concept location and clustering