Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

Similar presentations


Presentation on theme: "Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)"— Presentation transcript:

1 Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)

2 Introduction Coupling: Coupling: Well-understood Well-understood Excessive coupling should be avoided Excessive coupling should be avoided Empirically (in excess) has been associated with fault-proneness in C++ at least Empirically (in excess) has been associated with fault-proneness in C++ at least The Coupling Between Objects (CBO) metric of Chidamber and Kemerer has dominated the area The Coupling Between Objects (CBO) metric of Chidamber and Kemerer has dominated the area Simple count of the number of unique classes to which any single class is coupled (in whatever way) Simple count of the number of unique classes to which any single class is coupled (in whatever way)

3 Introduction (cont.) Theoretical properties also well understood Theoretical properties also well understood Coupling of a modular system is non- negative Coupling of a modular system is non- negative Merging two modules cant increase system coupling Merging two modules cant increase system coupling Based on a modular system being comprised of nodes and edges connecting those nodes Based on a modular system being comprised of nodes and edges connecting those nodes

4 Information Theoretic metrics (for coupling) Pioneered by Allen and Khoshgoftaar (A&K) Pioneered by Allen and Khoshgoftaar (A&K) First appeared based on Allens PhD work, c.1996 First appeared based on Allens PhD work, c.1996 METRICS paper in 1999 METRICS paper in 1999 At the time created a bit of a stir At the time created a bit of a stir Metrics community re-think Metrics community re-think Could be applied to both OO and procedural Could be applied to both OO and procedural Appealed to the cross-disciplinary ethos Appealed to the cross-disciplinary ethos

5 Roadmap Explain A&Ks metric for system coupling Explain A&Ks metric for system coupling Based on a modular system graph Based on a modular system graph Demonstrate its usefulness Demonstrate its usefulness and drawbacks and drawbacks Identify open issues Identify open issues Research paths in evaluating/modifying the metric Research paths in evaluating/modifying the metric Other applications Other applications

6 Explaining A&Ks coupling

7 A modular system Source: Allen and Khoshgoftaar, 1999

8 Inter-module coupling Source: Allen and Khoshgoftaar, 1999

9 Part I

10 Representation Source: Allen and Khoshgoftaar, 1999

11 Entropy The average information per node The average information per node Always non-negative Always non-negative Defined as: Defined as:

12 Entropy (cont.) All logs All logs base 2 base 2 Unit of measure is a bit Unit of measure is a bit Graph selected has entropy H(S) of 2.46 Graph selected has entropy H(S) of 2.46

13 Part II

14 Sub-graph analysis Consider the subgraph S i consisting of all the nodes in S and the edges of S that have the i th node as an end point Consider the subgraph S i consisting of all the nodes in S and the edges of S that have the i th node as an end point Disconnected nodes included in the sub- graph Disconnected nodes included in the sub- graph Calculate the same probability distribution as we did previously Calculate the same probability distribution as we did previously

15 For node 2 NodeEdge 1Edge 4 000 110 211 300 400 500 600 700 800 900 1000 1101 1200 1300 1400 Source: Allen and Khoshgoftaar, 1999

16 Entropy (for distribution of node labels) Defined as: Defined as:

17 Entropy (cont.) Gives an entropy H(S i ) total Gives an entropy H(S i ) total value (i : 0..14) of 6.28 value (i : 0..14) of 6.28

18 Part III

19 Ethos of the coupling metric The entropy of the modular system taken as a whole is less than or equal to the sum of entropies of the individual components The entropy of the modular system taken as a whole is less than or equal to the sum of entropies of the individual components H(S) <= sum H(S i ) H(S) <= sum H(S i ) The difference between these values represents the true coupling relationships or excess entropy The difference between these values represents the true coupling relationships or excess entropy

20 Excess entropy C(S) C(S) = 6.28 – 2.46 = 3.82 Where:

21 Coupling in a modular system (ms) Coupling(MS) = (n+1) C(S) Coupling(MS) = (n+1) C(S) = 15 * 3.82 = 57.28 = 15 * 3.82 = 57.28

22 Assessment of the metric

23 A metric sensitive to patterns of connections. This is attractive, because software engineers recognize patterns as well (Allen and Khoshgoftaar, 1999)

24 Coupling(MS): a.2.76f. 26.83 b.8.00g. 30.83 c.16.00h. 34.83 d.17.32i. 22.04 e.24.07j. 27.78 Source: Allen and Khoshgoftaar, 1999

25 Coupling (CBO): a.2f. 8 b.4g. 10 c.6h. 12 d.6i. 8 e.8j. 8 Source: Allen and Khoshgoftaar, 1999 Coupling(MS): a.2.76f. 26.83 b.8.00g. 30.83 c.16.00h. 34.83 d.17.32i. 22.04 e.24.07j. 27.78

26 Comparison with CBO

27 Issues Computes system coupling Computes system coupling Most coupling studies use a class coupling basis Most coupling studies use a class coupling basis Need a class-based entropy measure (NHD) Need a class-based entropy measure (NHD) Comparison between i. and j. Comparison between i. and j. Suggests that I is better than j. Suggests that I is better than j. OO people might disagree with an inheritance structure being better OO people might disagree with an inheritance structure being better Maintaining the root node would be highly problematic Maintaining the root node would be highly problematic Do developers really look for patterns? Do developers really look for patterns? Does not take into account the type of coupling Does not take into account the type of coupling Can not be gleaned from a UML class diagram Can not be gleaned from a UML class diagram

28 Potential studies Fault analysis Fault analysis Which of the two correlates more with faults Which of the two correlates more with faults Larger-scale study Larger-scale study The effect of refactoring on the values of both Coupling(MS) and CBO The effect of refactoring on the values of both Coupling(MS) and CBO Hamming distance for coupling? Hamming distance for coupling? A final word on cohesion…… A final word on cohesion……

29 Cohesion A key advantage of the CBO and the reason for its popularity is that there is no argument about its interpretation and to some extent the Coupling(MS); it is an objective measure A key advantage of the CBO and the reason for its popularity is that there is no argument about its interpretation and to some extent the Coupling(MS); it is an objective measure The same cannot be said about cohesion, because it is subjective The same cannot be said about cohesion, because it is subjective

30 Thanks for listening


Download ppt "Information Theory in Software Metrics (Assessment and Issues) Steve Counsell, (Brunel University and CREST)"

Similar presentations


Ads by Google