© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.1 7. Problem Detection Metrics Software quality Analyzing trends Duplicated Code Detection techniques Visualizing duplicated code
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.2 Why Metrics in OO Reengineering (ii)? Assessing Software Quality Which components have poor quality? (Hence could be reengineered) Which components have good quality? (Hence should be reverse engineered) Metrics as a reengineering tool! Controlling the Reengineering Process Trend analysis: which components did change? Which refactorings have been applied? Metrics as a reverse engineering tool!
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.3 ISO 9126 Quantitative Quality Model Leaves are simple metrics, measuring basic attributes
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.4 Product & Process Attributes
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.5 External & Internal Attributes
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.6 External vs. Internal Product Attributes ExternalInternal Advantage: close relationship with quality factors Disadvantage: relationship with quality factors is not empirically validated Disadvantages: measure only after the product is used or process took place data collection is difficult; often involves human intervention/interpretation relating external effect to internal cause is difficult Advantages: can be measured at any time data collection is quite easy and can be automated direct relationship between measured attribute and cause
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.7 Metrics and Measurements [Wey88] defined nine properties that a software metric should hold. Read [Fenton] for critiques. For OO only 6 properties are really interesting [Chid 94, Fenton] 1. Noncoarseness: Given a class P and a metric m, another class Q can always be found such that m (P) m(Q) not every class has the same value for a metric 2. Nonuniqueness. There can exist distinct classes P and Q such that m(P) = m(Q) two classes can have the same metric 3. Monotonicity m(P) m (P+Q) and m(Q) m (P+Q), P+Q is the “combination” of the classes P and Q.
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.8 Metrics and Measurements (ii) 4. Design Details are Important The specifics of a class must influence the metric value. Even if a class performs the same actions details should have an impact on the metric value. 5. Nonequivalence of Interaction m(P) = m(Q) m(P+R) = m(Q+R) where R is an interaction with the class. 6. Interaction Increases Complexity m(P) + (Q) < m (P+Q). when two classes are combined, the interaction between the too can increase the metric value Conclusion: Not every measurement is a metric.
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.9 Selecting Metrics Fast Scalable: you can’t afford log(n2) when n 1 million LOC Precise (e.g. #methods — do you count all methods, only public ones, also inherited ones?) Reliable: you want to compare apples with apples Code-based Scalable: you want to collect metrics several times Reliable: you want to avoid human interpretation Simple Complex metrics are hard to interpret
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.10 Assessing Maintainability Size of the system, system entities Class size, method size, inheritance The intuition: large entities impede maintainability Cohesion of the entities Class internals The intuition: changes should be local Coupling between entities Within inheritance: coupling between class- subclass Outside of inheritance The intuition: strong coupling impedes locality of changes
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.11 Sample Size and Inheritance Metrics Class Attribute Method Access Invoke BelongTo Inherit Inheritance Metrics hierarchy nesting level (HNL) # immediate children (NOC) # inherited methods, unmodified (NMI) # overridden methods (NMO) Class Size Metrics # methods (NOM) # instance attributes (NIA, NCA) # Sum of method size (WMC) Method Size Metrics # invocations (NOI) # statements (NOS) # lines of code (LOC)
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.12 Sample class Size (NIV) [Lore94] Number of Instance Variables (NCV) [Lore94] Number of Class Variables (static) (NOM) [Lore94] Number of Methods (public, private, protected) (E++, S++) (LOC) Lines of Code (NSC) Number of semicolons [Li93] number of Statements (WMC) [Chid94] Weighted Method Count WMC = ∑ c i where c is the complexity of a method (number of exit or McCabe Cyclomatic Complexity Metric)
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.13 Hierarchy Layout (HNL) [Chid94] Hierarchy Nesting Level, (DIT) [Li93] Deep of Inheritance Tree, HNL, DIT = max hierarchy level (NOC) [Chid94] Number of Children (WNOC) Total number of Children (NMO, NMA, NMI, NME) [Lore94] Number of Method Overridden, Added, Inherited, Extended (super call) (SIX) [Lore94] SIX (C) = NMO * HNL / NOM Weighted percentage of Overridden Methods
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.14 Method Size (MSG) Number of Message Sends (LOC) Lines of Code (MCX) Method complexity Total Number of Complexity / Total number of methods API calls= 5, Assignment = 0.5, arithmetics op = 2, messages with params = 3....
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.15 Sample Metrics: Class Cohesion (LCOM) Lack of Cohesion in Methods [Chid94] for definition [Hitz95a] for critique I i = set of instance variables used by method M i let P = { (I i, I j ) | I i I j = } Q = { (I i, I j ) | I i I j } if all the sets are empty, P is empty LCOM =|P| - |Q|if |P|>|Q| 0otherwise Tight Class Cohesion (TCC) Loose Class Cohesion (LCC) [Biem95a] for definition Measure method cohesion across invocations
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.16 Sample Metrics: Class Coupling (i) Coupling Between Objects (CBO) [Chid94a] for definition, [Hitz95a] for a discussion Number of other classes to which it is coupled Data Abstraction Coupling (DAC) [Li93a] for definition Number of ADT’s defined in a class Change Dependency Between Classes (CDBC) [Hitz96a] for definition Impact of changes from a server class (SC) to a client class (CC).
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.17 Sample Metrics: Class Coupling (ii) Locality of Data (LD) [Hitz96a] for definition LD = ∑ |L i | / ∑ |T i | L i = non public instance variables + inherited protected of superclass + static variables of the class T i = all variables used in M i, except non-static local variables M i = methods without accessors
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.18 The Trouble with Coupling and Cohesion Coupling and Cohesion are intuitive notions Cf. “computability” E.g., is a library of mathematical functions “cohesive” E.g., is a package of classes that subclass framework classes cohesive? Is it strongly coupled to the framework package?
© S. Demeyer, S. Ducasse, O. Nierstrasz Duplication.19 Conclusion: Metrics for Quality Assessment Can internal product metrics reveal which components have good/poor quality? Yes, but... Not reliable false positives: “bad” measurements, yet good quality false negatives: “good” measurements, yet poor quality Heavy Weight Approach Requires team to develop (customize?) a quantitative quality model Requires definition of thresholds (trial and error) Difficult to interpret Requires complex combinations of simple metrics However... Cheap once you have the quality model and the thresholds Good focus (± 20% of components are selected for further inspection) Note: focus on the most complex components first!