Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alexander Serebrenik and Mark van den Brand Theil index for aggregation of software metrics values.

Similar presentations

Presentation on theme: "Alexander Serebrenik and Mark van den Brand Theil index for aggregation of software metrics values."— Presentation transcript:

1 Alexander Serebrenik and Mark van den Brand Theil index for aggregation of software metrics values

2 Dn – Aggregate – Evolution Metrics and evolution / SET / W&I PAGE 1 Measure: micro, need: macro

3 Dn – Aggregate – Evolution How can we aggregate values? Industry: sum, average Not always meaningful Theory: distribution Controversial even for LOC Requires separate effort for each metrics Econometry : Measures of inequality (for wealth distribution) Vasa et al. 2009: Gini coefficient −Not decomposable! We: Theil coefficient / SET / W&I PAGE 2

4 Dn – Aggregate – Evolution Econometry and software metrics Vasa et al. 2009: Gini coefficient Pro: −range 0..1 −for metrics, usually, 0.45..0.75 −quite stable in time, meaningful deviations / SET / W&I PAGE 3

5 Dn – Aggregate – Evolution Decomposition? Groups of individuals How can we explain inequality? programming language, development team, application domain How does the inequality evolve? How do the explanations evolve? To measure I we use the Theil index! Why? There are just two decomposable indices… / SET / W&I PAGE 4

6 Dn – Aggregate – Evolution Evolution of the Theil index / SET / W&I PAGE 5 Slight increase in the index values. A huge file added and then removed “Quite stable in time, meaningful deviations” JBoss Debian Adempiere

7 Dn – Aggregate – Evolution Explanation of inequality in LOC / SET / W&I PAGE 6 Programming language is quite poor as an explanation. Categories/packages are better as an explanation. Most significant part of the inequality is due to the inequality within the groups. Adempiere Package Programming language Debian

8 Dn – Aggregate – Evolution OK… within a group. But which one? / SET / W&I PAGE 7 Only two languages contribute significantly to the inequality: Java and SQL Different packages contribute most at different versions. More and more migration scripts, some very small (3 LOC), some rather big (18800 LOC). Adempiere

9 Dn – Aggregate – Evolution Debian (preliminary results) The largest contributors to inequality are ANSI C – presence of header files? / SET / W&I PAGE 8 etchlenny ANSI C47.5%45.9% Shell17.2%12.2% C++15.7%19.0%

10 Dn – Aggregate – Evolution Conclusions Decomposability for software metrics provides insights in reasons for inequality allows to compare different groups Theil index is a decomposable inequality measure Useful for assessing evolution of a system as a whole of the system subcomponents / SET / W&I PAGE 9

11 Dn – Aggregate – Evolution Theil index and its decomposition / SET / W&I PAGE 10 Decomposition: groups G 1, …, G n

Download ppt "Alexander Serebrenik and Mark van den Brand Theil index for aggregation of software metrics values."

Similar presentations

Ads by Google