Do Developers Focus on Severe Code Smells? Tsubasa Saika1, Eunjong Choi1, Norihiro Yoshida2, Shusuke Haruna1, Katsuro Inoue1 1Osaka University, Japan 2Nagoya University, Japan (Thank you for the introduction. ) (Hello, I’m Tsubasa Saika. I’m a master student at Osaka University.) Now, I would like to talk about our research entitled “Do Developers Focus on Severe Code Smells?”.
Code Smell A symptom of poor design that may hinder code comprehension[1] Used to find structures in source code that suggest the possibility of refactoring Examples of Code Smell Name Description Recommended Refactoring Blob Class A large and complex class Extract Class Blob Operation A large and complex method Extract Method At first, I will explain about the background of our research. Code smell is a symptom of poor design that may hinder code comprehension. It is used to find structure in source code that suggest the possibility of refactoring. This table shows two types of Code Smell. Blob Class is a type of code smell which indicates a large and complex class. Also, Blob Operation is another type of code smell which indicates a large and complex method. [1] M. Fowler. Refactoring:Improving the Design of Existing Code. Addison Wesley,1999.
Code Smell Detection Tool Many code quality analysis tools can automatically detect code smell in source code. Most of them use software quality metrics to identify code smell. E.g.) Detection rule for Blob Class[2] LOC > VERY_HIGH WMC > VERY_HIGH AND 0:54 Nowadays, many code quality analysis tools can automatically detect code smell from source code. Most of them use software quality metrics to identify code smells. This figure show an example of detection rule for Blob Class. A class that has very high LOC, very high WMC and satisfy other conditions will be detected as Blob Class. Blob Class TCC < LOW Containing more than 2 Blob operation [2] M. Lanza and R. Marinescu. Object-Oriented Metrics in Practice. Springer, 2006.
Prioritization of Code Smell In large-scale source code, tools detect a large number of code smells. Developers must determine which code smells should be preferentially refactored. 1:22 In large-scale source code, tools detect a large number of code smells. Therefore, developers must determine which code smells should be preferentially refactored.
Severity of Code Smell To prioritize code smells, several tools calculate the severity value of each detected instances. The severity value of each code smell is calculated based on the software metrics. severity 1 2 3 4 5 light heavy 1:35 To prioritize code smells, several tools calculated the severity value of each detected instances. The severity value of code smell is calculated based on the software metrics. Several software metrics are used to identify refactoring opportunities. Several software metrics are used to identify refactoring opportunities[3]. [3] F. Simon et al. Metrics based refactoring. In Proc of CSMR, 2001
Research Motivation Several code smell detection tools prioritize code smells based on the severity values. It is still unclear whether the severity indicators are in line with developers’ perception. If not in line with developers’ perception, severity based prioritization is inappropriate to suggest refactoring. 2:02 Several code smell tools prioritize code smells based on the severity calculated with software metrics. However, it is still unclear whether the severity indicators are in line with developers’ perception. If not in line with developers’ perception, severity based prioritization is inappropriate to suggest refactoring. Therefore, it is necessary to investigate whether developers focus on code with sever code smell. It is necessary to investigate whether developers focus on code with sever code smell.
Does refactoring decrease the severity of code smell? Overview of Research We investigated the relationship between the severity of code smells and refactoring. Research Questions(RQ) RQ1 RQ2 Do developers perform refactoring more frequently on code with more sever code smell? 2:31 Next, I will give an overview of this research. We investigated the relationship between the severity of code smells and refactoring performed in software development. In this study, we set up two research questions. The first one is “Do developers perform refactoring more frequently on code with more sever code smell?” and the second one is “Does refactoring decrease the severity of code smell?”. Does refactoring decrease the severity of code smell?
Analyzed Dataset This study investigated three Java OSS systems. Also used the dataset of refactoring that was collected by Bavota et al.[6] Statistics data of analyzed systems System Period # Release # Class # Refactoring Xerces-J Oct. 1999 - Nov. 2010 34 19,567 6,052 ArgoUML Oct. 2002 - Dec. 2011 12 43,686 3,423 Apache Ant Jan. 2000 - Dec. 2010 18 22,768 1,493 2:59 In our investigation, we analyzed sixty four release versions of three Java Open Source Software systems. We also used the dataset of refactoring that was collected by Bavota et al. This table shows the summary of analyzed systems. [6] G. Bavota et al. “An Experimental investigation on the Innate Relationship between Quality and Refactoring.” Journal of Systems and Software, 2015.
Overview of Investigation source code of each release version 1. Detect Code Smell inFusion Collection of severity values 2. Trace severity values Changes of severity values 3. Categorize whether Code Smell was refactored Dataset of Refactoring 3:19 Our investigation is comprised of four steps. In the first step, we detected code smells from source code of each release version. Refactored classes/methods with Code Smell Non-Refactored classes/methods with Code Smell 4. Conduct significant tests
Detecting Code Smell We used inFusion*1, a tool that detects code smells based on software metrics. It calculates severity value on a scale from 1 to 10. 1 is the most trivial and 10 is the worst E.g.) Detection of Blob Class inFusion 1. Calculates metrics 2. Identifies code smell 3. Calculates severity LOC Identifies Class A as Blob Class Calculates the severity based on metrics values WMC 3:36 For detecting code smells, we used inFusion which is a tool that detects code smells based on software metrics. It calculates severity value on a scale from 1 to 10. 1 is the most trivial and 10 is the worst. This figure illustrates an example of detection of Blob Class. First, inFusion calculates software metrics values from source code. Next, it identify the type of code smell based on the metrics values. Finally, it calculates the severity value. As a result, inFusion outputs name of the class or the method, type, and severity of each detected code smell. Class A TCC Output source type severity Class A Blob Class 4 *1http://www.intooitus.com/products/infusion
Overview of Investigation source code of each release version 1. Detect Code Smell inFusion Collection of severity values 2. Trace severity values Changes of severity values 3. Categorize whether Code Smell was refactored Dataset of Refactoring 4:29 In the second step, we traced the severity value of each code smell through consecutive release versions. Refactored classes/methods with Code Smell Non-Refactored classes/methods with Code Smell 4. Conduct significant tests
Tracing Severity Values Trace severity value of each code smell between consecutive release versions Release Version k Release Version k+1 Class A Class A Class Level Class Level no change Type severity Blob Class 4 Type severity Blob Class 4 Method Level Method Level increase 4:41 This figure shows an example of tracing severity values of code smells between release version k and k+1. In release version k, Class A has three code smells: Blob Class, Blob Operation, and Sibling Duplication. In release version k+1, the severity of Blob Class is not changed. The severity of Blob Operation is increased. Meanwhile, The severity of Sibling Duplication is decreased. In this study, the severity value of a class or a method without smell is considered as 0. method Type severity m1() Blob Operation 5 m2() Sibling Duplications 1 method Type severity m1() Blob Operation 6 m2() - +1 decrease -1 This study considers class/method without smell as severity 0.
Overview of Investigation source code of each release version 1. Detect Code Smell inFusion Collection of severity values 2. Trace severity values Changes of severity values 3. Categorize whether Code Smell was refactored Dataset of Refactoring 5:27 In the third step, we categorize code smells depending on whether they were refactored or not. Refactored classes/methods with Code Smell Non-Refactored classes/methods with Code Smell 4. Conduct significant tests
Overview of Investigation source code of each release version 1. Detect Code Smell inFusion Collection of severity values 2. Trace severity values Changes of severity values 3. Categorize whether Code Smell was refactored Dataset of Refactoring 5:37 In the final step, to answer research questions, we conducted two significant tests between refactored group and non-refactored group. Refactored classes/methods with Code Smell Non-Refactored classes/methods with Code Smell 4. Conduct significant tests
Significant Test For RQ1 Do developers perform refactoring more frequently on code with more sever code smell? RQ1: whether refactored classes/methods have more severe code smells than non-refactored classes/methods We conducted the Mann-Whitney U test, a nonparametric significance test. Group of Refactored Blob Class Group of Non-Refactored Blob Class 5:48 To answer research question 1, “Do developers perform refactoring more frequently on code with more sever code smell?”, we investigated whether refactored classes or methods have more severe code smells than non-refactored classes or methods. We conducted the Mann-Whitney U test, which is a nonparametric significance test, for each type of code smell separately. For example, we compared the severity of code smells between group of refactored Blob Class and group of non-refactored Blob Class. release class severity 1 Class A 4 Class D 10 2 release class severity 1 Class B 4 Class C 2 8 Compare the severity among the same type of Code Smell … …
Results of U-test for RQ1 6:29 This table shows the results of the Mann-Whitney U test obtained from all refactoring instances as well as obtained from refactoring instances corresponding to code smells. Fowler and Lanza suggested corresponding refactoring types for each type of code smell. In this presentation, we focus on corresponding refactoring instances because only corresponding refactoring are performed due to code smells. In this table, the check marks represent the existence of significant differences. ‘n/a’ indicates that the Mann-Whitney U test was not applied because they are non-refactored classes. (As shown in the table,) In the results obtained from corresponding refactoring instances, there are significant differences only for Blob Class, Blob Operation, and Sibling Duplication. ✓ :significant difference(p<0.05), n/a:not applicable [1] M. Fowler. Refactoring:Improving the Design of Existing Code. Addison Wesley,1999. [2] M. Lanza and R. Marinescu. Object-Oriented Metrics in Practice. Springer, 2006.
Significant Test For RQ2 Does refactoring decrease the severity of code smell? whether there are significant differences in the change of the severity value between refactored classes/method and non-refactored classes/method We also conducted the Mann-Whitney U test. Group of Refactored Blob Class Group of Non-Refactored Blob Class release class change of severity 1→2 Class A Class D -4 2→3 -2 release class change of severity 1→2 Class B 4 Class C -1 2→3 1 7:32 To answer research question 2, “Does refactoring decrease the severity of code smell?”, we investigated whether there are significant differences in the change of the severity value between refactored classes/method and non-refactored classes/method. We also conducted the Mann-Whitney U test for each type of code smell separately. We compared the change of the severity of code smells among the same type of Code Smell. Compare the change of the severity among the same type of Code Smell … …
Results of U-test for RQ2 8:10 This table shows the results of the Mann-Whitney U test for RQ2. As well as the previous table, the check marks represent the existence of significant differences. ‘n/a’ indicates that the Mann-Whitney U test was not applied because they are classes without code smells or non-refactored classes. As shown in the table, there are significant differences for Blob Class, Blob Operation, and Internal Duplication. ✓ :significant difference(p<0.05), n/a:not applicable
Summary of Results Only Blob Class and Blob Operation showed significant differences for both RQ1 and RQ2. Refactoring was performed more frequently on code with more severe code smell. Refactoring significantly decreased the severity. The severity was not a useful indicator for the other types of Code Smell. 8:55 Only Blob Class and Blob Operation showed significant differences for both RQ1 and RQ2. For these types of code sells, refactorings were performed more frequently on code with more severe code smell and refactorings significantly decreased the severity. Whereas, the severity was not a useful indicator for the other types of Code Smell. In conclusion, for Blob Class and Blob Operation, it is useful to preferentially show more severe Code Smell to developers. For Blob Class and Blob Operation, it is useful to preferentially show more severe Code Smell to developers.
Summary & Future Work Summary Future Works This study investigated the effect of the severity of code smells on refactorings. For Blob Class and Blob Operation, developers perform refactoring more frequently on code with more severe code smell. Future Works Analyze additional software systems 9:33 In summary, this study investigated the effect of the severity of code smells on refactorings performed through 64 releases of three Java OSS systems. Based on our investigation results, it turns out that developers perform refactoring more frequently on code with more severe code smell for Blob Class and Blob Operation. As future work, we plan to analyze additional software systems to achieve the generality of our findings. Thank you very much for your attention.