Why do we refactor? Technical Debt Items Versus Size: A Forensic Investigation of What Matters Hello everyone, I’m Ehsan Zabardast I am a PhD candidate in Software Engineering here at BTH So without further ado let’s start Ehsan Zabardast & Javier Gonzalez Huerta
Motivation: The decisions behind refactoring the code: Preparing the system for future changes? Dealing with forms of technical debt? Maintaining the code? Code understandability? … We started by investigating Technical Debt and File Size As you know There are different reasons behind a decision for refactoring the code For example, the code is refactored when Preparing the system for future change, developers tend to clean and refactor the files before adding new feature to the product Developers also refactor the code when they are dealing with technical debt Either to maintain the code by removing technical debt Or increase the code understandability by, for example, reducing the complexity There are other motivations behind refactoring but they are out of the scope of this study We’re in the process of studying refactoring operations effectiveness, but we started investigating the TD items and file size as a step to reach to that goal We use technical debt items as a proxy for capturing technical debt and file size as a proxy to capture code entities complexity ---- Decisions behind refactoring: Dealing with technical debt by improving code quality while maintaining the functionality Maintaining the code by reducing the size and/or complexity of the code entitites being refactored Increasing understandability by refining the code to be more readable for example by renaming the methods and variables Preparing the system for future changes We are using TD items “A technical debt item represents a maintenance task that has been postponed and may cause a problem in future if not completed at present [1]” as representations for TD and file size as a proxy for maintainability [1] Guo, Y., Seaman, C., Gomes, R., Cavalcanti, A., Tonin, G., Da Silva, F. Q. B., … Siebra, C. (2011). Tracking technical debt - An exploratory case study. IEEE International Conference on Software Maintenance, ICSM, 528–531. https://doi.org/10.1109/ICSM.2011.6080824
Study Design: Setup: empirical study 3 open source, java-based projects quantitative analysis on the repositories of selected projects commit level analysis To achieve this goal we designed an empirical study Where we have analyzed 3 open source, java-based projects These projects were selected because They are of a certain size And they have been investigated in similar studies We decided to start with open source projects to prepare the analysis procedure to be applied on industrial cases There are no restrictions for publishing the results of the experiments on open source projects And lastly, to compare the industrial cases with open source projects We have analyzed each project with the smallest granularity which is a commit level analysis ----- Smallest granularity Why we chose these projects: They are certain size They have been investigated in similar studies Why open source? Not a problem for publishing – however I will show you some preliminary results of the industrial data as well To prepare the analysis procedure to be applied on industrial projects\ Later, to compare them together
Studied Cases: System Analyzed Total Apache Ant 1 – 7,000 14,194 Apache Camel 43,832 Apache Commons IO 1 – 2,140 2,140 The studied projects are Apache Ant, Apache Camel, and Apache Commons IO In the case of Ant and Camel, we have analyzed the first 7000 commits since We expect to find more problems to be located in the initial stages of development The systems are more unstable in this phase and therefore we have a better chance of finding more refactoring operations to investigate And the systems are smaller in the initial development phase which gives us a better traceability when tracking technical debt items In the case of Commons IO, we have analyzed the entire history since it only has 2140 commits --- Start with ant and camel Starting with first 7000, we expect more problems to be located in the initial stages, the systems are more unstable in the initial stages therefore we expect to find more refactorings to investigate, the systems are smaller which gives us a better control over tracking the TD items and refactorings, refactoring will happen early in development to stabilize the system
Research Question: Do refactorings happen on larger files or the files which contain Technical Debt Items? So in this study we are investigating the file size and the number of technical debt items per each file to see if they matter when the developer decides to refactor. ------ We want to see what matters when it come to refactoring, Refactoring is a good way to mitigate td items and complexity so we wanna see whether refactorings happen on larger files or the files with more TD item.
Methodology: We are utilizing three tools: Refactoring miner to detect the refactoring operations Sonarqube to detect the technical debt items And cloc to calculate file sizes Here, in an example, I will explain the procedure – for a given commit i We detect the refactoring operations that has happened on that commit (give the example) We detect the technical debt items (give the example) And we calculate the file size in that commit as well Later we investigate commit i-1 to observe the state of the system before the refactoring has happened -------- We only investigate the commits that we detected at least one refactoring operation The analysis is confined within each commit, we know that this will be a threat to validity but since we are not sure about the fact that all the member have access or the ability to change everything everywhere, it’s not fair to compare the refactored files with the whole system.
Methodology: To answer the research question for the study We conduct a mann whitney u test to investigate whether the number of technical debt items in refactored files are significantly different than non refactored files The same goes for the files size We conduct a mann whitney u test to investigate whether the size of refactored files are significantly different than non refactored files
Results: The density distributions of refactored (blue) and non refactored (red) files are relatively similar. As you can see, The peak of the distribution for non refactored files are smaller than refactored files which is more obvious in the case of commons io. The vertical lines represent the median for each distribution And as we can see the median for refactored files distribution is bigger than non refactored files Again more prominently visible in commons io if you rescale the plots for apache ant and camel, the difference will be more visible
Results: the plots are not that informative This shows that no matter being refactored or not the files mostly contain one technical debt item
Results: File Size Matters (p-value <.05 for all system) To reduce the file size Increasing maintainability The more code you have the more problems appear Number of TD Items does NOT Matter (p-value >.05 for all system) Refactored or not-refactored, the files have similar number of TD Items Refactoring does not happen just to remove the TD Items The developers do not touch the files with more TD items The results suggest that Within the scope of the studied systems analyzed commits: Files size matters This might be owing to the fact that the number of technical debt items does not matter
Conclusions: There are multiple reasons behind a decision for refactoring We have investigated two of these File size – Bigger files tend to get refactored Technical Debt Items – The TD items in refactored files are not significantly different with non-refactored files
What’s Comes Next: Investigating the effectiveness of refactorings Refactorings are effective for removing TD items Identifying the TD reducing/inducing refactoring operations TD items Survivability, i.e. how they die. The impact of refactoring on maintainability By investigating the relationship between change entropy and refactoring operations Investigating the Developers’ Experience and Expertise on the project Investigating asset management through developers’ perspective and experience, i.e., what matters. so what comes next we are following up the research in 5 branches: the first step is to study the effectiveness of refactorings and how effective they are in removing technical debt items This also including investigating how refactorings effect other entities meaning that they might introduce new technical debt items somewhere else in the system This study is already being conducted and will share some preliminary results with you today
Systems’ Evolution: here we have three systems where we analyzed the evolution of technical debt items together with the system the systems are apache ant apache camel and an industrial case the red line represents the system size in KLOC the green line represent the total number of refactorings the blue line represents the accumulated number of refactoring operations as we can see the line follow the same pattern in case of apache ant ... in case of apache camel ... and in the industrial case
Refactoring Operations and TD Items:
What’s Comes Next: Investigating the effectiveness of refactorings Refactorings are effective for removing TD items Identifying the TD reducing/inducing refactoring operations TD items Survivability, i.e. how they die. The impact of refactoring on maintainability By investigating the relationship between change entropy and refactoring operations Investigating the Developers’ Experience and Expertise on the project Investigating asset management through developers’ perspective and experience, i.e., what matters. space The second branch is where we study the survivability of technical debt items in the third branch we study the impact of refactorings on maintainability of the code where we investigate the relationship between change entropy and refactoring operations in the forth branch we consider the developers expertise and experience in the project and how it effects the technical debt which is in the system and lastly in the fifth branch, we investigate asset management through developers' perspective and experience. this is where we ask them to explain what matters
Questions that sums it up questions?