Presented by Lu Xiao Drexel University Quantifying Architectural Debt
Agenda Problem statement Related work Background Approach Expected contributions Results achieved so far Evaluation plan Conclusion
Problem Statement (1) Flawed relations propagate defects among files. Flawed relations incur increasing maintenance costs over time Debts accumulate interest. Debts incur increasing penalty over time Architectural flaws Technical Debt Architectural Debt
Problem Statement (2) How to define architectural debt? How to identify architectural debt? How to quantify the penalty -maintenance costs? How to model the growing trend of penalty for architectural debt over time?
Related Work (1) Technical Debt [1] A metaphor for consequences of “short-cuts” for immediate goals. Technical Debt has attracted increasing attention [2] : Alves et.al. [3] built an ontology of technical debt by organizing and defining different types of debts. Everton et. al. [4] proposed a solution to identified “self- admitted” debts by reviewing the comments left by developers. Our work aims at advance the understanding and management of architectural debt, a type of technical debt, by automatically identify, quantify and model such debts.
Related Work (2) Bug prediction It aims at predicting the location of bugs to prioritize testing and debugging. History metrics [5] : E.g. number of bugs, bug density, churn… Complexity metrics [6]. E.g. LOC, fanin, fanout,… Our work aims at: Discover architecture issues that cause bugs to propagate. Find refactoring chances to reduce the error-rate through architectural improvement.
Related Work (3) Code Smell According to Fowler [7], "a code smell is a surface indication that usually corresponds to a deeper problem in the system". e.g. code clones, god class, lazy class and feature envy. Code smell has been used as a heuristic for approximating technical debt. Not all files with code smell are involved in technical debt [8]. Not all files in technical debt contain code smell. Our work aims at: Formally define architectural debt Precisely identify architectural debt
Background(1) A novel architectural model---DRSpaces model: The architecture of a software system can be represented using multiple overlapping DRSpaces. Each DRSpace represents a unique aspect of the architecture. We designed an architecture root detection algorithm based on DRSpace model. Error-prone files are architecturally connected in DRSpaces. These DRSpaces contain architectural issues.
Background(2) We defined and identified recurring architectural issues in software systems as hotspot patterns: Unstable interface Unhealthy inheritance Cyclic dependencies Modularity violations We observed strong positive correlation between the number of issues with high maintenance costs in a file.
Approach(1) We define an architectural debt (ArchDebt) as a group of architecturally connected files that incur high maintenance costs over time due to their flawed connections. Three key features: Flawed relations among files. Files coupled in revision history. Flawed relations between files persistently incur high maintenance costs over time.
Approach (2) Automatically identify architectural debts. Quantify the maintenance consequences of architectural debts. Model the growing trend of maintenance costs accumulated on architectural debts over time.
Expected Contributions Pushes the concept of technical debt closer to a practice from a metaphor: Formal definition of architectural debt Approach to identify, quantify and model architectural debt. Contribution in practice: Enable an analyst to precisely locate the debts, quantify and model the maintenance costs for each debt. Help project managers to make informed decisions in if, where and how to refactor.
Results Achieved (1) In the case study of an industry project: Identified architectural debts that were verified by developers. Quantified the impact scope and maintenance costs. Built a simple economic model: Cost required to refactor. Expected benefit of refactoring. We proposed a refactoring plan to the developers team, which was accepted and planned for implementation. We are now gathering and analyzing data.
Results Achieved (2) Preliminary study in 15 open source projects: We selected multiple stable versions for analysis in each project. We located groups of architecturally connected files that are persistently change- and error-prone in these projects. We built simple linear regression models showing that the number of error-prone files in these groups increases over time.
Evaluation Plan(1) Evaluation Questions: RQ1: Are the architectural debts identified by our approach real problems? That is, are they true and significant debts? RQ2: Are the proxies for quantifying architectural debt penalties reliable? What is the most reliable proxy? RQ3: How effective is the evolution model of architectural debts? Can it correctly estimate the amount of costs that have been and will be spent on a debt? Is it useful to stakeholders in making decisions?
Evaluation Plan(2) Open source project Divide data into “past” and “future”, using “future” as a evaluation source. Reach out to the developers for feedback. Industry project Ask for real effort data, if available. Get feedback from developers. If a refactoring is implemented based on our approach, we will track the costs and benefits of refactoring.
Conclusion We formally defined a specific form of technical debt--- architectural debt. We proposed an approach to identify architectural debts, quantify the penalties, and model their evolution trend. We’ve evaluated the usefulness of our approach in 1 industry project and did some preliminary study in open source projects. We plan to evaluate the usefulness of our work using more projects, both industry and open source.
References [1] W. Cunningham. The WyCash portfolio management system. In Addendum to Proc. 7th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, pages 29{30, Oct [2] Technical debt workshop series. [3] N. S. Alves, L. F. Ribeiro, V. Caires, T. S. Mendes, and R. O. Spinola. Towards an ontology of terms on technical debt. In Managing Technical Debt (MTD), 2014 Sixth International workshop on [4] E. da S. Maldonado and E. Shihab. Detecting and quantifying dierent types of self-admitted technical debt. SIGSOFT Softw. Eng. Notes, Apr [5] T. J. Ostrand, E. J. Weyuker, and R. M. Bell. Predicting the location and number of faults in large software systems. IEEE Transactions on Software Engineering, 31(4):340{355, [6] N. Nagappan, T. Ball, and A. Zeller. Mining metrics to predict component failures. pages 452{461, [7] M. Fowler. Refactoring: Improving the Design of Existing Code. Addison-Wesley, July [8] N. Zazworka, A. Vetro, C. Izurieta, S. Wong, Y. Cai, C. Seaman, and F. Shull. Comparing four approaches for technical debt identication. Software Quality Journal, pages 1{24, 2013.