Quality analysis of industrial systems. LaQuSo experience Serguei Roubtsov
/ LaQuSo / Mathematics & Computer Science PAGE LaQuSo: Laboratory for Quality Software 9 employees + master students and students- assistants (HG 5.91) industrial projects research projects own research in software quality assessment and tooling
/ LaQuSo / Mathematics & Computer Science PAGE Analysis approach Focus on maintainability: Static code analysis, architecture assessment, code reviews, tooling Based on quantifiable measures: software metrics Provides an answer to an analysis question; for this purpose metrics: should reflect quality criteria, thresholds Visualisation
/ LaQuSo / Mathematics & Computer Science PAGE What code is hard to maintain? Hard to understand not documented cluttered or inconsistently used/developed too large Difficult to modify duplicated intertwined non-extendable non-portable Difficult to test / analyse too complex
What to analyze? / LaQuSo / Mathematics & Computer Science PAGE Architecture: dependencies (layering) dependency cycles code external duplication dead code system documentation Code base: code size and complexity duplication metrics potential bugs documentation adherence to standards
/ LaQuSo / Mathematics & Computer Science PAGE Our Tooling Software Quality Analysis and Visualisation Toolset AV Repository (2) (3) (4) (1) Name, Class Count, Abstract Class Count, Ca, Ce, A, I, D, V: bsh,0,0,1,0,0,0,1,1 com.caucho.burlap.client,0,0,1,0,0,0,1,1 com.caucho.burlap.io,0,0,1,0,0,0,1,1 com.caucho.burlap.server,0,0,1,0,0,0,1,1 com.caucho.hessian.client,0,0,1,0,0,0,1,1 com.caucho.hessian.io,0,0,1,0,0,0,1,1 com.caucho.hessian.server,0,0,1,0,0,0,1,1 com.ibatis.common.util,0,0,1,0,0,0,1,1 oracle.toplink.essentials.sessions,0,0,1,0,0,0,1,1 oracle.toplink.exceptions,0,0,2,0,0,0,1,1 oracle.toplink.expressions,0,0,1,0,0,0,1,1 oracle.toplink.internal.databaseaccess,0,0,1,0,0,0,1,1 oracle.toplink.jndi,0,0,1,0,0,0,1,1 oracle.toplink.logging,0,0,1,0,0,0,1,1 oracle.toplink.publicinterface,0,0,2,0,0,0,1,1 oracle.toplink.queryframework,0,0,1,0,0,0,1,1 oracle.toplink.sessionbroker,0,0,1,0,0,0,1,1 oracle.toplink.sessions,0,0,2,0,0,0,1,1 oracle.toplink.threetier,0,0,1,0,0,0,1,1 oracle.toplink.tools.sessionconfiguration,0,0,1,0,0,0,1,1 oracle.toplink.tools.sessionmanagement,0,0,1,0,0,0,1,1 org.aopalliance.aop,0,0,9,0,0,0,1,1 org.aopalliance.intercept,0,0,24,0,0,0,1,1 org.apache.axis.encoding.ser,0,0,1,0,0,0,1,1 org.apache.catalina.loader,0,0,1,0,0,0,1,1 org.aspectj.weaver,0,0,2,0,0,0,1,1 org.aspectj.weaver.ast,0,0,1,0,0,0,1,1 org.aspectj.weaver.bcel,0,0,1,0,0,0,1,1 org.aspectj.weaver.internal.tools,0,0,1,0,0,0,1,1 org.aspectj.weaver.loadtime,0,0,1,0,0,0,1,1 org.quartz.spi,0,0,1,0,0,0,1,1 org.quartz.utils,0,0,1,0,0,0,1,1 org.quartz.xml,0,0,1,0,0,0,1,1 org.springframework.aop,24,20,17,6,0,83,0,26,0,09,1 org.springframework.aop.aspectj,39,7,3,24,0,18,0,89,0,07,1 org.springframework.aop.aspectj.annotation,27,3,0,19,0,11,1,0,11,1 org.springframework.aop.aspectj.autoproxy,3,0,1,8,0,0,89,0,11,1 org.springframework.aop.config,17,3,1,15,0,18,0,94,0,11,1 org.springframework.aop.framework,37,9,22,18,0,24,0,45,0,31,1 org.springframework.jdbc.core,53,20,6,20,0,38,0,77,0,15,1 org.springframework.jdbc.core.metadata,22,2,1,10,0,09,0,91,0,1 org.springframework.jdbc.core.namedparam,10,4,3,12,0,4,0,8,0,2,1 org.springframework.jdbc.core.simple,17,6,0,12,0,35,1,0,35,1 org.springframework.jdbc.core.support,8,5,2,14,0,62,0,88,0,5,1 org.springframework.jdbc.datasource,27,7,13,14,0,26,0,52,0,22,1 org.springframework.jdbc.datasource.lookup,8,2,2,13,0,25,0,87,0,12,1 org.springframework.jdbc.object,14,8,0,12,0,57,1,0,57,1 org.springframework.jdbc.support,15,5,12,16,0,33,0,57,0,1,1 org.springframework.jdbc.support.incrementer,15,4,0,8,0,27,1,0,27,1 org.springframework.jdbc.support.lob,18,5,5,12,0,28,0,71,0,02,1 org.springframework.jdbc.support.nativejdbc,10,2,2,7,0,2,0,78,0,02,1 org.springframework.jdbc.support.rowset,4,2,2,6,0,5,0,75,0,25,1 org.springframework.jdbc.support.xml,7,6,0,7,0,86,1,0,86,1 org.springframework.web.servlet.view.xslt,4,2,0,17,0,5,1,0,5,1 org.springframework.web.struts,16,5,0,22,0,31,1,0,31,1 org.springframework.web.util,24,6,26,15,0,25,0,37,0,38,1 org.w3c.dom,0,0,12,0,0,0,1,1 org.xml.sax,0,0,3,0,0,0,1,1
/ LaQuSo / Mathematics & Computer Science PAGE SQuAVisiT Flexible Plug-in architecture Languages C, Cobol, Java, JavaScript, PL/SQL, Delphi, C# Analysis tools (third party and our own) dependency extractors, duplication detectors, error detectors, metrics calculators, parsers, code style checkers Visualization tools MetricsView, GraphViz, ExTraVis, MatrixZoom, SolidSX, visualization modules of third-party tools
/ LaQuSo / Mathematics & Computer Science PAGE Real life industrial systems They are often: Heterogeneous (C/Assembler, Cobol/PL SQL, Java/Object mapping to SQL…) Incomplete (some code is in libraries and third- party components) Not compilable and executable within analysis environment ( ‘weird’ OS, proprietary development environment, …)
/ LaQuSo / Mathematics & Computer Science PAGE Industrial cases Range from 150 KLOC to 1.7 MLOC Homogeneous and heterogeneous Customers usually report problems experienced: Need to migrate due to discontinuation of support Lack of knowledge about the system due to high degree of staff rotation Danger of architecture deterioration due to extensive changes Maintenance (dis)continuation decision As an illustration we discuss only some of the analyses carried out in each case.
/ LaQuSo / Mathematics & Computer Science PAGE Overview of industrial cases Expert systemEmbedded systemPension fund Size (KLOC) 300 (JavaScript, C++, Java, PL SQL, Cobol) 150 ( C ) 1700 (Cobol, PL SQL) Age (years) Analyses shown Dependencies, duplication DependenciesDead code, effort Visualizatio n
/ LaQuSo / Mathematics & Computer Science PAGE Expert system Industrial case: Insurance company’s expert system What kind of system do we have? Heterogeneous: JavaScript, PL/SQL, C++, Java, Cobol Medium size: 300 KLOC 15 years old Scarce documentation Oracle DB Problem reported Maintenance (dis)continuation decision
/ LaQuSo / Mathematics & Computer Science PAGE Dependencies Model: Matrix View (Almost) layered: good design BUT data layer is accessed from several layers Layers affected by calls from top layer are visible (red squares) Data layer
/ LaQuSo / Mathematics & Computer Science PAGE Dependencies Model: Extravis Green ‘bubbles’: controversial coding approach Parameters as names f(1,3) -> f_1_3 Absence of dedicated data access layer is confirmed
/ LaQuSo / Mathematics & Computer Science PAGE Code duplication Code is polluted with duplication: restructuring would improve maintainability but may change the architecture CCFinder/Gemini ( Toshihiro Kamiya )
/ LaQuSo / Mathematics & Computer Science PAGE Summary Layered architecture System is well-structured but JavaScript + two-tier architecture could cause serious maintenance problems in the future Code polluted by duplication Low impact if no major changes are expected Analysis advice: Short term −Refactor and maintain for limited amount of time (3-5 years) −Develop overview documentation Long term −Migrate to three-tier architecture
/ LaQuSo / Mathematics & Computer Science PAGE Industrial case: Embedded System What kind of code do we have? Component system with compile-time binding via make files C with embedded Assembler Complete Medium size: 150 KLOC Developer’s assumption Layered architecture Problems reported Extensive change. Is architectural purity still preserved?
/ LaQuSo / Mathematics & Computer Science PAGE Dependencies Structure system is poorly layered unexpected cyclic dependencies exist between components
/ LaQuSo / Mathematics & Computer Science PAGE Summary Layered architecture System is poorly-structured (indications of decay) but Code is of good quality, well documented The system is NOT large or complex Analysis advice: Reengineer affected parts according to the presumed layered architecture
/ LaQuSo / Mathematics & Computer Science PAGE Industrial case: Pension fund What kind of system do we have? Homogeneous: Cobol Large: 1.7 MLOC 17 years old Oracle DB Problem reported Need to migrate due to discontinuation of support
/ LaQuSo / Mathematics & Computer Science PAGE Dead code? “Empty spaces” in the visualization 1216 modules not called by other modules Dead code? Other (sub)systems? 651 are dead Confirmed by the developers
/ LaQuSo / Mathematics & Computer Science PAGE Results of Analysis: effort Halstead metrics Time to understand (T) is proportional to Halstead Effort: T = E / 18 /3600 Time to understand, hours
/ LaQuSo / Mathematics & Computer Science PAGE Summary Architecture System structure is preserved but intertwined in some places Dead code is widely spread Code is polluted by duplication but Percentage of weak (large, complex) parts is low Analysis advice: Short term: −Refactor weak parts, eliminate dead code and maintain for limited amount of time Long term: −Redevelop on a modern platform
/ LaQuSo / Mathematics & Computer Science PAGE Expert system Industrial case: Insurance company’s front-end What kind of system do we have? Technical data −Homogeneous: Java −Large: 750 KLOC −Oracle DB −J2EE application (Spring Framework, Hibernate) Recently developed by a third party No documentation Problem reported Purchase decision
/ LaQuSo / Mathematics & Computer Science PAGE Code not available? What can we do? 1)Install locally. 2)Perform measurements. 4)Analyse, interpret. 3)Tune Customer
/ LaQuSo / Mathematics & Computer Science PAGE Understandability: Documentation Comments percentage (LOCs counter tool) Average: 43% Ranging from 0% to 1500%. Why? −large repeated header blocks of comments −commented out code Javadoc (CheckStyle tool) violations Missing or malformed declarations Documentation generation is impossible, or Documentation quality is compromised Documentation quality should be reassessed!
PAGE 25 D n – Distance from the main sequence Abstractness = #AbstrClasses/#Classes Instability = #Out/(#Out+#In) D n = | Abstractness + Instability – 1 | /LaQuSo SET / W&I main sequence zone of pain zone of uselessness [R.Martin 1994]
PAGE 26 Average D n Benchmarks open source our System /LaQuSo SET / W&I
What about distributions? PAGE 27 D n threshold value % of packages beyond threshold our System an average open source system /LaQuSo SET / W&I
/ LaQuSo / Mathematics & Computer Science PAGE What have we seen? Understandablenot documented Poor Javadoc. Commented out code suspected. Modificationabstractness instability balance Good Effort Time to understand: 1.87 man-years Annual maintenance effort (first year): 7.7 man-years Architecture is good Documentation should be reassessed
/ LaQuSo / Mathematics & Computer Science PAGE Conclusions Approach comprising analysis and visualization Supported by SQuAVisiT, a flexible tool allowing To address different maintainability aspects To combine different analysis and visualization techniques Confirmed by analysis of several middle-size to large systems (150 KLOC – 1.7 MLOC)
Future Work Flexible SQuAVisiT data structure: How to store data (dependencies, metrics) about historically and hierarchically different software artifacts? (Fully-featured) parsers/fact extractors for C++, C#, Delphi,… Improved dependency analysis: Analysis of dynamic bindings, injected dependencies, dependencies via global and static variables,… More metrics to retrieve: e.g. D n -based analysis for different languages, analysis/benchmarking of distributions of different metrics / LaQuSo / Mathematics & Computer Science PAGE (Your?)