1 Instability Visualization and Analysis Jim Whitehead Jennifer Bevan University of California, Santa Cruz
2 Problem In long-lived software systems Software architecture degrades over time Increasing structural complexity makes changes more difficult From perspective of software engineers performing maintenance… Cannot dependably modify the system Poor modularization makes it difficult to identify all code sections that must be modified for a logical change Poor structure reduces confidence that changes won’t affect existing functionality Modifications take longer, and are more costly reduction in capacity to adapt to changing environment
3 Software Instability Instability: a set of related source code statements that have been repeatedly modified Intuitively, if the same set of source code changes are frequently made for a given logical change, it may indicate a modularity problem Interfaces not yet well defined Didn’t adequately design for change Might be possible to refactor code to achieve a more change- resistant modularization Our approach: develop a tool that can identify and visualize software instabilities
4 General Approach Collect dependence graphs across entire revision history of a system Analyze graphs for recurring sets of related changes Calculate an additional metric (such as Eick’s FILES metric for code decay) to assign severities to each instability Issues: Handling wide range of possible code changes Desired granularity of instability regions Normalizing data across developer styles If developer A checks-in 5 times a day, and developer B only once a day, developer A’s changes shouldn’t be rated as more unstable
5 Architecture SCM Repository IVA Repository Preprocessor Daemon - Data Extraction - Instability Identification Instability Analyzer - Normalization - Filtering - Severity Classification Visualization Engine Report Generator
6 Visualization (1) First, create graph from dependence information - force-directed layout a la Walshaw [GD2000] Next, extrude synthetic terrain – altitude related to node density Hierarchy (Java): classes contain methods, which in turn contain a series of nested scoped code blocks
7 Visualization (2) Overlay instability information Software fault lines Line weights indicate instability severity Permit zoom in/out (focus+context) on portions of visualization
8 Visualization (3) Permit drill-down into the details of each instability Can see precisely what versions/code comprises the instability Ball & Eick diagram Access per-version metadata, focused on timespan of interest to the instability
9 Current Status Preliminary implementation – “IVA” Simple dependence graphs for Java code Works with Subversion, are close with CVS Simple graph layout Fractal radial layout Have run tool on three Java projects All small Have a flyer and technical report on current status and goals
10 IVA on IVA Analysis of 70 revisions of IVA code (19 classes, 5300 lines) Showed (correctly) that definition of repository interface was under significant flux during development
11 Validation Use tool with a range of software, especially HDCP testbed software Ensure that software instabilities correlate to observable areas of poor modularization If bug/issue tracking available, would like to see correlations between instabilities and defect densities But may not – correct code may still exhibit instability due to poor structure Ensure tool is usable, and visualizations convey correct meanings Ensure analysis and visualizations scale to handle non-trivial system sizes Testbeds can help a lot here
12 Areas of Collaboration Ideally, we want to produce an open source tool Would like to collaborate on: SCM API code Language parsing technology Are using ANTLR Handles Java, C Are exploring PCCTS for C++ Requires Linux port Would prefer not to use EDG ($$, not OS) Program dependence graph representation/technology Using a 3 rd party graph library
13