Chaos Theory for Software Evolution Rui Gustavo Crespo Technical University of Lisbon
2/22Software Evolution Behaviours Laws of Software Evolution (1) MM Lehman informal laws of software evolution 1 Continuing change: a program used in a real-world environment must change. Increasing entropy: the program structure becomes more complex, unless efforts are made to avoid the complexity. Statistically smooth growth: the global system metrics appear locally stochastic in time and space but are self-regulating and statistically smooth. 1 Lehman, M.M.: Programs, Life Cycles and Laws of Software Evolution. IEEE Special Issue on Software Engineering, 68(9),
3/22Software Evolution Behaviours Laws of Software Evolution (2) Valid laws must be:- 1. Unambiguous: the underlying system model must be clear (better formally). 2. Falsifiable: model predictions are checked against collected data and the law remains valid until tests fail. Thesis: Program organization plays a role in maintenance, and may be measured with LRC (“Long-range correlation”) metrics. Program evolution follows Verhulst population model.
4/22Software Evolution Behaviours F 2 (l) = [ BW(l,l 0 )] 2 - [ BW(l,l 0 )] 2 BW(l,l 0 ) = BW(l 0 +l)-BW(l 0 ) LRC metrics (1) 1. Encode symbols (data types int, struct,... and instruction keywords if, while,...) with a balanced numeric code. 2. Identify the Brownian walk graph BW BW(0) = 0 BW(n) = BW(n-1) + Code(S n ) 3. Evaluate the root of mean square flutuation about the average of displacement.
5/22Software Evolution Behaviours LRC metrics (2) F(l) l = 0.5, random programs 0.5< <1, meaningful programs
6/22Software Evolution Behaviours LRC metrics (3) [A] values for 36 compilers, coded in C Average: 0,82
7/22Software Evolution Behaviours LRC metrics (4) for 36 random programs, same keyword distribution (similar results for same number of lines) Average: 0,48
8/22Software Evolution Behaviours LRC metrics (5) values for 36 random C programs, same number of lines, same keyword distribution Average: 0,62
9/22Software Evolution Behaviours LRC metrics (6) values for source and object files are strongly correlated
10/22Software Evolution Behaviours Process dynamics (1) Pierre Verhulst, Belgian mathematician, studied models of human population growth in the 19 th century Growth with unlimited resources du/dt= u, >0 Solution is an exponential function u(t)=u(t 0 )e (t-t0) Growth with limited resources du/dt= ( -u)u, is the upper limit X t+1 = X t (1-X t ) X t \in [0,1], \in [0,4] Verhulst model, or logistic map
11/22Software Evolution Behaviours Process dynamics (2)
12/22Software Evolution Behaviours Process dynamics (3)
13/22Software Evolution Behaviours Process dynamics (4) Oscilation period is 2 n
14/22Software Evolution Behaviours Process dynamics (5) BTW, predator and prey populations (e.g., wolves and rabbits), are ruled by the same kind of equations dr/dt= ( - w)r dw/dt= (r - )w Solution of the differential equations are sinusoidal functions, with different phases
15/22Software Evolution Behaviours Process dynamics (6) The behaviour is cahotic
16/22Software Evolution Behaviours Process dynamics (7) Software processes also “compete” for resources (time, man-power, …) Interaction between components is non-linear: small changes in a module may stop other modules to work properly Proposal: link Verhulst values to Program organization, and Ideas formation
17/22Software Evolution Behaviours Process dynamics (8) norm = 2| -0.5|; = t+1 / t (1- t ) Ideas formation Ideas covergenceSingle Idea Implementation Time Product Attributes Information Criativity Process Form Chaos Bifurcation Normal 3,4 0 3,0
18/22Software Evolution Behaviours Process dynamics (9) BeanMetaData.java versions in JBOSS (Jul 2000-Oct 2002) ChaoticNormal Bifurcation
19/22Software Evolution Behaviours Process dynamics (10)
20/22Software Evolution Behaviours Process dynamics (11) EJBVerifier20.java versions in JBOSS (May 2000-Sep 2002) Chaotic Normal Bifurcation
21/22Software Evolution Behaviours Process dynamics (12)
22/22Software Evolution Behaviours It is possible to measure program organization and automatically highlight version behaviours Next steps: Check LRC validity and Verhulst model in other applications / languages / process phases Improve Verhulst model (sometimes, >4) Identify faster algorithms Cardoso,AI; Kokol,P.; Lenic,M.; Crespo,R.G.; Complexity-based Evaluation of Systems Evolution; in Advances in UML/XML Based Software Evolution; IRM Press; 2004 (in print) Conclusions