University of Waterloo How does your software grow? Evolution and architectural change in open source software Michael Godfrey Software Architecture Group.

Slides:



Advertisements
Similar presentations
SOFTWARE MAINTENANCE 24 March 2013 William W. McMillan.
Advertisements

1 The Laws of Software Evolution Tori Bowman CSSE 375, Rose-Hulman September 25, 2007 *based on Don Bagert’s lesson.
Swami NatarajanJune 17, 2015 RIT Software Engineering Reliability Engineering.
SE 450 Software Processes & Product Metrics Reliability Engineering.
SWE Introduction to Software Engineering
1 Chapter 1 Software and Software Engineering Software Engineering: A Practitioner’s Approach, 6th edition by Roger S. Pressman.
Design for Change Notkin: 1 of 3 lectures on change Today: high-level view Next Wednesday/Friday: more nuts and bolts.
Notion of a Project Notes from OOSE Slides - modified.
Chapter 3.1 Teams and Processes. 2 Programming Teams In the 1980s programmers developed the whole game (and did the art and sounds too!) Now programmers.
These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 6/e and are provided with permission by.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 2: Operating-System Structures Modified from the text book.
These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 6/e and are provided with permission by.
Algorithm Programming Coding Advices Bar-Ilan University תשס " ו by Moshe Fresko.
Chapter 9 – Software Evolution and Maintenance
Mining Large Software Compilations over Time Another Perspective on Software Evolution: Gregorio Robles, Jesus M. Gonzalez-Barahona, Martin Michlmayr,
PROGRAMMING LANGUAGES The Study of Programming Languages.
Linux Operations and Administration
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
1 Software Engineering: A Practitioner’s Approach, 6/e Chapter 1 Software and Software Engineering Software Engineering: A Practitioner’s Approach, 6/e.
These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 6/e and are provided with permission by.
These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 6/e and are provided with permission by.
CPSC 871 John D. McGregor MMS1 Maintenance & a new trend.
Open Source Software An Introduction. The Creation of Software l As you know, programmers create the software that we use l What you may not understand.
Software evolution. Objectives l To explain why change is inevitable if software systems are to remain useful l To discuss software maintenance and maintenance.
TOPIC R Software Maintenance, Evolution, Program Comprehension, and Reverse Engineering SEG4110 Advanced Software Design and Reengineering.
Chapter 1 Software and Software Engineering. A Quick Quiz 1. What percentage of large projects have excess schedule pressure? 25% 50% 75% 100% 2. What.
Presented By: Avijit Gupta V. SaiSantosh.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 27Slide 1 Software change l Managing the processes of software system change.
© S. Demeyer, S. Ducasse, O. Nierstrasz Intro.1 1. Introduction Goals Why Reengineering ?  Lehman's Laws  Object-Oriented Legacy Typical Problems  common.
Object-Oriented Software Engineering Practical Software Development using UML and Java Chapter 1: Software and Software Engineering.
Maintainability of FLOSS Projects
CSC 395 – Software Engineering Lecture 12: Reusability –or– Programming was Bjarne Again.
Software Engineering EKT 420 MOHAMED ELSHAIKH KKF 8A – room 4.
 CS 5380 Software Engineering Chapter 9 Software Evolution.
CS 390 Unix Programming Summer Unix Programming - CS 3902 Course Details Online Information Please check.
OHTO -99 SOFTWARE ENGINEERING “SOFTWARE PRODUCT QUALITY” Today: - Software quality - Quality Components - ”Good” software properties.
Object-Oriented Software Engineering Practical Software Development using UML and Java Chapter 1: Software and Software Engineering.
Open Source Software Architecture and Design By John Rouda.
University of Waterloo How does your software grow? Evolution and architectural change in open source software Michael Godfrey Software Architecture Group.
Department of Information Business Discussion of a Large-Scale Open Source Data Collection Methodology Michael Hahsler and Stefan Koch Department of Information.
1 Evaluating Code Duplication Detection Techniques Filip Van Rysselberghe and Serge Demeyer Lab On Re-Engineering University Of Antwerp Towards a Taxonomy.
Manag ing Software Change CIS 376 Bruce R. Maxim UM-Dearborn.
SWE311_Ch01 (071) Software & Software Engineering Slide 1 Chapter 1 Software and Software Engineering Chapter 1 Software and Software Engineering.
These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 6/e and are provided with permission by.
Software Maintenance Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
Evolution in Open Source Software: A Case Study
Understanding Software Evolution Michael W. Godfrey Software Architecture Group University of Waterloo.
Chapter 9 – Software Evolution 1Chapter 9 Software evolution.
University of Waterloo Four “interesting” ways in which history can teach us about software Michael W. Godfrey * Xinyi Dong Cory Kapser Lijie Zou Software.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 21 Slide 1 Software evolution.
Chapter 9 – Software Evolution 1Chapter 9 Software evolution.
CS223: Software Engineering Lecture 2: Introduction to Software Engineering.
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
Banaras Hindu University. A Course on Software Reuse by Design Patterns and Frameworks.
Objective ICT : Internet of Services, Software & Virtualisation FLOSSEvo some preliminary ideas.
University of Waterloo Exploring Structural Change and Architectural Evolution Qiang Tu and Michael Godfrey Software Architecture Group (SWAG) University.
Part 1 Introduction to Software Engineering 1 copyright © 1996, 2001, 2005 R.S. Pressman & Associates, Inc. For University Use Only May be reproduced ONLY.
A service Oriented Architecture & Web Service Technology.
Software Development Module Code: CST 240 Chapter 6: Software Maintenance Al Khawarizmi International College, AL AIN, U.A.E Lecturer: Karamath Ateeq.
Computer System Structures
INTRO. To I.T Razan N. AlShihabi
Overview Software Maintenance and Evolution Definitions
Open Source Software Development
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
Chapter 18 Maintaining Information Systems
Operating System Structure
Popular Operating Systems
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
Understanding Software Evolution
Chapter 8 Software Evolution.
Presentation transcript:

University of Waterloo How does your software grow? Evolution and architectural change in open source software Michael Godfrey Software Architecture Group (SWAG) University of Waterloo

Michael W. Godfrey Open source evolution — How does your software grow? 2 What is software evolution? “Evolution is what happens while you’re busy making other plans.” We distinguish between maintenance and evolution: – Maintenance is the planned set of tasks to effect changes. – Evolution is what actually happens to the software. All I want to know is: How and why does software evolve?

Michael W. Godfrey Open source evolution — How does your software grow? 3 Why should we care? Much of the commercial software world operates in perpetual crisis mode. –“Fix it, don’t try to understand it.” –Just-in-time program comprehension [Lethbridge] … but … large software systems are major assets of many businesses –Getting it right more important than getting it done fast. –Budget and time for preventive maintenance, navel gazing. Relatively little research on trying to understand how and why programs evolve.

Michael W. Godfrey Open source evolution — How does your software grow? 4 Lehman’s Laws of Software Evolution Based on measurement of a few (commercially- developed) systems, most notably IBM’s OS 360 –Originally three laws, now there are eight. Controversial as “laws” –Has been criticized for strong claims based on limited data. –However, it’s pioneering work on software evolution and software engineering.

Michael W. Godfrey Open source evolution — How does your software grow? 5 Lehman’s Laws of Software Evolution 1.Continuing change — An E-type program that is used must be continually adapted else it becomes progressively less satisfactory. 2.Increasing complexity — As a program is evolved, its complexity increases unless work is done to maintain or reduce it. 3.Self regulation — The program evolution process is self-regulating with close to normal distribution of measures of product and process attributes. 4.Invariant work rate — The average effective global activity rate on an evolving system is invariant over the product lifetime.

Michael W. Godfrey Open source evolution — How does your software grow? 6 Lehman’s Laws of Software Evolution 5.Conservation of familiarity — During the active life of an evolving program, the content of successive releases is statistically invariant. 6.Continuing growth — Functional content of a program must be continually increased to maintain user satisfaction over its lifetime. 7.Declining quality — E-type programs will be perceived as of declining quality unless rigorously maintained and adapted to a changing operation environment. 8.Feedback system — E-type programming processes constitute multi-loop, multi-level feedback systems and must be treated as such to be successfully modified or improved.

Michael W. Godfrey Open source evolution — How does your software grow? 7 Lehman’s Laws in a nutshell Observations: – (Most) useful software must evolve or die. – As a software system gets bigger, its resulting complexity tends to limit its ability to grow. – Development progress/effort is (more or less) constant; growth is at best constant. Lehman/Turski’s model: y’= y + E/y 2 ~ (3Ex) 1/3 where y= # of modules, x = release number Advice: – Need to manage complexity. – Do periodic redesigns. – Treat software and its development process as a feedback system (and not as a passive theorem).

Michael W. Godfrey Open source evolution — How does your software grow? 8 Lehman’s examples

Michael W. Godfrey Open source evolution — How does your software grow? 9 The S curve time size

Michael W. Godfrey Open source evolution — How does your software grow? 10 A case study in evolution: The Linux OS kernel [ICSM-00]

Michael W. Godfrey Open source evolution — How does your software grow? 11 A case study in evolution: The Linux OS kernel [ICSM-00] Evolution in Open Source Software: A Case Study” [Godfrey and Tu, ICSM 2000] It’s Linux! –Large system, very stable, many releases over several years, many developers –Growing mainstream adoption (e.g., IBM S390 port) –Commonly used within networked systems Open source development model –Interesting phenomenon in itself –Easy to track, can publish results, many experts –Not much previous study

Michael W. Godfrey Open source evolution — How does your software grow? 12 Evolution of Linux: Questions How has Linux evolved over time? –Does it obey Lehman’s laws? –What is the best way to characterize growth? How has its (open source) process model affected its development? How has the (high-level) architecture –changed over time? –affected the system’s evolution?

Michael W. Godfrey Open source evolution — How does your software grow? 13 Open source development Open source development vs. open source software GNU, Linux, Apache, vim, gcc, FreeBSD vs. Mozilla, JDK, Jikes, NetBeans “The Cathedral and the Bazaar” [Raymond] –Usual goal: scratching an interesting itch, not filling a commercial void. –Anyone may contribute, tho owner(s) have final say. Usually, developers work part-time and for free. –Motivation is peer recognition and personal satisfaction, not money. –However, industrial participation also increasing (e.g., Cygnus, IBM)

Michael W. Godfrey Open source evolution — How does your software grow? 14 Open source development Largely immune from time-to-market pressures –Can release when it’s really ready Can be hard to control/direct developers –Big egos, can’t be “fired” –What’s cool vs. what’s needed –Less “sexy” development tasks often suffer e.g., planned testing, preventive maintenance Code quality varies widely –Some projects have coding standards –Unstable/experimental code common (and even encouraged) –Quality maintained via “massively parallel debugging”, not rigorous testing.

Michael W. Godfrey Open source evolution — How does your software grow? 15 Linux background Linux kernel v1.0 released March 1994 –487 source files, 165 KLOC, i386 only Linux kernel v released January 2000 –4854 source files, 2.2 MLOC, 10 hardware architectures supported, over 300 developers credited Maintained along two parallel paths: –development and stable

Michael W. Godfrey Open source evolution — How does your software grow? 16 Methodology Examined 96 versions of Linux kernel –34 of the 67 stable releases –62 of the 369 development releases All measures considered only.c/.h files contained in tarball –Counted LOC using “ wc –l ” and an awk script that ignored comments and blank lines –Counted # of fcns/vars/macros using ctags –Architectural model (SSs hierarchy) based on default directory structure We plotted growth against calendar time –Lehman suggests plotting growth against release number

Michael W. Godfrey Open source evolution — How does your software grow? 17 Software architecture of Linux [IWPC-00]

Michael W. Godfrey Open source evolution — How does your software grow? 18 Growth of # of source files

Michael W. Godfrey Open source evolution — How does your software grow? 19 Growth of # of global fcns, variables, and macros

Michael W. Godfrey Open source evolution — How does your software grow? 20 Growth of compressed tar file

Michael W. Godfrey Open source evolution — How does your software grow? 21 Growth of Lines of Code (LOC) y =.21*x *x + 90,055 r2=.997

Michael W. Godfrey Open source evolution — How does your software grow? 22 Average/median.c file size

Michael W. Godfrey Open source evolution — How does your software grow? 23 Average/median.h file size

Michael W. Godfrey Open source evolution — How does your software grow? 24 Growth of major SSs (dev. releases)

Michael W. Godfrey Open source evolution — How does your software grow? 25 SS LOC as percentage of total system

Michael W. Godfrey Open source evolution — How does your software grow? 26 SS LOC as percentage of total system (ignoring drivers)

Michael W. Godfrey Open source evolution — How does your software grow? 27 Growth of arch SSs

Michael W. Godfrey Open source evolution — How does your software grow? 28 Growth of drivers SSs

Michael W. Godfrey Open source evolution — How does your software grow? 29 Observations and hypotheses Growth along devel. path is super-linear! y =.21*x *x + 90,055 r2=.997 y = size in LOC x = days since v1.0 r2 is “coefficient of determination” using least squares Lehman/Turski’s model: y’= y + E/y 2 ~ (3Ex) 1/3 where y= # of modules, x= release number – Linux’s strong growth is continuing. – This is stronger growth at MLOC level than observed by others (Lehman, Gall), even for other OSs.

Michael W. Godfrey Open source evolution — How does your software grow? 30 Growth of fetchmail [Raymond]

Michael W. Godfrey Open source evolution — How does your software grow? 31 Growth of pine

Michael W. Godfrey Open source evolution — How does your software grow? 32 Growth of X Windows X11R6 X11R5 X11R3 X10R3 X10R4 X11R1 X11R2 X11R6.1 X11R6.3 X11R6.4

Michael W. Godfrey Open source evolution — How does your software grow? 33 Growth of gcc/g++/egcs

Michael W. Godfrey Open source evolution — How does your software grow? 34 Growth of vim (text editor)

Michael W. Godfrey Open source evolution — How does your software grow? 35 vim avg % comments and blank lines per file

Michael W. Godfrey Open source evolution — How does your software grow? 36 vim avg/median file size

Michael W. Godfrey Open source evolution — How does your software grow? 37 vim ’s architecture

Michael W. Godfrey Open source evolution — How does your software grow? 38 Some open questions Philosophical: –Does software evolve in the same way as frogs and social structures? The Selfish Gene, by Richard Dawkins The Nature of Economies, by Jane Jacobs –What are the recurring patterns and compelling metaphors of software evolution? Methodological: –How to measure size? How to correlate size and quality? –How to measure change? How to model architectural change? –What is the predictive power of such models? Do the “other phenomena” dominate?

Michael W. Godfrey Open source evolution — How does your software grow? 39 Some open questions Practical: –What information do developers need to know about how a software system has evolved? –What kinds of tools would be useful: to the front-line developer? to the manager? –How best to deal with: Large data sets ( large_system  many_versions ) Visualization and navigation

Michael W. Godfrey Open source evolution — How does your software grow? 40 Change patterns and evolutionary narratives “Band-aid evolution” (just add a layer) – quick way to add new functionality, esp. if system is not well understood e.g., Y2K fixing, adding portability, new features “Vestigial features” – design artifact persists after rationale dies e.g., whale fin bone structure resembles hand “Adaptive radiation” [Lehman] – when conditions permit, encourage wild variation for a while. – later, evaluate and let “best” ideas live on. e.g., Linux kernel evolution “Convergent evolution” – compare similar systems to reference arch. (or to each other) e.g., everyone grows an XML generator in response to market pressure

Michael W. Godfrey Open source evolution — How does your software grow? 41 Change patterns and evolutionary narratives Cathedral style [Raymond] – careful control and management – debugging done before committing code – evolution is slow, planned, rarely undone Bazaar style (OSD) – lots of low-level changes, frequent fixes – lots of “building around” rather than wholesale changing, occasional redesigns – creeping feature-itis, “complete” dependency graph

Michael W. Godfrey Open source evolution — How does your software grow? 42 Change patterns and evolutionary narratives Radical redesigns (localized and global) – aka “refactoring” – little new functionality added, but structure changes significantly, legacy cruft dissipates – likely “goodness” (design metrics) improves Migration patterns – look out for known translation idioms, especially if migration is not one big bang e.g., procedural-to-OO idioms

Michael W. Godfrey Open source evolution — How does your software grow? 43 Change patterns and evolutionary narratives OO evolutionary patterns – one recognizable design pattern transformed into another (or a variation of the original) requires good OO extraction tools (dynamic binding, polymorphism, reflection, etc.) Reuse patterns – components are (re)used in different systems e.g., build COTS interface, throw out homebrew DB

Michael W. Godfrey Open source evolution — How does your software grow? 44 Change patterns and evolutionary narratives Phenomena observed in Linux evolution – Careful control of core code; more flexibility on contributed drivers, experimental features Linus has many lieutenants – “Aunt Tillie” effect Simplicity and scrutability of code, development processes, approval process, etc. – “Mostly parallel” enables sustained growth “Hard interfaces” make good neighbours. Loadable modules makes feature development easier – “Clone and hack” makes sense!

Michael W. Godfrey Open source evolution — How does your software grow? 45 Change patterns and evolutionary narratives Phenomena observed in Linux evolution – Amazing social phenomenon of OSD “You can try this at home” – and they did! Anti-MS sentiments, – “We can build it ourselves!” – Enlightened self-interest for many large computer industry companies » “If we can’t own the standard, no one should.” Bandwagon effect (both OS developers and industry) – Support for Linux as deployed OS by IBM, Dell, Sun, … – Lots of contributed production-quality third party code from industry (IBM S/390, drivers)

Michael W. Godfrey Open source evolution — How does your software grow? 46 An observed evolutionary phenomenon Code cloning! –Usually regarded as a bad sign –Usual solution: Abstract commonality into a single place, remove duplication In an OO setting, can use inheritance –But as observed in Linux, it seems less problematic than one might think!

Michael W. Godfrey Open source evolution — How does your software grow? 47 Case study: Cloning in Linux SCSI drivers Nice, controlled experiment: –Large body of code, multiple versions, well used system, open source –SCSI drivers all do similar tasks –Source comments shows cloning has occurred! Approx. 500 releases of Linux since Kernel v2.3.39: (released Jan 2000) –5000 source files, 2.2 MLOC, 10 hardware architectures – drivers/scsi has 212 source files, 166 KLOC,

Michael W. Godfrey Open source evolution — How does your software grow? 48 Goals of case study Examine “real world” cloning: –How common is it? –Why is it done? –What do the “cloning patterns” look like? Examine parallel evolution: –What kinds of changes are common? –Do developers (need to) change clone relatives too? Is there a better design structure lurking? Compare against existing clone detection tools –Are detections tools looking for the right indications of cloning?

Michael W. Godfrey Open source evolution — How does your software grow? 49 SCSI Subsystem - Size (rel ) Number of source files: 211 Number of functions: 2512 Number of lines of code: 254,953 % of comments: 38 Number of low-level drivers: 80 File size: –on average ~3000 lines –large multi-card drivers ~15,000 lines

Michael W. Godfrey Open source evolution — How does your software grow? 50 SCSI Subsystem - Architecture Upper Layer –Uniform way of handling devices –Hard Disk, CD-ROM Disk, Tape, Generic Middle Layer –“bridge” between Upper Layer and Low-Level Devices Low-Level Device Drivers –low-level driver functionality and management

Michael W. Godfrey Open source evolution — How does your software grow? 51 Clones Expected? Why did we expect to find clones? –Every driver must implement uniform interface –Design of subsystem does not support other forms of reuse –Driver logic is relatively simple (!) –Devices from same family  more cloning –Completely different hardware  less or no cloning –Open source  anyone can reuse code –Easier and more efficient to reuse existing code –Reused code already tested, so probably better quality than if we build it from scratch

Michael W. Godfrey Open source evolution — How does your software grow? 52 Clones - Manual Inspection From source code comments, we have found: esp.[ch] jazz_esp.[ch] dec_esp.[ch] cyberstorm.[ch] cyberstormII.[ch]mca_53c9x.[ch]blz2060.[ch]fastlane.[ch] qlogicisp.[ch] qlogicpti.[ch] fdomain.[ch] fd_mcs.[ch] sd.[ch] sr.[ch] t128.[ch] pas16.[ch]

Michael W. Godfrey Open source evolution — How does your software grow? 53 Types of Changes Detected Names of variables Initialization parameters and constants Driver specific initialization logic removed/added Small change in supporting functions Small changes in driver management code Comments are updated Code changed is highly embedded into other code, which makes extraction of that code hard

Michael W. Godfrey Open source evolution — How does your software grow? 54 Conclusion (Cloning) Unclear that current clone detection tools “do the right thing” –Combination of different approaches should give the best detection results Theory developed on clone management, detection, and removal is not universally applicable to all types of applications, languages, and designs –Need more qualitative analysis of “cloning in the real world” As practised, code cloning in the Linux SCSI subsystem seems like a reasonable approach!

Michael W. Godfrey Open source evolution — How does your software grow? 55 The past, present, and future of open source software ● Past ➢ An outgrowth of the Unix sysadmin tradition ➢ Bug fixes and evolution! ● Present ➢ Trendy... but large companies see more Apache is used by 63% of web servers [Aug 02 Netcraft survey] BIND used by vast majority of DNS servers Sendmail is most widely used transport...

Michael W. Godfrey Open source evolution — How does your software grow? 56 The past, present, and future of open source software ● Future ➢ Corporate prisoner's dilemma: Enforced open-ness allows companies to breathe easier, can concentrate on core strengths and real innovations ➢ Governments are beginning to require open source and open standards when available The German government and IBM recently announced a "far- reaching co-operation agreement" [NYTimes] ➢ Concern over Microsoft,.NET, and monopolistic practices Mono, an open source implementation of the.NET framework, being developed.

Michael W. Godfrey Open source evolution — How does your software grow? 57 Open source infrastructure Pros ● Reliability, trust, confidence by users ➢ Many users/readers aids in debugging ● Can usually “fork” projects for specialized needs ● Interoperability when component vendors conform to standards ● Cons ● Some open source licences can cause headaches e.g., the GPL “virus” ● Not in everyone's strategic best interests ● Non-compliance is a recurring problem

Michael W. Godfrey Open source evolution — How does your software grow? 58 Summary: Evolution and open source software Open source software seems to break some of the “rules” of how successful software is built and evolved: – Motivation for developers is fun, pride, professionalism, politics, … rather than money – It violates some of Lehman’s laws … yet this software is often of high quality and in wide use. – esp. common infrastructure-type systems – e.g., Linux/MacOS kernel, apache, ldap, samba, imap, …