Evolution in Open Source Software (OSS) SEVO seminar at Simula, 16 March 2006 Software Engineering (SU) group Reidar Conradi, Andreas Røsdal, Jingyue Li.

Slides:



Advertisements
Similar presentations
Chapter 27 Software Change.
Advertisements

1 These courseware materials are to be used in conjunction with Software Engineering: A Practitioner’s Approach, 5/e and are provided with permission by.
Mining Metrics to Predict Component Failures Nachiappan Nagappan, Microsoft Research Thomas Ball, Microsoft Research Andreas Zeller, Saarland University.
Supported by: Joint MSc curriculum in software engineering European Union TEMPUS Project CD_JEP New Topics for Software Evolution Miloš Radovanović.
1 Software Maintenance and Evolution CSSE 575: Session 8, Part 3 Predicting Bugs Steve Chenoweth Office Phone: (812) Cell: (937)
Swami NatarajanJune 10, 2015 RIT Software Engineering Activity Metrics (book ch 4.3, 10, 11, 12)
Detailed Design Kenneth M. Anderson Lecture 21
Software Metrics II Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
Dr Kettani, Spring 2002 Software Engineering IIFrom Sommerville, 6th edition Software change l Managing the processes of software system change.
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 21 Slide 1 Software evolution.
SE 450 Software Processes & Product Metrics Activity Metrics.
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 21 Slide 1 Software evolution.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Software evolution.
Software evolution.
Software maintenance Managing the processes of system change.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 27Slide 1 Chapter 27 Software Change.
State coverage: an empirical analysis based on a user study Dries Vanoverberghe, Emma Eyckmans, and Frank Piessens.
1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon
Cyclomatic Complexity Dan Fleck Fall 2009 Dan Fleck Fall 2009.
Software Metrics *** state of the art, weak points and possible improvements Gordana Rakić, Zoran Budimac Department of Mathematics and Informatics, Faculty.
Software evolution. Objectives l To explain why change is inevitable if software systems are to remain useful l To discuss software maintenance and maintenance.
Software change  Managing the processes of software system change.
TOPIC R Software Maintenance, Evolution, Program Comprehension, and Reverse Engineering SEG4110 Advanced Software Design and Reengineering.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 27Slide 1 Software change l Managing the processes of software system change.
Software Engineering CS3003
Presented By : Abirami Poonkundran.  This paper is a case study on the impact of ◦ Syntactic Dependencies, ◦ Logical Dependencies and ◦ Work Dependencies.
Security of Open Source Web Applications Maureen Doyle, James Walden Northern Kentucky University Students: Grant Welch, Michael Whelan Acknowledgements:
SWEN 5430 Software Metrics Slide 1 Quality Management u Managing the quality of the software process and products using Software Metrics.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 22 Regression Diagnostics.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 27Slide 1 Software change l Managing the processes of software system change.
1 Experience-Driven Process Improvement Boosts Software Quality © Software Quality Week 1996 Experience-Driven Process Improvement Boosts Software Quality.
Samad Paydar Web Technology Lab. Ferdowsi University of Mashhad 10 th August 2011.
OHTO -99 SOFTWARE ENGINEERING “SOFTWARE PRODUCT QUALITY” Today: - Software quality - Quality Components - ”Good” software properties.
Software Metrics (Part II). Product Metrics  Product metrics are generally concerned with the structure of the source code (example LOC).  Product metrics.
1 Jingyue Li et al. An Empirical Study on Decision Making in Off-the-Shelf Component-Based Development.
Exploring Core-Periphery Structures ©Alan MacCormack, John Rusnak, Carliss Baldwin Exploring Core-Periphery Structures in Complex Software Products.
University of Waterloo How does your software grow? Evolution and architectural change in open source software Michael Godfrey Software Architecture Group.
©Ian Sommerville 2004 Software Engineering. Chapter 21Slide 1 Chapter 21 Software Evolution.
Manag ing Software Change CIS 376 Bruce R. Maxim UM-Dearborn.
Object Oriented Reverse Engineering JATAN PATEL. What is Reverse Engineering? It is the process of analyzing a subject system to identify the system’s.
Enabling Reuse-Based Software Development of Large-Scale Systems IEEE Transactions on Software Engineering, Volume 31, Issue 6, June 2005 Richard W. Selby,
Andrea Capiluppi Dipartimento di Automatica e Informatica Politecnico di Torino, Italy & Computing Dept. The Open University, UK AICA 2004, Benevento,
Chapter 3: Software Project Management Metrics
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 21 Slide 1 Software evolution 1.
Software Evolution Program evolution dynamics Software maintenance Complexity and Process metrics Evolution processes 1.
1 Anita Gupta 28/05/2009 The Profile of Software Changes in Reused vs. Non-Reused Industrial Software Systems Doctoral thesis presentation, Anita Gupta.
Software Metrics.
Chapter 9 – Software Evolution 1Chapter 9 Software evolution.
1 Experience from Studies of Software Maintenance and Evolution Parastoo Mohagheghi Post doc, NTNU-IDI SEVO Seminar, 16 March 2006.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 21 Slide 1 Software evolution.
1 Practical Experience with Software Evolution in Statoil ASA SEVO Seminar, 16 March 2006 Odd Petter N. Slyngstad and Anita Gupta, Practical Experience.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 21 Slide 1 Software evolution.
Presented by Lu Xiao Drexel University Quantifying Architectural Debt.
1 Modeling the Search Landscape of Metaheuristic Software Clustering Algorithms Dagstuhl – Software Architecture Brian S. Mitchell
1 / 14 CS 425/625 Software Engineering Software Change Based on Chapter 27 of the textbook [SE-6] Ian Sommerville, Software Engineering, 6 th Ed., Addison-Wesley,
Testing Integral part of the software development process.
Exploring Software Evolution Using Spectrographs Jingwei Wu, Richard C. Holt, Ahmed Hassan School of Computer Science University of Waterloo Waterloo ON.
Software Development Module Code: CST 240 Chapter 6: Software Maintenance Al Khawarizmi International College, AL AIN, U.A.E Lecturer: Karamath Ateeq.
Laurea Triennale in Informatica – Corso di Ingegneria del Software I – A.A. 2006/2007 Andrea Polini XVI. Software Evolution.
Overview Software Maintenance and Evolution Definitions
Celia Chen1, Lin Shi2, Kamonphop Srisopha1
Cyclomatic Complexity
Cyclomatic Complexity
1.1.1 Software Evolution.
Chapter 27 Software Change.
Cyclomatic Complexity
Presented by Trey Brumley and Ryan Carter
Software Metrics SAD ::: Fall 2015 Sabbir Muhammad Saleh.
Empirical Study on Component-Based Development
Presentation transcript:

Evolution in Open Source Software (OSS) SEVO seminar at Simula, 16 March 2006 Software Engineering (SU) group Reidar Conradi, Andreas Røsdal, Jingyue Li Reidar Conradi, 30.jan.06

Motivation: Open Source Software is fast becoming the major way of making software 38% of European IT companies used OSS in 2003, 56% in 2005 (Evans Data Corp.). Need to understand how OSS is developed and evolved, e.g. are revised processes needed? Partly used “as-is”, partly made in cooperative projects.

Context and intention of our two OSS studies Study 1: Survey of OSS- and COTS- based (Commercial-Off-The-Shelf) development in Norway, Germany and Italy – 145 projects in companies. Study development and risk management processes. Study 2: Data mining of evolution in two OSS projects – Mozilla and Portage. Analyze change logs, source etc.

Main findings from Study1/Survey Source available? – 100% in OSS projects, 30% in COTS. Source being read: 68% in OSS, 77% in COTS w/ source. Source being modified: 36% in OSS, 15% in COTS with source + glueware/addware.

Empirical Study2 of Software Evolution in OSS Projects Goal of study: identify factors which can explain and possibly predict software evolution. How: by performing an empirical study on two open source projects and analyzing the results. Focus: observe changes in software architecture, quality and change-rates. Motivation: Reduce software maintenance costs by identifying evolution-prone software. Slide 516 March SEVO seminar on Software Evolution

Research questions RQ1: How much does the architectural properties of the software change over time? RQ2: Are modules with high complexity more evolution prone than modules with low complexity? RQ3: Are modules with high coupling more evolution prone than modules with low coupling? RQ4: Does the amount of software development decrease over time? RQ5: What is the relationship between the defect density and the architectural changes of a system? Slide 616 March SEVO seminar on Software Evolution

Overview of Methods Used Analyzing changes over time: measuring software metrics for all releases of the software. Data mining defect reports from the defect-tracking system. Collecting change-rates from change logs and source code. Applying software metric tools to measure evolution: C and C++ Code Counter and Pythonmetric. Slide 716 March SEVO seminar on Software Evolution

Data Sources: Mozilla (large) and Portage (small) Mozilla: Open source web browser developed in C/C++. Access to source code of 86 releases from 1999 to The latest version consists of 1.4 million lines of code. Access to change logs from CVS and reported defects from Bugzilla. Portage: Open source system utility for Gentoo Linux developed in Python. Access to source code of 4 mayor releases including 278 minor releases from 2003 to The latest version consists of 13 thousand lines of code. Access to change logs from CVS and reported defects from Bugzilla. Slide 816 March SEVO seminar on Software Evolution

Software Metrics (1) The following metrics were measured for each release: Lines of Code: simple metric for the size and complexity of source code. McCabe's Cyclomatic Complexity: metric for the number of independent paths through a program, and is a measure for program complexity. Henry-Kafura / Shepperd: metric for the information flow to and from a module (FAN- IN, FAN-OUT), and is a measure for structural complexity. Slide 916 March SEVO seminar on Software Evolution

Software Metrics (2) Module Coupling: a measure for how many relations a module has to other modules. It is a way to measure semantic coherence, or how the responsibilities of a module are related. Defect density: the number of known defects divided by the size of the software. Number of changes (line-based) to a module. Slide 1016 March SEVO seminar on Software Evolution

Results RQ1: Architectural changes over time? McCabe's cyclomatic complexity measured in Mozilla and Portage increases over time. The measurements in Portage shows a trend to follow a linear increase in cyclomatic complexity, and Mozilla shows a trend to follow a logarithmic increase in cyclomatic complexity. This result is in accordance with Lehman’s 2nd law of software evolution, which states ”As an E-type system is evolved its complexity increases unless work is done to maintain or reduce it.” Slide 1116 March SEVO seminar on Software Evolution

Results RQ1: Architectural changes over time? Measurements of Information Flow for three of the largest modules in Mozilla over a period of 5 years: all with increasing measures This metric will be compared to defect-density in RQ5. Slide 1216 March SEVO seminar on Software Evolution

Results RQ2: Evolution-proneness of complex modules Question: Does module complexity have an impact on evolution proneness? Approach: A sample of 30 modules with high cyclomatic complexity and a sample of 30 modules with low cyclomatic complexity was taken. Evolution- proneness was measured as the number of changes to these modules, and was collected from the change log. Result: A t-test performed on the sample support the hypothesis that modules with high cyclomatic complexity have a higher number of changes than modules with low cyclomatic complexity. Slide 1316 March SEVO seminar on Software Evolution

Results RQ3: Evolution-proneness of modules with high coupling Question: Does module coupling have an impact on evolution proneness? Approach: A sample of 30 modules with high coupling and a sample of 30 modules with low coupling was taken, and the number of changes to these modules was measured. Result: A t-test performed on the sample support the hypothesis that modules with high coupling have a higher number of changes than modules with low coupling. Slide 1416 March SEVO seminar on Software Evolution

Results RQ4: Decreasing software change over time? Observation for both OSS systems: the cumulative number of changes increases linearly over time for both data sources. This means the amount of software development, measured by the number of changed lines, did change at the same rate over time. Also: LOC > 15X code changes!! This result verifies Lehman’s 4th law of software evolution, in that “the average activity rate in an E-type system tends to remain constant over system lifetime or segments of that lifetime”. Slide 1516 March SEVO seminar on Software Evolution

Results RQ5: Defect density versus architectural changes A study by Allen Nikora and John Munson indicates that measures of an evolving system's structure are strongly related to its number of faults. To answer RQ5, defect density and Information flow was measured in three of the largest modules in Mozilla and compared in the plots below. No clear pattern! Slide 1616 March SEVO seminar on Software Evolution

Results RQ5: Defect density versus architectural changes The defect density of both OSS systems was collected by data mining the shared Bugzilla defect-tracking system. Both systems show an increasing defect-density over time, least for the largest system. Slide 1716 March SEVO seminar on Software Evolution

Conclusions Little difference in OSS/COTS read/writes, so similar in practice. RQ1: Architectural change: more in large systems. RQ2: High complexity is correlated with relatively more changes. RQ3: High coupling is correlated with relatively more changes. RQ4: Linear change rates over time. RQ5: Defect density and architectural changes: unrelated. Rather use change rates in RQ2 and RQ3? Does RQ2/RQ3 indicate that modules should be split/merged? Problem with “over-counting” line moves in change logs? Are Mozilla and Portage representative OSS systems – too volatile? Much meat for future studies!