Mining Large Software Compilations over Time Another Perspective on Software Evolution: Gregorio Robles, Jesus M. Gonzalez-Barahona, Martin Michlmayr,

Slides:



Advertisements
Similar presentations
EIONET Training Beginners Zope Course Miruna Bădescu Finsiel Romania Copenhagen, 27 October 2003.
Advertisements

Software change management
Configuration management
Developer Identification Methods for Integrated Data from Various Sources Gregorio Robles Jesus M. Gonzalez-Barahona Presented by Brian Chan Cisc 864.
Why.NET? Various languages struggling to interoperate with each other Developers undergoing huge learning curves to shift from one language to another.
Patterns of Research Collaboration in U.S. Universities, AAAS Meetings Denver, Colorado February 18, 2003 James D. Adams, University of Florida.
Chapter 11 Artificial Intelligence and Expert Systems.
Swami NatarajanJune 17, 2015 RIT Software Engineering Reliability Engineering.
CVS II: Parallelizing Software Development Author: Brian Berliner John Tully.
CS 501 : An Introduction to SCM & GForge An Introduction to SCM & GForge Lin Guo
Software Documentation Written By: Ian Sommerville Presentation By: Stephen Lopez-Couto.
G51FSE Version Control Naisan Benatar. Lecture 5 - Version Control 2 On today’s menu... The problems with lots of code and lots of people Version control.
Package Managers What are they and why we use them.
Introduction to Computer Administration System Administration
This chapter is extracted from Sommerville’s slides. Text book chapter
The aim We had to “build” a laptop from scratch. We needed to install the software and the Operating system needed. We came across all sorts of problems.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 18 Slide 1 Software Reuse.
#17 - Involve Users in the Development Model of Multinational Corporations - Is it worth it? Experience Report IRCSE '08: IDT Workshop Friday 31 October.
1 ITSK 2611 Welcome. 2 Operating System 3 What is an OS Resource Manager –Disk –Memory –CPU Device Manager –Printers –Video Card –Sound Card Utility.
Final Year Project Electronic & Computer Engineering Student: Andrew Sweeney Supervisor: Dr. Peter Corcoran Design and Realisation of Experiments for an.
Writing a Discussion Section. Writing a discussion section is where you really begin to add your interpretations to the work. In this critical part of.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
Database Systems – CRM DEFINITIONS CRM - Customer Relationship Management CRM usually refers to a strategic solution that helps businesses identify the.
Cluster Reliability Project ISIS Vanderbilt University.
PAPER PRESENTATION: EMPIRICAL ASSESSMENT OF MDE IN INDUSTRY Erik Wang CAS 703.
EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
Configuration Management (CM)
Geographical Locations of Developers at SourceForge: Gregorio Robles Jesus M. Gonzalez-Barahona Presented by Brian Chan Cisc 864.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Open Source Project By Robert Smith
The CORBA C++ Mapping: Beyond Repair? MARS Douglas C. Schmidt Vanderbilt University Institute for Software Integrated Systems CORBA IN THE.
Assessing the influence on processes when evolving the software architecture By Larsson S, Wall A, Wallin P Parul Patel.
Alexander Serebrenik and Mark van den Brand Theil index for aggregation of software metrics values.
Presented by: Ashgan Fararooy Referenced Papers and Related Work on:
Linux Kernel Management. Module 9 – Kernel Administration ♦ Overview The innermost layer of Linux operating system is the kernel, which is a thin layer.
National Center for Supercomputing ApplicationsNational Computational Science Grid Packaging Technology Technical Talk University of Wisconsin Condor/GPT.
General rules 1. Rule: 2. Rule: 3. Rule: 10. Rule: Ask questions ……………………. 11. Rule: I do not know your skill. If I tell you things you know, please stop.
Evolution in Open Source Software (OSS) SEVO seminar at Simula, 16 March 2006 Software Engineering (SU) group Reidar Conradi, Andreas Røsdal, Jingyue Li.
CSNB334 Advanced Operating Systems 1. Introduction to Linux Lecturer: Abdul Rahim Ahmad.
Configuration Management CSCI 5801: Software Engineering.
GLite build and integration system Building and Packaging Robert HARAKALY
CASA Users Survey Response and Progress Jeff Kern NRAO.
MVC WITH CODEIGNITER Presented By Bhanu Priya.
SEG 4110 – Advanced Software Design and Reengineering Topic T Introduction to Refactoring.
Made By: Micheal Mouner Linux VS Windows. Agenda.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
1 Geant4 Documentation Dennis Wright Geant4 Delta Review 9 October 2002 Internal documentation review Documentation improvements Plans for future improvements.
Politecnico di Torino Andrea Capiluppi Characterizing the Open Source Software Process: a Horizontal Study A. Capiluppi, P.
Linux Operating System By: Julie Dunbar. Overview Definitions History and evolution of Linux Current development In reality ◦United States  Business.
OCR A Level F453: The function and purpose of translators Translators a. describe the need for, and use of, translators to convert source code.
This slide deck is for LPI Academy instructors to use for lectures for LPI Academy courses. ©Copyright Network Development Group Module 01 Introduction.
Phases of ERP Implementation Lifecycle By ControlERP
Scientific Linux Inventory Project (SLIP) Troy Dawson Connie Sieh.
Chapter 25 – Configuration Management 1Chapter 25 Configuration management.
1 Policy Based Systems Management with Puppet Sean Dague
Free Electrons. Kernel, drivers and embedded Linux development, consulting, training and support. http//free-electrons.com Participate! During the lectures...
1 /15 Design and Implementation of the Custom Debian Distributions Toolkit (CDDT) 17 February 2006 Sergio Talens-Oliag II Open Source World.
FreeBSD ports & packages. FreeBSD ports & packages - overview Different UNIX distributions use differents package systems for distributing software Debian.
What is F/LOSS? By Scot Henderson.
How to download premium android apps and games
Selected topic in computer science (1)
IM-pack: Software Installation Using Disk Images
Penguin Weight Watchers
Basic Concepts in Data Management
Finding the Right CRM for Your Organization
An introduction to the Linux environment v
Discussing an OVS/OVN Split
Presentation transcript:

Mining Large Software Compilations over Time Another Perspective on Software Evolution: Gregorio Robles, Jesus M. Gonzalez-Barahona, Martin Michlmayr, Juan Jose Amor Presented by Brian Chan Cisc th October 2007

Overview Background Information Motivations for Paper Problems Addressed Solutions and Data Analysis Conclusions Thoughts about the paper Questions/Comments

Background Libre Software (open source software) Compilations of software by vendors: Group different software sources together as a product. Must be: Easy to install, configure administer

Background Information Example of Libre Sofware: Debian – Distribution of the Linux Kernel Versions , 2.2, 3.0 and 3.1 Lots of volunteers - all information mail etc becomes available.

Motivation for Paper The evolution of products created from software compilation is new Companies have trouble categorizing all the programs built by different vendors. This is different compared to normal software evolution: Integration Vs Development Maintenance means additions of new software not removal of faults or addition of new functionality

Problems addressed Dealing with adding and removal of packages in the Debian release and libre Software “by the large” Address versioning in packages Paper is indicative of Libre Software in general because of its size.

Solutions/Data Gathered Information of the product (Sources.tar.gz) contains: Name, version, list of binary packages built from it, name and address of maintainer. Experiment focuses on source lines of code (SLOC) using SLOCCount

Solutions/Data Gathered 1.SLOCCount transforms data into relational and XML data formats for viewing purposes.

Solutions/Data Gathered As MSLOC (Million Lines of Code) Number of Packages Every two years x2 growth Faster in earlier years

Solutions/Data Gathered Rule 1: Large packages grow in time Rule 2: Many small packages introduced Result: Mean size of packages is the same

Solutions/Data Gathered Common Package: Same files but updated in later versions Common Versions: Same files with no change

Solutions/Data Gathered 25% of packages have been completely removed 15% of packages have been unchanged Number of packages with versions in common increases

Solutions/Data Gathered C dominates (between 85%-55%) in all versions

Solutions/Data Gathered 300% increase in lines of C code But overall direction is heading to Python Perl Reasoning: Many more shell scripts for installation purposes

Conclusions Evidence shows: Versions that stay double in size (in terms of packages or lines of code) every two years. Mean size of packages is the same Not indicative of package behavior! Because more files with more lines but many small packages as well

Conclusions One developer can only handle N amount of files but software is getting larger => more developers C is becoming less important even though it is still leading in terms of percentage of lines

Conclusions More research needed if link between skills, # of developers, complexity and activities performed found. Debian provides good example for understanding compilation evolution.

Thoughts about the paper Strong points: Data provided shows interesting progression of versioning for this product; another face to software evolution Good use of linux product that has mainstream versioning for example: Ubuntu may have been too new Good explanation of reasons for trend: i.e. same mean, more shell code.

Thoughts about the paper Points that need improvement Borrow terms like maintenance from usual definition: versioning probably would have sufficed. Does not really explain the significance of common packages, files between versions, just lists them. Bold claim to say Debian is indicative of software compilation evolution as a whole: Other releases may show alternate patterns=> show background research on that.

Questions/Comments