Two Case Studies of Open Source Software Development: Apache and Mozilla By Helen Gower, Drew Spencer, Mila Reid, Nigel Macarthur & Mohamed Hossain
Introduction Development Process - Traditional What is Open Source Software Apache and Mozilla Apache Process Hypotheses Mozilla Process Hypotheses revisited Conclusion Research Any questions
Development Process - Traditional Basically the waterfall cycle Predominately used in the commercial industry Advantages: well established, structured procedures Disadvantages: management related constraints, cannot go back a phase
What is Open Source Software (OSS)? A new way to develop software Differences from traditional development – Source code is freely available – Communicate exclusively by /bulletin boards – Geographically distributed development Advantages: developer freedom, tacit knowledge Disadvantages: lacks traditional methods to coordinate development
Open Source Software - Results OSS development has proven to be equivalent/superior to traditional methods – Defects found and fixed quicker – Code written with more care/creativity An example of successful OSS software is the Linux operating system
What is Apache? Apache is a free, open source HTTP web server software system Works well on open source operating systems such as UNIX and Linux Also available for Windows and other operating systems
What is Apache? cont… Supports the PERL and PHP languages Provides services such as server-side scripting Industry leaders such as DEC, UUNet and Yahoo use Apache 70% of the worlds web servers run on Apache (
Why call it Apache? In early 1995, developers of some high visibility web sites decided to pool their patches and enhancements to the NCSA/1.3 server to create… A patchy server The Apache Group (AG) started 1995
What is Mozilla? The Mozilla Project is an open source software project Dedicated to development of the Mozilla web browser and application framework Available for many operating systems – Firefox, a cross-platform browser; and – Camino, a web browser for MacOS X
What is Mozilla? cont… Includes mail and news reader (Mozilla Thunderbird), HTML editor and an IRC client Supports many technologies including development tools – CVS, Bugzilla, Bonsai, Tinderbox It also builds toolkit type applications such as Komodo from ActiveState
What is Mozilla? cont… Mozilla uses a development process with commercial roots Mozilla.org exists as a group within Netscape – Central point of contact responsible for coordinating development
The Development Process Problems posed by OSS-style development – Decentralised Workspaces – Lack of communication & leadership – Inconsistent dedication of time Solutions – Concurrent Version Control Archive (CVS) – Mailing List – Quorum Voting System – Meritocracy
Identifying work to be done Modification Requests (MRs) – mailing list BUGDB USENET groups Showstoppers always addressed Others discussed on mailing list
Assigning & Performing Work Core developers have own areas New developers take on disowned areas or new features Great respect for core developers expertise and experience No specific rights to code – meritocracy gives implicit ownership
The Development Community 400 individual contributors of code – 182 people contributed 695 fixes – 249 people contributed 6,092 new code submissions 3,060 people submitted 3,975 bug reports – 458 people submitted 591 that caused a change in the code
Distribution of Work Top 15 developers contributed: – 83% of MRs for new features – 66% of MRs for defects/bugs The wider development community is significant in defect repair Few outside the core group submit with any regularity Developers contributing > 1 MR BeforeAfterBoth New Features Fixes
Commercial Project Comparison MRKLOCADevMR/top dev/yr LOCA/top dev/yr A3,3005, ,600 B2,5001, ,700 C1, ,100 D ,400 E ,000 A-E Avg ,360 Apache6, ,300
Commercial Project Comparison Top developers handle around twice the number of MRs as commercial projects Rate of development is within 2/3 that of C & D in terms of LOCA B & E are about twice as productive A is 10 times more productive
Reporting Problems Top Problem Reporters only contributed 5% of PRs Of these 15, only 3 are also core developers Problem reporting belongs almost exclusively to the wider development community
Ownership of Code Was thought likely that strong code ownership would evolve due to modular design and decentralisation This was not supported by analysis of files (.c files) – 75% had > 2 developers contributing 10% of lines – 50% had > 4 developers contributing 10% of lines – High level of trust and recognition of expertise
Defect Density Measured in defects/KLOCA Apache has same defect density for pre-release and post release tests Pre-release – less defects than commercial products Post release – more defects than commercial products
Resolving Problems How long does it take to resolve problems? – 50% resolved within 24hrs – 75% resolved within 42 days – 90% resolved within 140 days Slightly lower for documentation, OS related and optional features Over two periods the average resolution interval decreased significantly while the number of users increased
What has been analysed? The structure of the development process The number of participants The distribution of work among different roles Rules of ownership of code Density of defects Time taken to resolve problems
Hypothesis 1 Implicit coordination mechanism – Detailed knowledge of who has expertise in what area – Customs & habits regarding how things are done – What are core members are doing Core of developers who control the code base - No larger than people - Create approx 80% of the new functionality (not fixes or problem reporting) Core of developers who control the code base - No larger than people - Create approx 80% of the new functionality (not fixes or problem reporting)
Hypothesis 2 Satellite projects created Divide & conquer – work split over core developers and satellite groups Strict code ownership policy needs to be adopted
Hypothesis 3 A group around 10x larger than the core (10-15 people) will repair defects - E.g. Apache, 182 people repaired defects A group around 10x larger than the core (10-15 people) will repair defects - E.g. Apache, 182 people repaired defects A group 10x larger or more will report problems - E.g. Apache, 3,060 people reported bug reports A group 10x larger or more will report problems - E.g. Apache, 3,060 people reported bug reports
Hypothesis 4 Lack of resources = overburdened Most people have only ever submitted 1 bug – Apache: 3,060 people reported 3,975 bugs Wider community needed to free up core developers time so they can develop new functionality Projects without a wider community finding and repairing defects will fail
Hypothesis 5 Defect density (per 1,000 lines of code) will be lower than commercial software
Hypothesis 6 Familiar with the features needed Familiar with desirable user behaviour Developers are also experienced users of the software they write
Hypothesis 7 Many eyeballs implies shallow bugs Free-world of OSS – Patches available to all customers nearly as soon as they are made Commercial developments – Patches bundled into new releases and scheduled for release at specific times (long term projects) OSS developments exhibit rapid responses to customer problems
Mozilla – How Things Happen Development was done at the time of writing the paper by 12 staff in mozilla.org Non-development staff concentrate on issues like testing, or community milestone releases The content of future releases is specified in a road map Work within this is allocated according to developer preferences and expertise
How Things Happen cont… Developers can browse Bugzilla to choose areas on which they would like to work Mozilla web pages can be used to note areas where help is needed Mozilla operates on a daily build Each build is smoke tested by one of 6 pre-release test teams This is followed by inspections and managed release
Mozilla – Research Findings The points below summarise the research questions originally considered 486 people contributed code 412 contributed code to fixes 6,873 communicated problems (external community very large, small core) Code ownership is enforced
Mozilla – Findings cont… The authors hypotheses 1 and 2 are supported by the Mozilla data However, these hypotheses were modified as summarised below: The core size (10-15) is limited to if only informal coordination is used – Original hypothesis did not discuss impact of coordination Project cores larger than might require other mechanisms in addition to code ownership to improve coordination
Mozilla – Findings cont… Hypothesis 3 (relative sizes of core/fixers/reporters) is weakly supported Core = 22 to 35 (larger than expected) Fixers = 47 to 129 Reporters = 119 to 623 Mozilla defect density lower than commercial equivalent projects – although: caution – more may be found later
Mozilla - Conclusions Commercial/OSS has many possible hybrids These hybrids will require a large open source community to fix bugs They will also require an even larger community to find bugs
Research A strong paper Good approach to measuring the metrics required to test the required hypotheses Citation frequency in the following years suggests that is regarded as authoritative However, the final conclusion, as mentioned previously, must be considered unproven as yet
References paper.ppt#1 (How to read and critique a technical paper, Colorado State University paper.ppt#1 Greenhalgh, Trisha, How to read a paper London, BMJ, 1997 The above summarised at: – Note: the BMJ references above concern evidence based medicine, but have some useful sections!
Any questions? …preferably easy ones!