Download presentation
Presentation is loading. Please wait.
1
Open Source Software Development
Jim Herbsleb ISRI Wean 1321
2
Geographically Distributed Development
Extremely slow Hampered by communication and coordination problems Needs to make extensive use of collaboration technology, e.g., application sharing, shared calendars, teleconferences, chat and IM Requires extensive use of “coordination mechanisms” such as interface specifications, plans, processes Must carefully design division of labor across sites Very difficult to respond to unanticipated events
3
Open Source Challenge Fundamentally different model of software development How does it really work? What sort of process results from open source principles? What are the properties of the software developed this way? Case study of Apache and Mozilla (with Audris Mockus and Roy Fielding) Research issues in open source The open source challenge -- a fundamental challenge to the economics and organizational forms of traditional software development. Clearly has had some major successes: Linux, Apache, bind, etc. Properties very different from commercial development Built by large numbers of volunteers work not assigned, rather, individuals choose no explicit design no plan, schedule, processes, deliverables What sort of process results from this radical departure? What is the result? How good is the software, how efficient is the process?
4
Empirical Research Questions
How many people wrote code for new functionality? How many people reported problems? How many people repaired defects? Did large numbers of people participate somewhat equally in these activities, or did a small number of people do most of the work? Where did the code contributors work in the code? Was strict code ownership enforced on a file or module level? What is the productivity of OSS developers? What is the defect density of OSS code? How long did it take to resolve problems?
5
Empirical Methods - 1 Sources: Output: CVS updates, BUGDB numbers
Mail archives for ~3 years CVS/BUGDB/developer discussions Core group (about 12 people at any time) have CVS commit privileges Output: CVS updates, BUGDB numbers CVS update record (MR) date, files touched, lines changed author of the change BUGDB tracking number (if it’s a problem fix) BUGDB tracking number record raiser, dates opened, closed resolution module related CVS updates
6
Empirical Methods - 2 Research questions required change measures
Identified several “comparable” commercial projects number of deltas within order of magnitude developed over comparable period all had high reliability requirements Differences must be interpreted cautiously
7
Roles in Apache Development
Size of the development community How many people wrote code for new Apache functionality? (no reference to problem report) 249 people, 6092 submissions How many people reported problems? 458 people, 591 reports that resulted in code change How many people repaired defects? 182 people, 695 fixes How was work distributed within the development community?
8
The cumulative distribution of contributions to the code base.
All code contributions Fixes only The cumulative distribution of contributions to the code base. One figure, with delta or MR, new + fixes + A + B mention features and fixes are distinct sets of people, little overlap among high contributors Two Commercial projects (telecommunications)
9
Code Ownership Was strict code ownership enforced on a file or module level? No. Out of 42 “.c” files with more than 30 changes 40 had at least two developers making more than 10% of the changes 20 had at least four developers making more than 10% of the changes Use other means of coordinating changes core group members have mutual trust, contribute code to various modules as needed use discussion board
10
Productivity Compare sets of developers that produced 80% of the code in each application A-E: similar-sized commercial projects Apache A B C D E KMR/ developer/ year .11 .03 .03 .09 .02 .06 List projects A code for wireless base station -- may be copying B optical network element C, D, E various OAM things done by internal contract house KLOC/ developer/ year 4.3 38.6 11.7 6.1 5.4 10
11
Defect Density Measures post release and post-feature test
per KLOC added and per thousand Delta Measure Apache A C D E Post-release Defects/KLOCA 2.64 .11 0.1 0.7 Post-release Defects/KDelta 40.8 4.3 14 28 10 Post-feature test Defects/KLOCA * 5.7 6.0 6.9 Post-feature test Defects/KDelta 164 196 256 1 24 26 2.6 3.8 high density for Apache in post-release (between 1.5:1 and 26:1 in favor of commercial, high reliability software) Low density for Apache in post-feature test (between . 6:1 and 2:1 in favor of Apache) 1 9.5 2.9 1.5 5 1 .5 .4 .4 1 .25 .2 .16
12
Defect Resolution Time
Larger fonts for legend one half of problems are resolved within: core 1 day most sites 3 days documentation 4 days major optional 15 days os 10 days
13
Hypotheses Hypothesis 1: Open source developments will have a core of developers who control the code base. This core will be no larger than people, and will create approximately 80% or more of the new functionality. Hypothesis 2: For projects that are so large that developers cannot write 80% of the code in a reasonable time frame, informal coordination will not suffice. Hypothesis 3: In successful open source developments, a group larger by an order of magnitude than the core will repair defects, and a yet larger group (by another order of magnitude) will report problems. Hypothesis 4: Open source developments that have a strong core of developers but never achieve large numbers of contributors beyond that core will be able to create new functionality but will fail because of a lack of resources devoted to finding and repairing defects. Hypothesis 5: Defect density in open source releases will generally be lower than commercial code that has only been feature-tested, i.e., received a comparable level of testing. Hypothesis 6: In successful open source developments, the developers will also be users of the software. Hypothesis 7: OSS developments exhibit very rapid responses to customer problems. We base this hypothesis both on our empirical findings in this case, and also on observations and common wisdom about maximum team size. The core developers must work closely together, each with fairly detailed knowledge of what other core members are doing. Without such knowledge they would frequently make incompatible changes to the code. Since they form essentially a single team, they can be overwhelmed by communication and coordination overhead issues that typically limit the size of effective teams to people. 2. The fixed maximum core team size obviously limits the output of features per unit time. To cope with this problem, a number of satellite projects, such as Apache-SSL, were started by interested parties. Some of these projects produced as much or more functionality than Apache itself. It seems likely that this pattern of core group and satellite groups that add unique functionality targeted to a particular group of users, will frequently be adopted in such cases. In other OSS projects like Linux, the kernel functionality is also small compared to application and user interface functionalities. The nature of relationships between the core and satellite projects remains to be investigated; yet it might serve as an example how to break large monolithic commercial projects into smaller, more manageable pieces. We can see the examples where the integration of these related OSS products is performed by a commercial organization, e.g., RedHat for Linux, ActivePerl for Perl, and CYGWIN for GNU tools. 3-4 Many defect repairs can be performed with only a limited risk of interacting with other changes. Problem reporting can be done with no risk of harmful interaction at all. Since these types of work typically have fewer dependencies among participants than does the development of new functionality, potentially much larger groups can work on them. In a successful development, these activities will be performed by larger communities, freeing up time for the core developers to develop new functionality. Where an OSS development fails to stimulate wide participation, either the core will become overburdened with finding and repairing defects, or the code simply will never reach an acceptable level of quality. 6 In general, open source developers are experienced users of the software they write. They are intimately familiar with the features they need, and what the correct and desirable behavior is. Since the lack of domain knowledge is one of the chief problems in large software projects [7], one of the main sources of error is eliminated when domain experts write the software. It remains to be seen if this advantage can completely compensate for the absence of system testing. In any event, where the developers are not also experienced users of the software, they are highly unlikely to have the necessary level of domain expertise or the necessary motivation to succeed as an OSS project.
14
Research Questions Resource Allocation,Decision-Making
How do key developers decide where to allocate their resources? User innovation model Personal reputation model Product needs model How do individual motivations sum to give the development its trajectory? Not quite a market, not quite a hierarchy, perhaps a network
15
Research Questions Understanding Current Limitations of OSS
Product structure, architecture – comprehension and collaboration What does not get built? Developers only meeting own needs? Differences between developer/users and general users? Effective ways of incorporating requirements of non-developer users? Effects of scale With larger scale, will coordination needs force adoption of “commercial” development techniques? How to collaborate on “big” features? Possible to increase participation by non-core developers?
16
Research Questions Adoption and Patronage
Commercial organizations need ways to assess risk of adopting open source Patronage creates new forms of virtual organization What effects on OSS culture, individual motivation, economic network? How will competitive pressures, business motivations affect development? Cause branching, fragmentation? Evolve toward joint ventures, away from community ownership?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.