Download presentation
Presentation is loading. Please wait.
1
Conceptual Framework for Agent- Based Modeling and Simulation: The Computer Experiment Yongqin GaoVincent Freeh Greg Madey CSE DepartmentCS Department CSE Department University of Notre DameNCSU University of Notre Dame NAACSOS Conference Pittsburgh, PA June 25, 2003 Supported in part by the National Science Foundation - Digital Society & Technology Program
2
The Computer Experiment
3
Agent-Based Simulation as a Component of the Scientific Method Modeling (Hypothesis) Agent -Based Simulation (Experiment) Observation Social Network Model of F/OSS Grow Artificial SourceForge Analysis of SourceForge Data
4
Outline Investigation: Free/Open Source Software (F/OSS)Investigation: Free/Open Source Software (F/OSS) Conceptual framework(s)Conceptual framework(s) Model descriptionModel description ER modelER model BA modelBA model BA model with constant fitnessBA model with constant fitness BA model with dynamic fitnessBA model with dynamic fitness SummarySummary
5
Open Source Software (OSS) Free …Free … –to view source –to modify –to share –of cost ExamplesExamples –Apache –Perl –GNU –Linux –Sendmail –Python –KDE –GNOME –Mozilla –Thousands more Linux GNU Savannah
6
Free Open Source Software (F/OSS) DevelopmentDevelopment –Mostly volunteer –Global teams –Virtual teams –Self-organized - often peer-based meritocracy –Self-managed - but often a “ charismatic ” leader –Often large numbers of developers, testers, support help, end user participation –Rapid, frequent releases –Mostly unpaid
7
Typical Charismatic Leaders? Linus Tolvalds Linux Larry Wall Perl Richard Stallman GNU Manifesto Eric Raymond Cathedral and Bazaar
8
F/OSS: Significance Contradicts traditional wisdom:Contradicts traditional wisdom: –Software engineering –Coordination, large numbers –Motivation of developers –Quality –Security –Business strategy Almost everything is done electronically and available in digital formAlmost everything is done electronically and available in digital form Opportunity for Social Science Research -- large amounts of online data availableOpportunity for Social Science Research -- large amounts of online data available Research issues: –Understanding motives –Understanding processes –Intellectual property –Digital divide –Self-organization –Government policy –Impact on innovation –Ethics –Economic models –Cultural issues –International factors
9
SourceForge VA Software Part of OSDN Started 12/1999 Collaboration tools 58,685 Projects 80,000 Developers 590,00 Registered Users
10
Savannah Uses SourceForge Software Free Software Foundation 1,508 Projects 15,265 Registered Users
11
F/OSS: Importance Major Component of e-Technology Infrastructure with major presence in e-Commerce e-Science e-Government e-Learning Apache has over 65% market share of Internet Web servers Linux on over 7 million computers Most Internet e-mail runs on Sendmail Tens of thousands of quality products Part of product offerings of companies like IBM, Apple Apache in WebSphere, Linux on mainframe, FreeBSD in OSX Corporate employees participating on OSS projects
12
Free/Open Source Software Seems to challenge traditional economic assumptionsSeems to challenge traditional economic assumptions Model for software engineeringModel for software engineering New business strategiesNew business strategies –Cooperation with competitors –Beyond trade associations, shared industry research, and standards processes — shared product development! Virtual, self-organizing and self-managing teamsVirtual, self-organizing and self-managing teams Social issues, e.g., digital divide, international participationSocial issues, e.g., digital divide, international participation Government policy issues, e.g., US software industry, impact on innovation, security, intellectual propertyGovernment policy issues, e.g., US software industry, impact on innovation, security, intellectual property
13
Research Model
14
Data Collection — Monthly Web crawler (scripts)Web crawler (scripts) –Python –Perl –AWK –Sed MonthlyMonthly Since Jan 2001Since Jan 2001 ProjectIDProjectID DeveloperIDDeveloperID Almost 2 million recordsAlmost 2 million records Relational databaseRelational database PROJ|DEVELOPER 8001|dev378 8001|dev8975 8001|dev9972 8002|dev27650 8005|dev31351 8006|dev12509 8007|dev19395 8007|dev4622 8007|dev35611 8008|dev8975
15
dev[59] dev[54] dev[49] dev[64] dev[61] Project 6882 Project 9859 Project 7597 Project 7028 Project 15850 F/OSS Developers - Social Network Component Developers are nodes / Projects are links 24 Developers 5 Projects 2 Linchpin Developers 1 Cluster
16
Models of the F/OSS Social Network (Alternative Hypotheses) General model featuresGeneral model features –Agents are nodes on a graph (developers or projects) –Behaviors: Create, join, abandon and idle –Edges are relationships (joint project participation) –Growth of network: random or types of preferential attachment, formation of clusters –Fitness –Network attributes: diameter, average degree, power law, clustering coefficient Four specific modelsFour specific models –ER (random graph) –BA (scale free) –BA ( + constant fitness) –BA ( + dynamic fitness)
17
ER model – degree distribution Degree distribution is binomial distribution while it is power law in empirical data R 2 = 0.9712 for developer network R 2 = 0.9815 for project network
18
ER model - diameter Average degree is decreasing while it is increasing in empirical data Diameter is increasing while it is decreasing in empirical data
19
ER model – clustering coefficient Clustering coefficient is relatively low around 0.4 while it is around 0.7 in empirical data. Clustering coefficient is decreasing while it is increasing in empirical data
20
ER model – cluster distribution Cluster distribution in ER model also have power law distribution with R 2 as 0.6667 (0.9953 without the major cluster) while R 2 in empirical data is 0.7457 (0.9797 without the major cluster) The actual distribution is different from empirical data The later models (BA and further models) have similar behaviors
21
BA model – degree distribution Power laws in degree distribution, similar to empirical data (+ for simulated data and x for empirical data). For developer distribution: simulated data has R 2 as 0.9798 and empirical data has R 2 as 0.9712. For project distribution: simulated data has R 2 as 0.6650 and empirical data has R 2 as 0.9815.
22
BA model – diameter and CC Small diameter and high clustering coefficient like empirical data Diameter and clustering coefficient are both decreasing like empirical data
23
BA model with constant fitness Power laws in degree distribution, similar to empirical data (+ for simulated data and x for empirical data). For developer distribution: simulated data has R2 as 0.9742 and empirical data has R2 as 0.9712. For project distribution: simulated data has R2 as 0.7253 and empirical data has R2 as 0.9815. Diameter and CC are similar to simple BA model.
24
BA model with dynamic fitness Power laws in degree distribution, similar to empirical data (+ for simulated data and x for empirical data). For developer distribution: simulated data has R2 as 0.9695 and empirical data has R2 as 0.9712. For project distribution: simulated data has R2 as 0.8051 and empirical data has R2 as 0.9815.
25
Advantage of BA with dynamic fitness Intuition: Fitness should decreasing with time.Intuition: Fitness should decreasing with time. Statistics: project has life cycle behavior which can not be replicated by BA model with constant fitness but can be replicated by BA model with dynamic fitnessStatistics: project has life cycle behavior which can not be replicated by BA model with constant fitness but can be replicated by BA model with dynamic fitness
26
Conceptual Framework Agent-Based Modeling and Simulation as Components of the Scientific Method Observation Hypothesis Experiment
27
Summary We use ABM to model and simulate the SourceForge collaboration network.We use ABM to model and simulate the SourceForge collaboration network. Conceptual framework is proposed for agent- based modeling and simulation.Conceptual framework is proposed for agent- based modeling and simulation. Case study of this framework: SourceForge study through ER, BA, BA with constant fitness and BA with dynamic fitness.Case study of this framework: SourceForge study through ER, BA, BA with constant fitness and BA with dynamic fitness.
28
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.