1 / 28

Conceptual Framework for Agent-Based Modeling and Simulation: The Computer Experiment

Conceptual Framework for Agent-Based Modeling and Simulation: The Computer Experiment. Yongqin Gao Vincent Freeh Greg Madey CSE Department CS Department CSE Department University of Notre Dame NCSU University of Notre Dame NAACSOS Conference Pittsburgh, PA

lindsay
Download Presentation

Conceptual Framework for Agent-Based Modeling and Simulation: The Computer Experiment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Conceptual Framework for Agent-Based Modeling and Simulation: The Computer Experiment Yongqin Gao Vincent Freeh Greg Madey CSE Department CS Department CSE Department University of Notre Dame NCSU University of Notre Dame NAACSOS Conference Pittsburgh, PA June 25, 2003 Supported in part by the National Science Foundation - Digital Society & Technology Program

  2. The Computer Experiment

  3. Agent-Based Simulation as a Component of the Scientific Method Modeling (Hypothesis) Social Network Model of F/OSS Agent -Based Simulation (Experiment) Observation Analysis of Grow Artificial SourceForge SourceForge Data

  4. Outline • Investigation: Free/Open Source Software (F/OSS) • Conceptual framework(s) • Model description • ER model • BA model • BA model with constant fitness • BA model with dynamic fitness • Summary

  5. Open Source Software (OSS) GNU Linux • Free … • to view source • to modify • to share • of cost • Examples • Apache • Perl • GNU • Linux • Sendmail • Python • KDE • GNOME • Mozilla • Thousands more Savannah

  6. Free Open Source Software (F/OSS) • Development • Mostly volunteer • Global teams • Virtual teams • Self-organized - often peer-based meritocracy • Self-managed - but often a “charismatic” leader • Often large numbers of developers, testers, support help, end user participation • Rapid, frequent releases • Mostly unpaid

  7. TypicalCharismatic Leaders? Larry Wall Perl Linus Tolvalds Linux Richard Stallman GNU Manifesto Eric Raymond Cathedral and Bazaar

  8. Contradicts traditional wisdom: Software engineering Coordination, large numbers Motivation of developers Quality Security Business strategy Almost everything is done electronically and available in digital form Opportunity for Social Science Research -- large amounts of online data available Research issues: Understanding motives Understanding processes Intellectual property Digital divide Self-organization Government policy Impact on innovation Ethics Economic models Cultural issues International factors F/OSS: Significance

  9. SourceForge • VA Software • Part of OSDN • Started 12/1999 • Collaboration tools • 58,685 Projects • 80,000 Developers • 590,00 Registered Users

  10. Savannah • Uses SourceForge Software • Free Software Foundation • 1,508 Projects • 15,265 Registered Users

  11. F/OSS: Importance • Major Component of e-Technology Infrastructure with major presence in • e-Commerce • e-Science • e-Government • e-Learning • Apache has over 65% market share of Internet Web servers • Linux on over 7 million computers • Most Internet e-mail runs on Sendmail • Tens of thousands of quality products • Part of product offerings of companies like IBM, Apple • Apache in WebSphere, Linux on mainframe, FreeBSD in OSX • Corporate employees participating on OSS projects

  12. Free/Open Source Software • Seems to challenge traditional economic assumptions • Model for software engineering • New business strategies • Cooperation with competitors • Beyond trade associations, shared industry research, and standards processes — shared product development! • Virtual, self-organizing and self-managing teams • Social issues, e.g., digital divide, international participation • Government policy issues, e.g., US software industry, impact on innovation, security, intellectual property

  13. Research Model

  14. Data Collection — Monthly • Web crawler (scripts) • Python • Perl • AWK • Sed • Monthly • Since Jan 2001 • ProjectID • DeveloperID • Almost 2 million records • Relational database PROJ|DEVELOPER 8001|dev378 8001|dev8975 8001|dev9972 8002|dev27650 8005|dev31351 8006|dev12509 8007|dev19395 8007|dev4622 8007|dev35611 8008|dev8975

  15. dev[72] dev[67] dev[52] dev[65] dev[70] dev[57] 7597 dev[46] 6882 dev[47] dev[64] dev[45] dev[99] 7597 dev[46] 7597 dev[46] dev[52] dev[72] dev[67] 7597 dev[46] 6882 dev[47] dev[47] dev[55] dev[55] dev[55] 7597 dev[46] 7028 dev[46] dev[70] 7597 dev[46] 7028 dev[46] dev[57] dev[45] dev[99] dev[51] 7597 dev[46] 7028 dev[46] 6882 dev[47] 6882 dev[58] dev[61] dev[51] dev[79] dev[47] 7597 dev[46] dev[58] dev[58] dev[46] 9859 dev[46] dev[54] 15850 dev[46] dev[58] 9859 dev[46] dev[79] dev[58] 9859 dev[46] dev[49] dev[53] 9859 dev[46] 15850 dev[46] dev[59] dev[56] 15850 dev[46] dev[83] 15850 dev[46] dev[48] dev[53] dev[56] dev[83] dev[48] F/OSS Developers - Social Network Component Developers are nodes / Projects are links 24 Developers 5 Projects Project 7597 2 Linchpin Developers 1 Cluster dev[64] Project 6882 Project 7028 dev[61] dev[54] dev[49] dev[59] Project 9859 Project 15850

  16. Models of the F/OSS Social Network(Alternative Hypotheses) • General model features • Agents are nodes on a graph (developers or projects) • Behaviors: Create, join, abandon and idle • Edges are relationships (joint project participation) • Growth of network: random or types of preferential attachment, formation of clusters • Fitness • Network attributes: diameter, average degree, power law, clustering coefficient • Four specific models • ER (random graph) • BA (scale free) • BA ( + constant fitness) • BA ( + dynamic fitness)

  17. ER model – degree distribution • Degree distribution is binomial distribution while it is power law in empirical data • R2 = 0.9712 for developer network • R2 = 0.9815 for project network

  18. ER model - diameter • Average degree is decreasing while it is increasing in empirical data • Diameter is increasing while it is decreasing in empirical data

  19. ER model – clustering coefficient • Clustering coefficient is relatively low around 0.4 while it is around 0.7 in empirical data. • Clustering coefficient is decreasing while it is increasing in empirical data

  20. ER model – cluster distribution • Cluster distribution in ER model also have power law distribution with R2 as 0.6667 (0.9953 without the major cluster) while R2 in empirical data is 0.7457 (0.9797 without the major cluster) • The actual distribution is different from empirical data • The later models (BA and further models) have similar behaviors

  21. BA model – degree distribution • Power laws in degree distribution, similar to empirical data (+ for simulated data and x for empirical data). • For developer distribution: simulated data has R2 as 0.9798 and empirical data has R2 as 0.9712. • For project distribution: simulated data has R2 as 0.6650 and empirical data has R2 as 0.9815.

  22. BA model – diameter and CC • Small diameter and high clustering coefficient like empirical data • Diameter and clustering coefficient are both decreasing like empirical data

  23. BA model with constant fitness • Power laws in degree distribution, similar to empirical data (+ for simulated data and x for empirical data). • For developer distribution: simulated data has R2 as 0.9742 and empirical data has R2 as 0.9712. • For project distribution: simulated data has R2 as 0.7253 and empirical data has R2 as 0.9815. • Diameter and CC are similar to simple BA model.

  24. BA model with dynamic fitness • Power laws in degree distribution, similar to empirical data (+ for simulated data and x for empirical data). • For developer distribution: simulated data has R2 as 0.9695 and empirical data has R2 as 0.9712. • For project distribution: simulated data has R2 as 0.8051 and empirical data has R2 as 0.9815.

  25. Advantage of BA with dynamic fitness • Intuition: Fitness should decreasing with time. • Statistics: project has life cycle behavior which can not be replicated by BA model with constant fitness but can be replicated by BA model with dynamic fitness

  26. Conceptual FrameworkAgent-Based Modeling and Simulation as Components of the Scientific Method Hypothesis Observation Experiment

  27. Summary • We use ABM to model and simulate the SourceForge collaboration network. • Conceptual framework is proposed for agent-based modeling and simulation. • Case study of this framework: SourceForge study through ER, BA, BA with constant fitness and BA with dynamic fitness.

  28. Thank you

More Related