Skip to main content
Erschienen in:
Buchtitelbild

2019 | OriginalPaper | Buchkapitel

Massive Multi-agent Data-Driven Simulations of the GitHub Ecosystem

verfasst von : Jim Blythe, John Bollenbacher, Di Huang, Pik-Mai Hui, Rachel Krohn, Diogo Pacheco, Goran Muric, Anna Sapienza, Alexey Tregubov, Yong-Yeol Ahn, Alessandro Flammini, Kristina Lerman, Filippo Menczer, Tim Weninger, Emilio Ferrara

Erschienen in: Advances in Practical Applications of Survivable Agents and Multi-Agent Systems: The PAAMS Collection

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Simulating and predicting planetary-scale techno-social systems poses heavy computational and modeling challenges. The DARPA SocialSim program set the challenge to model the evolution of GitHub, a large collaborative software-development ecosystem, using massive multi-agent simulations. We describe our best performing models and our agent-based simulation framework, which we are currently extending to allow simulating other planetary-scale techno-social systems. The challenge problem measured participant’s ability, given 30 months of meta-data on user activity on GitHub, to predict the next months’ activity as measured by a broad range of metrics applied to ground truth, using agent-based simulation. The challenge required scaling to a simulation of roughly 3 million agents producing a combined 30 million actions, acting on 6 million repositories with commodity hardware. It was also important to use the data optimally to predict the agent’s next moves. We describe the agent framework and the data analysis employed by one of the winning teams in the challenge. Six different agent models were tested based on a variety of machine learning and statistical methods. While no single method proved the most accurate on every metric, the broadly most successful sampled from a stationary probability distribution of actions and repositories for each agent. Two reasons for the success of these agents were their use of a distinct characterization of each agent, and that GitHub users change their behavior relatively slowly.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Aggarwal, K., Hindle, A., Stroulia, E.: Co-evolution of project documentation and popularity within GitHub. In: Mining Software Repositories, MSR (2014) Aggarwal, K., Hindle, A., Stroulia, E.: Co-evolution of project documentation and popularity within GitHub. In: Mining Software Repositories, MSR (2014)
2.
Zurück zum Zitat Barbosa, H., de Lima-Neto, F.B., Evsukoff, A., Menezes, R.: The effect of recency to human mobility. EPJ Data Sci. 4(1), 1–14 (2015)CrossRef Barbosa, H., de Lima-Neto, F.B., Evsukoff, A., Menezes, R.: The effect of recency to human mobility. EPJ Data Sci. 4(1), 1–14 (2015)CrossRef
3.
Zurück zum Zitat Bissyand, T.F., Thung, F., Lo, D., Jiang, L., Rveillre, L.: Popularity, interoperability, and impact of programming languages in 100,000 open source projects. In: IEEE 37th Annual Computer Software and Applications Conference (2013) Bissyand, T.F., Thung, F., Lo, D., Jiang, L., Rveillre, L.: Popularity, interoperability, and impact of programming languages in 100,000 open source projects. In: IEEE 37th Annual Computer Software and Applications Conference (2013)
4.
Zurück zum Zitat Blythe, J.: A dual-process cognitive model for testing resilient control systems. In: 5th International Symposium on Resilient Control Systems, pp. 8–12, August 2012 Blythe, J.: A dual-process cognitive model for testing resilient control systems. In: 5th International Symposium on Resilient Control Systems, pp. 8–12, August 2012
5.
Zurück zum Zitat Blythe, J., Tregubov, A.: Farm: Architecture for distributed agent-based social simulations. In: IJCAI/AAMAS Workshop on Massively Multi-agent Systems (2018) Blythe, J., Tregubov, A.: Farm: Architecture for distributed agent-based social simulations. In: IJCAI/AAMAS Workshop on Massively Multi-agent Systems (2018)
6.
Zurück zum Zitat Borges, H., Hora, A.C., Valente, M.T.: Predicting the popularity of GitHub repositories. In: PROMISE (2016) Borges, H., Hora, A.C., Valente, M.T.: Predicting the popularity of GitHub repositories. In: PROMISE (2016)
7.
Zurück zum Zitat Collier, N., North, M.: Parallel agent-based simulation with repast for high performance computing. Simulation 89(10), 1215–1235 (2013)CrossRef Collier, N., North, M.: Parallel agent-based simulation with repast for high performance computing. Simulation 89(10), 1215–1235 (2013)CrossRef
8.
Zurück zum Zitat Cosenza, B., Cordasco, G., De Chiara, R., Scarano, V.: Distributed load balancing for parallel agent-based simulations. In: 19th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE (2011) Cosenza, B., Cordasco, G., De Chiara, R., Scarano, V.: Distributed load balancing for parallel agent-based simulations. In: 19th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE (2011)
9.
Zurück zum Zitat Dabbish, L.A., Stuart, H.C., Tsay, J., Herbsleb, J.D.: Social coding in GitHub: transparency and collaboration in an open software repository. In: CSCW (2012) Dabbish, L.A., Stuart, H.C., Tsay, J., Herbsleb, J.D.: Social coding in GitHub: transparency and collaboration in an open software repository. In: CSCW (2012)
11.
Zurück zum Zitat Fortunato, S., Flammini, A., Menczer, F.: Scale-free network growth by ranking. Phys. Rev. Lett. 96(21), 218701 (2006)CrossRef Fortunato, S., Flammini, A., Menczer, F.: Scale-free network growth by ranking. Phys. Rev. Lett. 96(21), 218701 (2006)CrossRef
12.
Zurück zum Zitat Gousios, G., Spinellis, D.: Ghtorrent: GitHub’s data from a firehose. In: 9th IEEE Working Conference on Mining Software Repositories (MSR), June 2012 Gousios, G., Spinellis, D.: Ghtorrent: GitHub’s data from a firehose. In: 9th IEEE Working Conference on Mining Software Repositories (MSR), June 2012
13.
Zurück zum Zitat Gousios, G.: The GHTorent dataset and tool suite. In: Proceedings of the 10th Working Conference on Mining Software Repositories, MSR 2013. IEEE Press (2013) Gousios, G.: The GHTorent dataset and tool suite. In: Proceedings of the 10th Working Conference on Mining Software Repositories, MSR 2013. IEEE Press (2013)
14.
Zurück zum Zitat Klug, M., Bagrow, J.P.: Understanding the group dynamics and success of teams. R. Soc. Open Sci. 3(4), 160007 (2016)CrossRef Klug, M., Bagrow, J.P.: Understanding the group dynamics and success of teams. R. Soc. Open Sci. 3(4), 160007 (2016)CrossRef
15.
Zurück zum Zitat Lima, A., Rossi, L., Musolesi, M.: Coding together at scale: GitHub as a collaborative social network. CoRR abs/1407.2535 (2014) Lima, A., Rossi, L., Musolesi, M.: Coding together at scale: GitHub as a collaborative social network. CoRR abs/1407.2535 (2014)
16.
Zurück zum Zitat Noda, I.: Multi-agent social simulation for social service design. In: IJCAI/AAMAS Workshop on Massively Multi-agent Systems (2018) Noda, I.: Multi-agent social simulation for social service design. In: IJCAI/AAMAS Workshop on Massively Multi-agent Systems (2018)
18.
Zurück zum Zitat Sornette, D., Maillart, T., Ghezzi, G.: How much is the whole really more than the sum of its parts? 1 + 1 = 2.5: superlinear productivity in collective group actions. PLoS ONE 9(8), e103023 (2014)CrossRef Sornette, D., Maillart, T., Ghezzi, G.: How much is the whole really more than the sum of its parts? 1 + 1 = 2.5: superlinear productivity in collective group actions. PLoS ONE 9(8), e103023 (2014)CrossRef
19.
Zurück zum Zitat Thung, F., Bissyande, T.F., Lo, D., Jiang, L.: Network structure of social coding in GitHub. In: Software Maintenance and Reengineering, CSMR (2013) Thung, F., Bissyande, T.F., Lo, D., Jiang, L.: Network structure of social coding in GitHub. In: Software Maintenance and Reengineering, CSMR (2013)
20.
Zurück zum Zitat Tumer, K., Agogino, A.: Distributed agent-based air traffic flow management. In: Autonomous Agents and Multiagent Systems, AAMAS 2007. ACM (2007) Tumer, K., Agogino, A.: Distributed agent-based air traffic flow management. In: Autonomous Agents and Multiagent Systems, AAMAS 2007. ACM (2007)
21.
Zurück zum Zitat Webber, W., Moffat, A., Zobel, J.: A similarity measure for indefinite rankings. ACM Trans. Inf. Syst. 28(4), 1–38 (2010)CrossRef Webber, W., Moffat, A., Zobel, J.: A similarity measure for indefinite rankings. ACM Trans. Inf. Syst. 28(4), 1–38 (2010)CrossRef
22.
Zurück zum Zitat Zhu, J., Zhou, M., Mockus, A.: Patterns of folder use and project popularity: a case study of GitHub repositories. In: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2014. ACM (2014) Zhu, J., Zhou, M., Mockus, A.: Patterns of folder use and project popularity: a case study of GitHub repositories. In: Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2014. ACM (2014)
Metadaten
Titel
Massive Multi-agent Data-Driven Simulations of the GitHub Ecosystem
verfasst von
Jim Blythe
John Bollenbacher
Di Huang
Pik-Mai Hui
Rachel Krohn
Diogo Pacheco
Goran Muric
Anna Sapienza
Alexey Tregubov
Yong-Yeol Ahn
Alessandro Flammini
Kristina Lerman
Filippo Menczer
Tim Weninger
Emilio Ferrara
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-24209-1_1

Premium Partner