Skip to main content
Erschienen in: Empirical Software Engineering 2/2018

08.12.2017

Data sets describing the circle of life in Ruby hosting, 2003–2016

verfasst von: Megan Squire

Erschienen in: Empirical Software Engineering | Ausgabe 2/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Studying software repositories and hosting services can provide valuable insights into the behaviors of large groups of software developers and their projects. Traditionally, most analysis of metadata collected from software project hosting services has been conducted by specifying some short window of time, typically just a few years. To date, few - if any - studies have been built from data comprising the entirety of a hosting facility’s lifespan: from its birth to its death, and rebirth in another form. Thus, the first contribution of this paper is to present two data sets that support the historical analysis of over ten years of collected metadata from the now-defunct RubyForge project hosting site, as well as the follow-on successor to RubyForge, the RubyGems package (“gem”) hosting facility. The data sets and samples of usage demonstrated in this paper include: analyses of overall forge growth over time, presentation of data and analyses of project-level characteristics on both forges and their changes over time (for example in licenses, languages, and so on), and demonstration of how to use developer-level metadata (for example counts of new developers and calculation of developer-project density) to assess changes in person-level activity on both sites over time. Finally, because RubyForge was phased out and the gem-hosting portion of it was replaced by RubyGems, all the gems within RubyForge projects were transferred by project owners and by the site owners themselves into the RubyGems hosting facility. Thus, the data sets in this paper represent a unique opportunity to study projects as they moved from one ecosystem to another, and as such we show several methods for locating related projects between the two forges, and for building a cross-forge, longitudinal project history using information from both forges. These data sets and sample analyses in this paper will be relevant to researchers studying long-term software evolution, and distributed, hosted, or collaborative software development environments.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
Zurück zum Zitat Booch G, Brown AW (2003) Collaborative development environments. Advances in Computers (59):1–27 Booch G, Brown AW (2003) Collaborative development environments. Advances in Computers (59):1–27
Zurück zum Zitat Delorey DP, Knutson CD, Giraud-Carrier C (2007) Programming language trends in open source development: An evaluation using data from all production phase SourceForge projects. In Proc. 2nd Workshop Public Data Software Dev. (WoPDaSD). Limerick, Ireland Delorey DP, Knutson CD, Giraud-Carrier C (2007) Programming language trends in open source development: An evaluation using data from all production phase SourceForge projects. In Proc. 2nd Workshop Public Data Software Dev. (WoPDaSD). Limerick, Ireland
Zurück zum Zitat Gousios G (2013) The GHTorrent dataset and tool suite. Proc. 10th Int. Conf. On mining software repositories (MSR 2013). 233–236 Gousios G (2013) The GHTorrent dataset and tool suite. Proc. 10th Int. Conf. On mining software repositories (MSR 2013). 233–236
Zurück zum Zitat Howison J, Crowston K, Conklin M (2006) FLOSSmole: a collaborative repository for FLOSS research data and analyses. Int J Information Technology and Web Engineering 1(3):17–26CrossRef Howison J, Crowston K, Conklin M (2006) FLOSSmole: a collaborative repository for FLOSS research data and analyses. Int J Information Technology and Web Engineering 1(3):17–26CrossRef
Zurück zum Zitat Knuth DE (1973) The art of computer programming: volume 3, Sorting and Searching. Addison Wesley Longman Publishing Co., Inc., Redwood City, pp 391–92 Knuth DE (1973) The art of computer programming: volume 3, Sorting and Searching. Addison Wesley Longman Publishing Co., Inc., Redwood City, pp 391–92
Zurück zum Zitat Krein JL, MacLean AC, Knutson CD, Delorey DP, Eggett DL (2009) Language entropy: A metric or characterization of author programming language distribution. In Proc. 4th Workshop Public Data Software Dev. (WoPDaSD). Skovde, Sweden Krein JL, MacLean AC, Knutson CD, Delorey DP, Eggett DL (2009) Language entropy: A metric or characterization of author programming language distribution. In Proc. 4th Workshop Public Data Software Dev. (WoPDaSD). Skovde, Sweden
Zurück zum Zitat Krein JL, MacLean AC, Knutson CD, Delorey DP, Eggett DL (2010) Impact of programming language fragmentation on developer productivity. Int. J. Open Source Sw. & Proc 2(2):41–61CrossRef Krein JL, MacLean AC, Knutson CD, Delorey DP, Eggett DL (2010) Impact of programming language fragmentation on developer productivity. Int. J. Open Source Sw. & Proc 2(2):41–61CrossRef
Zurück zum Zitat Lerner J, Tirole J (2005) The scope of open source licensing. J. of Law, Economics, and. Policy 21(1):20–56 Lerner J, Tirole J (2005) The scope of open source licensing. J. of Law, Economics, and. Policy 21(1):20–56
Zurück zum Zitat Squire M (2009) Integrating projects from multiple open source code forges. Int J Open Source Software & Proc 1(1):46–57CrossRef Squire M (2009) Integrating projects from multiple open source code forges. Int J Open Source Software & Proc 1(1):46–57CrossRef
Zurück zum Zitat Squire M (2016a) Data Sets: The Circle of Life in Ruby Hosting, 2003–2015. In Proc. 13 th Int. Conference on Mining Software Repositories (MSR2016). Austin, TX, USA. 452–455 Squire M (2016a) Data Sets: The Circle of Life in Ruby Hosting, 2003–2015. In Proc. 13 th Int. Conference on Mining Software Repositories (MSR2016). Austin, TX, USA. 452–455
Zurück zum Zitat Vasilescu B, Posnett D, Ray B, van den Brand MG, Serebrenik A, Devanbu P, Filkov V (2015) Gender and tenure diversity in GitHub teams. In Proc. CHI. ACM Vasilescu B, Posnett D, Ray B, van den Brand MG, Serebrenik A, Devanbu P, Filkov V (2015) Gender and tenure diversity in GitHub teams. In Proc. CHI. ACM
Zurück zum Zitat Vendome C (2015) A large scale study of license usage on GitHub. In Proc. 37th Int. Conf. Softw. Eng. (ICSE), 2, 772–774 Vendome C (2015) A large scale study of license usage on GitHub. In Proc. 37th Int. Conf. Softw. Eng. (ICSE), 2, 772–774
Metadaten
Titel
Data sets describing the circle of life in Ruby hosting, 2003–2016
verfasst von
Megan Squire
Publikationsdatum
08.12.2017
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 2/2018
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-017-9581-6

Weitere Artikel der Ausgabe 2/2018

Empirical Software Engineering 2/2018 Zur Ausgabe

Premium Partner