Skip to main content
Top
Published in: Empirical Software Engineering 2/2018

18-08-2017

How the R community creates and curates knowledge: an extended study of stack overflow and mailing lists

Authors: Alexey Zagalsky, Daniel M. German, Margaret-Anne Storey, Carlos Gómez Teshima, Germán Poo-Caamaño

Published in: Empirical Software Engineering | Issue 2/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

One of the effects of social media’s prevalence in software development is the many flourishing communities of practice where users share a common interest. These large communities use many different communication channels, but little is known about how they create, share, and curate knowledge using such channels. In this paper, we report a mixed methods study of how one community of practice, the R software development community, creates and curates knowledge associated with questions and answers (Q&A) in two of its main communication channels: the R tag in Stack Overflow and the R-Help mailing list. The results reveal that knowledge is created and curated in two main forms: participatory, where multiple users explicitly collaborate to build knowledge, and crowdsourced, where individuals primarily work independently of each other. Moreover, we take a unique approach at slicing the data based on question score and participation activities over time. Our study reveals participation patterns, showing the existence of prolific contributors: users who are active across both channels and are responsible for a large proportion of the answers, serving as a bridge of knowledge. The key contributions of this paper are: a characterization of knowledge artifacts that are exchanged by this community of practice; the reasons why users choose one channel over the other; and insights on the community participation patterns, which indicate an evolution of the community and a shift from knowledge creation to knowledge curation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Footnotes
6
In the third phase of our study, we extended the mined datasets up to September 2016
 
9
Our scripts, sample data, and coded data are openly available at https://​zenodo.​org/​record/​831805
 
16
There is a threat to validity for this result in the R-help data: Stack Overflow separates responses into comments and answers, however, R-help does not have this distinction. For R-help, we consider that any direct reply to an email is an answer; and we consider a reply to an answer to be a comment.
 
Literature
go back to reference Bettenburg N, Shihab E, Hassan A (2009) An empirical study on the risks of using off-the-shelf techniques for processing mailing list data. In: ICSM’09 Proceedings of the 25th International Conference on Software Maintenance, pp 539–542 Bettenburg N, Shihab E, Hassan A (2009) An empirical study on the risks of using off-the-shelf techniques for processing mailing list data. In: ICSM’09 Proceedings of the 25th International Conference on Software Maintenance, pp 539–542
go back to reference Bosu A, Corley CS, Heaton D, Chatterji D, Carver JC, Kraft NA (2013) Building reputation in stackoverflow An empirical investigation. In: Proceedings of the 10th International Conference on Mining Software Repositories, MSR ’13, pp 89–92 Bosu A, Corley CS, Heaton D, Chatterji D, Carver JC, Kraft NA (2013) Building reputation in stackoverflow An empirical investigation. In: Proceedings of the 10th International Conference on Mining Software Repositories, MSR ’13, pp 89–92
go back to reference Bowen GA (2008) Naturalistic inquiry and the saturation concept: a research note. Qual Res 8(1):137–152CrossRef Bowen GA (2008) Naturalistic inquiry and the saturation concept: a research note. Qual Res 8(1):137–152CrossRef
go back to reference Correa D, Sureka A (2014) Chaff from the wheat Characterization and modeling of deleted questions on stack overflow. In: Proceedings of the 23rd International Conference on World Wide Web, WWW ’14, pp 631–642 Correa D, Sureka A (2014) Chaff from the wheat Characterization and modeling of deleted questions on stack overflow. In: Proceedings of the 23rd International Conference on World Wide Web, WWW ’14, pp 631–642
go back to reference Creswell J (2009) Research design: Qualitative, Quantitative, and Mixed Methods Approaches. SAGE Publications Creswell J (2009) Research design: Qualitative, Quantitative, and Mixed Methods Approaches. SAGE Publications
go back to reference German D, Adams B, Hassan A (2013) The evolution of the r software ecosystem. In: 2013 17th European Conference on Software Maintenance and Reengineering (CSMR), pp 243–252 German D, Adams B, Hassan A (2013) The evolution of the r software ecosystem. In: 2013 17th European Conference on Software Maintenance and Reengineering (CSMR), pp 243–252
go back to reference Gomez C, Cleary B, Singer L (2013) A study of innovation diffusion through link sharing on stack overflow. In: Proceedings of the 10th International Conference on Mining Software Repositories, pp 81– 84 Gomez C, Cleary B, Singer L (2013) A study of innovation diffusion through link sharing on stack overflow. In: Proceedings of the 10th International Conference on Mining Software Repositories, pp 81– 84
go back to reference Ihaka R, Gentleman R (1996) A language for data analysis and graphics. J Comput Graph Stat 5(3):299– 314 Ihaka R, Gentleman R (1996) A language for data analysis and graphics. J Comput Graph Stat 5(3):299– 314
go back to reference Jenkins H (2009) Confronting the Challenges of Participatory Culture: Media Education for the 21st Century. The John D. and Catherine T. MacArthur Foundation Reports on Digital Media and Learning MIT Press Jenkins H (2009) Confronting the Challenges of Participatory Culture: Media Education for the 21st Century. The John D. and Catherine T. MacArthur Foundation Reports on Digital Media and Learning MIT Press
go back to reference Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174CrossRefMATH Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33(1):159–174CrossRefMATH
go back to reference Lave J, Wenger E (2002) Legitimate peripheral participation in communities of practice. Supporting Lifelong Learn 1:111–126 Lave J, Wenger E (2002) Legitimate peripheral participation in communities of practice. Supporting Lifelong Learn 1:111–126
go back to reference Li H, Xing Z, Peng X, Zhao W (2013) What help do developers seek, when and how?. In: 2013 20th Working Conference on Reverse Engineering Reverse Engineering (WCRE). IEEE, pp 142–151 Li H, Xing Z, Peng X, Zhao W (2013) What help do developers seek, when and how?. In: 2013 20th Working Conference on Reverse Engineering Reverse Engineering (WCRE). IEEE, pp 142–151
go back to reference Mamykina L, Manoim B, Mittal M, Hripcsak G, Hartmann B (2011) Design lessons from the fastest Q&A site in the west. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’11, pp 2857–2866 Mamykina L, Manoim B, Mittal M, Hripcsak G, Hartmann B (2011) Design lessons from the fastest Q&A site in the west. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’11, pp 2857–2866
go back to reference Naur P (1985) Programming as theory building. Microprocessing Microprogramming 15(5):253–261CrossRef Naur P (1985) Programming as theory building. Microprocessing Microprogramming 15(5):253–261CrossRef
go back to reference Runeson P, Host M, Rainer A, Regnell B (2012) Case Study Research in Software Engineering: Guidelines and Examples. Wiley Runeson P, Host M, Rainer A, Regnell B (2012) Case Study Research in Software Engineering: Guidelines and Examples. Wiley
go back to reference Singer L, Figueira Filho F, Cleary B, Treude C, Storey M-A, Schneider K (2013) Mutual assessment in the social programmer ecosystem: an empirical investigation of developer profile aggregators. In: Proceedings of the 2013 Conference on Computer Supported Cooperative Work, CSCW ’13, pp 103– 116 Singer L, Figueira Filho F, Cleary B, Treude C, Storey M-A, Schneider K (2013) Mutual assessment in the social programmer ecosystem: an empirical investigation of developer profile aggregators. In: Proceedings of the 2013 Conference on Computer Supported Cooperative Work, CSCW ’13, pp 103– 116
go back to reference Squire M (2015) Should we move to Stack Overflow?: measuring the utility of social media for developer support. In: 37th International Conference on Software Engineering, pp 219–228 Squire M (2015) Should we move to Stack Overflow?: measuring the utility of social media for developer support. In: 37th International Conference on Software Engineering, pp 219–228
go back to reference Srba I, Bielikova M (2016) Why is stack overflow failing? preserving sustainability in community question answering. IEEE Softw 33(4):80–89CrossRef Srba I, Bielikova M (2016) Why is stack overflow failing? preserving sustainability in community question answering. IEEE Softw 33(4):80–89CrossRef
go back to reference Stemler SE (2004) A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability. Pract Assess Res Eval 9:4 Stemler SE (2004) A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability. Pract Assess Res Eval 9:4
go back to reference Storey M-A, Singer L, Cleary B, Figueira Filho F, Zagalsky A (2014) The (r) evolution of social media in software engineering. In: Proceedings of the on Future of Software Engineering, FOSE 2014, pp 100–116 Storey M-A, Singer L, Cleary B, Figueira Filho F, Zagalsky A (2014) The (r) evolution of social media in software engineering. In: Proceedings of the on Future of Software Engineering, FOSE 2014, pp 100–116
go back to reference Tausczik YR, Kittur A, Kraut RE (2014) Collaborative problem solving: A study of mathoverflow. In: Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing, CSCW’14, pp 355–367 Tausczik YR, Kittur A, Kraut RE (2014) Collaborative problem solving: A study of mathoverflow. In: Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work and Social Computing, CSCW’14, pp 355–367
go back to reference Vasilescu B (2014) Social aspects of collaboration in online software communities. PhD thesis, Eindhoven University of Technology Vasilescu B (2014) Social aspects of collaboration in online software communities. PhD thesis, Eindhoven University of Technology
go back to reference Vasilescu B, Serebrenik A, Devanbu PT, Filkov V (2014) How social Q&A sites are changing knowledge sharing in open source software communities. In: Proceedings of the 17th ACM Conf. on Computer Supported Cooperative Work and Social Computing, pp 342–354 Vasilescu B, Serebrenik A, Devanbu PT, Filkov V (2014) How social Q&A sites are changing knowledge sharing in open source software communities. In: Proceedings of the 17th ACM Conf. on Computer Supported Cooperative Work and Social Computing, pp 342–354
go back to reference Wenger E, White N, Smith JD (2009) Digital habitats: Stewarding technology for communities. CPsquare Wenger E, White N, Smith JD (2009) Digital habitats: Stewarding technology for communities. CPsquare
go back to reference Zagalsky A, Teshima CG, German DM, Storey M-A, Poo-Caamaño G (2016) How the r community creates and curates knowledge: a comparative study of stack overflow and mailing lists. In: Proceedings of the 13th International Conference on Mining Software Repositories. ACM, pp 441–451 Zagalsky A, Teshima CG, German DM, Storey M-A, Poo-Caamaño G (2016) How the r community creates and curates knowledge: a comparative study of stack overflow and mailing lists. In: Proceedings of the 13th International Conference on Mining Software Repositories. ACM, pp 441–451
go back to reference Zhang AX, Ackerman MS, Karger DR (2015) Mailing lists: Why are they still here, what is wrong with them, and how can we fix them?. In: Proceedings of the 33rd SIGCHI Conference on Human Factors in Computing Systems Zhang AX, Ackerman MS, Karger DR (2015) Mailing lists: Why are they still here, what is wrong with them, and how can we fix them?. In: Proceedings of the 33rd SIGCHI Conference on Human Factors in Computing Systems
Metadata
Title
How the R community creates and curates knowledge: an extended study of stack overflow and mailing lists
Authors
Alexey Zagalsky
Daniel M. German
Margaret-Anne Storey
Carlos Gómez Teshima
Germán Poo-Caamaño
Publication date
18-08-2017
Publisher
Springer US
Published in
Empirical Software Engineering / Issue 2/2018
Print ISSN: 1382-3256
Electronic ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-017-9536-y

Other articles of this Issue 2/2018

Empirical Software Engineering 2/2018 Go to the issue

Premium Partner