An empirically-based characterization and quantification of information seeking through mailing lists during Open Source developers’ software evolution
Section snippets
Introduction and motivation
Software maintenance and evolution are large components of a successful software system’s lifecycle. The amount of software lifecycle effort consumed during this phase has been estimated to range between 60% and 80% of the entire lifecycle effort [1], [2], [3], [4]. While the empirical basis for such statements is dated and suggestions have been made that it should be revisited [4], the increasing scale and complexity of newer software systems [3], [5] implies that the effort invested in
OS governance
Information seeking in OS projects is contextualized by the governance model and working practices employed in those projects. The work presented in this paper focuses specifically on one aspect of governance, that is the use of information and tools [36]. This occurs in the context of two other aspects, these being the software development processes, and community management that implicitly guides these processes.
The two main types of governance models are where a single individual governs a
Research objective
This research has two objectives. The first is to empirically derive a schema of information types sought by Open Source programmers through mailing lists, during post-deployment activities like maintenance and evolution. The second is to quantify the prevalent types of information sought through this medium and the response rates for those queries. This section discusses the empirical process used to derive the information type schema. For a fuller description of the derivation process and the
The information types schema
The information schema derived from the mailing lists is presented in Fig. 2, Fig. 3. These Venn Diagrams present a hierarchical representation of the schema, each diagram representing 1 of the 2 top-level categories revealed by axial coding: Information Focus and Information Aspect. Focus refers to the target-entity that information is sought about, and Aspect refers to the type of information sought about that target-entity, so each question has a focus and an aspect. For example, in the case
The schema
The schema derived from the mailing lists can be aligned partially with existing schemas, most notably with Erdem’s et al. [52]. Erdem proposed that all information seeking events could be classified into Topic, Question type and Relation type. The Topic was the entity referenced in the question, and the Question Type consisted of Why, What, Where, When, How and Verification type questions. Erdem identified 9 different Relation types: Topic, Behavior, Structure, Function, Use, Goal,
Conclusion
This paper reports on the derivation of an Information Seeking schema for OS developers through a grounded analysis of 6 OS developer mailing lists, spanning 17 years of mail activity. The resultant schema is largely congruent with the findings of [11], [19], [74], [75] and closely echoes the schema proposed by Erdem et al. [52]. However, several of the categories differ, particularly with respect to Contextual Technology, and Documentation.
The resultant schema was then applied to the dataset to
Acknowledgment
This work was supported, in part, by Science Foundation Ireland Grant 10/CE/I1855 to Lero – the Irish Software Engineering Research Centre (www.lero.ie).
References (92)
- et al.
Governance practices and software maintenance: a study of open source projects
Decis. Support Syst.
(2012) - et al.
Sustainability of open source software communities beyond a fork: how and why has the LibreOffice project evolved?
J. Syst. Softw.
(2014) Qualitative research methods: a review of major stages, data analysis techniques, and quality controls
Libr. Inform. Sci. Res.
(1994)Cognitive processes in program comprehension
J. Syst. Softw.
(1987)- et al.
Why Open Source software can succeed
Res. Policy
(2003) - et al.
Program plan matching: experiments with a constraint-based approach
Sci. Comput. Program.
(2000) - et al.
A framework and methodology for studying the causes of software errors in programming systems
J. Vis. Lang. Comput.
(2005) - et al.
A systematic review of software fault prediction studies
Expert Syst. Appl.
(2009) - et al.
Characteristics of application software maintenance
Commun. ACM
(1978) - A.V. Mayrhauser, A.M. Vans, From code understanding needs to reverse engineering tool capabilities, in: Sixth...
Software Engineering
Re-evaluating inheritance depth on the maintainability of object-oriented software
Int. J. Empirical Softw. Eng.
A field study of the software design process for large systems
Commun. ACM
Concepts of information seeking and their presence in the practical library literature
Libr. Philos. Pract.
Prototyping a process monitoring experiment
IEEE Trans. Softw. Eng.
Two case studies of Open Source software development: Apache and Mozilla
ACM Trans. Softw. Eng. Methodol.
Just for Fun: The Story of an Accidental Revolutionary
Understanding Open Source Software Development
A critical look at open source
Computer
The governance of free/open source software projects: monolithic, multidimensional, or configurational?
J. Manage. Governance
Forks impacts and motivations in free and open source projects
Int. J. Adv. Comput. Sci. Appl. (IJACSA)
Cited by (21)
Architecture information communication in two OSS projects: The why, who, when, and what
2021, Journal of Systems and SoftwareCitation Excerpt :With a long history of developers’ communication provided in these mailing lists, the goal of this study is to understand architecture information communication in OSS development. Mailing lists in OSS development have been investigated recently for traceability between emails and source code (Bacchelli et al., 2010), communication in development using mailing lists (Guzzi et al., 2013), and information seeking through mailing lists (Sharif et al., 2015). For example, a recent study on 37 Apache projects shows that 89.51% of all design discussions occur in project mailing lists (Mannan et al., 2020).
Towards a unified criteria model for usability evaluation in the context of open source software based on a fuzzy Delphi method
2021, Information and Software TechnologyCitation Excerpt :Open-source software (OSS) is a software with source code that anyone can use, inspect, modify, and enhance [17,49,60,92,100,112,117]. Many organisations have been adopted OSS applications due to significant advantages that the application offer [28, 74, 94, 118, 124, 135]. The quality of the software is essential when considering which software package to adopt [23, 35, 41, 46, 62, 65].
A systematic examination of knowledge loss in open source software projects
2019, International Journal of Information ManagementCitation Excerpt :Moreover, the number of contributions made on the project can determine the expertise level of the group responding to knowledge seekers. A study on the schema of information types sought in OSS mailing lists asserted that mailing lists are a strong representative of communication in Open Source Software and offer an insight into information seeking needs (Sharif et al., 2015). The findings suggest that 42% of information sought on mailing lists is on understanding task implementation and understanding bugs.
Investigating software modularity using class and module level metrics
2016, Software Quality Assurance: In Large Scale and Complex Software-intensive SystemsSustaining Open Source Communities by Understanding the Influence of Discursive Manifestations on Sentiment
2023, Information Systems FrontiersCommitter Assessment Practice in Blockchain Project: A Systematic Literature Review
2023, Journal of Information and Communication Technology