Abstract
Code review is an important quality assurance activity for software development. Code review discussions among developers and maintainers can be heated and sometimes involve personal attacks and unnecessary disrespectful comments, demonstrating, therefore, incivility. Although incivility in public discussions has received increasing attention from researchers in different domains, the knowledge about the characteristics, causes, and consequences of uncivil communication is still very limited in the context of software development, and more specifically, code review. To address this gap in the literature, we leverage the mature social construct of incivility as a lens to understand confrontational conflicts in open source code review discussions. For that, we conducted a qualitative analysis on 1,545 emails from the Linux Kernel Mailing List (LKML) that were associated with rejected changes. We found that more than half (66.66%) of the non-technical emails included uncivil features. Particularly, frustration, name calling, and impatience are the most frequent features in uncivil emails. We also found that there are civil alternatives to address arguments, while uncivil comments can potentially be made by any people when discussing any topic. Finally, we identified various causes and consequences of such uncivil communication. Our work serves as the first study about the phenomenon of in(civility) in open source software development, paving the road for a new field of research about collaboration and communication in the context of software engineering activities.
- Toufique Ahmed, Amiangshu Bosu, Anindya Iqbal, and Shahram Rahimi. 2017. SentiCR: a customized sentiment analysis tool for code review interactions. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 106--111.Google ScholarCross Ref
- Adam Alami, Marisa Leavitt Cohn, and Andrzej Wka sowski. 2019. Why does code review work for open source software communities?. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 1073--1083.Google ScholarDigital Library
- Shaosong Ou Alexander Hars. 2002. Working for free? Motivations for participating in open-source projects. International journal of electronic commerce, Vol. 6, 3 (2002), 25--39.Google Scholar
- Ashley A Anderson, Dominique Brossard, Dietram A Scheufele, Michael A Xenos, and Peter Ladwig. 2014. The ?nasty effect:" Online incivility and risk perceptions of emerging technologies. Journal of Computer-Mediated Communication, Vol. 19, 3 (2014), 373--387.Google ScholarDigital Library
- Jo Angouri and Miriam A Locher. 2012. Theorising disagreement. Journal of Pragmatics, Vol. 44, 12 (2012), 1549--1553.Google ScholarCross Ref
- A. Bacchelli and C. Bird. 2013. Expectations, outcomes, and challenges of modern code review. In 2013 35th International Conference on Software Engineering (ICSE). 712--721.Google Scholar
- Teresa M Bejan. 2017. Mere Civility .Harvard University Press.Google Scholar
- Nicolas Bettenburg, Emad Shihab, and Ahmed E Hassan. 2009. An empirical study on the risks of using off-the-shelf techniques for processing mailing list data. In 2009 IEEE International Conference on Software Maintenance. IEEE, 539--542.Google ScholarCross Ref
- Gary Blau and Lynne Andersson. 2005. Testing a measure of instigated workplace incivility. Journal of Occupational and Organizational Psychology, Vol. 78, 4 (2005), 595--614.Google ScholarCross Ref
- A. Bosu, J. C. Carver, C. Bird, J. Orbeck, and C. Chockley. 2017. Process Aspects and Social Dynamics of Contemporary Code Review: Insights from Open Source Development and Industrial Practice at Microsoft. IEEE Transactions on Software Engineering, Vol. 43, 1 (2017), 56--75.Google ScholarDigital Library
- Deborah Jordan Brooks and John G Geer. 2007. Beyond negativity: The effects of incivility on the electorate. American Journal of Political Science, Vol. 51, 1 (2007), 1--16.Google ScholarCross Ref
- Michael Buckland and Fredric Gey. 1994. The relationship between recall and precision. Journal of the American society for information science, Vol. 45, 1 (1994), 12--19.Google ScholarDigital Library
- Fabio Calefato, Filippo Lanubile, Federico Maiorano, and Nicole Novielli. 2018. Sentiment polarity detection for software development. Empirical Software Engineering, Vol. 23, 3 (2018), 1352--1382.Google ScholarDigital Library
- Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, Vol. 16 (2002), 321--357.Google ScholarCross Ref
- J. Cheng and J. L. C. Guo. 2019. Activity-Based Analysis of Open Source Software Contributors: Roles and Dynamics. In 2019 IEEE/ACM 12th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE). 11--18.Google Scholar
- Kevin Coe, Kate Kenski, and Stephen A Rains. 2014. Online and uncivil? Patterns and determinants of incivility in newspaper website comments. Journal of Communication, Vol. 64, 4 (2014), 658--679.Google ScholarCross Ref
- Jonathan Corbet. 2018. A farewell to email. https://lwn.net/Articles/768483/ Accessed: 2020--10--15.Google Scholar
- F. Ebert, F. Castor, N. Novielli, and A. Serebrenik. 2019. Confusion in Code Reviews: Reasons, Impacts, and Coping Strategies. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). 49--60.Google Scholar
- Carolyn D. Egelman, Emerson Murphy-Hill, Elizabeth Kammer, Margaret Morrow Hodges, Collin Green, Ciera Jaspan, and James Lin. 2020. Predicting Developers' Negative Feelings about Code Review. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (Seoul, South Korea) (ICSE '20). Association for Computing Machinery, New York, NY, USA, 174--185. https://doi.org/10.1145/3377811.3380414Google ScholarDigital Library
- Ikram El Asri, Noureddine Kerzazi, Gias Uddin, Foutse Khomh, and MA Janati Idrissi. 2019. An empirical study of sentiments in code reviews. Information and Software Technology, Vol. 114 (2019), 37--54.Google ScholarDigital Library
- Isabella Ferreira, Kate Stewart, Daniel German, and Bram Adams. 2019. A longitudinal study on the maintainers' sentiment of a large scale open source ecosystem. In 2019 IEEE/ACM 4th International Workshop on Emotion Awareness in Software Engineering (SEmotion). IEEE, 17--22.Google ScholarDigital Library
- Anna Filippova and Hichang Cho. 2015. Mudslinging and Manners: Unpacking Conflict in Free and Open Source Software. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (Vancouver, BC, Canada) (CSCW '15). Association for Computing Machinery, New York, NY, USA, 1393--1403. https://doi.org/10.1145/2675133.2675254Google ScholarDigital Library
- Anna Filippova and Hichang Cho. 2016. The Effects and Antecedents of Conflict in Free and Open Source Software Development. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing (San Francisco, California, USA) (CSCW '16). Association for Computing Machinery, New York, NY, USA, 705--716. https://doi.org/10.1145/2818048.2820018Google ScholarDigital Library
- Pier Massimo Forni. 2010. Choosing civility: The twenty-five rules of considerate conduct .St. Martin's Press.Google Scholar
- Kim L Fridkin and Patrick J Kenney. 2008. The dimensions of negative messages. American Politics Research, Vol. 36, 5 (2008), 694--723.Google ScholarCross Ref
- Daviti Gachechiladze, Filippo Lanubile, Nicole Novielli, and Alexander Serebrenik. 2017. Anger and its direction in collaborative software development. In 2017 IEEE/ACM 39th International Conference on Software Engineering: New Ideas and Emerging Technologies Results Track (ICSE-NIER). IEEE, 11--14.Google ScholarDigital Library
- Michael J Gallivan. 2001. Striking a balance between trust and control in a virtual organization: a content analysis of open source software case studies. Information Systems Journal, Vol. 11, 4 (2001), 277--304.Google ScholarCross Ref
- Kazuki Hamasaki, Raula Gaikovina Kula, Norihiro Yoshida, AE Camargo Cruz, Kenji Fujiwara, and Hajimu Iida. 2013. Who does what during a code review? datasets of oss peer review repositories. In 2013 10th Working Conference on Mining Software Repositories (MSR). IEEE, 49--52.Google ScholarCross Ref
- Austin Z Henley, KIvancc Mucc lu, Maria Christakis, Scott D Fleming, and Christian Bird. 2018. Cfar: A tool to increase communication, productivity, and review quality in collaborative code reviews. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1--13.Google ScholarDigital Library
- Guido Hertel, Sven Niedner, and Stefanie Herrmann. 2003. Motivation of software developers in Open Source projects: an Internet-based survey of contributors to the Linux kernel. Research policy, Vol. 32, 7 (2003), 1159--1177.Google Scholar
- H. Hosseini, S. Kannan, B. Zhang, and R. Poovendran. 2017. Deceiving google's perspective api built for detecting toxic comments. arXiv preprint arXiv:1702.08138 (2017).Google Scholar
- Wenjian Huang, Tun Lu, Haiyi Zhu, Guo Li, and Ning Gu. 2016. Effectiveness of Conflict Management Strategies in Peer Review Process of Online Collaboration Projects. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing (San Francisco, California, USA) (CSCW '16). Association for Computing Machinery, New York, NY, USA, 717--728. https://doi.org/10.1145/2818048.2819950Google ScholarDigital Library
- IBM. [n.d.]. Watson Natural Language Understanding. https://www.ibm.com/cloud/watson-natural-language-understandingGoogle Scholar
- Md Rakibul Islam and Minhaz F Zibran. 2018. SentiStrength-SE: Exploiting domain specificity for improved sentiment analysis in software engineering text. Journal of Systems and Software, Vol. 145 (2018), 125--146.Google ScholarCross Ref
- Gaeul Jeong, Sunghun Kim, Thomas Zimmermann, and Kwangkeun Yi. 2009. Improving code review by predicting reviewers and acceptance of patches. Research on software analysis for error-free computing center Tech-Memo (ROSAEC MEMO 2009-006) (2009), 1--18.Google Scholar
- Yujuan Jiang, Bram Adams, and Daniel M German. 2013. Will my patch make it? and how fast?: Case study on the linux kernel. In Proceedings of the 10th Working Conference on Mining Software Repositories (MSR). IEEE Press, 101--110.Google ScholarDigital Library
- Yujuan Jiang, Bram Adams, Foutse Khomh, and Daniel M German. 2014. Tracing back the history of commits in low-tech reviewing environments: a case study of the linux kernel. In Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 1--10.Google ScholarDigital Library
- Robbert Jongeling, Proshanta Sarkar, Subhajit Datta, and Alexander Serebrenik. 2017. On negative results when using sentiment analysis tools for software engineering research. Empirical Software Engineering, Vol. 22, 5 (2017), 2543--2584.Google ScholarDigital Library
- Kate Kenski, Kevin Coe, and Stephen A Rains. 2020. Perceptions of uncivil discourse online: An examination of types and predictors. Communication Research, Vol. 47, 6 (2020), 795--814.Google ScholarCross Ref
- Tae Kyun Kim. 2015. T test as a parametric statistic. Korean journal of anesthesiology, Vol. 68, 6 (2015), 540.Google Scholar
- Bin Lin, Fiorella Zampetti, Gabriele Bavota, Massimiliano Di Penta, Michele Lanza, and Rocco Oliveto. 2018. Sentiment analysis for software engineering: How far can we go?. In Proceedings of the 40th International Conference on Software Engineering. 94--104.Google ScholarDigital Library
- Wu Liu, Shu-Cheng Steve Chi, Ray Friedman, and Ming-Hong Tsai. 2009. Explaining incivility in the workplace: The effects of personality and culture. Negotiation and Conflict Management Research, Vol. 2, 2 (2009), 164--184.Google ScholarCross Ref
- Robert Love. 2010. Linux kernel development .Pearson Education.Google Scholar
- Suman Kalyan Maity, Aishik Chakraborty, Pawan Goyal, and Animesh Mukherjee. 2018. Opinion conflicts: An effective route to detect incivility in Twitter. Proceedings of the ACM on Human-Computer Interaction, Vol. 2, CSCW (2018), 1--27.Google ScholarDigital Library
- Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and inter-rater reliability in qualitative research: Norms and guidelines for CSCW and HCI practice. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--23.Google ScholarDigital Library
- Mary L McHugh. 2013. The chi-square test of independence. Biochemia medica, Vol. 23, 2 (2013), 143--149.Google Scholar
- Roc'io Galarza Molina and Freddie J Jennings. 2018. The role of civility and metacommunication in Facebook discussions. Communication studies, Vol. 69, 1 (2018), 42--66.Google Scholar
- Nicole Novielli, Fabio Calefato, Davide Dongiovanni, Daniela Girardi, and Filippo Lanubile. 2020. Can we use se-specific sentiment analysis tools in a cross-platform setting?. In Proceedings of the 17th International Conference on Mining Software Repositories. 158--168.Google ScholarDigital Library
- Nicole Novielli, Fabio Calefato, and Filippo Lanubile. 2015. The challenges of sentiment detection in the social programmer ecosystem. In Proceedings of the 7th International Workshop on Social Software Engineering. 33--40.Google ScholarDigital Library
- Nicole Novielli, Fabio Calefato, Filippo Lanubile, and Alexander Serebrenik. 2021. Assessment of off-the-shelf SE-specific sentiment analysis tools: An extended replication study. Empirical Software Engineering, Vol. 26, 4 (2021), 1--29.Google ScholarDigital Library
- Zizi Papacharissi. 2004. Democracy online: Civility, politeness, and the democratic potential of online political discussion groups. New media & society, Vol. 6, 2 (2004), 259--283.Google Scholar
- Luca Pascarella, Davide Spadini, Fabio Palomba, Magiel Bruntink, and Alberto Bacchelli. 2018. Information Needs in Contemporary Code Review. Proc. ACM Hum.-Comput. Interact., Vol. 2, CSCW, Article 135 (Nov. 2018), 27 pages. https://doi.org/10.1145/3274404Google ScholarDigital Library
- Naveen Raman, Minxuan Cao, Yulia Tsvetkov, Christian K"astner, and Bogdan Vasilescu. 2020. Stress and Burnout in Open Source: Toward Finding, Understanding, and Mitigating Unhealthy Interactions. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results (Seoul, South Korea) (ICSE-NIER '20). Association for Computing Machinery, New York, NY, USA, 57--60. https://doi.org/10.1145/3377816.3381732Google ScholarDigital Library
- P. Rigby, D. German, and M. Storey. 2008. Open source software peer review practices. In 2008 ACM/IEEE 30th International Conference on Software Engineering. 541--550.Google Scholar
- Farig Sadeque, Stephen Rains, Yotam Shmargad, Kate Kenski, Kevin Coe, and Steven Bethard. 2019. Incivility Detection in Online Comments. In Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (* SEM 2019). 283--291.Google ScholarCross Ref
- Johnny Saldaña. 2015. The coding manual for qualitative researchers 3rd ed.). Sage.Google Scholar
- Daniel Schneider, Scott Spurlock, and Megan Squire. 2016. Differentiating Communication Styles of Leaders on the Linux Kernel Mailing List. In Proceedings of the 12th International Symposium on Open Collaboration. 1--10.Google ScholarDigital Library
- Fabrizio Sebastiani. 2002. Machine learning in automated text categorization. ACM computing surveys (CSUR), Vol. 34, 1 (2002), 1--47.Google Scholar
- Brittany Shoot. [n.d.]. Linux Founder to Take Some Time Off: I Need to Change Some of My Behavior. http://fortune.com/2018/09/17/linux-git-linus-torvalds-bullying-abuse-time-off/. Accessed: 2018--12--26.Google Scholar
- Sarah Sobieraj and Jeffrey M Berry. 2011. From incivility to outrage: Political discourse in blogs, talk radio, and cable news. Political Communication, Vol. 28, 1 (2011), 19--41.Google ScholarCross Ref
- Igor Steinmacher, Marco Aurélio Gerosa, and David Redmiles. 2014. Attracting, onboarding, and retaining newcomer developers in open source software projects. In Workshop on Global Software Development in a CSCW Perspective .Google Scholar
- Igor Steinmacher, Igor Wiese, Ana Paula Chaves, and Marco Aurélio Gerosa. 2013. Why do newcomers abandon open source software projects?. In Cooperative and Human Aspects of Software Engineering (CHASE), 2013 6th International Workshop on. IEEE, 25--32.Google ScholarCross Ref
- Anselm Strauss and Juliet Corbin. 1990. Open coding. Basics of qualitative research: Grounded theory procedures and techniques, Vol. 2, 1990 (1990), 101--121.Google Scholar
- Anselm L Strauss. 1987. Qualitative analysis for social scientists .Cambridge university press.Google Scholar
- Xin Tan and Minghui Zhou. 2019. How to Communicate When Submitting Patches: An Empirical Study of the Linux Kernel. Proc. ACM Hum.-Comput. Interact., Vol. 3, CSCW, Article 108 (Nov. 2019), 26 pages. https://doi.org/10.1145/3359210Google ScholarDigital Library
- Xin Tan, Minghui Zhou, and Brian Fitzgerald. 2020. Scaling Open Source Communities: An Empirical Study of the Linux Kernel. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (Seoul, South Korea) (ICSE '20). Association for Computing Machinery, New York, NY, USA, 1222--1234. https://doi.org/10.1145/3377811.3380920Google ScholarDigital Library
- Yida Tao, Donggyun Han, and Sunghun Kim. 2014. Writing acceptable patches: An empirical study of open source project patches. In 2014 IEEE International Conference on Software Maintenance and Evolution. IEEE, 271--280.Google ScholarDigital Library
- David R Thomas. 2003. A general inductive approach for qualitative data analysis. (2003).Google Scholar
- Patanamon Thongtanunam, Shane McIntosh, Ahmed E. Hassan, and Hajimu Iida. 2017. Review participation in modern code review. Empirical Software Engineering, Vol. 22, 2 (2017), 768--817. https://doi.org/10.1007/s10664-016--9452--6Google ScholarDigital Library
- Parastou Tourani, Bram Adams, and Alexander Serebrenik. 2017. Code of conduct in open source projects. In Software Analysis, Evolution and Reengineering (SANER), 2017 IEEE 24th International Conference on. IEEE, 24--33.Google Scholar
- Parastou Tourani, Yujuan Jiang, and Bram Adams. 2014. Monitoring sentiment in open source mailing lists: exploratory study on the apache ecosystem. In Proceedings of 24th annual international conference on computer science and software engineering. IBM Corp., 34--44.Google Scholar
- Anthony J Viera, Joanne M Garrett, et al. 2005. Understanding interobserver agreement: the kappa statistic. Fam med, Vol. 37, 5 (2005), 360--363.Google Scholar
- Wenting Wang, Deeksha Arya, Nicole Novielli, Jinghui Cheng, and Jin L.C. Guo. 2020. ArguLens: Anatomy of Community Opinions On Usability Issues Using Argumentation Models. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI '20). Association for Computing Machinery, New York, NY, USA, 1--14. https://doi.org/10.1145/3313831.3376218Google ScholarDigital Library
- Michael Williams and Tami Moser. 2019. The art of coding and thematic exploration in qualitative research. International Management Review, Vol. 15, 1 (2019), 45--55.Google Scholar
- Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in software engineering .Springer Science & Business Media.Google ScholarCross Ref
- Yan Xia, Haiyi Zhu, Tun Lu, Peng Zhang, and Ning Gu. 2020. Exploring Antecedents and Consequences of Toxicity in Online Discussions: A Case Study on Reddit. Proc. ACM Hum.-Comput. Interact., Vol. 4, CSCW2, Article 108 (Oct. 2020), 23 pages. https://doi.org/10.1145/3415179Google ScholarDigital Library
Index Terms
- The "Shut the f**k up" Phenomenon: Characterizing Incivility in Open Source Code Review Discussions
Recommendations
Incivility detection in open source code review and issue discussions
AbstractGiven the democratic nature of open source development, code review and issue discussions may be uncivil. Incivility, defined as features of discussion that convey an unnecessarily disrespectful tone, can have negative consequences to open source ...
Highlights- BERT detects incivility in both datasets with outstanding performance (F1 > 0.9).
- Classical ML models underperform to classify non-tone-bearing and civil texts.
- Adding the context does not improve the incivility classification.
The influence of workload and civility of treatment on the perpetration of email incivility
We conducted an experiment to examine incivility perpetration in email.We manipulated workload and received civility.Participants perpetrated more incivility under high (vs. low) workload.Participants perpetrated more incivility in response to uncivil (...
Incivility on Facebook and political polarization: The mediating role of seeking further comments and negative emotion
AbstractThis study examined whether and how (in)civility and the presence of supporting evidence in disagreeing comments influence individuals' attitude polarization. The study used a 2 (civility vs. incivility) × 2 (evidence vs. no evidence) ...
Highlights- Exposure to uncivil disagreeing comments reduces one's willingness to read more comments.
Comments