Abstract
The software industry lacks gender diversity. Recent research has suggested that a toxic working culture is to blame. Studies have found that communications in software repositories directed towards women are more negative in general. In this study, we use a destructive criticism lens to examine gender differences in software code review feedback. Software code review is a practice where code is peer reviewed and negative feedback is often delivered. We explore differences in perceptions, frequency, and impact of destructive criticism across genders. We surveyed 93 software practitioners eliciting perceived reactions to hypothetical scenarios (or vignettes) where participants are asked to imagine receiving either constructive or destructive criticism. In addition, the survey collected general opinions on feedback obtained during software code review as well as the frequency that participants give and receive destructive criticism.
We found that opinions on destructive criticism vary. Women perceive destructive criticism as less appropriate and are less motivated to continue working with the developer after receiving destructive criticism. Destructive criticism is fairly common with more than half of respondents having received nonspecific negative feedback and nearly a quarter having received inconsiderate negative feedback in the past year. Our results suggest that destructive criticism in code review could be a contributing factor to the lack of gender diversity observed in the software industry.
Supplemental Material
Available for Download
This file provides instructions on how to read the questionnaire that is provided separately in the replication package. - Red text was not visible to participants. - For each Vignette-Style question, the participant saw only one version of the feedback, either the constructive condition or the destructive condition. The remaining elements of the questionnaire appear in this replication package how they were presented to the participants.
- Cheryl S Alexander and Henry Jay Becker. 1978. The use of vignettes in survey research. Public opinion quarterly, Vol. 42, 1 (1978), 93--104.Google Scholar
- Kelly M Allred and Dianne L Chambless. 2018. Racial differences in attributions, perceived criticism, and upset: A study with Black and White community participants. Behavior therapy, Vol. 49, 2 (2018), 273--285.Google Scholar
- Anonymous. 2014. Leaving Toxic Open Source Communities. https://modelviewculture.com/pieces/leaving-toxic-open-source-communitiesGoogle Scholar
- Anonymous. 2016. I worked on Facebook's Trending team -- the most toxic work experience of my life. https://www.theguardian.com/technology/2016/may/17/facebook-trending-news-team-curators-toxic-work-environmentGoogle Scholar
- Neal M Ashkanasy, Charmine EJ H"artel, and Wilfred J Zerbe. 2000. Emotions in the workplace: Research, theory, and practice. Quorum Books/Greenwood Publishing Group.Google Scholar
- Gordon D. Atlas. 1994. Sensitivity to Criticism: A New Measure of Responses to Everyday Criticisms. Journal of Psychoeducational Assessment, Vol. 12, 3 (1994), 241--253.Google ScholarCross Ref
- Alberto Bacchelli and Christian Bird. 2013. Expectations, outcomes, and challenges of modern code review. In in Proc. 2013 35th International Conference on Software Engineering (ICSE). 712--721.Google ScholarCross Ref
- Sebastian Baltes and Paul Ralph. 2020. Sampling in software engineering research: A critical review and guidelines. arXiv preprint arXiv:2002.07764 (2020).Google Scholar
- Lecia Barker, Cynthia Mancha, and Catherine Ashcraft. 2014. What is the impact of gender diversity on technology business performance. Research Summary [Internet] (2014).Google Scholar
- Robert A Baron. 1988. Negative effects of destructive criticism: Impact on conflict, self-efficacy, and task performance. Journal of Applied Psychology, Vol. 73, 2 (May 1988), 199.Google ScholarCross Ref
- Len Bass, Ingo Weber, and Liming Zhu. 2015. DevOps: A software architect's perspective. Addison-Wesley Professional.Google ScholarDigital Library
- Kent Beck, James Grenning, Robert C. Martin, Mike Beedle, Jim Highsmith, Steve Mellor, Arie van Bennekum, Andrew Hunt, Ken Schwaber, Alistair Cockburn, Ron Jeffries, Jeff Sutherland, Ward Cunningham, Jon Kern, Dave Thomas, Martin Fowler, and Brian Marick. 2001. Manifesto for Agile Software Development.Google Scholar
- Frank D Belschak and Deanne N Den Hartog. 2009. Consequences of positive and negative feedback: The impact on emotions and extra-role behaviors. Applied Psychology, Vol. 58, 2 (Apr 2009), 274--303.Google ScholarCross Ref
- Amel Bennaceur, Ampaeli Cano, Lilia Georgieva, Mariam Kiran, Maria Salama, and Poonam Yadav. 2018. Issues in Gender Diversity and Equality in the UK. In in Proc.1st International Workshop on Gender Equality in Software Engineering. 5--9.Google Scholar
- Kelly Blincoe, Giuseppe Valetto, and Daniela Damian. 2015. Facilitating coordination between software developers: A study and techniques for timely and efficient recommendations. IEEE Transactions on Software Engineering, Vol. 41, 10 (2015), 969--985.Google ScholarDigital Library
- Amiangshu Bosu and Jeffrey C Carver. 2013. Impact of peer code review on peer impression formation: A survey. In 2013 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. 133--142.Google ScholarCross Ref
- Amiangshu Bosu, Jeffrey C Carver, Christian Bird, Jonathan Orbeck, and Christopher Chockley. 2016. Process aspects and social dynamics of contemporary code review: Insights from open source development and industrial practice at microsoft. IEEE Transactions on Software Engineering, Vol. 43, 1 (Jun 2016), 56--75.Google Scholar
- Dave Bouckenooghe, Usman Raja, and Arif Nazir Butt. 2013. Combined effects of positive and negative affectivity and job satisfaction on job performance and turnover intentions. The Journal of psychology, Vol. 147, 2 (2013), 105--123.Google ScholarCross Ref
- Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology, Vol. 3, 2 (Jan 2006), 77--101.Google Scholar
- Margaret Burnett, Anicia Peters, Charles Hill, and Noha Elarief. 2016. Finding gender-inclusiveness software issues with GenderMag: a field investigation. In in Proc. 2016 Conference on Human Factors in Computing Systems. 2586--2598.Google ScholarDigital Library
- Gilad Chen, Stanley M Gully, and Dov Eden. 2001. Validation of a new general self-efficacy scale. Organizational research methods, Vol. 4, 1 (Jan 2001), 62--83.Google Scholar
- Jithin Cheriyan, Bastin Tony Roy Savarimuthu, and Stephen Cranefield. 2020. Norm violation in online communities--A study of Stack Overflow comments. arXiv preprint arXiv:2004.05589 (Apr 2020).Google Scholar
- Jacqui Chetty and Glenda Barlow-Jones. 2018. Coding for girls: dismissing the boys club myth. In the 18th International Conference on Information, Communication Technologies in Education (ICICTE 2018).Google Scholar
- Lee Anna Clark and David Watson. 1988. Mood and the mundane: Relations between daily life events and self-reported mood. Journal of personality and social psychology, Vol. 54, 2 (1988), 296.Google ScholarCross Ref
- Jacob Cohen. 2013. Statistical power analysis for the behavioral sciences. Academic press.Google Scholar
- Katy Cook. 2020. Culture & Environment. In The Psychology of Silicon Valley. Springer, 37--64.Google Scholar
- Lee J Cronbach. 1951. Coefficient alpha and the internal structure of tests. psychometrika, Vol. 16, 3 (1951), 297--334.Google Scholar
- Laura Dabbish, Colleen Stuart, Jason Tsay, and Jim Herbsleb. 2012. Social coding in GitHub: transparency and collaboration in an open software repository. In Proceedings of the ACM 2012 conference on computer supported cooperative work. 1277--1286.Google ScholarDigital Library
- Aniruddha Das. 2009. Sexual harassment at work in the United States. Archives of sexual behavior, Vol. 38, 6 (2009), 909--921.Google Scholar
- Paul A David and Joseph S Shapiro. 2008. Community-based production of open-source software: What do we know about the developers who participate? Information Economics and Policy, Vol. 20, 4 (Dec 2008), 364--398.Google ScholarCross Ref
- Munmun De Choudhury and Scott Counts. 2013. Understanding affect in the workplace via social media. In Proceedings of the 2013 conference on Computer supported cooperative work. 303--316.Google ScholarDigital Library
- Giuseppe Destefanis, Marco Ortu, Steve Counsell, Stephen Swift, Michele Marchesi, and Roberto Tonelli. 2016. Software development: do good manners matter? PeerJ Computer Science, Vol. 2 (2016), e73.Google ScholarCross Ref
- Nicolas Ducheneaut. 2005. Socialization in an open source software community: A socio-technical analysis. Computer Supported Cooperative Work (CSCW), Vol. 14, 4 (2005), 323--368.Google ScholarDigital Library
- Michelle K Duffy, Daniel C Ganster, and Milan Pagon. 2002. Social undermining in the workplace. Academy of management Journal, Vol. 45, 2 (2002), 331--351.Google ScholarCross Ref
- Carolyn D Egelman, Emerson Murphy-Hill, Elizabeth Kammer, Margaret Morrow Hodges, Collin Green, Ciera Jaspan, and James Lin. 2020. Predicting developers' negative feelings about code review. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 174--185.Google ScholarDigital Library
- Staale Einarsen, Helge Hoel, and Guy Notelaers. 2009. Measuring exposure to bullying and harassment at work: Validity, factor structure and psychometric properties of the Negative Acts Questionnaire-Revised. Work & stress, Vol. 23, 1 (2009), 24--44.Google Scholar
- Ikram El Asri, Noureddine Kerzazi, Gias Uddin, Foutse Khomh, and MA Janati Idrissi. 2019. An empirical study of sentiments in code reviews. Information and Software Technology, Vol. 114 (2019), 37--54.Google ScholarDigital Library
- Ilker Etikan, Sulaiman Abubakar Musa, and Rukayya Sunusi Alkassim. 2016. Comparison of convenience sampling and purposive sampling. American journal of theoretical and applied statistics, Vol. 5, 1 (2016), 1--4.Google Scholar
- Michael Fagan. 1976. Design and code inspections to reduce errors in program development. Vol. 15. 182--2011.Google Scholar
- Leon Festinger. 1957. A theory of cognitive dissonance. Vol. 2. Stanford university press.Google Scholar
- Denae Ford, Reed Milewicz, and Alexander Serebrenik. 2019. How remote work can foster a more inclusive environment for transgender developers. In 2019 IEEE/ACM 2nd International Workshop on Gender Equality in Software Engineering (GE). IEEE, 9--12.Google ScholarDigital Library
- Denae Ford and Chris Parnin. 2015. Exploring causes of frustration for software developers. In 2015 IEEE/ACM 8th International Workshop on Cooperative and Human Aspects of Software Engineering. IEEE, 115--116.Google ScholarDigital Library
- Daviti Gachechiladze, Filippo Lanubile, Nicole Novielli, and Alexander Serebrenik. 2017. Anger and its direction in collaborative software development. In 2017 IEEE/ACM 39th International Conference on Software Engineering: New Ideas and Emerging Technologies Results Track (ICSE-NIER). IEEE, 11--14.Google ScholarDigital Library
- Michail N Giannakos, Letizia Jaccheri, and Ioannis Leftheriotis. 2014. Happy girls engaging with technology: Assessing emotions and engagement related to programming activities. In International Conference on Learning and Collaboration Technologies. Springer, 398--409.Google ScholarCross Ref
- Daniel Graziotin, Fabian Fagerholm, Xiaofeng Wang, and Pekka Abrahamsson. 2017. Unhappy developers: Bad for themselves, bad for process, and bad for software product. In 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C). 362--364.Google ScholarDigital Library
- Daniel Graziotin, Fabian Fagerholm, Xiaofeng Wang, and Pekka Abrahamsson. 2018. What happens when software developers are (un) happy. Journal of Systems and Software, Vol. 140 (2018), 32--47.Google ScholarCross Ref
- Sanuri Dananja Gunawardena, Peter Devine, Isabelle Beaumont, Lola Garden, Emerson Murphy-Hill, and Kelly Blincoe. 2022. Replication Package for Destructive Criticism in Software Code Review Impacts Inclusion. https://doi.org/10.6084/m9.figshare.14378954.v2Google Scholar
- Emitza Guzman, David Azócar, and Yang Li. 2014. Sentiment analysis of commit comments in GitHub: an empirical study. In in Proc.11th working conference on mining software repositories. 352--355.Google ScholarDigital Library
- Daniel Ilgen and Cori Davis. 2000. Bearing bad news: Reactions to negative performance feedback. Applied Psychology, Vol. 49, 3 (Jul 2000), 550--565.Google ScholarCross Ref
- Nasif Imtiaz, Justin Middleton, Joymallya Chakraborty, Neill Robson, Gina Bai, and Emerson Murphy-Hill. 2019. Investigating the effects of gender bias on GitHub. In in Proc. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 700--711.Google ScholarDigital Library
- Nathan Ingraham. 2016. Report: Apple is a sexist, toxic work environment. https://www.engadget.com/2016-09--15-apple-sexist-workplace-reports.htmlGoogle Scholar
- Md Rakibul Islam and Minhaz F Zibran. 2017. Leveraging automated sentiment analysis in software engineering. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 203--214.Google ScholarDigital Library
- Matthew Jay. 2019. generalhoslem: Goodness of Fit Tests for Logistic Regression Models. https://CRAN.R-project.org/package=generalhoslem R package version 1.3.4.Google Scholar
- Avraham N Kluger, Shai Lewinsohn, and John R Aiello. 1994. The influence of feedback on mood: Linear effects on pleasantness and curvilinear effects on arousal. Organizational Behavior and Human Decision Processes, Vol. 60, 2 (1994), 276--299.Google ScholarCross Ref
- Richard S Lazarus and Richard S Lazarus. 1991. Emotion and adaptation. Oxford University Press on Demand.Google Scholar
- Dawn Nafus. 2012. "Patches don't have gender': What is not open in open source software. New Media & Society, Vol. 14, 4 (Jun 2012), 669--683.Google ScholarCross Ref
- Marco Ortu, Bram Adams, Giuseppe Destefanis, Parastou Tourani, Michele Marchesi, and Roberto Tonelli. 2015. Are bullies more productive? Empirical study of affectiveness vs. issue fixing time. In in Proc.12th Working Conference on Mining Software Repositories. 303--313.Google ScholarCross Ref
- Christian R Østergaard, Bram Timmermans, and Kari Kristinsson. 2011. Does a different view create something new? The effect of employee diversity on innovation. Research Policy, Vol. 40, 3 (Apr 2011), 500--509.Google ScholarCross Ref
- Rajshakhar Paul, Amiangshu Bosu, and Kazi Zakia Sultana. 2019. Expressions of sentiments during code reviews: Male vs. female. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). 26--37.Google ScholarCross Ref
- Philip M Podsakoff and Jiing-Lih Farh. 1989. Effects of feedback sign and credibility on goal setting and task performance. Organizational behavior and human decision processes, Vol. 44, 1 (1989), 45--67.Google Scholar
- Erik Pulkstenis and Timothy J Robinson. 2004. Goodness-of-fit tests for ordinal response regression models. Statistics in medicine, Vol. 23, 6 (2004), 999--1014.Google Scholar
- Naveen Raman, Minxuan Cao, Yulia Tsvetkov, Christian K"astner, and Bogdan Vasilescu. 2020. Stress and burnout in open source: toward finding, understanding, and mitigating unhealthy interactions. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results. 57--60.Google ScholarDigital Library
- Jana L Raver, Jaclyn M Jensen, Junghyun Lee, and Jane O'Reilly. 2012. Destructive criticism revisited: Appraisals, task outcomes, and the moderating role of competitiveness. Applied Psychology, Vol. 61, 2 (2012), 177--203.Google ScholarCross Ref
- Neill Robson. 2018. Diversity and decorum in open source communities. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 986--987.Google ScholarDigital Library
- Caitlin Sadowski, Emma Söderberg, Luke Church, Michal Sipko, and Alberto Bacchelli. 2018. Modern code review: a case study at google. In Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. 181--190.Google ScholarDigital Library
- S. Sankarram. 2018. Unlearning Toxic Behaviors in a Code Review Culture. https://medium.com/@sandya.sankarram/unlearning-toxic-behaviors-in-a-code-review-culture-b7c295452a3cGoogle Scholar
- Farhana Sarker, Bogdan Vasilescu, Kelly Blincoe, and Vladimir Filkov. 2019. Socio-technical work-rate increase associates with changes in work patterns in online projects. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 936--947.Google ScholarDigital Library
- Jaydeb Sarker, Asif Kamal Turzo, and Amiangshu Bosu. 2020. A Benchmark Study of the Contemporary Toxicity Detectors on Software Engineering Interactions. arXiv preprint arXiv:2009.09331 (2020).Google Scholar
- Alexander Serenko and Ofir Turel. 2021. Why are women underrepresented in the American IT industry? The role of explicit and implicit gender identities. Journal of the Association for Information Systems, Vol. 22, 1 (2021), 8.Google ScholarCross Ref
- Mojtaba Shahin, Muhammad Ali Babar, and Liming Zhu. 2017. Continuous integration, delivery and deployment: a systematic review on approaches, tools, challenges and practices. IEEE Access, Vol. 5 (2017), 3909--3943.Google ScholarCross Ref
- Karina Kohl Silveira, Soraia Musse, Isabel H Manssour, Renata Vieira, and Rafael Prikladnicki. 2019. Confidence in programming skills: gender insights from StackOverflow developers survey. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). IEEE, 234--235.Google ScholarDigital Library
- Vinayak Sinha, Alina Lazar, and Bonita Sharif. 2016. Analyzing developer sentiment in commit logs. In Proceedings of the 13th international conference on mining software repositories. 520--523.Google ScholarDigital Library
- Megan Squire and Rebecca Gazda. 2015. FLOSS as a Source for Profanity and Insults: Collecting the Data. In 2015 48th Hawaii International Conference on System Sciences. IEEE, 5290--5298.Google ScholarDigital Library
- R Steyer, P Schwenkmezger, P Notz, and M Eid. 1997. Multidimensional Mood Questionnaire (MDMQ). Göttingen: Hogrefe (1997), 33.Google Scholar
- Margaret-Anne Storey, Neil A Ernst, Courtney Williams, and Eirini Kalliamvakou. 2020. The who, what, how of software engineering research: a socio-technical framework. Empirical Software Engineering, Vol. 25, 5 (2020), 4097--4129.Google ScholarDigital Library
- Josh Terrell, Andrew Kofink, Justin Middleton, Clarissa Rainear, Emerson Murphy-Hill, Chris Parnin, and Jon Stallings. 2017. Gender differences and bias in open source: Pull request acceptance of women versus men. PeerJ Computer Science, Vol. 3 (May 2017), e111.Google Scholar
- Mike Thelwall, David Wilkinson, and Sukhvinder Uppal. 2010. Data mining emotion in social network communication: Gender differences in MySpace. Journal of the American Society for Information Science and Technology, Vol. 61, 1 (Jan 2010), 190--199.Google ScholarCross Ref
- Gregory K Tortoriello and William Hart. 2019. Trait interpersonal vulnerability attenuates beneficial effects of constructive criticism on failure responses. British Journal of Psychology, Vol. 110, 3 (Aug 2019), 594--613.Google ScholarCross Ref
- Parastou Tourani, Bram Adams, and Alexander Serebrenik. 2017. Code of conduct in open source projects. In 2017 IEEE 24th international conference on software analysis, evolution and reengineering (SANER). IEEE, 24--33.Google ScholarCross Ref
- Parastou Tourani, Yujuan Jiang, and Bram Adams. 2014. Monitoring sentiment in open source mailing lists: exploratory study on the apache ecosystem. In CASCON, Vol. 14. 34--44.Google Scholar
- Bogdan Vasilescu, Daryl Posnett, Baishakhi Ray, Mark GJ van den Brand, Alexander Serebrenik, Premkumar Devanbu, and Vladimir Filkov. 2015. Gender and tenure diversity in GitHub teams. In in Proc. 33rd annual ACM conference on human factors in computing systems. 3789--3798.Google Scholar
- W. N. Venables and B. D. Ripley. 2002. Modern Applied Statistics with S fourth ed.). Springer, New York. https://www.stats.ox.ac.uk/pub/MASS4/ ISBN 0--387--95457-0.Google Scholar
- Timothy D Wilson and Daniel T Gilbert. 2003. Affective forecasting. (2003).Google Scholar
- Michal R Wrobel. 2013. Emotions in the software development process. In 2013 6th International Conference on Human System Interactions (HSI). IEEE, 518--523.Google ScholarCross Ref
- Michal R Wrobel, Olga Springer, and Kelly Blincoe. 2019. Perceptions of Gender Diversity's impact on mood in software development teams. IEEE Software, Vol. 36, 5 (2019), 51--56.Google ScholarDigital Library
Index Terms
- Destructive Criticism in Software Code Review Impacts Inclusion
Recommendations
Does ACM’s code of ethics change ethical decision making in software development?
ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software EngineeringEthical decisions in software development can substantially impact end-users, organizations, and our environment, as is evidenced by recent ethics scandals in the news. Organizations, like the ACM, publish codes of ethics to guide software-related ...
Developers perception of peer code review in research software development
AbstractContextResearch software is software developed by and/or used by researchers, across a wide variety of domains, to perform their research. Because of the complexity of research software, developers cannot conduct exhaustive testing. As a result, ...
Draw a Software Engineer Test - An Investigation into Children's Perceptions of Software Engineering Profession
ICSE-SEIS '23: Proceedings of the 45th International Conference on Software Engineering: Software Engineering in SocietyContext: The gender gap is particularly affecting the software engineering community, as both academia and industry are dominated by men. Literature reports how the lack of women is a consequence of gender stereotypes around certain figures that begin ...
Comments