research-article

Open Access

Investigating Severity Thresholds for Test Smells

Authors:
Davide Spadini

Software Improvement Group & Delft University of Technology, Amsterdam, The Netherlands

Software Improvement Group & Delft University of Technology, Amsterdam, The Netherlands
View Profile

,
Martin Schvarcbacher

Software Improvement Group, Amsterdam, The Netherlands

Software Improvement Group, Amsterdam, The Netherlands
View Profile

,
Ana-Maria Oprescu

University of Amsterdam, Amsterdam, The Netherlands

University of Amsterdam, Amsterdam, The Netherlands
View Profile

,
Magiel Bruntink

Software Improvement Group, Amsterdam, The Netherlands

Software Improvement Group, Amsterdam, The Netherlands
View Profile

,
Alberto Bacchelli

University of Zurich, Zurich, Switzerland

University of Zurich, Zurich, Switzerland
View Profile

MSR '20: Proceedings of the 17th International Conference on Mining Software RepositoriesJune 2020Pages 311–321https://doi.org/10.1145/3379597.3387453

Published:18 September 2020Publication History

MSR '20: Proceedings of the 17th International Conference on Mining Software Repositories

Pages 311–321

ABSTRACT

Test smells are poor design decisions implemented in test code, which can have an impact on the effectiveness and maintainability of unit tests. Even though test smell detection tools exist, how to rank the severity of the detected smells is an open research topic. In this work, we aim at investigating the severity rating for four test smells and investigate their perceived impact on test suite maintainability by the developers. To accomplish this, we first analyzed some 1,500 open-source projects to elicit severity thresholds for commonly found test smells. Then, we conducted a study with developers to evaluate our thresholds. We found that (1) current detection rules for certain test smells are considered as too strict by the developers and (2) our newly defined severity thresholds are in line with the participants' perception of how test smells have an impact on the maintainability of a test suite. Preprint [https://doi.org/10.5281/zenodo.3744281], data and material [https://doi.org/10.5281/zenodo.3611111].

References

Efthimia Aivaloglou and Felienne Hermans. 2016. How kids code and how we know: An exploratory study on the Scratch repository. In Proceedings of the 2016 ACM Conference on International Computing Education Research. ACM, 53--61.Google ScholarDigital Library
Tiago L. Alves, Christiaan Ypma, and Joost Visser. 2010. Deriving Metric Thresholds from Benchmark Data. In Proceedings of the 2010 IEEE International Conference on Software Maintenance (ICSM '10). IEEE Computer Society, Washington, DC, USA, 1--10. https://doi.org/10.1109/ICSM.2010.5609747Google ScholarDigital Library
Mauricio Aniche, Christoph Treude, Andy Zaidman, Arie Van Deursen, and Marco Aurelio Gerosa. 2016. SATT: Tailoring Code Metric Thresholds for Different Software Architectures. In 2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 41--50. https://doi.org/10.1109/SCAM.2016.19Google Scholar
Dimitrios Athanasiou, Ariadi Nugroho, Joost Visser, and Andy Zaidman. 2014. Test Code Quality and Its Relation to Issue Handling Performance. IEEE Transactions on Software Engineering 40, 11 (11 2014), 1100--1125. https://doi.org/10.1109/TSE.2014.2342227Google ScholarCross Ref
Robert Baggen, José Pedro Correia, Katrin Schill, and Joost Visser. 2012. Standardized code quality benchmarking for improving software maintainability. Software Quality Journal 20, 2 (2012), 287--307. https://doi.org/10.1007/s11219-011-9144-9Google ScholarDigital Library
Gabriele Bavota, Abdallah Qusef, Rocco Oliveto, Andrea De Lucia, and Dave Binkley. 2015. Are test smells really harmful? An empirical study. Empirical Software Engineering 20, 4 (8 2015), 1052--1094. https://doi.org/10.1007/s10664-014-9313-0Google ScholarDigital Library
Gabriele Bavota, Abdallah Qusef, Rocco Oliveto, Andrea De Lucia, and Dave Binkley. 2012. An empirical analysis of the distribution of unit test smells and their impact on software maintenance. In 2012 28th IEEE International Conference on Software Maintenance (ICSM). 56--65. https://doi.org/10.1109/ICSM.2012.6405253Google ScholarDigital Library
Manuel Breugelmans and Bart van Rompaey. 2008. TestQ: Exploring Structural and Maintenance Characteristics of Unit Test Suites. In 1st International Workshop on Advanced Software Development Tools and Techniques (WASDeTT-1).Google Scholar
Wayne W. Daniel. 1990. Applied nonparametric statistics. PWS-Kent.Google Scholar
Jonas De Bleser, Dario Di Nucci, and Coen De Roover. 2019. Assessing diffusion and perception of test smells in scala projects. IEEE International Working Conference on Mining Software Repositories 2019-May (2019), 457--467. https://doi.org/10.1109/MSR.2019.00072Google ScholarDigital Library
Francesca Arcelli Fontana, Vincenzo Ferme, Marco Zanoni, and Aiko Yamashita. 2015. Automatic metric thresholds derivation for code smell detection. International Workshop on Emerging Trends in Software Metrics, WETSoM 2015-August (2015), 44--53. https://doi.org/10.1109/WETSoM.2015.14Google ScholarDigital Library
Martin Fowler, Kent Beck, John Brant, William Opdyke, and Don Roberts. 1999. Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional.Google ScholarDigital Library
Alessio Gambi, Jonathan Bell, and Andreas Zeller. 2018. Practical Test Dependency Detection. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). 1--11. https://doi.org/10.1109/ICST.2018.00011Google Scholar
Michaela Greiler, Arie van Deursen, and Margaret-Anne Storey. 2013. Automated Detection of Test Fixture Strategies and Smells. In Verification and Validation 2013 IEEE Sixth International Conference on Software Testing. 322--331. https://doi.org/10.1109/ICST.2013.45Google Scholar
Maudlin Kummer. 2015. Categorising Test Smells. Bachelor Thesis. University of Bern.Google Scholar
Rensis Likert. 1932. A technique for the measurement of attitudes. Archives of psychology (1932).Google Scholar
Robert C. Martin. 2008. Clean Code: A Handbook of Agile Software Craftsmanship (1 ed.). Prentice Hall PTR, Upper Saddle River, NJ, USA.Google ScholarDigital Library
Sander Meester. 2019. Towards better maintainable software: creating a naming quality model. Master Thesis. University of Amsterdam.Google Scholar
Gerard Meszaros. 2007. xUnit test patterns: Refactoring test code. Pearson Education.Google Scholar
Hazel E. Nelson. 1976. A modified card sorting test sensitive to frontal lobe defects. Cortex 12, 4 (December 1976), 313--324.Google ScholarCross Ref
Fabio Palomba, Gabriele Bavota, Massimilliano Di Penta, Rocco Oliveto, and Andrea De Lucia. 2014. Do They Really Smell Bad? A Study on Developers' Perception of Bad Code Smells. In 2014 IEEE International Conference on Software Maintenance and Evolution. 101--110. https://doi.org/10.1109/ICSME.2014.32Google ScholarDigital Library
Fabio Palomba, Andy Zaidman, and Andrea De Lucia. 2018. Automatic Test Smell Detection Using Information Retrieval Techniques. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 311--322. https://doi.org/10.1109/ICSME.2018.00040Google Scholar
Anthony Peruma. 2018. What the Smell? An Empirical Investigation on the Distribution and Severity of Test Smells in Open Source Android Applications. Master Thesis.Google Scholar
Anthony Peruma, Khalid Almalki, Christian D. Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba. 2019. On the Distribution of Test Smells in Open Source Android Applications: An Exploratory Study. In Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering (CASCON '19). IBM Corp., USA, 193--202.Google ScholarDigital Library
Abdus Satter, Nadia Nahar, and Kazi Sakib. 2017. Automatically Identifying Dead Fields in Test Code by Resolving Method Call and Field Dependency. In 5th International Workshop on Quantitative Approaches to Software Quality. 51--58.Google Scholar
Dag Sjøberg, Jo Hannay, Ove Hansen, V.B. Kampenes, Amela Karahasanovic, Nils-Kristian Liborg, and A.C. Rekdal. 2005. A survey of controlled experiments in software engineering. Transactions on Software Engineering 31 (10 2005), 733--753. https://doi.org/10.1109/TSE.2005.97Google Scholar
Dag I. K. Sjøberg, Aiko Fallas Yamashita, Bente Anda, Audris Mockus, and Tore Dybå. 2013. Quantifying the Effect of Code Smells on Maintenance Effort. Transactions on Software Engineering 39 (12 2013), 1144--1156. https://doi.org/10.1109/TSE.2012.89Google Scholar
Davide Spadini. 2020. Replication package. https://doi.org/10.5281/zenodo.3611111Google Scholar
Davide Spadini, Fabio Palomba, Andy Zaidman, Magiel Bruntink, and Alberto Bacchelli. 2018. On the Relation of Test Smells to Software Code Quality. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 1--12. https://doi.org/10.1109/ICSME.2018.00010Google ScholarCross Ref
Michele Tufano, Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrea De Lucia, and Denys Poshyvanyk. 2016. An Empirical Investigation into the Nature of Test Smells. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE 2016). ACM, 4--15. https://doi.org/10.1145/2970276.2970340Google ScholarDigital Library
Arie van Deursen, Leon Moonen, Alex van den Bergh, and Gerard Kok. 2001. Refactoring test code. In Proceedings of the 2nd International Conference on eXtreme Programming and Flexible Processes in Software Engineering (XP2001). 92--95.Google Scholar
Bart van Rompaey, Bart du Bois, and Serge Demeyer. 2006. Characterizing the Relative Significance of a Test Smell. In 22nd IEEE International Conference on Software Maintenance. IEEE, 391--400. https://doi.org/10.1109/ICSM.2006.18Google ScholarDigital Library
Bart van Rompaey, Bois du Bois, Serge Demeyer, and Matthias Rieger. 2007. On The Detection of Test Smells: A Metrics-Based Approach for General Fixture and Eager Test. Transactions on Software Engineering 33, 12 (December 2007), 800--817. https://doi.org/10.1109/TSE.2007.70745Google ScholarDigital Library
Sai Zhang, Darioush Jalali, Jochen Wuttke, Kivanç Muşju, Wing Lam, Michael D. Ernst, and David Notkin. 2014. Empirically Revisiting the Test Independence Assumption. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014). ACM, 385--396. https://doi.org/10.1145/2610384.2610404Google ScholarDigital Library

Investigating Severity Thresholds for Test Smells
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis

Recommendations

Automated Detection of Test Fixture Strategies and Smells
ICST '13: Proceedings of the 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation

Designing automated tests is a challenging task. One important concern is how to design test fixtures, i.e. code that initializes and configures the system under test so that it is in an appropriate state for running particular automated tests. Test ...
Read More
Who Is Afraid of Test Smells? Assessing Technical Debt from Developer Actions
Testing Software and Systems
Abstract
Test smells are patterns in test code that may indicate poor code quality. Some recent studies have cast doubt on the accuracy and usefulness of the test smells proposed and studied by the research community. In this study, we aimed to determine ...
Read More
Investigating Test Smells in JavaScript Test Code
SAST '21: Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software Testing

Writing automated test cases is a challenging and demanding activity. The test case itself is software that requires proper design to ensure it can be implemented and maintained as long as the production code evolves. Like code smells, test smells may ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

MSR '20: Proceedings of the 17th International Conference on Mining Software Repositories
June 2020
675 pages
ISBN:9781450375177
DOI:10.1145/3379597

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 September 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Empirical Software Engineering
Software Testing
Test Smells
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 33
  Total Citations
  View Citations
- 511
  Total Downloads
- Downloads (Last 12 months)160
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Investigating Severity Thresholds for Test Smells

MSR '20: Proceedings of the 17th International Conference on Mining Software Repositories

ABSTRACT

References

Cited By

Recommendations

Automated Detection of Test Fixture Strategies and Smells

Who Is Afraid of Test Smells? Assessing Technical Debt from Developer Actions

Investigating Test Smells in JavaScript Test Code

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Investigating Severity Thresholds for Test Smells

MSR '20: Proceedings of the 17th International Conference on Mining Software Repositories

ABSTRACT

References

Cited By

Recommendations

Automated Detection of Test Fixture Strategies and Smells

Who Is Afraid of Test Smells? Assessing Technical Debt from Developer Actions

Investigating Test Smells in JavaScript Test Code

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media