ABSTRACT
Test smells are poor design decisions implemented in test code, which can have an impact on the effectiveness and maintainability of unit tests. Even though test smell detection tools exist, how to rank the severity of the detected smells is an open research topic. In this work, we aim at investigating the severity rating for four test smells and investigate their perceived impact on test suite maintainability by the developers. To accomplish this, we first analyzed some 1,500 open-source projects to elicit severity thresholds for commonly found test smells. Then, we conducted a study with developers to evaluate our thresholds. We found that (1) current detection rules for certain test smells are considered as too strict by the developers and (2) our newly defined severity thresholds are in line with the participants' perception of how test smells have an impact on the maintainability of a test suite. Preprint [https://doi.org/10.5281/zenodo.3744281], data and material [https://doi.org/10.5281/zenodo.3611111].
- Efthimia Aivaloglou and Felienne Hermans. 2016. How kids code and how we know: An exploratory study on the Scratch repository. In Proceedings of the 2016 ACM Conference on International Computing Education Research. ACM, 53--61.Google ScholarDigital Library
- Tiago L. Alves, Christiaan Ypma, and Joost Visser. 2010. Deriving Metric Thresholds from Benchmark Data. In Proceedings of the 2010 IEEE International Conference on Software Maintenance (ICSM '10). IEEE Computer Society, Washington, DC, USA, 1--10. https://doi.org/10.1109/ICSM.2010.5609747Google ScholarDigital Library
- Mauricio Aniche, Christoph Treude, Andy Zaidman, Arie Van Deursen, and Marco Aurelio Gerosa. 2016. SATT: Tailoring Code Metric Thresholds for Different Software Architectures. In 2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 41--50. https://doi.org/10.1109/SCAM.2016.19Google Scholar
- Dimitrios Athanasiou, Ariadi Nugroho, Joost Visser, and Andy Zaidman. 2014. Test Code Quality and Its Relation to Issue Handling Performance. IEEE Transactions on Software Engineering 40, 11 (11 2014), 1100--1125. https://doi.org/10.1109/TSE.2014.2342227Google ScholarCross Ref
- Robert Baggen, José Pedro Correia, Katrin Schill, and Joost Visser. 2012. Standardized code quality benchmarking for improving software maintainability. Software Quality Journal 20, 2 (2012), 287--307. https://doi.org/10.1007/s11219-011-9144-9Google ScholarDigital Library
- Gabriele Bavota, Abdallah Qusef, Rocco Oliveto, Andrea De Lucia, and Dave Binkley. 2015. Are test smells really harmful? An empirical study. Empirical Software Engineering 20, 4 (8 2015), 1052--1094. https://doi.org/10.1007/s10664-014-9313-0Google ScholarDigital Library
- Gabriele Bavota, Abdallah Qusef, Rocco Oliveto, Andrea De Lucia, and Dave Binkley. 2012. An empirical analysis of the distribution of unit test smells and their impact on software maintenance. In 2012 28th IEEE International Conference on Software Maintenance (ICSM). 56--65. https://doi.org/10.1109/ICSM.2012.6405253Google ScholarDigital Library
- Manuel Breugelmans and Bart van Rompaey. 2008. TestQ: Exploring Structural and Maintenance Characteristics of Unit Test Suites. In 1st International Workshop on Advanced Software Development Tools and Techniques (WASDeTT-1).Google Scholar
- Wayne W. Daniel. 1990. Applied nonparametric statistics. PWS-Kent.Google Scholar
- Jonas De Bleser, Dario Di Nucci, and Coen De Roover. 2019. Assessing diffusion and perception of test smells in scala projects. IEEE International Working Conference on Mining Software Repositories 2019-May (2019), 457--467. https://doi.org/10.1109/MSR.2019.00072Google ScholarDigital Library
- Francesca Arcelli Fontana, Vincenzo Ferme, Marco Zanoni, and Aiko Yamashita. 2015. Automatic metric thresholds derivation for code smell detection. International Workshop on Emerging Trends in Software Metrics, WETSoM 2015-August (2015), 44--53. https://doi.org/10.1109/WETSoM.2015.14Google ScholarDigital Library
- Martin Fowler, Kent Beck, John Brant, William Opdyke, and Don Roberts. 1999. Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional.Google ScholarDigital Library
- Alessio Gambi, Jonathan Bell, and Andreas Zeller. 2018. Practical Test Dependency Detection. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). 1--11. https://doi.org/10.1109/ICST.2018.00011Google Scholar
- Michaela Greiler, Arie van Deursen, and Margaret-Anne Storey. 2013. Automated Detection of Test Fixture Strategies and Smells. In Verification and Validation 2013 IEEE Sixth International Conference on Software Testing. 322--331. https://doi.org/10.1109/ICST.2013.45Google Scholar
- Maudlin Kummer. 2015. Categorising Test Smells. Bachelor Thesis. University of Bern.Google Scholar
- Rensis Likert. 1932. A technique for the measurement of attitudes. Archives of psychology (1932).Google Scholar
- Robert C. Martin. 2008. Clean Code: A Handbook of Agile Software Craftsmanship (1 ed.). Prentice Hall PTR, Upper Saddle River, NJ, USA.Google ScholarDigital Library
- Sander Meester. 2019. Towards better maintainable software: creating a naming quality model. Master Thesis. University of Amsterdam.Google Scholar
- Gerard Meszaros. 2007. xUnit test patterns: Refactoring test code. Pearson Education.Google Scholar
- Hazel E. Nelson. 1976. A modified card sorting test sensitive to frontal lobe defects. Cortex 12, 4 (December 1976), 313--324.Google ScholarCross Ref
- Fabio Palomba, Gabriele Bavota, Massimilliano Di Penta, Rocco Oliveto, and Andrea De Lucia. 2014. Do They Really Smell Bad? A Study on Developers' Perception of Bad Code Smells. In 2014 IEEE International Conference on Software Maintenance and Evolution. 101--110. https://doi.org/10.1109/ICSME.2014.32Google ScholarDigital Library
- Fabio Palomba, Andy Zaidman, and Andrea De Lucia. 2018. Automatic Test Smell Detection Using Information Retrieval Techniques. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 311--322. https://doi.org/10.1109/ICSME.2018.00040Google Scholar
- Anthony Peruma. 2018. What the Smell? An Empirical Investigation on the Distribution and Severity of Test Smells in Open Source Android Applications. Master Thesis.Google Scholar
- Anthony Peruma, Khalid Almalki, Christian D. Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba. 2019. On the Distribution of Test Smells in Open Source Android Applications: An Exploratory Study. In Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering (CASCON '19). IBM Corp., USA, 193--202.Google ScholarDigital Library
- Abdus Satter, Nadia Nahar, and Kazi Sakib. 2017. Automatically Identifying Dead Fields in Test Code by Resolving Method Call and Field Dependency. In 5th International Workshop on Quantitative Approaches to Software Quality. 51--58.Google Scholar
- Dag Sjøberg, Jo Hannay, Ove Hansen, V.B. Kampenes, Amela Karahasanovic, Nils-Kristian Liborg, and A.C. Rekdal. 2005. A survey of controlled experiments in software engineering. Transactions on Software Engineering 31 (10 2005), 733--753. https://doi.org/10.1109/TSE.2005.97Google Scholar
- Dag I. K. Sjøberg, Aiko Fallas Yamashita, Bente Anda, Audris Mockus, and Tore Dybå. 2013. Quantifying the Effect of Code Smells on Maintenance Effort. Transactions on Software Engineering 39 (12 2013), 1144--1156. https://doi.org/10.1109/TSE.2012.89Google Scholar
- Davide Spadini. 2020. Replication package. https://doi.org/10.5281/zenodo.3611111Google Scholar
- Davide Spadini, Fabio Palomba, Andy Zaidman, Magiel Bruntink, and Alberto Bacchelli. 2018. On the Relation of Test Smells to Software Code Quality. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 1--12. https://doi.org/10.1109/ICSME.2018.00010Google ScholarCross Ref
- Michele Tufano, Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrea De Lucia, and Denys Poshyvanyk. 2016. An Empirical Investigation into the Nature of Test Smells. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE 2016). ACM, 4--15. https://doi.org/10.1145/2970276.2970340Google ScholarDigital Library
- Arie van Deursen, Leon Moonen, Alex van den Bergh, and Gerard Kok. 2001. Refactoring test code. In Proceedings of the 2nd International Conference on eXtreme Programming and Flexible Processes in Software Engineering (XP2001). 92--95.Google Scholar
- Bart van Rompaey, Bart du Bois, and Serge Demeyer. 2006. Characterizing the Relative Significance of a Test Smell. In 22nd IEEE International Conference on Software Maintenance. IEEE, 391--400. https://doi.org/10.1109/ICSM.2006.18Google ScholarDigital Library
- Bart van Rompaey, Bois du Bois, Serge Demeyer, and Matthias Rieger. 2007. On The Detection of Test Smells: A Metrics-Based Approach for General Fixture and Eager Test. Transactions on Software Engineering 33, 12 (December 2007), 800--817. https://doi.org/10.1109/TSE.2007.70745Google ScholarDigital Library
- Sai Zhang, Darioush Jalali, Jochen Wuttke, Kivanç Muşju, Wing Lam, Michael D. Ernst, and David Notkin. 2014. Empirically Revisiting the Test Independence Assumption. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014). ACM, 385--396. https://doi.org/10.1145/2610384.2610404Google ScholarDigital Library
- Investigating Severity Thresholds for Test Smells
Recommendations
Automated Detection of Test Fixture Strategies and Smells
ICST '13: Proceedings of the 2013 IEEE Sixth International Conference on Software Testing, Verification and ValidationDesigning automated tests is a challenging task. One important concern is how to design test fixtures, i.e. code that initializes and configures the system under test so that it is in an appropriate state for running particular automated tests. Test ...
Who Is Afraid of Test Smells? Assessing Technical Debt from Developer Actions
Testing Software and SystemsAbstractTest smells are patterns in test code that may indicate poor code quality. Some recent studies have cast doubt on the accuracy and usefulness of the test smells proposed and studied by the research community. In this study, we aimed to determine ...
Investigating Test Smells in JavaScript Test Code
SAST '21: Proceedings of the 6th Brazilian Symposium on Systematic and Automated Software TestingWriting automated test cases is a challenging and demanding activity. The test case itself is software that requires proper design to ensure it can be implemented and maintained as long as the production code evolves. Like code smells, test smells may ...
Comments