skip to main content
10.1145/3379597.3387453acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open Access

Investigating Severity Thresholds for Test Smells

Authors Info & Claims
Published:18 September 2020Publication History

ABSTRACT

Test smells are poor design decisions implemented in test code, which can have an impact on the effectiveness and maintainability of unit tests. Even though test smell detection tools exist, how to rank the severity of the detected smells is an open research topic. In this work, we aim at investigating the severity rating for four test smells and investigate their perceived impact on test suite maintainability by the developers. To accomplish this, we first analyzed some 1,500 open-source projects to elicit severity thresholds for commonly found test smells. Then, we conducted a study with developers to evaluate our thresholds. We found that (1) current detection rules for certain test smells are considered as too strict by the developers and (2) our newly defined severity thresholds are in line with the participants' perception of how test smells have an impact on the maintainability of a test suite. Preprint [https://doi.org/10.5281/zenodo.3744281], data and material [https://doi.org/10.5281/zenodo.3611111].

References

  1. Efthimia Aivaloglou and Felienne Hermans. 2016. How kids code and how we know: An exploratory study on the Scratch repository. In Proceedings of the 2016 ACM Conference on International Computing Education Research. ACM, 53--61.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Tiago L. Alves, Christiaan Ypma, and Joost Visser. 2010. Deriving Metric Thresholds from Benchmark Data. In Proceedings of the 2010 IEEE International Conference on Software Maintenance (ICSM '10). IEEE Computer Society, Washington, DC, USA, 1--10. https://doi.org/10.1109/ICSM.2010.5609747Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Mauricio Aniche, Christoph Treude, Andy Zaidman, Arie Van Deursen, and Marco Aurelio Gerosa. 2016. SATT: Tailoring Code Metric Thresholds for Different Software Architectures. In 2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM). IEEE, 41--50. https://doi.org/10.1109/SCAM.2016.19Google ScholarGoogle Scholar
  4. Dimitrios Athanasiou, Ariadi Nugroho, Joost Visser, and Andy Zaidman. 2014. Test Code Quality and Its Relation to Issue Handling Performance. IEEE Transactions on Software Engineering 40, 11 (11 2014), 1100--1125. https://doi.org/10.1109/TSE.2014.2342227Google ScholarGoogle ScholarCross RefCross Ref
  5. Robert Baggen, José Pedro Correia, Katrin Schill, and Joost Visser. 2012. Standardized code quality benchmarking for improving software maintainability. Software Quality Journal 20, 2 (2012), 287--307. https://doi.org/10.1007/s11219-011-9144-9Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gabriele Bavota, Abdallah Qusef, Rocco Oliveto, Andrea De Lucia, and Dave Binkley. 2015. Are test smells really harmful? An empirical study. Empirical Software Engineering 20, 4 (8 2015), 1052--1094. https://doi.org/10.1007/s10664-014-9313-0Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gabriele Bavota, Abdallah Qusef, Rocco Oliveto, Andrea De Lucia, and Dave Binkley. 2012. An empirical analysis of the distribution of unit test smells and their impact on software maintenance. In 2012 28th IEEE International Conference on Software Maintenance (ICSM). 56--65. https://doi.org/10.1109/ICSM.2012.6405253Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Manuel Breugelmans and Bart van Rompaey. 2008. TestQ: Exploring Structural and Maintenance Characteristics of Unit Test Suites. In 1st International Workshop on Advanced Software Development Tools and Techniques (WASDeTT-1).Google ScholarGoogle Scholar
  9. Wayne W. Daniel. 1990. Applied nonparametric statistics. PWS-Kent.Google ScholarGoogle Scholar
  10. Jonas De Bleser, Dario Di Nucci, and Coen De Roover. 2019. Assessing diffusion and perception of test smells in scala projects. IEEE International Working Conference on Mining Software Repositories 2019-May (2019), 457--467. https://doi.org/10.1109/MSR.2019.00072Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Francesca Arcelli Fontana, Vincenzo Ferme, Marco Zanoni, and Aiko Yamashita. 2015. Automatic metric thresholds derivation for code smell detection. International Workshop on Emerging Trends in Software Metrics, WETSoM 2015-August (2015), 44--53. https://doi.org/10.1109/WETSoM.2015.14Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Martin Fowler, Kent Beck, John Brant, William Opdyke, and Don Roberts. 1999. Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Alessio Gambi, Jonathan Bell, and Andreas Zeller. 2018. Practical Test Dependency Detection. In 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). 1--11. https://doi.org/10.1109/ICST.2018.00011Google ScholarGoogle Scholar
  14. Michaela Greiler, Arie van Deursen, and Margaret-Anne Storey. 2013. Automated Detection of Test Fixture Strategies and Smells. In Verification and Validation 2013 IEEE Sixth International Conference on Software Testing. 322--331. https://doi.org/10.1109/ICST.2013.45Google ScholarGoogle Scholar
  15. Maudlin Kummer. 2015. Categorising Test Smells. Bachelor Thesis. University of Bern.Google ScholarGoogle Scholar
  16. Rensis Likert. 1932. A technique for the measurement of attitudes. Archives of psychology (1932).Google ScholarGoogle Scholar
  17. Robert C. Martin. 2008. Clean Code: A Handbook of Agile Software Craftsmanship (1 ed.). Prentice Hall PTR, Upper Saddle River, NJ, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sander Meester. 2019. Towards better maintainable software: creating a naming quality model. Master Thesis. University of Amsterdam.Google ScholarGoogle Scholar
  19. Gerard Meszaros. 2007. xUnit test patterns: Refactoring test code. Pearson Education.Google ScholarGoogle Scholar
  20. Hazel E. Nelson. 1976. A modified card sorting test sensitive to frontal lobe defects. Cortex 12, 4 (December 1976), 313--324.Google ScholarGoogle ScholarCross RefCross Ref
  21. Fabio Palomba, Gabriele Bavota, Massimilliano Di Penta, Rocco Oliveto, and Andrea De Lucia. 2014. Do They Really Smell Bad? A Study on Developers' Perception of Bad Code Smells. In 2014 IEEE International Conference on Software Maintenance and Evolution. 101--110. https://doi.org/10.1109/ICSME.2014.32Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Fabio Palomba, Andy Zaidman, and Andrea De Lucia. 2018. Automatic Test Smell Detection Using Information Retrieval Techniques. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 311--322. https://doi.org/10.1109/ICSME.2018.00040Google ScholarGoogle Scholar
  23. Anthony Peruma. 2018. What the Smell? An Empirical Investigation on the Distribution and Severity of Test Smells in Open Source Android Applications. Master Thesis.Google ScholarGoogle Scholar
  24. Anthony Peruma, Khalid Almalki, Christian D. Newman, Mohamed Wiem Mkaouer, Ali Ouni, and Fabio Palomba. 2019. On the Distribution of Test Smells in Open Source Android Applications: An Exploratory Study. In Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering (CASCON '19). IBM Corp., USA, 193--202.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Abdus Satter, Nadia Nahar, and Kazi Sakib. 2017. Automatically Identifying Dead Fields in Test Code by Resolving Method Call and Field Dependency. In 5th International Workshop on Quantitative Approaches to Software Quality. 51--58.Google ScholarGoogle Scholar
  26. Dag Sjøberg, Jo Hannay, Ove Hansen, V.B. Kampenes, Amela Karahasanovic, Nils-Kristian Liborg, and A.C. Rekdal. 2005. A survey of controlled experiments in software engineering. Transactions on Software Engineering 31 (10 2005), 733--753. https://doi.org/10.1109/TSE.2005.97Google ScholarGoogle Scholar
  27. Dag I. K. Sjøberg, Aiko Fallas Yamashita, Bente Anda, Audris Mockus, and Tore Dybå. 2013. Quantifying the Effect of Code Smells on Maintenance Effort. Transactions on Software Engineering 39 (12 2013), 1144--1156. https://doi.org/10.1109/TSE.2012.89Google ScholarGoogle Scholar
  28. Davide Spadini. 2020. Replication package. https://doi.org/10.5281/zenodo.3611111Google ScholarGoogle Scholar
  29. Davide Spadini, Fabio Palomba, Andy Zaidman, Magiel Bruntink, and Alberto Bacchelli. 2018. On the Relation of Test Smells to Software Code Quality. In 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 1--12. https://doi.org/10.1109/ICSME.2018.00010Google ScholarGoogle ScholarCross RefCross Ref
  30. Michele Tufano, Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrea De Lucia, and Denys Poshyvanyk. 2016. An Empirical Investigation into the Nature of Test Smells. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE 2016). ACM, 4--15. https://doi.org/10.1145/2970276.2970340Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Arie van Deursen, Leon Moonen, Alex van den Bergh, and Gerard Kok. 2001. Refactoring test code. In Proceedings of the 2nd International Conference on eXtreme Programming and Flexible Processes in Software Engineering (XP2001). 92--95.Google ScholarGoogle Scholar
  32. Bart van Rompaey, Bart du Bois, and Serge Demeyer. 2006. Characterizing the Relative Significance of a Test Smell. In 22nd IEEE International Conference on Software Maintenance. IEEE, 391--400. https://doi.org/10.1109/ICSM.2006.18Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Bart van Rompaey, Bois du Bois, Serge Demeyer, and Matthias Rieger. 2007. On The Detection of Test Smells: A Metrics-Based Approach for General Fixture and Eager Test. Transactions on Software Engineering 33, 12 (December 2007), 800--817. https://doi.org/10.1109/TSE.2007.70745Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sai Zhang, Darioush Jalali, Jochen Wuttke, Kivanç Muşju, Wing Lam, Michael D. Ernst, and David Notkin. 2014. Empirically Revisiting the Test Independence Assumption. In Proceedings of the 2014 International Symposium on Software Testing and Analysis (ISSTA 2014). ACM, 385--396. https://doi.org/10.1145/2610384.2610404Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Investigating Severity Thresholds for Test Smells

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MSR '20: Proceedings of the 17th International Conference on Mining Software Repositories
      June 2020
      675 pages
      ISBN:9781450375177
      DOI:10.1145/3379597

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 September 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Upcoming Conference

      ICSE 2025

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader