Skip to main content
Top
Published in: Empirical Software Engineering 6/2021

01-11-2021

Rotten green tests in Java, Pharo and Python

An empirical study

Authors: Vincent Aranega, Julien Delplanque, Matias Martinez, Andrew P. Black, Stéphane Ducasse, Anne Etien, Christopher Fuhrman, Guillermo Polito

Published in: Empirical Software Engineering | Issue 6/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Rotten Green Tests are tests that pass, but not because the assertions they contain are true: a rotten test passes because some or all of its assertions are not actually executed. The presence of a rotten green test is a test smell, and a bad one, because the existence of a test gives us false confidence that the code under test is valid, when in fact that code may not have been tested at all. This article reports on an empirical evaluation of the tests in a corpus of projects found in the wild. We selected approximately one hundred mature projects written in each of Java, Pharo, and Python. We looked for rotten green tests in each project, taking into account test helper methods, inherited helpers, and trait composition. Previous work has shown the presence of rotten green tests in Pharo projects; the results reported here show that they are also present in Java and Python projects, and that they fall into similar categories. Furthermore, we found code bugs that were hidden by rotten tests in Pharo and Python. We also discuss two test smells —missed fail and missed skip —that arise from the misuse of testing frameworks, and which we observed in tests written in all three languages.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Footnotes
6
A false negative would be generated by a call site that the analysis labelled “executed” but that was not actually executed.
 
10
A list of the assertion methods in unittest can be found at https://​docs.​python.​org/​3/​library/​unittest.​html
 
11
PYPL (PopularitY of Programming Language Index) https://​pypl.​github.​io/​PYPL.​html was created by analyzing how often language tutorials are sought on Google. By this metric, Python and Java were the two top programming languages in March 2021.
 
14
Repository accessed September 2019, commit 91a404073acac40a7945bf7d584e8b30bc7a08cb
 
15
with commit 9ca80538d9e9418ae658772516f9b7dfb1e02ccd
 
19
 
Literature
go back to reference Baudry B, Fleurey F, Jézéquel JM, Traon YL (2005) Automatic test case optimization: A bacteriologic algorithm. IEEE Softw 22(2):76–82CrossRef Baudry B, Fleurey F, Jézéquel JM, Traon YL (2005) Automatic test case optimization: A bacteriologic algorithm. IEEE Softw 22(2):76–82CrossRef
go back to reference Bowes D, Tracy H, Petrié J, Shippey T, Turhan B (2017) How good are my tests?. In: Workshop on emerging trends in software metrics (WETSoM). IEEE/ACM Bowes D, Tracy H, Petrié J, Shippey T, Turhan B (2017) How good are my tests?. In: Workshop on emerging trends in software metrics (WETSoM). IEEE/ACM
go back to reference Breugelmans M, Van Rompaey B (2008) TestQ: Exploring structural and maintenance characteristics of unit test suites. In: International workshop on advanced software development tools and techniques (WASDeTT) Breugelmans M, Van Rompaey B (2008) TestQ: Exploring structural and maintenance characteristics of unit test suites. In: International workshop on advanced software development tools and techniques (WASDeTT)
go back to reference Csallner C, Smaragdakis Y (2004) JCrasher: an automatic robust tester for Java. Softw Pract Exper 43 Csallner C, Smaragdakis Y (2004) JCrasher: an automatic robust tester for Java. Softw Pract Exper 43
go back to reference Daniel B, Dig D, Gvero T, Jagannath V, Jiaa J, Mitchell D, Nogiec J, Tan SH, Marinov D (2011) Reassert: A tool for repairing broken unit tests. In: Proceedings of the 33rd international conference on software engineering, ICSE ’11. https://doi.org/10.1145/1985793.1985978. ACM, New York, pp 1010–1012 Daniel B, Dig D, Gvero T, Jagannath V, Jiaa J, Mitchell D, Nogiec J, Tan SH, Marinov D (2011) Reassert: A tool for repairing broken unit tests. In: Proceedings of the 33rd international conference on software engineering, ICSE ’11. https://​doi.​org/​10.​1145/​1985793.​1985978. ACM, New York, pp 1010–1012
go back to reference Delplanque J, Ducasse S, Black AP, Polito G (2018) Rotten green tests: a first analysis. Tech. rep., Inria Delplanque J, Ducasse S, Black AP, Polito G (2018) Rotten green tests: a first analysis. Tech. rep., Inria
go back to reference van Deursen A, Moonen L, van den Bergh A, Kok G (2001) Refactoring test code. In: Marchesi M (ed) Proceedings of the 2nd international conference on extreme programming and flexible processes (XP2001), University of Cagliari, pp 92–95 van Deursen A, Moonen L, van den Bergh A, Kok G (2001) Refactoring test code. In: Marchesi M (ed) Proceedings of the 2nd international conference on extreme programming and flexible processes (XP2001), University of Cagliari, pp 92–95
go back to reference Ducasse S, Pollet D, Bergel A, Cassou D (2009) Reusing and composing tests with traits. In: TOOLS’09: Proceedings of the 47th international conference on objects, models, components, patterns, Zurich, Switzerland, pp 252–271 Ducasse S, Pollet D, Bergel A, Cassou D (2009) Reusing and composing tests with traits. In: TOOLS’09: Proceedings of the 47th international conference on objects, models, components, patterns, Zurich, Switzerland, pp 252–271
go back to reference Dustin E, Rashka J, Paul J (1999) Automated software testing :introduction, management, and performance. Addison-Wesley Professional, Boston Dustin E, Rashka J, Paul J (1999) Automated software testing :introduction, management, and performance. Addison-Wesley Professional, Boston
go back to reference Gligoric M, Groce A, Zhang C, Sharma R, Alipour MA, Marinov D (2013) Comparing non-adequate test suites using coverage criteria. In: International symposium on software testing and analysis Gligoric M, Groce A, Zhang C, Sharma R, Alipour MA, Marinov D (2013) Comparing non-adequate test suites using coverage criteria. In: International symposium on software testing and analysis
go back to reference Herzig K, Nagappan N (2015) Empirically detecting false test alarms using association rules. In: International conference on software engineering Herzig K, Nagappan N (2015) Empirically detecting false test alarms using association rules. In: International conference on software engineering
go back to reference Huo C, Clause J (2014) Improving oracle quality by detecting brittle assertions and unused inputs in tests. Found Softw Eng Huo C, Clause J (2014) Improving oracle quality by detecting brittle assertions and unused inputs in tests. Found Softw Eng
go back to reference Inozemtseva L, Holmes R (2014) Coverage is not strongly correlated with test suite effectiveness. In: International conference on software engineering Inozemtseva L, Holmes R (2014) Coverage is not strongly correlated with test suite effectiveness. In: International conference on software engineering
go back to reference Martinez M, Etien A, Ducasse S, Fuhrman C (2020) Rtj: a Java framework for detecting and refactoring rotten green test cases. In: IEEE/ACM 42nd int. conf. on software engineering: companion proceedings (ICSE ’20 Companion), 5–11 Oct, 2020, Seoul, Republic of Korea. https://doi.org/10.1145/3377812.3382151, pp 69–72 Martinez M, Etien A, Ducasse S, Fuhrman C (2020) Rtj: a Java framework for detecting and refactoring rotten green test cases. In: IEEE/ACM 42nd int. conf. on software engineering: companion proceedings (ICSE ’20 Companion), 5–11 Oct, 2020, Seoul, Republic of Korea. https://​doi.​org/​10.​1145/​3377812.​3382151, pp 69–72
go back to reference Meszaros G (2007) XUnit test patterns – refactoring test code. Addison Wesley, Boston Meszaros G (2007) XUnit test patterns – refactoring test code. Addison Wesley, Boston
go back to reference Mockus A, Nagappan N, Dinh-Trong TT (2009) Test coverage and post-verification defects: A multiple case study. In: Proceedings of the 2009 3rd international symposium on empirical software engineering and measurement, ESEM ’09. IEEE Computer Society, Washingto, pp 291–301. https://doi.org/10.1109/ESEM.2009.5315981 Mockus A, Nagappan N, Dinh-Trong TT (2009) Test coverage and post-verification defects: A multiple case study. In: Proceedings of the 2009 3rd international symposium on empirical software engineering and measurement, ESEM ’09. IEEE Computer Society, Washingto, pp 291–301. https://​doi.​org/​10.​1109/​ESEM.​2009.​5315981
go back to reference Niedermayr R, Juergens E, Wagne S (2016) Will my tests tell me if I break this code?. In: International workshop on continuous software evolution and delivery. ACM Press, pp 23–29 Niedermayr R, Juergens E, Wagne S (2016) Will my tests tell me if I break this code?. In: International workshop on continuous software evolution and delivery. ACM Press, pp 23–29
go back to reference Reichhart S, Gîrba T, Ducasse S (2007) Rule-based assessment of test quality. In: Journal of object technology, special issue. Proceedings of TOOLS Europe 2007, vol 6/9, pp 231–251 Reichhart S, Gîrba T, Ducasse S (2007) Rule-based assessment of test quality. In: Journal of object technology, special issue. Proceedings of TOOLS Europe 2007, vol 6/9, pp 231–251
go back to reference Runeson P, Höst M (2009) Guidelines for conducting and reporting case study research in software engineering. Empir Softw Eng 14(2):131–164CrossRef Runeson P, Höst M (2009) Guidelines for conducting and reporting case study research in software engineering. Empir Softw Eng 14(2):131–164CrossRef
go back to reference Shahrokni A, Feldt R (2011) Robustest: Towards a framework for automated testing of robustness in software. In: International conference on advances in system testing and validation LifeCycle Shahrokni A, Feldt R (2011) Robustest: Towards a framework for automated testing of robustness in software. In: International conference on advances in system testing and validation LifeCycle
go back to reference Silva Junior N, Rocha L, Martins LA, Machado I (2020) A survey on test practitioners’ awareness of test smells. arXiv:200305613 Silva Junior N, Rocha L, Martins LA, Machado I (2020) A survey on test practitioners’ awareness of test smells. arXiv:200305613
go back to reference Van Rompaey B, Du Bois B, Demeyer S (2006b) Improving test code reviews with metrics: a pilot study. Tech rep., Lab on Re-engineering, University of Antwerp Van Rompaey B, Du Bois B, Demeyer S (2006b) Improving test code reviews with metrics: a pilot study. Tech rep., Lab on Re-engineering, University of Antwerp
go back to reference Vera-Perez O, Danglot B, Monperrus M, Baudry B (2018) A comprehensive study of pseudo-tested methods. arXiv:1807.05030 Vera-Perez O, Danglot B, Monperrus M, Baudry B (2018) A comprehensive study of pseudo-tested methods. arXiv:1807.​05030
Metadata
Title
Rotten green tests in Java, Pharo and Python
An empirical study
Authors
Vincent Aranega
Julien Delplanque
Matias Martinez
Andrew P. Black
Stéphane Ducasse
Anne Etien
Christopher Fuhrman
Guillermo Polito
Publication date
01-11-2021
Publisher
Springer US
Published in
Empirical Software Engineering / Issue 6/2021
Print ISSN: 1382-3256
Electronic ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-021-10016-2

Other articles of this Issue 6/2021

Empirical Software Engineering 6/2021 Go to the issue

Premium Partner