research-article

Automatic generation of oracles for exceptional behaviors

Authors:
Alberto Goffi

University of Lugano, Switzerland

University of Lugano, Switzerland
View Profile

,
Alessandra Gorla

IMDEA Software Institute, Spain

IMDEA Software Institute, Spain
View Profile

,
Michael D. Ernst

University of Washington, USA

University of Washington, USA
View Profile

,
Mauro Pezzè

University of Lugano, Switzerland

University of Lugano, Switzerland
View Profile

ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and AnalysisJuly 2016Pages 213–224https://doi.org/10.1145/2931037.2931061

Published:18 July 2016Publication History

ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis

Pages 213–224

ABSTRACT

Test suites should test exceptional behavior to detect faults in error-handling code. However, manually-written test suites tend to neglect exceptional behavior. Automatically-generated test suites, on the other hand, lack test oracles that verify whether runtime exceptions are the expected behavior of the code under test.

This paper proposes a technique that automatically creates test oracles for exceptional behaviors from Javadoc comments. The technique uses a combination of natural language processing and run-time instrumentation. Our implementation, Toradocu, can be combined with a test input generation tool. Our experimental evaluation shows that Toradocu improves the fault-finding effectiveness of EvoSuite and Randoop test suites by 8% and 16% respectively, and reduces EvoSuite’s false positives by 33%.

References

G. Angeli, M. J. J. Premkumar, and C. D. Manning. Leveraging linguistic structure for open domain information extraction. In ACL 2015, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, pages 344–354, 2015.Google ScholarCross Ref
S. Antoy and D. Hamlet. Automatically checking an implementation against its formal specification. IEEE Transactions on Software Engineering, 26(1):55–69, 2000. Google ScholarDigital Library
W. Araujo, L. C. Briand, and Y. Labiche. Enabling the runtime assertion checking of concurrent contracts for the Java modeling language. In ICSE’11, Proceedings of the 33rd International Conference on Software Engineering, pages 786–795, 2011. Google ScholarDigital Library
C. Bacherler, B. Moszkowski, C. Facchi, and A. Huebner. Automated test code generation based on formalized natural language business rules. In ICSEA’12, Proceedings of the 7th International Conference on Software Engineering Advances, pages 165–171, 2012.Google Scholar
L. Baresi, P. L. Lanzi, and M. Miraz. Testful: An evolutionary test approach for Java. In ICST’10, Proceedings of the 3rd International Conference on Software Testing, Verification and Validation, pages 185–194, 2010. Google ScholarDigital Library
L. Baresi and M. Young. Test oracles. Technical Report CIS-TR-01-02, University of Oregon, Department of Computer and Information Science, 2001.Google Scholar
E. Barr, M. Harman, P. McMinn, M. Shahbaz, and S. Yoo. The oracle problem in software testing: A survey. IEEE Transactions on Software Engineering, 41(5):507–525, May 2015.Google ScholarDigital Library
A. Carzaniga, A. Goffi, A. Gorla, A. Mattavelli, and M. Pezzè. Cross-checking oracles from intrinsic software redundancy. In ICSE’14, Proceedings of the 36th International Conference on Software Engineering, pages 931–942, 2014. Google ScholarDigital Library
M. Ceccato, A. Marchetto, L. Mariani, C. D. Nguyen, and P. Tonella. Do automatically generated test cases make debugging easier? an experimental assessment of debugging effectiveness and efficiency. ACM Transactions on Programming Languages and Systems, 25(1):5:1–5:38, dec 2015. Google ScholarDigital Library
T. Y. Chen, F.-C. Kuo, T. H. Tse, and Z. Q. Zhou. Metamorphic testing and beyond. In STEP’03, Proceedings of the 11th International Workshop on Software Technology and Engineering Practice, pages 94–100, 2003. Google ScholarDigital Library
Y. Cheon. Abstraction in assertion-based test oracles. In QSIC’07, Proceedings of the 7th International Conference on Quality Software, pages 410–414, 2007. Google ScholarDigital Library
Y. Cheon and G. T. Leavens. A simple and practical approach to unit testing: The JML and JUnit way. In ECOOP 2002 — Object-Oriented Programming, 16th European Conference, pages 231–255, 2002. Google ScholarDigital Library
C. Csallner and Y. Smaragdakis. JCrasher: an automatic robustness tester for Java. Software: Practice and Experience, 34(11):1025–1050, September 2004. Google ScholarDigital Library
C. Csallner and Y. Smaragdakis. Check ’n’ Crash: Combining static checking and testing. In ICSE’05, Proceedings of the 27th International Conference on Software Engineering, pages 422–431, St. Louis, MO, USA, May 18–20, 2005. Google ScholarDigital Library
J. D. Day and J. D. Gannon. A test oracle based on formal specifications. In SOFTAIR’85, Proceedings of the 2nd Conference on Software Development Tools, Techniques, and Alternatives, pages 126–130, 1985. Google ScholarDigital Library
L. Del Corro and R. Gemulla. Clausie: Clause-based open information extraction. In WWW 2013, Proceedings of the 22nd International World Wide Web Conference, pages 355–366, 2013. Google ScholarDigital Library
W. Dietl, S. Dietzel, M. D. Ernst, K. Mu¸slu, and T. Schiller. Building and using pluggable type-checkers. In ICSE’11, Proceedings of the 33rd International Conference on Software Engineering, pages 681–690, Waikiki, Hawaii, USA, May 25–27, 2011. Google ScholarDigital Library
R.-K. Doong and P. G. Frankl. The ASTOOT approach to testing object-oriented programs. ACM Transactions on Software Engineering and Methodology, 3(2):101–130, 1994. Google ScholarDigital Library
G. Fraser and A. Zeller. Mutation-driven generation of unit tests and oracles. IEEE Transactions on Software Engineering, 38(2):278–292, March–April 2012. Google ScholarDigital Library
S. Fujiwara, G. von Bochmann, F. Khendek, M. Amalou, and A. Ghedamsi. Test selection based on finite state models. IEEE Transactions on Software Engineering, 17(6):591–603, 1991. Google ScholarDigital Library
J. P. Galeotti, G. Fraser, and A. Arcuri. Improving search-based test suite generation with dynamic symbolic execution. In ISSRE’13, Proceedings of the IEEE International Symposium on Software Reliability Engineering, pages 360–369, 2013.Google Scholar
J. Gannon, P. McMullin, and R. Hamlet. Data abstraction, implementation, specification, and testing. ACM Transactions on Programming Languages and Systems, 3(3):211–223, 1981. Google ScholarDigital Library
P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In PLDI 2005, Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, Chicago, IL, USA, June 13–15, 2005. Google ScholarDigital Library
A. Gotlieb. Exploiting symmetries to test programs. In ISSRE’03, Proceedings of the IEEE International Symposium on Software Reliability Engineering, pages 365–375, 2003. Google ScholarDigital Library
C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, pages 55–60, 2014.Google ScholarCross Ref
M. Marneffe, B. Maccartney, and C. Manning. Generating typed dependency parses from phrase structure parses. In LREC’06, Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation, pages 449–454, 2006.Google Scholar
J. Mcdonald. Translating Object-Z specifications to passive test oracles. In ICFEM’98, Proceedings of the 1998 International Conference on Formal Engineering Methods, pages 165–174, 1998. Google ScholarDigital Library
B. Meyer. Object-Oriented Software Construction. Prentice Hall, 1st edition, 1988. Google ScholarDigital Library
E. Mikk. Compilation of Z specifications into C for automatic test result evaluation. In ZUM’95, Proceedings of the 9th International Conference of Z Users, pages 167–180, 1995. Google ScholarDigital Library
C. Murphy, G. Kaiser, I. Vo, and M. Chu. Quality assurance of software applications using the in vivo testing approach. In ICST’09, Proceedings of the 2nd International Conference on Software Testing, Verification and Validation, pages 111–120, 2009. Google ScholarDigital Library
C. Pacheco and M. D. Ernst. Eclat: Automatic generation and classification of test inputs. In ECOOP 2005 — Object-Oriented Programming, 19th European Conference, pages 504–527, Glasgow, Scotland, July 27–29, 2005. Google ScholarDigital Library
C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedback-directed random test generation. In ICSE’07, Proceedings of the 29th International Conference on Software Engineering, pages 75–84, Minneapolis, MN, USA, May 23–25, 2007. Google ScholarDigital Library
R. Pandita, X. Xiao, H. Zhong, T. Xie, S. Oney, and A. Paradkar. Inferring method specifications from natural language API descriptions. In ICSE’12, Proceedings of the 34th International Conference on Software Engineering, pages 815–825, Zurich, Switzerland, 2012. Google ScholarDigital Library
M. M. Papi, M. Ali, T. L. Correa Jr., J. H. Perkins, and M. D. Ernst. Practical pluggable types for Java. In ISSTA 2008, Proceedings of the 2008 International Symposium on Software Testing and Analysis, pages 201–212, Seattle, WA, USA, July 22–24, 2008. Google ScholarDigital Library
Parasoft Corporation. Jtest version 4.5. http://www.parasoft.com/.Google Scholar
Randoop Developers. Randoop manual. https://randoop.github.io/randoop/manual/, January 2016.Google Scholar
Version 2.1.1.Google Scholar
J. M. Rojas, G. Fraser, and A. Arcuri. Automated unit test generation during software development: A controlled experiment and think-aloud observations. In ISSTA 2015, Proceedings of the 2015 International Symposium on Software Testing and Analysis, pages 338–349, 2015. Google ScholarDigital Library
D. S. Rosenblum. A practical approach to programming with assertions. IEEE Transactions on Software Engineering, 21(1):19–31, 1995. Google ScholarDigital Library
C. Rubio-González and B. Liblit. Expect the unexpected: error code mismatches between documentation and the real world. In PASTE’10, Proceedings of the ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, pages 73–80, 2010. Google ScholarDigital Library
K. Sen, D. Marinov, and G. Agha. CUTE: A concolic unit testing engine for C. In ESEC/FSE 2005: Proceedings of the 10th European Software Engineering Conference and the 13th ACM SIGSOFT Symposium on the Foundations of Software Engineering, pages 263–272, Lisbon, Portugal, September 7–9, 2005. Google ScholarDigital Library
S. Shamshiri, R. Just, J. M. Rojas, G. Fraser, P. McMinn, and A. Arcuri. Do automatically generated unit tests find real faults? An empirical study of effectiveness and challenges. In ASE 2015: Proceedings of the 30th Annual International Conference on Automated Software Engineering, pages 201–211, Lincoln, NE, USA, November 11–13, 2015.Google ScholarDigital Library
L. Tan, D. Yuan, G. Krishna, and Y. Zhou. /*iComment: Bugs or bad comments?*/. In SOSP 2007, Proceedings of the 21st ACM Symposium on Operating Systems Principles, pages 145–158, Stevenson, WA, USA, October 14–17, 2007. Google ScholarDigital Library
L. Tan, Y. Zhou, and Y. Padioleau. aComment: Mining annotations from comments and code to detect interrupt related concurrency bugs. In ICSE’11, Proceedings of the 33rd International Conference on Software Engineering, pages 11–20, 2011. Google ScholarDigital Library
S. H. Tan, D. Marinov, L. Tan, and G. T. Leavens. @tComment: Testing Javadoc comments to detect comment-code inconsistencies. In Fifth International Conference on Software Testing, Verification and Validation (ICST), pages 260–269, Montreal, Canada, April 18–20, 2012. Google ScholarDigital Library
R. N. Taylor. An integrated verification and testing environment. Software: Practice and Experience, 13(8):697–713, 1983.Google ScholarCross Ref
M. Vivanti, A. Mis, A. Gorla, and G. Fraser. Search-based data-flow test generation. In ISSRE’13, Proceedings of the IEEE International Symposium on Software Reliability Engineering, pages 370–379. IEEE, 2013.Google Scholar
W. Weimer and G. C. Necula. Finding and preventing run-time error handling mistakes. In Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2004), pages 419–431, Vancouver, BC, Canada, 2004. Google ScholarDigital Library
E. Wong, L. Zhang, S. Wang, T. Liu, and L. Tan. Dase: Document-assisted symbolic execution for improving automated software testing. In ICSE’15, Proceedings of the 37th International Conference on Software Engineering, pages 620–631, Florence, Italy, 2015. Google ScholarDigital Library
Q. Wu, L. Wu, G. Liang, Q. Wang, T. Xie, and H. Mei. Inferring dependency constraints on parameters for web services. In Proceedings of the 22nd International Conference on World Wide Web, pages 1421–1432, Rio de Janeiro, Brazil, 2013. Google ScholarDigital Library
X. Xiao, A. Paradkar, S. Thummalapenta, and T. Xie. Automated extraction of security policies from natural-language software documents. In FSE 2012, Proceedings of the ACM SIGSOFT 20th Symposium on the Foundations of Software Engineering, pages 12:1–12:11, Cary, North Carolina, 2012. Google ScholarDigital Library
T. Xie and D. Notkin. Tool-assisted unit test selection based on operational violations. In ASE 2003: Proceedings of the 18th Annual International Conference on Automated Software Engineering, pages 40–48, Montreal, Canada, October 8–10, 2003.Google Scholar
B. Zhang, E. Hill, and J. Clause. Automatically generating test templates from test names. In ASE 2015: Proceedings of the 30th Annual International Conference on Automated Software Engineering, pages 506–511, Lincoln, NE, USA, November 11–13, 2015.Google ScholarDigital Library
H. Zhong and Z. Su. Detecting API documentation errors. In Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2013), pages 803–816, Indianapolis, Indiana, USA, 2013. Google ScholarDigital Library
H. Zhong, L. Zhang, T. Xie, and H. Mei. Inferring resource specifications from natural language API documentation. In ASE 2009: Proceedings of the 24th Annual International Conference on Automated Software Engineering, pages 307–318, Washington, DC, USA, 2009. Google ScholarDigital Library

Index Terms

Automatic generation of oracles for exceptional behaviors
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Fault-based testing without the need of oracles
Abstract
There are two fundamental limitations in software testing, known as the reliable test set problem and the oracle problem. Fault-based testing is an attempt by Morell to alleviate the reliable test set problem. In this paper, we propose ...
Read More
Automatic system testing of programs without test oracles
ISSTA '09: Proceedings of the eighteenth international symposium on Software testing and analysis

Metamorphic testing has been shown to be a simple yet effective technique in addressing the quality assurance of applications that do not have test oracles, i.e., for which it is difficult or impossible to know what the correct output should be for ...
Read More
Testing web enabled simulation at scale using metamorphic testing
ICSE-SEIP '21: Proceedings of the 43rd International Conference on Software Engineering: Software Engineering in Practice

We report on Facebook's deployment of MIA (Metamorphic Interaction Automaton). MIA is used to test Facebook's Web Enabled Simulation, built on a web infrastructure of hundreds of millions of lines of code. MIA tackles the twin problems of test flakiness ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis
July 2016
452 pages
ISBN:9781450343909
DOI:10.1145/2931037
General Chair:
Andreas Zeller
Saarland University, Germany
,
Program Chair:
Abhik Roychoudhury
National University of Singapore, Singapore
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 July 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Testing
automatic test oracle
oracle generation
oracle problem
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate58of213submissions,27%
Upcoming Conference
ISSTA '24

Sponsor:

sigsoft

33rd ACM SIGSOFT International Symposium on Software Testing and Analysis

September 16 - 20, 2024

Vienna , Austria
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 68
  Total Citations
  View Citations
- 559
  Total Downloads
- Downloads (Last 12 months)46
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic generation of oracles for exceptional behaviors

ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis

ABSTRACT

References

Cited By

Index Terms

Recommendations

Fault-based testing without the need of oracles

Automatic system testing of programs without test oracles

Testing web enabled simulation at scale using metamorphic testing