skip to main content
10.1145/2931037.2931061acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

Automatic generation of oracles for exceptional behaviors

Published:18 July 2016Publication History

ABSTRACT

Test suites should test exceptional behavior to detect faults in error-handling code. However, manually-written test suites tend to neglect exceptional behavior. Automatically-generated test suites, on the other hand, lack test oracles that verify whether runtime exceptions are the expected behavior of the code under test.

This paper proposes a technique that automatically creates test oracles for exceptional behaviors from Javadoc comments. The technique uses a combination of natural language processing and run-time instrumentation. Our implementation, Toradocu, can be combined with a test input generation tool. Our experimental evaluation shows that Toradocu improves the fault-finding effectiveness of EvoSuite and Randoop test suites by 8% and 16% respectively, and reduces EvoSuite’s false positives by 33%.

References

  1. G. Angeli, M. J. J. Premkumar, and C. D. Manning. Leveraging linguistic structure for open domain information extraction. In ACL 2015, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, pages 344–354, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  2. S. Antoy and D. Hamlet. Automatically checking an implementation against its formal specification. IEEE Transactions on Software Engineering, 26(1):55–69, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. W. Araujo, L. C. Briand, and Y. Labiche. Enabling the runtime assertion checking of concurrent contracts for the Java modeling language. In ICSE’11, Proceedings of the 33rd International Conference on Software Engineering, pages 786–795, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. C. Bacherler, B. Moszkowski, C. Facchi, and A. Huebner. Automated test code generation based on formalized natural language business rules. In ICSEA’12, Proceedings of the 7th International Conference on Software Engineering Advances, pages 165–171, 2012.Google ScholarGoogle Scholar
  5. L. Baresi, P. L. Lanzi, and M. Miraz. Testful: An evolutionary test approach for Java. In ICST’10, Proceedings of the 3rd International Conference on Software Testing, Verification and Validation, pages 185–194, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Baresi and M. Young. Test oracles. Technical Report CIS-TR-01-02, University of Oregon, Department of Computer and Information Science, 2001.Google ScholarGoogle Scholar
  7. E. Barr, M. Harman, P. McMinn, M. Shahbaz, and S. Yoo. The oracle problem in software testing: A survey. IEEE Transactions on Software Engineering, 41(5):507–525, May 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Carzaniga, A. Goffi, A. Gorla, A. Mattavelli, and M. Pezzè. Cross-checking oracles from intrinsic software redundancy. In ICSE’14, Proceedings of the 36th International Conference on Software Engineering, pages 931–942, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Ceccato, A. Marchetto, L. Mariani, C. D. Nguyen, and P. Tonella. Do automatically generated test cases make debugging easier? an experimental assessment of debugging effectiveness and efficiency. ACM Transactions on Programming Languages and Systems, 25(1):5:1–5:38, dec 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Y. Chen, F.-C. Kuo, T. H. Tse, and Z. Q. Zhou. Metamorphic testing and beyond. In STEP’03, Proceedings of the 11th International Workshop on Software Technology and Engineering Practice, pages 94–100, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Y. Cheon. Abstraction in assertion-based test oracles. In QSIC’07, Proceedings of the 7th International Conference on Quality Software, pages 410–414, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Cheon and G. T. Leavens. A simple and practical approach to unit testing: The JML and JUnit way. In ECOOP 2002 — Object-Oriented Programming, 16th European Conference, pages 231–255, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Csallner and Y. Smaragdakis. JCrasher: an automatic robustness tester for Java. Software: Practice and Experience, 34(11):1025–1050, September 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Csallner and Y. Smaragdakis. Check ’n’ Crash: Combining static checking and testing. In ICSE’05, Proceedings of the 27th International Conference on Software Engineering, pages 422–431, St. Louis, MO, USA, May 18–20, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. D. Day and J. D. Gannon. A test oracle based on formal specifications. In SOFTAIR’85, Proceedings of the 2nd Conference on Software Development Tools, Techniques, and Alternatives, pages 126–130, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Del Corro and R. Gemulla. Clausie: Clause-based open information extraction. In WWW 2013, Proceedings of the 22nd International World Wide Web Conference, pages 355–366, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. W. Dietl, S. Dietzel, M. D. Ernst, K. Mu¸slu, and T. Schiller. Building and using pluggable type-checkers. In ICSE’11, Proceedings of the 33rd International Conference on Software Engineering, pages 681–690, Waikiki, Hawaii, USA, May 25–27, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R.-K. Doong and P. G. Frankl. The ASTOOT approach to testing object-oriented programs. ACM Transactions on Software Engineering and Methodology, 3(2):101–130, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G. Fraser and A. Zeller. Mutation-driven generation of unit tests and oracles. IEEE Transactions on Software Engineering, 38(2):278–292, March–April 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Fujiwara, G. von Bochmann, F. Khendek, M. Amalou, and A. Ghedamsi. Test selection based on finite state models. IEEE Transactions on Software Engineering, 17(6):591–603, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. P. Galeotti, G. Fraser, and A. Arcuri. Improving search-based test suite generation with dynamic symbolic execution. In ISSRE’13, Proceedings of the IEEE International Symposium on Software Reliability Engineering, pages 360–369, 2013.Google ScholarGoogle Scholar
  22. J. Gannon, P. McMullin, and R. Hamlet. Data abstraction, implementation, specification, and testing. ACM Transactions on Programming Languages and Systems, 3(3):211–223, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In PLDI 2005, Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, Chicago, IL, USA, June 13–15, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Gotlieb. Exploiting symmetries to test programs. In ISSRE’03, Proceedings of the IEEE International Symposium on Software Reliability Engineering, pages 365–375, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, pages 55–60, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  26. M. Marneffe, B. Maccartney, and C. Manning. Generating typed dependency parses from phrase structure parses. In LREC’06, Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation, pages 449–454, 2006.Google ScholarGoogle Scholar
  27. J. Mcdonald. Translating Object-Z specifications to passive test oracles. In ICFEM’98, Proceedings of the 1998 International Conference on Formal Engineering Methods, pages 165–174, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. B. Meyer. Object-Oriented Software Construction. Prentice Hall, 1st edition, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. E. Mikk. Compilation of Z specifications into C for automatic test result evaluation. In ZUM’95, Proceedings of the 9th International Conference of Z Users, pages 167–180, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. Murphy, G. Kaiser, I. Vo, and M. Chu. Quality assurance of software applications using the in vivo testing approach. In ICST’09, Proceedings of the 2nd International Conference on Software Testing, Verification and Validation, pages 111–120, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Pacheco and M. D. Ernst. Eclat: Automatic generation and classification of test inputs. In ECOOP 2005 — Object-Oriented Programming, 19th European Conference, pages 504–527, Glasgow, Scotland, July 27–29, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedback-directed random test generation. In ICSE’07, Proceedings of the 29th International Conference on Software Engineering, pages 75–84, Minneapolis, MN, USA, May 23–25, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. R. Pandita, X. Xiao, H. Zhong, T. Xie, S. Oney, and A. Paradkar. Inferring method specifications from natural language API descriptions. In ICSE’12, Proceedings of the 34th International Conference on Software Engineering, pages 815–825, Zurich, Switzerland, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. M. Papi, M. Ali, T. L. Correa Jr., J. H. Perkins, and M. D. Ernst. Practical pluggable types for Java. In ISSTA 2008, Proceedings of the 2008 International Symposium on Software Testing and Analysis, pages 201–212, Seattle, WA, USA, July 22–24, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Parasoft Corporation. Jtest version 4.5. http://www.parasoft.com/.Google ScholarGoogle Scholar
  36. Randoop Developers. Randoop manual. https://randoop.github.io/randoop/manual/, January 2016.Google ScholarGoogle Scholar
  37. Version 2.1.1.Google ScholarGoogle Scholar
  38. J. M. Rojas, G. Fraser, and A. Arcuri. Automated unit test generation during software development: A controlled experiment and think-aloud observations. In ISSTA 2015, Proceedings of the 2015 International Symposium on Software Testing and Analysis, pages 338–349, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. D. S. Rosenblum. A practical approach to programming with assertions. IEEE Transactions on Software Engineering, 21(1):19–31, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. C. Rubio-González and B. Liblit. Expect the unexpected: error code mismatches between documentation and the real world. In PASTE’10, Proceedings of the ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, pages 73–80, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. K. Sen, D. Marinov, and G. Agha. CUTE: A concolic unit testing engine for C. In ESEC/FSE 2005: Proceedings of the 10th European Software Engineering Conference and the 13th ACM SIGSOFT Symposium on the Foundations of Software Engineering, pages 263–272, Lisbon, Portugal, September 7–9, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. S. Shamshiri, R. Just, J. M. Rojas, G. Fraser, P. McMinn, and A. Arcuri. Do automatically generated unit tests find real faults? An empirical study of effectiveness and challenges. In ASE 2015: Proceedings of the 30th Annual International Conference on Automated Software Engineering, pages 201–211, Lincoln, NE, USA, November 11–13, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. L. Tan, D. Yuan, G. Krishna, and Y. Zhou. /*iComment: Bugs or bad comments?*/. In SOSP 2007, Proceedings of the 21st ACM Symposium on Operating Systems Principles, pages 145–158, Stevenson, WA, USA, October 14–17, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. L. Tan, Y. Zhou, and Y. Padioleau. aComment: Mining annotations from comments and code to detect interrupt related concurrency bugs. In ICSE’11, Proceedings of the 33rd International Conference on Software Engineering, pages 11–20, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. S. H. Tan, D. Marinov, L. Tan, and G. T. Leavens. @tComment: Testing Javadoc comments to detect comment-code inconsistencies. In Fifth International Conference on Software Testing, Verification and Validation (ICST), pages 260–269, Montreal, Canada, April 18–20, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. R. N. Taylor. An integrated verification and testing environment. Software: Practice and Experience, 13(8):697–713, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  47. M. Vivanti, A. Mis, A. Gorla, and G. Fraser. Search-based data-flow test generation. In ISSRE’13, Proceedings of the IEEE International Symposium on Software Reliability Engineering, pages 370–379. IEEE, 2013.Google ScholarGoogle Scholar
  48. W. Weimer and G. C. Necula. Finding and preventing run-time error handling mistakes. In Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2004), pages 419–431, Vancouver, BC, Canada, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. E. Wong, L. Zhang, S. Wang, T. Liu, and L. Tan. Dase: Document-assisted symbolic execution for improving automated software testing. In ICSE’15, Proceedings of the 37th International Conference on Software Engineering, pages 620–631, Florence, Italy, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Q. Wu, L. Wu, G. Liang, Q. Wang, T. Xie, and H. Mei. Inferring dependency constraints on parameters for web services. In Proceedings of the 22nd International Conference on World Wide Web, pages 1421–1432, Rio de Janeiro, Brazil, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. X. Xiao, A. Paradkar, S. Thummalapenta, and T. Xie. Automated extraction of security policies from natural-language software documents. In FSE 2012, Proceedings of the ACM SIGSOFT 20th Symposium on the Foundations of Software Engineering, pages 12:1–12:11, Cary, North Carolina, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. T. Xie and D. Notkin. Tool-assisted unit test selection based on operational violations. In ASE 2003: Proceedings of the 18th Annual International Conference on Automated Software Engineering, pages 40–48, Montreal, Canada, October 8–10, 2003.Google ScholarGoogle Scholar
  53. B. Zhang, E. Hill, and J. Clause. Automatically generating test templates from test names. In ASE 2015: Proceedings of the 30th Annual International Conference on Automated Software Engineering, pages 506–511, Lincoln, NE, USA, November 11–13, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. H. Zhong and Z. Su. Detecting API documentation errors. In Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2013), pages 803–816, Indianapolis, Indiana, USA, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. H. Zhong, L. Zhang, T. Xie, and H. Mei. Inferring resource specifications from natural language API documentation. In ASE 2009: Proceedings of the 24th Annual International Conference on Automated Software Engineering, pages 307–318, Washington, DC, USA, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic generation of oracles for exceptional behaviors

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis
      July 2016
      452 pages
      ISBN:9781450343909
      DOI:10.1145/2931037

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 18 July 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate58of213submissions,27%

      Upcoming Conference

      ISSTA '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader