ABSTRACT
In previous decades, researchers have explored the formal foundations of program testing. By exploring the foundations of testing largely separate from any specific method of testing, these researchers provided a general discussion of the testing process, including the goals, the underlying problems, and the limitations of testing. Unfortunately, a common, rigorous foundation has not been widely adopted in empirical software testing research, making it difficult to generalize and compare empirical research.
We continue this foundational work, providing a framework intended to serve as a guide for future discussions and empirical studies concerning software testing. Specifically, we extend Gourlay's functional description of testing with the notion of a test oracle, an aspect of testing largely overlooked in previous foundational work and only lightly explored in general. We argue additional work exploring the interrelationship between programs, tests, and oracles should be performed, and use our extension to clarify concepts presented in previous work, present new concepts related to test oracles, and demonstrate that oracle selection must be considered when discussing the efficacy of a testing process.
- Z. Al-Khanjari, M. Woodward, and H. Ramadhan. Critical analysis of the pie testability technique. Software Quality Journal, 10(4):331--354, 2002. Google ScholarDigital Library
- A. Avizienis, J.-C. Laprie, B. Randell, and C. Landwehr. Basic Concepts and Taxonomy of Dependable and Secure Computing. IEEE Trans. Dependable and Secure Computing, 1(1):11--33, 2004. Google ScholarDigital Library
- B. Beizer. Software Testing Techniques, 2nd Edition. Van Nostrand Reinhold, New York, 1990. Google ScholarDigital Library
- G. Bernot. Testing against formal specifications: A theoretical view. In TAPSOFT'91: Colloquium on Trees in Algebra and Programming (CAAP'91), page 99. Springer, 1991. Google ScholarDigital Library
- G. Bernot, M. Gaudel, B. Marre, and U. Liens. Software testing based on formal specifications: a theory and a tool. Software Engineering Journal, 6(6):387--405, 1991. Google ScholarDigital Library
- A. Bertolino. Software testing research: Achievements, challenges, dreams. In L. Briand and A. Wolf, editors, Future of Software Engineering 2007. IEEE-CS Press, 2007. Google ScholarDigital Library
- L. Briand. A critical analysis of empirical research in software testing. In Empirical Software Engineering and Measurement, 2007. ESEM 2007. First Int'l Symposium on, pages 1--8, 2007. Google ScholarDigital Library
- L. Briand, M. DiPenta, and Y. Labiche. Assessing and improving state-based class testing: A series of experiments. IEEE Trans. on Software Engineering, 30 (11), 2004. Google ScholarDigital Library
- T. Budd and D. Angluin. Two notions of correctness and their relation to testing. Acta Informatica, 18(1):31--45, 1982.Google ScholarDigital Library
- W. Chen, R. Untch, G. Rothermel, S. Elbaum, and J. Von Ronne. Can fault-exposure-potential estimates improve the fault detection abilities of test suites? Software Testing Verification and Reliability, 12(4):197--218, 2002.Google ScholarCross Ref
- R. DeMillo, R. Lipton, and F. Sayward. Hints on test data selection: Help for the practicing programmer. IEEE computer, 11(4):34--41, 1978. Google ScholarDigital Library
- P. Frankl and E. Weyuker. A formal analysis of the fault-detecting ability of testing methods. In IEEE Trans. on Software Engineering, 1993. Google ScholarDigital Library
- A. Gargantini and C. Heitmeyer. Using model checking to generate tests from requirements specifications. Software Engineering Notes, 24(6):146--162, November 1999. Google ScholarDigital Library
- M. Gaudel. Testing can be formal, too. Lecture Notes in Computer Science, 915:82--96, 1995. Google ScholarDigital Library
- M. Geller. Test data as an aid in proving program correctness. In Proc. of the 3rd ACM SIGACT-SIGPLAN Symp. on Principles on Programming Languages, pages 209--218. ACM New York, NY, USA, 1976. Google ScholarDigital Library
- J. B. Goodenough and S. L. Gerhart. Toward a theory of testing: Data selection criteria. In R. T. Yeh, editor, Current trends in programming methodology. Prentice Hall, 1979.Google Scholar
- J. Gourlay. A mathematical framework for the investigation of testing. IEEE Trans. on Software Engineering, pages 686--709, 1983. Google ScholarDigital Library
- D. Hamlet. Foundations of software testing: dependability theory. ACM SIGSOFT Software Engineering Notes, 19(5):128--139, 1994. Google ScholarDigital Library
- R. Hierons. Comparing test sets and criteria in the presence of test hypotheses and fault domains. ACM Transactions on Software Engineering and Methodology (TOSEM), 11(4):448, 2002. Google ScholarDigital Library
- W. Howden. Reliability of the path analysis testing strategy. IEEE Transactions on Software Engineering, 2(3), 1976. Google ScholarDigital Library
- W. Howden. Weak mutation testing and completeness of test sets. IEEE Trans. on Software Engineering, pages 371--379, 1982. Google ScholarDigital Library
- M. Hutchins, H. Foster, T. Goradia, and T. Ostrand. Experiments of the effectiveness of dataflow-and controlflow-based test adequacy criteria. In Proc. of the 16th Int'l Conference on Software Engineering, pages 191--200. IEEE Computer Society Press Los Alamitos, CA, USA, 1994. Google ScholarDigital Library
- J. Mayer and R. Guderlei. Test oracles using statistical methods. In Proc. of the First Int'l Workshop on Software Quality, pages 179--189. Citeseer, 2004.Google Scholar
- A. Memon, I. Banerjee, and A. Nagarajan. What test oracle should I use for effective GUI testing? Automated Software Engineering, 2003. Proc. 18th IEEE Int'l Conf. on, pages 164--173, 2003.Google ScholarDigital Library
- L. Morell. A theory of fault-based testing. IEEE Transactions on Software Engineering, 16(8):844--857, 1990. Google ScholarDigital Library
- A. Parrish and S. Zweben. Analysis and refinement of software test data adequacy properties. IEEE Trans. on Software Engineering, 17(6):565--581, 1991. Google ScholarDigital Library
- A. Parrish and S. Zweben. Clarifying some fundamental concepts in software testing. IEEE Trans. on Software Engineering, 19(7):742--746, 1993. Google ScholarDigital Library
- A. Rajan, M. Whalen, and M. Heimdahl. The effect of program and model structure on MC/DC test adequacy coverage. In Proc. of the 30th Int'l Conference on Software engineering, pages 161--170. ACM New York, NY, USA, 2008. Google ScholarDigital Library
- S. Rayadurgam and M. P. Heimdahl. Coverage based test-case generation using model checkers. In Proc. of the 8th IEEE Int'l. Conf. and Workshop on the Engineering of Computer Based Systems, pages 83--91. IEEE Computer Society, April 2001.Google ScholarCross Ref
- D. J. Richardson, S. L. Aha, and T. O'Malley. Specification-based test oracles for reactive systems. In Proc. of the 14th Int'l Conference on Software Engineering, pages 105--118. Springer, May 1992. Google ScholarDigital Library
- G. Rothermel, M. J. Harrold, J. Ostrin, and C. Hong. An empirical study of the effects of minimization on the fault detection capabilities of test suites. In Proceedings of the International Conference on Software Maintenance, pages 34--43, November 1998. Google ScholarDigital Library
- M. Staats, M. Whalen, and M. Heimdahl. Better testing through oracle selection (NIER track). In Proc. of the Int'l Conf. on Software Engineering 2011, 2011. Google ScholarDigital Library
- J. Voas. Dynamic testing complexity metric. Software Quality Journal, 1(2):101--114, 1992.Google ScholarCross Ref
- J. Voas. PIE: A dynamic failure-based technique. IEEE Trans. on Software Engineering, 18(8):717--727, 1992. Google ScholarDigital Library
- J. Voas and K. Miller. Putting assertions in their place. In Software Reliability Engineering, 1994., 5th Int'l Symposium on, pages 152--157, 1994.Google ScholarCross Ref
- S. Weiss. Comparing test data adequacy criteria. ACM SIGSOFT Software Engineering Notes, 14(6):42--49, 1989. Google ScholarDigital Library
- E. Weyuker. On testing non-testable programs. The Computer Journal, 25(4):465, 1982.Google ScholarCross Ref
- E. Weyuker. Axiomatizing software test data adequacy. IEEE Trans. on Software Engineering, 12(12):1128--1138, 1986. Google ScholarDigital Library
- E. Weyuker. The evaluation of program-based software test data adequacy criteria. Communications of the ACM, 31(6):668--675, 1988. Google ScholarDigital Library
- E. Weyuker and T. Ostrand. Theories of program testing and the application of revealing subdomains. IEEE Trans. on Software Engineering, pages 236--246, 1980. Google ScholarDigital Library
- E. Weyuker, S. Weiss, and D. Hamlet. Comparison of program testing strategies. In Proc. of the Symposium on Testing, Analysis, and Verification, page 10. ACM, 1991. Google ScholarDigital Library
- W. Wong, J. Horgan, S. London, and A. Mathur. Effect of test set minimization on fault detection effectiveness. In Proc. of the 17th Int'l Conf. on Software Engineering, pages 41--50. ACM, 1995. Google ScholarDigital Library
- M. Woodward and K. Halewood. From weak to strong, dead or alive? an analysis of some mutation testing issues. In Software Testing, Verification, and Analysis, 1988., Proc. of the 2nd Workshop on, pages 152--158, 1988.Google Scholar
- H. Zhu. Axiomatic assessment of control flow-based software test adequacy criteria. Software Engineering Journal, 10(5):194--204, 1995.Google ScholarCross Ref
- H. Zhu and P. Hall. Test data adequacy measurement. Software Engineering Journal, 8(1):21--29, 1993. Google ScholarDigital Library
- H. Zhu, P. Hall, and J. R. May. Software unit test coverage and adequacy. ACM Computing Surveys, 29(4):366--427, December 1997. Google ScholarDigital Library
- H. Zhu and X. He. A theory of behaviour observation in software testing. Technical report, 1999. Google ScholarDigital Library
Index Terms
- Programs, tests, and oracles: the foundations of testing revisited
Recommendations
Mutation-driven generation of unit tests and oracles
ISSTA '10: Proceedings of the 19th international symposium on Software testing and analysisTo assess the quality of test suites, mutation analysis seeds artificial defects (mutations) into programs; a non-detected mutation indicates a weakness in the test suite. We present an automated approach to generate unit tests that detect these ...
Mutation-Driven Generation of Unit Tests and Oracles
To assess the quality of test suites, mutation analysis seeds artificial defects (mutations) into programs; a nondetected mutation indicates a weakness in the test suite. We present an automated approach to generate unit tests that detect these ...
Theories of Program Testing and the Application of Revealing Subdomains
The theory of test data selection proposed by Goodenough and Gerhart is examined. In order to extend and refine this theory, the concepts of a revealing test criterion and a revealing subdomain are proposed. These notions are then used to provide a ...
Comments