Empirical comparison of regression test selection algorithms

https://doi.org/10.1016/S0164-1212(00)00119-9Get rights and content

Abstract

In the maintenance phase, the regression test selection problem refers to selecting test cases from the initial suite of test cases used in the development phase. In this paper, we empirically compare five representative regression test selection algorithms, which include: Simulated Annealing, Reduction, Slicing, Dataflow, and Firewall algorithms. The comparison is based on eight quantitative and qualitative criteria. These criteria are: number of selected test cases, execution time, precision, inclusiveness, preprocessing requirements, type of maintenance, level of testing, and type of approach. The empirical results show that the five algorithms can be used for different requirements of regression testing. For example the Simulated Annealing algorithm can be used for emergency non-safety-critical maintenance situations with a large number of small modifications.

Introduction

Software maintenance involves modifying programs as a result of errors, or changes in the user requirements (Bennett, 1991). During such modifications, new errors may be introduced, causing unintended adverse side effects in the software. Regression testing aims to provide confidence that the modifications are correct and have not adversely affected other parts of the program. Regression testing can be progressive or corrective (Hartmann and Robson, 1988). The former involves retesting major changes to the program's specifications. The latter is performed on specifications that essentially remain unchanged, so that only minor modifications, which do not affect the overall program structure, require retesting.

For regression testing, it might be costly to repeat the whole set of test cases used in the initial development of the program and unreliable to choose a random subset of these test cases. Therefore, it may be useful to select a suitable subset of test cases that accomplishes the objectives of regression testing. This subproblem of regression testing is referred to as the regression test selection problem.

The selection of suitable test cases can be made in different ways, and a number of regression testing approaches and algorithms have been proposed. A strategy based on input partitioning and cause-effect graphing of the program specification has been described in Yau and Kishimoto (1987). Leung and White (1989) have suggested classifying the initial test cases as reusable, retestable, obsolete, and adding new-structural and new-specification test cases. Then test cases can be selected from some of or all five classes. They have also extended this approach to cover programs with global variables (Leung and White, 1990b) and proposed methods to limit integration regression testing to a small set of modules based on the “firewall” concept (Leung and White, 1990a, Leung and White, 1992). An analogous concept of class firewall has been proposed for regression testing object-oriented software systems (Hsia et al., 1997).

A slicing algorithm presented in Gallagher and Lyle (1988) decomposes a program and selects the test cases that detect “linkage” between the modified and unmodified code. Another slicing strategy, which is based on data flow testing and incremental data flow analysis, has been described in Harrold and Soffa (1988) and Gupta et al. (1996). A safe algorithm, which uses a module's dependence graph and selects test cases that will cause the modified module to produce different output than the original module, has been proposed in Rothermel and Harrold, 1997, Rothermel and Harrold, 1998. An incremental slicing algorithm has been proposed in Agrawal et al. (1993) which selects test cases whose outputs may be affected by the modifications made.

Path expressions are used in Benedusi et al. (1988) to represent the program and then algebraic operations are used to determine the paths affected by the modifications. This facilitates the selection of the initial test cases needed for retesting. A similar approach based on semantic differencing has been proposed in Binkley (1997). Also, textual differencing has been used in Vokolos and Frankl (1998). An optimal regression testing approach based on a 0–1 integer programming problem formulation has been proposed by Fischer et al. (1981) and extended in Hartmann and Robson (1990) and solved by natural optimization algorithms in Mansour and El-Fakih (1999). Further, some software tools that are based on regression testing techniques have been constructed (Chen et al., 1994; Vokolos and Frankl, 1997; White et al., 1993; von Mayrhauser and Zhang, 1999).

Despite this amount of research on regression testing algorithms, insufficient work has been done on effectiveness evaluation and comparison of these algorithms. A simple cost model has been suggested early in Leung and White (1991). Another coverage-based cost-effectiveness prediction model has been proposed in Rosenblum and Weyuker (1997). A theoretical analysis framework has been suggested in Rothermel and Harrold (1996). In particular, empirical work has been insufficient. We have reported some initial experimental comparative results in Baradhi and Mansour (1997). Wong et al. (1998) directly examines the costs and benefits of only minimizing the test suite size. Rothermel et al. (1998) compares the costs and benefits of minimizing the test suite to show that the fault-detection capabilities of the selected subset can be compromised by minimization. Graves et al. (1998) empirically compare three algorithms with random and retest-all techniques. The three algorithms are: edge-coverage-adequate minimization, safe, and dataflow-coverage-based. The comparison focuses on the abilities to reduce regression testing effort and uncover faults in modified programs.

In this paper, we describe a comparison framework and present an empirical comparative study of five representative regression test selection algorithms, which include: Simulated Annealing, Reduction, Slicing, Dataflow, and Firewall algorithms. The comparison framework is composed of the following eight criteria: the number of test cases selected for regression testing, execution time, precision, inclusiveness, preprocessing requirements, type of maintenance, level of testing, and type of approach. This study shows that the five algorithms have diverse properties for different criteria. This enables software engineers and project managers to make appropriate decisions and select the regression testing algorithm that best fits the requirements of their application and development process.

This paper is organized as follows. Section 2 describes the regression test selection problem and the program models used. Section 3 gives a brief description of the five implemented algorithms. Section 4 describes the comparison criteria. Section 5 presents and briefly discusses the experimental results. Section 6 summarizes these results and discusses the limitations of the empirical study. Section 7 contains conclusions.

Section snippets

Program modeling and regression test selection problem

We assume that a program is modeled by a control flow graph with M nodes. Each node represents a program segment, which corresponds to a control statement or to a contiguous sequence of assignment statements. The graph edges represent control flow. Also, the data flow information is kept for relevant nodes, which includes the definitions and uses of the variables in the program statements within segments. Uses are classified as either computation uses (c-uses) or predicate uses (p-uses)

Implemented algorithms

In this section, we briefly present the five algorithms that we have implemented for our comparative study.

Comparison criteria

In this section, we present eight criteria for evaluating and comparing the five regression test selection algorithms. These criteria are quantitative and qualitative. The quantitative criteria are:

(i) Number of test cases in R selected by an algorithm, from T, to be rerun for regression testing, which is denoted as #R. This number will be represented as a percentage value with respect to N, the cardinality of T.

(ii) Execution time of the program that implements a regression test selection

Empirical results and discussion

In this section, we experimentally and qualitatively compare the properties of the five regression test selection algorithms using the eight criteria. For the four quantitative criteria, the experiments have been done on a PC running a Pentium II 233 MHz CPU.

Fifteen subject program modules (M1–M15) are used, for which the flowgraphs and the initial suite of test cases, T, have been manually generated. We assume that the flowgraphs and the associated information are generated off-line so that

Summary of the results and limitations

Table 7 summarizes the comparison results for the eight quantitative and qualitative criteria. The terms used to describe the properties of the algorithms, such as slow, fast, high, etc…, are based on comparing the five algorithms with each other. However, we note the following threats and weaknesses in this empirical study:

(i) we are doing an assessment using only a limited number of test modules;

(ii) the subject modules used are not random and are small- to medium-size modules;

(iii) the test

Conclusions and further work

Table 7 shows that different algorithms have different favorable properties. Thus, choosing an algorithm from the five algorithms is dependent on the regression tester's particular requirements and the system under consideration. However, we make the following recommendations:

(i) SA for emergency non-safety-critical situations, where we have a large number of small modifications and that maintenance has to be done rapidly. SA is also recommended where initial test cases have different

Acknowledgements

We thank the anonymous referees for their suggestions, which improved the paper significantly. We also thank Nidal Araby for implementing the reduction algorithm.

Nashat Mansour is an Associate Professor of Computer Science at the Lebanese American University, Lebanon. He received his BE and MS degrees in Electrical Engineering from the University of New South Wales, Australia, and MS in Computer Engineering and Ph.D. in Computer Science from Syracuse University, New York. His research interests include software testing and natural computation algorithms. He is a member of IEEE-CS and is on the executive committee of the Lebanese Association for the

References (33)

  • K.H Bennett

    The software maintenance of large software systems: management, methods and tools

    Reliability Engineering and Systems Safety

    (1991)
  • Agrawal, H., Horgan, J.R., Krauser, E.W., 1993. Incremental regression testing. In: Proceedings of the Conference on...
  • Baradhi, G., Mansour, N., 1997. A comparative study of five regression testing algorithms. In: Proceedings of the...
  • B Beizer

    Software Testing Techniques

    (1990)
  • Benedusi, P., Cimitile, A., De Carlini, U., 1988. Post-maintenance testing based on path change analysis. In:...
  • D Binkley

    Semantics guided regression test cost reduction

    IEEE Trans. Software Eng.

    (1997)
  • Chen, Y., Rosenblum, D.S., Vo, K., 1994. Testtube: a system for selective regression testing. In: Proceedings of the...
  • Fischer, K., Raji, F., Chruscicki, A., 1981. A methodology for retesting modified software. In: Proceedings of the...
  • Gallagher, K.B., Lyle, J.R., 1988. Using program decomposition to guide modifications. In: Proceedings of the...
  • Graves, T.L., Harrold, M.J., Kim, J.M., Porter, A., Rothermel, G., 1998. An empirical study of regression test...
  • R Gupta et al.

    Program slicing-based regression testing techniques

    Software Testing, Verification and Reliability

    (1996)
  • Harrold, M.J., Gupta, R., Soffa, M.L., 1993. A methodology for controlling the size of a test suite. ACM Trans....
  • Harrold, M.J., Soffa, M.L., 1988. An incremental approach to unit testing during maintenance. In: Proceedings of the...
  • Hartmann, J., Robson, D.J., 1988. Approaches to regression testing. In: Proceedings of the Conference on Software...
  • Hartmann, J., Robson, D.J., 1990. Techniques for selective revalidation. IEEE Software, January,...
  • P Hsia et al.

    A technique for the selective revalidation of OO software

    J. Software Maintenance

    (1997)
  • Cited by (31)

    • A systematic review on search based mutation testing

      2017, Information and Software Technology
      Citation Excerpt :

      According to the authors, if the trend continues, over 1700 SBST papers will have been published before the end of this decade. Some issues addressed by SBST are test data generation [7–11], test case selection [12–15], test case prioritization [14,16–19], functional testing [20] and non-functional testing [21–23]. Mutation testing is the most widely known criterion of the fault injection testing technique [24].

    • Recovering test-to-code traceability using slicing and textual analysis

      2014, Journal of Systems and Software
      Citation Excerpt :

      Specifically, running some test suites can take hours, even days, so developers cannot exercise the system instantly or in reasonable time (Runeson, 2006). Past research has proposed several test case selection methods aimed at reducing the cost of regression testing (Engström et al., 2010; Mansour et al., 2001; Graves et al., 2001; Binkley, 1997; Hurdugaci and Zaidman, 2012). However, in industry there is still no systematic approach for test case selection where test case selection often relies on the developers expertise and judgment (Runeson, 2006; Causevic et al., 2010).

    • A systematic review on regression test selection techniques

      2010, Information and Software Technology
      Citation Excerpt :

      This review is focused on test selection techniques for regression testing. Although techniques for regression test selection have been evaluated in previous work [3,15,36,65], no general solution has been put forward since no technique could possibly respond adequately to the complexity of the problem and the great diversity in requirements and preconditions in software systems and development organizations. Neither does any single study evaluate every aspect of the problem; e.g. Kim et al. [27] evaluate the effects of regression test application frequency, Elbaum et al. [11] investigate the impact that different modifications have on regression test selection techniques, several studies examine the ability to reduce regression testing effort [3,11,15,27,36,65,66] and to reveal faults [11,15,27,49].

    • Reduction-based methods and metrics for selective regression testing

      2002, Information and Software Technology
      Citation Excerpt :

      In this section, we use the proposed STM and RTM metrics to evaluate the test-selection adequacy for two coverage-based regression test-selection algorithms on the 60 test problems. The algorithms are: Gupta et al.'s DF algorithms [13], and Leung and White's SFW algorithm [1,7]. DF and SFW define all the definition-use pairs that may be affected by the modification and select test cases accordingly.

    • Software testing

      2019, Handbook of Software Engineering
    View all citing articles on Scopus

    Nashat Mansour is an Associate Professor of Computer Science at the Lebanese American University, Lebanon. He received his BE and MS degrees in Electrical Engineering from the University of New South Wales, Australia, and MS in Computer Engineering and Ph.D. in Computer Science from Syracuse University, New York. His research interests include software testing and natural computation algorithms. He is a member of IEEE-CS and is on the executive committee of the Lebanese Association for the Advancement of Science.

    Rami Bahsoon received his BS and MS degrees in Computer Science from the Lebanese American University in 1997 and 2000, respectively. He is presently pursuing a Ph.D. in Computer Science at the University College of London. His research interest is in software architecture and testing methods.

    Ghinwa Baradhi received her BS and MS degrees in Computer Science from the Lebanese American University in 1994 and 1997, respectively. Currently, she is a project manager at Emirates Bank.

    View full text