An enhanced response surface methodology (RSM) algorithm using gradient deflection and second-order search strategies

https://doi.org/10.1016/S0305-0548(98)00014-8Get rights and content

Abstract

In the optimization of complex systems where at best one can only evaluate the function being optimized, often via a simulation experimental process, Response Surface Methodology (RSM) is frequently the method of choice. However, RSM invariably employs a first-order steepest descent method, presumably because of the prohibitive cost of employing second-order designs, except for the canonical analysis that is conducted at termination. In this paper, we suggest the use of deflected/conjugate gradient approaches that can be gainfully employed to improve the performance of RSM. These methods involve the same effort per iteration and require only first-order designs. However, instead of discarding previously generated information, the method utilizes these computations to advantageously build upon accumulated knowledge regarding the second-order curvature of the function being optimized, hence accelerating convergence by circumventing the characteristic zigzagging of steepest descent methods. The proposed scheme is illustrated using various test examples from the literature for which optimal solutions are known and that serve the purpose of exhibiting the potential of the recommended methodology.

This paper attempts to improve the search techniques being currently used in standard Response Surface Methodology (RSM) algorithms. RSM is a collection of mathematical and statistical techniques for experimental optimization. This work presents a novel RSM algorithm that incorporates certain gradient deflection methods, augmented with appropriate restarting criteria, as opposed to using the path of steepest descent as the only search direction. In order to investigate the use of the new RSM algorithm in comparison with the standard existing RSM techniques, a set of standard test functions is used, both with and without random perturbations. Computational results exhibit the improvements achieved under the proposed algorithm.

Introduction

The focus of this paper is on improving the simulation-optimization technique known as response surface methodology (RSM). First proposed by Box and Wilson[3] in the context of optimization problems concerned with chemical process engineering, RSM is a collection of mathematical and statistical techniques for experimental optimization. The search procedures currently employed by RSM algorithms use the method of steepest descent only. This paper develops a novel RSM algorithm that incorporates certain gradient deflection methods, augmented with suitable restarting criteria, as opposed to using the path of steepest descent as the only search direction. In order to empirically investigate the use of the new RSM algorithm in contrast with the existing one, a set of standard test functions from the literature is used. Computational results exhibit the improvements achieved under the proposed algorithm, both with and without random perturbations on functional evaluations.

This paper is organized as follows. The remainder of this section presents an overview of RSM along with a statement of a standard RSM algorithm, and discusses various competitive gradient deflection techniques that we will be using in our approach. Section 2develops the new RSM algorithm. Empirical studies are described in Section 3, and Section 4presents conclusions and recommendations for future research.

Throughout this paper, it is assumed that the problem at hand is an unconstrained minimization problem, and that the functional form of the objective function is unknown. Typically, to start the RSM procedure, an experiment is designed in a small sub-region of the factor space, and a low degree polynomial (usually first-order) is used to represent the data obtained from the responses. This helps the practitioner deduce which region should be explored next, by conducting a search along the estimated direction of steepest descent. If the goal is to minimize the response, this method aims at climbing down the response surface rather than exploring the whole region, and its success depends on the assumption that the ultimate minimum can be reached via such a path of descent (see p. 503 of Davies[4]). That is, as the search moves down the surface (decreasing response), it is assumed that at some point, a curvature will become evident, after which the response will no longer improve or will start increasing. At this point another experiment is designed to detect curvature and determine effects up to second-order. A canonical analysis is then performed to locate the stationary point of the fitted surface, and to ascertain whether any further exploration needs to be conducted. For example, if the stationary point is found to be a saddle point, then a ridge analysis is performed, whereby the experimenter formulates a k-variable response surface problem in two dimensions and collects observations along the falling ridge. If the stationary point is found to be a minimum, then the region around this point is explored a bit further via some discrete step sizes and the incumbent solution is reported.

If the experimenter has a prior notion of the general vicinity of the location of the optimum, then Myers[7] gives a stepwise procedure that can be generally described as follows (see p. 88 of Myers[7]).

Step 1: The experimenter fits a first-order response model in some restricted region of the variables x1, x2, …, xk.

Step 2: The information from Step 1 is used to locate a path of steepest descent.

Step 3: A series of experiments is conducted along the path until no additional decrease in response is evident.

Step 4: Steps 1, 2, and 3 are repeated over the region about the current incumbent solution. However, if curvature is evident, and the experimenter is satisfied that little or no additional information can be obtained from the ongoing searches, a more elaborate experiment and analysis is conducted via the use of a second-order design.

In Step 1, typically, replications of a factorial or fractional factorial experiment are performed. The unknown parameters of the fitted linear model are then computed. A judicious selection of the experimental design having desirable properties such as minimum variance is needed in order to obtain better estimates of the unknown parameters. The first-order linear model is represented asyij0+l=1kβlxlijwhere yij is the response at the jth replication and the ith design point, x1, x2, …, xk are the k factor variables, βl, for l=1, …, k, are the unknown parameters of the linear model, and εij is the error term at the ith design point and the jth replication (i=1, 2, …, m, j=1, 2, …, r, l=1, 2, …, k).

If a full 2k factorial experiment is used at Step 1, then estimates of the unknown parameters βl in Eq. (1)can be obtained. Denote these estimates by bl (for l=1, 2, …, k), respectively. These estimates are then used in Step 2 to locate the path of steepest descent, which is given by the vector d=(−b1, −b2, …, −bk). Responses are observed along this path using some prescribed search strategy until there is no improvement. In the standard approach, a simple sequence of fixed steps along the direction d are explored. If curvature is evident, then a second-order model is derived as,yij0+l=1kβlxl+h>lβhlxhxl+l=1kβllx2lij.Here, βl, βhl and βll are the unknown parameters of the second-order model (l=1, 2, … k, h>l), and the remaining terms are as defined for Eq. (1). This model is used to evaluate a stationary point of the response surface. A canonical analysis is performed that involves, among other things, locating the stationary point of the system, and determining the nature of this point. Using this analysis, an optimum is reported, perhaps following some additional investigation in the case of a detected ridge system.

The principal shortcomings of the foregoing standard RSM approach are two-fold. First, the initial (pure) gradient search can be prone to zigzagging and can be slow to converge (see[1]). Moreover, there is no information from previous iterations that is successively employed to provide improved search directions and the process is essentially memoryless. Second, after curvature is detected, there is no attempt to conduct a continuous iterative second-order search using the information available from the canonical analysis. Our intent in this paper is to rectify these shortcomings.

This sub-section discusses the gradient deflection methods to be used in the present study in order to address the first shortcoming of the standard RSM approach as cited above. Originally developed by Hestenes and Stiefel[6], conjugate gradient, or gradient deflection methods, were first applied to unconstrained minimization problems by Fletcher and Reeves[5]. In this approach, a sequence of points xj+1, and a sequence of directions dj, are generated iteratively asxj+1=xjjdj,wheredj=−gjjdj−1and where d0=0, ψj is a multiplier that scales the direction vector of the previous iteration, being calculated differently for different gradient deflection methods, gj is the gradient of the objective function at the operating point xj (assumed to be nonzero, or else, the method is terminated), and λj is the step-length adopted along the (descent) direction, dj. As seen from Eq. (4), a deflected direction dj at the jth iteration is composed as a linear combination of the negative gradient at the jth iteration and the direction vector of the previous iteration. This implies a possible advantage of this method over the method of steepest descent, since the deflection strategy attempts to capture the second-order curvature effects over successive iterations. Indeed, for a quadratic function in k variables, this approach can be made to converge to an optimum within k iterations (see[1]).

Various gradient deflection methods use different techniques for computing the deflection parameters ψj at the jth iteration. In particular, one gradient deflection method proposed by Sherali and Ulular[9] that was found to be promising even for nondifferentiable functions (wherein gj represents a subgradient of the function) uses the average direction strategy and is denoted here as GD1. The deflected direction at the jth iteration for this method is given bydj=−gj+‖gj‖dj−1dj−1,where dj−1 is the search direction adopted at the (j−1)st iteration, with d1≡−g1.

In the context of proposing a conjugate gradient algorithm that adopts quasi-Newton types of updates and permits inexact line searches, Sherali and Ulular[10] develop yet another scheme for computing ψj. Their method, denoted in this paper by GD2, generates the deflected direction at the jth iteration asdj=−gj+qj′gj−(1/sj)pj′gjqj′dj−1dj−1,wherepj=xj−xj−1,qj=gj−gj−1,and where sj is a scale parameter that, if suitably chosen, permits sjdj to be the Newton direction at the jth iteration, given that this direction is spanned by −gj and dj−1. From a computational viewpoint, the parameter sj is prescribed assjj−1.Motivated by the computational success of BFGS quasi-Newton updates, Shanno proposed a related conjugate gradient strategy for which the direction at the jth (j=2, 3, …) iteration is given bydj=−I−pjqj′+qjpjqj′pjgj,where pj and qj are as defined in , , respectively. We denote this gradient deflection method by GD3. Finally, a fourth gradient deflection method explored in this study, denoted by GD4, is a modification of GD2 suggested by Sherali and Ulular[10] that converts this strategy into a symmetric memoryless BFGS type of an update. The prescribed search direction under this modification is given bydj=−I−pjqj′+qjpjqj′pj+1sj+qj′qjqj′pjpjpjqj′pjgjwhere pj, qj, and sj are as defined in , , , respectively. The four gradient deflection methods described above will be used in our empirical study. In addition, as suggested in the literature, restarting techniques employed in conjunction with gradient deflection methods can significantly enhance the performance of these methods. We next specify the particular restarting procedures employed in this study.

Beale[2] and Powell[8] have offered various restarting criteria that improve the performance of gradient deflection methods. In the present context of RSM, restarting at any iteration would imply following the path of steepest descent at that iteration, although as suggested by Powell[8] in the context of conjugate gradient methods, one could restart using the current deflected direction in order to preserve the accumulated second-order information. However, this would require an additional term in the deflection scheme. In this study, two restarting criteria, denoted by RSA and RSB, will be examined under the above four gradient deflection methods.

Under criterion RSA, for a k-variable problem, the optimization procedure is restarted at the jth iteration, by using dj=−gj, if any of the following three conditions is satisfied:

  • 1.

    if j=k, or

  • 2.

    if ‖gjgj+1‖≥0.2‖gj2 holds, or

  • 3.

    if −1.2‖gj2djprojgj≤−0.8(gjgjproj) is violated,

where djproj and gjproj respectively project dj and −gj onto the box (boundary) constraints by zeroing out the components corresponding to the active bounds that would be immediately violated if one were to start moving along the respective directions dj and −gj. The rationale for these conditions is provided in Section 4 of Powell[8], or on p. 336 in Bazaraa et al.[1].

Under criterion RSB, for a k-variable problem, the optimization procedure is restarted at the jth iteration if any of the following two conditions is satisfied:

  • 1.

    if j=k, or

  • 2.

    if djprojgj>−0.8(gjgjproj) holds.

Notice that the conditions in RSB are a subset of the conditions in RSA, and that in particular, since restarting is triggered whenever djprojgj>−0.8(gjgjproj), the adopted direction djproj is always a descent direction having a sufficiently negative directional derivative. These restarting criteria are empirically tested in Section 3. The next section embeds the foregoing deflection schemes within the framework of an enhanced RSM algorithm that, in addition, adopts second-order search directions whenever a significant curvature is evident.

Section snippets

An enhanced RSM algorithm

This section presents a new RSM algorithm that incorporates certain gradient deflection methods as discussed in Section 1.2, along with second-order search directions, in order to improve upon the search process described for the standard RSM algorithm in Section 1.1. Before presenting the algorithm, some preliminary observations about the problem are warranted. It is assumed that the problem at hand is a minimization problem in k variables. All variables are assumed to have upper and lower

Computational experience

This section provides computational results to evaluate the standard vs the enhanced RSM algorithm using some test problems available in the literature. To begin with, we first tested the various gradient deflection methods with no restarting criterion in the context of RSM using the ten test functions compiled in Sherali and Ulular[10]. The corresponding computational results showing the best solutions obtained using the different search direction strategies along with the number of simulation

Conclusions and future research

The standard RSM approach uses a pure gradient search that can be prone to zigzagging and slow to converge. Moreover, there is no information from previous iterations that is successively employed to provide improved search directions, and the process is essentially memoryless. Also, after curvature is detected, there is no attempt to conduct a continuous iterative second-order search using the information available from the canonical analysis. This paper suggests strategies for rectifying

References (10)

  • H.D. Sherali et al.

    Conjugate gradient methods using quasi-Newton updates with inexact line search

    Journal of Mathematical Analysis and Applications

    (1990)
  • Bazaraa, M. S., Sherali, H. D. and Shetty, C. M., Nonlinear Programming: Theory and Algorithms, 2nd edition. Wiley, New...
  • Beale, E. M. L., A derivation of conjugate gradients. In Numerical Methods for Nonlinear Optimization, ed. F. A....
  • Box, G. E. P. and Wilson, K. B., On the experimental attainment of optimum conditions. Journal of the Royal Stat. Soc.,...
  • Davies, O. L., Design and Analysis of Industrial Experiments, 2nd edition. Hafner, New York,...
There are more references available in the full text version of this article.

Cited by (22)

  • Improving response surface methodology by using artificial neural network and simulated annealing

    2012, Expert Systems with Applications
    Citation Excerpt :

    Kemper, Müller, and Thümmler (2006) applied a combined model of RSM and numerical method to optimize continuous time Markov chain models. Joshi, Sherali, and Tew (1998) applied a deflected conjugate gradient approach to improve the performance of RSM. A review about the application of response surface methodology (RSM) in the optimization of analytical methods has been presented by Almeida, Santelli, Oliveira, and Escaleira (2008).

  • Chapter 18 Metamodel-Based Simulation Optimization

    2006, Handbooks in Operations Research and Management Science
    Citation Excerpt :

    They also show that the stopping rule based on three consecutive observations without a decrease is effective. If multiple first-order fits and line searches are performed, conjugate-gradient or quasi-Newton directions might be chosen instead of the gradient direction (see Joshi et al., 1998). Phase II generally follows a sequence of one or more Phase I iterations.

  • Self-healing Systems Monitoring

    2022, Studies in Systems, Decision and Control
View all citing articles on Scopus
1

Shirish Joshi is working with i2 Technologies, Inc. His research interests include simulation, supply chain management, business process re-engineering, transportation, and optimization. He has published in archival journals such as European Journal of Operational Research and Transportation Research, and has presented his work in leading conferences including the Winter Simulation Conference and INFORMS.

1

Hanif D. Sherali is the Charles O. Gordon Professor of Industrial and Systems Engineering at Virginia Polytechnic Institute and State University. He received his M.S. and Ph.D. degrees in Operations Research from Georgia Institute of Technology. His area of research is in convex and nonconvex (discrete and continuous) optimization with applications to production, location, transportation, and engineering design problems. He has co-authored four books and has published several papers in journals related to his research.

1

Jeffrey D. Tew is a Senior Scientist at the General Electric Research and Development Division. His area of research interest is in simulation and its applications.

View full text