An enhanced response surface methodology (RSM) algorithm using gradient deflection and second-order search strategies

doi:10.1016/S0305-0548(98)00014-8

Computers & Operations Research

Volume 25, Issues 7–8, July 1998, Pages 531-541

https://doi.org/10.1016/S0305-0548(98)00014-8 Get rights and content

Abstract

In the optimization of complex systems where at best one can only evaluate the function being optimized, often via a simulation experimental process, Response Surface Methodology (RSM) is frequently the method of choice. However, RSM invariably employs a first-order steepest descent method, presumably because of the prohibitive cost of employing second-order designs, except for the canonical analysis that is conducted at termination. In this paper, we suggest the use of deflected/conjugate gradient approaches that can be gainfully employed to improve the performance of RSM. These methods involve the same effort per iteration and require only first-order designs. However, instead of discarding previously generated information, the method utilizes these computations to advantageously build upon accumulated knowledge regarding the second-order curvature of the function being optimized, hence accelerating convergence by circumventing the characteristic zigzagging of steepest descent methods. The proposed scheme is illustrated using various test examples from the literature for which optimal solutions are known and that serve the purpose of exhibiting the potential of the recommended methodology.

This paper attempts to improve the search techniques being currently used in standard Response Surface Methodology (RSM) algorithms. RSM is a collection of mathematical and statistical techniques for experimental optimization. This work presents a novel RSM algorithm that incorporates certain gradient deflection methods, augmented with appropriate restarting criteria, as opposed to using the path of steepest descent as the only search direction. In order to investigate the use of the new RSM algorithm in comparison with the standard existing RSM techniques, a set of standard test functions is used, both with and without random perturbations. Computational results exhibit the improvements achieved under the proposed algorithm.

Introduction

The focus of this paper is on improving the simulation-optimization technique known as response surface methodology (RSM). First proposed by Box and Wilson[3] in the context of optimization problems concerned with chemical process engineering, RSM is a collection of mathematical and statistical techniques for experimental optimization. The search procedures currently employed by RSM algorithms use the method of steepest descent only. This paper develops a novel RSM algorithm that incorporates certain gradient deflection methods, augmented with suitable restarting criteria, as opposed to using the path of steepest descent as the only search direction. In order to empirically investigate the use of the new RSM algorithm in contrast with the existing one, a set of standard test functions from the literature is used. Computational results exhibit the improvements achieved under the proposed algorithm, both with and without random perturbations on functional evaluations.

This paper is organized as follows. The remainder of this section presents an overview of RSM along with a statement of a standard RSM algorithm, and discusses various competitive gradient deflection techniques that we will be using in our approach. Section 2develops the new RSM algorithm. Empirical studies are described in Section 3, and Section 4presents conclusions and recommendations for future research.

Throughout this paper, it is assumed that the problem at hand is an unconstrained minimization problem, and that the functional form of the objective function is unknown. Typically, to start the RSM procedure, an experiment is designed in a small sub-region of the factor space, and a low degree polynomial (usually first-order) is used to represent the data obtained from the responses. This helps the practitioner deduce which region should be explored next, by conducting a search along the estimated direction of steepest descent. If the goal is to minimize the response, this method aims at climbing down the response surface rather than exploring the whole region, and its success depends on the assumption that the ultimate minimum can be reached via such a path of descent (see p. 503 of Davies[4]). That is, as the search moves down the surface (decreasing response), it is assumed that at some point, a curvature will become evident, after which the response will no longer improve or will start increasing. At this point another experiment is designed to detect curvature and determine effects up to second-order. A canonical analysis is then performed to locate the stationary point of the fitted surface, and to ascertain whether any further exploration needs to be conducted. For example, if the stationary point is found to be a saddle point, then a ridge analysis is performed, whereby the experimenter formulates a k-variable response surface problem in two dimensions and collects observations along the falling ridge. If the stationary point is found to be a minimum, then the region around this point is explored a bit further via some discrete step sizes and the incumbent solution is reported.

If the experimenter has a prior notion of the general vicinity of the location of the optimum, then Myers[7] gives a stepwise procedure that can be generally described as follows (see p. 88 of Myers[7]).

Step 1: The experimenter fits a first-order response model in some restricted region of the variables x₁, x₂, …, x_k.

Step 2: The information from Step 1 is used to locate a path of steepest descent.

Step 3: A series of experiments is conducted along the path until no additional decrease in response is evident.

Step 4: Steps 1, 2, and 3 are repeated over the region about the current incumbent solution. However, if curvature is evident, and the experimenter is satisfied that little or no additional information can be obtained from the ongoing searches, a more elaborate experiment and analysis is conducted via the use of a second-order design.

In Step 1, typically, replications of a factorial or fractional factorial experiment are performed. The unknown parameters of the fitted linear model are then computed. A judicious selection of the experimental design having desirable properties such as minimum variance is needed in order to obtain better estimates of the unknown parameters. The first-order linear model is represented as $y_{ij} =β_{0} + ∑ l=1 k β_{l} x_{l} +ε_{ij}$ where y_ij is the response at the jth replication and the ith design point, x₁, x₂, …, x_k are the k factor variables, β_l, for l=1, …, k, are the unknown parameters of the linear model, and ε_ij is the error term at the ith design point and the jth replication (i=1, 2, …, m, j=1, 2, …, r, l=1, 2, …, k).

If a full 2^k factorial experiment is used at Step 1, then estimates of the unknown parameters β_l in Eq. (1)can be obtained. Denote these estimates by b_l (for l=1, 2, …, k), respectively. These estimates are then used in Step 2 to locate the path of steepest descent, which is given by the vector d=(−b₁, −b₂, …, −b_k). Responses are observed along this path using some prescribed search strategy until there is no improvement. In the standard approach, a simple sequence of fixed steps along the direction d are explored. If curvature is evident, then a second-order model is derived as, $y_{ij} =β_{0} + ∑ l=1 k β_{l} x_{l} + ∑ h>l β_{hl} x_{h} x_{l} + ∑ l=1 k β_{ll} x^{2}_{l} +ε_{ij} .$ Here, β_l, β_hl and β_ll are the unknown parameters of the second-order model (l=1, 2, … k, h>l), and the remaining terms are as defined for Eq. (1). This model is used to evaluate a stationary point of the response surface. A canonical analysis is performed that involves, among other things, locating the stationary point of the system, and determining the nature of this point. Using this analysis, an optimum is reported, perhaps following some additional investigation in the case of a detected ridge system.

The principal shortcomings of the foregoing standard RSM approach are two-fold. First, the initial (pure) gradient search can be prone to zigzagging and can be slow to converge (see[1]). Moreover, there is no information from previous iterations that is successively employed to provide improved search directions and the process is essentially memoryless. Second, after curvature is detected, there is no attempt to conduct a continuous iterative second-order search using the information available from the canonical analysis. Our intent in this paper is to rectify these shortcomings.

This sub-section discusses the gradient deflection methods to be used in the present study in order to address the first shortcoming of the standard RSM approach as cited above. Originally developed by Hestenes and Stiefel[6], conjugate gradient, or gradient deflection methods, were first applied to unconstrained minimization problems by Fletcher and Reeves[5]. In this approach, a sequence of points x_j₊₁, and a sequence of directions d_j, are generated iteratively as $x_{j+1} =x_{j} +λ_{j} d_{j},$ where $d_{j} =−g_{j} +ψ_{j} d_{j−1}$ and where d₀=0, ψ_j is a multiplier that scales the direction vector of the previous iteration, being calculated differently for different gradient deflection methods, g_j is the gradient of the objective function at the operating point x_j (assumed to be nonzero, or else, the method is terminated), and λ_j is the step-length adopted along the (descent) direction, d_j. As seen from Eq. (4), a deflected direction d_j at the jth iteration is composed as a linear combination of the negative gradient at the jth iteration and the direction vector of the previous iteration. This implies a possible advantage of this method over the method of steepest descent, since the deflection strategy attempts to capture the second-order curvature effects over successive iterations. Indeed, for a quadratic function in k variables, this approach can be made to converge to an optimum within k iterations (see[1]).

Various gradient deflection methods use different techniques for computing the deflection parameters ψ_j at the jth iteration. In particular, one gradient deflection method proposed by Sherali and Ulular[9] that was found to be promising even for nondifferentiable functions (wherein g_j represents a subgradient of the function) uses the average direction strategy and is denoted here as GD1. The deflected direction at the jth iteration for this method is given by $d_{j} =−g_{j} + ‖g_{j} ‖ ‖d_{j−1} ‖ d_{j−1},$ where d_j₋₁ is the search direction adopted at the (j−1)st iteration, with d₁≡−g₁.

In the context of proposing a conjugate gradient algorithm that adopts quasi-Newton types of updates and permits inexact line searches, Sherali and Ulular[10] develop yet another scheme for computing ψ_j. Their method, denoted in this paper by GD2, generates the deflected direction at the jth iteration as $d_{j} =−g_{j} + q_{j} ′g_{j} −(1/s_{j})p_{j} ′g_{j} q_{j} ′d_{j−1} d_{j−1},$ where $p_{j} =x_{j} −x_{j−1},$ $q_{j} =g_{j} −g_{j−1},$ and where s_j is a scale parameter that, if suitably chosen, permits s_jd_j to be the Newton direction at the jth iteration, given that this direction is spanned by −g_j and d_j₋₁. From a computational viewpoint, the parameter s_j is prescribed as $s_{j} =λ_{j−1} .$ Motivated by the computational success of BFGS quasi-Newton updates, Shanno proposed a related conjugate gradient strategy for which the direction at the jth (j=2, 3, …) iteration is given by $d_{j} =− I− p_{j} q_{j} ′+q_{j} p_{j} ′ q_{j} ′p_{j} g_{j},$ where p_j and q_j are as defined in , , respectively. We denote this gradient deflection method by GD3. Finally, a fourth gradient deflection method explored in this study, denoted by GD4, is a modification of GD2 suggested by Sherali and Ulular[10] that converts this strategy into a symmetric memoryless BFGS type of an update. The prescribed search direction under this modification is given by $d_{j} =− I− p_{j} q_{j} ′+q_{j} p_{j} ′ q_{j} ′p_{j} + 1 s_{j} + q_{j} ′q_{j} q_{j} ′p_{j} p_{j} p_{j} ′ q_{j} ′p_{j} g_{j}$ where p_j, q_j, and s_j are as defined in , , , respectively. The four gradient deflection methods described above will be used in our empirical study. In addition, as suggested in the literature, restarting techniques employed in conjunction with gradient deflection methods can significantly enhance the performance of these methods. We next specify the particular restarting procedures employed in this study.

Beale[2] and Powell[8] have offered various restarting criteria that improve the performance of gradient deflection methods. In the present context of RSM, restarting at any iteration would imply following the path of steepest descent at that iteration, although as suggested by Powell[8] in the context of conjugate gradient methods, one could restart using the current deflected direction in order to preserve the accumulated second-order information. However, this would require an additional term in the deflection scheme. In this study, two restarting criteria, denoted by RSA and RSB, will be examined under the above four gradient deflection methods.

Under criterion RSA, for a k-variable problem, the optimization procedure is restarted at the jth iteration, by using d_j=−g_j, if any of the following three conditions is satisfied:

1.
if j=k, or
2.
if ‖g_j′g_j₊₁‖≥0.2‖g_j‖² holds, or
3.
if −1.2‖g_j‖²≤d_jproj′g_j≤−0.8(g_j′g_jproj) is violated,

where d_jproj and g_jproj respectively project d_j and −g_j onto the box (boundary) constraints by zeroing out the components corresponding to the active bounds that would be immediately violated if one were to start moving along the respective directions d_j and −g_j. The rationale for these conditions is provided in Section 4 of Powell[8], or on p. 336 in Bazaraa et al.[1].

Under criterion RSB, for a k-variable problem, the optimization procedure is restarted at the jth iteration if any of the following two conditions is satisfied:

1.
if j=k, or
2.
if d_jproj′g_j>−0.8(g_j′g_jproj) holds.

Notice that the conditions in RSB are a subset of the conditions in RSA, and that in particular, since restarting is triggered whenever d_jproj′g_j>−0.8(g_j′g_jproj), the adopted direction d_jproj is always a descent direction having a sufficiently negative directional derivative. These restarting criteria are empirically tested in Section 3. The next section embeds the foregoing deflection schemes within the framework of an enhanced RSM algorithm that, in addition, adopts second-order search directions whenever a significant curvature is evident.

Section snippets

An enhanced RSM algorithm

This section presents a new RSM algorithm that incorporates certain gradient deflection methods as discussed in Section 1.2, along with second-order search directions, in order to improve upon the search process described for the standard RSM algorithm in Section 1.1. Before presenting the algorithm, some preliminary observations about the problem are warranted. It is assumed that the problem at hand is a minimization problem in k variables. All variables are assumed to have upper and lower

Computational experience

This section provides computational results to evaluate the standard vs the enhanced RSM algorithm using some test problems available in the literature. To begin with, we first tested the various gradient deflection methods with no restarting criterion in the context of RSM using the ten test functions compiled in Sherali and Ulular[10]. The corresponding computational results showing the best solutions obtained using the different search direction strategies along with the number of simulation

Conclusions and future research

The standard RSM approach uses a pure gradient search that can be prone to zigzagging and slow to converge. Moreover, there is no information from previous iterations that is successively employed to provide improved search directions, and the process is essentially memoryless. Also, after curvature is detected, there is no attempt to conduct a continuous iterative second-order search using the information available from the canonical analysis. This paper suggests strategies for rectifying

References (10)

H.D. Sherali et al.
Conjugate gradient methods using quasi-Newton updates with inexact line search
Journal of Mathematical Analysis and Applications
(1990)
Bazaraa, M. S., Sherali, H. D. and Shetty, C. M., Nonlinear Programming: Theory and Algorithms, 2nd edition. Wiley, New...
Beale, E. M. L., A derivation of conjugate gradients. In Numerical Methods for Nonlinear Optimization, ed. F. A....
Box, G. E. P. and Wilson, K. B., On the experimental attainment of optimum conditions. Journal of the Royal Stat. Soc.,...
Davies, O. L., Design and Analysis of Industrial Experiments, 2nd edition. Hafner, New York,...

There are more references available in the full text version of this article.

Cited by (22)

A systematic investigation into the flowback cleanup of hydraulic-fractured wells in unconventional gas plays
2018, International Journal of Coal Geology
This paper conducts an extensive investigation into fracture cleanup efficiency by considering several pertinent parameters instantaneously over a wide practical range. Injection, shut-in and production stages of the fracturing operation were simulated for 32 sets consisting of 113,072 runs. To perform such a large number of simulation runs, a computer code was utilised to routinely read input data, implement the simulation runs and produce output data. In each set (which consists of 4096 runs), instantaneous impacts of twelve different parameters (i.e., fracture and matrix permeability, Brooks matrix capillary pressure (Pc) parameters, and Brooks-Corey relative permeability parameters) were investigated. To sample the domain of variables, full factorial experimental design (two-level FFS) was employed. The linear surface methodology was used to map the simulation output, which is the loss in gas production (GPL), compared to the clean case (i.e., 100% clean-up) after three production periods of 10, 30 and 365 days.
The impact of various combinations of fracture fluid injection volume, fracture length, shut-in soaking time, matrix permeability variation range and drawdown on GPL were studied in different sets. Additionally, more simulation sets were performed to capture the impact of hysteresis, layering and mobile formation water on the clean-up efficiency.
Results indicated that in line with some literature data, factors that controlled the mobility of FF inside the fracture had the most significant impact on cleanup efficiency. It was also noted that injecting high volumes of FF, into very tight formations significantly delayed clean-up and impaired gas production. The effect of varying other parameters such as extending soaking time or increasing pressure down in such a case delivered negligible GPL improvement. Introducing hysteresis made clean-up slightly faster in all production periods.
The impact of the gravity segregation was discussed in this study. Considering the layered systems, it was indicated that in the top layer, the fracture mobility coefficients were more important than the ones in the bottom layer whist capillary pressure seems to become more important in deeper layers compared to the top layers.
Additionally, a slower clean-up was observed for sets with larger initial water saturation compared to those cases with immobile water saturation due to the detrimental effect of mobile water on gas production. In some cases, with significantly high values of water saturation, using chemicals (which IFT reducing agents) to reduce Pc could reduce GPL and improve cleanup efficiency.
These findings contribute to the further understanding of the fracture fluid cleanup process and provide practical guidelines to achieve economically successful hydraulic fracturing operations, which are popular but expensive for tight and ultra-tight reservoirs.
Improving response surface methodology by using artificial neural network and simulated annealing
2012, Expert Systems with Applications
Citation Excerpt :
Kemper, Müller, and Thümmler (2006) applied a combined model of RSM and numerical method to optimize continuous time Markov chain models. Joshi, Sherali, and Tew (1998) applied a deflected conjugate gradient approach to improve the performance of RSM. A review about the application of response surface methodology (RSM) in the optimization of analytical methods has been presented by Almeida, Santelli, Oliveira, and Escaleira (2008).
Response surface methodology (RSM) explores the relationships between several explanatory variables and one or more response variables. The main idea of RSM is to use a set of designed experiments to obtain an optimal response. RSM tries to simplify the original problem through some polynomial estimation over small sections of the feasible area, elaborating on optimum provision through a well known optimization technique, say Gradient Method.
As the real world problems are usually very complicated, polynomial estimation may not perform well in providing a good representation of the objective function. Also, the main problem of the Gradient Method, getting trapped in local minimum (maximum), makes RSM at a disadvantage, while defining sub-sections of the feasible area is also a problem faced by analyst.
In this article, neural networks are used as a means to improve the estimation in the RSM context. This approach leads to reducing the calculations. Furthermore, it is proposed to use simulated annealing in maximizing the estimated objective function in reaching a suitable point. Three examples of different complexities are solved to shed light on the merits of the proposed method. The comparison results indicate that the proposed algorithm outperforms the classical method.
Response surface methodology for constrained simulation optimization: An overview
2008, Simulation Modelling Practice and Theory
This article summarizes ‘generalized response surface methodology’ (GRSM), extending Box and Wilson’s ‘response surface methodology’ (RSM). GRSM allows multiple random responses, selecting one response as goal and the other responses as constrained variables. Both GRSM and RSM estimate local gradients to search for the optimum. These gradients are based on local first-order polynomial approximations. GRSM combines these gradients with Mathematical Programming findings to estimate a better search direction than the steepest ascent direction used by RSM. Moreover, these gradients are used in a bootstrap procedure for testing whether the estimated solution is indeed optimal. The focus of this paper is the optimization of simulated (not real) systems.
Chapter 18 Metamodel-Based Simulation Optimization
2006, Handbooks in Operations Research and Management Science
Citation Excerpt :
They also show that the stopping rule based on three consecutive observations without a decrease is effective. If multiple first-order fits and line searches are performed, conjugate-gradient or quasi-Newton directions might be chosen instead of the gradient direction (see Joshi et al., 1998). Phase II generally follows a sequence of one or more Phase I iterations.
Simulation models allow the user to understand system performance and assist in behavior prediction, to support system diagnostics and design. Iterative optimization methods are often used in conjunction with engineering simulation models to search for designs with desired properties. These optimization methods can be difficult to employ with a discrete-event simulation, due to the stochastic nature of the response(s) and the potentially extensive run times. A metamodel, or model of the simulation model, simplifies the simulation optimization in two ways: the metamodel response is deterministic rather than stochastic, and the run times are generally much shorter than the original simulation. Metamodels based on first- or second-order polynomials generally provide good fit only locally, and so a series of metamodels are fit as the optimization progresses. Other classes of metamodels can provide good global fit; in these cases one can fit a (global) metamodel once, at the start of the optimization, and use it to find a design that will meet the optimality criteria. Both approaches are discussed in this chapter and illustrated with an example.
Self-healing Systems Monitoring
2022, Studies in Systems, Decision and Control
Spatiotemporal Investigation of Atmospheric Water Harvesting Potential Using Response Surface Methodology for Multan (Pakistan) and Fukuoka (Japan)
2020, International Exchange and Innovation Conference on Engineering and Sciences

View all citing articles on Scopus

¹: Shirish Joshi is working with i2 Technologies, Inc. His research interests include simulation, supply chain management, business process re-engineering, transportation, and optimization. He has published in archival journals such as European Journal of Operational Research and Transportation Research, and has presented his work in leading conferences including the Winter Simulation Conference and INFORMS.

¹: Hanif D. Sherali is the Charles O. Gordon Professor of Industrial and Systems Engineering at Virginia Polytechnic Institute and State University. He received his M.S. and Ph.D. degrees in Operations Research from Georgia Institute of Technology. His area of research is in convex and nonconvex (discrete and continuous) optimization with applications to production, location, transportation, and engineering design problems. He has co-authored four books and has published several papers in journals related to his research.

¹: Jeffrey D. Tew is a Senior Scientist at the General Electric Research and Development Division. His area of research interest is in simulation and its applications.

View full text