nach oben

Journal of Inequalities and Applications

Erschienen in:

Open Access 01.12.2020 | Research

A simple approximated solution method for solving fractional trust region subproblems of nonlinearly equality constrained optimization

verfasst von: Honglan Zhu, Qin Ni, Xuebing Zhang

Erschienen in: Journal of Inequalities and Applications | Ausgabe 1/2020

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

In this paper, a fractional model is used to solve nonlinearly constrained optimization problems. In order to solve the fractional trust region subproblems simply, we propose an approximated solution method by cyclically fixing the fractional coefficient part of the approximate function. The global convergence of the fractional trust region method is proved, and the numerical results show that the new algorithm is effective and stable.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

In this paper, we consider the nonlinear equality constrained optimization problem

$$\begin{aligned}& \min_{x\in R^{n}} f(x), \end{aligned}$$

(1.1)

$$\begin{aligned}& \text{s.t.}\quad C(x)=0, \end{aligned}$$

(1.2)

where $C(x)=(c_{1}(x), \ldots , c_{m}(x))^{T}$ ($m\leq n$) and $f(x)$, $c_{i}(x)$ ($i=1,\ldots ,m$) are continuously differentiable. The Lagrangian function for problem (1.1)–(1.2) is defined as follows:

$$\begin{aligned} L(x,\lambda ) = f(x) + \sum_{i=1}^{m} \lambda _{i} c_{i}(x), \end{aligned}$$

(1.3)

where $\lambda _{i}$ for $i=1,\ldots ,m$ are Lagrange multipliers. Problem (1.1)–(1.2) has been studied by many researchers, including Han [1], Powell [2], Yuan and Sun [3], Powell and Yuan [4], etc.

There are many efficient methods to solve problem (1.1)–(1.2), and the trust region method is a very effective method (see [4‐13]). In addition, the book of Conn, Gould, and Toint [14] is an excellent and comprehensive one on trust region methods. However, most of these methods use the quadratic model to approximate $f(x)$. The sequential quadratic programming method for (1.1)–(1.2) computes a search direction by minimizing a quadratic model of the Lagrangian subject to the linearized constraints. That is, at the kth iteration, the following subproblem

$$\begin{aligned}& \min_{s \in R^{n}} \varrho _{k} (s) = g_{k}^{T}s + \frac{1}{2} s ^{T}B_{k}s, \end{aligned}$$

(1.4)

$$\begin{aligned}& \text{s.t.}\quad A_{k}^{T}s + C_{k} =0 \end{aligned}$$

(1.5)

is solved to obtain a search direction $s_{k}$, where $x_{k}$ is the current iterate point, $g_{k}=\nabla f(x_{k})$, $B_{k}$ is symmetric and an approximation to the Hessian $\nabla _{xx} L(x_{k}, \lambda _{k})$ of the Lagrangian of problem (1.1)–(1.2), $A_{k}=[\nabla c_{1}(x _{k}),\ldots ,\nabla c_{m}(x_{k})]$ and $C_{k}=C(x_{k})$. The constraint gradients $\nabla c_{i}(x_{k})$ are assumed to be linearly independent for all $x_{k}$. However, if the objective function possesses high nonlinear property and the iterative point is far away from the minimum, the quadratic model could not approximate the original problem very well, which may lead to iteration proceeding slowly.

In 1980, Davidon [15] first proposed the collinear scaling of variables and conic model method for unconstrained optimization. The conic model is

$$\begin{aligned} \tilde{\phi }_{k} (s) = f_{k}+ \frac{g_{k}^{T}s}{1 - a_{k}^{T}s} + \frac{1}{2}\frac{s^{T}B_{k}s}{(1 - a_{k}^{T}s)^{2}}, \end{aligned}$$

(1.6)

where $a_{k}\in R^{n}$ is a horizontal vector. Then, Sorensen [16] published detailed results on a class of conic model methods and proved that a particular member of this class has the Q-superlinear convergence. Many other scholars have also studied the trust region algorithm of the conic model (see [17, 18]).

The trial step of a trust region algorithm is usually obtained by solving a trust region subproblem. In our trust region subproblem, the trust region bound constraint is

$$\begin{aligned} \Vert s \Vert \le \Delta _{k}, \end{aligned}$$

(1.7)

where $\Delta _{k}$ is the trust region radius at the kth iteration. It is easy to see that there is a possibility that the linearized constraints (1.5) may have no solutions in the trust region (1.7). To overcome this difficulty, we use a relaxed version of the linearized constraint, which was proposed by Byrd, Schnabel, and Schultz in [19]. In [20], Sun also used this relaxed version of the linearized constraint and proposed a conic trust region method for nonlinearly constrained optimization. That is, at the kth iteration, the trial step $s_{k}$ is computed by solving the following conic model trust region subproblem:

$$\begin{aligned}& \min_{s \in R^{n}} \phi _{k} (s) = \frac{g_{k}^{T}s}{1 - a_{k} ^{T}s} + \frac{1}{2}\frac{s^{T}B_{k}s}{(1 - a_{k}^{T}s)^{2}}, \end{aligned}$$

(1.8)

$$\begin{aligned}& \text{s.t.}\quad A_{k}^{T}s + \theta _{k} C_{k} =0, \end{aligned}$$

(1.9)

$$\begin{aligned}& \hphantom{\text{s.t.}\quad} \Vert s \Vert \le \Delta _{k}, \end{aligned}$$

(1.10)

where $\|\cdot \|$ refers to the Euclidean norm and $\theta _{k}$ is a relaxation parameter. $\theta _{k}\in (0,1]$ is chosen such that the feasible set of (1.9) and (1.10) is not empty. Geometrically speaking, the role of $\theta _{k}$ is to compress the feasible area of each constraints of (1.5) to the direction of origin (see [8, 21]).

The above conic model (1.8) has only one parameter $a_{k}$ and less degree of freedom, which affects the effect of the model approaching the objective function. Therefore, we consider selecting a fractional model with more parameters, which can make full use of the function and gradient information in the previous iteration and can approximate the original function well, thus obtaining a new method for solving the optimization problem. In [22], Zhu considered the unconstrained optimization problem and proposed a fractional model

$$\begin{aligned} \psi _{k} (s) = h_{k}(s) g^{T}_{k}s + \frac{1}{2}h_{k}^{2}(s) s^{T}B _{k}s, \end{aligned}$$

(1.11)

where

$$\begin{aligned} h_{k}(s) = \frac{(1 + c_{k}^{T}s)}{(1 - a_{k}^{T}s)(1 - b_{k}^{T}s)}, \end{aligned}$$

(1.12)

and horizontal vectors $a_{k}, b_{k}, c_{k}\in R^{n}$ are bounded. The gradient of $\psi _{k} (s)$ is

$$\begin{aligned} \nabla \psi _{k} (s)= \frac{1}{v_{s}^{3}} \omega (s)\bigl[v_{s} g_{k} + \bigl(1 + c_{k}^{T} s\bigr)B_{k}s\bigr], \end{aligned}$$

(1.13)

where

$$\begin{aligned}& v_{s}=\bigl(1 - a_{k}^{T}s \bigr) \bigl(1 - b_{k}^{T}s\bigr), \end{aligned}$$

(1.14)

$$\begin{aligned}& \omega (s)=\bigl(1 + c_{k}^{T}s \bigr)v_{s}I + \bigl[c_{k}v_{s} + \bigl(1 + c_{k}^{T}s\bigr) \bigl(a _{k} \bigl(1-b_{k}^{T}s\bigr) + b_{k} \bigl(1- a_{k}^{T}s\bigr)\bigr)\bigr] {s^{T}}. \end{aligned}$$

(1.15)

If $b_{k}=c_{k}=0$, then $\psi _{k} (s)$ is reduced to the conic model. If $a_{k}=b_{k}=c_{k}=0$, then $\psi _{k}(s)$ is the quadratic model $\varrho _{k} (s)$. The fractional model is new and it is an extension to conic model $\phi _{k} (s)$. It has more choice of parameter vectors and can make use of their function and gradient information. Based the new fractional model $\psi _{k} (s)$, a simplified fractional trust region subproblem

$$\begin{aligned}& \min_{s \in R^{n}} \psi _{k} (s), \end{aligned}$$

(1.16)

$$\begin{aligned}& \text{s.t.}\quad \Vert s \Vert \le \tilde{\Delta }_{k} \end{aligned}$$

(1.17)

is proposed, where

$$\begin{aligned}& \tilde{\Delta }_{k} = \min \biggl\{ {\Delta _{k}, \frac{\epsilon _{1}}{{ \Vert a_{k} \Vert }},\frac{\epsilon _{1}}{{ \Vert b_{k} \Vert }}, \frac{ \epsilon _{1}}{{ \Vert c_{k} \Vert }}} \biggr\} , \end{aligned}$$

(1.18)

$$\begin{aligned}& 0 < \epsilon _{1} < 1, \end{aligned}$$

(1.19)

and the parameters $a_{k}$, $b_{k}$, $c_{k}$ were chosen such that

$$ \Vert a_{k} \Vert \tilde{\Delta }_{k} \leq \epsilon _{1}, \qquad \Vert b_{k} \Vert \tilde{\Delta }_{k} \leq \epsilon _{1}, \qquad \Vert c_{k} \Vert \tilde{\Delta } _{k} \leq \epsilon _{1}. $$

(1.20)

Subproblem (1.16)–(1.17) is solved only in quasi-Newton direction in [22]. Based on this, we study in depth the Newton point and the steepest descent point of the fractional trust region subproblem, so as to construct a simple dogleg method to solve the subproblem (see [23]). Numerical experiments show that the fractional model trust region quasi-Newton algorithm seems to be superior to the conic model trust region algorithm in terms of the number of iterations and the running time as the dimension of the optimization problem increases. For the linear equality constrained optimization problem, the null space technique is used to delete the linear equality constraint, and the fractional trust region method for solving the linear equality constrained optimization problem is proposed (see [24]).

Now we use the fractional model to solve nonlinearly constrained optimization problems. In order to solve problem (1.1)–(1.2), we consider the following fractional trust region subproblem. That is, if the current iteration point is $x_{k}$, then the trial step $s_{k}$ is computed by

$$\begin{aligned}& \min_{s \in R^{n}} \psi _{k} (s) , \end{aligned}$$

(1.21)

$$\begin{aligned}& \text{s.t.}\quad A_{k}^{T}s+ \theta _{k} C_{k} =0, \end{aligned}$$

(1.22)

$$\begin{aligned}& \hphantom{\text{s.t.}\quad}\bigl\vert \bigl(1 - a_{k}^{T}s\bigr) \bigl(1 - b_{k}^{T}s\bigr) \bigr\vert \geq \varepsilon _{0}, \end{aligned}$$

(1.23)

$$\begin{aligned}& \hphantom{\text{s.t.}\quad} \Vert s \Vert \le \Delta _{k}, \end{aligned}$$

(1.24)

where $\varepsilon _{0}\in (0,1)$. The purpose of adding constraint (1.23) is to guarantee that the objective function $\psi _{k} (s)$ is bounded in the trust region and this also increases the difficulty of calculation.

We know that the exact solution of the trust region subproblem is often difficult to obtain, so many approximate solution methods have been spawned. For example, the trust region method is often combined with dogleg method, conjugate gradient method, inexact line search method, alternate direction search method, and other methods to obtain approximate solutions of trust region subproblems. In this paper, we also consider an approximate solution method for the subproblem of the trust region. First, the null space technique is used to remove the linear equality constraints of the trust region subproblem; then the fractional model of the subproblem is reduced to a quadratic model by cyclically fixing the coefficient part; finally, the problem can be easily solved by searching in the descending direction to obtain an approximate solution to the subproblem.

In the global algorithm, we propose a quasi-Newton trust region method with a fractional model and prove the global convergence of the new algorithm. The numerical experiment shows that the new algorithm is effective and robust, especially for large-scale test problems.

The organization of this paper is as follows. In Sect. 2, the fractional trust region subproblem and an algorithm for solving the subproblem are presented. In Sect. 3, we propose a quasi-Newton method with a fractional model for nonlinearly equality constrained optimization and prove its convergence under some reasonable conditions. Numerical results are given in Sect. 4.

Throughout this paper, we use $\|\cdot \|$ for the 2-norm.

2 The algorithm of fractional trust region subproblem

In order to solve (1.21)–(1.24), firstly we consider removing constraint (1.23) by the same process of simplification as in [22]. Therefore, when the parameters $a_{k}$, $b _{k}$, $c_{k}$ satisfy (1.20) where

$$\begin{aligned} 0 < \epsilon _{1}\leq 1-\sqrt{\varepsilon _{0}} < 1, \end{aligned}$$

(2.1)

then subproblem (1.21)–(1.24) can be rewritten as the following reduced subproblem:

$$\begin{aligned}& \min_{s \in R^{n}} \psi _{k} (s), \end{aligned}$$

(2.2)

$$\begin{aligned}& \text{s.t.}\quad A_{k}^{T}s+ \theta _{k} C_{k} =0, \end{aligned}$$

(2.3)

$$\begin{aligned}& \hphantom{\text{s.t.}\quad} \Vert s \Vert \le \tilde{\Delta }_{k}, \end{aligned}$$

(2.4)

where $\tilde{\Delta }_{k} $ is defined as (1.18).

The null-space technique (see [20, 25, 26]) is an important technique for solving optimization problems with equality constraints. In the following, we show the method to eliminate constraint (1.22). Assume that $A_{k}$ has full column rank, and there exist an orthogonal matrix $Q_{k}$ and a nonsingular upper triangular matrix $R_{k}$ such that

A_{k} = Q_{k} R_{k} = [\begin{array}{c} Q_{k}^{(1)} & Q_{k}^{(2)} \end{array}] [\begin{array}{c} R_{k}^{(1)} \\ 0 \end{array}] = Q_{k}^{(1)} R_{k}^{(1)},

(2.5)

where $Q_{k}^{(1)}\in R^{n\times m}$, $Q_{k}^{(2)}\in R^{n\times (n-m)}$, and $R^{(1)}_{k} \in R^{m \times m}$. Then (1.22) can be rewritten as

$$\begin{aligned} \bigl(R^{(1)}_{k}\bigr)^{T} \bigl(Q_{k}^{(1)}\bigr)^{T} s = -\theta _{k} C_{k}. \end{aligned}$$

(2.6)

Therefore the feasible point for (1.22) can be presented by

$$\begin{aligned} s = \tilde{s}_{k}+ Q_{k}^{(2)} u \end{aligned}$$

(2.7)

for any $u \in R^{n-m}$, where

$$\begin{aligned} \tilde{s}_{k} = -\theta _{k} Q_{k}^{(1)} \bigl(R^{(1)}_{k}\bigr)^{-T} C_{k}, \end{aligned}$$

(2.8)

and $Q_{k}^{(2)} u$ lies in the null space of $A_{k}$. We denote

$$\begin{aligned} l_{k} = - Q_{k}^{(1)} \bigl(R^{(1)}_{k} \bigr)^{-T} C_{k}, \end{aligned}$$

(2.9)

then

$$\begin{aligned} \tilde{s}_{k} = \theta _{k} l_{k}. \end{aligned}$$

(2.10)

In order to ensure that s lies in the trust region, we choose $\theta _{k}$ such that the norm of $\tilde{s}_{k}$ is at most $\iota \tilde{\Delta }_{k}$, where $\iota \in (0,1)$ is a given constant. That is,

$$\begin{aligned} \Vert \tilde{s}_{k} \Vert = \theta _{k} \Vert l_{k} \Vert \leq \iota \tilde{\Delta } _{k}. \end{aligned}$$

(2.11)

Define

$$\begin{aligned} \theta _{k} = \min \biggl\{ 1, \frac{\iota \tilde{\Delta }_{k}}{ \Vert l _{k} \Vert } \biggr\} ,\qquad \widehat{\Delta }_{k} = \sqrt{ \tilde{\Delta }^{2}_{k}- \Vert \tilde{s} _{k} \Vert ^{2} }. \end{aligned}$$

(2.12)

Then the fractional trust region subproblem (2.2)–(2.4) becomes

$$\begin{aligned}& \min_{u \in R^{n-m}} \tilde{\psi }_{k} (u) = h_{k} \bigl(s(u)\bigr) g^{T}_{k} s(u) + \frac{1}{2} h _{k}^{2} \bigl(s(u)\bigr) s(u)^{T} B_{k} s(u), \end{aligned}$$

(2.13)

$$\begin{aligned}& \text{s.t.}\quad \Vert u \Vert \le \widehat{ \Delta }_{k}, \end{aligned}$$

(2.14)

where $s(u)=\tilde{s}_{k}+ Q_{k}^{(2)} u$.

Remark 1

$\| u \| \le \widehat{ \Delta }_{k}$ is equivalent to $\| s \| \le \tilde{\Delta }_{k}$. It is easy to see that if $\| u \| \le \widehat{ \Delta }_{k}$, then from (2.7), (2.12), and $\|Q_{k}^{(2)}\|=1$, we have

$$\begin{aligned} \Vert s \Vert = \sqrt{ \Vert \tilde{s}_{k} \Vert ^{2} + \bigl\Vert Q_{k}^{(2)} u \bigr\Vert ^{2}} \leq \sqrt{ \Vert \tilde{s}_{k} \Vert ^{2}+ \Vert u \Vert ^{2} }\leq \sqrt{ \Vert \tilde{s}_{k} \Vert ^{2} +\widehat{\Delta }^{2}_{k}}= \tilde{\Delta }_{k}. \end{aligned}$$

On the other hand, if $\| s \| \le \tilde{\Delta }_{k}$, then from (2.5) we have

$$\begin{aligned} \Vert u \Vert = \bigl\Vert Q_{k}^{(2)} u \bigr\Vert = \sqrt{ \Vert s \Vert ^{2}- \Vert \tilde{s}_{k} \Vert ^{2} }\leq \sqrt{\tilde{\Delta }^{2}_{k}- \Vert \tilde{s}_{k} \Vert ^{2} }= \widehat{\Delta }_{k}. \end{aligned}$$

Therefore, if $u_{\ast }$ is the solution of (2.13)–(2.14), then $s_{\ast } = \tilde{s}_{k}+ Q_{k} ^{(2)} u_{\ast }$ is the solution of (2.2)–(2.4), whereas if $s_{\ast }$ is the solution of (2.2)–(2.4), then $u_{\ast } = (Q_{k}^{(2)})^{T}(s_{\ast }-\tilde{s}_{k})$ is the solution of (2.13)–(2.14).

In order to solve subproblem (2.13)–(2.14) by a simple method, we first fix the fractional part of $\tilde{\psi }_{k} (u)$ by letting $u=0$ in $h_{k} (s(u))$. At the same time, we set $u= - \tau \nabla \tilde{\psi }_{k} (0)$, where $\tau >0$, $\nabla \tilde{\psi }_{k} (0)\neq 0$. By direct computation, we know that $\nabla \tilde{\psi }_{k} (0)= (Q_{k}^{(2)})^{T} \nabla \psi _{k} ( \tilde{s}_{k})$, where $\nabla \psi _{k} (s)$ is defined by (1.13). Then from (2.7) we have that (2.13)–(2.14) reduce to the following simplified subproblem:

$$\begin{aligned}& \min_{\tau \in R} \varphi _{k} (\tau ), \end{aligned}$$

(2.15)

$$\begin{aligned}& \text{s.t.}\quad 0\leq \tau \le \tau _{ \Delta }, \end{aligned}$$

(2.16)

where

$$\begin{aligned}& \varphi _{k} (\tau )= h_{k} (\tilde{s}_{k}) g^{T}_{k} s(\tau ) + \frac{1}{2} h_{k}^{2} (\tilde{s}_{k}) s(\tau )^{T} B_{k} s(\tau ) \end{aligned}$$

(2.17)

$$\begin{aligned}& \hphantom{\varphi _{k} (\tau )} = \frac{1}{2} h_{k}( \tilde{s}_{k}) \bigl(a_{\tau }^{(1)} \tau ^{2} + b_{ \tau }^{(1)} \tau + c_{\tau }^{(1)} \bigr), \end{aligned}$$

(2.18)

$$\begin{aligned}& s(\tau )= \tilde{s}_{k}- \tau Q_{k}^{(2)} \bigl(Q_{k}^{(2)}\bigr)^{T}\nabla \psi _{k} (\tilde{s}_{k}), \end{aligned}$$

(2.19)

$$\begin{aligned}& a_{\tau }^{(1)} = h_{k}(\tilde{s}_{k}) \nabla \psi _{k} (\tilde{s}_{k})^{T} Q_{k}^{(2)} \bigl(Q_{k}^{(2)} \bigr)^{T} B_{k} Q_{k}^{(2)} \bigl(Q _{k}^{(2)}\bigr)^{T}\nabla \psi _{k} (\tilde{s}_{k}), \end{aligned}$$

(2.20)

$$\begin{aligned}& b_{\tau }^{(1)} = -2\nabla \psi _{k} ( \tilde{s}_{k})^{T} Q_{k}^{(2)} \bigl(Q _{k}^{(2)}\bigr)^{T}\bigl( g_{k} + h_{k}(\tilde{s}_{k}) B_{k} \tilde{s}_{k}\bigr), \end{aligned}$$

(2.21)

$$\begin{aligned}& c_{\tau }^{(1)} = 2 g_{k}^{T} \tilde{s}_{k}+h_{k}(\tilde{s}_{k}) \tilde{s}_{k}^{T} B_{k} \tilde{s}_{k} \end{aligned}$$

(2.22)

and

$$\begin{aligned} \tau _{ \Delta }=\frac{ \widehat{ \Delta }_{k}}{ \Vert (Q_{k}^{(2)})^{T} \nabla \psi (\tilde{s}_{k}) \Vert }. \end{aligned}$$

(2.23)

Combining with (1.17)–(1.20) and (2.11), we can obtain that $\|\tilde{s}_{k}\| \leq \tilde{\Delta }_{k}$ and

$$\begin{aligned} \zeta _{1} \leq h_{k} ( \tilde{s}_{k})\leq \zeta _{2}, \end{aligned}$$

(2.24)

where

$$\begin{aligned} \zeta _{1}= \frac{1-\epsilon _{1}}{(1 + {\epsilon _{1}})^{2}},\qquad \zeta _{2}= \frac{1+\epsilon _{1} }{{(1 -{\epsilon _{1}})}^{2}}. \end{aligned}$$

(2.25)

From (1.19), it is obvious that $0<\zeta _{1}<1$ and $\zeta _{2}>1$. From lines 2 and 3 before (2.15), we know that $(Q_{k}^{(2)})^{T} \nabla \psi _{k}(\tilde{s}_{k})\neq 0$, where $Q_{k}^{(2)}$ is defined in (2.5). For $Q_{k}^{(2)}\in R ^{n\times (n-m)}$, $\nabla \psi _{k} (\tilde{s}_{k})\in R^{n}$, then $Q_{k}^{(2)}(Q_{k}^{(2)})^{T}\nabla \psi _{k} (\tilde{s}_{k})\in R^{n}$ and $Q_{k}^{(2)}(Q_{k}^{(2)})^{T}\nabla \psi _{k} (\tilde{s}_{k}) \neq 0$ can be proved easily. Assume that $B_{k}$ is a symmetric positive definite matrix, then combining with (2.20), (2.24), and (2.25) we have

$$ a_{\tau }^{(1)}>0. $$

(2.26)

When $B_{k}$ is only symmetric but not positive definite matrix, then the modified BFGS formula can be used to modify the positive semi-definite matrix to a positive definite matrix (see [25]). Hence, it is easy to see that the solution of (2.15)–(2.16) is

$$\begin{aligned} \tau ^{(1)} = \textstyle\begin{cases} \min \{{\tau _{\mathrm{axis}}^{(1)}, \tau _{ \Delta }} \}, & \text{if } b_{\tau }^{(1)} < 0, \\ 0, & \text{otherwise,} \end{cases}\displaystyle \end{aligned}$$

(2.27)

where

$$\begin{aligned} \tau _{\mathrm{axis}}^{(1)} = \frac{ - b_{\tau }^{(1)} }{2a_{\tau } ^{(1)}}. \end{aligned}$$

(2.28)

We know that $u^{(1)} =-\tau ^{(1)}\nabla \tilde{\psi }_{k} (0)$ is an approximate solution of (2.13)–(2.14). In order to better approximate the solution of the subproblem, we are constantly repeating the above process similarly, except for fixing the fractional part. That is, in the ith iteration we substitute $u=-\tau ^{(i)}\nabla \tilde{\psi }_{k} (0)$ into $h_{k} (s(u))$, where $i=2,3,\ldots $ . We denote

$$\begin{aligned} h_{k}^{(i)}= h_{k} \bigl(s \bigl(-\tau ^{(i)}\nabla \tilde{\psi }_{k} (0)\bigr) \bigr). \end{aligned}$$

(2.29)

Then we can obtain a similar trust-region subproblem to (2.15)–(2.16). That is,

$$\begin{aligned}& \min_{\tau \in R} \varphi _{k}^{(i)} (\tau ), \end{aligned}$$

(2.30)

$$\begin{aligned}& \text{s.t.}\quad 0\leq \tau \le \tau _{ \Delta }, \end{aligned}$$

(2.31)

where

$$\begin{aligned}& \varphi _{k}^{(i)} (\tau )= h_{k}^{(i)} g^{T}_{k} s(\tau ) + \frac{1}{2} \bigl(h_{k}^{(i)}\bigr)^{2} s(\tau )^{T} B_{k} s(\tau ) \end{aligned}$$

(2.32)

$$\begin{aligned}& \hphantom{\varphi _{k}^{(i)} (\tau )} = \frac{1}{2} h_{k}^{(i)} \bigl(a_{\tau }^{(i)} \tau ^{2} + b_{\tau }^{(i)} \tau + c_{\tau }^{(i)} \bigr), \end{aligned}$$

(2.33)

$$\begin{aligned}& a_{\tau }^{(i)} = h_{k}^{(i)}\nabla \psi _{k} (\tilde{s}_{k})^{T} Q _{k}^{(2)} \bigl(Q_{k}^{(2)} \bigr)^{T} B_{k} Q_{k}^{(2)} \bigl(Q_{k}^{(2)}\bigr)^{T} \nabla \psi _{k} (\tilde{s}_{k}) , \end{aligned}$$

(2.34)

$$\begin{aligned}& b_{\tau }^{(i)} = -2\nabla \psi _{k} ( \tilde{s}_{k})^{T}Q_{k}^{(2)} \bigl(Q _{k}^{(2)}\bigr)^{T} \bigl( g_{k} + h_{k}^{(i)} B_{k} \tilde{s}_{k}\bigr), \end{aligned}$$

(2.35)

$$\begin{aligned}& c_{\tau }^{(i)} = 2 g_{k}^{T} \tilde{s}_{k}+h_{k}^{(i)} \tilde{s}_{k} ^{T} B_{k} \tilde{s}_{k}. \end{aligned}$$

(2.36)

By the computation, we have the solution of subproblem (2.30)–(2.31) as follows:

$$\begin{aligned} \tau ^{(i)} = \textstyle\begin{cases} \min \{{\tau _{\mathrm{axis}}^{(i)} , \tau _{ \Delta }} \}, & \text{if } b_{ \tau }^{(i)} < 0, \\ 0, & \text{otherwise,} \end{cases}\displaystyle \end{aligned}$$

(2.37)

where

$$\begin{aligned} \tau _{\mathrm{axis}}^{(i)} = \frac{ - b_{\tau }^{(i)} }{2a_{\tau } ^{(i)}}. \end{aligned}$$

(2.38)

However, if $\tau ^{(i)}= 0$, then subproblem (2.30)–(2.31) has no positive real roots. For this case, we set $b_{k} = c_{k}= 0$, and then the fractional model reduces to the conic model. Subproblem (1.21)–(1.24) can be solved by Algorithm 2.4 in [26]. That is, we calculate

$$\begin{aligned}& \tilde{a}_{k} = \bigl(Q_{k}^{(2)} \bigr)^{T}a_{k},\qquad \bar{a} = \frac{\tilde{a}_{k}}{1-a_{k}^{T}\tilde{s}_{k}}, \qquad \bar{g} = \frac{g_{k}}{1-a_{k}^{T}\tilde{s}_{k}}, \\& \bar{B} = \frac{B_{k}}{(1-a_{k}^{T}\tilde{s}_{k})^{2}},\qquad \alpha _{1} = 1- \Vert \bar{a} \Vert ^{2}\widehat{\Delta }^{2}, \\& \widehat{\Delta } = \min \biggl\{ \sqrt{ \tilde{\Delta }^{2} _{k}- \Vert \tilde{s}_{k} \Vert ^{2} }, \frac{1-a_{k}^{T}\tilde{s}_{k}- \varepsilon _{0}}{ \Vert a_{k} \Vert } \biggr\} ,\qquad m = \frac{ \Vert \bar{a} \Vert \widehat{\Delta }^{2}}{\alpha _{1} },\qquad \Delta _{1} = \frac{\widehat{\Delta }}{\sqrt{\alpha _{1}}}, \\& \hat{g} = \bigl(Q_{k}^{(2)}\bigr)^{T}\bar{g}+ \bar{a}\tilde{s}_{k}^{T}\bar{g}+ \bar{a} \tilde{s}_{k}^{T}\bar{B}\tilde{s}_{k}+ \bigl(Q_{k}^{(2)}\bigr)^{T}\bar{B} \tilde{s}_{k},\qquad V = \operatorname{diag}(\sqrt{\alpha _{1}},1,\ldots,1), \\& \hat{B} = \bigl(Q_{k}^{(2)}\bigr)^{T} \bar{B}Q_{k}^{(2)} + \tilde{s}_{k}^{T} \bar{B}\tilde{s}_{k}\bar{a}\bar{a}^{T} +\bar{a} \tilde{s}_{k}^{T} \bar{B}Q_{k}^{(2)}+ \bigl(Q_{k}^{(2)}\bigr)^{T}\bar{B} \tilde{s}_{k}\bar{a}^{T}, \\& g_{1} = V^{-1}Z\hat{g} + mV^{-1}Z \hat{B}Z^{T}e_{1},\qquad B_{1} = V^{-1}Z\hat{B}Z^{T}V^{-1}, \end{aligned}$$

where Z is an orthonormal rotation matrix and satisfies $Z\bar{a}= \|\bar{a}e_{1}\|$, and $e_{1}=(1,0,\ldots,0)^{T}$. Then the solution of (2.13)–(2.14) is

$$\begin{aligned} u_{\ast }= \frac{(1-a_{k}^{T}\tilde{s}_{k})(Z^{T}Vd^{\ast }+m Z^{T}e _{1})}{1-a_{k}^{T}\tilde{s}_{k}+\tilde{a}_{k}^{T}(Z^{T}Vd^{\ast }+m Z ^{T}e_{1})}, \end{aligned}$$

(2.39)

where $d^{\ast }$ is the solution of the quadratic trust region subproblem

$$\begin{aligned}& \min_{d \in R^{n}} g_{1}^{T}d+ \frac{1}{2}d^{T} B_{1} d, \\& \text{s.t.}\quad \Vert d \Vert \leq \Delta _{1}. \end{aligned}$$

Now we give an algorithm for solving the fractional trust region subproblems (2.13)–(2.14).

Subalgorithm 2.1

Step 0.

Input the data of the kth iteration, i.e., $0<\epsilon _{1}$, $\iota <1 $, $\varepsilon >0$, $a_{k}$, $b_{k}$, $c_{k}$, $g_{k}$, $B_{k}$, $A _{k}$, $C_{k}$, and $\tilde{\Delta }_{k}$. Set $i=1$.

Step 1.

Compute $Q_{k}^{(1)}$, $Q_{k}^{(2)}$, and $R^{(1)}_{k}$ as defined in (2.5).

Step 2.

Compute $l_{k}$, $\theta _{k}$, $\widehat{\Delta }_{k}$, and $\tilde{s}_{k}$ as defined in (2.9)–(2.12). Substitute $\tilde{s}_{k}$ into formula (1.13) and obtain $\nabla \psi _{k} (\tilde{s}_{k})$.

Step 3.

Calculate $\tau _{\ast }$.

Step 3.1

Compute $h_{k} (\tilde{s}_{k})$ and $\nabla \psi _{k} ( \tilde{s}_{k})$. Compute $a_{\tau }^{(1)}$, $b_{\tau }^{(1)}$, $c_{\tau } ^{(1)}$, and $\tau _{ \Delta }$ from (2.20)–(2.23). Solve (2.15)–(2.16) obtaining $\tau ^{(1)}$ as defined in (2.27). If $\tau ^{(1)}=0$, go to Step 5; otherwise, $i=i+1$, go to Step 3.2.

Step 3.2

Compute $h_{k}^{(i)}$ and $a_{\tau }^{(i)}$, $b_{\tau } ^{(i)}$, $c_{\tau }^{(i)}$ from (2.29) and (2.34)–(2.36). Solve (2.30)–(2.31) obtaining $\tau ^{(i)}$ as defined in (2.37). If $\tau ^{(i)}=0$, go to Step 5; otherwise, $i=i+1$, go to Step 3.3.

Step 3.3

If $|\tau ^{(i)}-\tau ^{(i-1)}|<\varepsilon $, then $\tau _{\ast }=\tau ^{(i)}$, and stop. Otherwise, go to Step 3.2.

Step 4.

Calculate $u_{\ast } = -\tau _{\ast }(Q_{k}^{(2)})^{T} \nabla \psi _{k}(\tilde{s}_{k})$, then $s_{\ast } = \tilde{s}_{k}+ Q _{k}^{(2)} u_{\ast }$.

Step 5.

Set $b_{k} = c_{k}= 0$. Calculate $u_{\ast } $ as defined in (2.39), then $s_{\ast } = \tilde{s}_{k}+ Q_{k}^{(2)} u_{\ast }$.

From the above analysis, we know that the steepest descent point $u_{\ast } $ is an approximate solution of the fractional trust region subproblem (2.13)–(2.14).

3 Global convergence

In this section, we propose a quasi-Newton method with a fractional model for nonlinearly equality constrained optimization and prove its convergence under some reasonable conditions. In order to solve problem (1.1)–(1.2), we approximate $f(x)$ with a fractional model of the form

$$ m_{k}(s)=f_{k}+\frac{(1+c_{k}^{T}s)g_{k}^{T}s}{(1-a_{k}^{T}s) (1-b _{k}^{T}s)}+ \frac{1}{2}\frac{(1+c_{k}^{T}s)^{2}s^{T}B_{k}s}{(1-a_{k} ^{T}s)^{2}(1-b_{k}^{T}s)^{2}}, $$

(3.1)

where $f_{k}=f(x_{k})$, $g_{k}=\nabla f(x_{k})$, $B_{k}\in R^{n\times n}$ is a positive definite matrix, and $a_{k}, b_{k}, c_{k}\in R^{n}$ are parameter vectors. We choose these vectors such that (3.1) satisfies the following conditions:

$$\begin{aligned}& m_{k}(0)=f_{k},\qquad \nabla m_{k}(0)=g_{k}, \end{aligned}$$

(3.2)

$$\begin{aligned}& m_{k}(-s_{k-1}) = f_{k-1},\qquad \nabla m_{k}(-s_{k-1}) = g_{k-1}, \end{aligned}$$

(3.3)

where $x_{k}=x_{k-1}+s_{k-1}$. Obviously, (3.2) holds. Then from (3.3) we have where $v_{k-1} = (1-a_{k}^{T} s_{k-1})(1-b_{k}^{T} s_{k-1})$ and

$$\begin{aligned} Q_{k-1} =& \bigl(1-c_{k}^{T} s_{k-1}\bigr)v_{k-1}I-\bigl[c_{k}v_{k-1}+ \bigl(1-c_{k}^{T} s _{k-1}\bigr) \bigl(a_{k}\bigl(1+B_{k}^{T} s_{k-1}\bigr) \\ &{}+B_{k}\bigl(1+a_{k}^{T} s_{k-1}\bigr)\bigr)\bigr]s_{k-1} ^{T}. \end{aligned}$$

(3.6)

We choose

$$ a_{k}=k_{1}g_{k-1}, \qquad b_{k}=k_{2}B_{k-1}s_{k-1} , \qquad c_{k}=k_{3}g_{k}, $$

(3.7)

where $k_{1}$, $k_{2}$, $k_{3}$ are unknown parameters, and details of the choice of parameters $k_{1}$, $k_{2}$, $k_{3}$ can be found in [24].

The merit function we applied is the $L_{1}$ exact penalty function

$$\begin{aligned} P(x)=f(x)+\sigma \bigl\Vert C(x) \bigr\Vert _{1}, \end{aligned}$$

(3.8)

where $\sigma >0$ is a penalty parameter. We know that, for σ sufficiently large, any strong local minimizer of (1.1)–(1.2) is a local minimizer of $P(x)$ (see [27]). It is found that this function is very convenient to be used as a merit function to force global convergence in line search type algorithms (for example, see [1]). We define the actual reduction in the merit function by

$$\begin{aligned} \operatorname{Ared}_{k}=P(x_{k})-P(x_{k}+s_{k}), \end{aligned}$$

(3.9)

where $s_{k}$ is a trial step computed by the algorithm at $x_{k}$. The above choice of actual reduction is used to prove the global convergence of the algorithm conveniently. Correspondingly, the predicted reduction is defined as

$$\begin{aligned} \operatorname{Pred}_{k}=\psi _{k} (0)-\psi _{k} (s_{k})+\sigma _{k}\bigl( \Vert C_{k} \Vert _{1}- \bigl\Vert C_{k}+A_{k}^{T}s_{k} \bigr\Vert _{1}\bigr), \end{aligned}$$

(3.10)

where $\psi _{k} (s)$ is defined as (1.11). If

$$\begin{aligned} \psi _{k} (0)-\psi _{k} (s_{k})\geq - \frac{\sigma _{k-1} }{2}\bigl( \Vert C_{k} \Vert _{1}- \bigl\Vert C_{k}+A_{k}^{T}s_{k} \bigr\Vert _{1}\bigr), \end{aligned}$$

(3.11)

we set the penalty parameter

$$\begin{aligned} \sigma _{k}=\sigma _{k-1} ; \end{aligned}$$

(3.12)

otherwise,

$$\begin{aligned} \sigma _{k} = \max \biggl[2\sigma _{k-1}, \frac{2(\psi _{k} ( s_{k})-\psi _{k} (0))}{ \Vert C_{k} \Vert _{1}- \Vert C_{k}+A_{k}^{T}s_{k} \Vert _{1}} \biggr]. \end{aligned}$$

(3.13)

Then

$$\begin{aligned} \operatorname{Pred}_{k} \geq \frac{1 }{2}\sigma _{k}\bigl( \Vert C_{k} \Vert _{1}- \bigl\Vert C_{k}+A _{k}^{T}s_{k} \bigr\Vert _{1}\bigr) \end{aligned}$$

(3.14)

holds for all k.

In the following algorithm, we compute the ratio of the actual reduction and the predicted reduction to choose the next iterate point and update the new trust region. Now we give the new quasi-Newton algorithm based on the fractional model (3.1).

Algorithm 3.1

Step 0.

Choose $x_{0}\in R^{n}$, $\varepsilon >0$, $\varepsilon _{0}>0$, $\epsilon _{1} \in (0, 1)$, $\Delta _{\max }>0$, $0<\iota _{1}< \iota _{2}<1$, $0<\delta _{1} <1< \delta _{2}$, $\sigma _{0}>0$, $B_{0} =I$, and the initial trust region radius $\Delta _{0}\in (0,\Delta _{\max }]$. Set $k=0$.

Step 1.

Stopping criterion. Compute $g_{k}$, $C_{k}$, and $Q_{k}^{(2)}$ as defined in (2.5). If $\| (Q_{k}^{(2)} )^{T} g_{k} \|\leq \varepsilon $ and $\| C_{k} \|\leq \varepsilon $, then $x_{\ast }=x_{k}$, stop. If $k=0$, go to Step 3.

Step 2.

Update $B_{k+1}$ by

$$ B_{k+1} = B_{k}-\frac{B_{k}s_{k}s_{k}^{T}B_{k}}{s_{k}^{T}B_{k}s_{k}}+ \frac{z_{k}z_{k}^{T}}{z_{k}^{T}s_{k}}, $$

(3.15)

where

$$\begin{aligned}& z_{k}=\vartheta y_{k}+(1-\vartheta )B_{k}s_{k}, \quad \vartheta \in [0,1], \\& \vartheta =\textstyle\begin{cases} 1,& \text{if } y_{k}^{T}s_{k}\geq 0.2s_{k}^{T}B_{k}s_{k}, \\ \frac{0.8s_{k}^{T}B_{k}s_{k}}{s_{k}^{T}B_{k}s_{k}-y_{k}^{T}s_{k}}, &\text{otherwise}, \end{cases}\displaystyle \end{aligned}$$

and $y_{k}=g_{k+1}-g_{k}$.

Step 3.

If $k\leq 1$, then set $a_{k}=b_{k}=c_{k}=0$ and $d_{k}=- B_{k}^{-1}g _{k}$, compute $\alpha _{k}$ such that Wolfe–Powell conditions are satisfied, and set $x_{k+1}=x_{k}+s_{k}=x_{k}+\alpha _{k}d_{k}$. $k=k+1$ and go to Step 1.

Step 4.

Compute

$$\begin{aligned} \textstyle\begin{cases} \alpha = g_{k-1}^{T}s_{k-1},\qquad \tilde{\alpha }=g_{k-1}^{T} \xi _{1},\qquad \hat{\alpha } =g_{k-1}^{T}\xi _{2}, \\ \beta =s_{k-1}^{T}B_{k-1}s_{k-1},\qquad \tilde{\beta } =s_{k-1}^{T}B_{k-1}\xi _{1},\qquad \hat{\beta } =s_{k-1}^{T}B_{k-1}\xi _{2}, \\ \zeta =s_{k-1}^{T}B_{k}s_{k-1},\qquad \tilde{\zeta } =s_{k-1}^{T}B_{k}\xi _{1},\qquad \hat{\zeta } =s_{k-1}^{T}B_{k}\xi _{2}, \\ \gamma = g_{k}^{T}s_{k-1},\qquad \tilde{\gamma } =g_{k}^{T}\xi _{1},\qquad \hat{\gamma } =g_{k}^{T} \xi _{2}, \end{cases}\displaystyle \end{aligned}$$

(3.16)

where $\xi _{1}$ and $\xi _{2}$ are chosen such that

$$\begin{aligned} \tilde{\alpha } = \tilde{\gamma } = \hat{\zeta } = \hat{\gamma } = 0. \end{aligned}$$

(3.17)

Step 5.

If $\gamma =0$, then go to Step 6; otherwise, compute

$$\begin{aligned}& \eta = \bigl(\gamma +\sqrt{\gamma ^{2}+ 2\zeta (f_{k-1}- f_{k}) }\bigr)/ \zeta , \\& \ddot{\gamma } =\eta \tilde{\beta } \zeta -\eta \beta \tilde{\zeta }-\tilde{\beta }\gamma , \\& \tilde{\iota }= \gamma -\eta \zeta ,\qquad \dot{\iota }= \tilde{\iota }\bigl( \hat{\alpha }\tilde{\beta }- \eta ^{2} \tilde{\zeta }\hat{\beta } \bigr),\\& \dot{\gamma }= \alpha \dot{\iota }+\eta \hat{\alpha } \tilde{\beta } \tilde{\iota }^{2}. \end{aligned}$$

If $\dot{\gamma }=0$ or $\ddot{\gamma }=0$, then go to Step 6; otherwise, calculate

$$\begin{aligned}& k_{1} = -\dot{\iota }/\dot{\gamma },\qquad k_{2} = \eta \tilde{\zeta }/\ddot{\gamma }, \\& k_{3} = \frac{1-\eta (1+k_{1}\alpha )(1+k_{2}\beta )}{\gamma }, \end{aligned}$$

and $a_{k}$, $b_{k}$, $c_{k}$ as determined in (3.7), go to Step 7.

Step 6.

Let $b_{k}=c_{k}=0$. Calculate

$$\begin{aligned}& \rho _{k} = (f_{k-1}- f_{k})^{2}- \alpha \gamma , \end{aligned}$$

(3.18)

$$\begin{aligned}& \dot{\beta } = \textstyle\begin{cases} \frac{(f_{k-1}- f_{k})+\sqrt{\rho _{k}}}{-\alpha },&\text{if } \rho_{k} \geq 0, \\ 1,&\text{otherwise}, \end{cases}\displaystyle \end{aligned}$$

(3.19)

and set

$$\begin{aligned} a_{k}= \frac{1-\dot{\beta }}{\alpha }g_{k-1}. \end{aligned}$$

(3.20)

Step 7.

If $\|a_{k}\| > \frac{\epsilon _{1}}{\tilde{\Delta }_{k}}$, then $a_{k} = \frac{\epsilon _{1} a_{k}}{\tilde{\Delta }_{k}\|a_{k}\|}$. Update $b_{k}$ and $c_{k}$ in the same way so that (1.20) are satisfied.

Step 8.

Solve subproblem (2.2)–(2.4) by Subalgorithm 2.1 to get $s_{k}$.

Step 9.

If (3.11) holds, then $\sigma _{k}=\sigma _{k-1}$; otherwise, calculate $\sigma _{k}$ as defined in (3.13).

Step 10.

Compute

$$ \rho _{k}=\frac{\operatorname{Ared}_{k}}{\operatorname{Pred}_{k}}, $$

(3.21)

where $\operatorname{Ared}_{k}$, $\operatorname{Pred}_{k}$ are defined in (3.9) and (3.10).

Step 11.

Update the trust region radius

$$\begin{aligned} \tilde{\Delta }_{k+1}= \textstyle\begin{cases} \delta _{1} \tilde{\Delta }_{k}, &\text{if } \rho _{k}\leq \iota _{1}, \\ \min \{\delta _{2}\tilde{\Delta }_{k}, \Delta _{\max }\}, &\text{if } \rho _{k}\geq \iota _{2} \text{ and } \Vert s_{k} \Vert = \tilde{\Delta }_{k}, \\ \tilde{\Delta }_{k}, &\text{otherwise}. \end{cases}\displaystyle \end{aligned}$$

(3.22)

Step 12.

If $\rho _{k}\geq \iota _{1}$, then $x_{k+1}=x_{k}+s_{k}$. Set $k=k+1$, and go to Step 1; otherwise $x_{k+1}=x_{k}$, $k=k+1$, and go to Step 6.

Remark 2

In order to ensure the positive definite transfer of the Hessian matrix, we use the modified BFGS formula in Step 2 to iterate to get the positive definite matrix $B_{k+1}$ (see [25]).

In the following, we establish the convergence results of Algorithm 3.1. The focus of this paper is to transform the fractional model trust region subproblem of the equality constrained optimization model into a simple one-dimensional quadratic model subproblem by cyclically fixing the fractional coefficient part of the model, so as to obtain a new approximate solution method for solving the subproblem. Therefore, the framework of the global convergence proof is similar to the proof process of the conic model trust region algorithm in [20], with a major difference being the lower bound of reduction in each iteration (see Lemma 3.1), which is an important result required in the proof of convergence.

Lemma 3.1

Suppose that (1.20) holds, where $\epsilon _{1}$satisfies (2.1). If $\{x_{k}\}$, $\{B_{k}\}$, and $\{(A_{k}^{T}A_{k})^{-1} \}$are uniformly bounded. Let $s_{k}$be the solution of subproblem (2.2)–(2.4), then there exist positive constants $M_{1}$and $M_{2}$such that

$$\begin{aligned} \psi _{k} (0)-\psi _{k} (s_{k}) \geq & M_{1} \bigl\Vert \bigl(Q_{k}^{(2)}\bigr)^{T}g_{k} \bigr\Vert \min \biggl\{ \tilde{\Delta }_{k}, \frac{ \Vert (Q_{k}^{(2)})^{T} g _{k} \Vert }{ \Vert B_{k} \Vert } \biggr\} \\ & {}- M_{2} \bigl(1+ \Vert B_{k} \Vert \bigr) \min \bigl\{ \tilde{\Delta }_{k}, \Vert C _{k} \Vert \bigr\} \end{aligned}$$

(3.23)

for allk.

Proof

Define

$$\begin{aligned} s_{k} (t) = \theta _{k} l_{k} -t Q_{k}^{(2)}\bigl(Q_{k}^{(2)} \bigr)^{T} g_{k}, \end{aligned}$$

(3.24)

where $\theta _{k}$ and $l_{k} $ are defined in (2.12) and (2.9) respectively. Then we can see that $s_{k} (t)$ is in the feasible region of subproblem (2.2)–(2.4) for all $t\in [0, \widehat{\Delta }_{k}/{\| (Q_{k}^{(2)})^{T} g_{k} \|}]$. From the definition of $s_{k}$ and $s_{k} (t)$, we have

$$\begin{aligned} \psi _{k} (0)-\psi _{k} (s_{k}) \geq \psi _{k} (0)-\psi _{k} \bigl(s_{k}(t)\bigr) \end{aligned}$$

(3.25)

for all $t\in [0, \widehat{\Delta }_{k}/{\| (Q_{k}^{(2)})^{T} g_{k} \|}]$. From (1.20), (2.24), and $s_{k} (t)\leq \tilde{\Delta }_{k}$ it follows

$$\begin{aligned} \zeta _{1} \leq h_{k} \bigl(s_{k}(t)\bigr)\leq \zeta _{2}, \end{aligned}$$

(3.26)

where $\zeta _{1}$, $\zeta _{2}$ are defined in (2.25). By $\|Q_{k}^{(2)}\|=1$, the Cauchy–Schwarz inequality, and for all $t\in [0, \widehat{\Delta }_{k}/{\| (Q_{k}^{(2)})^{T} g_{k} \|}]$, we have

$$\begin{aligned}& \psi _{k} (0)-\psi _{k} \bigl(s_{k}(t)\bigr) \\& \quad = -h_{k} \bigl(s_{k}(t)\bigr) \theta _{k} g^{T}_{k} l_{k} + t h_{k} \bigl(s_{k}(t)\bigr) \bigl\Vert \bigl(Q_{k}^{(2)}\bigr)^{T} g_{k} \bigr\Vert ^{2} \\& \qquad {}- \frac{1}{2} \bigl(h_{k} \bigl(s_{k}(t) \bigr)\bigr)^{2} \theta _{k} ^{2}l_{k}^{T} B _{k} l_{k} -\bigl(h_{k} \bigl(s_{k}(t)\bigr)\bigr)^{2} \theta _{k}l_{k}^{T} B_{k}Q_{k}^{(2)} \bigl(-t\bigl(Q _{k}^{(2)}\bigr)^{T} g_{k}\bigr) \\& \qquad {} - \frac{1}{2} \bigl(h_{k} \bigl(s_{k}(t) \bigr)\bigr)^{2} t^{2} g_{k}^{T} Q_{k}^{(2)} \bigl(Q _{k}^{(2)} \bigr)^{T}B_{k} Q_{k}^{(2)} \bigl(Q_{k}^{(2)}\bigr)^{T} g_{k} \\& \quad \geq -\zeta _{2}\theta _{k} \Vert l_{k} \Vert \Vert g_{k} \Vert + t\zeta _{1} \bigl\Vert \bigl(Q_{k} ^{(2)} \bigr)^{T} g_{k} \bigr\Vert ^{2} \\& \qquad {} - \frac{1}{2} \zeta _{2}^{2} \bigl( \theta _{k} ^{2} \Vert l_{k} \Vert ^{2} \Vert B _{k} \Vert + 2 \theta _{k} \Vert l_{k} \Vert \Vert B_{k} \Vert \widehat{ \Delta }_{k} +t ^{2} \bigl\Vert \bigl(Q_{k}^{(2)} \bigr)^{T} g_{k} \bigr\Vert ^{2} \Vert B_{k} \Vert \bigr) \\& \quad \geq t\zeta _{1} \bigl\Vert \bigl(Q_{k}^{(2)} \bigr)^{T} g_{k} \bigr\Vert ^{2}- \frac{1}{2} t^{2} \zeta _{2}^{2} \bigl\Vert \bigl(Q_{k}^{(2)}\bigr)^{T} g_{k} \bigr\Vert ^{2} \Vert B_{k} \Vert \\& \qquad {} - \zeta _{2}\theta _{k} \Vert l_{k} \Vert \Vert g_{k} \Vert - \zeta _{2}^{2} \theta _{k} \Vert l_{k} \Vert \Vert B_{k} \Vert \bigl( \theta _{k} \Vert l_{k} \Vert +\widehat{\Delta } _{k} \bigr). \end{aligned}$$

(3.27)

By (2.12), we obtain

$$\begin{aligned}& \max_{t\in [0, \widehat{\Delta }_{k}/{ \Vert (Q_{k}^{(2)})^{T} g_{k} \Vert }]} \biggl\{ t\zeta _{1} \bigl\Vert \bigl(Q_{k}^{(2)} \bigr)^{T} g_{k} \bigr\Vert ^{2}- \frac{1}{2} t^{2}\zeta _{2}^{2} \bigl\Vert \bigl(Q_{k}^{(2)}\bigr)^{T} g_{k} \bigr\Vert ^{2} \Vert B _{k} \Vert \biggr\} \\& \quad \geq \frac{\zeta _{1}}{2} \bigl\Vert \bigl(Q_{k}^{(2)} \bigr)^{T} g_{k} \bigr\Vert ^{2} \min \biggl\{ \frac{\widehat{\Delta }_{k}}{ \Vert (Q_{k}^{(2)})^{T} g_{k} \Vert }, \frac{ \zeta _{1}}{\zeta _{2}^{2} \Vert B_{k} \Vert } \biggr\} \\& \quad \geq \frac{\zeta _{1}}{2} \bigl\Vert \bigl(Q_{k}^{(2)} \bigr)^{T} g_{k} \bigr\Vert \min \biggl\{ \tilde{\Delta }_{k}, \frac{\zeta _{1} \Vert (Q_{k}^{(2)})^{T} g_{k} \Vert }{ \zeta _{2}^{2} \Vert B_{k} \Vert } \biggr\} - \frac{\zeta _{2}}{2} \bigl\Vert \bigl(Q_{k}^{(2)}\bigr)^{T} g_{k} \bigr\Vert \bigl(\theta _{k} \Vert l_{k} \Vert \bigr) \\& \quad \geq M_{1} \bigl\Vert \bigl(Q_{k}^{(2)} \bigr)^{T} g_{k} \bigr\Vert \min \biggl\{ \tilde{\Delta }_{k}, \frac{ \Vert (Q_{k}^{(2)})^{T} g_{k} \Vert }{ \Vert B_{k} \Vert } \biggr\} - \zeta _{2}^{2} \theta _{k} \Vert l_{k} \Vert \Vert g_{k} \Vert , \end{aligned}$$

(3.28)

where $0<\zeta _{1}<1$, $\zeta _{2}>1$, $M_{1}= {\zeta _{1}^{2}}/(2\zeta _{2}^{2})$, and the second inequality is obtained from the triangular inequality properties. Then from (3.25)–(3.28) we have

$$\begin{aligned} \psi _{k} (0)-\psi _{k} (s_{k}) \geq & \max_{t\in [0, \widehat{\Delta }_{k}/{ \Vert (Q_{k}^{(2)})^{T} g_{k} \Vert }]} \psi _{k} (0)-\psi _{k} \bigl(s_{k}(t)\bigr) \\ \geq & M_{1} \bigl\Vert \bigl(Q_{k}^{(2)} \bigr)^{T} g_{k} \bigr\Vert \min \biggl\{ \tilde{\Delta }_{k}, \frac{ \Vert (Q_{k}^{(2)})^{T} g_{k} \Vert }{ \Vert B_{k} \Vert } \biggr\} \\ &{} - \zeta _{2}^{2}\theta _{k} \Vert l_{k} \Vert \bigl[ \Vert B_{k} \Vert \bigl( \theta _{k} \Vert l_{k} \Vert +\widehat{\Delta }_{k} \bigr)+2 \Vert g_{k} \Vert \bigr]. \end{aligned}$$

(3.29)

Besides, since $\{(A_{k}^{T}A_{k})^{-1}\}$ are uniformly bounded, then from Lemma 3.1 in [20] we know that there exists a positive constant $\zeta _{3}$ such that

$$\begin{aligned} \theta _{k} \Vert l_{k} \Vert \leq \min \bigl\{ \iota \tilde{\Delta }_{k}, \zeta _{3} \Vert C_{k} \Vert \bigr\} \end{aligned}$$

holds where $\iota \in (0,1)$ is defined in (2.11). Then we have

$$\begin{aligned} \theta _{k} \Vert l_{k} \Vert \leq \max \{ \iota , \zeta _{3} \} \min \bigl\{ \tilde{\Delta }_{k}, \Vert C_{k} \Vert \bigr\} . \end{aligned}$$

(3.30)

The boundedness of $\{x_{k}\}$, rule (3.22), and inequality (3.30) imply that $\theta _{k} \| l_{k}\|+ \widehat{\Delta }_{k} $ and $\| g_{k}\|$ are bounded above uniformly. For instance, if $\theta _{k} \| l_{k}\|+ \widehat{\Delta }_{k} \leq \kappa _{1}$ and $\| g_{k}\| \leq \kappa _{2}$ for all sufficiently large k, then we have

$$\begin{aligned} \Vert B_{k} \Vert \bigl( \theta _{k} \Vert l_{k} \Vert +\widehat{\Delta }_{k} \bigr)+2 \Vert g_{k} \Vert \leq \kappa _{1} \Vert B_{k} \Vert +2\kappa _{2} \leq \max \{ \kappa _{1}, 2 \kappa _{2} \} \bigl( \Vert B_{k} \Vert +1\bigr). \end{aligned}$$

(3.31)

Let

$$\begin{aligned} M_{2}= \zeta _{2}^{2} \max \{ \iota , \zeta _{3} \} \max \{ \kappa _{1}, 2\kappa _{2} \} . \end{aligned}$$

Then from (3.29)–(3.31) we know that (3.23) holds. Hence the theorem is proved. □

The proofs of the following Theorems 3.1–3.2 are similar to those in [20], so we only give the conclusion and omit the proofs.

Theorem 3.1

Under the conditions of Lemma 3.1, we have

$$\begin{aligned} \lim_{k\to \infty } \Vert C_{k} \Vert = 0. \end{aligned}$$

Theorem 3.2

Under the conditions of Lemma 3.1, we have

$$\begin{aligned} \lim_{k\to \infty } \inf \bigl\Vert \bigl(Q_{k}^{(2)} \bigr)^{T} g_{k} \bigr\Vert = 0. \end{aligned}$$

Theorem 3.3

Let the conditions of Lemma 3.1hold and $\lambda _{k} = (R^{(1)} _{k})^{-1} (Q_{k}^{(1)})^{T} g_{k} $. Then we have

$$\begin{aligned} \lim_{k\to \infty } \inf \Vert g_{k}-A_{k}\lambda _{k} \Vert = 0. \end{aligned}$$

(3.32)

Proof

From the definition of $\lambda _{k}$ and (2.5), we have

$$\begin{aligned} g_{k}-A_{k}\lambda _{k} =& \bigl(Q^{(1)}_{k} \bigl(Q_{k}^{(1)} \bigr)^{T} + Q^{(2)} _{k} \bigl(Q_{k}^{(2)} \bigr)^{T}\bigr)g_{k}- Q_{k}^{(1)}R^{(1)}_{k} \bigl(R^{(1)}_{k}\bigr)^{-1} \bigl(Q_{k}^{(1)}\bigr)^{T} g_{k} \\ =& Q^{(2)}_{k} \bigl(Q_{k}^{(2)} \bigr)^{T}g_{k}. \end{aligned}$$

From Theorem 3.2 it follows that (3.32) holds. □

From Theorems 3.1 and 3.3, we know that there exists a subsequence of $\{x_{k}\}$ produced by Algorithm 3.1 converging to the KT point of (1.1)–(1.2).

4 Numerical experiment

In this section, Algorithm 3.1 (abbreviated as FTR) is tested with some test problems, where eight problems are directly chosen from [28, 29] and are listed in Table 1.

Table 1

Test functions

Pro	Function Name	Pro	Function Name
1	HS46	2	HS47
3	HS48	4	HS49
5	HS50	6	HS51
7	HS52	8	HS53

Moreover, in order to test Algorithm 3.1 more generally, we designed three problems (Problems 9–11) where Problems 10–11 are chosen from [30, 31] and the nonlinear equality constraints are polynomials (see [26]).

Conic function

$$\begin{aligned}& f(x)=\sum_{i=1}^{n/2} \frac{x_{2i-1}^{2}+x_{2i}^{2}}{(1-x_{2i-1})^{2}} \\& \mbox{s.t.}\quad c_{i}(x)=x_{2i}^{2}-x_{2i-1}x_{2i}+10=0, \quad 1 \leq i \leq n/2. \end{aligned}$$

10.

Generalized Brown function

$$\begin{aligned}& f(x)= \sum_{i=1}^{n/2} \bigl( \bigl(x_{2i-1}^{2}\bigr)^{(x_{2i}^{2}+1)}+ \bigl(x_{2i} ^{2}\bigr)^{(x_{2i-1}^{2}+1)} \bigr) \\& \mbox{s.t.}\quad c_{i}(x)=(3-2x_{i+1})x_{i+1}+1-x_{i}-2 x_{i+2}=0,\quad 1 \leq i \leq n-2. \end{aligned}$$

11.

Penalty-I function

$$\begin{aligned}& f(x)= 10^{-5} \sum_{i=1}^{n} (x_{i}-1)^{2}+\Biggl(\sum_{i=1}^{n} x_{i} ^{2}-0.25\Biggr)^{2} \\& \mbox{s.t.}\quad c_{i}(x)=3x_{i}^{3} +4x_{i+1}^{2}=0, \quad 1 \leq i \leq n-1. \end{aligned}$$

Our fractional model is proposed on the basis of the conic model, and it is a generalized form of the conic model. Both of them are suitable for solving the optimization problem that the objective function is non-quadratic and the curvature changes severely. At this time, the quadratic model methods often produce a poor prediction of the minimizer of the function. In order to compare the calculation results of the fractional model and the conic model, we let $b_{k}=c_{k}=0$ in Algorithm 3.1. Then we can obtain the conic model algorithm and call this algorithm CTR. Therefore, we solve these test problems by FTR and CTR respectively and compare their results.

All the computations are carried out in Matlab R2013b on a microcomputer in double precision arithmetic. These tests use the same stopping criterion $\| (Q_{k}^{(2)} )^{T} g_{k} \|\leq \varepsilon $ and $\| C_{k} \|\leq \varepsilon $. If the stoping criterion is not satisfied with $\text{Iter} \leq 5000$, then the test is terminated. We mark these by an ∗ in such tables. The columns in the tables have the following meanings: Pro denotes the number of the test problems; n is the dimension of the test problems, Iter is the number of iterations; nf and ng are the numbers of function and gradient evaluations respectively; Qg is the value of Euclidean norm $\|(Q_{k}^{(2)} )^{T} g_{k}\|$ at the final iteration; CPU(s) denotes the total iteration time of the algorithm in seconds.

The parameters in these algorithms are as follows:

$$\begin{aligned}& B_{0} = I, \qquad \epsilon _{1} = 0.6,\qquad \varepsilon =10^{-6},\qquad \varepsilon _{0}=10^{-3}, \qquad \Delta _{0} = 2, \\& \Delta _{\max } = 10,\qquad \iota = \sqrt{0.5},\qquad \sigma _{0}=1,\qquad \iota _{1} = 0.01,\qquad \iota _{2} =0.7, \\& \delta _{1} =0.25,\qquad \delta _{2} =1.5. \end{aligned}$$

The numerical comparison for 11 test problems is listed in Table 2. Because of the difference in the initial iteration point, we actually performed 19 experiments. In terms of the number of iterations, our algorithm FTR may be somewhat superior to CTR for 17 tests, and the two algorithms are the same in efficiency for the other 2 tests. Because FTR needs some extra algebra computation for some parameters, FTR takes more time than CTR.

Table 2

The numerical results of Algorithm 3.1 for some test problems

Pro	n	Starting point	Algorithm	Iter	nf/ng	Qg	CPU (s)
1	8	(2,1.5,0.5,…,2,1.5,0.5,2,1.5)	CTR	37	37/32	2.325673(−7)	0.065033
1	8	(2,1.5,0.5,…,2,1.5,0.5,2,1.5)	FTR	30	30/24	8.903926(−7)	0.265808
2	5	(2,1.5,−1,0.5,2)	CTR	12	12/12	1.923822(−8)	0.069842
2	5	(2,1.5,−1,0.5,2)	FTR	12	12/11	1.208933(−7)	0.179698
3	5	(3,5,−3,3,5)	CTR	29	29/29	7.867587(−7)	0.104208
3	5	(3,5,−3,3,5)	FTR	28	28/27	7.233105(−7)	0.209572
4	5	(7,7,5,−3,0.2)	CTR	28	28/24	8.816453(−7)	0.056596
4	5	(7,7,5,−3,0.2)	FTR	25	25/25	1.794643(−7)	0.187379
5	5	(35,−31,11,5,−5)	CTR	20	20/20	8.080796(−7)	0.054710
5	5	(35,−31,11,5,−5)	FTR	19	19/19	9.688987(−7)	0.208662
6	5	(2.5,0.5,2,−1,0.5)	CTR	10	10/9	2.088608(−7)	0.050139
6	5	(2.5,0.5,2,−1,0.5)	FTR	10	10/9	5.040703(−7)	0.170110
7	5	(3,3,3,3,3)	CTR	23	23/20	7.777555(−9)	0.058845
7	5	(3,3,3,3,3)	FTR	20	20/19	1.238526(−9)	0.191213
8	5	(3,3,3,3,3)	CTR	34	34/24	2.452030(−7)	0.085232
8	5	(3,3,3,3,3)	FTR	19	19/17	3.532659(−7)	0.189874
9	4	(−2,2,−2,2)	CTR	110	110/98	8.314982(−11)	0.163601
	4	(−2,2,−2,2)	FTR	74	74/67	1.493034(−8)	0.331437
	4	(−3,−1,−3,−1)	CTR	113	113/110	8.342504(−7)	0.167646
	4	(−3,−1,−3,−1)	FTR	26	26/23	1.415245(−8)	0.209718
	10	(−3,−1,…,−3,−1)	CTR	100	100/92	1.493407(−9)	0.203343
	10	(−3,−1,…,−3,−1)	FTR	34	34/31	8.041779(−9)	0.285324
	80	(−1,1,…,−1,1)	CTR	178	178/164	9.716836(−7)	0.871734
	80	(−1,1,…,−1,1)	FTR	128	128/119	1.525727(−7)	3.198384
10	6	(−1,1,−1,1,−1,1)	CTR	22	22/19	1.976346(−7)	0.060672
	6	(−1,1,−1,1,−1,1)	FTR	14	14/12	2.538846(−7)	0.193613
	4	(−1.2,1,−1.2,1)	CTR	27	27/24	7.132702(−9)	0.081467
	4	(−1.2,1,−1.2,1)	FTR	24	24/22	6.336906(−9)	0.223280
11	4	(−1.2,1,−1.2,1)	CTR	118	118/103	1.144310(−7)	0.142733
	4	(−1.2,1,−1.2,1)	FTR	16	16/12	9.425581(−7)	0.191315
	4	(0.75,0.5,0.25,0)	CTR	59	59/45	3.391065(−8)	0.080867
	4	(0.75,0.5,0.25,0)	FTR	42	42/32	9.504125(−7)	0.230135
	4	(−3,−1,−3,−1)	CTR	15	15/15	1.422248(−9)	0.047370
	4	(−3,−1,−3,−1)	FTR	13	13/13	2.558427(−7)	0.180390
	4	(3,4,3,4)	CTR	538	538/515	2.226214(−7)	0.653460
	4	(3,4,3,4)	FTR	19	19/17	2.771759(−7)	0.189182
	4	(−1,1,−1,1)	CTR	160	160/142	1.398612(−8)	0.184020
	4	(−1,1,−1,1)	FTR	12	12/11	6.116235(−9)	0.190648

We know that large-scale problems with 10,000 or more dimensions can be tested for unconstrained optimization problems. But when the test problem contains nonlinear equality constraints, the difficulty of solving the problem is greatly increased. Therefore, the dimensions of the nonlinear equality constraint problems are limited. From Table 3, we can see that as the dimensions of each test problem range from 25 to 400, we have computed nine numerical comparisons experiments for three test questions. It can be found that our algorithm can get the minimum value of the function after a finite number of iterations, but the algorithm CTR can not, and some test problem algorithms failed. The number of iterations of FTR is less than that of CTR. That is, with the increase of the dimension of the problem, our algorithm FTR can be superior to CTR both in iteration number and algorithm stability.

Table 3

The numerical results of Algorithm 3.1 for some test problems ($\varepsilon =10^{-4}$)

Pro	n	Starting point	Algorithm	Iter	nf/ng	Qg	CPU (s)
1	50	(2,1.5,0.5,…,2,1.5,0.5,2,1.5)	CTR	66	66/59	7.675845(−5)	0.300518
	50	(2,1.5,0.5,…,2,1.5,0.5,2,1.5)	FTR	29	29/23	5.866993(−5)	0.627469
	101	(2,1.5,0.5,…,2,1.5,0.5,2,1.5)	CTR	137	137/128	9.543858(−5)	1.619127
	101	(2,1.5,0.5,…,2,1.5,0.5,2,1.5)	FTR	37	37/29	2.970168(−5)	1.184047
	200	(2,1.5,0.5,…,2,1.5,0.5,2,1.5)	CTR	∗	∗/∗	∗	∗
	200	(2,1.5,0.5,…,2,1.5,0.5,2,1.5)	FTR	33	33/27	7.135361(−5)	1.942937
5	25	(35,11,5,−5,…,35,11,5,−5,35,11,5)	CTR	76	76/64	7.377510(−5)	0.232441
	25	(35,11,5,−5,…,35,11,5,−5,35,11,5)	FTR	58	58/55	8.181980(−5)	0.546569
	61	(35,11,5,−5,…,35,11,5,−5,35,11,5)	CTR	∗	∗/∗	∗	∗
	61	(35,11,5,−5,…,35,11,5,−5,35,11,5)	FTR	74	74/64	7.910359(−5)	1.934600
10	30	(−1,1,…,−1,1)	CTR	81	81/75	5.527921(−5)	0.202645
	30	(−1,1,…,−1,1)	FTR	26	26/21	3.795910(−6)	0.355148
	100	(−1,1,…,−1,1)	CTR	26	26/21	1.689243(−5)	0.197726
	100	(−1,1,…,−1,1)	FTR	21	21/16	1.357598(−5)	0.657030
	200	(−1,1,…,−1,1)	CTR	45	45/36	7.697442(−6)	0.713368
	200	(−1,1,…,−1,1)	FTR	28	28/20	1.204414(−5)	1.409634
	400	(−1,1,…,−1,1)	CTR	80	80/70	1.815889(−5)	5.648017
	400	(−1,1,…,−1,1)	FTR	24	24/18	9.612549(−5)	3.302268

5 Conclusions

The fractional model in Algorithm 3.1 is the extension of a conic model. By using more information of function and gradient from the previous iterations and choosing parameters flexibly, the fractional model can be more approximate to the original problem. When solving the trust region subproblem, we reduce the fractional model of the subproblem to a quadratic model by cycling the fixed coefficient part, and iteratively search in the descending direction to obtain the approximate solution of the subproblem. Solving of the subproblems is creative and the algorithm is simple. The theoretical and numerical results show that the method is effective and robust.

Acknowledgements

We are grateful to the editors and referees for their suggestions and comments.

Availability of data and materials

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Vorheriger Artikel Positive periodic solutions for multiparameter nonlinear differential systems with delays

Nächster Artikel A subgradient extragradient algorithm for solving monotone variational inequalities in Banach spaces

Han, S.P.: A globally convergent method for nonlinear programming. J. Optim. Theory Appl. 22(3), 297–309 (1977) MathSciNetCrossRef

Powell, M.J.D.: Variable Metric Methods for Constrained Optimization. Springer, Berlin Heidelberg (1983) CrossRef

Yuan, Y.X., Sun, W.Y.: Conic Methods for Unconstrained Minimization and Tensor Methods for Nonlinear Equations. Science Press, Beijing (1997)

Powell, M.J.D., Yuan, Y.: A trust region algorithm for equality constrained optimization. Math. Program. 49(1), 189–211 (1990) MathSciNetCrossRef

Vardi, A.: A trust region algorithm for equality constrained minimization: convergence properties and implementation. SIAM J. Numer. Anal. 22(3), 575–591 (1981) MathSciNetCrossRef

Boggs, P.T., Byrd, R.H., Schnabel, R.B.: A stable and efficient algorithm for nonlinear orthogonal distance regression. SIAM J. Sci. Stat. Comput. 8(6), 1052–1078 (1987) MathSciNetCrossRef

Toint, P.L.: Global convergence of a class of trust region methods for nonconvex minimization in Hilbert space. IMA J. Numer. Anal. 8, 231–252 (1988) MathSciNetCrossRef

Zhang, J.Z., Zhu, D.T.: Projected quasi-Newton algorithm with trust region for constrained optimization. J. Optim. Theory Appl. 67(2), 369–393 (1990) MathSciNetCrossRef

El-Alem, M.: A robust trust-region algorithm with a nonmonotonic penalty parameter scheme for constrained optimization. SIAM J. Optim. 5(2), 348–378 (1995) MathSciNetCrossRef

10.

Qu, S.J., Zhou, Y.Y., Zhang, Y.L., Wahab, M.I.M., Zhang, G., Ye, Y.Y.: Optimal strategy for a green supply chain considering shipping policy and default risk. Comput. Ind. Eng. 131, 172–186 (2019) CrossRef

11.

Huang, R.P., Qu, S.J., Yang, X.G., Liu, Z.M.: Multi-stage distributionally robust optimization with risk aversion. J. Ind. Manag. Optim. (2019). https://doi.org/10.3934/jimo.2019109 CrossRef

12.

Qin, X., Yao, J.C.: A viscosity iterative method for a split feasibility problem. J. Nonlinear Convex Anal. 20, 1497–1506 (2019) MathSciNet

13.

Dai, Y., Yamashita, N.: Convergence analysis of sparse quasi-Newton updates with positive definite matrix completion for two-dimensional functions. Numer. Algebra Control Optim. 1, 61–69 (2011) MathSciNetCrossRef

14.

Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. Society for Industrial and Applied Mathematics, Philadelphia (2000) CrossRef

15.

Davidon, W.C.: Conic approximations and collinear scalings for optimizers. SIAM J. Numer. Anal. 17(17), 268–281 (1980) MathSciNetCrossRef

16.

Sorensen, D.C.: The q-superlinear convergence of a collinear scaling algorithm for unconstrained optimization. SIAM J. Numer. Anal. 17(17), 84–114 (1980) MathSciNetCrossRef

17.

Zhang, X., Wen, X., Ni, Q.: Subspace trust-region algorithm with conic model for unconstrained optimization. Numer. Algebra Control Optim. 3, 223–234 (2013) MathSciNetCrossRef

18.

Zhu, H., Ni, Q.: A simple alternating direction method for the conic trust region subproblem. Math. Probl. Eng. 2018, Article ID 5358191 (2018) MathSciNetMATH

19.

Byrd, R.H., Shultz, G.A.: A trust region algorithm for nonlinearly constrained optimization. SIAM J. Numer. Anal. 24(5), 1152–1170 (1987) MathSciNetCrossRef

20.

Sun, W.Y., Yuan, Y.X.: A conic trust-region method for nonlinearly constrained optimization. Ann. Oper. Res. 103(1), 175–191 (2001) MathSciNetCrossRef

21.

Omojokun, E.O.: Trust region algorithms for optimization with nonlinear equality and inequality constraints. Ph.D. thesis, Boulder, CO, USA (1989)

22.

Zhu, H., Ni, Q., Zeng, M.L.: A quasi-Newton trust region method based on a new fractional model. Numer. Algebra Control Optim. 5(3), 237–249 (2015) MathSciNetCrossRef

23.

Zhu, H., Ni, Q., Dang, C., Zhang, H.: A trust region method based on the fractional model for unconstrained optimization. Sci. Sin., Math. 48(4), 531–546 (2018) CrossRef

24.

Zhu, H., Ni, Q., Zhang, L., Yang, W.: A fractional trust region method for linear equality constrained optimization. Discrete Dyn. Nat. Soc. 2016(2), 1–10 (2016) MathSciNetMATH

25.

Ni, Q.: Optimization Method and Program Design. Science Press, Beijing (2009)

26.

Zhang, L.W., Ni, Q.: Trust region algorithm of new conic model for nonlinearly equality constrained optimization. J. Numer. Methods Comput. Appl. 31(4), 279–289 (2010) MathSciNetMATH

27.

Coleman, T.F., Conn, A.R.: Second-order conditions for an exact penalty function. Math. Program. 19(1), 178–185 (1980) MathSciNetCrossRef

28.

Schittkowski, K.: More Test Examples for Nonlinear Programming Codes. Springer, New York (1979) MATH

29.

Lukšan, L., Vlček, J.: Indefinitely preconditioned inexact Newton method for large sparse equality constrained nonlinear programming problems. Numer. Linear Algebra Appl. 5(3), 219–247 (1998) MathSciNetCrossRef

30.

Zhu, M., Xue, Y., Sheng, Z.F.: A quasi-Newton type trust region method based on the conic model. Numer. Math. J. Chin. Univ. 17(1), 36–47 (1995) MathSciNetMATH

31.

Moré, J.J., Garbow, B.S., Hillstrom, K.E.: Testing unconstrained optimization software. ACM Trans. Math. Softw. 7(1), 17–41 (1981) MathSciNetCrossRef

Titel: A simple approximated solution method for solving fractional trust region subproblems of nonlinearly equality constrained optimization
verfasst von: Honglan Zhu
Qin Ni
Xuebing Zhang
Publikationsdatum: 01.12.2020
Verlag: Springer International Publishing
Erschienen in: Journal of Inequalities and Applications / Ausgabe 1/2020
Elektronische ISSN: 1029-242X
DOI: https://doi.org/10.1186/s13660-020-2300-7

Springer Professional

A simple approximated solution method for solving fractional trust region subproblems of nonlinearly equality constrained optimization

Abstract

Publisher’s Note

1 Introduction

2 The algorithm of fractional trust region subproblem

3 Global convergence

4 Numerical experiment

5 Conclusions

Acknowledgements

Availability of data and materials

Competing interests

Publisher’s Note

Premium Partner

Springer Professional

Abstract

Publisher’s Note

1 Introduction

2 The algorithm of fractional trust region subproblem

3 Global convergence

4 Numerical experiment

5 Conclusions

Acknowledgements

Availability of data and materials

Competing interests

Publisher’s Note

Weitere Artikel der Ausgabe 1/2020

Criteria for a certain class of the Carathéodory functions and their applications

Multivalued weak cyclic δ-contraction mappings

On -convergence in fuzzy normed spaces

Estimation of f-divergence and Shannon entropy by using Levinson type inequalities for higher order convex functions via Hermite interpolating polynomial

The functional inequality for the mixed quermassintegral

The solvability of a kind of generalized Riemann–Hilbert problems on function spaces

Premium Partner