The null-space technique (see [
20,
25,
26]) is an important technique for solving optimization problems with equality constraints. In the following, we show the method to eliminate constraint (
1.22). Assume that
\(A_{k}\) has full column rank, and there exist an orthogonal matrix
\(Q_{k}\) and a nonsingular upper triangular matrix
\(R_{k}\) such that
(2.5)
where
\(Q_{k}^{(1)}\in R^{n\times m}\),
\(Q_{k}^{(2)}\in R^{n\times (n-m)}\), and
\(R^{(1)}_{k} \in R^{m \times m}\). Then (
1.22) can be rewritten as
$$\begin{aligned} \bigl(R^{(1)}_{k}\bigr)^{T} \bigl(Q_{k}^{(1)}\bigr)^{T} s = -\theta _{k} C_{k}. \end{aligned}$$
(2.6)
Therefore the feasible point for (
1.22) can be presented by
$$\begin{aligned} s = \tilde{s}_{k}+ Q_{k}^{(2)} u \end{aligned}$$
(2.7)
for any
\(u \in R^{n-m}\), where
$$\begin{aligned} \tilde{s}_{k} = -\theta _{k} Q_{k}^{(1)} \bigl(R^{(1)}_{k}\bigr)^{-T} C_{k}, \end{aligned}$$
(2.8)
and
\(Q_{k}^{(2)} u\) lies in the null space of
\(A_{k}\). We denote
$$\begin{aligned} l_{k} = - Q_{k}^{(1)} \bigl(R^{(1)}_{k} \bigr)^{-T} C_{k}, \end{aligned}$$
(2.9)
then
$$\begin{aligned} \tilde{s}_{k} = \theta _{k} l_{k}. \end{aligned}$$
(2.10)
In order to ensure that
s lies in the trust region, we choose
\(\theta _{k}\) such that the norm of
\(\tilde{s}_{k}\) is at most
\(\iota \tilde{\Delta }_{k}\), where
\(\iota \in (0,1)\) is a given constant. That is,
$$\begin{aligned} \Vert \tilde{s}_{k} \Vert = \theta _{k} \Vert l_{k} \Vert \leq \iota \tilde{\Delta } _{k}. \end{aligned}$$
(2.11)
Define
$$\begin{aligned} \theta _{k} = \min \biggl\{ 1, \frac{\iota \tilde{\Delta }_{k}}{ \Vert l _{k} \Vert } \biggr\} ,\qquad \widehat{\Delta }_{k} = \sqrt{ \tilde{\Delta }^{2}_{k}- \Vert \tilde{s} _{k} \Vert ^{2} }. \end{aligned}$$
(2.12)
Then the fractional trust region subproblem (
2.2)–(
2.4) becomes
$$\begin{aligned}& \min_{u \in R^{n-m}} \tilde{\psi }_{k} (u) = h_{k} \bigl(s(u)\bigr) g^{T}_{k} s(u) + \frac{1}{2} h _{k}^{2} \bigl(s(u)\bigr) s(u)^{T} B_{k} s(u), \end{aligned}$$
(2.13)
$$\begin{aligned}& \text{s.t.}\quad \Vert u \Vert \le \widehat{ \Delta }_{k}, \end{aligned}$$
(2.14)
where
\(s(u)=\tilde{s}_{k}+ Q_{k}^{(2)} u\).
In order to solve subproblem (
2.13)–(
2.14) by a simple method, we first fix the fractional part of
\(\tilde{\psi }_{k} (u)\) by letting
\(u=0\) in
\(h_{k} (s(u))\). At the same time, we set
\(u= - \tau \nabla \tilde{\psi }_{k} (0)\), where
\(\tau >0\),
\(\nabla \tilde{\psi }_{k} (0)\neq 0\). By direct computation, we know that
\(\nabla \tilde{\psi }_{k} (0)= (Q_{k}^{(2)})^{T} \nabla \psi _{k} ( \tilde{s}_{k})\), where
\(\nabla \psi _{k} (s)\) is defined by (
1.13). Then from (
2.7) we have that (
2.13)–(
2.14) reduce to the following simplified subproblem:
$$\begin{aligned}& \min_{\tau \in R} \varphi _{k} (\tau ), \end{aligned}$$
(2.15)
$$\begin{aligned}& \text{s.t.}\quad 0\leq \tau \le \tau _{ \Delta }, \end{aligned}$$
(2.16)
where
$$\begin{aligned}& \varphi _{k} (\tau )= h_{k} (\tilde{s}_{k}) g^{T}_{k} s(\tau ) + \frac{1}{2} h_{k}^{2} (\tilde{s}_{k}) s(\tau )^{T} B_{k} s(\tau ) \end{aligned}$$
(2.17)
$$\begin{aligned}& \hphantom{\varphi _{k} (\tau )} = \frac{1}{2} h_{k}( \tilde{s}_{k}) \bigl(a_{\tau }^{(1)} \tau ^{2} + b_{ \tau }^{(1)} \tau + c_{\tau }^{(1)} \bigr), \end{aligned}$$
(2.18)
$$\begin{aligned}& s(\tau )= \tilde{s}_{k}- \tau Q_{k}^{(2)} \bigl(Q_{k}^{(2)}\bigr)^{T}\nabla \psi _{k} (\tilde{s}_{k}), \end{aligned}$$
(2.19)
$$\begin{aligned}& a_{\tau }^{(1)} = h_{k}(\tilde{s}_{k}) \nabla \psi _{k} (\tilde{s}_{k})^{T} Q_{k}^{(2)} \bigl(Q_{k}^{(2)} \bigr)^{T} B_{k} Q_{k}^{(2)} \bigl(Q _{k}^{(2)}\bigr)^{T}\nabla \psi _{k} (\tilde{s}_{k}), \end{aligned}$$
(2.20)
$$\begin{aligned}& b_{\tau }^{(1)} = -2\nabla \psi _{k} ( \tilde{s}_{k})^{T} Q_{k}^{(2)} \bigl(Q _{k}^{(2)}\bigr)^{T}\bigl( g_{k} + h_{k}(\tilde{s}_{k}) B_{k} \tilde{s}_{k}\bigr), \end{aligned}$$
(2.21)
$$\begin{aligned}& c_{\tau }^{(1)} = 2 g_{k}^{T} \tilde{s}_{k}+h_{k}(\tilde{s}_{k}) \tilde{s}_{k}^{T} B_{k} \tilde{s}_{k} \end{aligned}$$
(2.22)
and
$$\begin{aligned} \tau _{ \Delta }=\frac{ \widehat{ \Delta }_{k}}{ \Vert (Q_{k}^{(2)})^{T} \nabla \psi (\tilde{s}_{k}) \Vert }. \end{aligned}$$
(2.23)
Combining with (
1.17)–(
1.20) and (
2.11), we can obtain that
\(\|\tilde{s}_{k}\| \leq \tilde{\Delta }_{k}\) and
$$\begin{aligned} \zeta _{1} \leq h_{k} ( \tilde{s}_{k})\leq \zeta _{2}, \end{aligned}$$
(2.24)
where
$$\begin{aligned} \zeta _{1}= \frac{1-\epsilon _{1}}{(1 + {\epsilon _{1}})^{2}},\qquad \zeta _{2}= \frac{1+\epsilon _{1} }{{(1 -{\epsilon _{1}})}^{2}}. \end{aligned}$$
(2.25)
From (
1.19), it is obvious that
\(0<\zeta _{1}<1\) and
\(\zeta _{2}>1\). From lines 2 and 3 before (
2.15), we know that
\((Q_{k}^{(2)})^{T} \nabla \psi _{k}(\tilde{s}_{k})\neq 0\), where
\(Q_{k}^{(2)}\) is defined in (
2.5). For
\(Q_{k}^{(2)}\in R ^{n\times (n-m)}\),
\(\nabla \psi _{k} (\tilde{s}_{k})\in R^{n}\), then
\(Q_{k}^{(2)}(Q_{k}^{(2)})^{T}\nabla \psi _{k} (\tilde{s}_{k})\in R^{n}\) and
\(Q_{k}^{(2)}(Q_{k}^{(2)})^{T}\nabla \psi _{k} (\tilde{s}_{k}) \neq 0\) can be proved easily. Assume that
\(B_{k}\) is a symmetric positive definite matrix, then combining with (
2.20), (
2.24), and (
2.25) we have
$$ a_{\tau }^{(1)}>0. $$
(2.26)
When
\(B_{k}\) is only symmetric but not positive definite matrix, then the modified BFGS formula can be used to modify the positive semi-definite matrix to a positive definite matrix (see [
25]). Hence, it is easy to see that the solution of (
2.15)–(
2.16) is
$$\begin{aligned} \tau ^{(1)} = \textstyle\begin{cases} \min \{{\tau _{\mathrm{axis}}^{(1)}, \tau _{ \Delta }} \}, & \text{if } b_{\tau }^{(1)} < 0, \\ 0, & \text{otherwise,} \end{cases}\displaystyle \end{aligned}$$
(2.27)
where
$$\begin{aligned} \tau _{\mathrm{axis}}^{(1)} = \frac{ - b_{\tau }^{(1)} }{2a_{\tau } ^{(1)}}. \end{aligned}$$
(2.28)
We know that
\(u^{(1)} =-\tau ^{(1)}\nabla \tilde{\psi }_{k} (0)\) is an approximate solution of (
2.13)–(
2.14). In order to better approximate the solution of the subproblem, we are constantly repeating the above process similarly, except for fixing the fractional part. That is, in the
ith iteration we substitute
\(u=-\tau ^{(i)}\nabla \tilde{\psi }_{k} (0)\) into
\(h_{k} (s(u))\), where
\(i=2,3,\ldots \) . We denote
$$\begin{aligned} h_{k}^{(i)}= h_{k} \bigl(s \bigl(-\tau ^{(i)}\nabla \tilde{\psi }_{k} (0)\bigr) \bigr). \end{aligned}$$
(2.29)
Then we can obtain a similar trust-region subproblem to (
2.15)–(
2.16). That is,
$$\begin{aligned}& \min_{\tau \in R} \varphi _{k}^{(i)} (\tau ), \end{aligned}$$
(2.30)
$$\begin{aligned}& \text{s.t.}\quad 0\leq \tau \le \tau _{ \Delta }, \end{aligned}$$
(2.31)
where
$$\begin{aligned}& \varphi _{k}^{(i)} (\tau )= h_{k}^{(i)} g^{T}_{k} s(\tau ) + \frac{1}{2} \bigl(h_{k}^{(i)}\bigr)^{2} s(\tau )^{T} B_{k} s(\tau ) \end{aligned}$$
(2.32)
$$\begin{aligned}& \hphantom{\varphi _{k}^{(i)} (\tau )} = \frac{1}{2} h_{k}^{(i)} \bigl(a_{\tau }^{(i)} \tau ^{2} + b_{\tau }^{(i)} \tau + c_{\tau }^{(i)} \bigr), \end{aligned}$$
(2.33)
$$\begin{aligned}& a_{\tau }^{(i)} = h_{k}^{(i)}\nabla \psi _{k} (\tilde{s}_{k})^{T} Q _{k}^{(2)} \bigl(Q_{k}^{(2)} \bigr)^{T} B_{k} Q_{k}^{(2)} \bigl(Q_{k}^{(2)}\bigr)^{T} \nabla \psi _{k} (\tilde{s}_{k}) , \end{aligned}$$
(2.34)
$$\begin{aligned}& b_{\tau }^{(i)} = -2\nabla \psi _{k} ( \tilde{s}_{k})^{T}Q_{k}^{(2)} \bigl(Q _{k}^{(2)}\bigr)^{T} \bigl( g_{k} + h_{k}^{(i)} B_{k} \tilde{s}_{k}\bigr), \end{aligned}$$
(2.35)
$$\begin{aligned}& c_{\tau }^{(i)} = 2 g_{k}^{T} \tilde{s}_{k}+h_{k}^{(i)} \tilde{s}_{k} ^{T} B_{k} \tilde{s}_{k}. \end{aligned}$$
(2.36)
By the computation, we have the solution of subproblem (
2.30)–(
2.31) as follows:
$$\begin{aligned} \tau ^{(i)} = \textstyle\begin{cases} \min \{{\tau _{\mathrm{axis}}^{(i)} , \tau _{ \Delta }} \}, & \text{if } b_{ \tau }^{(i)} < 0, \\ 0, & \text{otherwise,} \end{cases}\displaystyle \end{aligned}$$
(2.37)
where
$$\begin{aligned} \tau _{\mathrm{axis}}^{(i)} = \frac{ - b_{\tau }^{(i)} }{2a_{\tau } ^{(i)}}. \end{aligned}$$
(2.38)
However, if
\(\tau ^{(i)}= 0\), then subproblem (
2.30)–(
2.31) has no positive real roots. For this case, we set
\(b_{k} = c_{k}= 0\), and then the fractional model reduces to the conic model. Subproblem (
1.21)–(
1.24) can be solved by Algorithm 2.4 in [
26]. That is, we calculate
$$\begin{aligned}& \tilde{a}_{k} = \bigl(Q_{k}^{(2)} \bigr)^{T}a_{k},\qquad \bar{a} = \frac{\tilde{a}_{k}}{1-a_{k}^{T}\tilde{s}_{k}}, \qquad \bar{g} = \frac{g_{k}}{1-a_{k}^{T}\tilde{s}_{k}}, \\& \bar{B} = \frac{B_{k}}{(1-a_{k}^{T}\tilde{s}_{k})^{2}},\qquad \alpha _{1} = 1- \Vert \bar{a} \Vert ^{2}\widehat{\Delta }^{2}, \\& \widehat{\Delta } = \min \biggl\{ \sqrt{ \tilde{\Delta }^{2} _{k}- \Vert \tilde{s}_{k} \Vert ^{2} }, \frac{1-a_{k}^{T}\tilde{s}_{k}- \varepsilon _{0}}{ \Vert a_{k} \Vert } \biggr\} ,\qquad m = \frac{ \Vert \bar{a} \Vert \widehat{\Delta }^{2}}{\alpha _{1} },\qquad \Delta _{1} = \frac{\widehat{\Delta }}{\sqrt{\alpha _{1}}}, \\& \hat{g} = \bigl(Q_{k}^{(2)}\bigr)^{T}\bar{g}+ \bar{a}\tilde{s}_{k}^{T}\bar{g}+ \bar{a} \tilde{s}_{k}^{T}\bar{B}\tilde{s}_{k}+ \bigl(Q_{k}^{(2)}\bigr)^{T}\bar{B} \tilde{s}_{k},\qquad V = \operatorname{diag}(\sqrt{\alpha _{1}},1,\ldots,1), \\& \hat{B} = \bigl(Q_{k}^{(2)}\bigr)^{T} \bar{B}Q_{k}^{(2)} + \tilde{s}_{k}^{T} \bar{B}\tilde{s}_{k}\bar{a}\bar{a}^{T} +\bar{a} \tilde{s}_{k}^{T} \bar{B}Q_{k}^{(2)}+ \bigl(Q_{k}^{(2)}\bigr)^{T}\bar{B} \tilde{s}_{k}\bar{a}^{T}, \\& g_{1} = V^{-1}Z\hat{g} + mV^{-1}Z \hat{B}Z^{T}e_{1},\qquad B_{1} = V^{-1}Z\hat{B}Z^{T}V^{-1}, \end{aligned}$$
where
Z is an orthonormal rotation matrix and satisfies
\(Z\bar{a}= \|\bar{a}e_{1}\|\), and
\(e_{1}=(1,0,\ldots,0)^{T}\). Then the solution of (
2.13)–(
2.14) is
$$\begin{aligned} u_{\ast }= \frac{(1-a_{k}^{T}\tilde{s}_{k})(Z^{T}Vd^{\ast }+m Z^{T}e _{1})}{1-a_{k}^{T}\tilde{s}_{k}+\tilde{a}_{k}^{T}(Z^{T}Vd^{\ast }+m Z ^{T}e_{1})}, \end{aligned}$$
(2.39)
where
\(d^{\ast }\) is the solution of the quadratic trust region subproblem
$$\begin{aligned}& \min_{d \in R^{n}} g_{1}^{T}d+ \frac{1}{2}d^{T} B_{1} d, \\& \text{s.t.}\quad \Vert d \Vert \leq \Delta _{1}. \end{aligned}$$