Abstract
In this paper, we propose a Dai–Liao (DL) conjugate gradient method for solving large-scale system of nonlinear equations. The method incorporates an extended secant equation developed from modified secant equations proposed by Zhang et al. (J Optim Theory Appl 102(1):147–157, 1999) and Wei et al. (Appl Math Comput 175(2):1156–1188, 2006) in the DL approach. It is shown that the proposed scheme satisfies the sufficient descent condition. The global convergence of the method is established under mild conditions, and computational experiments on some benchmark test problems show that the method is efficient and robust.
Similar content being viewed by others
1 Introduction
A typical system of nonlinear equations has the general form
where \(F:R^{n}\rightarrow R^{n}\) is a nonlinear mapping assumed to be continuously differentiable in a neighborhood of \(R^{n}\). Systems of nonlinear equations play important role in sciences and engineering fields; therefore, solving (1) has become a subject of interest to researchers in the aforementioned areas. Numerous algorithms or schemes have been developed for solving these systems of equations. Notable among them are the Newton and quasi-Newton schemes [14, 22, 34, 52], which converge rapidly from sufficiently good starting point. However, the requirement for computation and storage of the Jacobian matrix or an approximation of it at each iteration makes the two methods unattractive for large-scale nonlinear systems [51].
The ideal method for solving large-scale systems is the conjugate gradient (CG) method, which forms an important class of algorithms used in solving large-scale unconstrained optimization problems. The method is popular with mathematicians and engineers engaged in large-scale problems because of it low memory requirement and strong global convergence properties [19]. Generally, the nonlinear conjugate gradient method is used to solve large-scale problems in the following form;
where \(f:R^{n}\rightarrow R\) is a continuously differentiable function that is bounded from below and its gradient is available. The method generates a sequence of iterates \(x_k\) from an initial point \(x_0 \in R^{n}\) using the iterative formula
where \(x_k\) is the current iterate, \(\alpha _k >0\) is a step length computed using suitable line search technique, and \(d_k\) is the CG search direction defined by
where \(\beta _k\) is a scalar known as the CG update parameter, and \(F_k =\nabla f\left( {x_k } \right) \). It is worth noting that a crucial element in any CG algorithm is the formula definition of the update parameter \(\beta _k\) [4], which is why different CG algorithms corresponding to different choices of \(\beta _k\) in (4) have been proposed (see [8, 10,11,12,13,14, 17, 33, 50, 51, 53, 65]).
Also, some of the CG methods for unconstrained optimization are not globally convergent, so efforts have been made by researchers to develop CG methods that are not only globally convergent but also are numerically efficient. These new methods are based on secant equations. For nonlinear conjugate gradient methods, the conjugacy condition is given by
Perry [44] extended (5) by exploiting the following secant condition of quasi-Newton schemes:
and quasi-Newton search direction \(d_k\) given by
where \(B_k\) is a square matrix, which approximates the Hessian \(\nabla ^{2}f\left( x \right) \). By using (6) and (7), Perry gave an extension of (5) as:
and using (4), the Perry search direction is given as
where
and
Following Perry’s approach, Dai and Liao [18] incorporated a nonnegative parameter t to propose the following extension of (8):
It is noted that for \(t=0\), (12) reduces to (5), and if \(t=1\), we obtain Perry’s condition (8). Consequently, by substituting (4) into (12), Dai and Liao [18] proposed the following CG update parameter:
Numerical results have shown that the DL method is effective; however, it is much dependent on the nonnegative parameter t for which there is no optimal value [4], and it may not necessarily generate descent directions [8]. That is, the method may not satisfy the descent condition
or the sufficient descent condition, namely there exists a constant \(\lambda >0\) such that
Based on the DL conjugacy condition (12), conjugate gradient methods have been proposed over the years using modified secant equations. For example, Babaie-Kafaki et al. [13] and Yabe and Takano [55] proposed CG methods by applying a revised form of the modified secant equation proposed by Zhang and Xu [63] and Zhang et al.[64] and the modified secant equation proposed by Li and Fukushima [36]. Li et al.[37] applied the modified secant equation proposed by Wei et al. [54], while Ford et al. [26] employed the multi-step quasi-Newton conditions proposed by Ford and Moghrabi [27, 28]. CG methods based on modified secant equations have also been studied by Narushima and Yabe [57] and Reza Arazm et al. [7]. These methods have been found to be numerically efficient and globally convergent under suitable conditions, but like the DL method, they also fail to ensure sufficient descent.
Recently, by employing Perry’s idea [44], efficient CG methods with descent directions have been proposed. Liu and Shang [39] proposed a Perry conjugate gradient method, which provides prototypes for developing other special form of the Perry method like the HS method and the DL method [18]. Liu and Xu [40] presented a new Perry CG method with sufficient descent properties, which is independent of any line search. Also, based on the self-scaling memoryless BFGS update, Andrei [6] proposed an accelerated adaptive class of Perry conjugate gradient algorithms, whose search direction is determined by symmetrization of the scaled Perry CG direction [44].
CG methods for systems of nonlinear equations are rare as most of the methods are for unconstrained optimization. However, over the years, the method has been extended to large-scale nonlinear systems of equations by researchers. Using a combination of the Polak–Ribieré–Polyak (PRP) conjugate gradient method for unconstrained optimization [45, 47] and the hyperplane projection method of Solodov and Svaiter [48], Cheng [16] proposed a PRP-type method for systems of monotone equations. Yu [58, 59] extended the PRP method [45] to solve large-scale nonlinear systems with monotone line search strategies, which are modifications of the Grippo–Lampariello–Lucidi [29] and Li–Fukushima [35] schemes. As a further research of the Perry’s conjugate gradient method, Dai et al. [21] combined the modified Perry conjugate gradient method [41] and the hyperplane projection technique of Solodov and Svaiter [48] to propose a derivative-free method for solving large-scale nonlinear monotone equations. By combining the descent Dai–Liao CG method by Babaie-Kafaki and Ghanbari [54] and the projection method in [48], Abubakar and Pumam [2] proposed a descent Dai–Liao CG method for nonlinear equations. Numerical results show the method to be efficient. Based on the projection strategy [48], Liu and Feng [38] proposed a derivative-free iterative method for large-scale nonlinear monotone equations, which can be used to solved large-scale non-smooth problems due to its lower storage and derivative-free information. Abubakar and Kumam [1] proposed an improved three-term derivative-free method for solving large-scale nonlinear equations. The method is based on a modified HS method with the projection technique of Solodov and Svaiter [48]. Abubakar et al. [3] proposed a descent Dai–Liao CG method for solving nonlinear convex constraint monotone equations. The method is an extension of the method in [2]. By using a convex combination of two different positive spectral coefficients, Mohammed and Abubakar [42] proposed a combination of positive spectral gradient-like method and projection method for solving nonlinear monotone equations. Awwal et al. [43] proposed a hybrid spectral gradient algorithm for system of nonlinear monotone equations with convex constraints. The scheme is combination of a convex combination of two different positive spectral parameters and the projection technique.
Here, based on the work of Babaie-Kafaki and Ghanbari [9], and the Dai–Liao (DL) [18] approach, we propose a Dai–Liao conjugate gradient method for system of nonlinear equations by incorporating an extended secant equation in the classical DL update.
Throughout this work, we use \(\parallel .\parallel \) to denote the Euclidean norm of vectors, \(y_{k-1} =F_k -F_{k-1} ,s_{k-1} =x_k -x_{k-1}\) and \(F_k =F\left( {x_k } \right) \). We also assume that problem (1) is Lipschitz continuous and f in (2) is specified by
The paper is organized as follows: in Sect. 2, we present details of the method. Convergence analysis is presented in Sect. 3. Numerical results of the method are presented in Sect. 4. Finally, conclusions are made in Sect. 5.
2 Proposed method and its algorithm
Following the Dai–Liao approach, Babaie-Kafaki and Ghanbari [9] proposed the following extension of the PRP update parameter
where \(\beta _k^{PRP}\) is the classical PRP parameter and t is a nonnegative parameter, whose values were determined by carrying out eigenvalue analysis. Motivated by this, and employing similar approach, we propose a modification of the classical DL update parameter. In what follows, we suggest an extension of some previously modified secant equations.
By expanding (6), Zhang et al. [64] proposed the following modified secant equation
where
where \(\mu _{k-1} \in R^{n}\) is a vector parameter such that \(s_{k-1}^T \mu _{k-1} \ne 0\) (see [64]).
Similarly, Wei et al. [54] gave the following modified secant equation
with
where \(\mu _{k-1} \in R^{n}\) is a vector parameter such that \(s_{k-1}^T \mu _{k-1} \ne 0\) (see [60]). Also, in (18) and (20), the vector parameter \(\mu _{k-1} =s_{k-1}\) [55].
Here, we propose the following secant equation as an extension of (6), (18), and (20):
where \(\phi \) is a nonnegative parameter, \(\vartheta _{k-1}\) is defined by (21) and \(s_{k-1}^T \mu _{k-1} \ne 0\). We observe that for \(\phi =0\), (22) becomes the standard secant equation defined by (6), and if \(\phi =\frac{3}{2}\), (22) reduces to (19). Also, for \(\phi =\frac{1}{2}\), we see that (22) reduces to the modified secant equation proposed by Zhang et al. [64]. Substituting \(u_{k-1}\) in (22) for \(y_{k-1}\) in (13), we obtain the following version of the DL update parameter:
Observe that, in general, the denominator, \(d_{k-1}^T u_{k-1}\) may not be nonzero since \(\vartheta _{k-1}\) as defined in (22) may be non-positive. Therefore, we redefine \(u_{k-1}\) and obtain its revised form as
Consequently, we get the revised form of (23) as
Andrei [4] noted that the parameter t has no optimal choice and so, to obtain descent directions for our proposed method, we proceed to obtain appropriate values for t. From (4), and after some algebra, our search direction becomes:
Following Perry’s approach [44], search direction of our proposed method can be written as
where \(H_k \), called the search direction matrix is given by
and \(z_{k-1}\) is as defined by (24). And from (27) we can write
where
Proposition 2.1
The matrix \(\bar{H}_k\) defined by (30) is a symmetric matrix.
Proof
Using direct computation, we see that \(\bar{H}_k =\bar{H}_k^T \). Hence, \(\bar{H}_k\) is symmetric.
And so, to analyze the descent property of our method, we need to find eigenvalues of \(\bar{H}_k\) and their structure. \(\square \)
Theorem 2.2
Let the matrix \(\bar{H}_k\) be defined by (30). Then, the eigenvalues of \(\bar{H}_k\) consist of 1 with (\(n-2\) multiplicity), \(\lambda _k^+\) and \(\lambda _k^-\), where
and \(a_k =t\frac{\parallel s_{k-1} \parallel ^{2}}{s_{k-1}^T z_{k-1}}\), \(\quad b_k =\frac{\parallel s_{k-1} \parallel ^{2}\parallel z_{k-1} \parallel ^{2}}{(s_{k-1}^T z_{k-1} )^{2}}\).
Furthermore, all eigenvalues of \(\bar{H}_k\) are positive real numbers.
Proof
Since \(d_{k-1}^T z_{k-1} \ne 0\), then \(s_{k-1}^T z_{k-1} \ne 0\). And so, \(s_{k-1} \ne 0\) and \(z_{k-1} \ne 0\), which implies that the vectors \(s_{k-1}\) and \(z_{k-1}\) are nonzero vectors. Suppose V is the vector space spanned by \(\left\{ {s_{k-1} ,z_{k-1} } \right\} \). Then \(dim\left( V \right) \le 2\) and \(dim\left( {V^{\bot }} \right) \ge n-2\), where \(V^{\bot }\) is the orthogonal complement of V. Therefore, there exists a set of mutually orthogonal vectors \(\{\tau _{k-1}^i \}_{i=1}^{n-2} \subset V^{\bot }\) satisfying
By multiplying both sides of (30) by \(\tau _{k-1}^i\), we obtain
which can be viewed as an eigenvector equation. So, \(\tau _{k-1}^i \), for \(i=1,\ldots ,n-2\) are the eigenvectors of \(\bar{H}_k\) with eigenvalue 1 each. Let \(\lambda _k^+\) and \(\lambda _k^-\) be the remaining two eigenvalues, respectively. Observe that (30) can be written as
Clearly, \(\bar{H}_k\) represents a rank-two update, so from the fundamental algebra formula (see inequality \(\left( {1.2.70} \right) )\) of [49]
where
Since sum of the eigenvalues of a square symmetric matrix equals to its trace, from (30), we have
for which we obtain
Using the relationship between trace and determinant of a matrix and its eigenvalues, we can obtain \(\lambda _k^+\) and \(\lambda _k^-\) as roots of the following quadratic polynomial:
So, the remaining two eigenvalues are obtained from (40). And applying the quadratic formula with some rearrangements, we obtain
We can write (41) as
To obtain \(\lambda _k^+\) and \(\lambda _k^-\) as real numbers, we must have \(\Delta =(a_k -1)^{2}+b_{k-1} \ge 0\).
From Cauchy inequality, \(b_k =\frac{\parallel s_{k-1} \parallel ^{2}\parallel z_{k-1} \parallel ^{2}}{(s_{k-1}^T z_{k-1} )^{2}}\ge 1\), so, \(\Delta >0\). Consequently, both eigenvalues are real numbers and \(\lambda _k^+ >0\) since \(\left( {1+a_k } \right) \) is nonnegative. And to obtain \(\lambda _k^- >0\), the following must be satisfied:
After some algebra, we obtain the following estimation for the parameter t, which satisfies (43):
So, \(\lambda _k^- >0\) if (44) is satisfied. In addition, for t satisfying (44), \(\bar{H}_k\) is nonsingular.
Therefore, all the eigenvalues of the symmetric matrix \(\bar{H}_k\) are positive real numbers, which ensures that it is a positive-definite matrix. Moreover, using (42) and (44), we obtain the following estimation for \(\lambda _k^+\) and \(\lambda _k^- \):
And the proof is complete. Hence, from (29), we have
which shows that the descent condition is satisfied. We, therefore, propose the following formula for the parameter t in the modified DL method:
where \(\xi >\frac{1}{4}\) and \(\gamma <\frac{1}{4}\). \(\square \)
Remark 2.3
Since the DL parameter t is nonnegative, we restrict the values of the parameter \(\gamma \) in (47) to be negative so as to avoid a numerically unreasonable approximation [32]. So, based on the above remark, we can write the modified DL update parameter as
with \(\xi \ge \frac{1}{4}\) and \(\gamma <0\) satisfying (47) and guaranteeing the descent condition. We also write the search direction for the proposed method as
We use the derivative-free line search proposed by Li and Fukushima [34] to compute our step length \(\alpha _k \).
Let \(\sigma _1 >0\), \(\sigma _2 >0\) and \(r\in \left( {0,1} \right) \) be constants and let \(\left\{ {\eta _k } \right\} \) be a given positive sequence such that
and
Let \(i_k\) be the smallest non-negative integer i such that (51) holds for \(\alpha =r^{i}\). Let \(\alpha _k =r^{i_k }.\)
Now, we describe the algorithm of the proposed method as follows:
Algorithm 2.4
A Dai–Liao CG method (ADLCG)
- Step 1 :
-
Given \(\varepsilon >0\), choose an initial point \(x_0 \in R^{n}\), a positive sequence \(\left\{ {\eta _k } \right\} \) satisfying (50), and constants \(r\in \left( {0,1} \right) ,\sigma _1 ,\sigma _2 >0\), \(\xi \ge \frac{1}{4}\), \(\gamma <0\). Compute \(d_0 =-F_0\) and set \(k=0\).
- Step 2 :
-
Compute \(F\left( {x_k } \right) \). If \(\parallel F\left( {x_k } \right) \parallel \le \varepsilon \), stop. Otherwise, compute the search direction \(d_k\) by (49).
- Step 3 :
-
Compute \(\alpha _k\) via the line search in (51).
- Step 4 :
-
Set \(x_{k+1} =x_k +\alpha _k d_k \).
- Step 5 :
-
Set \(k:=k+1\) and go to Step 2.
3 Convergence analysis
The following assumptions are required to analyze the convergence of the ADLCG algorithm.
Assumption 3.1
The level set
is bounded.
Assumption 3.2
-
(1)
The solution set of problem (1) is not empty.
-
(2)
F is continuously differentiable on an open convex set \(\Phi _1\) containing \(\Phi \).
-
(3)
F is Lipschitz continuous in some neighborhood N of \(\Phi \); namely, there exists a positive constant \(L>0\) such that,
$$\begin{aligned} \parallel F\left( x \right) -F\left( y \right) \parallel \le L\parallel x-y\parallel , \end{aligned}$$(53)for all \(x,y\in N\).
Assumption (3.1) and condition (3) imply that there exists a positive constant \(\omega \) such that
$$\begin{aligned} \parallel F\left( {x_k } \right) \parallel \le \omega , \end{aligned}$$(54)for all \(x\in \Phi \), (see Proposition 1.3 of [13]).
-
(4)
The Jacobian of F is bounded, symmetric and positive-definite on \(\Phi _1 \), which implies that there exist constants \(m_2 \ge m_1 >0\) such that
$$\begin{aligned} \parallel F^{\prime }\left( x \right) \parallel \le m_2 ,\quad \forall x\in \Phi _1 , \end{aligned}$$(55)and
$$\begin{aligned} m_1 \parallel d\parallel ^{2}\le d^{T}F^{\prime }\left( x \right) d,\quad \forall x\in \Phi _1 ,d\in R^{n}. \end{aligned}$$(56)
Lemma 3.3
Let \(\left\{ {x_k } \right\} \) be generated by the Algorithm 2.4. Then \(d_k\) is a descent direction for \(F\left( {x_k } \right) \) at \(x_k\). i.e.,
Proof
By (46), the Lemma is true and we can deduce that the norm function \(f\left( {x_k } \right) \) is a descent along the direction \(d_k \). i.e., \(\parallel F\left( {x_{k+1} } \right) \parallel \le \parallel F\left( {x_k } \right) \parallel \) is true \(\forall k\). \(\square \)
Lemma 3.4
Suppose Assumptions 3.1 and 3.2 hold. Let \(\left\{ {x_k } \right\} \) be generated by the Algorithm 2.4. Then \(\left\{ {x_k } \right\} \subset \Omega \). Moreover, \(\parallel F_k \parallel \}\) converges.
Proof
By Lemma 3.3, we have \(\{\parallel F\left( {x_{k+1} } \right) \parallel \le \parallel F\left( {x_k } \right) \parallel \). So, by Lemma 3.3 in [20], we conclude that \(\left\{ {\parallel F_k \parallel } \right\} \) converges. Moreover, for all k, we have
This implies that \(\left\{ {x_k } \right\} \subset \Omega \)\(\square \)
Lemma 3.5
Suppose Assumption 3.1 and 3.2 hold. Let \(\left\{ {x_k } \right\} \) be generated by the Algorithm 2.4. Then
and
Proof
From the line search (51) and for all \(k>0\), we obtain
And by summing up the above k inequality, we obtain
Therefore, by (52) and since \(\left\{ {\eta _i } \right\} \) satisfies (50), then the series \(\mathop {\sum }\nolimits _{i=0}^k \parallel \alpha _k d_k \parallel ^{2}\) is convergent, which implies that (59) holds. Using the same argument as above, with \(\sigma _1 \parallel \alpha _k F\left( {x_k } \right) \parallel ^{2}\) on the left-hand sides, we obtain (60). \(\square \)
Lemma 3.6
[62] Suppose Assumptions 3.1 and 3.2 hold and \(\left\{ {x_k } \right\} \) be generated by Algorithm 2.4. Then, there exists a constant \(m>0\) such that,
Proof
By mean-value theorem, we have
where \(\varphi =\lambda x_k +\left( {1-\lambda } \right) x_{k+1}\), for some \(\lambda \in \left( {0,1} \right) \). We obtain the last inequality from (56). Letting \(m_1 =m\), the proof is established. \(\square \)
Lemma 3.7
Suppose Assumptions 3.1 and 3.2 hold. Let the sequence \(\left\{ {x_k } \right\} \) be generated by Algorithm 2.4 with update parameter \(\beta _k^{\mathrm{ADL}} \). Then, there exists \(M>0\) such that
Proof
Applying the mean-value theorem, we have
where \(\varphi =\lambda x_k +\left( {1-\lambda } \right) x_{k+1} \), for some \(\lambda \in \left( {0,1} \right) \).
Hence from (53), we have
Utilizing (24), (53), (68), and setting \(\mu _{k-1} =s_{k-1} \), we obtain
And using (47), (53) and (69), we get
By utilizing (4), (47), (48), (69) and (70) we obtain,
where \(c_1 =(m^{2}+m\left( {L+2\phi L} \right) +\left( {\left( {L+2\phi L)^{2}\xi +\left| \gamma \right| } \right) } \right) \).
Setting \(M:=\frac{c_1 \parallel F\left( {x_k } \right) \parallel }{m^{2}}\), we obtain the required result.
In the next, we prove the global convergence of the ADLCG method. \(\square \)
Theorem 3.8
Suppose Assumption 3.1 and 3.2 hold and that the sequence \(\left\{ {x_k } \right\} \) is generated by Algorithm 2.4. Also, assume that for all \(k>0\)
where c is some positive constant. Then, \(\left\{ {x_k } \right\} \) converges globally to a solution of problem (1); i.e,
Proof
By (59) and the boundedness of \(\left\{ {\parallel d_k \parallel } \right\} \), we have
On the other hand, from (46), and (45), we have
But from (45), we have
Thus, from (76) and applying the sandwich theorem, we obtain
Therefore,
And the proof is completed. \(\square \)
4 Numerical result
In this section, we test the efficiency and robustness of our proposed approach using the following method in the literature:
A new derivative-free conjugate gradient method for solving large-scale nonlinear systems of equations (NDFCG) [24]. All the codes used were written in MATLAB R2014a environment and run on a personal computer (2.20GHZ CPU, 8GB RAM). Also, the two algorithms used in the experiment were implemented with the same line search procedure, and the parameters are set to \(\sigma _1 =\sigma _2 =10^{-4}\), \(\alpha _0 =0.1\), \(r=0.2\) and \(\eta _k =\frac{1}{(k+1)^{2}}\). In addition, we set \(\xi =0.5\), \(\gamma =-0.5\) and \(\mu _{k-1} =s_{k-1}\) for the ADLCG method. Also, the iteration was set to terminate if it exceeds 2000 or the inequality \(\parallel F_k \parallel \le 10^{-10}\) is satisfied (Table 1).
The two algorithms were tested using the following test problems with various sizes:
Problem 4.1
[2] The elements of the function \(F\left( x \right) \) are given by:
Problem 4.2
[2] The elements of the function \(F\left( x \right) \) are given by:
Problem 4.3
[67] The elements of the function \(F\left( x \right) \) are given by:
Problem 4.4
[56] The elements of the function \(F\left( x \right) \) are given by:
Problem 4.5
[38] The elements of the function \(F\left( x \right) \) are given by:
Problem 4.6
[51] The function \(F\left( x \right) \) is given by
where \(b_1 =(e_1^x -1,\ldots ,e_n^x -1)^{T}\), and
Problem 4.7
[61] The elements of the function \(F\left( x \right) \) are given by:
Problem 4.8
[2] The elements of the function \(F\left( x \right) \) are given by:
Problem 4.9
[2] The elements of the function \(F\left( x \right) \) are given by:
Problem 4.10
[2] The elements of the function \(F\left( x \right) \) are given by:
Using the performance profile of Dolan and More [23], we generate Figs. 1 and 2 to show the performance and efficiency of each of the two methods. To better illustrate the performance of the two methods, a summary of the results is presented in Table 2. The summarized data show the number of problems for which each method is a winner in terms of number of iterations and CPU time, respectively. The corresponding percentages of number of problems solved are also indicated
In Figs. 1 and 2, we observed that the curve representing the ADLCG method is above the curve representing the NDFCG method. This is a measure of the efficiency of the ADLCG method compared to the NDFCG scheme.
Similarly, the summary reported in Table 2 indicated that the ADLCG method is a winner with respect to number of iterations and CPU time. The table shows that the ADLCG method solves 95% (76 out of 80) of the problems with less number of iterations compared to the NDFCG method, which solves only 3.75% (3 out of 80). The summarized result also shows that both methods solve 1 problem with the same number of iteration, which translates to 1.25% and is reported as undecided. Also, the summary indicated that the ADLCG method outperforms the NDFCG scheme as it solves 72.5% (58 out of 80) of the problems with less CPU time compared to 27.5% (22 out of 80) solved by the NDFCG. Therefore, it is clear from Figs. 1 and 2 and the summarized result in Table 2 that our method is more efficient than the NDFCG method and better for large-scale nonlinear systems.
5 Conclusion
In this work, we proposed a Dai–Liao conjugate gradient method via modified secant equation for systems of nonlinear equations. This was achieved by finding appropriate values for the nonnegative parameter in the DL method using of an extended secant equation developed from the work of Zhang et al. [64] and Wei et al [54]. Numerical comparisons with some existing methods and Global convergence show that the method is efficient.
References
Abubakar, A.B.; Kumam, P.: An improved three-term derivative-free method for solving nonlinear equations. Comput. Appl. Math. 37(5), 6760–6773 (2018)
Abubakar, A.B.; Kumam, P.: A descent Dai–Liao conjugate gradient method for nonlinear equations. Numer. Algorithms (2018). https://doi.org/10.1007/s11075-018-0541-z
Abubakar, A.B.; Kumam, P.; Auwal, A.M.: A descent Dai–Liao projection method for convex constrained nonlinear monotone equations with applications. Thai J. Math. 17(1), 128–152 (2018)
Andrei, N.: Open problems in conjugate gradient algorithms for unconstrained optimization. Bull. Malays. Math. Sci. Soc. 34(2), 319–330 (2011)
Andrei, N.: An adaptive conjugate gradient algorithm for large-scale unconstrained optimization. J. Comput. Appl. Math. 292, 8391 (2016)
Andrei, N.: Accelerated adaptive Perry conjugate gradient algorithms based on the selfscaling BFGS update. J. Comput. Appl. Math. 325, 149–164 (2017)
Arazm, M.R.; Babaie-Kafaki, S.; Ghanbari, R.: An extended Dai–Liao conjugate gradient method with global convergence for nonconvex functions. Glas. Mat. 52(72), 361–375 (2017)
Babaie-Kafaki, S.; Ghanbari, R.: A descent family of Dai–Liao conjugate gradient methods. Optim. Methods Softw. 29(3), 583–591 (2013)
Babaie-Kafaki, S.; Ghanbari, R.: A descent extension of the Polak–Ribier’\(e\)–Polyak conjugate gradient method. Comput. Math. Appl. 68, 2005–2011 (2014)
Babaie-Kafaki, S.; Ghanbari, R.: Two modified three-term conjugate gradient methods with sufficient descent property. Optim. Lett. 8(8), 2285–2297 (2014)
Babaie-Kafaki, S.; Ghanbari, R.: The Dai–Liao nonlinear conjugate gradient method with optimal parameter choices. Eur. J. Oper. Res. 234, 625–630 (2014)
Babaie-Kafaki, S.; Ghanbari, R.: Two optimal Dai–Liao conjugate gradient methods. Optimization 64, 2277–2287 (2015)
Babaie-Kafaki, S.; Ghanbari, R.; Mahdavi-Amiri, N.: Two new conjugate gradient methods based on modified secant equations. J. Comput. Appl. Math. 234(5), 13741386 (2010)
Broyden, C.G.: A class of methods for solving nonlinear simultaneous equations. Math. Comput. 19, 577–593 (1965)
Cheng, W.: A two-term PRP-based descent method. Numer. Funct. Anal. Optim. 28, 1217–1230 (2007)
Cheng, W.: A PRP type method for systems of monotone equations. Math. Comput. Model. 50, 15–20 (2009)
Cheng, W.; Xiao, Y.; Hu, Q.: A family of derivative-free conjugate gradient methods for large-scale nonlinear systems of equations. J. Comput. Appl. Math. 224, 11–19 (2009)
Dai, Y.H.; Liao, L.Z.: New conjugacy conditions and related nonlinear conjugate gradient methods. Appl. Math. Optim. 43(1), 87–101 (2001)
Dai, Y.H.; Han, J.Y.; Liu, G.H.; Sun, D.F.; Yin, H.X.; Yuan, Y.X.: Convergence properties of nonlinear conjugate gradient methods. SIAM J. Optim. 10(2), 348–358 (1999)
Dai, Y.H.; Han, J.Y.; Liu, G.H.; Sun, D.F.; Yin, H.X.; Yuan, Y.X.: Convergence properties of nonlinear conjugate gradient methods. SIAM J. Optim. 10(2), 348–358 (1999)
Dai, Z.; Chen, X.; Wen, F.: A modified Perrys conjugate gradient method-based derivativefree method for solving large-scale nonlinear monotone equation. Appl. Math. Comput. 270, 378–386 (2015)
Dauda, M.K.; Mamat, M.; Mohamed, M.A.; Waziri, M.Y.: Improved quasi-Newton method via SR1 update for solving symmetric systems of nonlinear equations. Malay. J. Fundam. Appl. Sci. 15(1), 117–120 (2019)
Dolan, E.D.; Mor, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2, Ser. A), 201–2013 (2002)
Fang, X.; Ni, Q.: A new derivative-free conjugate gradient method for nonlinear system of equations. Bull. Aust. Math. Soc. 95, 1–12 (2017)
Feng, D.; Sun, M.; Wang, X.: A family of conjugate gradient methods for large-scale nonlinear equations. J. Inequal. Appl. 2017, 236 (2017)
Ford, J.A.; Moghrabi, I.A.: Multi-step quasi-Newton methods for optimization. J. Comput. Appl. Math. 50(1–3), 305323 (1994)
Ford, J.A.; Moghrabi, I.A.: Using function-values in multi-step quasi-Newton methods. J. Comput. Appl. Math. 66(12), 201211 (1996)
Ford, J.A.; Narushima, Y.; Yabe, H.: Multi-step nonlinear conjugate gradient methods for unconstrained minimization. Comput. Optim. Appl. 40(2), 191–216 (2008)
Grippo, L.; Lampariello, F.; Lucidi, S.: A nonmonotone linesearch technique for Newtons method. SIAM J. Numer. Anal. 23, 707–716 (1986)
Hager, W.; Zhang, H.: A survey of nonlinear conjugate gradient methods. Pac. J. Optim. 2(1), 35–58 (2006)
Hestenes, M.R.; Stiefel, E.L.: Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 49, 409–436 (1952)
Kincaid, D.; Cheney, W.: Numerical Analysis. Brooks/Cole Publishing Company, California (1991)
Li, M.: A derivative-free PRP method for solving large-scale nonlinear systems of equations and its global convergence. Optim. Methods Softw. 29(3), 503–514 (2014)
Li, D.H.; Fukushima, M.: A globally and superlinearly convergent Gauss–Newton-based BFGS method for symmetric nonlinear equations. SIAM J. Numer. Anal. 37(1), 152–172 (2000)
Li, D.H.; Fukushima, M.: A derivative-free linesearch and global convergence of Broydenlike method for nonlinear equations. Optim. Methods Softw. 13, 583–599 (2000)
Li, D.H.; Fukushima, M.: A modified BFGS method and its global convergence in nonconvex minimization. J. Comput. Appl. Math. 129, 15–35 (2001)
Li, G.; Tang, C.; Wei, Z.: New conjugacy condition and related new conjugate gradient methods for unconstrained optimization. J. Comput. Appl. Math. 202(2), 523539 (2007)
Liu, J.; Feng, Y.: A derivative-free iterative method for nonlinear monotone equations with convex constraints. Numer. Algorithms (2018). https://doi.org/10.1007/s11075-018-06032
Liu, D.Y.; Shang, Y.F.: A new Perry conjugate gradient method with the generalized conjugacy condition. In: 2010 International Conference on Computational Intelligence and Software Engineering (CiSE), p. 1012 (2010)
Liu, D.Y.; Xu, G.Q.: A Perry descent conjugate gradient method with restricted spectrum, optimization online, nonlinear optimization (unconstrained optimization), pp. 1–19 (2011)
Livieris, I.E.; Pintelas, P.: Globally convergent modified Perrys conjugate gradient method. Appl. Math. Comput. 218, 9197–9207 (2012)
Mohammad, H.; Abubakar, A.B.: A positive spectral gradient-like method for nonlinear monotone equations. Bull. Comput. Appl. Math. 5(1), 99–115 (2017)
Muhammed, A.A.; Kumam, P.; Abubakar, A.B.; Wakili, A.; Pakkaranang, N.: A new hybrid spectral gradient projection method for monotone system of nonlinear equations with convex constraints. Thai J. Math. 16(4), 125–147 (2018)
Perry, A.: A modified conjugate gradient algorithm. Oper. Res. Tech. Notes 26(6), 10731078 (1978)
Polak, B.T.: The conjugate gradient method in extreme problems. USSR Comput. Math. Math. Phys. 4, 94–112 (1969)
Powell, M.J.D.: Restart procedures of the conjugate gradient method. Math. Prog. 2, 241–254 (1977)
Ribiere, G.; Polak, E.: Note sur la convergence de directions conjugees. Rev. Fr. Inf. Rech. Oper. 16, 35–43 (1969)
Solodov, M.V.; Svaiter, B.F.: A globally convergent inexact Newton method for systems of monotone equations. In: Reformulation: Nonsmooth, Piecewise Smooth, Semismooth and Smoothing Methods, pp. 355–369. Springer (2015)
Sun, W.; Yuan, Y.X.: Optimization Theory and Methods: Nonlinear Programming. Springer, NewYork (2006)
Waziri, M.Y.; Muhammad, L.: An accelerated three-term conjugate gradient algorithm for solving large-scale systems of nonlinear equations. Sohag J. Math. 4, 1–8 (2017)
Waziri, M.Y.; Sabiu, J.: A derivative-free conjugate gradient method and its global convergence for solving symmetric nonlinear equations, Hindawi Publishing Corporation. Int. J. Math. Math. Sci. 2015 (2015)
Waziri, M.Y.; Leong, W.J.; Hassan, M.A.: Jacobian free-diagonal Newton’s method for nonlinear systems with singular Jacobian. Malay. J. Math. Sci. 5(2), 241–255 (2011)
Waziri, M.Y.; Sabiu, J.; Muhammad, L.: A simple three-term conjugate gradient algorithm for solving symmetric systems of nonlinear equations. Int. J. Adv. Appl. Sci. (IJAAS) (2016)
Wei, Z.; Li, G.; Qi, L.: New quasi-Newton methods for unconstrained optimization problems. Appl. Math. Comput. 175(2), 1156–1188 (2006)
Yabe, H.; Takano, M.: Global convergence properties of nonlinear conjugate gradient methods with modified secant condition. Comput. Optim. Appl. 28(2), 203–225 (2004)
Yan, Q.R.; Peng, X.Z.; Li, D.H.: A globally convergent derivative-free method for solving large-scale nonlinear monotone equations. J. Comput. Appl. Math. 234, 649–657 (2010)
Yasushi, N.; Hiroshi, Y.: Conjugate gradient methods based on secant conditions that generate descent directions for unconstrained optimization. J. Comput. Appl. Math. 236, 4303–4317 (2012)
Yu, G.: A derivative-free method for solving large-scale nonlinear systems of equations. J. Ind. Manag. Optim. 6, 149–160 (2010)
Yu, G.: Nonmonotone spectral gradient-type methods for large-scale unconstrained optimization and nonlinear systems of equations. Pac. J. Optim. 7, 387–404 (2011)
Yuan, Y.X.: A modified BFGS algorithm for unconstrained optimization. IMA J. Numer. Anal. 11, 325–332 (1991)
Yuan, N.: A derivative-free projection method for solving convex constrained monotone equations. ScienceAsia 43, 195–200 (2017)
Yuan, G.; Lu, X.: A new backtracking inexact BFGS method for symmetric nonlinear equations. J. Comput. Math. Appl. 55, 116–129 (2008)
Zhang, J.; Xu, C.: Properties and numerical performance of quasi-Newton methods with modified quasi-Newton equations. J. Comput. Appl. Math. 137(2), 269–278 (2001)
Zhang, J.Z.; Deng, N.Y.; Chen, L.H.: New quasi-Newton equation and related methods for unconstrained optimization. J. Optim. Theory Appl. 102(1), 147–157 (1999)
Zhang, L.; Zhou, W.; Li, D.: Some descent three-term conjugate gradient methods and their global convergence. Optim. Methods Softw. 22(4), 697–711 (2007)
Zhou, W.; Li, D.: Limited memory BFGS method for nonlinear monotone equations. J. Comput. Math. 25(1), 89–96 (2007)
Zhou, W.; Li, D.: Limited memory BFGS method for nonlinear monotone equations. J. Comput. Math. 25(1), 89–96 (2007)
Zhou, W.; Shen, D.: An inexact PRP conjugate gradient method for symmetric nonlinear equations. Numer. Funct. Anal. Optim. 35(3), 370–388 (2014)
Zhou, W.; Shen, D.: Convergence properties of an iterative method for solving symmetric non-linear equations. J. Optim. Theory Appl. 164(1), 277–289 (2015)
Zhou, W.; Zhang, L.: A nonlinear conjugate gradient method based on the MBFGS secant condition. Optim. Methods Softw. 21(5), 707–714 (2006)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Waziri, M.Y., Ahmed, K. & Sabi’u, J. A Dai–Liao conjugate gradient method via modified secant equation for system of nonlinear equations. Arab. J. Math. 9, 443–457 (2020). https://doi.org/10.1007/s40065-019-0264-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40065-019-0264-6