nach oben

Numerical Algorithms

Erschienen in:

Open Access 01.07.2023 | Original Paper

A nonsmooth Newton method for solving the generalized complementarity problem

verfasst von: Hevert Vivas, Rosana Pérez, Carlos A. Arias

Erschienen in: Numerical Algorithms | Ausgabe 2/2024

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

In this paper, we present a new nonsmooth Newton-type algorithm for solving the generalized complementarity problem based on its reformulation as a system of nonlinear equations using a one-parametric family of complementarity functions. We demonstrate, under suitable hypotheses, that this algorithm converges locally and q-quadratically. In addition, we show numerical experiments that allow us to see the good performance of the proposed algorithm.

Hevert Vivas, Rosana Pérez, and Carlos A. Arias contributed equally to this work.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Let $F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}, \, (F_1({\varvec{x}}), \ldots , F_n({\varvec{x}})) $ and $ G:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n},\,(G_1({\varvec{x}}), \ldots , G_n({\varvec{x}})) $ be continuously differentiable functions. The generalized complementarity problem (GCP (F, G)) is to find a solution of the following system of equations and inequalities

$$\begin{aligned} F_i{\varvec{(x)}}\ge 0, \hspace{0.5cm} G_i({\varvec{x}})\ge 0, \hspace{0.5cm} F_i{\varvec{(x)}}G_i({\varvec{x}})=0, \hspace{0.3cm} \forall i=1,\ldots ,n. \end{aligned}$$

(1)

Particular cases of GCP(F, G), of great interest due to its numerous applications, are the linear complementarity problem ($F({\varvec{x}})= M{\varvec{x}}+{\varvec{q}},\, M\in \mathbb {R}^{n\times n} $ and $G({\varvec{x}})={\varvec{x}}$) [1‐5] nonlinear complementarity problem ($G({\varvec{x}})={\varvec{x}} $) [2‐4] and implicit complementarity problem ($\,G({\varvec{x}})={\varvec{x}}-E({\varvec{x}}), $ with E a continuously differentiable function [5‐8].

As in the particular cases mentioned previously, the GCP(F, G) can be reformulated as the following system of nonlinear equations,

$$\begin{aligned} \Phi \left( {\varvec{x}}\right) =\left( \varphi (F_{1}({\varvec{x}}),G_{1}({\varvec{x}})),\, \,\ldots , \, \varphi (F_{n}({\varvec{x}}),G_{n}({\varvec{x}}) \right) ^T=0, \end{aligned}$$

(2)

where $\Phi :\mathbb {R}^{n}\rightarrow \mathbb {R}^{n} $ and $\,\varphi :\mathbb {R}^{2}\rightarrow \mathbb {R}. $ The last one is called complementarity function and satisfies that $ \varphi (a,b)=0 \Leftrightarrow a\ge 0,\, b\ge 0,\, ab=0. $ From this equivalence, we conclude two facts. First, the trace of $\varphi $ by the intersection with the xy-plane is the curve formed by the non negative x and y semi-axes which is nonsmooth, and therefore, $\varphi $ and $\Phi $ are nonsmooth functions [9]. Second, ${\varvec{x}}^{*}$ solves GCP(F, G) if only if $\Phi \left( {\varvec{x}}^{*} \right) =0 $ which guarantees that solving the generalized complementarity problem is equivalent to solve its reformulation.

Alternatively, if $\Psi :\mathbb {R}^{n}\rightarrow \mathbb {R} $ denotes the natural merit function defined by $\Psi ({\varvec{x}}) = \frac{1}{2}\, \Phi ({\varvec{x}})^T\Phi ({\varvec{x}}) $, then we may rewrite the generalized complementarity problem as the following unconstrained minimization problem

$$\begin{aligned} Minimize&\hspace{0.3cm} \Psi ({\varvec{x}}). \\ {\varvec{x}} \in \mathbb {R}^{n} \,\, \, \nonumber&\end{aligned}$$

(3)

Observe that any solution to (2) is a global minimizer of $\Psi $ in $ \mathbb {R}^{n}. $ Reciprocally, any local solution ${\varvec{x}}$ of (3) such that $\Psi ({\varvec{x}}) = 0 $ is a solution of (2).

Nonsmooth Newton-type methods are popular ones to solve nonlinear complementarity problems [2, 10‐12]. They use the concept of generalized Jacobian [13]. Some of them have been extended to generalized complementarity problems [14, 15].

In this paper, we consider the reformulation (2) with $\varphi =\varphi _\lambda , $ the one-parametric family of complementarity functions introduced in [12], defined by

$$\begin{aligned} \varphi _{\lambda }(a,b)=\sqrt{\left( a-b\right) ^{2}+\lambda ab}-a-b\,, \hspace{1cm} \lambda \in \left( 0,4\right) . \end{aligned}$$

(4)

Thus, we may rewrite the nonlinear system of equations (2) as $\,\Phi _\lambda \left( {\varvec{x}}\right) = 0\, $ and the minimization problem (3) as

$$\begin{aligned} Minimize \hspace{0.3cm}&\Psi _{\lambda }({\varvec{x}}) =\frac{1}{2} \left\| \Phi _{\lambda }({\varvec{x}})\right\| _2 ^2, \\ {\varvec{x}} \in \mathbb {R}^{n} \,\, \, \nonumber&\end{aligned}$$

(5)

whose objective function is continuously differentiable [12].

We chose the family $\varphi _\lambda $ for two reasons. First, because its relationship with the two probably most prominent complementarity functions: Fischer function [2, 16] and the Minimum function [17] defined by $\varphi (a,b)=\sqrt{a^2+b^2}-a-b $ and $\varphi (a,b)=\min \{a,b\}, $ respectively. Observe that, for $\lambda =2,\,\varphi _\lambda \,$ reduces to the Fischer function and, when $\lambda $ tends to 0, $\,\varphi _\lambda \,$ converges to a multiple of the Minimum function.

The second reason is that, as far as we know, the family has not been used in connection with the generalized complementarity problem, only the particular case $\lambda =2 $ (Fischer function) was used in [15] to propose a generalized Newton-type method to solve the GCP (F, G), and the minimum function was used in [14] to analyze a local convergence of Levenberg-Marquardt type-method for the GCP(F, G).

In this paper, we analyze and extend to generalized complementarity problems, the properties of the operator $\Phi _\lambda $ investigated in [12] for the particular case of nonlinear complementarity, and we present a new nonsmooth Newton-type algorithm for solving the GCP (F, G) based on its reformulation as a system of nonlinear equations using a one-parametric family of complementarity functions. We demonstrate, under suitable hypotheses, that this algorithm converges locally and q-quadratically. We show numerical tests that allow us to see the good performance of the new algorithm.

We organize this article as follows. In Section 2, we recall some definitions and results from nonsmooth analysis which we use in the document. We also extend some definitions of regularity to generalized complementarity. In Section 3, we analyze properties of the operator $\Phi _\lambda $ as extensions of those investigated in [12] for nonlinear complementarity. In Section 4, we present the algorithm and convergence results. In Section 5, we present the analysis of the numerical performance of the algorithm proposed. Finally, In Section 6, we present some conclusions.

2 Preliminaries

We recall some definitions and results from nonsmooth analysis which we use in this work.

Definition 1

[18]. Let $ K:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n} $ be a locally Lipschitzian function and let $D_K$ be the set where K is differentiable. For all $\varvec{x} \in {\mathbb {R}}^{n}, $ the set given by

$$\begin{aligned} \partial _{B} K(\varvec{x})=\left\{ \lim _{k\rightarrow \infty } \,K' (\varvec{x}_k) \in \mathbb {R}^{n\times n}: \,\lim _{k\rightarrow \infty }\, \varvec{x}_k \,=\, \varvec{x},\,\, \varvec{x}_k\in D_K \right\} \end{aligned}$$

(6)

is known as the generalized B-Jacobian of K at $\varvec{x}$.

Definition 2

The convex hull of $\partial _{B} K({\varvec{x}}) $ is called the generalized Jacobian of K at ${\varvec{x}}, $ denoted $\partial K({\varvec{x}}).$

Usually, the set $\partial K({\varvec{x}}) $ is difficult to compute. An alternative is the following overestimation [13],

$$\begin{aligned} \partial K({\varvec{x}})^T \subseteq \partial K_1 ({\varvec{x}}) \times \cdots \times \partial K_n ({\varvec{x}}), \end{aligned}$$

(7)

where the right side, for short $ \partial _C K({\varvec{x}}) $ is called C-subdifferential of K at ${\varvec{x}}\, $ [19], and it is the set of matrices in $\mathbb {R}^{n\times n} $ whose ith column is the generalized gradient of the ith component of the function K.

Definition 3

[20]. A locally Lipschitz continuous and directionally differentiable function $\,K:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\,$ is called semismooth at a point $\,{\varvec{x}}\in \mathbb {R}^{n}\,$ if

$$\,H{\varvec{d}}-K'({\varvec{x}};{\varvec{d}}) = o(\left\| {\varvec{d}} \right\| )\,$$

holds for every $\,{\varvec{d}}\rightarrow {\varvec{0}}\,$ and every $\,H \in \partial K({\varvec{x}}+{\varvec{d}})\,$ and, in addition

$$\begin{aligned} \lim _{H \in \, \partial K(\varvec{x}+t\varvec{h^{'}}), \, {\varvec{h}}^{'}\rightarrow {\varvec{h}}, \,t\rightarrow 0^{+}} \,\left\{ H {\varvec{h}}^{'}\right\} \end{aligned}$$

(8)

exists for any $\, \varvec{h} \in \mathbb {R}^{n},\,$ and strongly semismooth at $\,{\varvec{x}}\in \mathbb {R}^{n}\,$ if

$$\,H{\varvec{d}}-K'({\varvec{x}};{\varvec{d}}) = O(\left\| {\varvec{d}} \right\| ^{2})\,$$

holds for every $\,{\varvec{d}}\rightarrow {\varvec{0}},\,$ and every $\,H \in \partial K({\varvec{x}}+{\varvec{d}}).\,$ Here, $\,K'({\varvec{x}};{\varvec{d}})\,$ denotes the usual directional derivative of $\,K\,$ at $\,{\varvec{x}}\,$ in the direction $\,{\varvec{d}}.\,$

Definition 4

[12]. The function $K:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\,$ is class $LC^{1},$ if it is differentiable and its derivative is locally Lipschitz continuous.

Definition 5

[10] The function $ K :\mathbb {R}^{n} \longrightarrow \mathbb {R}^{n} $ is B-differentiable at ${\varvec{x}}, $ if it is directionally differentiable at ${\varvec{x}} $ and

$$\begin{aligned} K\left( {\varvec{x}}+{\varvec{h}}\right) -K\left( {\varvec{x}}\right) -K' ({\varvec{x}},{\varvec{h}})=o(\left\| {\varvec{h}}\right\| ). \end{aligned}$$

(9)

Moreover, K is B-differentiable of degree 2 en ${\varvec{x}}$ if and only if

$$\begin{aligned} K\left( {\varvec{x}}+{\varvec{h}}\right) = K\left( {\varvec{x}}\right) +K' ({\varvec{x}};{\varvec{h}})+O(\left\| {\varvec{h}}\right\| ^{2} ). \end{aligned}$$

(10)

Furthermore, the directional derivative $K'(\cdot ,\cdot ) $ is semicontinuous of degree 2 in ${\varvec{x}}$ [10] if there exists a constant $\,L\,$ and a neighborhood N of ${\varvec{x}}$ such that for all $\,{\varvec{x}}+{\varvec{h}}\in N,\,$

$$\begin{aligned} \,\left\| K'\left( {\varvec{x}}+{\varvec{h}}, {\varvec{h}}\right) -K'\left( {\varvec{x}}, {\varvec{h}}\right) \right\| \le L \left\| {\varvec{h}}\right\| ^{2}. \end{aligned}$$

(11)

The following two lemmas will be useful in determining the rate of convergence of our algorithm.

Lemma 1

(Lemma 2.2 [10]) Suppose that $K:\mathbb {R}^{n}\rightarrow \mathbb {R}^{m}$ is directionally differentiable at a neighborhood of ${\varvec{x}}.$ The following statements are equivalent:

(1)

K is semismooth at ${\varvec{x}}.$

(2)

$K'(\cdot ,\cdot )$ is semicontinuous at $ {\varvec{x}}. $

(3)

for any $ H \in \partial K({\varvec{x}}+{\varvec{h}}),$ ${\varvec{h}}\longrightarrow {\varvec{0}},\,\, H{\varvec{h}}-K'({\varvec{x}};{\varvec{h}})=o(\left\| {\varvec{h}}\right\| ).$

Lemma 2

(Lemma 2.3 [10]) Suppose that $K:\mathbb {R}^{n}\rightarrow \mathbb {R}^{m}$ is directionally differentiable at a neighborhood of $ {\varvec{x}}. $ The following statements are equivalent:

(1)

K is semicontinuous of degree 2 at $ {\varvec{x}}. $

(2)

for any $ H \in \partial K({\varvec{x}}+{\varvec{h}}),$ $ {\varvec{h}}\longrightarrow {\varvec{0}},\,\, H{\varvec{h}}-K'({\varvec{x}};{\varvec{h}})=O(\left\| {\varvec{h}}\right\| ^2).$ If (1) or (2) holds, then K is B-differentiable of degree 2 at ${\varvec{x}}.$

Next, we define the concepts of BD-regularity and R-regularity in generalized complementarity context. For this, we consider a solution ${\varvec{x}}_{*} $ of GCP(F, G) and the following index sets, $\alpha _*=\left\{ i\in I:F_i({\varvec{x}}_{*})>0=G_i({\varvec{x}}_{*})\right\} , \,$ $\,\beta _*= \left\{ i\in I:F_i({\varvec{x}}_{*})=0=G_i({\varvec{x}}_{*})\right\} $ and $\gamma _*= \left\{ i\in I:F_i({\varvec{x}}_{*})=0<G_i({\varvec{x}}_{*})\right\} . $

Definition 6

[4]. Let $\,\varvec{x}_*\,$ be a solution of GCP(F, G).

If all matrices $\,H \in \partial _{B} \Phi ({\varvec{x}}_{*})\,$ are nonsingular, $\,\varvec{x}_*\,$ is called a BD-regular solution.

Let $ F'({\varvec{x}}_{*}) $ be nonsingular and $K=G'({\varvec{x}}_{*})F'({\varvec{x}}_{*})^{-1}.$ If the submatrix¹$ K_{\alpha \alpha } $ is nonsingular and the Schur-complement of K,

$$\, K_{\beta \beta }-K_{\beta \alpha }K^{-1}_{\alpha \alpha }K_{\alpha \beta }\in \mathbb {R}^{\left| \beta \right| \times \left| \beta \right| }\,$$

is a P-matrix,²${\varvec{x}}_{*}$ is called an R-regular solution.

Observe that for the particular case $G({\varvec{x}})={\varvec{x}}, $ the Definition 6 is reduced to the one of BD-regularity and R-regularity for nonlinear complementarity.

Following [12], we will use the notation $f_{\lambda } (a,b)$ for the first term on the right side of (4).

Lemma 3

[12]. Let $f_\lambda :\mathbb {R}^{2}\rightarrow \mathbb {R} $ defined by

$$\begin{aligned} f _{\lambda }(a,b)=\sqrt{\left( a-b\right) ^{2}+\lambda ab}, \hspace{0.5cm} \lambda \in (0,4). \end{aligned}$$

(12)

There exists a constant $c_\lambda \in (0,2)$ such that $\left\| \nabla f _{\lambda }(a,b)\right\| ^{2}\le c_{\lambda }, $ for all nonzero vector $(a,b)\in \mathbb {R}^{2}.$

Finally, we next present a characterization of the class of P-matrices for nonlinear complementarity, which will be useful later.

Proposition 1

(Proposition 2.7 [12]) A matrix of the form $\,D_{a}+D_{b}N\,$ is nonsingular for all positive (negative) semidefinite diagonal matrices $D_{a},D_{b}\,\in \mathbb {R}^{n \times n}$ such that $\,D_a+D_b\,$ is positive (negative) definite if and only if $ N\in \mathbb {R}^{n \times n}\,$ is a P-matrix.

3 The operator $\Phi _{{\lambda }}$

As we mentioned earlier, properties of the operator $ \Phi _\lambda $ were investigated in detail in [12] for the reformulation of nonlinear complementarity problem (i.e., when $ G({\varvec{x}}) ={\varvec{x}}. $). In this section, we verify that these properties are easily extend to generalized complementarity problem.

The following lemma and corollary give upper bounds for the partial derivatives of the function $f_\lambda , $ defined by (12), which will be used in later results.

Lemma 4

The partial derivatives of the function $f_\lambda $ are bounded from above by 1, for all nonzero vector $(a,b)\in \mathbb {R}^{2}.$

Proof

We must prove that for all nonzero vector $(a,b)\in \mathbb {R}^{2},$ the following inequalities are satisfied:

$$\begin{aligned} \frac{\partial f_{\lambda }(a,b)}{\partial a} \le 1 \hspace{1cm} \text{ and }\hspace{1cm} \frac{\partial f_{\lambda }(a,b)}{\partial b} \le 1. \end{aligned}$$

(13)

The first inequality was proved in [9]. Analogously, we prove the second below. From the inequality $\lambda a^{2}(\lambda -4)\le 0, $ adding $4(a-b)^{2}+4\lambda ab $ and after some algebraic manipulations, we obtain the inequalities

$$4(a-b)^{2}+4\lambda ab+ \lambda ^{2}a^{2}-4\lambda a^{2} \le 4f^{2}_{\lambda }(a,b)$$

$$4(a-b)^{2}-4\lambda a(a-b)+ \lambda ^{2}a^{2}\le 4f^{2}_{\lambda }(a,b)$$

$$[-2(a-b)+\lambda a]^{2}\le 4f^{2}_{\lambda }(a,b)$$

which implies that $\left| \, -2(a-b)+\lambda a\,\right| \le 2f_{\lambda }(a,b)>0, $ then

$$\left| \frac{ -2(a-b)+\lambda a}{2f_{\lambda }(a,b)}\right| \le 1,$$

thus

$$-1 \le \frac{ -2(a-b)+\lambda a}{2f_{\lambda }(a,b)}\le 1.$$

Therefore, we obtain the second inequality in (13).$\square $

Corollary 1

The partial derivatives of the function $f_\lambda $ satisfy the inequality

$$\frac{\partial f_{\lambda }(a,b)}{\partial a} + \frac{\partial f_{\lambda }(a,b)}{\partial b} < 2, $$

for all nonzero vector $(a,b)\in \mathbb {R}^{2}.$

Proof

From Lemma 4, $\,\frac{\partial f_{\lambda }(a,b)}{\partial a} + \frac{\partial f_{\lambda }(a,b)}{\partial b} \le 2. $ We assume that the equality holds. By Lemma 4, we have that $\,\frac{\partial f_{\lambda }(a,b)}{\partial a} =1\,$ and $\, \frac{\partial f_{\lambda }(a,b)}{\partial b} = 1,\,$ then $\left\| \nabla f _{\lambda }(a,b)\right\| ^{2} =2 \,$ which contradicts Lemma 3. Therefore, $\,\frac{\partial f_{\lambda }(a,b)}{\partial a}+\frac{\partial f_{\lambda }(a,b)}{\partial b} < 2.$ $\square $

The following lemma gives a compact expression for the gradient of the component functions of $\Phi _\lambda ,$ where $\varphi _\lambda $ is differentiable. This expression will be useful to characterize the generalized Jacobian matrices of $\,\Phi _\lambda \,$ in $\,\varvec{x}.$

Lemma 5

Let $\,\Phi _{\lambda , i} (\varvec{x}) \,= \,\varphi _\lambda (F_i(\varvec{x}), G_i(\varvec{x})) \,$ for $\,i \in \{1,2\ldots ,n\},\,$ such that $\,\,(F_i(\varvec{x}),G_i(\varvec{x})) \ne (0, 0). $ Then the gradient of $\Phi _{\lambda , i} (\varvec{x}) $ at $ \varvec{x} $ is given by

$$\begin{aligned} \nabla \Phi _{\lambda , i} (\varvec{x}) =(a_{i}({\varvec{x}})-1)\nabla F_{i} ({\varvec{x}})+ (b_{i}({\varvec{x}})- 1)\nabla G_{i} ({\varvec{x}}), \end{aligned}$$

(14)

where

$$\begin{aligned} a_{i}({\varvec{x}})= \frac{2\left( F_{i}({\varvec{x}})-G_{i}({\varvec{x}})\right) + \lambda G_{i}({\varvec{x}})}{2\sqrt{\left( F_{i}({\varvec{x}})-G_{i}({\varvec{x}})\right) ^{2}+\lambda F_{i}({\varvec{x}})G_{i}({\varvec{x}})}} \end{aligned}$$

(15)

$$\begin{aligned} b_{i}({\varvec{x}})= \frac{-2\left( F_{i}({\varvec{x}})-G_{i}({\varvec{x}})\right) +\lambda F_{i}({\varvec{x}})}{2\sqrt{\left( F_{i}({\varvec{x}})-G_{i}({\varvec{x}})\right) ^{2}+ \lambda F_{i}({\varvec{x}})G_{i}({\varvec{x}})}}\cdot \end{aligned}$$

(16)

Proof

Let ${\varvec{x}} \in \mathbb {R}^{n} $ such that $(F_{i}({\varvec{x}}),G_{i}({\varvec{x}}))\ne (0,0). $ Therefore, $\Phi _i $ and $\varphi _\lambda $ are differentiable at ${\varvec{x}}. $ Moreover, $ \nabla \Phi _{\lambda , i} (\varvec{x}) =\nabla {\varphi _\lambda }(F_i({\varvec{x}}),G_i({\varvec{x}})). $ After some algebraic calculations using the Chain rule, we have that

$$\begin{aligned} \nabla \Phi _{\lambda , i}({\varvec{x}}) = \nabla \varphi _{\lambda }(F_i({\varvec{x}}),G_i({\varvec{x}}))&= \left[ \begin{array}{c} (a_{i}({\varvec{x}})-1)\dfrac{\partial F_{i}({\varvec{x}})}{\partial x_{1}}+ (b_{i}({\varvec{x}})-1)\dfrac{\partial G_{i}({\varvec{x}})}{\partial x_{1}}\\ \vdots \\ (a_{i}({\varvec{x}})-1)\dfrac{\partial F_{i}({\varvec{x}})}{\partial x_{n}}+ (b_{i}({\varvec{x}})-1)\dfrac{\partial G_{i}({\varvec{x}})}{\partial x_{n}} \end{array} \right] , \end{aligned}$$

from which it follows immediately that

$$\begin{aligned} \nabla \Phi _{\lambda , i} (\varvec{x})= & {} (a_{i}({\varvec{x}})-1)\nabla F_{i}({\varvec{x}})+ (b_{i}({\varvec{x}})-1)\nabla G_{i} ({\varvec{x}}), \end{aligned}$$

where $a_i({\varvec{x}})$ and $b_i({\varvec{x}})$ are given by (15) and (16), respectively. $\square $

The following result extends the Proposition 2.5 in [12] to generalized complementarity. It gives an overestimation of the generalized Jacobian of $\Phi _{\lambda }$ at ${\varvec{x}}.$

Proposition 2

For all ${\varvec{x}}\in \mathbb {R}^{n},\,$ we have

$$\, \partial \Phi _{\lambda }({\varvec{x}})\,\subseteq D_{a}({\varvec{x}})F'({\varvec{x}})+D_{b}({\varvec{x}})G'({\varvec{x}}),\,$$

where $D_{a}({\varvec{x}})=diag \left( a_{1}({\varvec{x}})-1,\ldots ,a_{n}({\varvec{x}})-1\right) \,$ and $ \,D_{b}({\varvec{x}})=diag( b_{1}({\varvec{x}})-1,$ $\ldots , \,b_{n}({\varvec{x}})-1)$ are diagonal matrices, with $a_{i}({\varvec{x}})\,$ and $b_{i}({\varvec{x}})\,$ given by (15) and (16), if $\,(F_{i}({\varvec{x}}),G_{i}({\varvec{x}}))\ne (0,0),\,$ and by

$$\,a_{i}({\varvec{x}})= \xi _{i},\,\,\, b_{i}({\varvec{x}})=\chi _{i}, \, \text{ for } \text{ all } (\xi _{i},\chi _{i})\in \mathbb {R}^{2}\,\text{ such } \text{ that } \,\Vert (\xi _{i},\chi _{i})\Vert \le \sqrt{c_{\lambda }}, $$

if $\,(F_{i}({\varvec{x}}),G_{i}({\varvec{x}}))= (0,0), $ where $c_{\lambda } $ is the constant from Lemma 3.

Proof

From (7) and by Lipschitz continuity of $\Phi _\lambda ,$ we have

$$\begin{aligned} \partial \Phi _\lambda ({\varvec{x}})^T\subseteq \partial \Phi _{\lambda ,1}({\varvec{x}})\times \cdots \times \partial \Phi _{\lambda ,n}({\varvec{x}}), \end{aligned}$$

where $\Phi _{\lambda ,i} $ is the ith component function of $\Phi _{\lambda } $ and $\partial \Phi _{\lambda , i}({\varvec{x}}) $ denotes the generalized gradient of $\Phi _{\lambda , i}$ at ${\varvec{x}}.$

If $\,(F_{i}({\varvec{x}}),G_{i}({\varvec{x}}))\ne (0,0),\,$ the function $\,\Phi _{\lambda ,i}\,$ is continuously differentiable. Therefore the generalized gradient $\,\partial \Phi _{\lambda ,i}({\varvec{x}}) $ is the set whose only element is the gradient of $\Phi _{\lambda ,i} $ at ${\varvec{x}}, $ which by Lemma 5 is given by

$$\begin{aligned} \partial \Phi _{\lambda , i}({\varvec{x}})=\{(a_{i}({\varvec{x}})-1)\nabla F_{i}({\varvec{x}})+ (b_{i}({\varvec{x}})-1)\nabla G_{i} ({\varvec{x}})\nabla G_{i}({\varvec{x}})\}, \end{aligned}$$

(17)

where $a_i({\varvec{x}})$ and $b_i({\varvec{x}})$ are given by (15) and (16), respectively.

If $(F_{i}({\varvec{x}}),G_{i}({\varvec{x}}))= (0,0) $ then $\Phi _{\lambda ,i} $ is not differentiable at ${\varvec{x}}\,$ and therefore, for $H \in \partial _{B}\Phi _{\lambda ,i}({\varvec{x}}),$ we have

$$\begin{aligned} H= \lim \limits _{{\varvec{x}}_{k}\rightarrow {\varvec{x}}} \nabla \Phi _{\lambda , i}({\varvec{x}}_{k}), \end{aligned}$$

(18)

where $\left\{ {{\varvec{x}}_{k}}\right\} $ is a sequence of points in $D_{\Phi _{\lambda , i}}\,$ such that $\,{{\varvec{x}}_{k}\rightarrow {\varvec{x}}}, $ when $\,k\rightarrow \infty .$ Thus, by Lemma 5, for each $\,{{\varvec{x}}_{k}} \in D_{\Phi },\,$ the ith row of gradient from $\,\Phi _{\lambda ,i}\,$ at $\,{\varvec{x}}_{k}\,$ is given by

$$\begin{aligned} \nabla \Phi _{\lambda , i }({\varvec{x}}_{k}) =(a_{i}({\varvec{x}}_{k})-1)\nabla F_{i}({\varvec{x}}_{k})^{T}+(b_{i}({\varvec{x}}_{k})-1)\nabla G_{i}({\varvec{x}}_{k})^{T}. \end{aligned}$$

(19)

Since the limit in (19) exists, $ \nabla \Phi _{\lambda ,i},\,$ F and G are continuously differentiable at ${\varvec{x}}_{k}$ and from (18), we have

$$\begin{aligned} H= & {} \lim \limits _{{\varvec{x}}_{k}\rightarrow {\varvec{x}}}\left[ \left( a_{i}({\varvec{x}}_{k})-1\right) \nabla F_{i}({\varvec{x}}_{k})+\left( b_{i}({\varvec{x}}_{k})-1\right) \nabla G_{i}({\varvec{x}}_{k})\right] \\= & {} (\xi _{i}-1)\nabla F_{i}({\varvec{x}})+(\chi _{i}-1)\nabla G_{i}({\varvec{x}}), \end{aligned}$$

where $\xi _{i}=\lim \limits _{{\varvec{x}}_{k}\rightarrow {\varvec{x}}}a _{i}({\varvec{x}}_{k})$ and $\chi _{i}=\lim \limits _{{\varvec{x}}_{k}\rightarrow {\varvec{x}}}a _{i}({\varvec{x}}_{k}).$ On the other hand, for each k,

$$\begin{aligned} \left( a_{i}({\varvec{x}}_{k}), b_{i}({\varvec{x}}_{k})\right) ^{T} = \nabla f_{\lambda }((F_{i}({\varvec{x}}_{k}), (G_{i}({\varvec{x}}_{k})), \end{aligned}$$

(20)

where $f_{\lambda },$ is defined by (12). By Lemma 4, for each $\,i=1,\ldots ,n\,$ and for each $\,k\, \in N,\,$ $\left\| \nabla f_{\lambda }((F_{i}({\varvec{x}}_{k}), (G_{i}({\varvec{x}}_{k}))\right\| ^{2} \le c_{\lambda }; $ so

$$\,\lim \limits _{k\rightarrow \infty }\left\| \nabla f_{\lambda }((F_{i}({\varvec{x}}_{k}), (G_{i}({\varvec{x}}_{k}))\right\| ^{2} \le c_\lambda .\,$$

Therefore, $\Vert (\xi _{i},\chi _{i})\Vert \le \sqrt{c_{\lambda }}.\,$ For the above, the statement of the lemma follows easily. $\square $

The following result generalizes the Proposition 2.6 in [12] to the case where G is any continuously differentiable function. Furthermore, it guarantees that the diagonal elements $a_i({\varvec{x}})-1$ and $b_i({\varvec{x}})-1$ defined in Proposition 2 are nonpositive (namely for those indices for which $(F_{i}({\varvec{x}}),G_{i}({\varvec{x}}))= (0,0) $).

Proposition 3

Any $H\in \partial \Phi _{\lambda }({\varvec{x}})$ can be written in the form

$$\begin{aligned} H=(D_{a} -I)F'({\textbf {x}})+(D_{b} -I)G'({\textbf {x}}), \end{aligned}$$

(21)

where $(D_{a} -I)$ and $(D_{b} -I)$ are negative semidefinite diagonal matrices such that $D_{a} +D_{b} -2I$ is negative definite.

Proof

It is analogous to the proof of Proposition 2.6 in [12], taking into account that we must use Proposition 2. $\square $

The next proposition gives a new characterization of the class of P-matrices in the context of generalized complementarity. It reduces to the one presented in [12] for nonlinear complementarity when G is the identity function.

Proposition 4

Let $D_a $ and $D_b $ be positive (negative) semidefinite diagonal matrices such that $D_a+D_b$ is positive (negative) definite, and $M,\, N$ $ \in \mathbb {R}^{n \times n},$ with M nonsingular. The matrix $D_{a}M+D_{b}N$ is nonsingular if and only if $NM^{-1} \in \mathbb {R}^{n \times n}$ is a P-matrix.

Proof

Clearly,

$$\begin{aligned} D_{a}M+D_{b}N=\left( D_a +D_{b}NM^{-1}\right) M. \end{aligned}$$

(22)

$\Rightarrow $) We assume that $\,D_{a}M+D_{b}N\,$ is nonsingular. Since M is nonsingular (hypotheses), then from (22), we have that $ D_a +D_{b}NM^{-1}\,$ is also nonsingular and by proposition 1, $NM^{-1}$ is a P-matrix.
$\Leftarrow $) If $NM^{-1}$ is a P-matrix, proposition 1 guarantees that the matrix $ D_a +D_{b}NM^{-1}\,$ is nonsingular. Using this fact in (22) and recalling that M es nonsingular, we conclude that $\,D_{a}M+D_{b}N\,$ is also nonsingular.

$\square $

The following theorem gives a sufficient condition to guarantee the nonsingularity of the elements of the generalized Jacobian of $\,\Phi _\lambda \,$ in a solution $\,{\varvec{x}}_{*}\,$ of the GCP(F, G).

Theorem 1

If $\,{\varvec{x}}_{*}\,$ is an R-regular solution of the GCP(F, G) then all the matrices in the generalized Jacobian $ \partial \Phi _{\lambda }({\varvec{x}}_{*}) $ are nonsingular.

Proof

Let $H\in \partial \Phi _{\lambda }({\varvec{x}}_{*}) $ be fixed but arbitrary. By proposition 2, there are diagonal matrices $D_a = D_a ({\varvec{x}}_{*}) $ and $ D_b = D_b ({\varvec{x}}_{*}) $ such that

$$\begin{aligned} H=D_{a}F'({\varvec{x}}_{*})+D_{b}G'({\varvec{x}}_{*}). \end{aligned}$$

(23)

Since ${\varvec{x}}_{*} $ is an R-regular solution of the GCP(F, G), we have that $F'({\varvec{x}}_{*})$ is nonsingular. Thus, from (23), we have

$$\begin{aligned} \overline{H}=HF'({\varvec{x}}_{*})^{-1}=D_{a}+D_{b}G'({\varvec{x}}_{*})F'({\varvec{x}}_{*})^{-1}=D_{a}+D_{b}K, \end{aligned}$$

where $\,K=G'({\varvec{x}}_{*})F'({\varvec{x}}_{*})^{-1}. $ From here, the proof can be carried out in essentially the same way as the one for Theorem 2.8 in [12]. $\square $

Corollary 2

If $\,{\varvec{x}}_{*} $ is an R-regular solution of the GCP(F, G) then $ {\varvec{x}}_{*} $ is also a BD-regular solution.

Proof

Let ${\varvec{x}}_{*} $ be an R-regular solution of the GCP(F, G). Then all matrices in $ \partial \Phi _{\lambda }({\varvec{x}}_{*}) $ are nonsingular. Since $\, \partial _B\Phi _{\lambda }({\varvec{x}}_{*})\subseteq \partial \Phi _{\lambda }({\varvec{x}}_{*}), \, $ we have that if $ H \in \partial _B\Phi _{\lambda }({\varvec{x}}_{*}) $ then H is nonsingular. Therefore ${\varvec{x}}_{*} $ is a BD-regular solution of the GCP(F, G). $\square $

The following theorem guarantees the semismoothness of $\Phi _{{\lambda }} $ and gives a sufficient condition for its strong semismoothness.

Theorem 2

The following propositions are fulfilled:

$\Phi _{{\lambda }}$ is semismooth.

If F and G are class $LC^1$ then $\Phi _{{\lambda }}$ is strongly semismooth.

Proof

The proof is analogous to one given in [12] in the nonlinear complementarity context, and it is an immediate consequence of the fact that $\varphi _{\lambda }$ is strongly semismooth. $\square $

Finally, we present the following result which will be very useful in the next section.

Theorem 3

The function $\Psi _\lambda $ is continuously differentiable with $\nabla \Psi _\lambda ({\varvec{x}})=H^T \Phi _\lambda ({\varvec{x}}), $ for any $H\in \partial \Phi _\lambda ({\varvec{x}}).$

Proof

Using proposition 3 and properties of the family $\varphi _\lambda $ [12], the proof is practically the same as the one for proposition 3.4 in [21]. $\square $

4 Algorithm and convergence results

In this section, we present a new global nonsmooth Newton-type algorithm to solve the GCP(F, G) and some convergence results of this algorithm.

Remark 1

The algorithm proposed uses in each iteration the matrix H defined in (21) which in turn uses the Jacobian matrices of F and G. It can be seen as an extension of the algorithm proposed in [15] to all members of the family of complementary functions $\varphi _\lambda $ defined in (12).

Remark 2

The Algorithm 1 differs from the proposed in [15], not only in the complementarity function but also in the linear search. Our algorithm uses the one-parametric family of complementarity functions defined in (4) and its linear search guarantees that the step satisfies the Armijo condition [22] while the other algorithm uses the Fischer function and its linear search guarantees that the step satisfies the Goldstein conditions [22].

Remark 3

From the previous Remark, Algorithm 1 can be seen as a generalization of algorithm proposed in [15] to the family of complementarity functions (4), and comparing their linear search, Algorithm 1 is less restrictive than the other.

For Algorithm 1, we develop its global convergence theory. For this purpose, we present two lemmas that will be useful in proving the convergence theorems of Algorithm 1 and in which a characterization of the accumulation points of the sequence $\,\left\{ {\varvec{x}}_{k}\right\} \,$ generated by Algorithm 1 is given.

Theorem 4

Each of the accumulation points of the sequence $\,\left\{ {\varvec{x}}_{k}\right\} \,$ generated by Algorithm 1 is a stationary point of $\Psi _{\lambda }.$

Proof

Let ${\varvec{x}}_{*} $ be an accumulation point of the sequence $\left\{ {\varvec{x}}_{k}\right\} $ generated by the Algorithm 1. If for an infinite set of indices J, the direction given by Step 4 of the algorithm ($\,{\varvec{d}}_{k}=-\nabla \Psi _{\lambda }({\varvec{x}}_{k}), $ for all $k\in J$ ) then any limit point of the subsequence $ \left\{ {\varvec{x}}_{k}\right\} _J $ is a stationary point of $\,\Psi _{\lambda }\,$ [23].

We assume, without losing generality, that ${\varvec{d}}_{k} $ is given by (24). We will prove that for all k, there are positive constants m and M such that

$$\begin{aligned} m \le \left\| {\varvec{d}}_{k}\right\| \le M. \end{aligned}$$

(27)

Let k be any index of the sequence generated by Algorithm 1. Since ${\varvec{d}}_{k}$ is given by (24), we have $ H_{k}{\varvec{d}}_{k}=-\Phi _{\lambda }({\varvec{x}}_{k}) $ then

$$\begin{aligned} \left\| \Phi _{\lambda }({\varvec{x}}_{k})\right\| \le \left\| H_{k}\right\| \left\| {\varvec{d}}_{k}\right\| , \end{aligned}$$

(28)

for a vector norm $\left\| \, \cdot \,\right\| . $ Clearly $\,\left\| H_k\right\| \ne 0,\,$ otherwise, by Theorem 3 we would have that $\nabla \Psi _{\lambda }({\varvec{x}}_{k})=0; $ that is, ${\varvec{x}}_{k} $ would be a stationary point of $\Psi _\lambda $ and Algorithm 1 would stop. Thus, from (28), we have the inequality

$$\begin{aligned} \frac{\left\| \Phi _{\lambda }({\varvec{x}}_{k})\right\| }{ \left\| H_{k}\right\| } \le \left\| {\varvec{d}}_{k}\right\| . \end{aligned}$$

(29)

We suppose that ${\varvec{x}}_{*} $ is not a stationary point of $\Psi _{\lambda } (\nabla \Psi _{\lambda }({\varvec{x}}_{*})\ne 0). $ If for an infinite set of indices $J, \,\left\{ \left\| {\varvec{d}}_{k}\right\| \right\} _J\rightarrow 0, $ when $\,k \rightarrow \infty \,$ then since the generalized Jacobian is compact, $ H_k $ is bounded and from (28), we get, $\,\left\{ \left\| \Phi _{\lambda }({\varvec{x}}_{k})\right\| \right\} _J\rightarrow 0 $

Now, by the continuity of $ \, \Phi _{\lambda } \, $ and by taking the limit $ k \rightarrow \infty $ in (29), we obtain $\,\Phi _{\lambda }({\varvec{x}}_{*})=0 $ so $ \nabla \Psi _{\lambda }({\varvec{x}}_{*})= 0 $ which contradicts that $ {\textbf {\textit{x}}}_{*} $ is not a stationary point. Then $\,\left\{ \left\| {\varvec{d}}_{k}\right\| \right\} \,$ cannot converge to 0. Therefore there exists $ m> 0 $ such that $ m \le \left\| {\varvec{d}}_{k} \right\| , $ for all k.

On the other hand, if $ \, \left\| {\varvec{d}}_{k} \right\| \, $ were not bounded above then, since $ p> 2 $ and $\nabla \Psi _{\lambda }( {\varvec{x}}_{k}) $ is bounded, we have

$$\begin{aligned} \lim \limits _{k\rightarrow \infty }\frac{\left\| \nabla \Psi _{\lambda }({\varvec{x}}_{k})\right\| \cos \theta }{\Vert {\varvec{d}}_{k}\Vert ^{p-1}} =0, \end{aligned}$$

where $\,\theta \,$ is the angle between $\nabla \Psi _{\lambda }({\varvec{x}}_{k})$ and $ {\varvec{d}}_{k}.$ Equivalently

$$\begin{aligned} \lim \limits _{k\rightarrow \infty }\frac{ \nabla \Psi _{\lambda }({\varvec{x}}_{k})^{T}{\varvec{d}}_{k}}{\Vert {\varvec{d}}_{k}\Vert ^{p}} =0, \end{aligned}$$

which contradicts (25). Then there exists $ \, M> 0 \, $ such that $\,\left\| {\varvec{d}}_{k}\right\| \le M.$

Finally, since (25) is satisfied at each iteration and $ \Psi _{\lambda } $ is continuously differentiable, $\,\left\{ \Psi _{\lambda }({\varvec{x}}_{k+1})-\Psi _{\lambda }({\varvec{x}}_{x})\right\} \rightarrow 0\,$ when $\,k \rightarrow \infty ,\,$ which implies by (25), that $\,\left\{ \sigma 2^{-i_k}\Vert \nabla \Psi _{\lambda }({\varvec{x}}_{k})\Vert {\varvec{d}}_{k}\right\} \rightarrow 0, \,$ therefore

$$\begin{aligned} \left\{ 2^{-i_k}\Vert \nabla \Psi _{\lambda }({\varvec{x}}_{k})\Vert {\varvec{d}}_{k}\right\} \rightarrow 0. \end{aligned}$$

(30)

Now, we want to show that $2^{-i_k}$ is bounded away from 0. Suppose the contrary. Then, subsequencing if necessary, we have that $\{2^{-i_k}\} \rightarrow 0, $ so that at each iteration the stepsize is reduced at least once and (26) gives

$$\begin{aligned} \frac{\Psi _{\lambda }({\varvec{x}}_{k}+2^{-\left( i_{k}-1\right) }{\varvec{d}}_{k})-\Psi _{\lambda }({\varvec{x}}_{k})}{2^{-\left( i_{k}-1\right) }} > \sigma \nabla \Psi _{\lambda }({\varvec{x}}_{k})^{T} {\varvec{d}}_{k}. \end{aligned}$$

(31)

From (27), we can assume that $\,\left\{ {\varvec{d}}_k\right\} \rightarrow \overline{{\varvec{d}}}\ne 0,\,$ and passing to the limit at (31), we obtain

$$\begin{aligned} \nabla \Psi _{\lambda }({\varvec{x}}_{*})^{T}\overline{{\varvec{d}}} \,\ge \, \sigma \nabla \Psi _{\lambda }({\varvec{x}}_{*})^{T}\overline{{\varvec{d}}} \end{aligned}$$

(32)

in addition, by (25), we have

$$\begin{aligned} \nabla \Psi _{\lambda }({\varvec{x}}_{*})^{T}\overline{{\varvec{d}}} \,\le \, -\rho \Vert {\varvec{d}}_{k}\Vert ^{p}<0, \end{aligned}$$

(33)

which contradicts (32) since $\sigma \in (0,1/2). $ Thus, we can assume that the sequence $\{2^{-i_k}\} $ is bounded from 0. On the other hand, (25) and (30) imply that $\left\{ {\varvec{d}}_k\right\} \rightarrow 0, $ which contradicts (27). Therefore, $\nabla \Psi _{\lambda }({\varvec{x}}_{*})= 0. $ That is, $ {\varvec{x}}_{*} $ is a stationary point of $\,\Psi _{\lambda }.$ $\square $

Theorem 5

If ${\varvec{x}}_{*} $ is an isolated accumulation point of the sequence $ \left\{ {\varvec{x}}_{k}\right\} $ generated by Algorithm 1. Then the sequence converges to ${\varvec{x}}_{*}.$

Proof

Let $\left\{ {\varvec{x}}_{k}\right\} $ be the sequence generated by Algorithm 1 and ${\varvec{x}}_{*}$ an isolated accumulation point of the sequence. By Theorem 4, ${\varvec{x}}_{*}$ is a stationary point of the convex function $\Psi _{\lambda } $ then ${\varvec{x}}_{*}$ is an isolated global minimizer of $\Psi _{\lambda }.$

Let $\Omega $ be the set of accumulation points of $\left\{ {\varvec{x}}_{k}\right\} . $ Then $ \Omega \ne \emptyset $ since ${\varvec{x}}_{*} \in \Omega .$ we define

$$\delta ={\left\{ \begin{array}{ll} dist({\varvec{x}}_{*};\Omega \backslash \left\{ {\varvec{x}}_{*}\right\} ) \,\, \,\, if &{} \Omega \backslash \left\{ {\varvec{x}}_{*}\right\} \ne \emptyset \\ 1 \,\,\,\,\hspace{2.5 cm} if &{} \Omega =\left\{ {\varvec{x}}_{*}\right\} . \end{array}\right. }$$

$\delta > 0 \, $ since ${\varvec{x}}_{*} $ is an isolated accumulation point. Now, if

$$\Omega _1 = \left\{ {\varvec{y}}\in \mathbb {R}^n: dist({\varvec{y}};\Omega )\le \delta /4 \right\} ,\,$$

then exist $\overline{k} $ such that ${\varvec{x}}_{k}\in \Omega _1 $ for all $k \ge \overline{k}. $ If

$$K = \left\{ k\in \mathbb {N}: dist({\varvec{x}}_{k};{\varvec{x}}_{*})\le \delta /4 \right\} \,$$

(K is obviously not empty because $\,{\varvec{x}}_{*}\,$ is a limit point of the sequence) then $\,\left\{ {\varvec{x}}_{k}\right\} _K \subset \overline{B}\left( {\varvec{x}}_{*};\delta /4\right) .\,$

Since all points of the subsequence $\left\{ {\varvec{x}}_{k}\right\} _K $ belong to the compact set $ \overline{B}\left( {\varvec{x}}_{*};\delta /4\right) $ and all its limit points are also limit points of $\left\{ {\varvec{x}}_{k}\right\} , $ we conclude that $\left\{ {\varvec{x}}_{k}\right\} _K $ converges to $ {\varvec{x}}_{*}.$ Furthermore, because $\nabla \Psi _{\lambda }({\varvec{x}}_{*})=0,$ the Theorem 4 guarantees that $\left\{ \left\| \nabla \Psi _{\lambda }({\varvec{x}}_{k})\right\| \right\} _K $ converges to zero, which by (25) implies that the sequence $\left\{ {\varvec{d}}_{k}\right\} $ converges to the zero vector. Thus, for all $\epsilon ,$ there exists $ k_1\in \mathbb {N} $ such that if $ k\in K $ and $ k \ge k_1 \ge \overline{k}, $ then $ \left\| {\varvec{d}}_{k}\right\| \le \epsilon . $ Particularly, for $ \epsilon =\delta /4 $ we have $\left\| {\varvec{d}}_{k}\right\| \le \delta /4. $

Let $\, k_2\in K \,$ such that $\,k_2\ge k_1.\,$ By Algorithm 1, we have that

$$\,{\varvec{x}}_{ k_{2}+1}={\varvec{x}}_{ k_2} +t_ {k_2}{\varvec{d}}_{ k_2}\,$$

with $\,t_ {k_2}\in \left( 0,1\right] , \,$ whereby $\,\left\| {\varvec{x}}_{ k_{2}+1}-{\varvec{x}}_{ k_2}\right\| \le \left\| {\varvec{d}}_{ k_2}\right\| \le \delta /4. $ Thus,

$$\begin{aligned} dist({\varvec{x}}_{*};\Omega \backslash \left\{ {\varvec{x}}_{*}\right\} )\le & {} dist({\varvec{x}}_{k_{2}+1};\Omega \backslash \left\{ {\varvec{x}}_{*}\right\} )+\left\| {\varvec{x}}_{ *}-{\varvec{x}}_{ k_2+1} \right\| \\[0.1cm]\le & {} dist({\varvec{x}}_{k_{2}+1};\Omega \backslash \left\{ {\varvec{x}}_{*}\right\} )+\left\| {\varvec{x}}_{ *}-{\varvec{x}}_{ k_2} \right\| +\left\| {\varvec{x}}_{ k_2}-{\varvec{x}}_{ k_{2}+1} \right\| \\[0.1cm]\le & {} dist({\varvec{x}}_{k_{2}+1};\Omega \backslash \left\{ {\varvec{x}}_{*}\right\} )+\delta /4 + \delta /4. \end{aligned}$$

That is,

$$dist({\varvec{x}}_{k_{2}+1};\Omega \backslash \left\{ {\varvec{x}}_{*}\right\} ) \ge dist({\varvec{x}}_{*};\Omega \backslash \left\{ {\varvec{x}}_{*}\right\} )-\delta /2 \ge \delta -\delta /2=\delta /2, $$

consequently, ${\varvec{x}}_{k_{2}+1} \notin \Omega _1 \backslash \overline{B}\left( {\varvec{x}}_{*};\delta /4\right) $ and since ${\varvec{x}}_{k_{2}+1} \in \Omega _1 $ then $ {\varvec{x}}_{k_{2}+1} \in \overline{B}\left( {\varvec{x}}_{*};\delta /4\right) ; $ that is, $k_{2}+1 \in K $ and since $ (k_{2}+1) > k_1, $ then, using induction $ k_{2} \in K, $ for all $ k_{2} > k_1, $ which implies that $ {\varvec{x}}_{k} \in \overline{B}\left( {\varvec{x}}_{*};\delta /4\right) , $ for all $\,k_{2} > k_1. $ Thus, $ \left\{ {\varvec{x}}_{k}\right\} $ converges to $ {\varvec{x}}_{*}.$ $\square $

Theorem 6

Let ${\varvec{x}}_{*} $ be an accumulation point of the sequence $ \left\{ {\varvec{x}}_{k}\right\} $ generated by Algorithm 1 such that $ {\varvec{x}}_{*} $ is an R-regular solution of GCP(F, G). Then, the sequence $\left\{ {\varvec{x}}_{k}\right\} $ converges to ${\varvec{x}}_{*}, $ the search direction ${\varvec{d}}_k$ is given by the solution of the linear system (24) eventually, and the full stepsize $t_k = 1 $ is accepted for all k sufficiently large.

Proof

By hypothesis, $\,{\varvec{x}}_{*}\,$ is an R-regular solution of the GCP(F, G) and by Corollary 2, ${\varvec{x}}_{*} $ is also a BD-regular solution then the Proposition 3 in [8] guarantees that ${\varvec{x}}_{*}$ is an isolated accumulation point Therefore, the sequence $\left\{ {\varvec{x}}_{k}\right\} $ converges to ${\varvec{x}}_{*} $ by Theorem 5.

Since $\left\{ {\varvec{x}}_{k}\right\} $ converges to a BD-regular solution of the system $ \Phi _{\lambda }({\varvec{x}})=0 $ then for k large enough, $ H_k $ is nonsingular, therefore the system (24) always has solution [10]. Let’s see that it satisfies the inequality

$$\begin{aligned} \nabla \Psi _{\lambda }({\varvec{x}}_{k})^{T}{\varvec{d}}_{k} \le -\rho _1 \Vert {\varvec{d}}_{k}\Vert ^{2}, \end{aligned}$$

(34)

for some $ \rho _1 >0. $

From (24),

$$\begin{aligned} \left\| {\varvec{d}}_{k}\right\| \le \left\| H_{k}^{-1}\right\| \left\| \Phi _{\lambda }({\varvec{x}}_{k})\right\| . \end{aligned}$$

(35)

Furthermore, Since $\nabla \Psi _{\lambda }({\varvec{x}}_{k})=H_{k}^{T}\Phi _{\lambda }({\varvec{x}}_{k}) $ and $ H_{k}{\varvec{d}}_{k}=-\Phi _{\lambda }({\varvec{x}}_{k}), $ we get

$$\begin{aligned} \nabla \Psi _{\lambda }({\varvec{x}}_{k})^{T}{\varvec{d}}_{k} = - \Vert \Phi _{\lambda }({\varvec{x}}_{k})\Vert ^{2}. \end{aligned}$$

(36)

Combining (35) and (36), we have

$$\begin{aligned} \nabla \Psi _{\lambda }({\varvec{x}}_{k})^{T}{\varvec{d}}_{k} \le -\frac{\Vert {\varvec{d}}_{k}\Vert ^{2}}{N^{2}}, \end{aligned}$$

(37)

where N is an upper bound of $\left\| H_{k}^{-1}\right\| $ (which exists by Lemma 2.6 of [10]). Taking $\rho _1 = 1/N^2$ in (37), we get (34). Since the sequence $\left\{ \left\| {\varvec{d}}_{k}\right\| \right\} $ converges to 0. From (34) it follows that (25), holds for any $ p>2 $ and any positive constant $ \rho .$

Finally, let’s see that the full step size is achieved; that is, $i_k=0. $ First, $ \Psi _{\lambda } $ is class $ LC^1 $ by Theorem 2.3 in [12]. With this hypothesis and Theorem 3.2 in [3] guarantees that, starting from a certain $ k,\, t_{k}=1. $ $\square $

Finally, we present a result that guarantees the rate of convergence of algorithm.

Theorem 7

If ${\varvec{x}}_{*}$ is an accumulation point of the sequence $\left\{ {\varvec{x}}_{k}\right\} $ generated by Algorithm 1, such that $ {\varvec{x}}_{*} $ is an R-regular solution of the GCP(F,G) then the sequence $ \left\{ {\varvec{x}}_{k}\right\} $ generated by Algorithm 1 converges q-superlinearly to $ {\varvec{x}}_{*}. $ In addition, if $ \Phi _{\lambda } $ is an $ LC^{1} $ mapping then the convergence rate is q-quadratic.

Proof

Since ${\varvec{x}}_{*} $ is R-regular and by Lemma 2.6 of [10], $ H_k $ is nonsingular and $ \left\| H_{k}^{-1}\right\| \le N, $ for some positive constant N; so,$\,t_k=1\,$ is accepted from a certain value of k, the sequence $ \left\{ {\varvec{x}}_{k}\right\} $ is well defined and

$$\,{\varvec{x}}_{ k+1}={\varvec{x}}_{ k} -H_{k}^{-1}\Phi _{\lambda }({\varvec{x}}_{ k}).\,$$

Subtracting $ {\varvec{x}}_{*} $ and using a norm $ \left\| \, \cdot \,\right\| , $ we have

$$ \left\| {\varvec{x}}_{ k+1}-{\varvec{x}}_{*} \right\| \nonumber = \left\| {\varvec{x}}_{ k}-{\varvec{x}}_{*}-H_{k}^{-1}\Phi _{\lambda }({\varvec{x}}_{ k}) \right\| .$$

After some algebraic manipulations, we obtain

$$\begin{aligned} \left\| {\varvec{x}}_{ k+1}-{\varvec{x}}_{*} \right\|= & {} \left\| {\varvec{x}}_{ k}-{\varvec{x}}_{*}-H_{k}^{-1}\Phi _{\lambda }({\varvec{x}}_{ k})\right\| \nonumber \\[0.2cm]\le & {} \left\| H_{k}^{-1}\left[ \Phi _{\lambda }({\varvec{x}}_{ k})-\Phi _{\lambda }({\varvec{x}}_{ *})-\Phi _{\lambda }'({\varvec{x}}_{ *};{\varvec{x}}_{ k}-{\varvec{x}}_{ *})\right] \right\| \\[0.2cm]\nonumber{} & {} \hspace{0.4cm}+ \left\| H_{k}^{-1}\,\left[ H_{k}({\varvec{x}}_{ k} -{\varvec{x}}_{ *})-\Phi _{\lambda }'({\varvec{x}}_{ *};{\varvec{x}}_{ k}-{\varvec{x}}_{ *})\,\right] \right\| . \end{aligned}$$

(38)

To bound the two terms on the right side in (38), we must take into account that $\Phi _\lambda $ is semismooth (Theorem 2) and, therefore it is directionally differentiable. Then, from (9),

$$\,\left\| \Phi _{\lambda }({\varvec{x}}_{ k})-\Phi _{\lambda }({\varvec{x}}_{ *})-\Phi _{\lambda }'({\varvec{x}}_{ *};{\varvec{x}}_{ k}-{\varvec{x}}_{ *}) \right\| =o\left\| {\varvec{x}}_{ k}-{\varvec{x}}_{*} \right\| ,$$

and by Lemma 1,

$$ \left\| H_{k}({\varvec{x}}_{ k} -{\varvec{x}}_{ *})-\Phi _{\lambda }'({\varvec{x}}_{ *};{\varvec{x}}_{ k}-{\varvec{x}}_{ *}) \right\| = o\left\| {\varvec{x}}_{ k}-{\varvec{x}}_{*} \right\| ,$$

using the last two equalities and since $\left\| H_{k}^{-1}\right\| \le N $ in (38), we conclude

$$\begin{aligned} \left\| {\varvec{x}}_{ k+1}-{\varvec{x}}_{*} \right\| = o\left\| {\varvec{x}}_{ k}-{\varvec{x}}_{*} \right\| , \end{aligned}$$

(39)

which shows that the sequence $\left\{ {\varvec{x}}_{k}\right\} $ converges q-superlinearly to ${\varvec{x}}_{*}.$

Now, we assume that $\Phi _{\lambda } $ is an $LC^{1} $ mapping. Then, by Theorem 2, it is also strongly semismooth at $ {\varvec{x}}_{*}. $ Moreover, the directional derivative $ \Phi '_{\lambda }(\cdot , \cdot )$ is semi-continuous of degree 2 [2]. Then, from (11) and by Lemma 2, we have

$$\begin{aligned} \left\| H_{k}({\varvec{x}}_{ k} -{\varvec{x}}_{ *})-\Phi _{\lambda }'({\varvec{x}}_{ *};{\varvec{x}}_{ k}-{\varvec{x}}_{ *})\right\| = O\left\| {\varvec{x}}_{ k}-{\varvec{x}}_{*} \right\| ^2 \end{aligned}$$

and $\,\left\| \Phi _{\lambda }({\varvec{x}}_{ k})-\Phi _{\lambda }({\varvec{x}}_{ *})-\Phi _{\lambda }'({\varvec{x}}_{ *};{\varvec{x}}_{ k}-{\varvec{x}}_{ *}) \right\| =O\left\| {\varvec{x}}_{ k}-{\varvec{x}}_{*} \right\| ^2. $ Hence, from (38),

$$\,\left\| {\varvec{x}}_{ k+1}-{\varvec{x}}_{*} \right\| = O\left\| {\varvec{x}}_{ k}-{\varvec{x}}_{*} \right\| ^2,\,$$

which shows that the sequence $\left\{ {\varvec{x}}_{k}\right\} $ converges q-quadratically to ${\varvec{x}}_{*}.$ $\square $

5 Numerical experiments

In this section, we analyze the numerical performance of the Algorithm 1 proposed in Section 4, which we call Algorithm NG, and compare it with the algorithm proposed in [15] that we call Algorithm NGS. Moreover, we incorporate to Algorithm NG the dynamic procedure to update, at each iteration, the parameter $\lambda $ proposed in [12], obtaining a new algorithm that we call Algorithm NGD, which we also compare with the two previous ones.

The algorithms were implemented in Matlab $^{\circledR }$ R2019a and tested on a computer with an AMD Sempron (tm) processor of 2.21 GHz. The parameters are the following: $ \epsilon =10^{-4},\,N=100,\,\rho =10^{-8},\,p=2.1$ and $\,\sigma =10^{-4}.\,$

We did four types of experiments. First, we compare the performance of the Algorithms NGD and NGS in terms of iterations number and CPU time. Second, we analyze the global performance of Algorithms NGD and NGS. Third, we analyze the behavior of Algorithm NG in terms of iterations number for different values of $ \lambda . $ Finally, in the fourth experiment, we compare the Algorithms NG and NGD.

The experiments were carried out with seven problems. Below, we describe each of them, its solutions and the starting points used. For simplicity, we denote $ {\varvec{e}}_n $ as the vector of ones of order n.

Problem 1 [24]. F is the Kojima-Shindo function, and G is the identity function. Thus, the GCP(F, G) reduces to a nonlinear complementarity problem. Explicitly $F,G:\mathbb {R}^{4}\rightarrow \mathbb {R}^{4}$ are defined by

$$\begin{aligned}F({\varvec{x}})=\left( \begin{array}{c} 3x_{1}^{2}+2x_{1}x_{2}+2x_{2}^{2}+x_{3}+3x_{4}-6 \\ 2x_{1}^{2}+x_{2}^{2}+x_{1}+10x_{3}+2x_{4}-2 \\ 3x_{1}^{2}+x_{1}x_{2}+2x_{2}^{2}+2x_{3}+9x_{4}-9 \\ x_{1}^{2}+3x_{2}^{2}+2x_{3}+3x_{4}-3\end{array} \right) \hspace{0.3cm} \text {and}\hspace{0.3cm} G({\varvec{x}})={\varvec{x}}\,. \end{aligned}$$

The solutions are $ {\varvec{x}}_*= \left( 1,\, 0,\, 3,\, 0\right) $ and $ {\varvec{x}}_*= \left( \sqrt{6}/2,\, 0,\, 0,\, 1/2\right) \, $ and the starting points are $ {\varvec{x}}_{1}= \left( 0,\, 0,\, 0,\,0\right) ,\,$ $\,{\varvec{x}}_{2}= \left( 1,\, 0,\, 1,\,0\right) ,\,$ $\,{\varvec{x}}_{3}= \left( 1,\, 0,\, 0,\,0\right) ,\,$ $\,{\varvec{x}}_{4}= \left( 0,\, 1,\, 1,\,0\right) .\,$

Problem 2 [15]. The functions $ F,G:\mathbb {R}^{2}\rightarrow \mathbb {R}^{2} $ are defined by

$$\begin{aligned} \,F({\varvec{x}})=\left( \begin{array}{c} x_{1}^{2}\\ x_{2}^{2} \\ \end{array} \right) \hspace{0.3cm} \text {and}\hspace{0.3cm}G({\varvec{x}})=\left( \begin{array}{c} x_{1}^{2}+10\\ x_{2}^{2} +1 \\ \end{array} \right) .\end{aligned}$$

The unique solution is $ {\varvec{x}}_*=(0,\, 0) $ and the starting points are $\,{\varvec{x}}_{1}= \left( 10,\, 1\right) ,\,$ $\,{\varvec{x}}_{2}=100 {\varvec{e}}_2,\,$ $\,{\varvec{x}}_{3}=1000 {\varvec{e}}_2\,$ and $\,{\varvec{x}}_{4}=10000 {\varvec{e}}_2.\,$

Problem 3 [5]. The functions $ F,G:\mathbb {R}^{2}\rightarrow \mathbb {R}^{2} $ are defined by

$$\begin{aligned}F({\varvec{x}})=\left( \begin{array}{c} -\frac{100}{3}+2x_{1}+\frac{8}{3}x_{2}\\ -22.5+2x_{2}+\frac{5}{4}x_{1} \\ \end{array} \right) \hspace{0.3cm} \text {and}\hspace{0.3cm} G({\varvec{x}})=\left( \begin{array}{c} 15-x_{2}\\ 20-x_{1} \\ \end{array} \right) .\end{aligned}$$

The solutions are $ {\varvec{x}}_*= \left( 10,\, 5\right) $ and $ {\varvec{x}}_*=\left( 20,\, 15\right) .\,$ The starting points are $\,{\varvec{x}}_{1}= \left( 0,\, 0\right) ,\,$ $\,{\varvec{x}}_{2}= \left( 5,\, 0\right) \,$ and $\,{\varvec{x}}_{3}= \left( 11,\, 0\right) .$

Problem 4 [5]. This is an implicit complementarity problem where the functions $F,G:\mathbb {R}^{4}\rightarrow \mathbb {R}^{4}$ are defined by $ F({\varvec{x}})=A{\varvec{x}}+{\varvec{b}}$ and $\, G({\varvec{x}})={\varvec{x}}-h({\varvec{x}}), \,$ where $A=tridiag(-1,2,-1),$ ${\varvec{b}}={\varvec{e}}_4$ and $\,h_i({\varvec{x}})= -0{.}5-x_i. \,$

The solution is $ {\varvec{x}}_*=\left( -0{.}9,\, -1{.}2, -1{.}2,\, -0{.}9\right) $ and the starting points are $\,{\varvec{x}}_{1}= \left( 0,\, 0,\,0,\, 0\right) ,\,$ $\,{\varvec{x}}_{2}=-0{.}5 {\varvec{e}}_4\,$ and $\,{\varvec{x}}_{3}=-{\varvec{e}}_4.$

Problem 5 [8]. The functions $F,G:\mathbb {R}_+^{5}\rightarrow \mathbb {R}^{5}$ are defined by

$$\begin{aligned} G({\varvec{x}})=F({\varvec{x}})={\varvec{c}}+L^{\frac{1}{{\varvec{b}}}}{\varvec{x}}^{\frac{1}{{\varvec{b}}}}-\left( \frac{5000}{ \sum \limits _{i=1}^{5}x_{i} }\right) ^{\frac{1}{\gamma }}\left( {\varvec{e}}_5-\frac{1}{\gamma \left( \sum \limits _{i=1}^{5}x_{i}\right) }{\varvec{x}}\right) , \end{aligned}$$

where ${\varvec{c}}=(10,\,8,\,6,\,4,\,2)^{T},$ $ {\varvec{b}}=(1{.}2,\,1{.}1,\,1,\,0{.}9,\,0{.}8)^{T}, $ $ L=5{\varvec{e}}_5, $ and $ \gamma =1{.}1. $ In this problem all operations are component-wise.

The solution is $ {\varvec{x}}_*= \left( 15{.}4293,\, 12{.}4986,\, 9{.}6635,\, 7{.}1651,\, 5{.}1326\right) ^{T}$ and the starting points are ${\varvec{x}}_{1}={\varvec{e}}_5,$ ${\varvec{x}}_{2}=10{\varvec{e}}_5$ and $\,{\varvec{x}}_{3}=20{\varvec{e}}_5.\,$

Problems 6 and 7 [25]. The functions $F,G:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}$ are defined by

$$\begin{aligned} F({\varvec{x}})=A{\varvec{x}}+{\varvec{q}}+\varGamma ({\varvec{x}}) \hspace{0.5cm} \text{ and } \hspace{0.5cm} G({\varvec{x}})={\varvec{x}}-\varUpsilon ({\varvec{x}}), \end{aligned}$$

where $A\in \mathbb {R}^{n\times n}, \, {\varvec{q}}=\left( -1,1,\ldots ,(-1)^n\right) \in \mathbb {R}^{n},$ $\varUpsilon ({\varvec{x}})=\left( x_{1}^3,x_{2}^3,\ldots ,x_{n}^3\right) $ and $\varGamma ({\varvec{x}})=\left( x_{1}^2,x_{2}^2,\ldots ,x_{n}^2\right) .$

The solution for both problems is $ {\varvec{x}}_*={\varvec{e}}_n $ and the starting points are ${\varvec{x}}_{1}=\left( 1, 0.6, 1, 0.6,\ldots \right) ,$ ${\varvec{x}}_{2}= 5 {\varvec{e}}_n $ and ${\varvec{x}}_{3}=15 {\varvec{e}}_n. $

For Problem 6, we define as in [25]:

$$\begin{aligned}A=tridiag(-I,S,-I)= \left( \begin{array}{rrrrrr} S&{}-I&{}0&{}\ldots &{}0&{}0\\ -I&{}S&{}-I&{}\ldots &{}0&{}0\\ 0&{}-I&{}S&{}\ldots &{}0&{}0\\ \vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots \\ 0&{}0 &{}0 &{}\ldots &{}S&{}-I\\ 0&{}0 &{}0 &{}\ldots &{}-I&{}S\\ \end{array} \right) ,\end{aligned}$$

with

$$\begin{aligned} S=tridiag(-1,4,-1)= \left( \begin{array}{rrrrrr} 4&{}-1&{}0&{}\ldots &{}0&{}0\\ -1&{}4&{}-1&{}\ldots &{}0&{}0\\ 0&{}-1&{}4&{}\ldots &{}0&{}0\\ \vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots &{}\vdots \\ 0&{}0 &{}0 &{}\ldots &{}4&{}-1\\ 0&{}0 &{}0 &{}\ldots &{}-1&{}4\\ \end{array} \right) \in \mathbb {R}^{m\times m}.\end{aligned}$$

For Problem 7, the matrix A is defined by $A=tridiag(-1.5I,S,-0.5I), $ with $S=tridiag(-1.5,4,-0.5).$

The results of the experiments are presented in Fig. 1 and Tables 1, 2, and 3 with the following information: number of the test problem (Prob), starting point (SP), number of iterations (k), CPU time (CPU), value of the objective function at the final iteration ($\Psi $), success rate (S), divergence ($\,-\,$).

5.1 Experiment 1

In this experiment, we compared the Algorithms NGD and NGS in terms of number of iterations, the CPU time and the accuracy at the solution. In Table 1, we report the results obtained when executing the algorithms to solve each of the problems described above.

Table 1

Results for the algorithms NGD and NGS

		NGD			NGS
Prob	SP	k	CPU	$\Psi $	k	CPU	$\Psi $
1	${\varvec{x}}_{1}$	12	1.1	$1{.}2\times 10^{-10}$	12	1.5	$3{.}8\times 10^{-8}$
	${{x}}_{2}$	5	0.4	$1{.}6\times 10^{-11}$	5	0.7	$5{.}8\times 10^{-10}$
	${{x}}_{3}$	6	0.5	$1{.}4\times 10^{-8}$	6	0.6	$1{.}8\times 10^{-7}$
	$s{{x}}_{4}$	12	1.1	$1{.}9\times 10^{-12}$	–	–	–
2	${\varvec{x}}_{1}$	14	0.5	$1{.}4\times 10^{-6}$	13	0.5	$2{.}8\times 10^{-6}$
	${\varvec{x}}_{2}$	17	0.6	$3{.}1\times 10^{-6}$	17	0.6	$1{.}5\times 10^{-6}$
	${\varvec{x}}_{3}$	24	1.3	$1{.}3\times 10^{-6}$	23	0.8	$2{.}6\times 10^{-6}$
	${\varvec{x}}_{4}$	24	0.9	$1{.}8\times 10^{-6}$	23	0.9	$3{.}6\times 10^{-6}$
3	${\varvec{x}}_{1}$	5	0.2	$1{.}1\times 10^{-6}$	5	0.3	$1{.}6\times 10^{-6}$
	${\varvec{x}}_{2}$	5	0.3	$1{.}5\times 10^{-6}$	5	0.2	$3{.}9\times 10^{-6}$
	${\varvec{x}}_{3}$	5	0.3	$2{.}1\times 10^{-10}$	5	0.2	$2{.}3\times 10^{-8}$
4	${\varvec{x}}_{1}$	6	0.3	$2{.}4\times 10^{-12}$	6	0.3	$7{.}5\times 10^{-7}$
	${\varvec{x}}_{2}$	5	0.2	$2{.}7\times 10^{-10}$	5	0.2	$1{.}1\times 10^{-6}$
	${\varvec{x}}_{3}$	5	0.2	$3{.}8\times 10^{-10}$	5	0.2	$1{.}7\times 10^{-6}$
5	${\varvec{x}}_{1}$	8	0.8	0.0	−	−	−
	${\varvec{x}}_{2}$	4	0.3	0.0	−	−	−
	${\varvec{x}}_{3}$	5	0.4	$1{.}0\times 10^{-15}$	−	−	−
6$\,(64)$	${\varvec{x}}_{1}$	6	13.3	$3.6\times 10^{-13}$	−	−	−
	${\varvec{x}}_{2}$	10	11.9	$1.3\times 10^{-8}$	−	−	−
	${\varvec{x}}_{3}$	12	14.3	$9.1\times 10^{-10}$	−	−	−
6$\,(100)$	${\varvec{x}}_{1}$	6	37.26	$4.7\times 10^{-11}$	−	−	−
	${\varvec{x}}_{2}$	14	41.7	$3.2\times 10^{-12}$	−	−	−
	${\varvec{x}}_{3}$	13	73.2	$4.7 \times 10^{-9}$	−	−	−
7$\,(64)$	${\varvec{x}}_{1}$	6	17.1	$5.5\times 10^{-13}$	−	−	−
	${\varvec{x}}_{2}$	10	20.3	$3.51 \times 10^{-9}$	−	−	−
	${\varvec{x}}_{3}$	13	22.4	$1.1\times 10^{-10}$	−	−	−
7$\,(100)$	${\varvec{x}}_{1}$	6	71.04	$5.4\times 10^{-11}$	−	−	−
	${\varvec{x}}_{2}$	11	92.4	$4.7\times 10^{-8}$	−	−	−
	${\varvec{x}}_{3}$	13	101.2	$4.3\times 10^{-10}$	−	−	−

In this table, the notation ${\textbf {6}}\,(n)$ and ${\textbf {7}}\,(n)$, in the first column, means that the algorithms were executed to solve the Problems 6 and 7, for $n=m^2, $ with $m=8$ and $m=10.$

From the Table 1, we see that for problems 1 to 4, the algorithms have a similar behavior in terms of number of iterations and CPU time, with a slight CPU time advantage of the Algorithm NGD. However, when we compare the value of the merit function, $ \Psi ({\varvec{x}}), $ in the final iteration, Algorithm NGD reached, in $90\%$ of the cases, a value lower than that reached by Algorithm NGS. In some cases, the difference in these values was of the order of $10^{-5};$ this suggests that Algorithm NGD probably gives a better approximation to the solution of GCP(F, G) by about same number of iterations as Algorithm NGS. For the remaining Problems (5, 6 and 7), the NGS Algorithm did not converge, while the NGD Algorithm did. Furthermore, it converges in few iterations, with an acceptable CPU time.

On the other hand, this experiment shows us that the $\lambda $-dynamic strategy implemented in Algorithm NGD takes advantage when the iterations are close of the solution ${\varvec{x}}_*. $ Proof of this is the fast decrease in the value of $ \Psi ({\varvec{x}}) $ in the last iterations.

5.2 Experiment 2

In this experiment, we study the global performance of the Algorithms NGD and NGS. For this, we tested the algorithms taking one hundred random starting points for each of the five problems. For Problems 1 to 4, we take each component of the starting points in the interval $[-30, 30] $ and for the remaining problems, the interval [1, 50]. The results of this experiment are summarized in Table 2, in which $\overline{k} $ is the mean of the iterations of the hundred of experiments for each problem.

Table 2

Success rate for the Algorithms NGD and NGS

	NGD		NGS
Prob	$\overline{k}$	$S\,(\%)$	$\overline{k}$	$S\,(\%)$
1	10	96	9	87
2	14	98	14	98
3	6	100	6	97
4	6	100	6	100
5	34	100	35	100
6 $\,(64)$	41	98	40	96
6 $\,(100)$	23.1	84	24	79
7 $\,(64)$	42	94	42	94
7 $\,(100)$	25	97	27	95

Table 3

Performance of the Algorithms NG and NGD

		NG					NGD
Prob	$\textit{SP} $	$\lambda _B$	$\lambda _W$	$k_B$	$k_W$	$\overline{k}$	k
1	${\varvec{x}}_1$	$(0,\,0.4)$	$(3.5,\,3.8)$	11	19	15	12
	${\varvec{x}}_2$	$(0,\,1.7)$	$(3.7,\,3.8)$	4	20	12	5
	${\varvec{x}}_3$	$(0,\,0.9)$	$(3.7,\,3.8)$	4	22	13	6
	${\varvec{x}}_4$	$(0,\,0.4)$	$(3.7,\,3.8)$	6	21	13.5	12
2	${\varvec{x}}_1$	$(3.7,\,3.8)$	$(0,\,1.2)$	11	14	12.5	9
	${\varvec{x}}_2$	$(3.5\,\,3.8)$	$(0,\,0.2)$	15	17	16	12
	${\varvec{x}}_3$	$(3.7,\,3.8)$	$(0,\,1.0)$	21	24	22.5	19
	${\varvec{x}}_4$	$(3.2,\,3.8)$	$(0,\,1.6)$	22	24	23	19
3	${\varvec{x}}_1$	$(0,\,0.7)$	$(3.7,\,3.8)$	4	8	6	5
	${\varvec{x}}_2$	$(0,\,0.7)$	$(3.3,\,3.8)$	4	7	5.5	5
	${\varvec{x}}_3$	$(0,\,0.9)$	$(3.6,\,3.7)$	4	7	5.5	5
4	${\varvec{x}}_1$	$(0,\,0.1)$	$(2.3,\,3.7)$	4	7	5.5	6
	${\varvec{x}}_2$	$(0,\,0.1)$	$(2.7,\,3.7)$	4	6	5	5
	${\varvec{x}}_3$	$(0,\,0.7)$	$(3.6,\,3.7)$	4	10	7	5
5	${\varvec{x}}_1$	$(0,\,1.7)$	$(1.7,\,3.8)$	50	51	50.5	49
	${\varvec{x}}_2$	$(3.3,\,3.5)$	$(0,\,0.9)$	23	28	25.5	29
	${\varvec{x}}_3$	$(3.6,\,3.8)$	$(0,\,0.5)$	32	38	35	39
6 (100)	${\varvec{x}}_1$	$(0,\,1.2)$	$(3.4,\,3.8)$	4	6	5	6
	${\varvec{x}}_2$	$(0.2,\,1.6)$	$(3.3,\,3.8)$	4	6	5	14
	${\varvec{x}}_3$	$(1.2,\,1.8)$	$(3.2,\,3.8)$	4	6	5	13
7 (100)	${\varvec{x}}_1$	$(0.8,\,1.3)$	$(3.5,\,3.8)$	5	14	9.5	6
	${\varvec{x}}_2$	$(0.2,\,1,6)$	$(3.3,\,0.9)$	4	6	5	11
	${\varvec{x}}_3$	$(1.2,\,1.8)$	$(3.3,\,0.5)$	4	6	5	13

As we expected, the Algorithm NGD seems to be a robust algorithm. We want to highlight the performance of our algorithm for the first problem since this is considered a hard one [12]. In this problem, Algorithm NGD had a success rate of $96\%$ while its counterpart, Algorithm NGS, had a success rate of $87\%,$ so we can conclude that the $\lambda $-dynamic strategy combined with a linear search less restrictive, such as an Armijo-type linear search, works very well and can lead to a robust and efficient algorithm.

5.3 Experiment 3

The above experiments show that the $\lambda $-dynamic strategy is a good option to complement the Algorithm NG, but a reasonable question is the following: What is the performance of this algorithm without $\lambda $-dynamic strategy? What is the behavior of the algorithm when it works with the same value of $\lambda $ throughout the execution? And what is the better choice of $\lambda $ to run the algorithm? In this experiment we investigate these situations. For this, we tested the algorithm with different values for $ \lambda . $

In Fig. 1, we show the number of iterations required by the Algorithm NG to solve each of the five problems when it was executed with 38 values of $ \lambda $ equally spaced in the interval $(0,\,4).$ We observe that Algorithm NG converges in all experiments and for all values of $\lambda $ which reflects in a sense, the robustness of algorithm. On the other hand, we can see that the Algorithm NG had his best performance taking values of $\lambda $ at the beginning or at the end of interval $(0,\,4).$

Comparing the results of Fig. 1 with those on Table 1, we can conclude that in all the experiments, it is possible to find a value of $\lambda $ for which Algorithm NG is faster than Algorithm NGS; however, there is no a single value of $\lambda $ that works well for all problems; therefore, the $\lambda $-dynamic strategy seems to be the best option to complement the Algorithm NG.

5.4 Experiment 4

In this experiment, we summarize the results obtained in Experiments 1 and 3 for Algorithms NGD and NG with the aim to compare the best and worst performance of the Algorithm NG with the one of the Algorithm NGD.

The results are shown in Table 3, where $ \lambda _B $ indicates the subintervals for which, taking values of $\lambda $ into them, the Algorithm NG shows his best behavior, and $ \lambda _W $ indicates the subintervals for which the Algorithm NG had his worst behavior. Analogously, $k_B$ and $k_W$ are the number of iterations needed for the Algorithm NG in his best and worst behavior, respectively.

In Table 3, we observe that, for all problems except Problem 2, it is possible to find a value of $\lambda $ for which Algorithm NG solved the corresponding GCP(F, G) in fewer iterations than Algorithm NGD. However, there is no single value of $\lambda $ that makes Algorithm NG perform better than Algorithm NGD in all experiments.

On the other hand, we observe that in general, in the worst case, the Algorithm NG needed many more iterations to solve the GCP(F, G) than Algorithm NGD which increases significantly its average of iterations.

From above observations, we can conclude that when we do not have the best value of $\lambda $ for Algorithm NG available, the $\lambda $-dynamic strategy is a good option. However, we made some preliminary tests varying the initial $\lambda $ in the strategy proposed in [12], and the results showed us that by making some changes in the initial value of $\lambda ,$ the Algorithm NGD can have a better performance, so we think that this is an open problem in this area.

6 Conclusions

In this work, we present a new nonsmooth Newton-type algorithm for solving the generalized complementarity problem indirectly, reformulating it as a system of nonlinear equations using a one-parametric family of complementarity functions.

We demonstrate that this algorithm converges locally and q-quadratically. In addition, we verify that properties of operator $ \Phi _\lambda , $ investigated in [12] for the nonlinear complementarity case, can easily be extended to the generalized complementarity problem. For this, it was necessary to introduce some new definitions such as the concepts of BD-regularity and R-regularity in generalized complementarity.

We also present some numerical experiments that show the good performance of the new algorithm. In particular, these experiments showed that for each problem, there exists a subinterval of values of $\lambda $ that give us better performance of Algorithm NG. For this, we consider important to modify the variation procedure of parameter $\lambda $ changing its initial value by one that belongs to some of the subintervals previously mentioned. We leave this as a future research topic.

Acknowledgements

We are grateful to the University of Cauca for the time allowed for this research.

Declarations

Ethical approval

The authors declare that they followed all the rules of a good scientific practice.

All authors approve their participation in this work.

the authors approve the publication of this research.

Conflict of interest

The authors declare no competing interests.

Human and animal ethics

Not applicable

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Nächster Artikel A relaxed proximal ADMM method for block separable convex programming

Let $A=(a_{ij}) \in \mathbb {R}^{m \times n}. $ $A_{\alpha \beta }$ is the one with elements $a_{ij}$ such that $i\in \alpha $ and $j\in \beta .$

A matrix $M \in \mathbb {R}^{n \times n}$ is a P-matrix if for every nonzero vector ${\varvec{z}}$ there is an index $j \in \left\{ 1, \ldots , n\right\} \,$ such that ${\varvec{z}}_{j}[M{\varvec{z}}]_{j}>0.$

Cottle, R.W., Pang, J. S., Stone, R.E. The linear complementarity problem, Philadelphia (2009)

De Luca, T., Facchinei, F., Kanzow, C.: A semismooth equation approach to the solution of nonlinear complementarity problems. Math. Program. 75, 407–439 (1996)MathSciNetCrossRef

Facchinei, F.: Minimization of SC1 functions and the Maratos effect. Operation Res. Lett. 17(3), 131–137 (1995)CrossRef

Arias, C., Martínez, H.J., Pérez, R.: A global quasi-Newton algorithms for nonlinear complementarity problems. Pacific J Optim. 13(1), 1–15 (2017)MathSciNet

Outrata, J.V., Zowe, J.: A Newton method for a class of quasi-variational inequalities. Comput. Optim. Appl. 4(1), 5–21 (1995)MathSciNetCrossRef

Pang, J.S.: On the convergence of a basic iterative method for the implicit complementarity problem. J Optim. Theory Appl. 37(2), 149–162 (1982)MathSciNetCrossRef

Dolcetta, I.C., Mosco, U. Implicit complementarity problems and quasivariational inequalities, in variational inequalities and complementarity problems: theory and applications. R. W. Cottle, F. Giannessi, and J. L. Lions, 75–87 (1980)

Pang, J.S.: On the convergence of a basic iterative method for the implicit complementarity problem. J Optim. Theory y Appl. 37(2), 149–162 (1982)MathSciNetCrossRef

Arenas, F., Mart ínez, H.J., P érez, R.: Least change secant update methods for nonlinear complementarity problem. Ingenier ía y Ciencia 11(21), 11–36 (2015)

10.

Qi, L.: Convergence analysis of some algorithms for solving nonsmooth equations. Math. Oper. Res. 18(1), 227–244 (1993)MathSciNetCrossRef

11.

Jiang, H., Fukushima, M., Qi, L., Sun, D.: A trust region method for solving generalized complementarity problems. SIAM J Optim. 8(1), 140–157 (1998)MathSciNetCrossRef

12.

Kanzow, C., Kleinmichel, H.: A new class of semismooth Newton-type methods for nonlinear complementarity problems. Comput. Optim. Appl. 11(3), 227–251 (1998)MathSciNetCrossRef

13.

Clarke, F.H. Necessary conditions for nonsmooth problems in optimal control and the calculus of variations. PhD thesis, University of Washington (1973)

14.

Song, L., Gao, Y.: On the local convergence of a Levenberg-Marquardt method for nonsmooth nonlinear complementarity problem. ScienceAsia 43(1), 377–382 (2017)CrossRef

15.

Du, S.-Q.: Generalized Newton method for a kind of complementarity problem. Abstract Appl. Anal. 2014, 1–5 (2014)MathSciNet

16.

Fischer, A., Kanzow, C.: On finite termination of an iterative method for linear complementarity problems. Math. Program. 74(3), 279–292 (1996)MathSciNetCrossRef

17.

Pang, J.S.: Newton’s method for B-differentiable equations. Math. Oper. Res. 15(2), 311–347 (1990)MathSciNetCrossRef

18.

Lopes, V.L.R., Martínez, J.M., Pérez, R.: On the local convergence of quasi-Newton methods for nonlinear complementarity problems. Appl. Numer. Math. 30(1), 3–22 (1999)MathSciNetCrossRef

19.

Qi, L. C-differentiability, C-differential operators and generalized Newton methods. Technical report, Applied Mathematics Report AMR96/5, University of New South Wales, Sydney, Australia (1996)

20.

Sun, D., Han, J.: Newton and quasi-Newton methods for a class of nonsmooth equations and related problems. SIAM J Optim. 7(2), 463–480 (1997)MathSciNetCrossRef

21.

Facchinei, F., Soares, J.: A new merit function for nonlinear complementarity problems and a related algorithm. SIAM J Optim. 7(1), 225–247 (1997)MathSciNetCrossRef

22.

Nocedal, J., Wright, S.: Numerical Optimization. Springer, New York (2006)

23.

Bertsekas, D.P.: Constrained Optimization and Lagrange Multipliers Methods. Academic Press, INC, New York (2014)

24.

Kojima, M., Shindo, S.: Extension of Newton and quasi-Newton methods to systems of PC1 equations. J Oper. Res. Soc. Japan 29(4), 352–375 (1986)

25.

Wu, S.L., Guo, P.: Modulus-based matrix splitting algorithms for the quasi-complementarity problems. Appl. Numer. Math. 132, 127–137 (2018)MathSciNetCrossRef

Titel: A nonsmooth Newton method for solving the generalized complementarity problem
verfasst von: Hevert Vivas
Rosana Pérez
Carlos A. Arias
Publikationsdatum: 01.07.2023
Verlag: Springer US
Erschienen in: Numerical Algorithms / Ausgabe 2/2024
Print ISSN: 1017-1398
Elektronische ISSN: 1572-9265
DOI: https://doi.org/10.1007/s11075-023-01581-2

		NG					NGD
Prob	\(\textit{SP} \)	\(\lambda _B\)	\(\lambda _W\)	\(k_B\)	\(k_W\)	\(\overline{k}\)	k
1	\({\varvec{x}}_1\)	\((0,\,0.4)\)	\((3.5,\,3.8)\)	11	19	15	12
	\({\varvec{x}}_2\)	\((0,\,1.7)\)	\((3.7,\,3.8)\)	4	20	12	5
	\({\varvec{x}}_3\)	\((0,\,0.9)\)	\((3.7,\,3.8)\)	4	22	13	6
	\({\varvec{x}}_4\)	\((0,\,0.4)\)	\((3.7,\,3.8)\)	6	21	13.5	12
2	\({\varvec{x}}_1\)	\((3.7,\,3.8)\)	\((0,\,1.2)\)	11	14	12.5	9
	\({\varvec{x}}_2\)	\((3.5\,\,3.8)\)	\((0,\,0.2)\)	15	17	16	12
	\({\varvec{x}}_3\)	\((3.7,\,3.8)\)	\((0,\,1.0)\)	21	24	22.5	19
	\({\varvec{x}}_4\)	\((3.2,\,3.8)\)	\((0,\,1.6)\)	22	24	23	19
3	\({\varvec{x}}_1\)	\((0,\,0.7)\)	\((3.7,\,3.8)\)	4	8	6	5
	\({\varvec{x}}_2\)	\((0,\,0.7)\)	\((3.3,\,3.8)\)	4	7	5.5	5
	\({\varvec{x}}_3\)	\((0,\,0.9)\)	\((3.6,\,3.7)\)	4	7	5.5	5
4	\({\varvec{x}}_1\)	\((0,\,0.1)\)	\((2.3,\,3.7)\)	4	7	5.5	6
	\({\varvec{x}}_2\)	\((0,\,0.1)\)	\((2.7,\,3.7)\)	4	6	5	5
	\({\varvec{x}}_3\)	\((0,\,0.7)\)	\((3.6,\,3.7)\)	4	10	7	5
5	\({\varvec{x}}_1\)	\((0,\,1.7)\)	\((1.7,\,3.8)\)	50	51	50.5	49
	\({\varvec{x}}_2\)	\((3.3,\,3.5)\)	\((0,\,0.9)\)	23	28	25.5	29
	\({\varvec{x}}_3\)	\((3.6,\,3.8)\)	\((0,\,0.5)\)	32	38	35	39
6 (100)	\({\varvec{x}}_1\)	\((0,\,1.2)\)	\((3.4,\,3.8)\)	4	6	5	6
	\({\varvec{x}}_2\)	\((0.2,\,1.6)\)	\((3.3,\,3.8)\)	4	6	5	14
	\({\varvec{x}}_3\)	\((1.2,\,1.8)\)	\((3.2,\,3.8)\)	4	6	5	13
7 (100)	\({\varvec{x}}_1\)	\((0.8,\,1.3)\)	\((3.5,\,3.8)\)	5	14	9.5	6
	\({\varvec{x}}_2\)	\((0.2,\,1,6)\)	\((3.3,\,0.9)\)	4	6	5	11
	\({\varvec{x}}_3\)	\((1.2,\,1.8)\)	\((3.3,\,0.5)\)	4	6	5	13

Springer Professional

A nonsmooth Newton method for solving the generalized complementarity problem

Abstract

Publisher's Note

1 Introduction

2 Preliminaries

3 The operator \(\Phi _{{\lambda }}\)

4 Algorithm and convergence results

5 Numerical experiments

5.1 Experiment 1

5.2 Experiment 2

5.3 Experiment 3

5.4 Experiment 4

6 Conclusions

Acknowledgements

Declarations

Ethical approval

Conflict of interest

Human and animal ethics

Publisher's Note

Premium Partner

		NGD			NGS
Prob	SP	k	CPU	\(\Psi \)	k	CPU	\(\Psi \)
1	\({\varvec{x}}_{1}\)	12	1.1	\(1{.}2\times 10^{-10}\)	12	1.5	\(3{.}8\times 10^{-8}\)
	\({{x}}_{2}\)	5	0.4	\(1{.}6\times 10^{-11}\)	5	0.7	\(5{.}8\times 10^{-10}\)
	\({{x}}_{3}\)	6	0.5	\(1{.}4\times 10^{-8}\)	6	0.6	\(1{.}8\times 10^{-7}\)
	\(s{{x}}_{4}\)	12	1.1	\(1{.}9\times 10^{-12}\)	–	–	–
2	\({\varvec{x}}_{1}\)	14	0.5	\(1{.}4\times 10^{-6}\)	13	0.5	\(2{.}8\times 10^{-6}\)
	\({\varvec{x}}_{2}\)	17	0.6	\(3{.}1\times 10^{-6}\)	17	0.6	\(1{.}5\times 10^{-6}\)
	\({\varvec{x}}_{3}\)	24	1.3	\(1{.}3\times 10^{-6}\)	23	0.8	\(2{.}6\times 10^{-6}\)
	\({\varvec{x}}_{4}\)	24	0.9	\(1{.}8\times 10^{-6}\)	23	0.9	\(3{.}6\times 10^{-6}\)
3	\({\varvec{x}}_{1}\)	5	0.2	\(1{.}1\times 10^{-6}\)	5	0.3	\(1{.}6\times 10^{-6}\)
	\({\varvec{x}}_{2}\)	5	0.3	\(1{.}5\times 10^{-6}\)	5	0.2	\(3{.}9\times 10^{-6}\)
	\({\varvec{x}}_{3}\)	5	0.3	\(2{.}1\times 10^{-10}\)	5	0.2	\(2{.}3\times 10^{-8}\)
4	\({\varvec{x}}_{1}\)	6	0.3	\(2{.}4\times 10^{-12}\)	6	0.3	\(7{.}5\times 10^{-7}\)
	\({\varvec{x}}_{2}\)	5	0.2	\(2{.}7\times 10^{-10}\)	5	0.2	\(1{.}1\times 10^{-6}\)
	\({\varvec{x}}_{3}\)	5	0.2	\(3{.}8\times 10^{-10}\)	5	0.2	\(1{.}7\times 10^{-6}\)
5	\({\varvec{x}}_{1}\)	8	0.8	0.0	−	−	−
	\({\varvec{x}}_{2}\)	4	0.3	0.0	−	−	−
	\({\varvec{x}}_{3}\)	5	0.4	\(1{.}0\times 10^{-15}\)	−	−	−
6\(\,(64)\)	\({\varvec{x}}_{1}\)	6	13.3	\(3.6\times 10^{-13}\)	−	−	−
	\({\varvec{x}}_{2}\)	10	11.9	\(1.3\times 10^{-8}\)	−	−	−
	\({\varvec{x}}_{3}\)	12	14.3	\(9.1\times 10^{-10}\)	−	−	−
6\(\,(100)\)	\({\varvec{x}}_{1}\)	6	37.26	\(4.7\times 10^{-11}\)	−	−	−
	\({\varvec{x}}_{2}\)	14	41.7	\(3.2\times 10^{-12}\)	−	−	−
	\({\varvec{x}}_{3}\)	13	73.2	\(4.7 \times 10^{-9}\)	−	−	−
7\(\,(64)\)	\({\varvec{x}}_{1}\)	6	17.1	\(5.5\times 10^{-13}\)	−	−	−
	\({\varvec{x}}_{2}\)	10	20.3	\(3.51 \times 10^{-9}\)	−	−	−
	\({\varvec{x}}_{3}\)	13	22.4	\(1.1\times 10^{-10}\)	−	−	−
7\(\,(100)\)	\({\varvec{x}}_{1}\)	6	71.04	\(5.4\times 10^{-11}\)	−	−	−
	\({\varvec{x}}_{2}\)	11	92.4	\(4.7\times 10^{-8}\)	−	−	−
	\({\varvec{x}}_{3}\)	13	101.2	\(4.3\times 10^{-10}\)	−	−	−

	NGD		NGS
Prob	\(\overline{k}\)	\(S\,(\%)\)	\(\overline{k}\)	\(S\,(\%)\)
1	10	96	9	87
2	14	98	14	98
3	6	100	6	97
4	6	100	6	100
5	34	100	35	100
6 \(\,(64)\)	41	98	40	96
6 \(\,(100)\)	23.1	84	24	79
7 \(\,(64)\)	42	94	42	94
7 \(\,(100)\)	25	97	27	95

Springer Professional

Abstract

Publisher's Note

1 Introduction

2 Preliminaries

3 The operator \(\Phi _{{\lambda }}\)

4 Algorithm and convergence results

5 Numerical experiments

5.1 Experiment 1

5.2 Experiment 2

5.3 Experiment 3

5.4 Experiment 4

6 Conclusions

Acknowledgements

Declarations

Ethical approval

Consent to participate

Consent for publication

Conflict of interest

Human and animal ethics

Publisher's Note

Weitere Artikel der Ausgabe 2/2024

Interval division and linearization algorithm for minimax linear fractional program

High relative accuracy through Newton bases

Two new self-adaptive algorithms for solving the split feasibility problem in Hilbert space

The series expansions and blow-up time estimation for the solutions of convolution Volterra-Hammerstein integral equations

A regularization method for Caputo fractional derivatives in the Banach space

A fast second-order predictor-corrector method for a nonlinear time-fractional Benjamin-Bona-Mahony-Burgers equation

Premium Partner