Skip to main content
Erschienen in: Journal of Inequalities and Applications 1/2019

Open Access 01.12.2019 | Research

Split proximal linearized algorithm and convergence theorems for the split DC program

verfasst von: Chih-Sheng Chuang, Chi-Ming Chen

Erschienen in: Journal of Inequalities and Applications | Ausgabe 1/2019

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we study the split DC program by using the split proximal linearized algorithm. Further, linear convergence theorem for the proposed algorithm is established under suitable conditions. As applications, we first study the DC program (DCP). Finally, we give numerical results for the proposed convergence results.
Hinweise

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

First, we recall the minimization problem for convex functions:
$$ \text{Find }\bar{x}\in \operatorname{arg}\min\bigl\{ f(x): x\in H\bigr\} , $$
(MP1)
where H is a real Hilbert space and \(f: H\rightarrow(-\infty, \infty]\) is a proper, lower semicontinuous, and convex function. This is a classical convex minimization problem with many applications. To study this problem, Martinet [11] introduced the proximal point algorithm
$$ x_{n+1}=\operatorname{arg}\min_{y\in H} \biggl\{ f(y)+ \frac{1}{2\beta_{n}} \Vert y-x_{n} \Vert ^{2} \biggr\} ,\quad n\in\mathbb{N}, $$
(PPA)
and showed that \(\{x_{n}\}_{n\in\mathbb{N}}\) converges weakly to a minimizer of f under suitable conditions. This algorithm is useful, however, only for convex problems, because the idea for this algorithm is based on the monotonicity of subdifferential operators of convex functions. So, it is important to consider the relation between nonconvex problems and a proximal point algorithm.
The following is a well-known nonconvex problem, known as DC program:
$$ \text{Find }\bar{x}\in \operatorname{arg}\min_{x\in\mathbb{R}^{n}}\bigl\{ f(x)=g(x)-h(x) \bigr\} , $$
(DCP)
where \(g,h:\mathbb{R}^{n}\rightarrow \mathbb{R}\) are proper lower semicontinuous and convex functions. Here, the function f is called a DC function, and functions g and h are called DC components of f. In the DC program, the convention \((+\infty)-(+\infty)=+\infty\) has been adopted to avoid the ambiguity \((+\infty)-(+\infty)\) that does not present any interest. It is well known that a necessary condition for \(x\in\operatorname{dom}(f):=\{x\in\mathbb{R}^{n}: f(x)<\infty\}\) to be a local minimizer of f is \(\partial h(x)\subseteq \partial g(x)\), where \(\partial g(x)\) and \(\partial h(x)\) are the subdifferentials of g and h, respectively (see Definition 2.4). But this condition is hard to be reached. So, many researchers focus their attention on finding points such that \(\partial h(x)\cap\partial g(x) \neq\emptyset\), where x is called a critical point of f [8].
It is worth mentioning the richness of the class of DC functions which is a subspace containing the class of lower-\(\mathcal{C}^{2}\) functions. In particular, \(\mathcal{DC}(\mathbb{R}^{n})\) contains the space \(\mathcal{C}^{1,1}\) of functions whose gradient is locally Lipschitz continuous. Further, \(\mathcal{DC}(\mathbb{R}^{n})\) is closed under the operations usually considered in optimization. For example, a linear combination, a finite supremum, or the product of two DC functions remain DC. It is also known that the set of DC functions defined on a compact convex set of \(\mathbb{R}^{n}\) is dense in the set of continuous functions on this set.
We also observed that the interest in the theory of DC functions has much increased in the last years. Some interesting optimality conditions and duality theorems related to the DC program have been given (for example, see [6, 7, 14]). Some algorithms for the DC program are proposed to analyze and solve a variety of highly structured and practical problems (for example, see [13]).
In 2003, Sun, Sampaio, and Candido [16] gave the following algorithm to study problem (DCP).
Algorithm 1.1
(Proximal point algorithm for (DCP) [16])
Let \(\{\beta_{n}\} _{n\in\mathbb{N}}\) be a sequence in \((0,\infty)\), and let \(g,h:\mathbb{R}^{k}\rightarrow \mathbb{R}\) be proper lower semicontinuous and convex functions. Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by
$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ \text{compute }w_{n}\in\partial h(x_{n})\text{ and set }y_{n}=x_{n}+ \beta_{n}w_{n}, \\ x_{n+1}:=(I+\beta_{n}\partial g)^{-1}(y_{n}), \quad n\in\mathbb{N}, \\ \text{stop criteria: } x_{n+1}=x_{n}. \end{cases} $$
In 2016, Souza, Oliveira, and Soubeyran [15] gave the following algorithm to study the DC program.
Algorithm 1.2
(Proximal linearized algorithm for (DCP) [15])
Let \(\{\beta_{n}\} _{n\in\mathbb{N}}\) be a sequence in \((0,\infty)\), and let \(g,h:\mathbb{R}^{k}\rightarrow \mathbb{R}\) be proper lower semicontinuous and convex functions. Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by
$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ \text{compute }w_{n}\in\partial h(x_{n}), \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-x_{n} \Vert ^{2}- \langle w_{n},u-x_{n}\rangle\},\quad n\in\mathbb{N}, \\ \text{stop criteria: } x_{n+1}=x_{n}. \end{cases} $$
In fact, if h is differentiable, then it is reduced to the following:
$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-x_{n} \Vert ^{2}- \langle\nabla h(x_{n}),u-x_{n}\rangle\},\quad n\in\mathbb{N}. \\ \text{stop criteria: } x_{n+1}=x_{n}. \end{cases} $$
Further, Souza, Oliveira, and Soubeyran [15] gave the following convergence theorem for problem (DCP).
Theorem 1.1
([15, Theorem 3])
Let \(g,h:\mathbb{R}^{k}\rightarrow \mathbb{R}\cup\{+\infty\}\) be proper, lower semicontinuous, and convex functions, and \(g-h\) be bounded from below. Suppose that g is ρ-strongly convex, h is differentiable, and \(\nabla h(x)\) is L-Lipschitz continuous. Let \(\{\beta_{n}\}_{n\in \mathbb{N}}\) be a bounded sequence with \(\liminf_{n\rightarrow \infty}\beta _{n}>0\). Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by Algorithm 1.2. If \(\rho>2L\), then \(\{x_{n}\}_{n\in\mathbb {N}}\) converges linearly to a critical point of problem (DCP).
In this paper, we want to study the split DC program:
$$ \text{Find } \bar{x}\in H_{1} \text{ such that }\bar{x}\in \operatorname{arg}\min _{x\in H_{1}}f_{1}(x) \text{ and } A\bar{x}\in \operatorname{arg}\min _{y\in H _{2}}f_{2}(y), $$
(SDCP)
where \(H_{1}\) and \(H_{2}\) are real Hilbert spaces, \(A:H_{1}\rightarrow H_{2}\) is a nonzero linear and bounded mapping with adjoint operator \(A^{*}\), \(g_{1},h_{1}:H_{1}\rightarrow \mathbb{R}\) are proper lower semicontinuous and convex functions, and \(g_{2},h_{2}:H_{2}\rightarrow \mathbb{R}\) are proper lower semicontinuous and convex functions, and \(f_{1}(x)=g_{1}(x)-h_{1}(x)\) for all \(x\in H_{1}\), and \(f_{2}(y)=g_{2}(y)-h_{2}(y)\) for all \(y\in H_{2}\).
Clearly, (SDCP) is a generalization of problem (DCP). Indeed, if \(H_{1}=H_{2}=\mathbb{R}^{n}\), \(A:\mathbb{R}^{n}\rightarrow \mathbb{R}^{n}\) is the identity mapping, \(g_{1}=g_{2}\), and \(h_{1}=h_{2}\), then problem (SDCP) is reduced to problem (DCP).
If \(h_{1}(x)=0\) and \(h_{2}(y)=0\) for all \(x\in H_{1}\) and \(y\in H_{2}\), then (SDCP) is reduced to the split minimization problems (SMP) for convex functions:
$$\begin{aligned}& \mbox{Find }\bar{x}\in H_{1}\mbox{ such that }g_{1}( \bar{x})= \min_{u\in H_{1}}g_{1}(u)\mbox{ and }g_{2}(A\bar{x})=\min_{v\in H_{2}}g _{2}(v), \end{aligned}$$
(SMP)
where \(H_{1}\) and \(H_{2}\) are real Hilbert spaces, \(A:H_{1}\rightarrow H_{2}\) is a linear and bounded mapping with adjoint \(A^{*}\), \(g_{1}:H_{1}\rightarrow \mathbb{R}\) and \(g_{2}:H_{2}\rightarrow \mathbb{R}\) are proper, lower semicontinuous, and convex functions. For example, one can see [4] and the related references.
If \(H_{1}=H_{2}=H\) and \(A:H\rightarrow H\) is the identity mapping, then problem (SMP) is reduced to the common minimization problem for convex functions:
$$\begin{aligned}& \mbox{Find }\bar{x}\in H\mbox{ such that }g_{1}(\bar{x})= \min _{u\in H}g_{1}(u)\mbox{ and }g_{2}(\bar{x})= \min_{v\in H}g_{2}(v), \end{aligned}$$
(CMP)
where H is a real Hilbert space, \(g_{1},g_{2}:H\rightarrow \mathbb {R}\) are proper, lower semicontinuous, and convex functions. Further, if the solution set of problem (CMP) is nonempty, then problem (CMP) is equivalent to the following problem:
$$\begin{aligned}& \mbox{Find }\bar{x}\in H\mbox{ such that }g_{1}(\bar{x})+g _{2}(\bar{x})=\min_{u\in H}g_{1}(u)+g_{2}(u), \end{aligned}$$
(MP2)
where H is a real Hilbert space, \(g_{1},g_{2}:H\rightarrow \mathbb {R}\) are proper, lower semicontinuous, and convex functions. This problem is well known and it has many important applications, including multiresolution sparse regularization, Fourier regularization, hard-constrained inconsistent feasibility, and alternating projection signal synthesis problems. For example, one can refer to [5, 9] and the related references.
On the other hand, Moudafi [12] introduced the split variational inclusion problem, which is a generalization of problem (SMP):
$$\begin{aligned}& \mbox{Find }\bar{x}\in H_{1}\mbox{ such that }0_{H_{1}}\in B_{1}( \bar{x})\mbox{ and }0_{H_{2}}\in B_{2}(A \bar{x}), \end{aligned}$$
(SVIP)
where \(H_{1}\) and \(H_{2}\) are real Hilbert spaces, \(B_{1}:H_{1}\multimap H _{1}\) and \(B_{2}:H_{2}\multimap H_{2}\) are set-valued maximal monotone mappings, \(A:H_{1}\rightarrow H_{2}\) is a linear and bounded operator, and \(A^{*}\) is the adjoint of A. Here, \(0_{H_{1}}\) and \(0_{H_{2}}\) are zero elements of real Hilbert spaces \(H_{1}\) and \(H_{2}\), respectively. To study problem (SVIP), Byrne et al. [3] gave the following algorithm and related convergence theorem in infinite dimensional Hilbert space.
Theorem 1.2
([3])
Let \(H_{1}\) and \(H_{2}\) be real Hilbert spaces, \(A:H_{1}\rightarrow H_{2}\) be a nonzero linear and bounded operator, and \(A^{*}\) denote the adjoint operator of A. Let \(B_{1}:H_{1}\multimap H_{1}\) and \(B_{2}:H_{2}\multimap H_{2}\) be set-valued maximal monotone mappings, \(\beta>0\), and \(\gamma\in(0,\frac {2}{\Vert A\Vert ^{2}})\). Let Ω be the solution set of (SVIP), and suppose that \(\varOmega\neq\emptyset\). Let \(\{x_{n}\} _{n\in\mathbb{N}}\) be defined by
$$ x_{n+1}:=J_{\beta}^{B_{1}}\bigl[x_{n}-\gamma A^{*}\bigl(I-J_{\beta}^{B_{2}}\bigr)Ax_{n} \bigr],\quad n\in\mathbb{N}. $$
Then \(\{x_{n}\}\) converges weakly to an element \(\bar{x}\in\varOmega\).
If \(B_{1}=\partial g_{1}\) and \(B_{2}=\partial g_{2}\) (the subdifferential of \(g_{i}\), \(i=1,2\)), then the algorithm given by Theorem 1.2 is reduced to the following algorithm:
$$ \textstyle\begin{cases} y_{n}=\operatorname{arg}\min_{z\in H_{2}}\{g(z)+\frac{1}{2\beta_{n}} \Vert z-Ax_{n} \Vert ^{2} \}, \\ z_{n}=x_{n}-\gamma A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}=\operatorname{arg}\min_{y\in H_{1}}\{g(y)+\frac{1}{2\beta_{n}} \Vert y-z_{n} \Vert ^{2} \},\quad n\in\mathbb{N}. \end{cases} $$
In this paper, motivated by the above works on DC programs and related problems, we want to study problem (SDCP) by using the split proximal linearized algorithm:
$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ y_{n}:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta_{n}} \Vert v-Ax _{n} \Vert ^{2}-\langle\nabla h_{2}(Ax_{n}),v-Ax_{n}\rangle\}, \\ z_{n}:=x_{n}-r_{n} A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta_{n}} \Vert u-z _{n} \Vert ^{2}-\langle\nabla h_{1}(z_{n}),u-z_{n}\rangle\},\quad n\in \mathbb{N}, \end{cases} $$
where \(H_{1}\) and \(H_{2}\) are real Hilbert spaces, \(A:H_{1}\rightarrow H_{2}\) is a linear and bounded mapping with adjoint \(A^{*}\), \(g_{1},h_{1}:H_{1}\rightarrow \mathbb{R}\) are proper lower semicontinuous and convex functions, and \(g_{2},h_{2}:H_{2}\rightarrow \mathbb{R}\) are proper lower semicontinuous and convex functions, \(g_{1}\) and \(g_{2}\) are strongly convex, \(h_{1}\) and \(h_{2}\) are Fréchet differentiable, \(\nabla h_{1}\) and \(\nabla h_{2}\) are L-Lipschitz continuous, and \(f_{1}(x)=g_{1}(x)-h_{1}(x)\) for all \(x\in H_{1}\), and \(f_{2}(y)=g _{2}(y)-h_{2}(y)\) for all \(y\in H_{2}\). Further, linear convergence theorems for the proposed algorithms are established under suitable conditions. Finally, we give numerical results for the proposed convergence theorems.

2 Preliminaries

Let H be a (real) Hilbert space with the inner product \(\langle \cdot,\cdot\rangle\) and the norm \(\Vert \cdot\Vert \). We denote the strong and weak convergence of \(\{x_{n}\}_{n\in\mathbb{N}}\) to \(x\in H\) by \(x_{n}\rightarrow x\) and \(x_{n}\rightharpoonup x\), respectively. For each \(x,y,u,v\in H\) and \(\lambda\in\mathbb{R}\), we have
$$\begin{aligned}& \Vert x+y \Vert ^{2}= \Vert x \Vert ^{2}+2\langle x,y\rangle+ \Vert y \Vert ^{2}, \end{aligned}$$
(2.1)
$$\begin{aligned}& \bigl\Vert \lambda x+(1-\lambda)y \bigr\Vert ^{2}= \lambda \Vert x \Vert ^{2}+(1-\lambda) \Vert y \Vert ^{2}- \lambda(1-\lambda) \Vert x-y \Vert ^{2}, \end{aligned}$$
(2.2)
$$\begin{aligned}& 2\langle x-y,u-v\rangle= \Vert x-v \Vert ^{2}+ \Vert y-u \Vert ^{2}- \Vert x-u \Vert ^{2}- \Vert y-v \Vert ^{2}. \end{aligned}$$
(2.3)
Definition 2.1
Let H be a real Hilbert space, \(B:H\rightarrow H\) be a mapping, and \(\rho>0\). Thus,
(i)
B is monotone if \(\langle x-y,Bx-By\rangle\geq0\) for all \(x,y\in H\).
 
(ii)
B is ρ-strongly monotone if \(\langle x-y,Bx-By \rangle\geq\rho\Vert x-y\Vert ^{2}\) for all \(x,y\in H\).
 
Definition 2.2
Let H be a real Hilbert space and \(B:H\multimap H\) be a set-valued mapping with domain \(\mathcal{D}(B):=\{x\in H:B(x)\neq\emptyset\}\). Thus,
(i)
B is called monotone if \(\langle u-v,x-y\rangle\geq0\) for any \(u\in B(x)\) and \(v\in B(y)\).
 
(ii)
B is maximal monotone if its graph \(\{(x,y):x\in \mathcal{D}(B), y\in B(x)\}\) is not properly contained in the graph of any other monotone mapping.
 
(iii)
B is ρ-strongly monotone if \(\langle x-y,u-v \rangle\geq\rho\Vert x-y\Vert ^{2}\) for all \(x,y\in H\) and all \(u\in B(x)\), and \(v\in B(y)\).
 
Definition 2.3
Let H be a real Hilbert space, and \(f:H\rightarrow\mathbb{R}\) be a function. Thus,
(i)
f is proper if \(\operatorname{dom}(f):=\{x\in H: f(x)<\infty\}\neq \emptyset\).
 
(ii)
f is lower semicontinuous if \(\{x\in H:f(x)\leq r\}\) is closed for each \(r\in\mathbb{R}\).
 
(iii)
f is convex if \(f(tx+(1-t)y)\leq t f(x)+(1-t)f(y)\) for every \(x,y\in H\) and \(t\in[0,1]\).
 
(iv)
f is ρ-strongly convex (\(\rho>0\)) if
$$ f\bigl(tx+(1-t)y\bigr)+\frac{\rho}{2}\cdot t(1-t) \Vert x-y \Vert ^{2}\leq tf(x)+(1-t)f(y) $$
for all \(x,y\in H\) and \(t\in(0,1)\).
 
(v)
f is Gâteaux differentiable at \(x\in H\) if there is \(\nabla f(x)\in H\) such that
$$ \lim_{t\rightarrow0}\frac{f(x+ty)-f(x)}{t}=\bigl\langle y, \nabla f(x) \bigr\rangle $$
for each \(y\in H\).
 
(vi)
f is Fréchet differentiable at x if there is \(\nabla f(x)\) such that
$$ \lim_{y\rightarrow0} \frac{f(x+y)-f(x)-\langle\nabla f(x),y\rangle}{ \Vert y \Vert }=0. $$
 
Remark 2.1
Let H be a real Hilbert space. Then \(f(x):=\Vert x\Vert ^{2}\) is a 2-strongly convex function. Besides, we know \(g:H\rightarrow \mathbb {R}\) is ρ-strongly convex if and only if \(g-\frac{\rho}{2}\Vert \cdot \Vert ^{2}\) is convex [1, Proposition 10.6].
Example 2.1
Let \(g(x):=\frac{1}{2}\langle Qx,x\rangle-\langle x,b\rangle\), where \(Q\in\mathbb{R}^{n\times n}\) is a real symmetric positive definite matrix and \(b\in\mathbb{R}^{n}\). Then g is a strongly convex function.
Definition 2.4
Let \(f:H\rightarrow(-\infty,\infty]\) be a proper lower semicontinuous and convex function. Then the subdifferential ∂f of f is defined by
$$ \partial f(x):=\bigl\{ x^{*}\in H: f(x)+\bigl\langle y-x,x^{*} \bigr\rangle \leq f(y)\text{ for each }y\in H\bigr\} $$
for each \(x\in H\).
Lemma 2.1
Let \(f:H\rightarrow(-\infty,\infty]\) be a proper lower semicontinuous and convex function. Then the following are satisfied:
(i)
∂f is a set-valued maximal monotone mapping.
 
(ii)
f is Gâteaux differentiable at \(x\in \operatorname{int}( \mathcal{D})\) if and only if \(\partial f(x)\) consists of a single element. That is, \(\partial f(x)=\{\nabla f(x)\}\) [2, Proposition 1.1.10].
 
(iii)
Suppose that f is Fréchet differentiable. Then f is convex if and only iff is a monotone mapping.
 
Lemma 2.2
([1, Example 22.3(iv)])
Let \(\rho>0\), H be a real Hilbert space and \(f:H\rightarrow \mathbb {R}\) be a proper, lower-semicontinuous, and convex function. If f is ρ-strongly convex, then ∂f is ρ-strongly monotone.
Lemma 2.3
([1, Proposition 16.26])
Let H be a real Hilbert space and \(f:H\rightarrow (\infty,\infty]\) be a proper, lower semicontinuous, and convex function. If \(\{u_{n}\}_{n\in \mathbb{N}}\) and \(\{x_{n}\}_{n\in\mathbb{N}}\) are sequences in H with \(u_{n}\in\partial f(x_{n})\) for all \(n\in\mathbb{N}\), and \(x_{n}\rightharpoonup x\) and \(u_{n}\rightarrow u\), then \(u\in\partial f(x)\).
Lemma 2.4
([17, p. 114])
Let \(\{a_{n}\}_{n\in \mathbb{N}}\) and \(\{b_{n}\}_{n\in\mathbb{N}}\) be sequences of nonnegative real numbers. If \(\sum_{n=1}^{\infty}a_{n}=\infty\) and \(\sum_{n=1}^{\infty}a_{n} b_{n}<\infty\), then \(\liminf_{n\rightarrow \infty }b_{n}=0\).
Lemma 2.5
([10])
Let H be a real Hilbert space, \(B:H\multimap H\) be a set-valued maximal monotone mapping, \(\beta >0\), and \(J_{\beta}^{B}\) be defined by \(J_{\beta}^{B}(x):=(I+\beta B)^{-1}(x)\) for each \(x\in H\). Then \(J_{\beta}^{B}\) is a single-valued mapping.

3 Split proximal linearized algorithm

Throughout this section, we use the following notations and assumptions. Let \(\rho\geq L>0\). Let \(H_{1}\) and \(H_{2}\) be finite dimensional real Hilbert spaces, \(A:H_{1}\rightarrow H_{2}\) be a nonzero linear and bounded mapping, \(A^{*}\) be the adjoint of A, \(g_{1},h_{1}:H_{1}\rightarrow \mathbb{R}\) be proper lower semicontinuous and convex functions, \(g_{2},h_{2}:H_{2}\rightarrow \mathbb{R}\) be proper lower semicontinuous and convex functions, \(f_{1}(x)=g_{1}(x)-h_{1}(x)\) for all \(x\in H_{1}\), and \(f_{2}(y)=g_{2}(y)-h_{2}(y)\) for all \(y\in H_{2}\). Further, we assume that \(f_{1}\) and \(f_{2}\) are bounded from below, \(h_{1}\) and \(h_{2}\) are Fréchet differentiable, \(\nabla h_{1}\) and \(\nabla h _{2}\) are L-Lipschitz continuous, \(g_{1}\) and \(g_{2}\) are ρ-strongly convex. Let \(\beta\in(0,\infty)\), and let \(\{\beta_{n}\}_{n\in\mathbb{N}}\) be a sequence in \([a,b]\subseteq(0, \infty)\). Let \(r\in(0,\frac{1}{\Vert A\Vert ^{2}})\) and \(\{r_{n}\}_{n\in \mathbb{N}}\) be a sequence in \((0,\frac{1}{\Vert A\Vert ^{2}})\). Let \(\varOmega_{\text{SDCP}}\) be defined by
$$\varOmega_{\text{SDCP}}:=\bigl\{ x\in H_{1}: \nabla h_{1}(x)\in\partial g _{1}(x), \nabla h_{2}(Ax) \in\partial g_{2}(Ax)\bigr\} , $$
and assume that \(\varOmega_{\text{SDCP}}\neq\emptyset\).
Proposition 3.1
If \(\rho>L\) and \(\varOmega_{\mathrm{SDCP}}\neq\emptyset\), then the set \(\varOmega_{\mathrm{SDCP}}\) is a singleton.
Proof
If \(x,y\in\varOmega_{\mathrm{SDCP}}\), then
$$\begin{aligned}& \nabla h_{1}(x)\in\partial g_{1}(x),\qquad \nabla h_{1}(y)\in\partial g_{1}(y), \\& \nabla h_{2}(Ax)\in\partial g_{2}(Ax),\qquad \nabla h_{2}(Ay) \in\partial g_{2}(Ay). \end{aligned}$$
Since \(g_{1}\) is ρ-strongly convex, we know \(\partial g_{1}\) is ρ-strongly monotone. Thus,
$$ \rho \Vert x-y \Vert ^{2}\leq\bigl\langle x-y,\nabla h_{1}(x)-\nabla h_{1}(y) \bigr\rangle \leq \Vert x-y \Vert \cdot \bigl\Vert \nabla h_{1}(x)-\nabla h_{1}(y) \bigr\Vert . $$
Since \(\nabla h_{1}\) is L-Lipschitz continuous, we have
$$ \rho \Vert x-y \Vert ^{2}\leq \Vert x-y \Vert \cdot \bigl\Vert \nabla h_{1}(x)-\nabla h_{1}(y) \bigr\Vert \leq L \Vert x-y \Vert ^{2}. $$
Since \(\rho>L\), we have \(x=y\). The proof is completed. □
In this section, we study the split DC program by the following algorithm.
Algorithm 3.1
(Split proximal linearized algorithm)
$$ \textstyle\begin{cases} x_{1}\in H_{1}\quad \text{is chosen arbitrarily}, \\ y_{n}:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta_{n}} \Vert v-Ax _{n} \Vert ^{2}-\langle\nabla h_{2}(Ax_{n}),v-Ax_{n}\rangle\}, \\ z_{n}:=x_{n}-r_{n} A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta_{n}} \Vert u-z _{n} \Vert ^{2}-\langle\nabla h_{1}(z_{n}),u-z_{n}\rangle\},\quad n\in \mathbb{N}. \end{cases} $$
Theorem 3.1
Let \(\{r_{n}\}_{n\in\mathbb{N}}\) be a sequence in \((0,\frac {1}{\Vert A\Vert ^{2}})\) with \(0<\liminf_{n\rightarrow \infty}r_{n} \leq \limsup_{n\rightarrow \infty}r_{n}<\frac{1}{\Vert A\Vert ^{2}}\). Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by Algorithm 3.1. Then \(\{x_{n}\}_{n\in\mathbb{N}}\) converges to an element \(\bar{x}\in\varOmega_{\mathrm{SDCP}}\).
Proof
Take any \(w\in\varOmega_{\text{SDCP}}\) and \(n\in\mathbb{N}\), and let w and n be fixed. First, from the second line of Algorithm 3.1, we get
$$ 0\in\partial g_{2}(y_{n})+ \frac{1}{\beta_{n}}(y_{n}-Ax_{n})-\nabla h _{2}(Ax_{n}). $$
(3.1)
By (3.1), there exists \(u_{n}\in\partial g _{2}(y_{n})\) such that
$$ \nabla h_{2}(Ax_{n})=u_{n}+ \frac{1}{\beta_{n}}(y_{n}-Ax_{n}). $$
(3.2)
Since \(w\in\varOmega_{\text{SDCP}}\), we know that \(\nabla h_{2}(Aw) \in\partial g_{2}(Aw)\). By Lemma 2.2, \(\partial g_{2}\) is ρ-strongly monotone, and then
$$ 0\leq\bigl\langle y_{n}-Aw,u_{n}-\nabla h_{2}(Aw)\bigr\rangle -\rho \Vert y_{n}-Aw \Vert ^{2}. $$
(3.3)
By (3.2) and (3.3),
$$ 0\leq\biggl\langle y_{n}-Aw, \nabla h_{2}(Ax_{n})+\frac{1}{\beta_{n}}(Ax _{n}-y_{n})- \nabla h_{2}(Aw)\biggr\rangle -\rho \Vert y_{n}-Aw \Vert ^{2}. $$
(3.4)
Hence, by (3.4), we have
$$\begin{aligned} 0 \leq&2\beta_{n}\bigl\langle y_{n}-Aw,\nabla h_{2}(Ax_{n})-\nabla h_{2}(Aw) \bigr\rangle +2 \langle y_{n}-Aw,Ax_{n}-y_{n}\rangle \\ & {}-2\beta_{n}\rho \Vert y_{n}-Aw \Vert ^{2} \\ \leq&2\beta_{n}L \Vert y_{n}-Aw \Vert \cdot \Vert Ax_{n}-Aw \Vert -2\beta_{n}\rho \Vert y _{n}-Aw \Vert ^{2} \\ &{}+ \Vert Ax_{n}-Aw \Vert ^{2}- \Vert y_{n}-Ax_{n} \Vert ^{2}- \Vert y_{n}-Aw \Vert ^{2} \\ \leq& \beta_{n} L \Vert y_{n}-Aw \Vert ^{2}+ \beta_{n} L \Vert Ax_{n}-Aw \Vert ^{2}-2 \beta_{n}\rho \Vert y_{n}-Aw \Vert ^{2} \\ &{}+ \Vert Ax_{n}-Aw \Vert ^{2}- \Vert y_{n}-Ax_{n} \Vert ^{2}- \Vert y_{n}-Aw \Vert ^{2}. \end{aligned}$$
(3.5)
By (3.5), we obtain
$$\begin{aligned} \Vert y_{n}-Aw \Vert ^{2} \leq& \frac{\beta_{n}L+1}{1+2\beta_{n}\rho-\beta _{n}L} \Vert Ax_{n}-Aw \Vert ^{2}- \frac{ \Vert y _{n}-Ax_{n} \Vert ^{2}}{1+2\beta_{n}\rho-\beta_{n}L} \\ \leq& \Vert Ax_{n}-Aw \Vert ^{2}-\frac{ \Vert y_{n}-Ax_{n} \Vert ^{2}}{1+2\beta_{n} \rho-\beta_{n}L}. \end{aligned}$$
(3.6)
In the same way, one obtains
$$ \Vert x_{n+1}-w \Vert ^{2}\leq \Vert z_{n}-w \Vert ^{2} -\frac{1}{1+2\beta_{n}\rho- \beta_{n}L} \Vert x_{n+1}-z_{n} \Vert ^{2}\leq \Vert z_{n}-w \Vert ^{2}. $$
(3.7)
Next, we have
$$\begin{aligned} 2 \Vert z_{n}-w \Vert ^{2} =& 2\bigl\langle z_{n}-w,x_{n}-r_{n} A^{*}(Ax_{n}-y_{n})-w \bigr\rangle \\ =& 2\langle z_{n}-w,x_{n}-w\rangle-2r_{n} \bigl\langle z_{n}-w,A^{*}(Ax _{n}-y_{n})\bigr\rangle \\ =& 2\langle z_{n}-w,x_{n}-w\rangle-2r_{n} \langle Az_{n}-Aw,Ax_{n}-y _{n}\rangle \\ =& \Vert z_{n}-w \Vert ^{2}+ \Vert x_{n}-w \Vert ^{2}- \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y _{n} \Vert ^{2} \\ &{}-r_{n} \Vert Ax_{n}-Aw \Vert ^{2}+r_{n} \Vert Az_{n}-Ax_{n} \Vert ^{2}+r_{n} \Vert y_{n}-Aw \Vert ^{2}. \end{aligned}$$
(3.8)
By (3.6), (3.7), and (3.8),
$$\begin{aligned}& \Vert x_{n+1}-w \Vert ^{2} \\& \quad \leq \Vert z_{n}-w \Vert ^{2} \\& \quad = \Vert x_{n}-w \Vert ^{2}- \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y_{n} \Vert ^{2} \\& \qquad {}-r_{n} \Vert Ax_{n}-Aw \Vert ^{2}+r_{n} \Vert Az_{n}-Ax_{n} \Vert ^{2}+r_{n} \Vert y_{n}-Aw \Vert ^{2} \\& \quad \leq \Vert x_{n}-w \Vert ^{2}- \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y_{n} \Vert ^{2}-r _{n} \Vert Ax_{n}-Aw \Vert ^{2} \\& \qquad {}+r_{n} \Vert A \Vert ^{2}\cdot \Vert z_{n}-x_{n} \Vert ^{2}+r_{n}\cdot \frac{\beta_{n}L+1}{1+2 \beta_{n}\rho-\beta_{n}L} \Vert Ax_{n}-Aw \Vert ^{2} \\& \quad = \Vert x_{n}-w \Vert ^{2}-\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az _{n}-y_{n} \Vert ^{2} \\& \qquad {}-r_{n}\biggl(1-\frac{\beta_{n}L+1}{1+2\beta_{n}\rho-\beta _{n}L}\biggr) \Vert Ax_{n}-Aw \Vert ^{2} \\& \quad = \Vert x_{n}-w \Vert ^{2}-\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z_{n} \Vert ^{2}-r_{n} \Vert Az _{n}-y_{n} \Vert ^{2} \\& \qquad {}-r_{n}\biggl(\frac{2\beta_{n}(\rho-L)}{1+2\beta_{n}\rho-\beta_{n}L}\biggr) \Vert Ax _{n}-Aw \Vert ^{2} \\& \quad \leq \Vert x_{n}-w \Vert ^{2}. \end{aligned}$$
(3.9)
By (3.9), \(\lim_{n\rightarrow \infty}\Vert x_{n}-w\Vert \) exists and \(\{x_{n}\}_{n\in\mathbb{N}}\) is a bounded sequence. Further, \(\{Ax_{n}\}_{n\in\mathbb{N}}\), \(\{y_{n}\}_{n\in\mathbb{N}}\), \(\{z_{n}\}_{n\in\mathbb{N}}\) are bounded sequences. By (3.9) again, we know that
$$ \lim_{n\rightarrow \infty} \Vert x_{n}-w \Vert = \lim_{n\rightarrow \infty } \Vert z_{n}-w \Vert , $$
(3.10)
and
$$ \lim_{n\rightarrow \infty}\frac{ \Vert x_{n+1}-z_{n} \Vert ^{2}}{1+2\beta _{n}\rho- \beta_{n}L}=\lim _{n\rightarrow \infty}r_{n} \Vert Az_{n}-y_{n} \Vert ^{2} = \lim_{n\rightarrow \infty}\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z_{n} \Vert ^{2}=0. $$
(3.11)
It follows from \(\{\beta_{n}\}_{n\in\mathbb{N}}\subseteq (a,b)\), \(0<\liminf_{n\rightarrow \infty}r_{n}\leq\limsup_{n\rightarrow \infty}r_{n}< \frac{1}{\Vert A\Vert ^{2}}\), and (3.11) that
$$ \lim_{n\rightarrow \infty} \Vert x_{n+1}-z_{n} \Vert =\lim_{n\rightarrow \infty} \Vert Az_{n}-y _{n} \Vert =\lim_{n\rightarrow \infty} \Vert x_{n}-z_{n} \Vert =0. $$
(3.12)
Since \(\{x_{n}\}_{n\in\mathbb{N}}\) is bounded, there exists a subsequence \(\{x_{n_{k}}\}_{k\in\mathbb{N}}\) of \(\{x_{n}\}_{n\in \mathbb{N}}\) such that \(x_{n_{k}}\rightarrow \bar{x}\in H_{1}\). Clearly, \(Ax_{n_{k}}\rightarrow A\bar{x}\), \(z_{n_{k}}\rightarrow \bar{x}\), \(Az_{n_{k}}\rightarrow A \bar{x}\), \(y_{n_{k}}\rightarrow A\bar{x}\), and \(x_{n_{k}+1}\rightarrow \bar{x}\). Further, by (3.2), we obtain
$$ \nabla h_{2}(Ax_{n_{k}})\in\partial g_{2}(y_{n_{k}})+\frac{1}{ \beta_{n_{k}}}(y_{n_{k}}-Ax_{n_{k}}). $$
(3.13)
By (3.12), (3.13), Lemma 2.3, and \(\{\beta_{n}\}_{n\in \mathbb{N}}\subseteq (a,b)\), we determine that
$$ \nabla h_{2}(A\bar{x})\in\partial g_{2}(A \bar{x}). $$
(3.14)
Similarly, we have
$$ \nabla h_{1}(\bar{x})\in\partial g_{1}( \bar{x}). $$
(3.15)
By (3.14) and (3.15), we know that \(\bar{x}\in\varOmega_{\text{SDCP}}\). Further, \(\lim_{n\rightarrow \infty}\Vert x_{n}-\bar{x}\Vert =\lim_{k\rightarrow \infty }\Vert x_{n_{k}}-\bar{x}\Vert =0\). Therefore, the proof is completed. □
Remark 3.1
(i)
In Algorithm 3.1, if \(y_{n}=Ax _{n}\) and \(x_{n+1}=z_{n}\), then \(x_{n}=z_{n}\), and this implies that \(\nabla h_{1}(x_{n})\in\partial g_{1}(x_{n})\) and \(\nabla h_{2}(Ax _{n})\in\partial g_{2}(Ax_{n})\). Thus, \(x_{n}\in\varOmega_{\text{SDCP}}\).
 
(ii)
In Algorithm 3.1, if \(x_{n+1} \neq z_{n}\), then \(f_{1}(x_{n+1})< f_{1}(z_{n})\). Indeed, it follows from \(\partial h_{1}(z_{n})=\{\nabla h_{1}(z_{n})\}\) and the definition of \(x_{n+1}\) that
$$ g_{1}(x_{n+1})-h_{1}(x_{n+1})+ \frac{1}{2\beta_{n}} \Vert x_{n+1}-z_{n} \Vert ^{2} \leq g_{1}(z_{n})-h_{1}(z_{n}). $$
So, if \(x_{n+1}\neq z_{n}\), then \(f_{1}(x_{n+1})< f_{1}(z_{n})\).
 
(iii)
In Algorithm 3.1, if \(y_{n} \neq Ax_{n}\), then \(f_{2}(y_{n})< f_{2}(Ax_{n})\). Indeed, it follows from \(\partial h_{2}(Ax_{n})=\{\nabla h_{2}(Ax_{n})\}\) and the definition of \(y_{n}\) that
$$ g_{2}(y_{n})-h_{2}(y_{n})+ \frac{1}{2\beta_{n}} \Vert y_{n}-Ax_{n} \Vert ^{2} \leq g_{2}(Ax_{n})-h_{2}(Ax_{n}). $$
So, if \(y_{n}\neq Ax_{n}\), then \(f_{2}(y_{n})< f_{2}(Ax_{n})\).
 
(iv)
If \(\rho> L\), then it follows from (3.7) that (3.9) can be rewritten as
$$ \Vert x_{n+1}-w \Vert ^{2}\leq k_{n} \Vert z_{n}-w \Vert ^{2}\leq k_{n} \Vert x_{n}-w \Vert ^{2}, $$
where
$$ k_{n}:=\frac{1+\beta_{n}L}{1+2\beta_{n}\rho-\beta_{n}L}\in(0,1). $$
Hence, \(\{x_{n}\}_{n\in\mathbb{N}}\) converges linearly to , where \(\varOmega_{\text{SDCP}}=\{\bar{x}\}\).
 
Remark 3.2
From the proof of Theorem 3.1, we know that
$$ \nabla h_{2}(Ax_{n})+\frac{1}{\beta_{n}}(Ax_{n}-y_{n}) \in\partial g _{2}(y_{n}), $$
(3.16)
and this implies that
$$ Ax_{n}+\beta_{n}\nabla h_{2}(Ax_{n}) \in y_{n}+\beta_{n}\partial g_{2}(y _{n})=(I_{H_{2}}+\beta_{n}\partial g_{2}) (y_{n}), $$
(3.17)
where \(I_{H_{2}}\) is the identity mapping on \(H_{2}\). Since \(g_{2}\) is proper, lower semicontinuous, and convex, we know that \(\partial g_{2}\) is maximal monotone. So, by Lemma 2.5, we determine that
$$ y_{n}=(I_{H_{2}}+\beta_{n}\partial g_{2})^{-1}\bigl(Ax_{n}+\beta_{n}\nabla h_{2}(Ax_{n})\bigr). $$
(3.18)
Similarly, we have
$$ x_{n+1}=(I_{H_{1}}+\beta_{n}\partial g_{1})^{-1}\bigl(z_{n}+\beta_{n} \nabla h_{1}(z_{n})\bigr), $$
(3.19)
where \(I_{H_{1}}\) is the identity mapping on \(H_{1}\). Therefore, Algorithm 3.1 can be rewritten as the following algorithm:
$$ \textstyle\begin{cases} y_{n}:=(I_{H_{2}}+\beta_{n}\partial g_{2})^{-1}(Ax_{n}+\beta_{n} \nabla h_{2}(Ax_{n})), \\ z_{n}:=x_{n}-r_{n} A^{*}(Ax_{n}-y_{n}), \\ x_{n+1}:=(I_{H_{1}}+\beta_{n}\partial g_{1})^{-1}(z_{n}+\beta_{n} \nabla h_{1}(z_{n})), \quad n\in\mathbb{N}. \end{cases} $$
(Algorithm 3.2)
In fact, we observe that the idea of Algorithm 3.2 is the same as the proposed algorithm by Sun, Sampaio, and Candido [16, Algorithm 2.3]. Hence, a modified algorithm and related convergence theorems could be presented by using the idea of [16, Algorithm 5.3].
Remark 3.3
Under the assumptions in this section, consider the following:
$$ \textstyle\begin{cases} y:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta} \Vert v-Ax \Vert ^{2}- \langle\nabla h_{2}(Ax),v-Ax\rangle\}, \\ z:=x-rA^{*}(Ax-y), \\ w:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta} \Vert u-z \Vert ^{2}- \langle\nabla h_{1}(z),u-z\rangle\}, \end{cases} $$
(3.20)
that is,
$$ \textstyle\begin{cases} y:=(I_{H_{2}}+\beta\partial g_{2})^{-1}(Ax+\beta\nabla h_{2}(Ax)), \\ z:=x-r A^{*}(Ax-y), \\ w:=(I_{H_{1}}+\beta\partial g_{1})^{-1}(z+\beta\nabla h_{1}(z)), \end{cases} $$
(3.21)
we know that
$$ Ax=y\quad \text{and}\quad z=w \quad \Leftrightarrow\quad x=z\in \varOmega_{\text{SDCP}}. $$
(3.22)
Proof
For this equivalence relation, we only need to show that \(x=z\in \varOmega_{\text{SDCP}}\) implies that \(Ax=y\) and \(z=w\). Indeed, since \(x=z\in\varOmega_{\text{SDCP}}\), we know that \(\nabla h_{1}(z)\in \partial g_{1}(z)\) and \(\nabla h_{2}(Ax)\in\partial g_{2}(Ax)\). By Lemma 2.5,
$$ \textstyle\begin{cases} (I_{H_{2}}+\beta\partial g_{2})^{-1}(Ax+\beta\nabla h_{2}(Ax))=Ax, \\ (I_{H_{1}}+\beta\partial g_{1})^{-1}(z+\beta\nabla h_{1}(z))=z. \end{cases} $$
(3.23)
By (3.21) and (3.23), we know that \(Ax=y\) and \(z=w\). □
Remark 3.4
In Algorithm 3.1, if \(\beta_{n}=\beta\) and \(r_{n}=r\) for each \(n\in\mathbb{N}\), and \(x_{N+1}=x_{N}\) for some \(N\in\mathbb{N}\), then \(x_{n}=x_{N}\), \(y_{n}=y_{N}\), and \(z_{n}=z _{N}\) for each \(n\in\mathbb{N}\) with \(n\geq N\). By Theorem 3.1, we know that \(\lim_{n\rightarrow \infty}x_{n}=x _{N}\in\varOmega_{\text{SDCP}}\). So, \(x_{n+1}=x_{n}\) could be set as a stop criterion in Algorithm 3.1. Further, from (3.21), we have
$$\begin{aligned} x=w\quad \Rightarrow& \quad x\in\varOmega_{\text{SDCP}} \\ \Rightarrow&\quad x\in\varOmega_{\text{SDCP}}\quad \text{and}\quad y=Ax \\ \Rightarrow&\quad x=z\in\varOmega_{\text{SDCP}}\quad \text{and}\quad y=Ax \\ \Rightarrow&\quad x=z=w\in\varOmega_{\text{SDCP}}\quad \text{and}\quad y=Ax \\ \Rightarrow&\quad x=w. \end{aligned}$$
This equivalence relation is important for the split DC program.
By Remark 3.4, we give the following result.
Proposition 3.2
Under the assumptions in this section, and
$$ \textstyle\begin{cases} y:=\operatorname{arg}\min_{v\in H_{2}}\{g_{2}(v)+\frac{1}{2\beta} \Vert v-Ax \Vert ^{2}- \langle\nabla h_{2}(Ax),v-Ax\rangle\}, \\ z:=x-rA^{*}(Ax-y), \\ w:=\operatorname{arg}\min_{u\in H_{1}}\{g_{1}(u)+\frac{1}{2\beta} \Vert u-z \Vert ^{2}- \langle\nabla h_{1}(z),u-z\rangle\}. \end{cases} $$
(3.24)
Then \(x\in\varOmega_{\mathrm{SDCP}}\) if and only if \(x=w\).
Next, we give another convergence theorem for the split proximal linearized algorithm under different assumptions on \(\{r_{n}\}_{n \in\mathbb{N}}\).
Theorem 3.2
Let \(\{r_{n}\}_{n\in\mathbb{N}}\) be a sequence in \((0,\frac {1}{\Vert A\Vert ^{2}})\) with \(\lim_{n\rightarrow \infty}r_{n}=0\) and \(\sum_{n=1}^{\infty }r_{n}=\infty\). Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by Algorithm 3.1. Then \(\{x_{n}\}_{n\in\mathbb{N}}\) converges to an element \(\bar{x}\in\varOmega_{\mathrm{SDCP}}\).
Proof
Take any \(w\in\varOmega_{\mathrm{SDCP}}\) and \(n\in\mathbb{N}\), and let w and n be fixed. From the proof of Theorem 3.1, we have
$$\begin{aligned}& \nabla h_{2}(Ax_{n})-\frac{1}{\beta_{n}}(y_{n}-Ax_{n}) \in\partial g _{2}(Ax_{n}), \end{aligned}$$
(3.25)
$$\begin{aligned}& \nabla h_{1}(z_{n})-\frac{1}{\beta_{n}}(x_{n+1}-z_{n}) \in\partial g _{1}(x_{n+1}), \end{aligned}$$
(3.26)
and
$$\begin{aligned} \Vert x_{n+1}-w \Vert ^{2} \leq& \Vert x_{n}-w \Vert ^{2}-\bigl(1-r_{n} \Vert A \Vert ^{2}\bigr) \Vert x_{n}-z _{n} \Vert ^{2}-r_{n} \Vert Az_{n}-y_{n} \Vert ^{2} \\ &{}-r_{n}\biggl(\frac{2\beta_{n}(\rho-L)}{1+2\beta_{n}\rho-\beta_{n}L}\biggr) \Vert Ax _{n}-Aw \Vert ^{2}-\frac{1}{1+2\beta_{n}\rho-\beta_{n}L} \Vert x_{n+1}-z_{n} \Vert ^{2} \\ \leq& \Vert x_{n}-w \Vert ^{2}. \end{aligned}$$
(3.27)
Further, the following are satisfied:
$$ \textstyle\begin{cases} \lim_{n\rightarrow \infty} \Vert x_{n}-w \Vert \text{ exists,} \\ \{x_{n}\}_{n\in\mathbb{N}}, \{Ax_{n}\}_{n\in\mathbb{N}}, \{y_{n}\} _{n\in\mathbb{N}}, \{z_{n}\}_{n\in\mathbb{N}} \text{ are bounded sequences,} \\ \lim_{n\rightarrow \infty} \Vert x_{n+1}-z_{n} \Vert =0, \\ \lim_{n\rightarrow \infty}(1-r_{n} \Vert A \Vert ^{2}) \Vert x_{n}-z_{n} \Vert ^{2}=0. \end{cases} $$
Since \(\lim_{n\rightarrow \infty}r_{n}=0\), we know that
$$ \lim_{n\rightarrow \infty} \Vert x_{n}-z_{n} \Vert =0. $$
(3.28)
By (3.27), we have
$$ \sum_{n=1}^{\infty}r_{n} \Vert Az_{n}-y_{n} \Vert ^{2}\leq\sum _{n=1}^{\infty }\bigl( \Vert x_{n}-w \Vert ^{2}- \Vert x_{n+1}-w \Vert ^{2}\bigr)< \infty. $$
(3.29)
By (3.29) and Lemma 2.4, we determine that
$$ \liminf_{n\rightarrow \infty} \Vert Az_{n}-y_{n} \Vert ^{2}=0. $$
(3.30)
Then there exist a subsequence \(\{y_{n_{k}}\}_{k\in\mathbb{N}}\) of \(\{y_{n}\}_{n\in\mathbb{N}}\), a subsequence \(\{z_{n_{k}}\}_{k\in \mathbb{N}}\) of \(\{z_{n}\}_{n\in\mathbb{N}}\), and \(\bar{x}\in H_{1}\) such that \(z_{n_{k}}\rightarrow \bar{x}\), \(y_{n_{k}}\rightarrow A\bar{x}\), and
$$ \liminf_{n\rightarrow \infty} \Vert Az_{n}-y_{n} \Vert ^{2}=\lim_{k\rightarrow \infty} \Vert Az _{n_{k}}-y_{n_{k}} \Vert ^{2}=0. $$
(3.31)
Clearly, \(x_{n_{k}}\rightarrow \bar{x}\), \(x_{n_{k}+1}\rightarrow \bar{x}\), and \(Ax_{n_{k}}\rightarrow A\bar{x}\). By (3.25) and (3.26), we know that \(\bar{x}\in \varOmega_{\text{SDCP}}\). Thus, \(\bar{x}=\hat{x}\). Since \(\lim_{n\rightarrow \infty}\Vert x_{n}-\bar{x}\Vert \) exists, we know \(\lim_{n\rightarrow \infty}\Vert x _{n}-\bar{x}\Vert =\lim_{k\rightarrow \infty}\Vert x_{n_{k}}-\bar{x}\Vert =0\). Therefore, the proof is completed. □

4 Application to the DC program and numerical results

Let \(\rho>L\geq0\). Let H be a finite dimensional Hilbert space, \(g,h:H\rightarrow \mathbb{R}\) be proper, lower semicontinuous, and convex functions. Besides, we also assume that h is Fréchet differentiable, ∇h is L-Lipschitz continuous, g is ρ-strongly convex. Let \(\{\beta_{n}\}_{n\in\mathbb{N}}\) be a sequence in \((a,b)\subseteq(0,\infty)\). Let \(\{r_{n}\}_{n\in \mathbb{N}}\) be a sequence in \((0,1)\) with \(0<\liminf_{n\rightarrow \infty}r _{n}\leq \limsup_{n\rightarrow \infty}r_{n}<1\).
Now, we recall the DC program:
$$ \text{Find }\bar{x}\in \operatorname{arg}\min_{x\in H}\bigl\{ f(x)=g(x)-h(x)\bigr\} . $$
(DCP)
Let \(\varOmega_{\text{DCP}}\) be defined by
$$\varOmega_{\text{DCP}}:=\bigl\{ x\in H: \nabla h(x)\in\partial g(x)\bigr\} , $$
and assume that \(\varOmega_{\text{DCP}}\neq\emptyset\). If \(H_{1}=H_{2}=H\), \(g_{1}=g_{2}=g\), and \(h_{1}=h_{2}=h\), then we get the following algorithm and convergence theorem from Algorithm 3.1 and Theorem 3.1, respectively.
Algorithm 4.1
$$ \textstyle\begin{cases} x_{1}\in H\quad \text{is chosen arbitrarily}, \\ y_{n}:=\operatorname{arg}\min_{v\in H}\{g(v)+\frac{1}{2\beta_{n}} \Vert v-x_{n} \Vert ^{2}- \langle\nabla h(x_{n}),v-x_{n}\rangle\}, \\ z_{n}:=(1-r_{n})x_{n}+r_{n}y_{n}, \\ x_{n+1}:=\operatorname{arg}\min_{u\in H}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-z_{n} \Vert ^{2}- \langle\nabla h(z_{n}),u-z_{n}\rangle\},\quad n\in\mathbb{N}. \end{cases} $$
Theorem 4.1
Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by Algorithm 4.1. Then \(\{x_{n}\}_{n\in\mathbb{N}}\) converges to an element \(\bar {x}\in\varOmega_{\mathrm{DCP}}\).
In fact, we can get the following algorithm and convergence theorem by Algorithm 4.1 and Theorem 4.1, respectively.
Algorithm 4.2
$$ \textstyle\begin{cases} x_{1}\in H\quad \text{is given}, \\ z_{n}:=\operatorname{arg}\min_{u\in H}\{g(u)+\frac{1}{2\beta_{n}} \Vert u-x_{n} \Vert ^{2}- \langle\nabla h(x_{n}),u-x_{n}\rangle\}, \\ y_{n}:=\operatorname{arg}\min_{v\in H}\{g(v)+\frac{1}{2\beta_{n}} \Vert v-z_{n} \Vert ^{2}- \langle\nabla h(z_{n}),v-z_{n}\rangle\}, \\ x_{n+1}:=(1-r_{n})z_{n}+r_{n}y_{n}, \quad n\in\mathbb{N}. \end{cases} $$
Theorem 4.2
Let \(\{x_{n}\}_{n\in\mathbb{N}}\) be generated by Algorithm 4.2. Then \(\{x_{n}\}_{n\in\mathbb{N}}\) converges to an element \(\bar {x}\in\varOmega_{\mathrm{DCP}}\).
Example 4.1
Let \(g,h:\mathbb{R}^{3}\rightarrow \mathbb{R}\) be defined by \(g(x_{1},x_{2},x_{3}):=2x_{1}^{2}+2x_{2}^{2}+2x_{3}^{2}\) and \(h(x_{1},x_{2},x_{3}):=4x_{1}+8x_{2}+12x_{3}\) for all \((x_{1},x_{2},x_{3})\in\mathbb{R}^{3}\). Then \(\varOmega_{\text{DCP}}:=\{x\in H: \nabla h(x)\in\partial g(x)\}=\{ (1,2,3)\}\).
Example 4.2
Let \(g_{1},h_{1}:\mathbb{R}^{3}\rightarrow \mathbb{R}\) be defined by \(g_{1}(x_{1},x_{2},x_{3}):=2x_{1}^{2}+2x_{2}^{2}+2x_{3}^{2}\) and \(h_{1}(x_{1},x_{2},x_{3}):=4x_{1}+8x_{2}+12x_{3}\) for all \((x_{1},x_{2},x_{3})\in\mathbb{R}^{3}\). Let \(g_{2},h_{2}:\mathbb{R}^{2}\rightarrow \mathbb{R}\) be defined by \(g_{2}(y_{1},y_{2}):=y_{1}^{2}+y_{2}^{2}\) and \(h_{2}(y_{1},y_{2}):=28y_{1}+64y_{2}\) for all \((y_{1},y_{2})\in\mathbb {R}^{2}\). Let \(A:\mathbb{R}^{3}\rightarrow \mathbb{R}^{2}\) be defined by \(A(x_{1},x_{2},x_{3})=(x_{1}+2x_{2}+3x_{3},4x_{1}+5x_{2}+6x_{3})\) for all \((x_{1},x_{2},x_{3})\in\mathbb{R}^{3}\). Here, \(\Vert A\Vert \approx0.10517\) and \(\varOmega_{\text{SDCP}}=\{(1,2,3)\}\).
From Table 1, we see that Algorithm 4.2 reaches the required errors only need six iterations if \(\beta_{n}=500\) for all \(n\in \mathbb{N}\), but Algorithm 4.2 reaches the required errors need 73 iterations if \(\beta_{n}=0.1\) for all \(n\in\mathbb{N}\).
Table 1
Numerical results for Example 4.1.
Algorithm 4.2
\(x_{1}=(88,2000,500)\), and \(r_{n}=0.5\) for all \(n\in\mathbb{N}\)
ε = 10−12
Iteration
Approximate solution
\(\beta_{n}=0.1\)
73
(1.00000000000004, 2.00000000000091, 3.00000000000023)
\(\beta_{n}=1\)
18
(1.00000000000002, 2.00000000000044, 3.00000000000011)
\(\beta_{n}=10\)
10
(1.00000000000000, 2.00000000000002, 3.00000000000000)
\(\beta_{n}=20\)
8
(1.00000000000003, 2.00000000000074, 3.00000000000018)
\(\beta_{n}=30\)
8
(1.00000000000000, 2.00000000000004, 3.00000000000001)
\(\beta_{n}=40\)
8
(1.00000000000000, 2.00000000000001, 3.00000000000000)
\(\beta_{n}=50\)
7
(1.00000000000002, 2.00000000000049, 3.00000000000012)
\(\beta_{n}=100\)
7
(1.00000000000000, 2.00000000000001, 3.00000000000030)
\(\beta_{n}=500\)
6
(1, 2, 3)
From Table 2, we see that Algorithm 4.2 reaches the required errors only need seven iterations if \(\beta_{n}=100\) for all \(n\in\mathbb{N}\), but Algorithm 4.2 reaches the required errors need 72 iterations if \(\beta_{n}=0.1\) for all \(n\in\mathbb{N}\).
Table 2
Numerical results for Example 4.1.
Algorithm 4.2
\(x_{1}=(123,456,789)\), and \(r_{n}=0.5\) for all \(n\in\mathbb{N}\)
ε = 10−12
Iteration
Approximate solution
\(\beta_{n}=0.1\)
72
(1.00000000000009, 2.00000000000034, 3.00000000000059)
\(\beta_{n}=1\)
18
(1.00000000000003, 2.00000000000010, 3.00000000000017)
\(\beta_{n}=10\)
9
(1.00000000000007, 2.00000000000027, 3.00000000000047)
\(\beta_{n}=20\)
8
(1.00000000000005, 2.00000000000017, 3.00000000000029)
\(\beta_{n}=30\)
8
(1.00000000000000, 2.00000000000001, 3.00000000000002)
\(\beta_{n}=40\)
7
(1.00000000000011, 2.00000000000042, 3.00000000000073)
\(\beta_{n}=50\)
7
(1.00000000000003, 2.00000000000011, 3.00000000000019)
\(\beta_{n}=100\)
7
(1, 2, 3)
From Table 3, we see that Algorithm 3.1 reaches the required errors only need seven iterations if \(\beta_{n}=700\) for all \(n\in\mathbb{N}\), but Algorithm 3.1 reaches the required errors need 99 iterations if \(\beta_{n}=0.1\) for all \(n\in\mathbb{N}\).
Table 3
Numerical results for Example 4.2.
Algorithm 3.1
\(x_{1}=(123,456,789)\), and \(r_{n}=0.05\) for all \(n\in\mathbb {N}\)
ε = 10−12
Iteration
Approximate solution
\(\beta_{n}=0.1\)
99
(0.99999999999927, 1.99999999999990, 3.00000000000052)
\(\beta_{n}=1\)
39
(1.00000000000036, 2.00000000000048, 3.00000000000059)
\(\beta_{n}=10\)
15
(1.00000000000018, 2.00000000000024, 3.00000000000030)
\(\beta_{n}=20\)
12
(0.99999999999973, 1.99999999999964, 2.99999999999955)
\(\beta_{n}=30\)
11
(1.00000000000013, 2.00000000000017, 3.00000000000021)
\(\beta_{n}=40\)
10
(0.99999999999963, 1.99999999999952, 2.99999999999940)
\(\beta_{n}=50\)
10
(0.99999999999995, 1.99999999999993, 2.99999999999992)
\(\beta_{n}=100\)
9
(1.00000000000001, 2.00000000000002, 3.00000000000002)
\(\beta_{n}=700\)
7
(1, 2, 3)
From Table 3 and Table 4, we see that Algorithm 3.1 reaches the required errors need 283 iterations if \(\beta_{n}=1\) and \(r_{n}=0.05\) for all \(n\in\mathbb{N}\), but Algorithm 3.1 reaches the required errors need 39 iterations if \(\beta_{n}=1\) and \(r_{n}=0.09\) for all \(n\in\mathbb{N}\). On the other hand, for other settings of \(\beta_{n}\), we know the numerical results in Table 3 and Table 4 show that there are no significant differences in the setting of \(\{r_{n}\}_{n\in\mathbb{N}}\).
Table 4
Numerical results for Example 4.2.
Algorithm 3.1
\(x_{1}=(123,456,789)\), and \(r_{n}=0.09\) for all \(n\in\mathbb {N}\)
ε = 10−12
Iteration
Approximate solution
\(\beta_{n}=0.1\)
98
(0.99999999999931, 1.99999999999990, 3.00000000000050)
\(\beta_{n}=1\)
283
(1.00000000000038, 2.00000000000051, 3.00000000000063)
\(\beta_{n}=10\)
21
(1.00000000000008, 2.00000000000010, 3.00000000000013)
\(\beta_{n}=20\)
15
(1.00000000000042, 2.00000000000056, 3.00000000000069)
\(\beta_{n}=30\)
14
(0.99999999999997, 1.99999999999996, 2.99999999999995)
\(\beta_{n}=40\)
12
(0.99999999999959, 1.99999999999946, 2.99999999999933)
\(\beta_{n}=50\)
12
(0.99999999999996, 1.99999999999995, 2.99999999999994)
\(\beta_{n}=100\)
10
(0.99999999999994, 1.99999999999992, 2.99999999999990)
\(\beta_{n}=1300\)
7
(1, 2, 3)
However, in Table 5, if \(\beta_{n}=1\) for all \(n\in\mathbb{N}\), then we know the numerical results have big differences in the setting of \(\{r_{n}\}_{n\in\mathbb{N}}\).
Table 5
Numerical results for Example 4.2.
Algorithm 3.1
\(x_{1}=(123,456,789)\), and \(\beta_{n}=1\) for all \(n\in \mathbb{N}\)
ε = 10−12
Iteration
Approximate solution
\(r_{n}=0.05\)
39
(1.00000000000036, 2.00000000000048, 3.00000000000059)
\(r_{n}=0.09\)
283
(1.00000000000038, 2.00000000000051, 3.00000000000063)
\(r_{n}=0.095\)
611
(1.00000000000041, 2.00000000000054, 3.00000000000067)
\(r_{n}=0.099\)
5129
(1.00000000000043, 2.00000000000056, 3.00000000000070)
\(r_{n}=0.0994\)
18,434
(0.99999999999957, 1.99999999999943, 2.99999999999930)

Acknowledgements

The authors would like to thank referee(s) for many useful comments and suggestions for the improvement of the article. Prof. Chi-Ming Chen was supported by Grant No. MOST 107-2115-M-007-008 of the Ministry of Science and Technology of the Republic of China.

Competing interests

The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literatur
1.
Zurück zum Zitat Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011) CrossRef Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, Berlin (2011) CrossRef
2.
Zurück zum Zitat Butnariu, D., Iusem, A.N.: Totally Convex Functions for Fixed Points Computation and Infinite Dimensional Optimization. Kluwer Academic, Norwell (2000) CrossRef Butnariu, D., Iusem, A.N.: Totally Convex Functions for Fixed Points Computation and Infinite Dimensional Optimization. Kluwer Academic, Norwell (2000) CrossRef
3.
Zurück zum Zitat Byrne, C., Censor, Y., Gibali, A., Reich, S.: Weak and strong convergence of algorithms for the split common null point problem. J. Nonlinear Convex Anal. 13, 759–775 (2011) MATH Byrne, C., Censor, Y., Gibali, A., Reich, S.: Weak and strong convergence of algorithms for the split common null point problem. J. Nonlinear Convex Anal. 13, 759–775 (2011) MATH
4.
Zurück zum Zitat Chuang, C.S.: Strong convergence theorems for the split variational inclusion problem in Hilbert spaces. Fixed Point Theory Appl. 2013, 350 (2013) MathSciNetCrossRef Chuang, C.S.: Strong convergence theorems for the split variational inclusion problem in Hilbert spaces. Fixed Point Theory Appl. 2013, 350 (2013) MathSciNetCrossRef
5.
Zurück zum Zitat Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005) MathSciNetCrossRef Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005) MathSciNetCrossRef
6.
Zurück zum Zitat Fujikara, Y., Kuroiwa, D.: Lagrange duality in canonical DC programming. J. Math. Anal. Appl. 408, 476–483 (2013) MathSciNetCrossRef Fujikara, Y., Kuroiwa, D.: Lagrange duality in canonical DC programming. J. Math. Anal. Appl. 408, 476–483 (2013) MathSciNetCrossRef
7.
8.
Zurück zum Zitat Hiriart-Urruty, J.B., Tuy, H.: Essays on Nonconvex Optimization. Mathematical Programming, vol. 41. North-Holland, Amsterdam (1988) Hiriart-Urruty, J.B., Tuy, H.: Essays on Nonconvex Optimization. Mathematical Programming, vol. 41. North-Holland, Amsterdam (1988)
9.
Zurück zum Zitat Levy, A.J.: A fast quadratic programming algorithm for positive signal restoration. IEEE Trans. Acoust. Speech Signal Process. 31, 1337–1341 (1983) CrossRef Levy, A.J.: A fast quadratic programming algorithm for positive signal restoration. IEEE Trans. Acoust. Speech Signal Process. 31, 1337–1341 (1983) CrossRef
10.
Zurück zum Zitat Marino, G., Xu, H.K.: Convergence of generalized proximal point algorithm. Commun. Pure Appl. Anal. 3, 791–808 (2004) MathSciNetCrossRef Marino, G., Xu, H.K.: Convergence of generalized proximal point algorithm. Commun. Pure Appl. Anal. 3, 791–808 (2004) MathSciNetCrossRef
11.
Zurück zum Zitat Martinet, B.: Régularisation d’inéquations variationnelles par approximations successives. Rev. Fr. Inform. Rech. Opér. 4(Ser R–3), 154–158 (1970) MATH Martinet, B.: Régularisation d’inéquations variationnelles par approximations successives. Rev. Fr. Inform. Rech. Opér. 4(Ser R–3), 154–158 (1970) MATH
12.
13.
Zurück zum Zitat Pham, D.T., An, L.T.H., Akoa, F.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133, 23–46 (2005) MathSciNetCrossRef Pham, D.T., An, L.T.H., Akoa, F.: The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133, 23–46 (2005) MathSciNetCrossRef
14.
Zurück zum Zitat Saeki, Y., Kuroiwa, D.: Optimality conditions for DC programming problems with reverse convex constraints. Nonlinear Anal. 80, 18–27 (2013) MathSciNetCrossRef Saeki, Y., Kuroiwa, D.: Optimality conditions for DC programming problems with reverse convex constraints. Nonlinear Anal. 80, 18–27 (2013) MathSciNetCrossRef
15.
Zurück zum Zitat Souza, J.C.O., Oliveira, R.P., Soubeyran, A.: Global convergence of a proximal linearized algorithm for difference of convex functions. Optim. Lett. 10, 1529–1539 (2016) MathSciNetCrossRef Souza, J.C.O., Oliveira, R.P., Soubeyran, A.: Global convergence of a proximal linearized algorithm for difference of convex functions. Optim. Lett. 10, 1529–1539 (2016) MathSciNetCrossRef
16.
Zurück zum Zitat Sun, W., Sampaio, R.J.B., Candido, M.A.B.: Proximal point algorithm for minimization of DC functions. J. Comput. Math. 21, 451–462 (2003) MathSciNetMATH Sun, W., Sampaio, R.J.B., Candido, M.A.B.: Proximal point algorithm for minimization of DC functions. J. Comput. Math. 21, 451–462 (2003) MathSciNetMATH
17.
Zurück zum Zitat Takahashi, W.: Introduction to Nonlinear and Convex Analysis. Yokohoma Publishers, Yokohoma (2009) MATH Takahashi, W.: Introduction to Nonlinear and Convex Analysis. Yokohoma Publishers, Yokohoma (2009) MATH
Metadaten
Titel
Split proximal linearized algorithm and convergence theorems for the split DC program
verfasst von
Chih-Sheng Chuang
Chi-Ming Chen
Publikationsdatum
01.12.2019
Verlag
Springer International Publishing
Erschienen in
Journal of Inequalities and Applications / Ausgabe 1/2019
Elektronische ISSN: 1029-242X
DOI
https://doi.org/10.1186/s13660-019-2084-9

Weitere Artikel der Ausgabe 1/2019

Journal of Inequalities and Applications 1/2019 Zur Ausgabe