4.1 Rank-one approximation
It is well known that finding best rank-one approximations of a real tensor is equivalent to finding the largest singular value [
29] of the tensor; see e.g., [
28]. For a real symmetric tensor, best rank-one approximations can be obtained at a symmetric rank-one tensor [
3,
11], and is equivalent to finding the largest eigenvalue [
37] of the tensor or spherical constrained polynomial optimization [
17,
46]. In the complex field, Sorber et al. [
42] proposed line search and plane search for tensor optimization including best rank-one approximation as a special case. Ni et al. [
33] studied best symmetric rank-one approximations of symmetric complex tensors. Along this line, PS and CPS tensors possess similar properties. Let us first introduce eigenvalues of these tensors.
All the C-eigenvalues of CPS tensors are real [
24]. The C-eigenvalue of PS tensors has not been defined, and we can simply adopt Definition
4.1 as the definition of C-eigenvalue and C-eigenvector for PS tensors. In this paper, as long as their is no ambiguity, we call C-eigenvalue and C-eigenvector to be the eigenvalue and the eigenvector for PS tensors (including CPS tensors), respectively, and call
\((\lambda ,\varvec{x})\) in Definition
4.1 to be the eigenpair. We also need to clarify a couple of terms. For a CPS tensor
\({\mathcal {T}}\), if
\(\mathrm{rank}\,({\mathcal {T}})=1\), then
\(\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})=1\), and so the term
rank-one CPS tensor has no ambiguity. However, for a PS tensor
\({\mathcal {T}}\),
\(\mathrm{rank}\,({\mathcal {T}})=1\) does not imply
\(\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=1\). Here, the term
rank-one PS tensor stands for a PS tensor
\({\mathcal {T}}\) with
\(\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=1\).
Since CPS tensors are PS tensors, for a best rank-one PS tensor
\(\lambda \,\varvec{x}^{\otimes d}\otimes \overline{\varvec{x}}^{\otimes d}\) approximation in (
14),
\(\lambda \) must be an eigenvalue. As all the eigenvalues of CPS tensors are real,
\(\lambda \,\varvec{x}^{\otimes d}\otimes \overline{\varvec{x}}^{\otimes d}\) becomes a best rank-one CPS tensor approximation. Therefore, Theorem
4.2 immediately implies a similar result for CPS tensors.
For CPS tensors, one may consider different rank-one approximation problems. The following result is interesting, which echoes the inequivalence on ranks discussed earlier in Theorem
3.5.
In fact, the equality in (
17) is a consequence of Theorem
4.2 and Corollary
4.3. The inequality in (
17) is obvious since
\({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\subseteq {\mathbb {C}}^{n^{2d}}\) and
\(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {X}})=1\) implies that
\(\mathrm{rank}\,({\mathcal {X}})=1\), and its strictness can be validated by the following example.
We remark that the equivalence between best rank-one CPS tensor approximations and best rank-one complex tensor approximations actually holds for Hermitian matrices (
\(d=1\)), i.e.,
$$\begin{aligned} \min _{\mathrm{rank}\,_\mathrm{CPS}(X)=1,\,X\in {\mathbb {C}}_{\mathrm{cps}}^{n^2}}\Vert A-X\Vert =\min _{\mathrm{rank}\,_\mathrm{PS}(X)=1,\,X\in {\mathbb {C}}_{\mathrm{ps}}^{n^2}}\Vert A-X\Vert =\min _{\mathrm{rank}\,(X)=1,\,X\in {\mathbb {C}}^{n^2}}\Vert A-X\Vert \end{aligned}$$
when
\(A\in {\mathbb {C}}^{n^2}\) is Hermitian. This is because of the trivial fact
\({\mathbb {C}}^{n^2}={\mathbb {C}}_{\mathrm{ps}}^{n^2}\) and the equality in (
17).
4.2 Rank-one equivalence via matricization
Matricization, also known as matrix flattening or matrix unfolding, of a tensor is a widely used tool to study high-order tensors. When a tensor is rank-one, it is obvious that any matricization of the tensor is rank-one, while the reverse is not true in general. The rank-one equivalence via general matricization has been studied for real tensors [
43]. For an even-order symmetric tensor
\({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{s}}^{n^{2d}}\), it is known that if its square matricization (unfolding
\({\mathcal {T}}\) as an
\(n^d\times n^d\) matrix) is rank-one, then the original tensor
\({\mathcal {T}}\) must be rank-one; see e.g., [
25,
35]. In the real field, this rank-one equivalence suggests some convex optimization methods to compute the largest eigenvalue or best rank-one approximations of a symmetric tensor. In practice, the methods are very likely to find global optimal solutions [
25,
35]. Inspired by these results, let us look into the rank-one equivalence for CPS tensors.
For a CPS tensor, one hopes that its square matricization being rank-one implies the original tensor being rank-one. Unfortunately, this may not hold true if a CPS tensor is not unfolded in a right way. The following example shows that the standard square matricization of a non-rank-one CPS tensor turns to a rank-one Hermitian matrix.
The role of symmetry is important in nonlinear elasticity and material sciences [
20,
38]. The following example of elasticity tensors points out a structural difference for the rank-one equivalence via matricization.
We notice that square matricization is unique for symmetric tensors, but not for CPS tensors. Examples
4.6 and
4.7 motivate us to consider other ways of matricization, with a hope to establish certain rank-one equivalence. To this end, it is necessary to introduce tensor transpose, extending the concept of matrix transpose.
In a plain language, mode 1 of \({\mathcal {T}}^\pi \) originates from mode \(\pi _1\) of \({\mathcal {T}}\), mode 2 of \({\mathcal {T}}^\pi \) originates from mode \(\pi _2\) of \({\mathcal {T}}\), and so on. As a matter of fact, for a matrix \(A\in {\mathbb {C}}^{n^2}\) and \(\pi =(2,1)\), \(A^\pi =A^{\mathrm{T}}\). For a PS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\) and \(\pi =(d+1,\ldots ,2d,1,\ldots ,d)\), \({\mathcal {T}}^{\mathrm{H}}=\overline{{\mathcal {T}}^\pi }\).
Given any integers
\(1\le i_1,\ldots ,i_d\le n\), let us denote
$$\begin{aligned} n(i_1\ldots i_d):=\sum _{k = 1}^{d}(i_k-1)n^{d-k} + 1 \end{aligned}$$
to be the decimal of the tuple
\(i_1\ldots i_d\) in the base-
n numeral system. We now discuss matricization and vectorization.
Obviously the standard square matricization is a \(\pi \)-matricization when \(\pi =(1,2,\ldots ,2d)\). Vectorization, \(\pi \)-matricization and \(\pi \)-transpose are all one-to-one. They are different ways of representation for tensor data. The following property on ranks are straightforward.
As mentioned earlier, the \(\pi \)-matricization of a symmetric tensor is unique for any permutation \(\pi \), since \({\mathcal {T}}^\pi ={\mathcal {T}}\) if \({\mathcal {T}}\) is symmetric. However, CPS tensors only possess partial symmetry as well as certain conjugate property. Therefore, conditions of \(\pi \) are necessary to guarantee the rank-one equivalence, as well as for the \(\pi \)-matricization being Hermitian.
We remark that the condition of
\(\pi \) in (
19) is in fact necessary for
\(M_\pi ({\mathcal {T}})\) to be a Hermitian matrix for a general CPS tensor
\({\mathcal {T}}\). Essentially, if mode
k of
\({\mathcal {T}}^\pi \) originates from modes
\(\{1,\ldots ,d\}\) of
\({\mathcal {T}}\), then mode
\(d+k\) of
\({\mathcal {T}}^\pi \) must originate from modes
\(\{d+1,\ldots ,2d\}\) of
\({\mathcal {T}}\), and vise versa.
Proof
Let
\(M_\pi ({\mathcal {T}})=\varvec{x}\otimes \varvec{y}\) where
\(\varvec{x}\) and
\(\varvec{y}\) are
\(n^d\)-dimensional vectors, and further let
\(\varvec{x}=\varvec{v}({\mathcal {X}})\) and
\(\varvec{y}=\varvec{v}({\mathcal {Y}})\) where
\({\mathcal {X}},{\mathcal {Y}}\in {\mathbb {C}}^{n^d}\). Since
$$\begin{aligned} M({\mathcal {T}}^\pi )=M_\pi ({\mathcal {T}})=\varvec{x}\otimes \varvec{y}=\varvec{v}({\mathcal {X}})\otimes \varvec{v}({\mathcal {Y}}), \end{aligned}$$
one has
\({\mathcal {T}}^\pi ={\mathcal {X}}\otimes {\mathcal {Y}}\). To prove
\(\mathrm{rank}\,({\mathcal {T}})=1\), it suffices to show that
\(\mathrm{rank}\,({\mathcal {X}})=\mathrm{rank}\,({\mathcal {Y}})=1\).
Modes
\(1,\ldots ,d\) of
\({\mathcal {T}}^\pi \) are originated from modes
\(\pi _1,\ldots ,\pi _d\) of
\({\mathcal {T}}\), respectively. By (
20), there are almost half (either
\(\left\lfloor d/2\right\rfloor \) or
\(\left\lceil d/2 \right\rceil \)) from modes
\(\{1,\ldots ,d\}\) of
\({\mathcal {T}}\) and the remaining half from modes
\(\{d+1,\ldots ,2d\}\) of
\({\mathcal {T}}\). This is also true for the modes of
\({\mathcal {X}}\). To provide a clearer presentation, we may construct a permutation
\(\tau \) such that, the first half of the modes of
\({\mathcal {X}}^\tau \) are from modes
\(\{1,\ldots ,d\}\) of
\({\mathcal {T}}\) and the remaining half are from modes
\(\{d+1,\ldots ,2d\}\) of
\({\mathcal {T}}\). Explicitly,
\(\tau \in \Pi (1,\ldots ,d)\) needs to satisfy
$$\begin{aligned} \left\{ \tau _1,\ldots ,\tau _{\left| \{\pi _1,\ldots ,\pi _{d}\} \cap \{1,\ldots ,d\}\right| }\right\} =\left\{ 1\le k\le d: 1\le \pi _k\le d \right\} . \end{aligned}$$
(21)
As
\({\mathcal {T}}\) is symmetric to modes
\(\{1,\ldots ,d\}\) and to modes
\(\{d+1,\ldots ,2d\}\), respectively, the order of
\(\tau _k\)’s in (
21) does not matter. Observing that
\(\mathrm{rank}\,({\mathcal {X}})=\mathrm{rank}\,({\mathcal {X}}^\tau )\) and
\(M({\mathcal {X}}^\tau \otimes {\mathcal {Y}})\) is rank-one, we may without loss of generality assume that the first
\(\left\lceil d/2 \right\rceil \) modes of
\({\mathcal {X}}\) are from modes
\(\{1,\ldots ,d\}\) of
\({\mathcal {T}}\) and the remaining
\(\left\lfloor d/2 \right\rfloor \) from modes
\(\{d+1,\ldots ,2d\}\) of
\({\mathcal {T}}\), and for the same reason assume that the first
\(\left\lfloor d/2 \right\rfloor \) modes of
\({\mathcal {Y}}\) are from modes
\(\{1,\ldots ,d\}\) of
\({\mathcal {T}}\) and the remaining
\(\left\lceil d/2 \right\rceil \) from modes
\(\{d+1,\ldots ,2d\}\) of
\({\mathcal {T}}\). In a nutshell, we can assume without loss of generality that
$$\begin{aligned} \pi =\left( 1,\ldots , \left\lceil \frac{d}{2}\right\rceil , d+1, \ldots , d+ \left\lfloor \frac{d}{2}\right\rfloor , \left\lceil \frac{d}{2}\right\rceil +1, \ldots , d, d +\left\lfloor \frac{d}{2}\right\rfloor +1,\ldots , 2d \right) . \end{aligned}$$
This implies that
\({\mathcal {X}}\otimes {\mathcal {Y}}={\mathcal {T}}^\pi \) is symmetric to modes
$$\begin{aligned} {\mathbb {I}}_d^1:=\left\{ 1,\ldots ,\left\lceil \frac{d}{2}\right\rceil ,d+1, \ldots ,d+\left\lfloor \frac{d}{2}\right\rfloor \right\} \end{aligned}$$
and symmetric to modes
$$\begin{aligned} {\mathbb {I}}_d^2:=\left\{ \left\lceil \frac{d}{2} \right\rceil +1,\ldots ,d,d +\left\lfloor \frac{d}{2}\right\rfloor +1,\ldots ,2d\right\} . \end{aligned}$$
We proceed to prove that under the condition of
\({\mathcal {X}}\otimes {\mathcal {Y}}\) being symmetric to modes
\({\mathbb {I}}_d^1\) and symmetric to modes
\({\mathbb {I}}_d^2\),
\(\mathrm{rank}\,({\mathcal {X}})=\mathrm{rank}\,({\mathcal {Y}})=1\) holds, by induction on
d.
When
\(d=1\), both
\({\mathcal {X}}\) and
\({\mathcal {Y}}\) are obviously rank-one as they are vectors. Suppose that the claim holds for
\(d-1\). For general
d, as
\({\mathcal {X}}\otimes {\mathcal {Y}}\) is symmetric to modes
\({\mathbb {I}}_d^1\) and symmetric to modes
\({\mathbb {I}}_d^2\), we can swap all but the first modes of
\({\mathcal {X}}\) with some modes of
\({\mathcal {Y}}\). In particularly, modes
\(\left\{ 2,\ldots , \left\lceil \frac{d}{2}\right\rceil \right\} \) of
\({\mathcal {X}}\) are swapped with modes
\(\left\{ 1,\ldots , \left\lceil \frac{d}{2}\right\rceil -1\right\} \) of
\({\mathcal {Y}}\), respectively, and modes
\(\left\{ \left\lceil \frac{d}{2}\right\rceil +1, \ldots ,d\right\} \) of
\({\mathcal {X}}\) are swapped with modes
\(\left\{ \left\lceil \frac{d}{2}\right\rceil +1, \ldots ,d\right\} \) of
\({\mathcal {Y}}\), respectively. Consequently, one has for any
\(1 \le i_1, \ldots , i_{2d}\le n\),
$$\begin{aligned} {\mathcal {X}}_{i_1\ldots i_d}{\mathcal {Y}}_{i_{d+1}\ldots i_{2d}} ={\mathcal {X}}_{ i_1 i_{d+1} \ldots i_{d+\lceil d/2\rceil -1} i_{d+\lceil d/2\rceil +1} \ldots i_{2d} } {\mathcal {Y}}_{ i_2 \ldots i_{\lceil d/2\rceil } i_{d+\lceil d/2\rceil } i_{\lceil d/2\rceil +1}\ldots i_d }. \end{aligned}$$
(22)
Pick any nonzero entry of
\({\mathcal {Y}}\), say
\({\mathcal {Y}}_{k_1\ldots k_d}\ne 0\). Let
\((i_{d+1}, \ldots ,i_{2d})=(k_1,\ldots ,k_d)\) in (
22) and we have
$$\begin{aligned} {\mathcal {X}}_{i_1\ldots i_d}{\mathcal {Y}}_{k_1\ldots k_d} ={\mathcal {X}}_{ i_1 k_1 \ldots k_{\lceil d/2\rceil -1} k_{\lceil d/2\rceil +1} \ldots k_{d} } {\mathcal {Y}}_{ i_2 \ldots i_{\lceil d/2\rceil } k_{\lceil d/2\rceil } i_{\lceil d/2\rceil +1}\ldots i_d }. \end{aligned}$$
By defining
\(\varvec{a}\in {\mathbb {C}}^n\) and
\({\mathcal {U}}\in {\mathbb {C}}^{n^{d-1}}\) where
$$\begin{aligned} a_{i_1}:=\frac{{\mathcal {X}}_{ i_1 k_1 \ldots k_{\lceil d/2\rceil -1} k_{\lceil d/2\rceil +1} \ldots k_{d} }}{{\mathcal {Y}}_{k_1\ldots k_d}} \text { and } {\mathcal {U}}_{i_2\ldots i_d}:={\mathcal {Y}}_{ i_2 \ldots i_{\lceil d/2\rceil } k_{\lceil d/2\rceil } i_{\lceil d/2\rceil +1}\ldots i_d }, \end{aligned}$$
we obtain that
\({\mathcal {X}}=\varvec{a}\otimes {\mathcal {U}}\). Similarly to (
22), one may swap all but the last modes of
\({\mathcal {Y}}\) to some modes of
\({\mathcal {X}}\) and obtain
\({\mathcal {Y}}={\mathcal {V}}\otimes \varvec{b}\) where
\({\mathcal {V}}\in {\mathbb {C}}^{n^{d-1}}\) and
\(\varvec{b}\in {\mathbb {C}}^n\). Since
$$\begin{aligned} {\mathcal {X}}\otimes {\mathcal {Y}}= \varvec{a}\otimes {\mathcal {U}}\otimes {\mathcal {V}}\otimes \varvec{b}\end{aligned}$$
is symmetric with respect to modes
\({\mathbb {I}}^1_d\) and symmetric with respect to modes
\({\mathbb {I}}^2_d\),
\({\mathcal {U}}\otimes {\mathcal {V}}\) is symmetric to modes
\(\left\{ 1,\ldots ,\lceil \frac{d}{2}\rceil -1,d,\ldots ,d +\lfloor \frac{d}{2}\rfloor -1\right\} \) and symmetric to modes
\(\left\{ \lceil \frac{d}{2}\rceil ,\ldots ,d-1,d+\lfloor \frac{d}{2}\rfloor ,\ldots ,2d-2\right\} \). As a result,
\({\mathcal {V}}\otimes {\mathcal {U}}\) is symmetric with respect to modes
\(\left\{ 1,\ldots ,\lfloor \frac{d}{2} \rfloor ,d,\ldots ,d+\lceil \frac{d}{2}\rceil -2\right\} \) and symmetric with respect to modes
\(\left\{ \lfloor \frac{d}{2}\rfloor +1,\ldots ,d-1,d +\lceil \frac{d}{2}\rceil -1,\ldots ,2d-2\right\} \).
If d is odd, by noticing that \(\lceil \frac{d}{2}\rceil -1=\lceil \frac{d-1}{2}\rceil \) and \(\lfloor \frac{d}{2}\rfloor =\lfloor \frac{d-1}{2}\rfloor \), we have that \({\mathcal {U}}\otimes {\mathcal {V}}\) is symmetric to modes \({\mathbb {I}}^1_{d-1}\) and symmetric to modes \({\mathbb {I}}^2_{d-1}\). By induction, we obtain \(\mathrm{rank}\,({\mathcal {U}})=\mathrm{rank}\,({\mathcal {V}})=1\), proving that \(\mathrm{rank}\,({\mathcal {X}})=\mathrm{rank}\,({\mathcal {Y}})=1\).
If d is even, again by noticing that \(\lfloor \frac{d}{2}\rfloor =\lceil \frac{d-1}{2}\rceil \) and \(\lceil \frac{d}{2}\rceil =\lfloor \frac{d-1}{2}\rfloor +1\), we have that \({\mathcal {V}}\otimes {\mathcal {U}}\) is symmetric to the modes \({\mathbb {I}}^1_{d-1}\) and symmetric to the modes \({\mathbb {I}}^2_{d-1}\). By induction, we obtain \(\mathrm{rank}\,({\mathcal {V}})=\mathrm{rank}\,({\mathcal {U}})=1\), and so \(\mathrm{rank}\,({\mathcal {X}})=\mathrm{rank}\,({\mathcal {Y}})=1\). \(\square \)
In fact, the condition of
\(\pi \) in (
20) is also a necessary condition for the rank-one equivalence in Theorem
4.13 for a general CPS tensor. The proof, or an explanation of a counter example, involves heavy notations and we leave it to interested readers. The key step leading to Theorem
4.13 is the identity (
22), which is a consequence of modes swapping due to some partial symmetry. This is doable because, among modes
\(\{1,\ldots ,d\}\) of
\({\mathcal {T}}\), they are (almost) equally allocated to the modes of
\({\mathcal {X}}\) (the first half of the modes of
\({\mathcal {T}}^\pi \)) and to modes of
\({\mathcal {Y}}\) (the last half of the modes of
\({\mathcal {T}}^\pi \)), and the same holds for modes
\(\{d+1,\ldots ,2d\}\) of
\({\mathcal {T}}\). If the number of modes of
\({\mathcal {X}}\) that originate from modes
\(\{1,\ldots ,d\}\) of
\({\mathcal {T}}\) differs the number of modes of
\({\mathcal {Y}}\) that originate from modes
\(\{1,\ldots ,d\}\) of
\({\mathcal {T}}\) for more than one (such as Example
4.6 with
\(\pi =(1,2,3,4)\)), then (
22) cannot be obtained. This makes some modes binding, i.e., not separable to the outer product of a vector and a tensor in a lower order.
Combing Propositions
4.11 and
4.12, Theorem
4.13, and the discussion regarding the necessities, we arrive at the following result.
In practice, such as the discussion in modelling (Section
4.3) and the numerical experiments (Section
5), we may focus on a particular permutation satisfying (
19) and (
20). The most straightforward one is
$$\begin{aligned} \pi =\left( 1,\ldots ,\left\lceil \frac{d}{2}\right\rceil , d+1,\ldots ,d+\left\lfloor \frac{d}{2}\right\rfloor , d+\left\lfloor \frac{d}{2}\right\rfloor +1,\ldots ,2d, \left\lceil \frac{d}{2}\right\rceil +1,\ldots ,d \right) . \end{aligned}$$
(23)
In particular,
\(\pi =(1,3,4,2)\) for fourth-order tensors (
\(d=2\)),
\(\pi =(1,2,4,5,6,3)\) for sixth-order tensors (
\(d=3\)), and
\(\pi =(1,2,5,6,7,8,3,4)\) for eighth-order tensors (
\(d=4\)).
Before concluding this part, let us look at elasticity tensors (Example
4.7) again with Theorem
4.14 at hand. Since every elasticity tensor is CPS, Theorem
4.14 leads to the following interesting result from the rank-one equivalence.
4.3 Computing best rank-one approximations
As an immediate application of the rank-one matricization equivalence, we now discuss how it can be used to compute best rank-one approximations of CPS tensors. Specifically, we consider the problem
$$\begin{aligned} \min _{\Vert \varvec{x}\Vert =1,\lambda \in {\mathbb {R}}} \Vert {\mathcal {T}}-\lambda \, \varvec{x}^{\otimes d} \otimes \overline{\varvec{x}}^{\otimes d}\Vert . \end{aligned}$$
(24)
As mentioned in Corollary
4.3,
$$\begin{aligned} \min _{\Vert \varvec{x}\Vert =1,\lambda \in {\mathbb {R}}} \Vert {\mathcal {T}}-\lambda \, \varvec{x}^{\otimes d} \otimes \overline{\varvec{x}}^{\otimes d} \Vert \Longleftrightarrow \max _{ {\mathcal {T}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d)=\lambda \varvec{x}, \,\Vert \varvec{x}\Vert =1,\,\lambda \in {\mathbb {R}}}|\lambda | \Longleftrightarrow \max _{\Vert \varvec{x}\Vert =1}|{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)|. \end{aligned}$$
Since
\({\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)\) is a real-valued function, the maximum of
\(|{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)|\) is obtained either at
\({\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)\) or at
\((-{\mathcal {T}})(\overline{\varvec{x}}^d\varvec{x}^d)\) for a given
\({\mathcal {T}}\). Therefore, the above problem is essentially
$$\begin{aligned} \max _{\Vert \varvec{x}\Vert =1}{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d), \end{aligned}$$
(25)
i.e., finding the largest eigenvalue of the CPS tensor
\({\mathcal {T}}\).
The model (
25) is NP-hard when the order of
\({\mathcal {T}}\) is larger than two, even in the real field [
16,
18]. Let us reformulate the tensor based optimization model to a matrix optimization model.
We remark that both
\(\mathrm{tr}\,(X)=1\) and
\(X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}})\) in the model (
26) are linear equality constraints. In particular,
\(X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}})\) contains
\(O(n^d)\) equalities, which are the requirements of partial symmetry and conjugate property for CPS tensors. As an example, when
\(n=d=2\) and let
\(\pi =(1,3,4,2)\) as in (
23),
\(X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{2^{4}})\) can be explicitly written as
$$\begin{aligned} X_{14}=X_{22}= X_{33}=X_{41},\, X_{12}=X_{31}, \, X_{24}=X_{43}, \text { and } X^{\mathrm{H}}=X. \end{aligned}$$
In fact,
\(X^{\mathrm{H}}=X\) is included in the constraints
\(X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}})\), but we leave it in (
26) to emphasize that the decision variable sits in the space of Hermitian matrices.
The problem (
26) remains hard because of the rank-one constraint. However, it broadens ways by resorting to various matrix optimization tools, particularly in convex optimization. We now propose two convex relaxation methods. First, in the proof of Theorem
4.16, we observe that
X is a rank-one Hermitian matrix with
\(\mathrm{tr}\,(X)=1\) actually implies that
X is positive semidefinite. By dropping the rank-one constraint, (
26) is relaxed to a semidefinite program (SDP):
$$\begin{aligned} \max \left\{ \langle \overline{M_\pi ({\mathcal {T}})}, X \rangle : \mathrm{tr}\,(X)=1, \, X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}), \, X\succeq O \right\} , \end{aligned}$$
(28)
where
\(X\succeq O\) denotes that
X is Hermtian positive semidefinite. The convex optimization model (
28) can be easily solved by some SDP solvers in CVX [
14]. Alternatively, one may resort to first order methods such as the alternating direction method of multipliers (ADMM).
The second relaxation method is to add a penalty of the nuclear norm of the decision matrix in the objective function [
25]. By dropping the rank-one constraint, this leads to the following convex optimization model
$$\begin{aligned} \max \left\{ \langle \overline{M_\pi ({\mathcal {T}})}, X \rangle -\rho \Vert X\Vert _*: \mathrm{tr}\,(X)=1, \, X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}), \, X^{\mathrm{H}}=X \right\} , \end{aligned}$$
(29)
where
\(\rho >0\) is a penalty parameter and
\(\Vert X\Vert _*\) denotes the nuclear norm of
X, a convex surrogate for
\(\mathrm{rank}\,(X)\). To see why (
29) is a convex relaxation of (
26), we notice that
\(\Vert X\Vert _*\) is a convex function, and so the objective of (
29) is concave. Moreover, an optimal solution of (
26), say
X, is rank-one and
\(\mathrm{tr}\,(X) = 1\) imply that
X is positive semidefinite. Thus,
\(\Vert X\Vert _* = \mathrm{tr}\,(X) = 1\), which implies that the term
\(-\rho \Vert X\Vert _*\) added to the objective function is actually a constant.
Our observations in several numerical examples show that the solution obtained by the two convex relaxation models (
28) and (
29) are often rank-one (see Section
5). Once a rank-one solution
X is obtained, one may resort
\(X=M_\pi (\overline{\varvec{x}}^{\otimes d}\otimes \varvec{x}^{\otimes d})\) to find a solution
\(\varvec{x}\) for (
25), as stipulated in the proof of Theorem
4.16. This
\(\varvec{x}\) provides a solution to the best rank-one approximation problem (
24). We remark that the convex relaxation methods can also be used to find good approximate solutions to the best rank-
r approximation problem, i.e.,
$$\begin{aligned} \min _{\Vert \varvec{x}_j\Vert =1,\lambda _j\in {\mathbb {R}},\,j=1,\ldots ,r} \left\| {\mathcal {T}}-\sum _{j=1}^r \lambda _j {\varvec{x}_j}^{\otimes d}\otimes \overline{\varvec{x}_j}^{\otimes d}\right\| , \end{aligned}$$
in a successive (greedy) manner.