Skip to main content
Top
Published in: Calcolo 4/2021

Open Access 01-12-2021

On decompositions and approximations of conjugate partial-symmetric tensors

Authors: Taoran Fu, Bo Jiang, Zhening Li

Published in: Calcolo | Issue 4/2021

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Hermitian matrices have played an important role in matrix theory and complex quadratic optimization. The high-order generalization of Hermitian matrices, conjugate partial-symmetric (CPS) tensors, have shown growing interest recently in tensor theory and computation, particularly in application-driven complex polynomial optimization problems. In this paper, we study CPS tensors with a focus on ranks, computing rank-one decompositions and approximations, as well as their applications. We prove constructively that any CPS tensor can be decomposed into a sum of rank-one CPS tensors, which provides an explicit method to compute such rank-one decompositions. Three types of ranks for CPS tensors are defined and shown to be different in general. This leads to the invalidity of the conjugate version of Comon’s conjecture. We then study rank-one approximations and matricizations of CPS tensors. By carefully unfolding CPS tensors to Hermitian matrices, rank-one equivalence can be preserved. This enables us to develop new convex optimization models and algorithms to compute best rank-one approximations of CPS tensors. Numerical experiments from data sets in radar wave form design, elasticity tensor, and quantum entanglement are performed to justify the capability of our methods.
Notes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

As the complex counterpart of real symmetric matrices, Hermitian matrices are often considered to play a more important role than complex symmetric matrices in practice. This is mainly due to the fact that any complex quadratic form generated by a Hermitian matrix always takes real values and all the eigenvalues of a Hermitian matrix are real. It is especially important in many applications, for instance, in quantum physics where Hermitian matrices are operators that measure properties of a system, e.g., total spin which has to be real, and in mathematical optimization whereas objective functions need to be real-valued. Generalizing to high-order tensors, symmetric tensors, no matter in the real or in the complex field, have been paid enormous attention in the recent decade. However, the high-order generalization of Hermitian matrices has not been formally proposed in mathematics until recently by Jiang et al. [24], who named it as conjugate partial-symmetric (CPS) tensors. The concept of CPS tensors were later generalized to conjugate non-symmetric tensors called Hermitian tensors by Ni [32]. Nevertheless, various examples of CPS tensors can be extracted from real applications in forms of complex polynomial optimization problems. Aittomaki and Koivunen [1] considered the beampattern optimization and formulated it as a complex multivariate quartic minimization model. Sorber et al. [41] applied unconstrained complex optimization to study the simulation of nonlinear circuits in the frequency domain. Aubry et al. [2] modeled a radar signal processing problem by optimizing a complex quartic polynomial which always takes real values. Josz [26] investigated applications of complex polynomial optimization to electricity transmission network. Moreover, Madani et al. [30] studied the power system state estimation via complex polynomial optimization.
There were several discussions on high-order generalization of Hermitian matrices earlier and recently. The Hermitian tensor product, defined to be the Kronecker product of a Hermitian matrix, has been studied since 1960s [27, 31]. It has many applications in quantum entanglement [9] and enjoys certain nice properties; for instance, the Kronecker product of a Hermitian matrix remains a Hermitian matrix. Fourth-order cumulant tensors in multivariate statistics were proposed and applied to the blind source separation [6]. Such tensors are called quadricovariance tensors, which is a special class of fourth-order CPS tensors. In material sciences [20, 38], an elastic tensor is a three-dimensional fourth-order CPS tensor. Ni et al. [33] proposed the unitary eigenvalues and unitary symmetric eigenvalues for complex tensors and symmetric complex tensors, respectively, and demonstrated a relation to the geometric measure of quantum entanglement. Zhang et al. [44] studied the unitary eigenvalues of non-symmetric complex tensors. Jiang et al. [24] characterized real-valued complex polynomial functions and their symmetric tensor representations, which naturally led to the definition of CPS tensors as well as its generalization called conjugate super-symmetric tensors. Eigenvalues and applications for these tensors were discussed as well. Derksen et al. [8] studied entanglement of d-partite system in the field of quantum mechanics and introduced the notion of bisymmetric Hermitian tensor, which is essentially the same definition of CPS tensors in [24]. Some elementary properties of bisymmetric Hermitian tensors were discussed. Recently, Nie and Yang [36] studied decompositions of Hermitian tensors.
CPS tensors inherit many nice properties of Hermitian matrices. For instance, every symmetric complex form generated by a CPS tensor is real-valued and all the eigenvalues of a CPS tensor are real [24]. In contrast to the many efforts on the optimization aspect [12, 13, 19, 23, 40, 45] of CPS tensors, the current paper aims for their decompositions, ranks and approximations, which are important topics for high-order tensors. As we all know that the generalization of matrices to high-order tensors has led to interesting new findings as well as keeping many nice properties, CPS tensors, as a generalization of Hermitian matrices in terms of the high order and a generalization of real symmetric tensors in terms of the complex field, should also be expected to behave in that sense. One of our findings states that Comon’s type conjecture, i.e., the symmetric rank of a symmetric tensor is equal to the rank of the tensor, applied to CPS tensors is actually invalid in a simple example. We believe the analysis in this line will provide novel insights into CPS tensors, and hope these new findings will help in future modelling of practical applications. In fact, one of our results on rank-one equivalence via square matricization helps to develop new models and algorithms to compute best rank-one approximations and the extreme eigenvalue of CPS tensors.
The study of CPS tensors in this paper is focused on ranks, rank-one decompositions and approximations, as well as their applications. The analysis is conducted along side with a more general class of complex tensors called partial-symmetric (PS) tensors. We propose the Hermitian and skew-Hermitian parts of PS tensors, which are helpful to understand structures of PS tensors and CPS tensors. We prove constructively that any CPS tensor can be decomposed into a sum of rank-one CPS tensors, using the tools in additive number theory, specifically, Hilbert’s identity [4, 22]. This provides an alternative definition of CPS tensors via real linear combinations of rank-one CPS tensors and a computational method to decompose it. As a consequence, perhaps surprisingly, any PS tensors can be decomposed into a complex linear combinations of rank-one CPS tensors. We then define three types of ranks for CPS tensors and show that they are non-identical in general. For CPS tensors, this leads to the invalidity of conjugate version of Comon’s conjecture, albeit it is not the exact form of Comon’s conjecture in [7]. We further study rank-one approximations of CPS tensors. Depending on the types of rank-one tensors to be considered, rank-one approximations could also be different. As is known in the literature, if the square matricization of an even-order symmetric tensor is rank-one, then the original symmetric tensor is also rank-one. We figure out that the same property does hold for CPS tensors when they are unfolded to Hermitian matrices under a careful way of matricization. Based on the rank-one matricization equivalence, we propose two convex optimization models to compute best rank-one approximations of CPS tensors. Several numerical experiments from real data to simulated data are performed to justify the capability of our methods.
This paper is organized as follows. We start with preparations of various notations, definitions, and elementary properties in Section 2. In Section 3, we study decompositions of CPS tensors and discuss several concepts of ranks. We then focus on rank-one approximations and rank-one equivalence via matricization for CPS tensors in Section 4, which provide an immediate application to compute best rank-one approximations of CPS tensors via convex optimization. Finally in Section 5, we conducted several numerical experiments to illustrate effective performance of the methods proposed in Section 4.

2 Preparations

Throughout this paper, we uniformly use lowercase letters, boldface lowercase letters, capital letters, and calligraphic letters to denote scalars, vectors, matrices, and high-order tensors, respectively, e.g., a scalar x, a vector \(\varvec{x}\), a matrix X, and a tensor \({\mathcal {X}}\). We use subscripts to denote their components, e.g., \(x_i\) being the i-th entry of a vector \(\varvec{x}\), \(X_{ij}\) being the (ij)-th entry of a matrix X, and \({\mathcal {X}}_{ijk}\) being the (ijk)-th entry of a third-order tensor \({\mathcal {X}}\). As usual, the field of real numbers and the field of complex numbers are denoted by \({\mathbb {R}}\) and \({\mathbb {C}}\), respectively.
For any complex number \(z = x + \mathbf{i}y\in {\mathbb {C}}\) with \(x,y\in {\mathbb {R}}\), its real part and imaginary part are denoted by \(\mathrm{Re}\,z:= x\) and \(\mathrm{Im}\,z:= y\), respectively. Its argument is denoted by \(\arg (z)\) and its modulus is denoted by \(|z|: = \sqrt{\overline{z}z} = \sqrt{x^2 + y^2}\), where \(\overline{z}: = x-\mathbf{i}y\) denotes the conjugate of z. For any vector \(\varvec{z}\in {\mathbb {C}}^n\), we denote \(\varvec{z}^{\mathrm{H}}: = \overline{\varvec{z}}^{\mathrm{T}}\) to be the transpose of its conjugate, and we define it analogously for matrices. The norm of a complex vector \(\varvec{z}\in {\mathbb {C}}^n\) is defined as
$$\begin{aligned} \Vert \varvec{z}\Vert : = \sqrt{\varvec{z}^{\mathrm{H}}\varvec{z}} = \sqrt{\sum _{i=1}^n |z_i|^2}. \end{aligned}$$

2.1 CPS tensor

Let us consider \({\mathbb {C}}^{n^d}\), the space of complex tensors of order d and dimension n. A tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^d}\) is called symmetric if every entry of \({\mathcal {T}}\) is invariant under all permutations of its indices, i.e., for every \(1\le i_1\le \ldots \le i_{d}\le n\),
$$\begin{aligned} {\mathcal {T}}_{j_1\ldots j_d} = {\mathcal {T}}_{i_1\ldots i_d} \quad \forall \, (j_1,\ldots , j_d) \in \Pi (i_1,\ldots , i_d), \end{aligned}$$
where \(\Pi (i_1,\ldots , i_d)\) denotes the set of all distinctive permutations of \(\{i_1,\ldots ,i_d\}\). The set of symmetric tensors in \({\mathbb {C}}^{n^d}\) is denoted by \({\mathbb {C}}_{\mathrm{s}}^{n^d}\).
Definition 2.1
(Jiang et al. [24, Definition 2.3]). An even-order complex tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^{2d}}\) is called partial-symmetric (PS) if for every \(1\le i_1\le \ldots \le i_{d}\le n\) and \(1\le i_{d+1}\le \ldots \le i_{2d}\le n\),
$$\begin{aligned} {\mathcal{T}}_{j_1\ldots j_d j_{d+1}\ldots j_{2d}} = {\mathcal{T}}_{i_1\ldots i_d i_{d+1}\ldots i_{2d}} \quad &\forall (j_1,\ldots , j_d) \in \Pi (i_1,\ldots , i_d),\\ & \,\,\, (j_{d+1}, \ldots , j_{2d}) \in \Pi (i_{d+1},\ldots , i_{2d}). \end{aligned}$$
Essentially, a PS tensor is symmetric with respect to its first half of the modes, and symmetric with respect to its last half of the modes as well, while a symmetric tensor is symmetric with respect to all the modes. The set of PS complex tensors in \({\mathbb {C}}^{n^{2d}}\) is denoted by \({\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\). When \(d=1\), one has \({\mathbb {C}}_{\mathrm{s}}^{n^2}\subsetneq {\mathbb {C}}_{\mathrm{ps}}^{n^2}={\mathbb {C}}^{n^2}\). However, for \(d\ge 2\), it is obvious that \({\mathbb {C}}_{\mathrm{s}}^{n^{2d}}\subsetneq {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\subsetneq {\mathbb {C}}^{n^{2d}}\).
A special class of PS tensors, called conjugate partial-symmetric tensors, generalizes Hermitian matrices to high-order tensor spaces.
Definition 2.2
(Jiang et al. [24, Definition 3.7]). An even-order complex tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^{2d}}\) is called conjugate partial-symmetric (CPS) if it is partial-symmetric and
$$\begin{aligned} {\mathcal {T}}_{i_1\ldots i_d i_{d+1}\ldots i_{2d}} = \overline{{\mathcal {T}}_{i_{d+1} \ldots i_{2d} i_1\ldots i_d}} \quad \forall \, 1\le i_1 \le \ldots \le i_{d}\le n,\,1\le i_{d+1}\le \ldots \le i_{2d}\le n. \end{aligned}$$
The set of CPS tensors in \({\mathbb {C}}^{n^{2d}}\) is denoted by \({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\). Obviously when \(d=1\), \({\mathbb {C}}_{\mathrm{cps}}^{n^{2}}\) is nothing but the set of Hermitian matrices in \({\mathbb {C}}^{n^2}\). CPS tensors are the high-order generalization of Hermitian matrices. For \(d\ge 2\), one has \({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\subsetneq {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\subsetneq {\mathbb {C}}^{n^{2d}}\). However, \({\mathbb {C}}_{\mathrm{s}}^{n^{2d}}\) and \({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\) are not comparable. Actually one has \({\mathbb {C}}_{\mathrm{s}}^{n^{2d}} \cap {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}} = {\mathbb {R}}_{\mathrm{s}}^{n^{2d}}\), the set of real symmetric tensors. This is the the same for complex symmetric matrices and Hermitian matrices, i.e., \(d=1\).
We remark that PS tensors are closed under addition and multiplication by complex numbers, while CPS tensors are closed under addition and multiplication by real numbers only. This fact may not be obvious from their definitions. In fact, it can be easily seen from their equivalent definitions via partial-symmetric decompositions; see Section 3.1.

2.2 Complex form

The Frobenius inner product of two complex tensors \({\mathcal {U}},{\mathcal {V}}\in {\mathbb {C}}^{n^d}\) is defined as
$$\begin{aligned} \langle {\mathcal {U}}, {\mathcal {V}}\rangle := \sum _{i_1=1}^n\ldots \sum _{i_d=1}^n \overline{{\mathcal {U}}_{i_1\ldots i_d}} {\mathcal {V}}_{i_1\ldots i_d}, \end{aligned}$$
and its induced Frobenius norm of a complex tensor \({\mathcal {T}}\) is naturally defined as \(\Vert {\mathcal {T}}\Vert :=\sqrt{\left\langle {\mathcal {T}},{\mathcal {T}}\right\rangle }.\) We remark that these two notations naturally apply to vectors and matrices, which are tensors of order one and order two, respectively. A rank-one tensor, also called a simple tensor, is a tensor that can be written as outer products of vectors, i.e., \(\varvec{x}_1\otimes \ldots \otimes \varvec{x}_d\in {\mathbb {C}}^{n^d}\) where \(\varvec{x}_k\in {\mathbb {C}}^n\) for \(k=1,\ldots ,d\).
Given a complex tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^d}\), the multilinear form of \({\mathcal {T}}\) is defined as
$$\begin{aligned} {\mathcal {T}}(\varvec{x}_1,\ldots ,\varvec{x}_d):=\sum _{i_1=1}^n\ldots \sum _{i_d=1}^n {\mathcal {T}}_{i_1\ldots i_d}(x_1)_{i_1}\ldots (x_d)_{i_d} = \langle \overline{{\mathcal {T}}}, \varvec{x}_1\otimes \ldots \otimes \varvec{x}_d \rangle , \end{aligned}$$
(1)
where the variable \(\varvec{x}_k\in {\mathbb {C}}^n\) for \(k=1,\ldots ,d\).
If a vector in a multilinear form (1) is missing and replaced by a ‘\(\bullet \)’, say \({\mathcal {T}}(\bullet ,\varvec{x}_2, \ldots ,\varvec{x}_d)\), it then becomes a vector in \({\mathbb {C}}^n\). Explicitly, the i-th entry of \({\mathcal {T}}(\bullet ,\varvec{x}_2,\ldots ,\varvec{x}_d)\) is \({\mathcal {T}}(\varvec{e}_i,\varvec{x}_2,\ldots ,\varvec{x}_d)\) for \(i=1,\ldots ,n\), where \(\varvec{e}_i\) is the i-th unit vector of \({\mathbb {R}}^n\). When all \(\varvec{x}_k\)’s in (1) are the same, a multilinear form becomes a complex homogenous polynomial function (or complex form) of \(\varvec{x}\in {\mathbb {C}}^n\), i.e.,
$$\begin{aligned} {\mathcal {T}}(\varvec{x}^d):= {\mathcal {T}}(\underbrace{\varvec{x}, \ldots , \varvec{x}}_d) = \langle \overline{{\mathcal {T}}}, \underbrace{\varvec{x}\otimes \ldots \otimes \varvec{x}}_d \rangle =: \langle \overline{{\mathcal {T}}}, \varvec{x}^{\otimes d} \rangle . \end{aligned}$$
The notations, \(\varvec{x}^d\) standing for d copies of \(\varvec{x}\) in a multilinear form, and \(\varvec{x}^{\otimes d}\) standing for outer products of d copies of \(\varvec{x}\), will be used throughout this paper as long as there is no ambiguity.
To our particular interest in this paper, the following conjugate complex form, or conjugate form, is defined by a PS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\),
$$\begin{aligned} {\mathcal {T}}({\overline{\varvec{x}}}^d\varvec{x}^d):= {\mathcal {T}}(\underbrace{\overline{\varvec{x}},\ldots ,\overline{\varvec{x}}}_d, \underbrace{\varvec{x}, \ldots , \varvec{x}}_d) = \langle \overline{{\mathcal {T}}}, \overline{\varvec{x}}^{\otimes d} \otimes \varvec{x}^{\otimes d} \rangle . \end{aligned}$$
Remark that \({\overline{\varvec{x}}}^d\) and \(\varvec{x}^d\) in \({\mathcal {T}}({\overline{\varvec{x}}}^d\varvec{x}^d)\) cannot be swapped as otherwise it becomes a different form. Similarly, we may use the notation \({\mathcal {T}}(\bullet \,{\overline{\varvec{x}}}^{d-1}\varvec{x}^d)\in {\mathbb {C}}^n\), which equals \({\mathcal {T}}({\overline{\varvec{x}}}^{d-1}\bullet \varvec{x}^d)\) since \({\mathcal {T}}\) is a PS tensor.

2.3 Hermitian part and skew-Hermitian part

It is shown in [24] that \({\mathcal {T}}({\overline{\varvec{x}}}^d\varvec{x}^d)\) is real-valued if and only if \({\mathcal {T}}\) is CPS, extending the case of \(d=1\), i.e., \(A(\overline{\varvec{x}}\varvec{x})\) is real-valued if and only if A is a Hermitian matrix. As is well known, any complex matrix \(A\in {\mathbb {C}}^{n^2}\) can be written as \(A= H(A) + S(A)\), where
$$\begin{aligned} H(A) = \frac{1}{2}(A + A^{\mathrm{H}}) \text { and } S(A) =\frac{1}{2}(A - A^{\mathrm{H}}) \end{aligned}$$
are the Hermitian part and the skew-Hermitian part of A, respectively. We extend this concept to high-order PS tensors, which is helpful in the analysis of our results.
Definition 2.3
The conjugate transpose of a PS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\), denoted by \({\mathcal {T}}^{\mathrm{H}}\), satisfies
$$\begin{aligned} ({\mathcal {T}}^{\mathrm{H}})_{i_{1}\ldots i_{d}i_{d+1}\ldots i_{2d}} =\overline{{\mathcal {T}}_{i_{d+1}\ldots i_{2d}i_{1}\ldots i_{d}}} \quad \forall \, 1\le i_{1}\le \ldots \le i_{d}\le n,\,1\le i_{d+1}\le \ldots \le i_{2d}\le n. \end{aligned}$$
The Hermitian part \({\mathcal {H}}({\mathcal {T}})\) and the skew-Hermitian part \({\mathcal {S}}({\mathcal {T}})\) of a PS tensor \({\mathcal {T}}\) are defined as
$$\begin{aligned} {\mathcal {H}}({\mathcal {T}}) := \frac{1}{2}({\mathcal {T}}+ {\mathcal {T}}^{\mathrm{H}}) \text { and } {\mathcal {S}}({\mathcal {T}}) :=\frac{1}{2}({\mathcal {T}}-{\mathcal {T}}^{\mathrm{H}}), \end{aligned}$$
respectively.
Obviously, one has \(({\mathcal {T}}^{\mathrm{H}})^{\mathrm{H}}={\mathcal {T}}\) for a PS tensor \({\mathcal {T}}\). It is clear from Definition 2.2 that a PS tensor \({\mathcal {T}}\) is CPS if and only if \({\mathcal {T}}^{\mathrm{H}}={\mathcal {T}}\), or \({\mathcal {S}}({\mathcal {T}})={\mathcal {O}}\), a zero tensor. The following property can be verified straightforwardly, similar to Hermitian matrices.
Proposition 2.4
Any PS tensor \({\mathcal {T}}\) can be uniquely written as \({\mathcal {T}}= {\mathcal {U}}+ \mathbf{i}{\mathcal {V}}\), where both \({\mathcal {U}}\) and \({\mathcal {V}}\) are CPS tensors. In particular, \({\mathcal {U}}={\mathcal {H}}({\mathcal {T}})\) and \({\mathcal {V}}=-\mathbf{i}{\mathcal {S}}({\mathcal {T}})\). Moreover, \({\mathcal {T}}\) is a CPS tensor if and only if \({\mathcal {V}}={\mathcal {O}}\).
Proof
Obviously \({\mathcal {T}}={\mathcal {H}}({\mathcal {T}})+\mathbf{i}(-\mathbf{i}{\mathcal {S}}({\mathcal {T}}))\). According to Definition 2.3, we have
$$\begin{aligned} {\mathcal {H}}({\mathcal {T}})^{\mathrm{H}}&=\frac{1}{2}({\mathcal {T}}+ {\mathcal {T}}^{\mathrm{H}})^{\mathrm{H}} =\frac{1}{2}( {\mathcal {T}}^{\mathrm{H}}+{\mathcal {T}})= {\mathcal {H}}({\mathcal {T}}),\\ (-\mathbf{i}{\mathcal {S}}({\mathcal {T}}))^{\mathrm{H}}&= \left( -\frac{\mathbf{i}}{2}({\mathcal {T}}-{\mathcal {T}}^{\mathrm{H}})\right) ^{\mathrm{H}} =\frac{\mathbf{i}}{2}({\mathcal {T}}^{\mathrm{H}}-{\mathcal {T}})= -\mathbf{i}{\mathcal {S}}({\mathcal {T}}), \end{aligned}$$
implying that both \({\mathcal {U}}={\mathcal {H}}({\mathcal {T}})\) and \({\mathcal {V}}=-\mathbf{i}{\mathcal {S}}({\mathcal {T}})\) are CPS.
For the uniqueness, suppose that \({\mathcal {T}}={\mathcal {X}}+\mathbf{i}{\mathcal {Y}}\) with \({\mathcal {X}},{\mathcal {Y}}\) being CPS. We have
$$\begin{aligned} {\mathcal {U}}={\mathcal {H}}({\mathcal {T}})=\frac{{\mathcal {T}}+{\mathcal {T}}^{\mathrm{H}}}{2}= \frac{{\mathcal {X}}+\mathbf{i}{\mathcal {Y}}+({\mathcal {X}}+\mathbf{i}{\mathcal {Y}})^{\mathrm{H}}}{2} =\frac{{\mathcal {X}}+\mathbf{i}{\mathcal {Y}}+ {\mathcal {X}}^{\mathrm{H}}-\mathbf{i}{\mathcal {Y}}^{\mathrm{H}}}{2} = {\mathcal {X}}. \end{aligned}$$
Using a similar argument, one can show that \({\mathcal {V}}={\mathcal {Y}}\).
If \({\mathcal {T}}\) is a CPS tensor, then
$$\begin{aligned} {\mathcal {U}}+\mathbf{i}{\mathcal {V}}={\mathcal {T}}={\mathcal {T}}^{\mathrm{H}}={\mathcal {U}}^{\mathrm{H}}-\mathbf{i}{\mathcal {V}}^{\mathrm{H}}={\mathcal {U}}-\mathbf{i}{\mathcal {V}}, \end{aligned}$$
implies that \({\mathcal {V}}={\mathcal {O}}\). If \({\mathcal {V}}={\mathcal {O}}\), then obviously \({\mathcal {T}}\) is a CPS tensor. \(\square \)

3 CPS decomposition and rank

This section is devoted to decompositions of CPS tensors as well as PS tensors. It is an extension of the symmetric decomposition of symmetric tensors. One main result is to propose a constructive method to decompose a CPS tensor into a sum of rank-one CPS tensors, and hence provide an alternative definition of CPS tensors via real linear combination of rank-one CPS tensors. Based on these results, we discuss several ranks for PS tensors and CPS tensors, which can be classified as the conjugate version of Waring’s decomposition [5, 10].

3.1 CPS decomposition

Rank-one decompositions play an essential role in exploring structures of high-order tensors. For Hermitian matrices, they enjoy the following type of conjugate symmetric decompositions: If \(A\in {\mathbb {C}}_{\mathrm{cps}}^{n^2}\), then
$$\begin{aligned} A = \sum _{j = 1}^r \lambda _j\overline{\varvec{a}_j}{\varvec{a}_j}^{\mathrm{T}} =\sum _{j = 1}^r \lambda _j\overline{\varvec{a}_j}\otimes \varvec{a}_j, \end{aligned}$$
(2)
where \(\lambda _j\in {\mathbb {R}}\) and \(\varvec{a}_j\in {\mathbb {C}}^n\) for \(j = 1,\ldots ,r\) with some \(r\le n\). As the high-order generalization of Hermitian matrices, CPS tensors do inherit this important property. Before presenting the main result in this section, let us first prove a technical result.
Lemma 3.1
For any positive integers d and n, there exists \(\varvec{a}_j\in {\mathbb {C}}^n\) and \(\lambda _j\in {\mathbb {R}}\) for \(j=1,\ldots ,m\) with finite m, such that
$$\begin{aligned} \left| \sum _{i=1}^n{x_i}^d\right| ^2 =\sum _{j=1}^m\lambda _j \left| {\varvec{a}_j}^{\mathrm{T}}\varvec{x}\right| ^{2d}. \end{aligned}$$
Proof
For any nonzero \(\alpha _0, \alpha _1, \ldots , \alpha _{2d}\in {\mathbb {R}}\) with \(\alpha _i\ne \alpha _j\) if \(i\ne j\), consider the following \(2d+1\) linear equations with \(2d+1\) variables:
$$\begin{aligned} {\alpha _0}^k z_0 + {\alpha _1}^k z_1 + \ldots +{\alpha _{2d}}^k z_{2d} = \gamma _k\quad k=0,1,\ldots ,2d, \end{aligned}$$
(3)
where \(\gamma _0=1,\,\gamma _d=\sqrt{d!},\,\gamma _{2d}=d!\), and other \(\gamma _k\)’s are zeros. The determinant of the coefficients of (3) is the Vandermonde determinant
$$\begin{aligned} \left| \begin{array}{cccc} 1 &{} 1 &{} \ldots &{} 1 \\ \alpha _0 &{} \alpha _1 &{} \ldots &{}\alpha _{2d} \\ \vdots &{}\vdots &{}\vdots &{} \vdots \\ {\alpha _0}^{2d} &{} {\alpha _1}^{2d} &{} \ldots &{}{\alpha _{2d}}^{2d}\\ \end{array} \right| = \prod _{0 \le i < j \le d} (\alpha _j - \alpha _i) \ne 0, \end{aligned}$$
and so (3) has a unique real solution, which is denoted by \((z_0, z_1, \ldots , z_{2d})\) for simplicity.
Denote \(\Omega _{d}:=\{e^{\mathbf{i}\frac{2k\pi }{d}}:k=0,1,\ldots ,d-1\}\), the set of all complex solutions to \(z^d=1\). Let \(\xi _1,\ldots , \xi _n\) be i.i.d. random variables uniformly distributed on \(\Omega _{d}\). For any linear function of \(\xi _i\)’s, say, \(c_1\xi _1+\ldots + c_n\xi _n\), it follows that
$$\begin{aligned}&\mathrm{E}\left[ \left( \overline{ c_1\xi _1+\ldots + c_n\xi _n}\right) ^d \left( c_1\xi _1+\ldots + c_n\xi _n\right) ^d \right] \nonumber \\&\quad =\sum _{d_1+\ldots +d_n=d}\left( \frac{d!}{d_1 ! \ldots d_n !} \right) ^2 \prod _{i=1}^{n} \left| c_i\right| ^{2d_i} +\sum _{i \ne j} \overline{c_i}^d {c_j}^d. \end{aligned}$$
(4)
To see why (4) holds, we consider all terms
$$\begin{aligned} \prod _{i=1}^n\left( \overline{c_i\xi _i}\right) ^{d_i} \left( c_i\xi _i\right) ^{t_i} \text { with } d_1 +\ldots +d_n=t_1+\ldots +t_n=d \end{aligned}$$
in the left hand side of (4). They can be classified into three mutually exclusive cases: (i) If there is some i such that \(1\le |d_i-t_i|\le d-1\), then
$$\begin{aligned} \mathrm{E}\left[ \prod _{i=1}^n\left( \overline{c_i\xi _i}\right) ^{d_i} \left( c_i\xi _i\right) ^{t_i}\right] = \mathrm{E}\left[ \overline{c_i}^{d_i}{c_i}^{t_i} \overline{\xi _i}^{d_i}{\xi _i}^{t_i}\right] \mathrm{E}\left[ \prod _{j\ne i}\left( \overline{c_j\xi _j}\right) ^{d_j} \left( c_j\xi _j\right) ^{t_j}\right] = 0 \end{aligned}$$
since \(\mathrm{E}[{\xi _i}^k]=0\) for any integer k with \(1\le |k|\le d-1\); (ii) If \(|d_i-t_i|=0\) or d for \(i=1,\ldots ,n\) and there is some k such that \(|d_k-t_k|=d\), which implies that there is another index \(\ell \ne k\) such that \(|d_\ell -t_\ell |=d\). As a result, there are \(i\ne j\) such that \(d_i=t_j=d\) and the rest are zeros. Therefore,
$$\begin{aligned} \mathrm{E}\left[ \prod _{i=1}^n\left( \overline{c_i\xi _i}\right) ^{d_i} \left( c_i\xi _i\right) ^{t_i}\right] = \mathrm{E}\left[ \left( \overline{c_i\xi _i}\right) ^d \left( c_j\xi _j\right) ^d\right] =\overline{c_i}^d {c_j}^d; \end{aligned}$$
and (iii) If \(|d_i-t_i|=0\) for \(i=1,\ldots ,n\), then
$$\begin{aligned} \mathrm{E}\left[ \prod _{i=1}^n\left( \overline{c_i\xi _i}\right) ^{d_i} \left( c_i\xi _i\right) ^{t_i}\right] = \prod _{i=1}^n\mathrm{E}\left[ \left( \overline{c_i\xi _i}\right) ^{d_i}\left( c_i\xi _i\right) ^{d_i}\right] =\prod _{i=1}^{n}\left| c_i\right| ^{2d_i} \end{aligned}$$
and the number of such terms in (4) is \(\left( \frac{d!}{d_1 ! \ldots d_n !} \right) ^2\).
By applying (4) and (3), we obtain
$$\begin{aligned}&\frac{1}{d!} \sum _{k_1 = 0}^{2d} \ldots \sum _{k_n = 0}^{2d} \left( \prod _{\ell =1}^nz_{k_\ell }\right) \mathrm{E}\left[ \left( \overline{ \alpha _{k_1}\xi _1x_1 + \ldots +\alpha _{k_n}\xi _n x_n}\right) ^d \left( \alpha _{k_1} \xi _1 x_1 + \ldots + \alpha _{k_n} \xi _n x_n\right) ^d \right] \nonumber \\&\quad = \frac{1}{d!} \sum _{k_1 = 0}^{2d} \ldots \sum _{k_n =0}^{2d} \left( \prod _{\ell =1}^nz_{k_\ell }\right) \nonumber \\&\qquad \left( \sum _{i \ne j} \overline{\alpha _{k_i}}^d \overline{x_i}^d {\alpha _{k_j}}^d{x_j}^d + \sum _{d_1+ \ldots + d_n=d} \left( \frac{d!}{d_1 ! \ldots d_n !} \right) ^2 \prod _{i=1}^{n} \left| \alpha _{k_i}x_i \right| ^{2d_i}\right) \nonumber \\&\quad = \frac{1}{d!} \sum _{i \ne j} \sum _{k_1 = 0}^{2d} \ldots \sum _{k_n = 0}^{2d} \left( \prod _{\ell =1}^nz_{k_\ell }\right) \overline{\alpha _{k_i}}^d {\alpha _{k_j}}^d \overline{x_i}^d{x_j}^d \nonumber \\&\qquad + d! \sum _{d_1+ \ldots + d_n=d} \sum _{k_1 = 0}^{2d} \ldots \sum _{k_n = 0}^{2d} \prod _{i=1}^{n} \frac{\left| x_i \right| ^{2d_i}}{\left( d_i !\right) ^2} \left( {\alpha _{k_i}} ^{2d_i} z_{k_i} \right) \nonumber \\&\quad = \frac{1}{d!} \sum _{i \ne j} \overline{x_i}^d{x_j}^d \left( \sum _{k=0}^{2d} {\alpha _k}^{d} z_k \right) ^2 \left( \sum _{k=0}^{2d} z_k \right) ^{n-2} + d!\sum _{d_1+ \ldots + d_n=d} \prod _{i=1}^{n} \frac{\left| x_i \right| ^{2d_i}}{\left( d_i !\right) ^2} \left( \sum _{k=0}^{2d} {\alpha _k}^{2d_i} z_k \right) \nonumber \\&\quad = \sum _{i \ne j} \overline{x_i}^d{x_j}^d + \sum _{i =1}^{n} |x_i|^{2d} + \mathrm{even}(d) \frac{\left( d!\right) ^2}{\left( \left( d/2\right) !\right) ^4}\sum _{i< j} |x_i|^d |x_j|^d \nonumber \\&\quad = \left| \sum _{i=1}^n {x_i}^d \right| ^2 + \mathrm{even}(d) \frac{\left( d!\right) ^2}{\left( \left( d/2\right) !\right) ^4}\sum _{i < j} |x_i|^d |x_j|^d, \end{aligned}$$
(5)
where \(\mathrm{even}(d)\) is one if d is even and zero otherwise. Notice that \(\frac{1}{d!}\left( \prod _{\ell =1}^nz_{k_\ell }\right) \) is real, and so (5) provides a constructive expression of \(\sum _{j=1}^m\lambda _j \left| {\varvec{a}_j}^{\mathrm{T}}\varvec{x}\right| ^{2d}\) with finite m for \(\left| \sum _{i=1}^n {x_i}^d \right| ^2\) when d is odd.
When d is even, we consider another system of \(d+1\) linear equations with \(d+1\) variables:
$$\begin{aligned} {\beta _0}^{2k} y_0 + {\beta _1}^{2k} y_1 + \ldots +{\beta _d}^{2k} y_{d} = \delta _k\quad k=0,1,\ldots ,d, \end{aligned}$$
(6)
where nonzero \(\beta _0, \beta _1, \ldots , \beta _{d}\in {\mathbb {R}}\) with \({\beta _i}^2\ne {\beta _j}^2\) if \(i\ne j\), and \(\delta _0=\delta _{d/2}=1\) and other \(\delta _k\)’s are zeros. Similar to the linear system (3) whose Vandermonde determinant is nonzero, (6) also has a unique real solution, denoted by \((y_0, y_1, \ldots , y_d)\) for simplicity.
Let \(\eta _1,\ldots , \eta _n\) be i.i.d. random variables uniformly distributed on \(\Omega _{d+1}\). Similar to the proof of (4), we can obtain
$$\begin{aligned} \mathrm{E}\left[ \left( \overline{ c_1\eta _1+\ldots + c_n\eta _n}\right) ^d \left( c_1\eta _1+\ldots + c_n\eta _n\right) ^d \right] =\sum _{d_1+\ldots +d_n=d}\left( \frac{d!}{d_1 ! \ldots d_n !} \right) ^2 \prod _{i=1}^{n} \left| c_i\right| ^{2d_i}. \end{aligned}$$
Therefore,
$$\begin{aligned}&\sum _{k_1 = 0}^{d} \ldots \sum _{k_n = 0}^{d} \left( \prod _{\ell =1}^ny_{k_\ell }\right) \mathrm{E}\left[ \left( \overline{\beta _{k_1} \eta _1x_1 + \ldots + \beta _{k_n}\eta _nx_n}\right) ^d \left( \beta _{k_1}\eta _1x_1 + \ldots + \beta _{k_n}\eta _nx_n\right) ^d \right] \\&\quad = \sum _{k_1 = 0}^{d} \ldots \sum _{k_n = 0}^{d} \left( \prod _{\ell =1}^ny_{k_\ell }\right) \sum _{d_1+ \ldots + d_n=d} \left( \frac{d!}{d_1! \ldots d_n!} \right) ^2 \prod _{i=1}^{n} \left| \beta _{k_i}x_i \right| ^{2d_i}\\&\quad = (d!)^2 \sum _{d_1+ \ldots + d_n=d} \sum _{k_1 = 0}^{d} \ldots \sum _{k_n = 0}^{d} \prod _{i=1}^n\frac{\left| x_i \right| ^{2d_i}}{(d_i!)^2} \left( {\beta _{k_i}}^{2d_i}y_{k_i}\right) \\&\quad = (d!)^2 \sum _{d_1+ \ldots + d_n=d} \prod _{i=1}^{n} \frac{\left| x_i \right| ^{2d_i}}{(d_i !)^2} \left( \sum _{k=0}^d {\beta _k} ^{2d_i} y_k \right) \\&\quad = \sum _{i<j} \frac{(d!)^2}{((d/2)!)^4}|x_i|^d|x_j|^d, \end{aligned}$$
where the last inequality is due to (6). This, together with (5) for even d, gives that
$$\begin{aligned} \left| \sum _{i=1}^n {x_i}^d \right| ^2&\\&=\frac{1}{d!} \sum _{k_1 =0}^{2d} \ldots \sum _{k_n = 0}^{2d} \left( \prod _{\ell =1}^nz_{k_\ell }\right) \mathrm{E}\left[ \left( \overline{\alpha _{k_1} \xi _1x_1 + \ldots + \alpha _{k_n} \xi _n x_n}\right) ^d \left( \alpha _{k_1} \xi _1 x_1 + \ldots + \alpha _{k_n} \xi _n x_n\right) ^d \right] \\&\quad - \sum _{k_1 = 0}^{d} \ldots \sum _{k_n = 0}^{d} \left( \prod _{\ell =1}^ny_{k_\ell }\right) \mathrm{E} \left[ \left( \overline{ \beta _{k_1} \eta _1x_1 + \ldots + \beta _{k_n} \eta _n x_n}\right) ^d \left( \beta _{k_1} \eta _1 x_1 + \ldots + \beta _{k_n} \eta _n x_n\right) ^d \right] . \end{aligned}$$
By taking the coefficients of \(\varvec{x}\), \((\alpha _{k_1}\xi _1,\ldots ,\alpha _{k_n}\xi _n)\) or \((\beta _{k_1}\eta _1,\ldots ,\beta _{k_n}\eta _n)\), and enumerating every possible value in the supports sets of \(\xi _i\)’s or \(\eta _i\)’s, to form an \(\varvec{a}_j \in {\mathbb {C}}^n\), the above equality provides an expression of \(\sum _{j=1}^m\lambda _j \left| {\varvec{a}_j}^{\mathrm{T}}\varvec{x}\right| ^{2d}\) with finite m for \(\left| \sum _{i=1}^n {x_i}^d \right| ^2\) when d is even. \(\square \)
The above lemma can be viewed as a conjugate type of Hilbert’s identity in the literature (see e.g., [4]), which states that for any positive integers d and n, there always exist \(\varvec{a}_1,\ldots ,\varvec{a}_m\in {\mathbb {R}}^n\), such chat
$$\begin{aligned} (\varvec{x}^{\mathrm{T}}\varvec{x})^d = \sum _{j=1}^m ({\varvec{a}_j}^{\mathrm{T}}\varvec{x})^{2d}. \end{aligned}$$
With Lemma 3.1 in hand, we are ready to show that any CPS tensor can be decomposed into a sum of rank-one CPS tensors.
Theorem 3.2
An even-order tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^{2d}}\) is CPS if and only if \({\mathcal {T}}\) has the following CPS decomposition
$$\begin{aligned} {\mathcal {T}}= \sum _{j = 1}^{m}\lambda _j \overline{\varvec{a}_j}^{\otimes d} \otimes {\varvec{a}_j}^{\otimes d}, \end{aligned}$$
(7)
where \(\lambda _j \in {\mathbb {R}}\) and \(\varvec{a}_j\in {\mathbb {C}}^{n}\) for \(j=1,\ldots ,m\) with finite m.
Proof
For any \(\varvec{a}\in {\mathbb {C}}^n\), it is straightforward to check that the rank-one tensor \(\overline{\varvec{a}}^{\otimes d}\otimes \varvec{a}^{\otimes d}\) is CPS by Definition 2.2. In fact, it is symmetric with respect to its first half of the modes, and also symmetric with respect to its last half of the modes, resulting a PS tensor. Besides, this PS tensor satisfies
$$\begin{aligned} ({\overline{\varvec{a}}}^{\otimes d}\otimes \varvec{a}^{\otimes d})^{\mathrm{H}} = \overline{\varvec{a}^{\otimes d}} \otimes \overline{{\overline{\varvec{a}}}^{\otimes d}} = {\overline{\varvec{a}}}^{\otimes d}\otimes \varvec{a}^{\otimes d}, \end{aligned}$$
resulting a CPS tensor. Therefore, as a real linear combination of such rank-one CPS tensors in (7), \({\mathcal {T}}\) must also be CPS.
On the other hand, according to [24, Proposition 3.9] with a constructive algorithm, any CPS tensor \({\mathcal {T}}\) can be written as
$$\begin{aligned} {\mathcal {T}}= \sum _{j = 1}^q\lambda _j\overline{{\mathcal {Z}}_j}\otimes {\mathcal {Z}}_j \end{aligned}$$
(8)
where \(\lambda _j\in \{-1,1\}\) and \({\mathcal {Z}}_j\in {\mathbb {C}}_{\mathrm{s}}^{n^{d}}\) for \(j=1,\ldots ,q\) with finite q. It suffices to prove that \(\overline{{\mathcal {Z}}}\otimes {\mathcal {Z}}\) admits a decomposition of (7) if \({\mathcal {Z}}\in {\mathbb {C}}_{\mathrm{s}}^{n^d}\).
Since any symmetric complex tensor admits a finite symmetric decomposition (see e.g., [5, Algorithm 5.1] with a theoretical guarantee and [34, Algorithm 4.3] with numerical efficiency), we may let
$$\begin{aligned} {\mathcal {Z}}=\sum _{j=1}^r{\varvec{a}_j}^{\otimes d}, \end{aligned}$$
where \(\varvec{a}_j\in {\mathbb {C}}^n\) for \(j=1,\ldots ,r\) with finite r. For any \(\varvec{x}\in {\mathbb {C}}^{n}\), one has
$$\begin{aligned} (\overline{{\mathcal {Z}}}\otimes {\mathcal {Z}})(\overline{\varvec{x}}^d\varvec{x}^d) =\langle {\mathcal {Z}}\otimes \overline{{\mathcal {Z}}}, \overline{\varvec{x}}^{\otimes d}\otimes \varvec{x}^{\otimes d} \rangle =\langle {\mathcal {Z}}, \overline{\varvec{x}}^{\otimes d} \rangle \cdot \langle \overline{{\mathcal {Z}}}, \varvec{x}^{\otimes d} \rangle =\overline{{\mathcal {Z}}}(\overline{\varvec{x}}^d){\mathcal {Z}}(\varvec{x}^d)=|{\mathcal {Z}}(\varvec{x}^d)|^2 \end{aligned}$$
and
$$\begin{aligned} {\mathcal {Z}}(\varvec{x}^d)=\langle \overline{{\mathcal {Z}}}, \varvec{x}^{\otimes d}\rangle =\left\langle \sum _{j=1}^r\overline{\varvec{a}_j}^{\otimes d}, \varvec{x}^{\otimes d} \right\rangle =\sum _{j=1}^r \langle \overline{\varvec{a}_j}^{\otimes d}, \varvec{x}^{\otimes d}\rangle = \sum _{j = 1}^r ({\varvec{a}_j}^{\mathrm{T}}\varvec{x})^{d} = \sum _{j = 1}^r{y_j}^{d}, \end{aligned}$$
where we let \(\varvec{y}=A\varvec{x}\in {\mathbb {C}}^r\) and \(A=(\varvec{a}_1,\ldots ,\varvec{a}_r)^{\mathrm{T}}\in {\mathbb {C}}^{r\times n}\). Therefore, by Lemma 3.1, there exist \(\alpha _k\in {\mathbb {R}}\) and \(\varvec{b}_k\in {\mathbb {C}}^r\) for \(k=1,\ldots ,s\) with finite s, such that
$$\begin{aligned} (\overline{{\mathcal {Z}}}\otimes {\mathcal {Z}})(\overline{\varvec{x}}^d\varvec{x}^d)=|{\mathcal {Z}}(\varvec{x}^d)|^2 =\left| \sum _{j = 1}^r{y_j}^{d}\right| ^{2} = \sum _{k = 1}^{s}\alpha _k\left| {\varvec{b}_k}^{\mathrm{T}}\varvec{y}\right| ^{2d} = \sum _{k = 1}^{s}\alpha _k\left| {\varvec{b}_k}^{\mathrm{T}}A\varvec{x}\right| ^{2d}. \end{aligned}$$
Finally, by letting \(\varvec{c}_k=A^{\mathrm{T}}\varvec{b}_k\in {\mathbb {C}}^n\) for \(k=1,\ldots ,s\), we obtain
$$\begin{aligned} (\overline{{\mathcal {Z}}}\otimes {\mathcal {Z}})(\overline{\varvec{x}}^d\varvec{x}^d)&= \sum _{k = 1}^{s} \alpha _k \left| {\varvec{c}_k}^{\mathrm{T}}\varvec{x}\right| ^{2d} \\&= \sum _{k = 1}^{s} \alpha _k \langle {\varvec{c}_k}^{\otimes d}\otimes \overline{\varvec{c}_k}^{\otimes d} ,\overline{\varvec{x}}^{\otimes d}\otimes \varvec{x}^{\otimes d}\rangle \\&= \left\langle \sum _{k = 1}^{s}\alpha _k {\varvec{c}_k}^{\otimes d}\otimes \overline{\varvec{c}_k}^{\otimes d}, \overline{\varvec{x}}^{\otimes d}\otimes \varvec{x}^{\otimes d} \right\rangle \\&= \left( \sum _{k = 1}^{s}\alpha _k\overline{\varvec{c}_k}^{\otimes d}\otimes {\varvec{c}_k}^{\otimes d}\right) (\overline{\varvec{x}}^d\varvec{x}^d). \end{aligned}$$
This implies that \(\overline{{\mathcal {Z}}}\otimes {\mathcal {Z}}=\sum _{k =1}^{s} \alpha _k\overline{\varvec{c}_k}^{\otimes d}\otimes {\varvec{c}_k}^{\otimes d}\), completing the whole proof. \(\square \)
Theorem 3.2 provides an alternative definition of a CPS tensor via a real linear combination of rank-one CPS tensors, i.e.,
$$\begin{aligned} {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}:=\left\{ \sum _{j = 1}^{m}\lambda _j \overline{\varvec{a}_j}^{\otimes d} \otimes {\varvec{a}_j}^{\otimes d}: \lambda _j\in {\mathbb {R}},\,\varvec{a}_j\in {\mathbb {C}}^n,\, j=1,\ldots ,m\right\} . \end{aligned}$$
The proof of Theorem 3.2 actually develops an explicit algorithm to decompose a general CPS tensor into a sum of rank-one CPS tensors. The procedure involves the following three main steps:
(i)
Find \({\mathcal {T}}= \sum _{j = 1}^q\lambda _j\overline{{\mathcal {Z}}_j}\otimes {\mathcal {Z}}_j\) where \(\lambda _j\in \{-1,1\}\) and \({\mathcal {Z}}_j\in {\mathbb {C}}_{\mathrm{s}}^{n^{d}}\) is symmetric;
 
(ii)
Find a symmetric rank-one decomposition for every \({\mathcal {Z}}_j\), i.e., \({\mathcal {Z}}_j=\sum _{k=1}^{r_j}{\varvec{a}_{jk}}^{\otimes d}\) where \(\varvec{a}_{jk}\in {\mathbb {C}}^n\);
 
(iii)
Find \(\overline{\sum _{k=1}^{r_j}{\varvec{a}_{jk}}^{\otimes d}}\otimes \left( \sum _{k=1}^{r_j}{\varvec{a}_{jk}}^{\otimes d}\right) = \sum _{k = 1}^{s_j} \alpha _{jk}\overline{\varvec{c}_{jk}}^{\otimes d}\otimes {\varvec{c}_{jk}}^{\otimes d}\) where \(\alpha _{jk}\in {\mathbb {R}}\) and \(\varvec{c}_{jk}\in {\mathbb {C}}^n\) for every j.
 
In fact, as a consequence of Theorem 3.2, perhaps surprisingly, PS tensors also enjoy similar decompositions, via complex linear combinations of rank-one CPS tensors.
Corollary 3.3
An even-order tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^{2d}}\) is PS if and only if \({\mathcal {T}}\) has the following PS decomposition
$$\begin{aligned} {\mathcal {T}}= \sum _{j = 1}^{m}\lambda _j \overline{\varvec{a}_j}^{\otimes d} \otimes {\varvec{a}_j}^{\otimes d}, \end{aligned}$$
(9)
where \(\lambda _j \in {\mathbb {C}}\) and \(\varvec{a}_j\in {\mathbb {C}}^{n}\) for \(j=1,\ldots ,m\) with finite m.
Proof
The proof of the ‘if’ part can be straightforwardly verified by Definition 2.1 using a PS decomposition as that in the proof of Theorem 3.2.
For the ‘only if’ part, by Proposition 2.4, \({\mathcal {T}}={\mathcal {H}}({\mathcal {T}})+{\mathcal {S}}({\mathcal {T}})\) where \({\mathcal {H}}({\mathcal {T}})\) and \(\mathbf{i}{\mathcal {S}}({\mathcal {T}})\) are CPS. By Theorem 3.2, both \({\mathcal {H}}({\mathcal {T}})\) and \(\mathbf{i}{\mathcal {S}}({\mathcal {T}})\) can be decomposed into a sum of rank-one CPS tensors as in (7), with coefficients being real numbers. Therefore, \({\mathcal {T}}={\mathcal {H}}({\mathcal {T}})+ (-\mathbf{i})\mathbf{i}{\mathcal {S}}({\mathcal {T}})\) can be decomposed into a sum of rank-one CPS tensors as in (9), with coefficients being complex numbers. \(\square \)
Corollary 3.3 also provides an alternative definition of a PS tensor via a complex linear combination of rank-one CPS tensors, i.e.,
$$\begin{aligned} {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}:=\left\{ \sum _{j = 1}^{m} \lambda _j \overline{\varvec{a}_j}^{\otimes d} \otimes {\varvec{a}_j}^{\otimes d}: \lambda _j\in {\mathbb {C}},\,\varvec{a}_j\in {\mathbb {C}}^n,\, j=1,\ldots ,m\right\} . \end{aligned}$$
In terms of rank-one decompositions shown in Theorem 3.2 and Corollary 3.3, CPS tensors and PS tensors are straightforward generalization of Hermitian matrices and complex matrices, respectively.
Some remarks on decompositions of PS tensors are necessary in place. From Definition 2.1, in particular the symmetry with respect to its first half of the modes and symmetry with respect to its last half of the modes, it can be shown that any PS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\) can also be decomposed as
$$\begin{aligned} {\mathcal {T}}=\sum _{j=1}^m{\varvec{a}_j}^{\otimes d}\otimes {\varvec{b}_j}^{\otimes d}, \end{aligned}$$
(10)
where \(\varvec{a}_j,\,\varvec{b}_j\in {\mathbb {C}}^n\) for \(j=1,\ldots ,m\) with some finite m. This decomposition seems natural from its original definition, but is quite different to and less symmetric than (9) in Corollary 3.3. In fact, (10) can be immediately obtained from (9) by absorbing each \(\lambda _j\) into \(\overline{\varvec{a}_j}^{\otimes d}\). This makes the decomposition (9) interesting as it links the first half and the last half of the modes of a PS tensor, which is not obvious either from Definition 2.1 or the decomposition (10). Even for \(d=1\), (9) reduces to that any complex matrix \(A\in {\mathbb {C}}^{n^2}\) can be written as
$$\begin{aligned} A = \sum _{j = 1}^m \lambda _j{\varvec{a}_j}{\varvec{a}_j}^{\mathrm{H}} \text { with } \lambda _j\in {\mathbb {C}}\text { and } \varvec{a}_j\in {\mathbb {C}}^n \text { for }j = 1,\ldots ,m, \end{aligned}$$
which, to the best of our knowledge, has not been seen in the literature. However, the connection between CPS tensors and PS tensors makes (9) more straightforward as a consequence of Theorem 3.2.

3.2 CPS rank

The discussion in Section 3.1 obviously raises the question for the shortest CPS decomposition, a common question of interest for matrices and high-order tensors, called rank. For any tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^d}\), the rank of \({\mathcal {T}}\), denoted by \(\mathrm{rank}\,({\mathcal {T}})\), is the smallest number r that \({\mathcal {T}}\) can be written as a sum of rank-one complex tensors, i.e.,
$$\begin{aligned} \mathrm{rank}\,({\mathcal {T}}): = \min \left\{ r :{\mathcal {T}}= \sum _{j = 1}^r \varvec{a}_{j1} \otimes \ldots \otimes \varvec{a}_{jd}, \,\varvec{a}_{jk}\in {\mathbb {C}}^n,\,j = 1, \ldots ,r, \, k=1,\ldots ,d \right\} . \end{aligned}$$
Depending on the types of rank-one tensors, we define the partial-symmetric rank and the conjugate partial-symmetric rank as follows.
Definition 3.4
The partial-symmetric rank (PS rank) of a PS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\), denoted by \(\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})\), is defined as
$$\begin{aligned} \mathrm{rank}\,_\mathrm{PS}({\mathcal {T}}): = \min \left\{ r :{\mathcal {T}}= \sum _{j = 1}^r \lambda _j \overline{\varvec{a}_j}^{\otimes d}\otimes {\varvec{a}_j}^{\otimes d},\, \lambda _j\in {\mathbb {C}},\,\varvec{a}_j\in {\mathbb {C}}^n,\,j = 1,\ldots ,r \right\} . \end{aligned}$$
The conjugate partial-symmetric rank (CPS rank) of a CPS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\), denoted by \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})\), is defined as
$$\begin{aligned} \mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}}): = \min \left\{ r :{\mathcal {T}}= \sum _{j = 1}^r \lambda _j \overline{\varvec{a}_j}^{\otimes d}\otimes {\varvec{a}_j}^{\otimes d},\, \lambda _j\in {\mathbb {R}},\,\varvec{a}_j\in {\mathbb {C}}^n,\,j = 1,\ldots ,r \right\} . \end{aligned}$$
To echo the discussion at the end of Section 3.1, we remark that by the original definition (Definition 2.1) of PS tensors, another rank for PS tensors can be defined based on the decomposition (10), i.e., the minimum r such that \({\mathcal {T}}= \sum _{j = 1}^r {\varvec{a}_j}^{\otimes d}\otimes {\varvec{b}_j}^{\otimes d}\). This rank is different to the PS rank in Definition 3.4 (see Example 3.6). Our interest here is to emphasize the conjugate property and to better understand CPS tensors.
Obviously by Definition 3.4, for a CPS tensor \({\mathcal {T}}\), one has
$$\begin{aligned} \mathrm{rank}\,({\mathcal {T}}) \le \mathrm{rank}\,_\mathrm{PS}({\mathcal {T}}) \le \mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}}). \end{aligned}$$
(11)
An interesting question is whether the above inequality is an equality or not. It is obvious that (11) are equalities when the rank, PS rank, or CPS rank of a CPS tensor is one. The equality also holds in the case of matrices, i.e., for any Hermitian matrix, the three ranks must be the same. However, this is not true in general for high-order CPS tensors, as stipulated in Theorem 3.5.
In \({\mathbb {C}}_{\mathrm{s}}^{n^d}\), the space of symmetric tensors, a similar problem was posed by Comon: The symmetric rank of a symmetric tensor is equal to the rank of the tensor, known as Comon’s conjecture [7]. It received a considerable amount of attention in recent years; see e.g., [39] and references therein. Comon’s conjecture was shown to be true in various special cases and was only recently disproved by Shitov [39] using a sophisticate counter example of complex symmetric tensor. Nevertheless, the real version of Comon’s conjecture remains open. Our result on the ranks of CPS tensors below can be taken as a disproof for the conjugate version of Comon’s conjecture. In fact, our counter example (Example 3.6) is very simple.
Theorem 3.5
If \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\) is a CPS tensor, then
$$\begin{aligned} \mathrm{rank}\,({\mathcal {T}})\le \mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}}). \end{aligned}$$
Moreover, there exists a CPS tensor \({\mathcal {T}}\) such that \(\mathrm{rank}\,({\mathcal {T}}) <\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})\).
Proof
Let \(\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=r\) and \({\mathcal {T}}\) has the following PS decomposition
$$\begin{aligned} {\mathcal {T}}=\sum _{j = 1}^r \lambda _j \overline{\varvec{a}_j}^{\otimes d}\otimes {\varvec{a}_j}^{\otimes d}. \end{aligned}$$
where \(\lambda _j\in {\mathbb {C}}\) and \(\varvec{a}_j\in {\mathbb {C}}^n\) for \(j = 1,\ldots ,r\). It is easy to see that \({\mathcal {T}}\) can be written as
$$\begin{aligned} {\mathcal {T}}= \sum _{j = 1}^r (\mathrm{Re}\,\lambda _j) \overline{\varvec{a}_j}^{\otimes d} \otimes {\varvec{a}_j}^{\otimes d} + \mathbf{i}\sum _{j = 1}^r (\mathrm{Im}\,\lambda _j) \overline{\varvec{a}_j}^{\otimes d}\otimes {\varvec{a}_j}^{\otimes d}. \end{aligned}$$
We notice that both \(\sum _{j = 1}^r (\mathrm{Re}\,\lambda _j) \overline{\varvec{a}_j}^{\otimes d}\otimes {\varvec{a}_j}^{\otimes d}\) and \(\sum _{j = 1}^r (\mathrm{Im}\,\lambda _j) \overline{\varvec{a}_j}^{\otimes d}\otimes {\varvec{a}_j}^{\otimes d}\) are CPS tensors. By the uniqueness result in Proposition 2.4 and the fact that \({\mathcal {T}}\) is already CPS, \(\sum _{j = 1}^r (\mathrm{Im}\,\lambda _j) \overline{\varvec{a}_j}^{\otimes d}\otimes {\varvec{a}_j}^{\otimes d}\) must be a zero tensor. Therefore,
$$\begin{aligned} {\mathcal {T}}= \sum _{j = 1}^r (\mathrm{Re}\,\lambda _j) \overline{\varvec{a}_j}^{\otimes d}\otimes {\varvec{a}_j}^{\otimes d}. \end{aligned}$$
This implies that \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})\le r=\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})\) since \(\mathrm{Re}\,\lambda _j\in {\mathbb {R}}\). Together with the obvious fact that \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})\ge \mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})\) we conclude \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})=\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})\).
Example 3.6 shows a CPS tensor \({\mathcal {T}}\) with \(\mathrm{rank}\,({\mathcal {T}}) <\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})\). \(\square \)
Example 3.6
Let \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{2^4}\) where \({\mathcal {T}}_{1122}={\mathcal {T}}_{2211}=1\) and other entries are zeros. It follows that
$$\begin{aligned} \mathrm{rank}\,({\mathcal {T}})=2<\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}}). \end{aligned}$$
Proof
Obviously \({\mathcal {T}}\) can be written as a sum of two rank-one tensors, each matching a nonzero entry of \({\mathcal {T}}\). It is also easy to show that \(\mathrm{rank}\,({\mathcal {T}})\ne 1\) by contradiction. Therefore, \(\mathrm{rank}\,({\mathcal {T}})=2\).
We now prove \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})\ge 3\) by contradiction. According to Theorem 3.5, \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})=\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})\ge 2\). Suppose that one has \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})=\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=2\). There exist nonzero \(\lambda _1,\lambda _2\in {\mathbb {R}}\) and \(\varvec{u}=(u_1,u_2)^{\mathrm{T}},\varvec{v}=(v_1,v_2)^{\mathrm{T}}\in {\mathbb {C}}^2\) such that
$$\begin{aligned} {\mathcal {T}}=\lambda _1\overline{\varvec{u}}\otimes \overline{\varvec{u}}\otimes \varvec{u}\otimes \varvec{u}+\lambda _2\overline{\varvec{v}}\otimes \overline{\varvec{v}}\otimes \varvec{v}\otimes \varvec{v}. \end{aligned}$$
By comparing the entries \({\mathcal {T}}_{1111}={\mathcal {T}}_{1112}={\mathcal {T}}_{2222}=0\) and \({\mathcal {T}}_{1122}=1\), we obtain
$$\begin{aligned} 0&=\lambda _1|u_1|^4+\lambda _2|v_1|^4, \end{aligned}$$
(12a)
$$\begin{aligned} 0&=\lambda _1|u_1|^2\overline{u_1}u_2+\lambda _2|v_1|^2\overline{v_1}v_2, \end{aligned}$$
(12b)
$$\begin{aligned} 0&=\lambda _1|u_2|^4+\lambda _2|v_2|^4, \end{aligned}$$
(12c)
$$\begin{aligned} 1&=\lambda _1\overline{u_1}^2{u_2}^2+\lambda _2\overline{v_1}^2{v_2}^2. \end{aligned}$$
(12d)
First, we claim that none of \(u_1,u_2,v_1,v_2\) can be zero. Otherwise, if either \(u_1\) or \(v_1\) is zero, we have both \(u_1\) and \(v_1\) are zeros by (12a), which invalidates (12d). In the other case, if either \(u_2\) or \(v_2\) is zero, we have both \(u_2\) and \(v_2\) are zeros by (12c), which also invalidates (12d).
Let us now multiply \(u_2\) to (12a) and multiply \(u_1\) to (12b), and we obtain
$$\begin{aligned} 0&=\lambda _1|u_1|^4u_2+\lambda _2|v_1|^4u_2, \\ 0&=\lambda _1|u_1|^4u_2+\lambda _2|v_1|^2\overline{v_1}v_2u_1. \end{aligned}$$
Combining the above two equations leads to
$$\begin{aligned} \lambda _2|v_1|^2\overline{v_1}(v_1u_2-v_2u_1)=0, \end{aligned}$$
which implies that \(v_1u_2=v_2u_1\), i.e., \(\frac{u_1}{v_1}=\frac{u_2}{v_2}\). There exists \(\alpha \in {\mathbb {C}}\) such that \(\varvec{u}=\alpha \varvec{v}\), and so we get
$$\begin{aligned} {\mathcal {T}}=\lambda _1\overline{\alpha \varvec{v}}\otimes \overline{\alpha \varvec{v}}\otimes (\alpha \varvec{v}) \otimes (\alpha \varvec{v}) +\lambda _2\overline{\varvec{v}}\otimes \overline{\varvec{v}}\otimes \varvec{v}\otimes \varvec{v}= (\lambda _1|\alpha |^4+\lambda _2)\overline{\varvec{v}} \otimes \overline{\varvec{v}}\otimes \varvec{v}\otimes \varvec{v}. \end{aligned}$$
Therefore, we arrive at \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})\le 1\), which is obviously a contradiction. \(\square \)
Although Example 3.6 invalids the conjugate version of Comon’s conjecture, the rank and the PS rank of a generic PS tensor (including CPS tensor) can still be the same when its PS rank is no more than its dimension; see Proposition 3.7. This is similar to [7, Proposition 5.3] for a generic symmetric complex tensor.
Proposition 3.7
If a PS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\) satisfies \(\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})\le n\), then generically \(\mathrm{rank}\,({\mathcal {T}})=\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})\).
Proof
Let \(\mathrm{rank}\,({\mathcal {T}})=r\) and \(\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=m\le n\). There exist decompositions
$$\begin{aligned} \sum _{j=1}^r\varvec{c}_{j1}\otimes \ldots \otimes \varvec{c}_{j2d} ={\mathcal {T}}=\sum _{j=1}^{m}\lambda _j {\varvec{a}_j}^{\otimes d}\otimes \overline{\varvec{a}_j}^{\otimes d}, \end{aligned}$$
(13)
where \(\varvec{c}_{jk}\in {\mathbb {C}}^n\) for \(j=1,\ldots ,r\) and \(k=1,\ldots ,2d\), and nonzero \(\lambda _j\in {\mathbb {C}}\) and \(\varvec{a}_j\in {\mathbb {C}}^n\) for \(j=1,\ldots ,m\). As \(m\le n\), it is not difficulty to show that the set of n-dimensional vectors \(\{\varvec{a}_1,\ldots ,\varvec{a}_m\}\) are generically linearly independent; see e.g., [7, Lemma 5.2]. As a consequence, one may find \(\varvec{x}_j\in {\mathbb {C}}^n\) for \(j=1,\ldots ,m\), such that
$$\begin{aligned} \langle \varvec{x}_i, \varvec{a}_j \rangle = \left\{ \begin{array}{ll} 1&{} i=j\\ 0&{} i\ne j. \end{array}\right. \end{aligned}$$
By applying the multilinear form of \(\bullet \,\overline{\varvec{x}_k}^{d-1}{\varvec{x}_k}^{d}\) on both sides of (13), we obtain
$$\begin{aligned} \left( \sum _{j=1}^r\varvec{c}_{j1}\otimes \ldots \otimes \varvec{c}_{j2d}\right) (\bullet \,\overline{\varvec{x}_k}^{d-1}{\varvec{x}_k}^{d})&= \sum _{j=1}^r \left( \prod _{i=2}^{d} \langle \overline{\varvec{c}_{ji}}, \overline{\varvec{x}_k} \rangle \right) \left( \prod _{i=d+1}^{2d} \langle \overline{\varvec{c}_{ji}}, \varvec{x}_k \rangle \right) \varvec{c}_{j1} \\ \left( \sum _{j=1}^{m}\lambda _j {\varvec{a}_j}^{\otimes d} \otimes \overline{\varvec{a}_j}^{\otimes d}\right) (\bullet \,\overline{\varvec{x}_k}^{d-1} {\varvec{x}_k}^{d})&= \lambda _k \varvec{a}_k. \end{aligned}$$
Therefore, for any \(1\le k\le m\)
$$\begin{aligned} \varvec{a}_k = \frac{1}{\lambda _k}\sum _{j=1}^r \left( \prod _{i=2}^{d} \langle \overline{\varvec{c}_{ji}}, \overline{\varvec{x}_k} \rangle \right) \left( \prod _{i=d+1}^{2d} \langle \overline{\varvec{c}_{ji}}, \varvec{x}_k \rangle \right) \varvec{c}_{j1}, \end{aligned}$$
i.e., a complex linear combination of \(\{\varvec{c}_{11},\ldots ,\varvec{c}_{r1}\}\). This implies that \(m\le r\). Combining with the obvious fact that \(r=\mathrm{rank}\,({\mathcal {T}})\le \mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=m\), we obtain that \(r=m\). In other words, \(\mathrm{rank}\,({\mathcal {T}})=\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})\) holds generically. \(\square \)
We remark that the above result already holds for CPS tensors. This is because CPS tensors are PS, and PS rank of a CPS tensor is equal to CPS rank of the tensor (Theorem 3.5).
Corollary 3.8
If a CPS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\) satisfies \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})\le n\), then generically \(\mathrm{rank}\,({\mathcal {T}})=\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})\).

4 Rank-one approximation and matricization equivalence

Finding tensor ranks is in general very hard [18]. This makes low-rank approximations important, and in fact it has been one of the main topics for high-order tensors. Along this line, the rank-one approximation is perhaps the most simple and important problem. In this section, we study several rank-one approximations and the rank-one equivalence via matricization for CPS tensors. As an application of the matricization equivalence, new convex optimization models are developed to find best rank-one approximations of CPS tensors.

4.1 Rank-one approximation

It is well known that finding best rank-one approximations of a real tensor is equivalent to finding the largest singular value [29] of the tensor; see e.g., [28]. For a real symmetric tensor, best rank-one approximations can be obtained at a symmetric rank-one tensor [3, 11], and is equivalent to finding the largest eigenvalue [37] of the tensor or spherical constrained polynomial optimization [17, 46]. In the complex field, Sorber et al. [42] proposed line search and plane search for tensor optimization including best rank-one approximation as a special case. Ni et al. [33] studied best symmetric rank-one approximations of symmetric complex tensors. Along this line, PS and CPS tensors possess similar properties. Let us first introduce eigenvalues of these tensors.
Definition 4.1
(Jiang et al. [24, Definition 4.4]). \(\lambda \in {\mathbb {C}}\) is called a C-eigenvalue of a CPS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\) if there exists a vector \(\varvec{x}\in {\mathbb {C}}^n\) called C-eigenvector, such that \({\mathcal {T}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d) =\lambda \varvec{x}\) and \(\Vert \varvec{x}\Vert =1\).
All the C-eigenvalues of CPS tensors are real [24]. The C-eigenvalue of PS tensors has not been defined, and we can simply adopt Definition 4.1 as the definition of C-eigenvalue and C-eigenvector for PS tensors. In this paper, as long as their is no ambiguity, we call C-eigenvalue and C-eigenvector to be the eigenvalue and the eigenvector for PS tensors (including CPS tensors), respectively, and call \((\lambda ,\varvec{x})\) in Definition 4.1 to be the eigenpair. We also need to clarify a couple of terms. For a CPS tensor \({\mathcal {T}}\), if \(\mathrm{rank}\,({\mathcal {T}})=1\), then \(\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})=1\), and so the term rank-one CPS tensor has no ambiguity. However, for a PS tensor \({\mathcal {T}}\), \(\mathrm{rank}\,({\mathcal {T}})=1\) does not imply \(\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=1\). Here, the term rank-one PS tensor stands for a PS tensor \({\mathcal {T}}\) with \(\mathrm{rank}\,_\mathrm{PS}({\mathcal {T}})=1\).
Theorem 4.2
For a PS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\), \(\lambda \in {\mathbb {C}}\) is a largest (in terms of the modulus) eigenvalue in an eigenpair \((\lambda ,\varvec{x})\) of \({\mathcal {T}}\) if and only if \(\lambda \,\varvec{x}^{\otimes d}\otimes \overline{\varvec{x}}^{\otimes d}\) is a best rank-one PS tensor approximation of \({\mathcal {T}}\), i.e.,
$$\begin{aligned} \mathop {\mathrm{arg}\,\mathrm{max}}\limits _{{\mathcal {T}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d) =\lambda \varvec{x},\,\Vert \varvec{x}\Vert =1,\,\lambda \in {\mathbb {C}}}|\lambda | =\mathop {\mathrm{arg}\,\mathrm{min}}\limits _{\Vert \varvec{x}\Vert =1,\,\lambda \in {\mathbb {C}}} \Vert {\mathcal {T}}-\lambda \, \varvec{x}^{\otimes d}\otimes \overline{\varvec{x}}^{\otimes d}\Vert . \end{aligned}$$
(14)
Proof
Straightforward computation shows that
$$\begin{aligned} \Vert {\mathcal {T}}-\lambda \,\varvec{x}^{\otimes d}\otimes \overline{\varvec{x}}^{\otimes d}\Vert ^2 = \Vert {\mathcal {T}}\Vert ^2 +|\lambda |^2 - 2\,\mathrm{Re}\,(\overline{\lambda } {\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)). \end{aligned}$$
To minimize the right hand side of the above for given \({\mathcal {T}}\) and fixed \(\varvec{x}\), \(\lambda \in {\mathbb {C}}\) must satisfy \(\arg (\lambda )=\arg ({\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d))\), which implies that \(\mathrm{Re}\,(\overline{\lambda } {\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d))= |\lambda |\cdot |{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)|\). We have
$$\begin{aligned} \min _{\lambda \in {\mathbb {C}}}\left( \Vert {\mathcal {T}}\Vert ^2 + |\lambda |^2 -2|\lambda |\cdot |{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)|\right) =\Vert {\mathcal {T}}\Vert ^2 -|{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)|^2, \end{aligned}$$
held if and only if \(|\lambda |=|{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)|\). This further implies that \(\lambda ={\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)\) is an optimal solution of the right hand side of (14). Therefore, we arrive at
$$\begin{aligned} \mathop {\mathrm{arg}\,\mathrm{min}}\limits _{\Vert \varvec{x}\Vert =1,\lambda \in {\mathbb {C}}} \Vert {\mathcal {T}}-\lambda \, \varvec{x}^{\otimes d} \otimes \overline{\varvec{x}}^{\otimes d}\Vert =\mathop {\mathrm{arg}\,\mathrm{max}}\limits _{\Vert \varvec{x}\Vert =1, \,{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)=\lambda } |{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)| =\mathop {\mathrm{arg}\,\mathrm{max}}\limits _{\Vert \varvec{x}\Vert =1,\,{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)=\lambda } |\lambda |. \end{aligned}$$
By comparing to the left hand side of (14), it suffices to show that
$$\begin{aligned} \mathop {\mathrm{arg}\,\mathrm{max}}\limits _{{\mathcal {T}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d)=\lambda \varvec{x}, \,\Vert \varvec{x}\Vert =1,\,\lambda \in {\mathbb {C}}}|\lambda | =\mathop {\mathrm{arg}\,\mathrm{max}}\limits _{\Vert \varvec{x}\Vert =1,\,{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)=\lambda } |\lambda |. \end{aligned}$$
(15)
It is obvious that \({\mathcal {T}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d)=\lambda \varvec{x}\) implies \({\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)=\lambda \) by pre-multiplying \(\overline{\varvec{x}}\) on both sides. It remains to prove that an optimal solution of the right hand side of (15) satisfies \({\mathcal {T}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d)=\lambda \varvec{x}\), i.e., \(\varvec{x}\) is an eigenvector of \({\mathcal {T}}\).
The right hand side of (15) is equivalent to \(\max _{\Vert \varvec{x}\Vert ^2=1}|{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)|^2\). Let \({\mathcal {T}}={\mathcal {U}}+\mathbf{i}{\mathcal {V}}\) where \({\mathcal {U}},{\mathcal {V}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\) as in Proposition 2.4. This problem is further equivalent to
$$\begin{aligned} \max _{\Vert \varvec{x}-\varvec{y}\Vert ^2 = 0,\, \Vert \varvec{x}\Vert ^2 = 1,\,|\varvec{y}\Vert ^2 = 1} \left( {\mathcal {U}}(\overline{\varvec{x}}^d\varvec{x}^d)\right) ^2 + \left( {\mathcal {V}}(\overline{\varvec{y}}^d\varvec{y}^d)\right) ^2. \end{aligned}$$
(16)
Since both \({\mathcal {U}}(\overline{\varvec{x}}^d\varvec{x}^d)\) and \({\mathcal {V}}(\overline{\varvec{y}}^d\varvec{y}^d)\) are real, the Lagrangian function is
$$\begin{aligned} f(\varvec{x},\varvec{y},\gamma _1,\gamma _2,\gamma _3)&\\&=\left( {\mathcal {U}}(\overline{\varvec{x}}^d\varvec{x}^d)\right) ^2 +\left( {\mathcal {V}}(\overline{\varvec{y}}^d\varvec{y}^d)\right) ^2 \\&\quad + \gamma _1\Vert \varvec{x}-\varvec{y}\Vert ^2 +\gamma _2(1-\Vert \varvec{x}\Vert ^2) + \gamma _3(1-\Vert \varvec{y}\Vert ^2). \end{aligned}$$
This provides (part of) the first-order optimality condition:
$$\begin{aligned} \frac{\partial L}{\partial \overline{\varvec{x}}}&= 2d\,{\mathcal {U}}(\overline{\varvec{x}}^d\varvec{x}^d) \, {\mathcal {U}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d) + \gamma _1(\varvec{x}-\varvec{y})-\gamma _2\varvec{x}= 0, \\ \frac{\partial L}{\partial \overline{\varvec{y}}}&= 2d\,{\mathcal {V}}(\overline{\varvec{y}}^d\varvec{y}^d) \, {\mathcal {V}}(\bullet \,\overline{\varvec{y}}^{d-1}\varvec{y}^d) - \gamma _1(\varvec{x}-\varvec{y})-\gamma _3\varvec{y}= 0. \end{aligned}$$
By that \(\varvec{x}=\varvec{y}\) in the constraints of (16), the above equations lead to
$$\begin{aligned} {\mathcal {U}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d)&=\frac{\gamma _2\varvec{x}}{2d\,{\mathcal {U}}(\overline{\varvec{x}}^d\varvec{x}^d)},\\ {\mathcal {V}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d)&= \frac{\gamma _3\varvec{x}}{2d\,{\mathcal {V}}(\overline{\varvec{x}}^d\varvec{x}^d)}, \end{aligned}$$
which implies that
$$\begin{aligned} {\mathcal {T}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d) ={\mathcal {U}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d) +\mathbf{i}{\mathcal {V}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d) =\left( \frac{\gamma _2}{2d\,{\mathcal {U}}(\overline{\varvec{x}}^d\varvec{x}^d)} +\frac{\mathbf{i}\gamma _3}{2d\,{\mathcal {V}}(\overline{\varvec{x}}^d\varvec{x}^d)} \right) \varvec{x}. \end{aligned}$$
Therefore, \(\varvec{x}\) is a an eigenvector of \({\mathcal {T}}\). \(\square \)
Since CPS tensors are PS tensors, for a best rank-one PS tensor \(\lambda \,\varvec{x}^{\otimes d}\otimes \overline{\varvec{x}}^{\otimes d}\) approximation in (14), \(\lambda \) must be an eigenvalue. As all the eigenvalues of CPS tensors are real, \(\lambda \,\varvec{x}^{\otimes d}\otimes \overline{\varvec{x}}^{\otimes d}\) becomes a best rank-one CPS tensor approximation. Therefore, Theorem 4.2 immediately implies a similar result for CPS tensors.
Corollary 4.3
For a CPS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\), \(\lambda \in {\mathbb {R}}\) is a largest (in terms of the absolute value) eigenvalue in an eigenpair \((\lambda ,\varvec{x})\) of \({\mathcal {T}}\) if and only if \(\lambda \,\varvec{x}^{\otimes d}\otimes \overline{\varvec{x}}^{\otimes d}\) is a best rank-one CPS tensor approximation of \({\mathcal {T}}\), i.e.,
$$\begin{aligned} \mathop {\mathrm{arg}\,\mathrm{max}}\limits _{{\mathcal {T}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d)=\lambda \varvec{x}, \,\Vert \varvec{x}\Vert =1,\,\lambda \in {\mathbb {R}}}|\lambda | =\mathop {\mathrm{arg}\,\mathrm{min}}\limits _{\Vert \varvec{x}\Vert =1,\,\lambda \in {\mathbb {R}}} \Vert {\mathcal {T}}-\lambda \, \varvec{x}^{\otimes d}\otimes \overline{\varvec{x}}^{\otimes d}\Vert . \end{aligned}$$
For CPS tensors, one may consider different rank-one approximation problems. The following result is interesting, which echoes the inequivalence on ranks discussed earlier in Theorem 3.5.
Theorem 4.4
If \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\) is a CPS tensor with \(d\ge 2\), then the best rank-one CPS tensor approximation of \({\mathcal {T}}\) is equivalent to the best rank-one PS tensor approximation of \({\mathcal {T}}\), but is not equivalent to the best rank-one complex tensor approximation of \({\mathcal {T}}\), i.e.,
$$\begin{aligned} \min _{\mathrm{rank}\,_\mathrm{CPS}({\mathcal {X}})=1,\,{\mathcal {X}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}}\Vert {\mathcal {T}}-{\mathcal {X}}\Vert =\min _{\mathrm{rank}\,_\mathrm{PS}({\mathcal {X}})=1,\,{\mathcal {X}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}}\Vert {\mathcal {T}}-{\mathcal {X}}\Vert \ge \min _{\mathrm{rank}\,({\mathcal {X}})=1,\,{\mathcal {X}}\in {\mathbb {C}}^{n^{2d}}}\Vert {\mathcal {T}}-{\mathcal {X}}\Vert . \end{aligned}$$
(17)
Moreover, there exists a CPS tensor \({\mathcal {T}}\) such that
$$\begin{aligned} \min _{\mathrm{rank}\,_\mathrm{CPS}({\mathcal {X}})=1,\,{\mathcal {X}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}}\Vert {\mathcal {T}}-{\mathcal {X}}\Vert > \min _{\mathrm{rank}\,({\mathcal {X}})=1,\,{\mathcal {X}}\in {\mathbb {C}}^{n^{2d}}}\Vert {\mathcal {T}}-{\mathcal {X}}\Vert . \end{aligned}$$
In fact, the equality in (17) is a consequence of Theorem 4.2 and Corollary 4.3. The inequality in (17) is obvious since \({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\subseteq {\mathbb {C}}^{n^{2d}}\) and \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {X}})=1\) implies that \(\mathrm{rank}\,({\mathcal {X}})=1\), and its strictness can be validated by the following example.
Example 4.5
Let \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{2^4}\) where \({\mathcal {T}}_{1122}={\mathcal {T}}_{2211}=1\) and other entries are zeros. For any \(\varvec{z}\in {\mathbb {C}}^2\) with \(\Vert \varvec{z}\Vert =1\), one has
$$\begin{aligned} |{\mathcal {T}}(\overline{\varvec{z}}^2\varvec{z}^2)|=|{\overline{z_1}}^2{z_2}^2+{\overline{z_2}}^2{z_1}^2| \le 2|z_1|^2|z_2|^2\le \frac{1}{2}(|z_1|^2+|z_2|^2)^2=\frac{1}{2}, \end{aligned}$$
implying that
$$\begin{aligned} \Vert {\mathcal {T}}-\lambda \varvec{z}^{\otimes 2}\otimes \overline{\varvec{z}}^{\otimes 2}\Vert ^2 =\Vert {\mathcal {T}}\Vert ^2-2\lambda {\mathcal {T}}(\overline{\varvec{z}}^2\varvec{z}^2)+\lambda ^2\ge \Vert {\mathcal {T}}\Vert ^2 -|{\mathcal {T}}(\overline{\varvec{z}}^2\varvec{z}^2)|^2 \ge 2-\frac{1}{4}=\frac{7}{4}. \end{aligned}$$
However,
$$\begin{aligned} \Vert {\mathcal {T}}-\varvec{e}_1\otimes \varvec{e}_1\otimes \varvec{e}_2\otimes \varvec{e}_2\Vert ^2 =1. \end{aligned}$$
This shows that \(\min _{\mathrm{rank}\,_\mathrm{CPS}({\mathcal {X}})=1,\,{\mathcal {X}}\in {\mathbb {C}}_{\mathrm{cps}}^{2^4}}\Vert {\mathcal {T}}-{\mathcal {X}}\Vert > \min _{\mathrm{rank}\,({\mathcal {X}})=1,\,{\mathcal {X}}\in {\mathbb {C}}^{2^4}}\Vert {\mathcal {T}}-{\mathcal {X}}\Vert \).
We remark that the equivalence between best rank-one CPS tensor approximations and best rank-one complex tensor approximations actually holds for Hermitian matrices (\(d=1\)), i.e.,
$$\begin{aligned} \min _{\mathrm{rank}\,_\mathrm{CPS}(X)=1,\,X\in {\mathbb {C}}_{\mathrm{cps}}^{n^2}}\Vert A-X\Vert =\min _{\mathrm{rank}\,_\mathrm{PS}(X)=1,\,X\in {\mathbb {C}}_{\mathrm{ps}}^{n^2}}\Vert A-X\Vert =\min _{\mathrm{rank}\,(X)=1,\,X\in {\mathbb {C}}^{n^2}}\Vert A-X\Vert \end{aligned}$$
when \(A\in {\mathbb {C}}^{n^2}\) is Hermitian. This is because of the trivial fact \({\mathbb {C}}^{n^2}={\mathbb {C}}_{\mathrm{ps}}^{n^2}\) and the equality in (17).

4.2 Rank-one equivalence via matricization

Matricization, also known as matrix flattening or matrix unfolding, of a tensor is a widely used tool to study high-order tensors. When a tensor is rank-one, it is obvious that any matricization of the tensor is rank-one, while the reverse is not true in general. The rank-one equivalence via general matricization has been studied for real tensors [43]. For an even-order symmetric tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{s}}^{n^{2d}}\), it is known that if its square matricization (unfolding \({\mathcal {T}}\) as an \(n^d\times n^d\) matrix) is rank-one, then the original tensor \({\mathcal {T}}\) must be rank-one; see e.g., [25, 35]. In the real field, this rank-one equivalence suggests some convex optimization methods to compute the largest eigenvalue or best rank-one approximations of a symmetric tensor. In practice, the methods are very likely to find global optimal solutions [25, 35]. Inspired by these results, let us look into the rank-one equivalence for CPS tensors.
For a CPS tensor, one hopes that its square matricization being rank-one implies the original tensor being rank-one. Unfortunately, this may not hold true if a CPS tensor is not unfolded in a right way. The following example shows that the standard square matricization of a non-rank-one CPS tensor turns to a rank-one Hermitian matrix.
Example 4.6
Let \({\mathcal {T}}=\overline{A}\otimes A\in {\mathbb {C}}^{2^4}\) where \(A=\left( \begin{array}{cc} 1 &{} 1 + \mathbf{i}\\ 1+\mathbf{i}&{} 2 \end{array}\right) \in {\mathbb {C}}^{2^2}\). Explicitly, \({\mathcal {T}}\) can be written as
$$\begin{aligned} \begin{array}{c|c} {\mathcal {T}}(\varvec{e}_1\varvec{e}_1\bullet \bullet ) &{} {\mathcal {T}}(\varvec{e}_1\varvec{e}_2\bullet \bullet ) \\ \hline {\mathcal {T}}(\varvec{e}_2\varvec{e}_1\bullet \bullet ) &{} {\mathcal {T}}(\varvec{e}_2\varvec{e}_2\bullet \bullet ) \end{array} =\begin{array}{c|c} \left( \begin{array}{cc} 1 &{} 1+\mathbf{i}\\ 1+\mathbf{i}&{} 2 \end{array}\right) &{} \left( \begin{array}{cc} 1-\mathbf{i}&{} 2 \\ 2 &{} 2-2\mathbf{i}\end{array}\right) \\ \hline \left( \begin{array}{cc} 1-\mathbf{i}&{} 2 \\ 2 &{} 2-2\mathbf{i}\end{array}\right) &{} \left( \begin{array}{cc} 2 &{} 2+2\mathbf{i}\\ 2+2\mathbf{i}&{} 4 \end{array}\right) \end{array}, \end{aligned}$$
which can be straightforwardly verified as a CPS tensor. However, \(\mathrm{rank}\,({\mathcal {T}})\ge 2\) but the standard square matricization of \({\mathcal {T}}\) is a rank-one Hermitian matrix.
Proof
By the construction of \({\mathcal {T}}\) via the outer product of two matrices, it is obvious that its standard square matricization is \((1,1+\mathbf{i},1+\mathbf{i},2)^{\mathrm{H}}(1,1+\mathbf{i},1+\mathbf{i},2)\), which is a rank-one Hermitian matrix.
On the other hand, suppose that \(\mathrm{rank}\,({\mathcal {T}})=1\). This implies that \(\mathrm{rank}\,_\mathrm{CPS}({\mathcal {T}})=1\) and so we may let \({\mathcal {T}}=\overline{\varvec{x}}\otimes \overline{\varvec{x}}\otimes \varvec{x}\otimes \varvec{x}\) for some \(\varvec{x}\in {\mathbb {C}}^2\). By comparing some entries, one has
$$\begin{aligned}&|x_1|^4={\mathcal {T}}_{1111}=1, \,|x_2|^4={\mathcal {T}}_{2222}=4, \\&\, \overline{x_1}^2x_1x_2={\mathcal {T}}_{1112}=1+\mathbf{i}, \,\overline{x_2}\overline{x_1}x_2x_2={\mathcal {T}}_{2122}=2-2\mathbf{i}. \end{aligned}$$
Clearly \(|x_1|^2=1\) and \(|x_2|^2=2\), and this leads to
$$\begin{aligned} 2-2\mathbf{i}= \overline{x_2}\overline{x_1}x_2x_2 = 2\,\overline{x_1}x_2 = 2 \, \overline{x_1}^2x_1x_2 = 2 + 2\mathbf{i}, \end{aligned}$$
a contradiction. Therefore, \(\mathrm{rank}\,({\mathcal {T}})\ge 2\). \(\square \)
The role of symmetry is important in nonlinear elasticity and material sciences [20, 38]. The following example of elasticity tensors points out a structural difference for the rank-one equivalence via matricization.
Example 4.7
(Itin and Hehl [21]). Elasticity tensors are three-dimensional fourth-order real positive definite tensors with certain symmetry. Specifically, \({\mathcal {A}}\in {\mathbb {R}}^{3^4}\) is an elasticity tensor if
$$\begin{aligned} \begin{array}{ll} {\mathcal {A}}_{ijk\ell }={\mathcal {A}}_{jik\ell }={\mathcal {A}}_{ij\ell k}={\mathcal {A}}_{k\ell ij}&{} \forall \, 1\le i,j,k,\ell \le 3,\\ {\mathcal {A}}(\varvec{x}^4)> 0&{}\forall \,\varvec{x}\in {\mathbb {R}}^3,\,\varvec{x}\ne \mathbf{0}. \end{array} \end{aligned}$$
It is straightforward to check that an elasticity tensor is always CPS. After applying the standard square matricization, deleting identical rows and columns, and swapping some rows and columns, \({\mathcal {A}}\) can be one-to-one represented by a \(6\times 6\) real symmetric matrix, known as Voigt’s notation:
$$\begin{aligned} \left( \begin{array}{cccccc} {\mathcal {A}}_{1111} &{} {\mathcal {A}}_{1122} &{} {\mathcal {A}}_{1133} &{} {\mathcal {A}}_{1123} &{} {\mathcal {A}}_{1131} &{} {\mathcal {A}}_{1112} \\ &{} {\mathcal {A}}_{2222} &{} {\mathcal {A}}_{2233} &{} {\mathcal {A}}_{2223} &{} {\mathcal {A}}_{2231} &{} {\mathcal {A}}_{2212} \\ &{} &{} {\mathcal {A}}_{3333} &{} {\mathcal {A}}_{3323} &{} {\mathcal {A}}_{3331} &{} {\mathcal {A}}_{3312} \\ &{} &{} &{} {\mathcal {A}}_{2323} &{} {\mathcal {A}}_{2331} &{} {\mathcal {A}}_{2312} \\ &{} &{} &{} &{} {\mathcal {A}}_{3131} &{} {\mathcal {A}}_{3112} \\ &{} &{} &{} &{} &{} {\mathcal {A}}_{1212} \\ \end{array}\right) , \end{aligned}$$
(18)
whose degree of freedome is 21. These independent entries are called elasticities. If the Voigt’s matrix, or equivalently the standard square matricization, is rank-one, then this positive definite matrix can be written as \(\varvec{x}\varvec{x}^{\mathrm{T}}\) with \(\varvec{x}\in {\mathbb {R}}^6\), whose degree of freedom is 6. However, if the elasticity tensor \({\mathcal {A}}\) is rank-one, then \({\mathcal {A}}\) can be written as \(\overline{\varvec{y}}\otimes \overline{\varvec{y}} \otimes \varvec{y}\otimes \varvec{y}\) with \(\varvec{y}\in {\mathbb {C}}^3\) since \({\mathcal {A}}\) is CPS and positive definite, and further it can be easily shown that \(\varvec{y}\in {\mathbb {R}}^3\), whose degree of freedom is then 3. Therefore, the standard square matricization of an elasticity tensor is rank-one cannot guarantee that the original elasticity tensor is rank-one.
We notice that square matricization is unique for symmetric tensors, but not for CPS tensors. Examples 4.6 and 4.7 motivate us to consider other ways of matricization, with a hope to establish certain rank-one equivalence. To this end, it is necessary to introduce tensor transpose, extending the concept of matrix transpose.
Definition 4.8
Given a tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^d}\) and a permutation \(\pi =(\pi _1,\ldots ,\pi _d)\in \Pi (1,\ldots ,d)\), the \(\pi \)-transpose of \({\mathcal {T}}\), denoted by \({\mathcal {T}}^\pi \in {\mathbb {C}}^{n^d}\), satisfies
$$\begin{aligned} {\mathcal {T}}_{i_1\ldots i_d}=({\mathcal {T}}^\pi )_{i_{\pi _1} \ldots i_{\pi _{d}}} \quad \forall \, 1 \le i_1, \ldots , i_d\le n. \end{aligned}$$
In a plain language, mode 1 of \({\mathcal {T}}^\pi \) originates from mode \(\pi _1\) of \({\mathcal {T}}\), mode 2 of \({\mathcal {T}}^\pi \) originates from mode \(\pi _2\) of \({\mathcal {T}}\), and so on. As a matter of fact, for a matrix \(A\in {\mathbb {C}}^{n^2}\) and \(\pi =(2,1)\), \(A^\pi =A^{\mathrm{T}}\). For a PS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\) and \(\pi =(d+1,\ldots ,2d,1,\ldots ,d)\), \({\mathcal {T}}^{\mathrm{H}}=\overline{{\mathcal {T}}^\pi }\).
Given any integers \(1\le i_1,\ldots ,i_d\le n\), let us denote
$$\begin{aligned} n(i_1\ldots i_d):=\sum _{k = 1}^{d}(i_k-1)n^{d-k} + 1 \end{aligned}$$
to be the decimal of the tuple \(i_1\ldots i_d\) in the base-n numeral system. We now discuss matricization and vectorization.
Definition 4.9
Given a tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^{d}}\), the vectorization of \({\mathcal {T}}\), denoted by \(\varvec{v}({\mathcal {T}})\), is an \(n^{d}\)-dimensional vector satisfying
$$\begin{aligned} v({\mathcal {T}})_{n(i_1\ldots i_d)} = {\mathcal {T}}_{i_1\ldots i_{d}} \quad \forall \, 1\le i_1,\ldots ,i_{d}\le n. \end{aligned}$$
Given an even-order tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^{2d}}\), the standard square matricization (or simply matricization) of \({\mathcal {T}}\), denoted by \(M({\mathcal {T}})\), is an \(n^d\times n^d\) matrix satisfying
$$\begin{aligned} M({\mathcal {T}})_{n(i_1\ldots i_d)\, n(i_{d+1}\ldots i_{2d})} ={\mathcal {T}}_{i_1\ldots i_{2d}} \quad \forall \, 1\le i_1,\ldots ,i_{2d}\le n. \end{aligned}$$
Definition 4.10
Given an even-order tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^{2d}}\) and a permutation \(\pi \in \Pi (1,\ldots ,2d)\), the \(\pi \)-matricization of \({\mathcal {T}}\), denoted by \(M_\pi ({\mathcal {T}})\), satisfies
$$\begin{aligned} M_\pi ({\mathcal {T}})_{n(i_{\pi _1}\ldots i_{\pi _d})\, n(i_{\pi _{d+1}} \ldots i_{\pi _{2d}})}= {\mathcal {T}}_{i_{1} \ldots i_{{2d}}} \quad \forall \, 1 \le i_1, \ldots , i_{2d}\le n, \end{aligned}$$
in other words, \(M_\pi ({\mathcal {T}})=M({\mathcal {T}}^\pi )\).
Obviously the standard square matricization is a \(\pi \)-matricization when \(\pi =(1,2,\ldots ,2d)\). Vectorization, \(\pi \)-matricization and \(\pi \)-transpose are all one-to-one. They are different ways of representation for tensor data. The following property on ranks are straightforward.
Proposition 4.11
Given a tensor \({\mathcal {T}}\in {\mathbb {C}}^{n^{2d}}\) and any \(\pi \in \Pi (1,\ldots ,2d)\), it follows that \(\mathrm{rank}\,({\mathcal {T}})=\mathrm{rank}\,({\mathcal {T}}^\pi )\) and \(\mathrm{rank}\,(M_\pi ({\mathcal {T}}))\le \mathrm{rank}\,({\mathcal {T}})\), in particular, \(\mathrm{rank}\,({\mathcal {T}})=1 \Longrightarrow \mathrm{rank}\,(M_\pi ({\mathcal {T}}))=1\).
As mentioned earlier, the \(\pi \)-matricization of a symmetric tensor is unique for any permutation \(\pi \), since \({\mathcal {T}}^\pi ={\mathcal {T}}\) if \({\mathcal {T}}\) is symmetric. However, CPS tensors only possess partial symmetry as well as certain conjugate property. Therefore, conditions of \(\pi \) are necessary to guarantee the rank-one equivalence, as well as for the \(\pi \)-matricization being Hermitian.
Proposition 4.12
If \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\) and a permutation \(\pi \in \Pi (1,\ldots ,2d)\) satisfies
$$\begin{aligned} \left| \{\pi _k,\pi _{d+k}\}\cap \{1,\ldots ,d\}\right| =1 \quad \forall \, 1\le k\le d, \end{aligned}$$
(19)
then \(M_\pi ({\mathcal {T}})\) is a CPS (Hermitian) matrix.
Proof
Since any CPS tensor can be written as a sum of rank-one CPS tensors (Theorem 3.2), we only need to show the case \({\mathcal {T}}\) is rank-one. Suppose that \({\mathcal {T}}=\varvec{a}_1\otimes \ldots \otimes \varvec{a}_{2d}\), where \(\varvec{a}_1=\ldots =\varvec{a}_d=\overline{\varvec{x}}\) and \(\varvec{a}_{d+1}=\ldots =\varvec{a}_{2d}=\varvec{x}\), and so \({\mathcal {T}}^\pi =\varvec{a}_{\pi _1}\otimes \ldots \otimes \varvec{a}_{\pi _{2d}}\).
For any \(1\le k\le d\), as exactly one of \(\{\pi _k,\pi _{d+k}\}\) belongs to \(\{1,\ldots ,d\}\) and the other belongs to \(\{d+1,\ldots ,2d\}\), we have \(\overline{\varvec{a}_{\pi _k}}=\varvec{a}_{\pi _{d+k}}\). Therefore,
$$\begin{aligned} M({\mathcal {T}}^\pi )^{\mathrm{H}}&= \left( \varvec{v}(\varvec{a}_{\pi _1}\otimes \ldots \otimes \varvec{a}_{\pi _d}) \otimes \varvec{v}(\varvec{a}_{\pi _{d+1}} \otimes \ldots \otimes \varvec{a}_{\pi _{2d}}) \right) ^{\mathrm{H}}\\&= \overline{\varvec{v}(\varvec{a}_{\pi _{d+1}}\otimes \ldots \otimes \varvec{a}_{\pi _{2d}}) \otimes \varvec{v}(\varvec{a}_{\pi _1}\otimes \ldots \otimes \varvec{a}_{\pi _d})} \\&= \varvec{v}(\overline{\varvec{a}_{\pi _{d+1}}}\otimes \ldots \otimes \overline{\varvec{a}_{\pi _{2d}}}) \otimes \varvec{v}(\overline{\varvec{a}_{\pi _1}}\otimes \ldots \otimes \overline{\varvec{a}_{\pi _d}}) \\&= \varvec{v}(\varvec{a}_{\pi _{1}}\otimes \ldots \otimes \varvec{a}_{\pi _{d}}) \otimes \varvec{v}(\varvec{a}_{\pi _{d+1}}\otimes \ldots \otimes \varvec{a}_{\pi _{2d}}) \\&= M({\mathcal {T}}^\pi ), \end{aligned}$$
proving that \(M_\pi ({\mathcal {T}})=M({\mathcal {T}}^\pi )\) is a Hermitian matrix. \(\square \)
We remark that the condition of \(\pi \) in (19) is in fact necessary for \(M_\pi ({\mathcal {T}})\) to be a Hermitian matrix for a general CPS tensor \({\mathcal {T}}\). Essentially, if mode k of \({\mathcal {T}}^\pi \) originates from modes \(\{1,\ldots ,d\}\) of \({\mathcal {T}}\), then mode \(d+k\) of \({\mathcal {T}}^\pi \) must originate from modes \(\{d+1,\ldots ,2d\}\) of \({\mathcal {T}}\), and vise versa.
Theorem 4.13
If \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^{2d}}\) and \(\pi \in \Pi (1,\ldots ,2d)\) satisfies
$$\begin{aligned} \left\lfloor \frac{d}{2}\right\rfloor \le \left| \{\pi _1,\ldots ,\pi _{d}\} \cap \{1,\ldots ,d\}\right| \le \left\lceil \frac{d}{2}\right\rceil , \end{aligned}$$
(20)
then \(\mathrm{rank}\,(M_\pi ({\mathcal {T}}))=1 \Longrightarrow \mathrm{rank}\,({\mathcal {T}})=1\).
Proof
Let \(M_\pi ({\mathcal {T}})=\varvec{x}\otimes \varvec{y}\) where \(\varvec{x}\) and \(\varvec{y}\) are \(n^d\)-dimensional vectors, and further let \(\varvec{x}=\varvec{v}({\mathcal {X}})\) and \(\varvec{y}=\varvec{v}({\mathcal {Y}})\) where \({\mathcal {X}},{\mathcal {Y}}\in {\mathbb {C}}^{n^d}\). Since
$$\begin{aligned} M({\mathcal {T}}^\pi )=M_\pi ({\mathcal {T}})=\varvec{x}\otimes \varvec{y}=\varvec{v}({\mathcal {X}})\otimes \varvec{v}({\mathcal {Y}}), \end{aligned}$$
one has \({\mathcal {T}}^\pi ={\mathcal {X}}\otimes {\mathcal {Y}}\). To prove \(\mathrm{rank}\,({\mathcal {T}})=1\), it suffices to show that \(\mathrm{rank}\,({\mathcal {X}})=\mathrm{rank}\,({\mathcal {Y}})=1\).
Modes \(1,\ldots ,d\) of \({\mathcal {T}}^\pi \) are originated from modes \(\pi _1,\ldots ,\pi _d\) of \({\mathcal {T}}\), respectively. By (20), there are almost half (either \(\left\lfloor d/2\right\rfloor \) or \(\left\lceil d/2 \right\rceil \)) from modes \(\{1,\ldots ,d\}\) of \({\mathcal {T}}\) and the remaining half from modes \(\{d+1,\ldots ,2d\}\) of \({\mathcal {T}}\). This is also true for the modes of \({\mathcal {X}}\). To provide a clearer presentation, we may construct a permutation \(\tau \) such that, the first half of the modes of \({\mathcal {X}}^\tau \) are from modes \(\{1,\ldots ,d\}\) of \({\mathcal {T}}\) and the remaining half are from modes \(\{d+1,\ldots ,2d\}\) of \({\mathcal {T}}\). Explicitly, \(\tau \in \Pi (1,\ldots ,d)\) needs to satisfy
$$\begin{aligned} \left\{ \tau _1,\ldots ,\tau _{\left| \{\pi _1,\ldots ,\pi _{d}\} \cap \{1,\ldots ,d\}\right| }\right\} =\left\{ 1\le k\le d: 1\le \pi _k\le d \right\} . \end{aligned}$$
(21)
As \({\mathcal {T}}\) is symmetric to modes \(\{1,\ldots ,d\}\) and to modes \(\{d+1,\ldots ,2d\}\), respectively, the order of \(\tau _k\)’s in (21) does not matter. Observing that \(\mathrm{rank}\,({\mathcal {X}})=\mathrm{rank}\,({\mathcal {X}}^\tau )\) and \(M({\mathcal {X}}^\tau \otimes {\mathcal {Y}})\) is rank-one, we may without loss of generality assume that the first \(\left\lceil d/2 \right\rceil \) modes of \({\mathcal {X}}\) are from modes \(\{1,\ldots ,d\}\) of \({\mathcal {T}}\) and the remaining \(\left\lfloor d/2 \right\rfloor \) from modes \(\{d+1,\ldots ,2d\}\) of \({\mathcal {T}}\), and for the same reason assume that the first \(\left\lfloor d/2 \right\rfloor \) modes of \({\mathcal {Y}}\) are from modes \(\{1,\ldots ,d\}\) of \({\mathcal {T}}\) and the remaining \(\left\lceil d/2 \right\rceil \) from modes \(\{d+1,\ldots ,2d\}\) of \({\mathcal {T}}\). In a nutshell, we can assume without loss of generality that
$$\begin{aligned} \pi =\left( 1,\ldots , \left\lceil \frac{d}{2}\right\rceil , d+1, \ldots , d+ \left\lfloor \frac{d}{2}\right\rfloor , \left\lceil \frac{d}{2}\right\rceil +1, \ldots , d, d +\left\lfloor \frac{d}{2}\right\rfloor +1,\ldots , 2d \right) . \end{aligned}$$
This implies that \({\mathcal {X}}\otimes {\mathcal {Y}}={\mathcal {T}}^\pi \) is symmetric to modes
$$\begin{aligned} {\mathbb {I}}_d^1:=\left\{ 1,\ldots ,\left\lceil \frac{d}{2}\right\rceil ,d+1, \ldots ,d+\left\lfloor \frac{d}{2}\right\rfloor \right\} \end{aligned}$$
and symmetric to modes
$$\begin{aligned} {\mathbb {I}}_d^2:=\left\{ \left\lceil \frac{d}{2} \right\rceil +1,\ldots ,d,d +\left\lfloor \frac{d}{2}\right\rfloor +1,\ldots ,2d\right\} . \end{aligned}$$
We proceed to prove that under the condition of \({\mathcal {X}}\otimes {\mathcal {Y}}\) being symmetric to modes \({\mathbb {I}}_d^1\) and symmetric to modes \({\mathbb {I}}_d^2\), \(\mathrm{rank}\,({\mathcal {X}})=\mathrm{rank}\,({\mathcal {Y}})=1\) holds, by induction on d.
When \(d=1\), both \({\mathcal {X}}\) and \({\mathcal {Y}}\) are obviously rank-one as they are vectors. Suppose that the claim holds for \(d-1\). For general d, as \({\mathcal {X}}\otimes {\mathcal {Y}}\) is symmetric to modes \({\mathbb {I}}_d^1\) and symmetric to modes \({\mathbb {I}}_d^2\), we can swap all but the first modes of \({\mathcal {X}}\) with some modes of \({\mathcal {Y}}\). In particularly, modes \(\left\{ 2,\ldots , \left\lceil \frac{d}{2}\right\rceil \right\} \) of \({\mathcal {X}}\) are swapped with modes \(\left\{ 1,\ldots , \left\lceil \frac{d}{2}\right\rceil -1\right\} \) of \({\mathcal {Y}}\), respectively, and modes \(\left\{ \left\lceil \frac{d}{2}\right\rceil +1, \ldots ,d\right\} \) of \({\mathcal {X}}\) are swapped with modes \(\left\{ \left\lceil \frac{d}{2}\right\rceil +1, \ldots ,d\right\} \) of \({\mathcal {Y}}\), respectively. Consequently, one has for any \(1 \le i_1, \ldots , i_{2d}\le n\),
$$\begin{aligned} {\mathcal {X}}_{i_1\ldots i_d}{\mathcal {Y}}_{i_{d+1}\ldots i_{2d}} ={\mathcal {X}}_{ i_1 i_{d+1} \ldots i_{d+\lceil d/2\rceil -1} i_{d+\lceil d/2\rceil +1} \ldots i_{2d} } {\mathcal {Y}}_{ i_2 \ldots i_{\lceil d/2\rceil } i_{d+\lceil d/2\rceil } i_{\lceil d/2\rceil +1}\ldots i_d }. \end{aligned}$$
(22)
Pick any nonzero entry of \({\mathcal {Y}}\), say \({\mathcal {Y}}_{k_1\ldots k_d}\ne 0\). Let \((i_{d+1}, \ldots ,i_{2d})=(k_1,\ldots ,k_d)\) in (22) and we have
$$\begin{aligned} {\mathcal {X}}_{i_1\ldots i_d}{\mathcal {Y}}_{k_1\ldots k_d} ={\mathcal {X}}_{ i_1 k_1 \ldots k_{\lceil d/2\rceil -1} k_{\lceil d/2\rceil +1} \ldots k_{d} } {\mathcal {Y}}_{ i_2 \ldots i_{\lceil d/2\rceil } k_{\lceil d/2\rceil } i_{\lceil d/2\rceil +1}\ldots i_d }. \end{aligned}$$
By defining \(\varvec{a}\in {\mathbb {C}}^n\) and \({\mathcal {U}}\in {\mathbb {C}}^{n^{d-1}}\) where
$$\begin{aligned} a_{i_1}:=\frac{{\mathcal {X}}_{ i_1 k_1 \ldots k_{\lceil d/2\rceil -1} k_{\lceil d/2\rceil +1} \ldots k_{d} }}{{\mathcal {Y}}_{k_1\ldots k_d}} \text { and } {\mathcal {U}}_{i_2\ldots i_d}:={\mathcal {Y}}_{ i_2 \ldots i_{\lceil d/2\rceil } k_{\lceil d/2\rceil } i_{\lceil d/2\rceil +1}\ldots i_d }, \end{aligned}$$
we obtain that \({\mathcal {X}}=\varvec{a}\otimes {\mathcal {U}}\). Similarly to (22), one may swap all but the last modes of \({\mathcal {Y}}\) to some modes of \({\mathcal {X}}\) and obtain \({\mathcal {Y}}={\mathcal {V}}\otimes \varvec{b}\) where \({\mathcal {V}}\in {\mathbb {C}}^{n^{d-1}}\) and \(\varvec{b}\in {\mathbb {C}}^n\). Since
$$\begin{aligned} {\mathcal {X}}\otimes {\mathcal {Y}}= \varvec{a}\otimes {\mathcal {U}}\otimes {\mathcal {V}}\otimes \varvec{b}\end{aligned}$$
is symmetric with respect to modes \({\mathbb {I}}^1_d\) and symmetric with respect to modes \({\mathbb {I}}^2_d\), \({\mathcal {U}}\otimes {\mathcal {V}}\) is symmetric to modes \(\left\{ 1,\ldots ,\lceil \frac{d}{2}\rceil -1,d,\ldots ,d +\lfloor \frac{d}{2}\rfloor -1\right\} \) and symmetric to modes \(\left\{ \lceil \frac{d}{2}\rceil ,\ldots ,d-1,d+\lfloor \frac{d}{2}\rfloor ,\ldots ,2d-2\right\} \). As a result, \({\mathcal {V}}\otimes {\mathcal {U}}\) is symmetric with respect to modes \(\left\{ 1,\ldots ,\lfloor \frac{d}{2} \rfloor ,d,\ldots ,d+\lceil \frac{d}{2}\rceil -2\right\} \) and symmetric with respect to modes \(\left\{ \lfloor \frac{d}{2}\rfloor +1,\ldots ,d-1,d +\lceil \frac{d}{2}\rceil -1,\ldots ,2d-2\right\} \).
If d is odd, by noticing that \(\lceil \frac{d}{2}\rceil -1=\lceil \frac{d-1}{2}\rceil \) and \(\lfloor \frac{d}{2}\rfloor =\lfloor \frac{d-1}{2}\rfloor \), we have that \({\mathcal {U}}\otimes {\mathcal {V}}\) is symmetric to modes \({\mathbb {I}}^1_{d-1}\) and symmetric to modes \({\mathbb {I}}^2_{d-1}\). By induction, we obtain \(\mathrm{rank}\,({\mathcal {U}})=\mathrm{rank}\,({\mathcal {V}})=1\), proving that \(\mathrm{rank}\,({\mathcal {X}})=\mathrm{rank}\,({\mathcal {Y}})=1\).
If d is even, again by noticing that \(\lfloor \frac{d}{2}\rfloor =\lceil \frac{d-1}{2}\rceil \) and \(\lceil \frac{d}{2}\rceil =\lfloor \frac{d-1}{2}\rfloor +1\), we have that \({\mathcal {V}}\otimes {\mathcal {U}}\) is symmetric to the modes \({\mathbb {I}}^1_{d-1}\) and symmetric to the modes \({\mathbb {I}}^2_{d-1}\). By induction, we obtain \(\mathrm{rank}\,({\mathcal {V}})=\mathrm{rank}\,({\mathcal {U}})=1\), and so \(\mathrm{rank}\,({\mathcal {X}})=\mathrm{rank}\,({\mathcal {Y}})=1\). \(\square \)
In fact, the condition of \(\pi \) in (20) is also a necessary condition for the rank-one equivalence in Theorem 4.13 for a general CPS tensor. The proof, or an explanation of a counter example, involves heavy notations and we leave it to interested readers. The key step leading to Theorem 4.13 is the identity (22), which is a consequence of modes swapping due to some partial symmetry. This is doable because, among modes \(\{1,\ldots ,d\}\) of \({\mathcal {T}}\), they are (almost) equally allocated to the modes of \({\mathcal {X}}\) (the first half of the modes of \({\mathcal {T}}^\pi \)) and to modes of \({\mathcal {Y}}\) (the last half of the modes of \({\mathcal {T}}^\pi \)), and the same holds for modes \(\{d+1,\ldots ,2d\}\) of \({\mathcal {T}}\). If the number of modes of \({\mathcal {X}}\) that originate from modes \(\{1,\ldots ,d\}\) of \({\mathcal {T}}\) differs the number of modes of \({\mathcal {Y}}\) that originate from modes \(\{1,\ldots ,d\}\) of \({\mathcal {T}}\) for more than one (such as Example 4.6 with \(\pi =(1,2,3,4)\)), then (22) cannot be obtained. This makes some modes binding, i.e., not separable to the outer product of a vector and a tensor in a lower order.
Combing Propositions 4.11 and 4.12, Theorem 4.13, and the discussion regarding the necessities, we arrive at the following result.
Theorem 4.14
Both \(M_\pi ({\mathcal {T}})\) is Hermitian and \(\mathrm{rank}\,(M_\pi ({\mathcal {T}}))=1 \Longleftrightarrow \mathrm{rank}\,({\mathcal {T}})=1\) hold for any CPS tensor \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\) if and only if \(\pi \in \Pi (1,\ldots ,2d)\) satisfies both (19) and (20).
In practice, such as the discussion in modelling (Section 4.3) and the numerical experiments (Section 5), we may focus on a particular permutation satisfying (19) and (20). The most straightforward one is
$$\begin{aligned} \pi =\left( 1,\ldots ,\left\lceil \frac{d}{2}\right\rceil , d+1,\ldots ,d+\left\lfloor \frac{d}{2}\right\rfloor , d+\left\lfloor \frac{d}{2}\right\rfloor +1,\ldots ,2d, \left\lceil \frac{d}{2}\right\rceil +1,\ldots ,d \right) . \end{aligned}$$
(23)
In particular, \(\pi =(1,3,4,2)\) for fourth-order tensors (\(d=2\)), \(\pi =(1,2,4,5,6,3)\) for sixth-order tensors (\(d=3\)), and \(\pi =(1,2,5,6,7,8,3,4)\) for eighth-order tensors (\(d=4\)).
Before concluding this part, let us look at elasticity tensors (Example 4.7) again with Theorem 4.14 at hand. Since every elasticity tensor is CPS, Theorem 4.14 leads to the following interesting result from the rank-one equivalence.
Proposition 4.15
Let \({\mathcal {A}}\in {\mathbb {R}}^{n^4}\) be a real CPS tensor (including the case of elasticity tensor where \(n=3\) and \({\mathcal {A}}\) is positive definite), i.e.,
$$\begin{aligned} {\mathcal {A}}_{ijk\ell }={\mathcal {A}}_{jik\ell }={\mathcal {A}}_{ij\ell k}={\mathcal {A}}_{k\ell ij} \quad \forall \, 1\le i,j,k,\ell \le n, \end{aligned}$$
and let \(\pi =(1,3,4,2)\). If \(\mathrm{rank}\,(M_\pi ({\mathcal {A}})) = 1\), then \({\mathcal {A}}\) is a rank-one symmetric tensor.
Proof
Since \({\mathcal {A}}\) is CPS and \(\pi \) satisfies the condition of Theorem 4.14, we have that \(\mathrm{rank}\,({\mathcal {A}}^\pi )=1\) and \(M({\mathcal {A}}^\pi )\) is a Hermitian matrix, and hence a real symmetric matrix. Therefore, we may let
$$\begin{aligned} {\mathcal {A}}^\pi =\varvec{a}\otimes \varvec{b}\otimes \varvec{a}\otimes \varvec{b}\quad \varvec{a},\varvec{b}\in {\mathbb {C}}^n. \end{aligned}$$
This leads to \({\mathcal {A}}=\varvec{a}\otimes \varvec{b}\otimes \varvec{b}\otimes \varvec{a}\). Since \({\mathcal {A}}\) is CPS, it is symmetric to the first two modes. This implies that \(\varvec{a}\otimes \varvec{b}\) is a symmetric matrix, say \(\varvec{a}\otimes \varvec{b}=\varvec{x}\otimes \varvec{x}\). We then also have
$$\begin{aligned} \varvec{b}\otimes \varvec{a}=(\varvec{a}\otimes \varvec{b})^{\mathrm{T}}=(\varvec{x}\otimes \varvec{x})^{\mathrm{T}}=\varvec{x}\otimes \varvec{x}, \end{aligned}$$
implying that \({\mathcal {A}}=\varvec{x}\otimes \varvec{x}\otimes \varvec{x}\otimes \varvec{x}\), a rank-one symmetric tensor. \(\square \)

4.3 Computing best rank-one approximations

As an immediate application of the rank-one matricization equivalence, we now discuss how it can be used to compute best rank-one approximations of CPS tensors. Specifically, we consider the problem
$$\begin{aligned} \min _{\Vert \varvec{x}\Vert =1,\lambda \in {\mathbb {R}}} \Vert {\mathcal {T}}-\lambda \, \varvec{x}^{\otimes d} \otimes \overline{\varvec{x}}^{\otimes d}\Vert . \end{aligned}$$
(24)
As mentioned in Corollary 4.3,
$$\begin{aligned} \min _{\Vert \varvec{x}\Vert =1,\lambda \in {\mathbb {R}}} \Vert {\mathcal {T}}-\lambda \, \varvec{x}^{\otimes d} \otimes \overline{\varvec{x}}^{\otimes d} \Vert \Longleftrightarrow \max _{ {\mathcal {T}}(\bullet \,\overline{\varvec{x}}^{d-1}\varvec{x}^d)=\lambda \varvec{x}, \,\Vert \varvec{x}\Vert =1,\,\lambda \in {\mathbb {R}}}|\lambda | \Longleftrightarrow \max _{\Vert \varvec{x}\Vert =1}|{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)|. \end{aligned}$$
Since \({\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)\) is a real-valued function, the maximum of \(|{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)|\) is obtained either at \({\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d)\) or at \((-{\mathcal {T}})(\overline{\varvec{x}}^d\varvec{x}^d)\) for a given \({\mathcal {T}}\). Therefore, the above problem is essentially
$$\begin{aligned} \max _{\Vert \varvec{x}\Vert =1}{\mathcal {T}}(\overline{\varvec{x}}^d\varvec{x}^d), \end{aligned}$$
(25)
i.e., finding the largest eigenvalue of the CPS tensor \({\mathcal {T}}\).
The model (25) is NP-hard when the order of \({\mathcal {T}}\) is larger than two, even in the real field [16, 18]. Let us reformulate the tensor based optimization model to a matrix optimization model.
Theorem 4.16
If \({\mathcal {T}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\) and \(\pi \) satisfies (19) and (20), then (25) is equivalent to
$$\begin{aligned} \max \left\{ \langle \overline{M_\pi ({\mathcal {T}})}, X \rangle : \mathrm{tr}\,(X)=1, \, \mathrm{rank}\,(X)=1,\, X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}), \, X^{\mathrm{H}}=X \right\} , \end{aligned}$$
(26)
where \(M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}):=\left\{ M_\pi ({\mathcal {X}}): {\mathcal {X}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}} \right\} \).
Proof
The equivalence between (26) and (25) can be established via \(X=M_\pi (\overline{\varvec{x}}^{\otimes d}\otimes \varvec{x}^{\otimes d})\), where X and \(\varvec{x}\) are feasible solutions of (26) and (25), respectively.
Given an optimal solution \(\varvec{z}\) of (25), by Propositions 4.11 and 4.12, \(Z=M_\pi (\overline{\varvec{z}}^{\otimes d}\otimes \varvec{z}^{\otimes d})\) is a rank-one Hermitian matrix, and so \(Z=\overline{\varvec{y}}\otimes \varvec{y}\) where \(\varvec{y}\) is an \(n^d\)-dimensional vector. Moreover,
$$\begin{aligned} \mathrm{tr}\,(Z)=\langle I, \overline{\varvec{y}}\otimes \varvec{y}\rangle = \Vert \varvec{y}\Vert ^2 =\Vert Z\Vert =\Vert \overline{\varvec{z}}^{\otimes d}\otimes \varvec{z}^{\otimes d}\Vert =\Vert \varvec{z}\Vert ^{2d}=1. \end{aligned}$$
(27)
This shows that Z is a feasible solution of (26), whose objective value is
$$\begin{aligned} \langle \overline{M_\pi ({\mathcal {T}})}, Z\rangle = \langle M_\pi (\overline{{\mathcal {T}}}), M_\pi (\overline{\varvec{z}}^{\otimes d}\otimes \varvec{z}^{\otimes d})\rangle = \langle \overline{{\mathcal {T}}}, \overline{\varvec{z}}^{\otimes d}\otimes \varvec{z}^{\otimes d}\rangle ={\mathcal {T}}(\overline{\varvec{z}}^d\varvec{z}^d). \end{aligned}$$
On the other hand, given an optimal solution Z of (26), let \({\mathcal {Z}}\in {\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}\) such that \(Z=M_\pi ({\mathcal {Z}})\). As Z is a rank-one Hermitian matrix, \(Z=\alpha \overline{\varvec{y}}\otimes \varvec{y}\) for some \(\alpha \in {\mathbb {R}}\) and \(\Vert \varvec{y}\Vert =1\). Further by \(\mathrm{tr}\,(Z)=1\) and (27), we observe that \(\alpha =1\) and so \(Z=\overline{\varvec{y}}\otimes \varvec{y}\). Moreover, by Theorem 4.13, \({\mathcal {Z}}\) is a rank-one CPS tensor, i.e., \({\mathcal {Z}}=\lambda \,\overline{\varvec{z}}^{\otimes d}\otimes \varvec{z}^{\otimes d}\) for some \(\lambda \in {\mathbb {R}}\) and \(\Vert \varvec{z}\Vert =1\). Noticing that
$$\begin{aligned} \overline{\varvec{y}}\otimes \varvec{y}= Z = M_\pi ({\mathcal {Z}}) = M_\pi (\lambda \,\overline{\varvec{z}}^{\otimes d} \otimes \varvec{z}^{\otimes d}) \text { and } \Vert \varvec{y}\Vert =\Vert \varvec{z}\Vert =1, \end{aligned}$$
it is easy to see that \(\lambda =1\), resulting \(Z=M_\pi (\overline{\varvec{z}}^{\otimes d}\otimes \varvec{z}^{\otimes d})\). Therefore, \(\varvec{z}\) is a feasible solution of (25), whose objective value is
$$\begin{aligned} {\mathcal {T}}(\overline{\varvec{z}}^d\varvec{z}^d) = \langle \overline{{\mathcal {T}}}, \overline{\varvec{z}}^{\otimes d} \otimes \varvec{z}^{\otimes d}\rangle = \langle M_\pi (\overline{{\mathcal {T}}}), M_\pi (\overline{\varvec{z}}^{\otimes d}\otimes \varvec{z}^{\otimes d})\rangle =\langle \overline{M_\pi ({\mathcal {T}})}, Z\rangle . \end{aligned}$$
\(\square \)
We remark that both \(\mathrm{tr}\,(X)=1\) and \(X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}})\) in the model (26) are linear equality constraints. In particular, \(X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}})\) contains \(O(n^d)\) equalities, which are the requirements of partial symmetry and conjugate property for CPS tensors. As an example, when \(n=d=2\) and let \(\pi =(1,3,4,2)\) as in (23), \(X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{2^{4}})\) can be explicitly written as
$$\begin{aligned} X_{14}=X_{22}= X_{33}=X_{41},\, X_{12}=X_{31}, \, X_{24}=X_{43}, \text { and } X^{\mathrm{H}}=X. \end{aligned}$$
In fact, \(X^{\mathrm{H}}=X\) is included in the constraints \(X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}})\), but we leave it in (26) to emphasize that the decision variable sits in the space of Hermitian matrices.
The problem (26) remains hard because of the rank-one constraint. However, it broadens ways by resorting to various matrix optimization tools, particularly in convex optimization. We now propose two convex relaxation methods. First, in the proof of Theorem 4.16, we observe that X is a rank-one Hermitian matrix with \(\mathrm{tr}\,(X)=1\) actually implies that X is positive semidefinite. By dropping the rank-one constraint, (26) is relaxed to a semidefinite program (SDP):
$$\begin{aligned} \max \left\{ \langle \overline{M_\pi ({\mathcal {T}})}, X \rangle : \mathrm{tr}\,(X)=1, \, X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}), \, X\succeq O \right\} , \end{aligned}$$
(28)
where \(X\succeq O\) denotes that X is Hermtian positive semidefinite. The convex optimization model (28) can be easily solved by some SDP solvers in CVX [14]. Alternatively, one may resort to first order methods such as the alternating direction method of multipliers (ADMM).
The second relaxation method is to add a penalty of the nuclear norm of the decision matrix in the objective function [25]. By dropping the rank-one constraint, this leads to the following convex optimization model
$$\begin{aligned} \max \left\{ \langle \overline{M_\pi ({\mathcal {T}})}, X \rangle -\rho \Vert X\Vert _*: \mathrm{tr}\,(X)=1, \, X\in M_\pi ({\mathbb {C}}_{\mathrm{cps}}^{n^{2d}}), \, X^{\mathrm{H}}=X \right\} , \end{aligned}$$
(29)
where \(\rho >0\) is a penalty parameter and \(\Vert X\Vert _*\) denotes the nuclear norm of X, a convex surrogate for \(\mathrm{rank}\,(X)\). To see why (29) is a convex relaxation of (26), we notice that \(\Vert X\Vert _*\) is a convex function, and so the objective of (29) is concave. Moreover, an optimal solution of (26), say X, is rank-one and \(\mathrm{tr}\,(X) = 1\) imply that X is positive semidefinite. Thus, \(\Vert X\Vert _* = \mathrm{tr}\,(X) = 1\), which implies that the term \(-\rho \Vert X\Vert _*\) added to the objective function is actually a constant.
Our observations in several numerical examples show that the solution obtained by the two convex relaxation models (28) and (29) are often rank-one (see Section 5). Once a rank-one solution X is obtained, one may resort \(X=M_\pi (\overline{\varvec{x}}^{\otimes d}\otimes \varvec{x}^{\otimes d})\) to find a solution \(\varvec{x}\) for (25), as stipulated in the proof of Theorem 4.16. This \(\varvec{x}\) provides a solution to the best rank-one approximation problem (24). We remark that the convex relaxation methods can also be used to find good approximate solutions to the best rank-r approximation problem, i.e.,
$$\begin{aligned} \min _{\Vert \varvec{x}_j\Vert =1,\lambda _j\in {\mathbb {R}},\,j=1,\ldots ,r} \left\| {\mathcal {T}}-\sum _{j=1}^r \lambda _j {\varvec{x}_j}^{\otimes d}\otimes \overline{\varvec{x}_j}^{\otimes d}\right\| , \end{aligned}$$
in a successive (greedy) manner.

5 Numerical experiments

In this section, we conduct numerical experiments to test the methods proposed in Section 4.3 to compute best rank-one approximations of CPS tensors. This is also to justify applicability of the rank-one equivalence in Theorem 4.13 or Theorem 4.14. Both the nuclear norm penalty model (29) and the SDP relaxation method (28) are applied to solve three types of instances. Interestingly, both methods are able to return rank-one solutions for almost all the test instances, and thus guarantee the optimality of the original problem (25). In case a rank-one solution fails to obtain, one can slightly perturb the original tensor to lead a success (see Example 5.2). All the numerical experiments are conducted using an Intel Core i5-4200M 2.5GHz computer with 4GB of RAM. The supporting software is MATLAB R2015a. To solve the convex optimization problems, CVX 2.1 [14] and the ADMM approach in [25] are called.

5.1 Quartic minimization from radar wave form design

In radar system, one always regulates the interference power produced by unwanted returns through controlling the range-Doppler response [2]. It is important to design a suitable radar waveform minimizing the disturbance power at the output of the matched filter. This can be written as
$$\begin{aligned} \phi (\varvec{s})=\sum _{r=0}^{n-1} \sum _{j=1}^{m} \rho (r,k) \left| \varvec{s}^{\mathrm{H}}J^r (\varvec{s}\odot \varvec{p}(x_j))\right| ^2, \end{aligned}$$
where \(J^r\in {\mathbb {R}}^{n^2}\) is the shifted matrix for \(r=0,1,\ldots ,n-1\), \(\odot \) denotes the Hadamard product, \(\varvec{p}(v)=(1,e^{\mathbf{i}2\pi v}, \ldots , e^{\mathbf{i}2(n-1)\pi v})^{\mathrm{T}}\), and \(\rho (r,k) =\sum _{k=1}^{n_0}{\delta _{r,r_k}}{1}_{\Delta _k}(j) \frac{\sigma _k^2}{|\Delta _k|}\) with \(\delta _{r,r_k}\) being the Kronecker delta and \({1}_{\Delta _k}(j)\) being an indicator function for the index set \(\Delta _k\) of discrete frequencies. Interested readers are referred to [2] for more details of the ambiguity function and radar waveform design.
To account for the finite energy transmitted by the radar it is assumed that \(\Vert \varvec{s}\Vert ^2 = 1\) and a similarity constraint, \(\Vert \varvec{s}-\varvec{s}_0 \Vert ^2 \le \gamma \), needs to be enforced to obtain phase-only modulated waveforms, where \(\varvec{s}_0\) is a known code sharing some nice properties. Noticing that \(\Vert \varvec{s}\Vert =1\) and \(\varvec{s}_0\) is known, this similarity constraint can be realized by penalizing the quantity \(-|\varvec{s}^{\mathrm{H}}\varvec{s}_0|\) in the objective function \(\phi (\varvec{s})\). Therefore, the following quartic minimization problem is arrived (see [24] for a detail discussion on the modelling):
$$\begin{aligned} \min _{\Vert \varvec{s}\Vert = 1} \left( \phi (\varvec{s}) - \rho |\varvec{s}^{\mathrm{H}}\varvec{s}_0|^2 \Vert \varvec{s}\Vert ^2\right) \end{aligned}$$
(30)
with a penalty parameter \(\rho >0\). The objective function of (30) is a real-valued quartic conjugate form, i.e., there is a fourth-order CPS tensor \({\mathcal {T}}\) such that \({\mathcal {T}}(\overline{\varvec{s}}^2\varvec{s}^2)=\phi (\varvec{s}) - \rho |\varvec{s}^{\mathrm{H}}\varvec{s}_0|^2 \Vert \varvec{s}\Vert ^2\). This shows that (30) is an instance of (25), which is equivalent to (26).
We use the data considered in [2] to construct \(\phi (\varvec{s})\) and let \(\rho =30\) in (30). To obtain phase-only modulated waveforms, a known code \(\varvec{s}_0\) (see e.g., [15]) with \(|(s_0)_i|=1\) for \(i=1,\ldots ,n\) is chosen and further normalized such that \(\Vert \varvec{s}_0\Vert =1\). The problem is solved by the nuclear norm penalty model (29) and the SDP relaxation method (28), respectively. In the experiment, we randomly generate \(\varvec{s}_0\) for 100 instances and record the percentage of instances that the corresponding method outputs rank-one solutions in Table 1. The convex relaxation models are solved by the ADMM algorithm in [25], whose average CPU time (in seconds) is also reported. Observed in Table 1, both convex relaxation methods always obtain rank-one solutions, leading to optimal solutions of (30). In terms of the speed, nuclear norm penalty method runs generally faster than SDP relaxations.
Table 1
Efficiency for the radar wave form design
n
Nuclear norm penalty (29)
SDP relaxation (28)
rank-one
CPU
rank-one
CPU
5
100 %
0.371
100 %
0.693
10
100 %
4.552
100 %
11.261

5.2 Determine elasticity tensors

Elasticity tensors (see Example 4.7) are three-dimensional fourth-order real positive definite tensors with certain symmetry. For a given tensor \({\mathcal {A}}\in {\mathbb {R}}^{3^4}\), this symmetry can be easily identified as a symmetry matrix in Voigt’s notation (18). Therefore, \({\mathcal {A}}\) in such a form is an elasticity tensor if and only if
$$\begin{aligned} {\mathcal {A}}(\varvec{x}^4)> 0 \quad \forall \,\varvec{x}\in {\mathbb {R}}^3,\,\varvec{x}\ne \mathbf{0}. \end{aligned}$$
(31)
Let \({\mathcal {T}}= - {\mathcal {A}}\) and \(d=2\) in (25), i.e., \(\max _{\Vert \varvec{x}\Vert =1}-{\mathcal {A}}(\overline{\varvec{x}}^2\varvec{x}^2)\). If we solve this problem and obtain a real optimal solution, then its optimal value can verify (31). That is, \({\mathcal {A}}\) is an elasticity tensor if and only if the optimal value is negative. In the following, we implement this idea to determine the elasticity of given tensors in the literature.
The first instance is taken from [38, Example 8.1], where
$$\begin{aligned} {\mathcal {A}}_{1111}=13,\, {\mathcal {A}}_{1122}=-4,\, {\mathcal {A}}_{1112}=-2,\, {\mathcal {A}}_{2222}=12, \, {\mathcal {A}}_{2212}=-1,\, {\mathcal {A}}_{1212}=2, \end{aligned}$$
and others are zeros in (18). Both the nuclear norm penalty model (29) and the SDP relaxation method (28) output a rank-one matrix which yields an optimal solution \((-0.7184, -0.6956, 0)\) for \(\max _{\Vert \varvec{x}\Vert =1} -{\mathcal {A}}(\overline{\varvec{x}}^2\varvec{x}^2)\). The corresponding optimal value is \(-3.2420\) and this validates the given tensor \({\mathcal {A}}\) is an elasticity tensor.
In our second example, we consider the most general type of an anisotropic medium that allows propagation of purely polarized waves in an arbitrary direction [21, Proposition 17]. Its Voigt’s matrix has the following form:
$$\begin{aligned} \left( \begin{array}{cccccc} \alpha &{} \alpha /3+2 \rho _1 &{} \alpha /3+2 \rho _2&{} 2 \rho _3 &{} 0 &{} 0 \\ &{} \alpha &{} \alpha /3+2 \rho _4 &{} 0 &{} 2 \rho _5 &{} 0 \\ &{} &{} \alpha &{} 0 &{} 0 &{}2 \rho _6 \\ &{} &{} &{} \alpha /3 - \rho _4 &{} - \rho _6 &{} - \rho _5 \\ &{} &{} &{} &{} \alpha /3 - \rho _2 &{} - \rho _3\\ &{} &{} &{} &{} &{} \alpha /3 - \rho _1 \\ \end{array}\right) . \end{aligned}$$
(32)
We randomly generate the data \((\alpha , \rho _1, \rho _2, \rho _3, \rho _4, \rho _5, \rho _6)\) from i.i.d. standard normal distributions and solve the problem by models (28) and (29), respectively. For all the random instances, both (28) and (29) return a real-valued optimal solution with a positive optimal values, which implies that the generated tensor instance is not an elasticity tensor. For example, in one instance we have
$$\begin{aligned}&\alpha = 0.3919,\, \rho _1=-1.2507,\, \rho _2=-0.9480, \, \rho _3=-0.7411,\\&\rho _4=-0.5078, \, \rho _5=-0.3206, \, \rho _6=0.0125. \end{aligned}$$
The optimal solution obtained by (28) and (29) is (0.6978, 0.2104, 0.0918) and the optimal value is 4.4558.
In order to see if an elasticity tensor can be obtained in the form of (32), we increase its diagonal terms. For example, if we increase \(\alpha \) by 5 and keep all the \(\rho \)’s unchanged in the previous instance, then both (28) and (29) return a real optimal solution (0.0005, 0.9862, 0.0133) with its optimal value being \(-2.2327\). Therefore, the modified tensor instance is an elasticity tensor.

5.3 Randomly generated CPS tensors

The data from (30) has its own structure. In this part, we test the two relaxation methods extensively using randomly generated CPS tensors. The aim is to check the chance of getting rank-one solutions and hence generating optimal solutions for the largest eigenvalue problem (25), under the tractability of solving the two convex relaxation models. These CPS tensors are generated as follows. First, we randomly generate two real tensors \({\mathcal {U}},{\mathcal {V}}\in {\mathbb {R}}^{n^4}\) whose entries follow i.i.d. standard normal distributions, independently. We then let \({\mathcal {W}}={\mathcal {U}}+\mathbf{i}{\mathcal {V}}\) to define a complex tensor in \({\mathbb {C}}^{n^4}\). To make it being PS, we further let \({\mathcal {X}}\in {\mathbb {C}}_{\mathrm{ps}}^{n^4}\) where
$$\begin{aligned} {\mathcal {X}}_{ijk\ell }=\frac{1}{4}\left( {\mathcal {W}}_{ijk\ell }+{\mathcal {W}}_{jik\ell } +{\mathcal {W}}_{ij\ell k}+{\mathcal {W}}_{ji\ell k}\right) \quad \forall \,1\le i,j,k,\ell \le n. \end{aligned}$$
Finally, to make it being CPS, we let \({\mathcal {T}}=\frac{1}{2}({\mathcal {X}}+{\mathcal {X}}^{\mathrm{H}})\in {\mathbb {C}}_{\mathrm{cps}}^{n^4}\).
For various \(n\le 15\), 100 random CPS tensor instances are generated. We then solve the two convex relaxation models (29) and (28) using the ADMM algorithm, and record the percentage of instances that produce rank-one solutions. The results are shown in Table 2 together with the average CPU time (in seconds). It shows that both the nuclear norm penalty method and the SDP relaxation model (28) are able to generate rank-one solutions for most randomly generated instances, and thus find the largest eigenvalue of CPS tensors. Opposite to the data of radar wave form design, SDP relaxation outperforms nuclear norm penalty method, both in speed and in the chance of optimality when the dimension of the problem increases.
Table 2
Efficiency for largest eigenvalue of random CPS tensors
n
Nuclear norm penalty (29)
SDP relaxation (28)
rank-one
CPU
rank-one
CPU
4
100 %
0.231
100 %
0.083
6
100 %
1.434
100 %
0.473
8
100 %
5.570
100 %
1.824
9
100 %
11.414
100 %
3.519
10
100 %
19.339
100 %
5.740
12
99 %
56.299
99 %
16.050
15
45 %
206.879
82 %
90.184

5.4 Computing largest US-eigenvalues

Motivated by the geometric measure of quantum entanglement, Ni et al. [33] introduced the notion of unitary symmetric eigenvalue (US-eigenvalue) and unitary symmetric eigenvector (US-eigenvector). The geometric measure of entanglement has various applications such as entanglement witnesses and quantum computation. The US-eigenvalues and US-eigenvectors reflect some specific states of the composite quantum system to certain extent. Specifically, \(\lambda \in {\mathbb {C}}\) is a US-eigenvalue associated with a US-eigenvector \(\varvec{x}\in {\mathbb {C}}^n\) of a symmetric tensor \({\mathcal {Z}}\in {\mathbb {C}}^{n^d}\) if
$$\begin{aligned} \overline{{\mathcal {Z}}}(\bullet \,\varvec{x}^{d-1})=\lambda \,\overline{\varvec{x}},\,{\mathcal {Z}}(\bullet \overline{\varvec{x}}^{d-1}) =\lambda \,\varvec{x},\text { and }\Vert \varvec{x}\Vert =1. \end{aligned}$$
It is known that US-eigenvalues must be real. Jiang et. al. [24] showed that \(\lambda \in {\mathbb {R}}\) is a US-eigenvalue of a symmetric tensor \({\mathcal {Z}}\in {\mathbb {C}}^{n^d}\) if and only if \(\lambda ^2\) is a C-eigenvalue of the CPS tensor \({\mathcal {Z}}\otimes \overline{{\mathcal {Z}}}\in {\mathbb {C}}^{n^{2d}}\) and their eigenvectors are closely related. Therefore, we may resort the model (25) to find the largest US-eigenvalue with its corresponding eigenvectors for a symmetric tensor \({\mathcal {Z}}\), i.e., to solve
$$\begin{aligned} \max _{\Vert \varvec{x}\Vert =1}({\mathcal {Z}}\otimes \overline{{\mathcal {Z}}})(\overline{\varvec{x}}^{d}\varvec{x}^d). \end{aligned}$$
(33)
In these tests, we look into the two examples in [33]. We first transfer the largest US-eigenvalue problem to (33), and then use the SDP relaxation model (28). The hope is to find rank-one solutions and hence to obtain the largest US-eigenvalue with its corresponding eigenvectors.
Example 5.1
([33, Table 1]). Let a symmetric tensor \({\mathcal {Z}}\in {\mathbb {C}}_{\mathrm{s}}^{2^3}\) have entries \({\mathcal {Z}}_{111}=2\), \({\mathcal {Z}}_{112}={\mathcal {Z}}_{121}={\mathcal {Z}}_{211}=1\), \({\mathcal {Z}}_{122}={\mathcal {Z}}_{212}={\mathcal {Z}}_{221}=-1\), \({\mathcal {Z}}_{222}=1\), and others being zeros.
By applying the SDP relaxation method to (33) and solving it using CVX, we directly generate a rank-one solution. In other words, we obtain a C-eigenpair \((\lambda ^2,\varvec{x})\) with \(\lambda \in {\mathbb {R}}\) of the CPS tensor \({\mathcal {Z}}\otimes \overline{{\mathcal {Z}}}\), i.e.,
$$\begin{aligned} \lambda ^2 = ({\mathcal {Z}}\otimes \overline{{\mathcal {Z}}})(\overline{\varvec{x}}^3\varvec{x}^3) =\langle \overline{{\mathcal {Z}}}\otimes {\mathcal {Z}}, \overline{\varvec{x}}^{\otimes 3} \otimes \varvec{x}^{\otimes 3} \rangle =|\overline{{\mathcal {Z}}}(\varvec{x}^3)|^2. \end{aligned}$$
However, \(\overline{{\mathcal {Z}}}(\varvec{x}^3)\) may not be real. This can be easily done by rotating \(\varvec{x}\). In particular, by letting \(\varvec{z}=e^{-\mathbf{i}\theta /3}\varvec{x}\) where \(\theta =\arg (\overline{{\mathcal {Z}}}(\varvec{x}^3))\), one has \(\overline{{\mathcal {Z}}}(\varvec{z}^3) = e^{-\mathbf{i}\theta } \overline{{\mathcal {Z}}}(\varvec{x}^3) \in {\mathbb {R}}\). This implies that \((\lambda ,\varvec{z})\) is the corresponding US-eigenpair of \({\mathcal {Z}}\). In this example, it recovers the largest US-eigenvalue 2.3547 with its corresponding US-eigenvector \((0.9726,0.2326)^{\mathrm{T}}\).
Example 5.2
([33, Table 2]). Let a symmetric tensor \({\mathcal {Z}}\in {\mathbb {C}}_{\mathrm{s}}^{2^3}\) have entries \({\mathcal {Z}}_{111}=2\), \({\mathcal {Z}}_{112}={\mathcal {Z}}_{121}={\mathcal {Z}}_{211}=-1\), \({\mathcal {Z}}_{122}={\mathcal {Z}}_{212}={\mathcal {Z}}_{221}=-2\), \({\mathcal {Z}}_{222}=1\), and others being zeros.
We again consider the SDP relaxation method and resort to CVX for solutions. Unfortunately, it fails to give us a rank-one solution. Motivated by the high frequency of rank-one solutions obtained when solving randomly generated tensors as shown in Table 2, we now add a tiny random perturbation \({\mathcal {E}}\in {\mathbb {C}}_{\mathrm{s}}^{2^3}\) with \(\Vert {\mathcal {E}}\Vert =10^{-4}\) to the original tensor \({\mathcal {Z}}\). The hope is to generate a rank-one solution via SDP relaxation while keeping the original US-eigenpair almost unchanged since \({\mathcal {E}}\) is small enough. Furthermore, the largest US-eigenvalue may have more than one US-eigenvectors, i.e., (33) admits multiple global optimal solutions. In our experiments, we observe that adding tiny perturbations not only obtains a rank-one solution, but also helps to generate different rank-one solutions under different perturbations. Using this approach, we successfully obtain the largest US-eigenvalue 3.1623 and its four US-eigenvectors:
$$\begin{aligned}&(0.6987+0.1088\,\mathbf{i},-0.1088+0.6987\,\mathbf{i})^{\mathrm{T}}, (0.6987-0.1088\,\mathbf{i},-0.1088-0.6987\,\mathbf{i})^{\mathrm{T}}, \\&(-0.2551+0.6595\,\mathbf{i},0.6595+0.2551\,\mathbf{i})^{\mathrm{T}}, (-0.2551-0.6595\,\mathbf{i},0.6595-0.2551\,\mathbf{i})^{\mathrm{T}}, \end{aligned}$$
which are consistent with the results in [33]. In fact, our convex relaxation approach is able to certify that the obtained eigenvalue is globally the largest as long as the solution to (28) is rank-one. This certificate, however, cannot be seen from the solutions obtained in [33]. Therefore, our experiment on Example 5.2 helps to verify that the largest one among all the eigenvalues obtained in [33, Table 2] is actually the largest eigenvalue of \({\mathcal {Z}}\).
To conclude the numerical results, the convex relaxation methods proposed in Section 4.3 and established based on the rank-one equivalence, are capable to find optimal solutions for best rank-one approximations or the largest eigenvalue of CPS tensors. At least they are able to generate rank-one solutions (hence optimality) for the three types of instances discussed above. In case a rank-one solution fails to be obtained, one may slightly perturb the original tensor, and the chance to obtain rank-one solutions may increase. This is one of the research topics to look into further.

Acknowledgements

The research of the second author was partially supported by National Natural Science Foundation of China (Grants 11771269 and 11831002) and Program for Innovative Research Team of Shanghai University of Finance and Economics. The research of the third author was partially supported by CORL Visiting Researcher Fund of the University of Portsmouth. We thank Professor Liqun Qi for bringing the elasticity tensors to our attention.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Literature
1.
go back to reference Aittomaki, T., Koivunen, V.: Beampattern optimization by minimization of quartic polynomial. Proceedings of the 2009 IEEE/SP 15th Workshop on Statistical Signal Processing, 437–440 (2009) Aittomaki, T., Koivunen, V.: Beampattern optimization by minimization of quartic polynomial. Proceedings of the 2009 IEEE/SP 15th Workshop on Statistical Signal Processing, 437–440 (2009)
2.
go back to reference Aubry, A., De Maio, A., Jiang, B., Zhang, S.: Ambiguity function shaping for cognitive radar via complex quartic optimization. IEEE Trans. Signal Process. 61, 5603–5619 (2013)MathSciNetCrossRef Aubry, A., De Maio, A., Jiang, B., Zhang, S.: Ambiguity function shaping for cognitive radar via complex quartic optimization. IEEE Trans. Signal Process. 61, 5603–5619 (2013)MathSciNetCrossRef
3.
go back to reference Banach, S.: Über homogene Polynome in \((L^2)\). Studia Mathematica 7, 36–44 (1938)CrossRef Banach, S.: Über homogene Polynome in \((L^2)\). Studia Mathematica 7, 36–44 (1938)CrossRef
4.
go back to reference Barvinok, A.: A Course in Convexity, Graduate Studies in Mathematics, vol. 54. American Mathematical Society, Providence, RI (2002) Barvinok, A.: A Course in Convexity, Graduate Studies in Mathematics, vol. 54. American Mathematical Society, Providence, RI (2002)
5.
go back to reference Brachat, J., Comon, P., Mourrain, B., Tsigaridas, E.: Symmetric tensor decomposition. Lin. Algebra Appl. 433, 1851–1872 (2010)MathSciNetCrossRef Brachat, J., Comon, P., Mourrain, B., Tsigaridas, E.: Symmetric tensor decomposition. Lin. Algebra Appl. 433, 1851–1872 (2010)MathSciNetCrossRef
6.
go back to reference Cardoso, J.-F.: Eigen-structure of the fourth-order cumulant tensor with application to the blind source separation problem. Proc. Int. Conf. Acoust., Speech, Signal Process. 5, 2655–2658 (1990)CrossRef Cardoso, J.-F.: Eigen-structure of the fourth-order cumulant tensor with application to the blind source separation problem. Proc. Int. Conf. Acoust., Speech, Signal Process. 5, 2655–2658 (1990)CrossRef
7.
go back to reference Comon, P., Golub, G., Lim, L.-H., Mourrain, B.: Symmetric tensor and symmetric tensor rank. SIAM J. Matrix Anal. Appl. 30, 1254–1279 (2008)MathSciNetCrossRef Comon, P., Golub, G., Lim, L.-H., Mourrain, B.: Symmetric tensor and symmetric tensor rank. SIAM J. Matrix Anal. Appl. 30, 1254–1279 (2008)MathSciNetCrossRef
8.
9.
go back to reference Fei, S.-M., Jing, N., Sun, B.-Z.: Hermitian tensor product approximation of complex matrices and separability. Rep. Math. Phys. 57, 271–288 (2006)MathSciNetCrossRef Fei, S.-M., Jing, N., Sun, B.-Z.: Hermitian tensor product approximation of complex matrices and separability. Rep. Math. Phys. 57, 271–288 (2006)MathSciNetCrossRef
10.
go back to reference Fontanari, C.: On Warings problem for many forms and Grassmann defective varieties. J. Pure Appl. Algebra 174, 243–247 (2002)MathSciNetCrossRef Fontanari, C.: On Warings problem for many forms and Grassmann defective varieties. J. Pure Appl. Algebra 174, 243–247 (2002)MathSciNetCrossRef
12.
go back to reference Fu, T., Fan, J.: Successive partial-symmetric rank-one algorithms for almost unitarily decomposable conjugate partial-symmetric tensors. J. Op. Res. Soc. China 7, 147–167 (2019)MathSciNetCrossRef Fu, T., Fan, J.: Successive partial-symmetric rank-one algorithms for almost unitarily decomposable conjugate partial-symmetric tensors. J. Op. Res. Soc. China 7, 147–167 (2019)MathSciNetCrossRef
13.
go back to reference Fu, T., Jiang, B., Li, Z.: Approximation algorithms for optimization of real-valued general conjugate complex forms. J. Global Optim. 70, 99–130 (2018)MathSciNetCrossRef Fu, T., Jiang, B., Li, Z.: Approximation algorithms for optimization of real-valued general conjugate complex forms. J. Global Optim. 70, 99–130 (2018)MathSciNetCrossRef
15.
go back to reference He, H., Stoica, P., Li, J.: Waveform design with stopband and correlation constraints for cognitive radar. Proceedings of the 2010 2nd International Workshop on Cognitive Information Processing, 344–349 (2010) He, H., Stoica, P., Li, J.: Waveform design with stopband and correlation constraints for cognitive radar. Proceedings of the 2010 2nd International Workshop on Cognitive Information Processing, 344–349 (2010)
16.
go back to reference He, S., Li, Z., Zhang, S.: Approximation algorithms for homogeneous polynomial optimization with quadratic constraints. Math. Program. 125, 353–383 (2010)MathSciNetCrossRef He, S., Li, Z., Zhang, S.: Approximation algorithms for homogeneous polynomial optimization with quadratic constraints. Math. Program. 125, 353–383 (2010)MathSciNetCrossRef
17.
go back to reference He, S., Li, Z., Zhang, S.: Inhomogeneous polynomial optimization over a convex set: An approximation approach. Math. Comput. 84, 715–741 (2015)MathSciNetCrossRef He, S., Li, Z., Zhang, S.: Inhomogeneous polynomial optimization over a convex set: An approximation approach. Math. Comput. 84, 715–741 (2015)MathSciNetCrossRef
19.
go back to reference Huang, Y., Zhang, S.: Approximation algorithms for indefinite complex quadratic maximization problems. Sci. China Math. 53, 2697–2708 (2010)MathSciNetCrossRef Huang, Y., Zhang, S.: Approximation algorithms for indefinite complex quadratic maximization problems. Sci. China Math. 53, 2697–2708 (2010)MathSciNetCrossRef
20.
go back to reference Huang, Z., Qi, L.: Positive definiteness of paired symmetric tensors and elasticity tensors. J. Comput. Appl. Math. 338, 22–43 (2018)MathSciNetCrossRef Huang, Z., Qi, L.: Positive definiteness of paired symmetric tensors and elasticity tensors. J. Comput. Appl. Math. 338, 22–43 (2018)MathSciNetCrossRef
21.
go back to reference Itin, Y., Hehl, F.W.: The constitutive tensor of linear elasticity: Its decompositions, Cauchy relations, null Lagrangians, and wave propagation. J. Math. Phys. 54, 042903 (2013)MathSciNetCrossRef Itin, Y., Hehl, F.W.: The constitutive tensor of linear elasticity: Its decompositions, Cauchy relations, null Lagrangians, and wave propagation. J. Math. Phys. 54, 042903 (2013)MathSciNetCrossRef
22.
go back to reference Jiang, B., He, S., Li, Z., Zhang, S.: Moments tensors, Hilberts identity and \(k\)-wise uncorrelated random variables. Math. Op. Res. 39, 775–788 (2014)MathSciNetCrossRef Jiang, B., He, S., Li, Z., Zhang, S.: Moments tensors, Hilberts identity and \(k\)-wise uncorrelated random variables. Math. Op. Res. 39, 775–788 (2014)MathSciNetCrossRef
23.
go back to reference Jiang, B., Li, Z., Zhang, S.: Approximation methods for complex polynomial optimization. Comput. Optim. Appl. 59, 219–248 (2014)MathSciNetCrossRef Jiang, B., Li, Z., Zhang, S.: Approximation methods for complex polynomial optimization. Comput. Optim. Appl. 59, 219–248 (2014)MathSciNetCrossRef
24.
go back to reference Jiang, B., Li, Z., Zhang, S.: Characterizing real-valued multivariate complex polynomials and their symmetric tensor representations. SIAM J. Matrix Anal. Appl. 37, 381–408 (2016)MathSciNetCrossRef Jiang, B., Li, Z., Zhang, S.: Characterizing real-valued multivariate complex polynomials and their symmetric tensor representations. SIAM J. Matrix Anal. Appl. 37, 381–408 (2016)MathSciNetCrossRef
25.
go back to reference Jiang, B., Ma, S., Zhang, S.: Tensor principal component analysis via convex optimization. Math. Program. 150, 423–457 (2015)MathSciNetCrossRef Jiang, B., Ma, S., Zhang, S.: Tensor principal component analysis via convex optimization. Math. Program. 150, 423–457 (2015)MathSciNetCrossRef
26.
go back to reference Josz, C.: Application of Polynomial Optimization to Electricity Transmission Networks. Ph.D. Dissertation, Université Pierre et Marie Curie, Paris (2016) Josz, C.: Application of Polynomial Optimization to Electricity Transmission Networks. Ph.D. Dissertation, Université Pierre et Marie Curie, Paris (2016)
28.
go back to reference Li, Z.: Polynomial Optimization Problems—Approximation Algorithms and Applications. Ph.D. Dissertation, The Chinese University of Hong Kong, Hong Kong (2011) Li, Z.: Polynomial Optimization Problems—Approximation Algorithms and Applications. Ph.D. Dissertation, The Chinese University of Hong Kong, Hong Kong (2011)
29.
go back to reference Lim, L.-H.: Singular values and eigenvalues of tensors: A variational approach. Proceedings of the IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing 1, 129–132 (2005) Lim, L.-H.: Singular values and eigenvalues of tensors: A variational approach. Proceedings of the IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing 1, 129–132 (2005)
30.
go back to reference Madani, R., Lavaei, J., Baldick, R.: Convexification of power flow equations for power systems in presence of noisy measurements. IEEE Trans. Autom. Control 64, 3101–3116 (2019)CrossRef Madani, R., Lavaei, J., Baldick, R.: Convexification of power flow equations for power systems in presence of noisy measurements. IEEE Trans. Autom. Control 64, 3101–3116 (2019)CrossRef
31.
go back to reference Marcus, M., Minc, H.: A Survey of Matrix Theory and Matrix Inequalities. Dover Publications, Boston, MA (1964)MATH Marcus, M., Minc, H.: A Survey of Matrix Theory and Matrix Inequalities. Dover Publications, Boston, MA (1964)MATH
33.
go back to reference Ni, G., Qi, L., Bai, M.: Geometric measure of entanglement and U-eigenvalues of tensors. SIAM J. Matrix Anal. Appl. 35, 73–87 (2014)MathSciNetCrossRef Ni, G., Qi, L., Bai, M.: Geometric measure of entanglement and U-eigenvalues of tensors. SIAM J. Matrix Anal. Appl. 35, 73–87 (2014)MathSciNetCrossRef
34.
35.
go back to reference Nie, J., Wang, L.: Semidefinite relaxations for best rank-1 tensor approximations. SIAM J. Matrix Anal. Appl. 35, 1155–1179 (2014)MathSciNetCrossRef Nie, J., Wang, L.: Semidefinite relaxations for best rank-1 tensor approximations. SIAM J. Matrix Anal. Appl. 35, 1155–1179 (2014)MathSciNetCrossRef
38.
go back to reference Qi, L., Chen, H., Chen, Y.: Tensor Eigenvalues and Their Applications. Springer, Singapore (2018)CrossRef Qi, L., Chen, H., Chen, Y.: Tensor Eigenvalues and Their Applications. Springer, Singapore (2018)CrossRef
40.
go back to reference So, A.M.-C., Zhang, J., Ye, Y.: On approximating complex quadratic optimization problems via semidefinite programming relaxations. Math. Program. 110, 93–110 (2007)MathSciNetCrossRef So, A.M.-C., Zhang, J., Ye, Y.: On approximating complex quadratic optimization problems via semidefinite programming relaxations. Math. Program. 110, 93–110 (2007)MathSciNetCrossRef
41.
go back to reference Sorber, L., Van Barel, M., De Lathauwer, L.: Unconstrained optimization of real functions in complex variables. SIAM J. Optim. 22, 879–898 (2012)MathSciNetCrossRef Sorber, L., Van Barel, M., De Lathauwer, L.: Unconstrained optimization of real functions in complex variables. SIAM J. Optim. 22, 879–898 (2012)MathSciNetCrossRef
42.
go back to reference Sorber, L., Domanov, I., Van Barel, M., De Lathauwer, L.: Exact line and plane search for tensor optimization. Comput. Optim. Appl. 63, 121–142 (2016)MathSciNetCrossRef Sorber, L., Domanov, I., Van Barel, M., De Lathauwer, L.: Exact line and plane search for tensor optimization. Comput. Optim. Appl. 63, 121–142 (2016)MathSciNetCrossRef
43.
go back to reference Yang, Y., Feng, Y., Huang, X., Suykens, J.A.K.: Rank-\(1\) tensor properties with applications to a class of tensor optimization problems. SIAM J. Optim. 26, 171–196 (2016)MathSciNetCrossRef Yang, Y., Feng, Y., Huang, X., Suykens, J.A.K.: Rank-\(1\) tensor properties with applications to a class of tensor optimization problems. SIAM J. Optim. 26, 171–196 (2016)MathSciNetCrossRef
44.
go back to reference Zhang, M., Ni, G., Zhang, G.: Iterative methods for computing U-eigenvalues of non-symmetric complex tensors with application in quantum entanglement. Comput. Optim. Appl. 75, 609–628 (2020)MathSciNetCrossRef Zhang, M., Ni, G., Zhang, G.: Iterative methods for computing U-eigenvalues of non-symmetric complex tensors with application in quantum entanglement. Comput. Optim. Appl. 75, 609–628 (2020)MathSciNetCrossRef
45.
go back to reference Zhang, S., Huang, Y.: Complex quadratic optimization and semidefinite programming. SIAM J. Optim. 16, 871–890 (2006)MathSciNetCrossRef Zhang, S., Huang, Y.: Complex quadratic optimization and semidefinite programming. SIAM J. Optim. 16, 871–890 (2006)MathSciNetCrossRef
46.
Metadata
Title
On decompositions and approximations of conjugate partial-symmetric tensors
Authors
Taoran Fu
Bo Jiang
Zhening Li
Publication date
01-12-2021
Publisher
Springer International Publishing
Published in
Calcolo / Issue 4/2021
Print ISSN: 0008-0624
Electronic ISSN: 1126-5434
DOI
https://doi.org/10.1007/s10092-021-00437-2

Other articles of this Issue 4/2021

Calcolo 4/2021 Go to the issue

Premium Partner