We consider the nonlinear eigenvalue problem (NEP) which consists of computing
\((\lambda , v) \in D \times {{\mathbb {C}}}^n {\setminus } \left\{ 0 \right\} \) such that
$$\begin{aligned} M(\lambda ) x = 0, \end{aligned}$$
(1)
where
$$\begin{aligned} M(\lambda ) = \sum _{m=1}^p f_m(\lambda ) A_m, \end{aligned}$$
(2)
with
\(D \subset {{\mathbb {C}}}\) open disk,
\(f_m: D \rightarrow {{\mathbb {C}}}\) analytic functions, and
\(A_m \in {{\mathbb {C}}}^{n \times n}\) for
\(m=1, \dots , p\). In this work we focus on the symmetric NEP, namely we assume that
$$\begin{aligned}{} & {} M(\lambda )^T=M(\lambda ){} & {} \forall \lambda \in D. \end{aligned}$$
(3)
Equivalently,
\(M(\lambda )\) can be expressed as (
2) with symmetric matrices
\(A_m^T = A_m\) for
\(m=1, \dots , p\). Notice that the complex matrices
\(A_m\) and
\(M(\lambda )\) are assumed to be symmetric but not necessarily Hermitian. The NEP arises in many areas such as: stability analysis, control theory, wave propagation, etc, and it has been studied in various settings. See the review papers [
21,
39], the PhD theses [
15,
52], and the problem collection [
6]. Specialized software for NEPs has been recently produced: the package NEP-PACK [
26], the library SLEPc [
23], and even more open-source software. An approach for solving the NEP consists of constructing a linear eigenvalue problem (linearization) whose eigenvalues approximate, or correspond to, the eigenvalues of the original NEP [
2,
3,
14,
20,
32,
35,
47]. When the NEP has specific structures such as being: symmetric, Hermitian, Hamiltonian, palindromic, etc, it is preferable to construct a linearization that preserves these structures. Theoretical and algorithmic aspects of structured linearizations have been extensively analyzed [
10,
12,
18,
36,
37,
41,
51]. In particular, it has been shown that methods based on structure preserving linearizations, in certain applications, are more efficient than other methods that do not take into account the structure [
38,
40]. For the polynomial eigenvalue problem (PEP), i.e., the special case where
\(f_m(\lambda )\) in (
1) are polynomials, symmetric linearizations are extensively characterized in [
11,
24]. A well established class of methods for solving symmetric eigenvalue problems are Lanczos-like methods. More precisely, the Lanczos method, and its variants, can be applied for solving symmetric generalized eigenvalue problems
\(A x = \lambda B x\) where
\(A,B \in {{\mathbb {C}}}^{n \times n}\) are symmetric matrices. The original approach [
31] was developed for the case
\(B=I\), a generalization for
B positive definite is presented in [
44, Ch.15, Sect.11]. A further extension of this approach, known as indefinite Lanczos method, for the case where
A and
B are Hermitian or symmetric matrices is discussed in [
45] and [
4, Section 8.6.1]. Lanczos methods belong to the class of Krylov methods. They exploit the fact that, during the construction of the orthogonal basis of the Krylov space, due to the symmetry of the problem, the orthogonalization can be performed in a more efficient way with a three-term recurrence.In exact arithmetic, this approach reduces drastically the complexity. However, in floating-point arithmetic and without further specializations, the basis vectors are often affected by loss of orthogonality, resulting in slow convergence of the Ritz pairs and numerical instability [
1,
46,
48,
55]. Even if Lanczos methods are in certain cases unstable and in general less efficient, in terms of number of iterations, with respect to Arnoldi methods, they can be efficiently implemented in a distributed computing framework. On the other hand, Arnoldi methods are more difficult to parallelize due to the full orthogonalization procedure. Moreover, there are various ways to resolve the instability and improve the efficiency of Lanczos methods such as: selective and partial re-orthogonalization, deflation, structured restart techniques, etc. See [
1,
46,
48,
55] and references therein.
In this work, we present a new symmetric linearization for the symmetric NEP, resulting in a symmetric, linear, and infinite dimensional eigenvalue problem. Symmetric generalized eigenvalue problems can be solved with the indefinite Lanczos method [
4, Section 8.6.1]. We present a new method that corresponds to adapting, in an efficient and robust way, the indefinite Lanczos method to the derived linearization. In order to cure the slow convergence, that is due to the loss of orthogonality, we extract the eigenpair approximations by using the nonlinear Rayleigh–Ritz procedure combined with a proper choice of the projection space, which is derived by the structure of the linearization. We also present numerical experiments that show that the proposed method is competitive, in terms of robustness and complexity, with Arnoldi-like methods for NEPs that perform the full orthogonalization.
The paper is organized as follows: in Sect.
2 we prove that the symmetric NEP is equivalent to a symmetric, linear, and infinite dimensional eigenvalue problem. In Sect.
3 we derive a method, in finite arithmetic, which consists of applying the indefinite Lanczos method to the derived linearization. In Sect.
4 we show how the computation time of the resulting method can be considerably reduced by exploiting additional NEP structures. In Sect.
5 we illustrate the performance of this new approach with numerical simulations by solving large and sparse NEPs. The simulations were carried out in the Julia programming language [
9] with NEP-PACK [
26], which is an open source Julia package for NEPs.
The method we derive can been seen as an Arnoldi-like method applied to an iteratively expanding linearization, or equivalently to a infinite dimensional linear operator. Other methods that are based on these ideas are: infinite Arnoldi [
29] and its tensor variant [
28], NLEIGS [
22], and CORK [
53]. There are also methods based on the bi-orthogonalization procedure, which also lead to a three-term recurrence, presented in [
19,
33]. However, these methods and their variations, in the way they are presented and without further research, are not capable of taking advantage of the symmetry of the NEP.
In the rest of this work, vectors and matrices are denoted as
\(v = [v_i]_{i=1}^{n}\) and
\(A=[a_{i,j}]_{i,j=1}^{n,m}\) respectively, whereas bold letters represent block-vectors and block-matrices with infinite size, namely
\(\varvec{v} = [v_i]_{i=1}^{\infty }\) with
\(v_i \in {{\mathbb {C}}}^n\) and
\(\varvec{A}=[A_{i,j}]_{i,j=1}^{\infty }\) with
\(A_{i,j} \in {{\mathbb {C}}}^{n \times n}\). The matrix
\([\varvec{A}]_k = [A_{i,j}]_{i,j=1}^{k} \in {{\mathbb {C}}}^{n k \times n k}\) consists of the main sub-matrix obtained by extracting the first
k-blocks. The Kronecker product and the Hadamard (element-wise) product are denoted by
\(\otimes \) and
\(\circ \) respectively. The vectors
\(e_j\) and
\(\varvec{e}_j\) have zeros as elements except one in the
j-th position whereas
e and
\(\varvec{e}\) are the vectors with all ones. Without loss of generality, after a change of variables in (
2), we assume that the region of interest
\(D \subset {{\mathbb {C}}}\) is a disk centered in the origin. Unless differently specified, the matrices
\(M_i\) will denote the derivatives
\(M^{(i)}(0)\). We will denote by
\(A^T\) the transpose (not conjugate transpose) of the matrix
\(A \in {{\mathbb {C}}}^{n \times n}\).