Top

Calcolo

Published in:

Open Access 01-06-2023

The infinite Lanczos method for symmetric nonlinear eigenvalue problems

Author: Giampaolo Mele

Published in: Calcolo | Issue 2/2023

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Patentsearch

Off

Abstract

A new iterative method for solving large scale symmetric nonlinear eigenvalue problems is presented. We firstly derive an infinite dimensional symmetric linearization of the nonlinear eigenvalue problem, then we apply the indefinite Lanczos method to this specific linearization, resulting in a short-term recurrence. We show how, under specific assumption on the starting vector, this method can be carried out in finite arithmetic and how the exploitation of the problem structure leads to improvements in terms of computation time. The eigenpair approximations are extracted with the nonlinear Rayleigh-Ritz procedure combined with a specific choice of the projection space. We illustrate how this extraction technique resolves the instability issues that may occur due to the loss of orthogonality in many standard Lanczos-type methods.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

We consider the nonlinear eigenvalue problem (NEP) which consists of computing $(\lambda , v) \in D \times {{\mathbb {C}}}^n {\setminus } \left\{ 0 \right\} $ such that

$$\begin{aligned} M(\lambda ) x = 0, \end{aligned}$$

(1)

where

$$\begin{aligned} M(\lambda ) = \sum _{m=1}^p f_m(\lambda ) A_m, \end{aligned}$$

(2)

with $D \subset {{\mathbb {C}}}$ open disk, $f_m: D \rightarrow {{\mathbb {C}}}$ analytic functions, and $A_m \in {{\mathbb {C}}}^{n \times n}$ for $m=1, \dots , p$. In this work we focus on the symmetric NEP, namely we assume that

$$\begin{aligned}{} & {} M(\lambda )^T=M(\lambda ){} & {} \forall \lambda \in D. \end{aligned}$$

(3)

Equivalently, $M(\lambda )$ can be expressed as (2) with symmetric matrices $A_m^T = A_m$ for $m=1, \dots , p$. Notice that the complex matrices $A_m$ and $M(\lambda )$ are assumed to be symmetric but not necessarily Hermitian. The NEP arises in many areas such as: stability analysis, control theory, wave propagation, etc, and it has been studied in various settings. See the review papers [21, 39], the PhD theses [15, 52], and the problem collection [6]. Specialized software for NEPs has been recently produced: the package NEP-PACK [26], the library SLEPc [23], and even more open-source software. An approach for solving the NEP consists of constructing a linear eigenvalue problem (linearization) whose eigenvalues approximate, or correspond to, the eigenvalues of the original NEP [2, 3, 14, 20, 32, 35, 47]. When the NEP has specific structures such as being: symmetric, Hermitian, Hamiltonian, palindromic, etc, it is preferable to construct a linearization that preserves these structures. Theoretical and algorithmic aspects of structured linearizations have been extensively analyzed [10, 12, 18, 36, 37, 41, 51]. In particular, it has been shown that methods based on structure preserving linearizations, in certain applications, are more efficient than other methods that do not take into account the structure [38, 40]. For the polynomial eigenvalue problem (PEP), i.e., the special case where $f_m(\lambda )$ in (1) are polynomials, symmetric linearizations are extensively characterized in [11, 24]. A well established class of methods for solving symmetric eigenvalue problems are Lanczos-like methods. More precisely, the Lanczos method, and its variants, can be applied for solving symmetric generalized eigenvalue problems $A x = \lambda B x$ where $A,B \in {{\mathbb {C}}}^{n \times n}$ are symmetric matrices. The original approach [31] was developed for the case $B=I$, a generalization for B positive definite is presented in [44, Ch.15, Sect.11]. A further extension of this approach, known as indefinite Lanczos method, for the case where A and B are Hermitian or symmetric matrices is discussed in [45] and [4, Section 8.6.1]. Lanczos methods belong to the class of Krylov methods. They exploit the fact that, during the construction of the orthogonal basis of the Krylov space, due to the symmetry of the problem, the orthogonalization can be performed in a more efficient way with a three-term recurrence.In exact arithmetic, this approach reduces drastically the complexity. However, in floating-point arithmetic and without further specializations, the basis vectors are often affected by loss of orthogonality, resulting in slow convergence of the Ritz pairs and numerical instability [1, 46, 48, 55]. Even if Lanczos methods are in certain cases unstable and in general less efficient, in terms of number of iterations, with respect to Arnoldi methods, they can be efficiently implemented in a distributed computing framework. On the other hand, Arnoldi methods are more difficult to parallelize due to the full orthogonalization procedure. Moreover, there are various ways to resolve the instability and improve the efficiency of Lanczos methods such as: selective and partial re-orthogonalization, deflation, structured restart techniques, etc. See [1, 46, 48, 55] and references therein.

In this work, we present a new symmetric linearization for the symmetric NEP, resulting in a symmetric, linear, and infinite dimensional eigenvalue problem. Symmetric generalized eigenvalue problems can be solved with the indefinite Lanczos method [4, Section 8.6.1]. We present a new method that corresponds to adapting, in an efficient and robust way, the indefinite Lanczos method to the derived linearization. In order to cure the slow convergence, that is due to the loss of orthogonality, we extract the eigenpair approximations by using the nonlinear Rayleigh–Ritz procedure combined with a proper choice of the projection space, which is derived by the structure of the linearization. We also present numerical experiments that show that the proposed method is competitive, in terms of robustness and complexity, with Arnoldi-like methods for NEPs that perform the full orthogonalization.

The paper is organized as follows: in Sect. 2 we prove that the symmetric NEP is equivalent to a symmetric, linear, and infinite dimensional eigenvalue problem. In Sect. 3 we derive a method, in finite arithmetic, which consists of applying the indefinite Lanczos method to the derived linearization. In Sect. 4 we show how the computation time of the resulting method can be considerably reduced by exploiting additional NEP structures. In Sect. 5 we illustrate the performance of this new approach with numerical simulations by solving large and sparse NEPs. The simulations were carried out in the Julia programming language [9] with NEP-PACK [26], which is an open source Julia package for NEPs.

The method we derive can been seen as an Arnoldi-like method applied to an iteratively expanding linearization, or equivalently to a infinite dimensional linear operator. Other methods that are based on these ideas are: infinite Arnoldi [29] and its tensor variant [28], NLEIGS [22], and CORK [53]. There are also methods based on the bi-orthogonalization procedure, which also lead to a three-term recurrence, presented in [19, 33]. However, these methods and their variations, in the way they are presented and without further research, are not capable of taking advantage of the symmetry of the NEP.

In the rest of this work, vectors and matrices are denoted as $v = [v_i]_{i=1}^{n}$ and $A=[a_{i,j}]_{i,j=1}^{n,m}$ respectively, whereas bold letters represent block-vectors and block-matrices with infinite size, namely $\varvec{v} = [v_i]_{i=1}^{\infty }$ with $v_i \in {{\mathbb {C}}}^n$ and $\varvec{A}=[A_{i,j}]_{i,j=1}^{\infty }$ with $A_{i,j} \in {{\mathbb {C}}}^{n \times n}$. The matrix $[\varvec{A}]_k = [A_{i,j}]_{i,j=1}^{k} \in {{\mathbb {C}}}^{n k \times n k}$ consists of the main sub-matrix obtained by extracting the first k-blocks. The Kronecker product and the Hadamard (element-wise) product are denoted by $\otimes $ and $\circ $ respectively. The vectors $e_j$ and $\varvec{e}_j$ have zeros as elements except one in the j-th position whereas e and $\varvec{e}$ are the vectors with all ones. Without loss of generality, after a change of variables in (2), we assume that the region of interest $D \subset {{\mathbb {C}}}$ is a disk centered in the origin. Unless differently specified, the matrices $M_i$ will denote the derivatives $M^{(i)}(0)$. We will denote by $A^T$ the transpose (not conjugate transpose) of the matrix $A \in {{\mathbb {C}}}^{n \times n}$.

2 Indefinite Lanczos method in infinite dimensional settings

In order to derive a symmetric linearization for NEPs, we first review a specific linearization technique for PEPs. This technique consists of symmetrizing the companion linearization and it is presented in [41, Theorem 2.1] that we recall below. The approach is previously reported in [30, Ch.4 Sect. 2] for the scalar case.

Theorem 1

(Mehrmann and Watkins [41], Lancaster [30]) Consider the polynomial eigenvalue problem $M(\lambda )v=0$ where $M(\lambda )=\sum _{j=0}^k M_j \lambda ^j$ and $M_k$ nonsingular. Then, the pencil $A-\lambda B \in {{\mathbb {C}}}^{nk \times nk}$, where

$$\begin{aligned} A = \begin{pmatrix} -M_0 &{} 0 &{} 0 &{} 0 &{} \dots &{} 0 \\ 0 &{} M_2 &{} M_3 &{} M_4 &{} \dots &{} M_k \\ 0 &{} M_3 &{} M_4 &{} &{} &{} 0 \\ 0 &{} M_4 &{} &{} &{} &{} 0 \\ \vdots &{} \vdots &{} &{} &{} &{} \vdots \\ 0 &{} M_k &{} 0 &{} 0 &{} \dots &{} 0 \\ \end{pmatrix},{} & {} B = \begin{pmatrix} M_1 &{} M_2 &{} M_3 &{} \dots &{} M_{k-1} &{} M_k \\ M_2 &{} M_3 &{} M_4 &{} &{} M_k &{} 0 \\ M_3 &{} M_4 &{} &{} &{} &{} 0 \\ M_4 &{} &{} &{} &{} &{} 0 \\ \vdots &{} &{} &{} &{} &{} \vdots \\ M_k &{} 0 &{} 0 &{} 0 &{} \dots &{} 0 \\ \end{pmatrix} \end{aligned}$$

(4)

has the same eigenvalues as $M(\lambda )$. If $M_j$ are symmetric, i.e., $M(\lambda )$ is symmetric, then A and B are symmetric. If $M(\lambda ) v = 0 $, then $[v^T, \lambda v^T, \dots , \lambda ^{k-1} v^T]^T$ is an eigenvector of $A-\lambda B$.

The proof is based on the following argument. A pair $(\lambda ,x)$ that fulfill $M(\lambda )x=0$ defines an eigenpair of the companion linearization [20], namely

$$\begin{aligned} \begin{pmatrix} -M_0 &{} &{} &{} &{} \\ &{} I &{} &{} &{} \\ &{} &{} I &{} &{} \\ &{} &{} &{} \ddots &{} \\ &{} &{} &{} &{} I \\ \end{pmatrix} \begin{pmatrix} x \\ \lambda x \\ \lambda ^2 x \\ \vdots \\ \lambda ^{k-1} x \end{pmatrix} = \lambda \begin{pmatrix} M_1 &{} M_2 &{} \dots &{} M_{k-1} &{} M_k \\ I &{} &{} &{} &{} 0 \\ &{} I &{} &{} &{} 0 \\ &{} &{} \ddots &{} &{} \vdots \\ &{} &{} &{} I &{} 0 \\ \end{pmatrix} \begin{pmatrix} x \\ \lambda x \\ \lambda ^2 x \\ \vdots \\ \lambda ^{k-1} x \end{pmatrix}. \end{aligned}$$

(5)

Let us define the symmetrizer matrix S, which is nonsingular [41], as

$$\begin{aligned} S := \begin{pmatrix} I &{} 0 &{} 0 &{} 0 &{} \dots &{} 0 \\ 0 &{} M_2 &{} M_3 &{} M_4 &{} \dots &{} M_k \\ 0 &{} M_3 &{} M_4 &{} &{} &{} 0 \\ 0 &{} M_4 &{} &{} &{} &{} 0 \\ \vdots &{} \vdots &{} &{} &{} &{} \vdots \\ 0 &{} M_k &{} 0 &{} 0 &{} \dots &{} 0 \\ \end{pmatrix}. \end{aligned}$$

(6)

To obtain (4) we multiply (5) on the left by the matrix (6).

The main disadvantage of using this linearization in practice is that the blocks forming the eigenvectors of the pencil, defined by (4), grow or decay exponentially, depending on the value of $|\lambda |$. More precisely, the norm of the j-th block is $|\lambda |^j \Vert x \Vert $ and, if the PEP has high degree, this leads to overflow or underflow when the eigenpairs of (4) are computed numerically. In order to resolve this issue, in this section we consider the scaled companion linearization presented [29, Section 5.1]. We extend the ideas used in Theorem 1 to symmetrize the scaled companion linearization. Moreover, we consider the NEP in its general form (1), therefore we derive a linearization that involves vectors and matrices with infinite size.

2.1 An infinite dimensional symmetric linearization

The NEP (1) is equivalent to a linear and infinite dimensional eigenvalue problem, see [29, Section 5.1]. More precisely, if $(\lambda ,x)$ is an eigenpair of (1), the following relation between vectors and matrices of infinite size is fulfilled

$$\begin{aligned} \begin{pmatrix} -M_0 &{} &{} &{} &{} \\ &{} I &{} &{} &{} \\ &{} &{} I &{} &{} \\ &{} &{} &{} I &{} \\ &{} &{} &{} &{} \ddots \end{pmatrix} \begin{pmatrix} \frac{\lambda ^0}{0!} x \\ \frac{\lambda ^1}{1!} x \\ \frac{\lambda ^2}{2!} x \\ \frac{\lambda ^3}{3!} x \\ \vdots \end{pmatrix} = \lambda \begin{pmatrix} M_1 &{} \frac{1}{2} M_{2} &{} \frac{1}{3} M_{3} &{} \frac{1}{4} M_4 &{} \dots \\ \frac{1}{1} I &{} &{} &{} &{} \\ &{} \frac{1}{2}I &{} &{} &{} \\ &{} &{} \frac{1}{3}I &{} &{} \\ &{} &{} &{} \ddots &{} \\ \end{pmatrix} \begin{pmatrix} \frac{\lambda ^0}{0!} x \\ \frac{\lambda ^1}{1!} x \\ \frac{\lambda ^2}{2!} x \\ \frac{\lambda ^3}{3!} x \\ \vdots \end{pmatrix}. \end{aligned}$$

(7)

The equation (7) defines a linear and infinite dimensional eigenvalue problem

$$\begin{aligned} \varvec{A}\varvec{x}= \lambda \varvec{B}\varvec{x}, \end{aligned}$$

(8)

where $\varvec{A}, \varvec{B}, \varvec{x}$ are matrices and vector of infinite size defined accordingly. Clearly, the linearization (7) is never symmetric. However, if the NEP is symmetric, i.e., it holds (3), then it is possible to symmetrize (7) with a similar technique as in Theorem 1. More precisely, since we consider a scaled and infinite companion linearization, in the following theorem we derive a scaled and infinite version of the matrix (6) that symmetrizes (7).

Theorem 2

(Symmetric linearization) Assume that the NEP (2) is symmetric, i.e., it holds (3), then there exists a unique matrix $\varvec{C}$ such that

$$\begin{aligned} \varvec{S}:= \left[ \begin{pmatrix} 1 &{} \\ &{} \varvec{C}\end{pmatrix} \otimes \varvec{e}\varvec{e}^T \right] \circ \begin{pmatrix} I &{} &{} &{} &{} &{} \\ &{} M_{2} &{} M_{3} &{} M_{4} &{} M_{5} &{} \dots \\ &{} M_{3} &{} M_{4} &{} M_{5} &{} &{} \\ &{} M_{4} &{} M_{5} &{} &{} &{} \\ &{} M_{5} &{} &{} &{} &{} \\ &{} \vdots &{} &{} &{} &{} \end{pmatrix} \end{aligned}$$

(9)

is a symmetrizer for (7), namely

$$\begin{aligned} \varvec{S}\varvec{A}\varvec{x}= \lambda \varvec{S}\varvec{B}\varvec{x}\end{aligned}$$

(10)

is a symmetric eigenvalue problem. The vector $\varvec{e}$ has infinite length with ones in all the entries. The coefficients of the matrix $\varvec{C}$ fulfill the following relations

$$\begin{aligned} c_{i,1}&=\frac{1}{i+1}{} & {} i \ge 1, \end{aligned}$$

(11a)

$$\begin{aligned} c_{i-1,j}&=\frac{j}{i} c_{i,j-1}{} & {} i,j > 1. \end{aligned}$$

(11b)

Proof

We start by observing that $M_j$, for $j \ge 0$, are symmetric matrices as consequence of (3). The relations (2) uniquely define a matrix $\varvec{C}$ since the first column is fixed in (11a) and the j-th column is computed by the $(j-1)$-th column in (11b). We start by showing that the matrix $\varvec{C}$ is symmetric. Let us consider $i>j$, namely $i=j+k$ for some positive integer k. By iteratively using (11b) we obtain the relations

$$\begin{aligned} (j+1) c_{j+k,j}&= (j+k) c_{j+k-1,j+1} \\ (j+2) c_{j+k-1,j+1}&= (j+k-1) c_{j+k-2,j+2} \\ \cdots&= \cdots \\ (j+s) c_{j+k-s+1,j+s-1}&= (j+k-s+1) c_{j+k-s,j+s} \\ \cdots&= \cdots \\ (j+k) c_{j+1,j+k-1}&= (j+1) c_{j,j+k}, \end{aligned}$$

that combined together give

$$\begin{aligned} c_{j+k,j} = \frac{ (j+k) (j+k-1) \dots (j+1) }{(j+1) (j+2) \dots (j+k) } c_{j,j+k}, \end{aligned}$$

that is, $c_{i,j}=c_{j,i}$. The case $i<j$ is analogous and we conclude that the matrix $\varvec{C}$ is symmetric.

By multiplying (8) on the left by the matrix (9) we get

$$\begin{aligned} \varvec{S}\varvec{A}= \begin{pmatrix} -M_0 &{} &{} &{} &{} &{} \\ &{} c_{1,1} M_{2} &{} c_{1,2} M_{3} &{} c_{1,3} M_{4} &{} c_{1,4} M_{5} &{} \dots \\ &{} c_{2,1} M_{3} &{} c_{2,2} M_{4} &{} c_{2,3} M_{5} &{} &{} \\ &{} c_{3,1} M_{4} &{} c_{3,2} M_{5} &{} &{} &{} \\ &{} c_{4,1} M_{5} &{} &{} &{} &{} \\ &{} \vdots &{} &{} &{} &{} \end{pmatrix} \end{aligned}$$

(12)

and

$$\begin{aligned} \varvec{S}\varvec{B}= \begin{pmatrix} \frac{1}{1} M_{1} &{} \frac{1}{2} M_{2} &{} \frac{1}{3} M_{3} &{} \frac{1}{4} M_{4} &{} \frac{1}{5} M_{5} &{} \dots \\ \frac{c_{1,1}}{1} M_{2} &{} \frac{c_{1,2}}{2} M_{3} &{} \frac{c_{1,3}}{3} M_{4} &{} \frac{c_{1,4}}{4} M_{5} &{} &{} \\ \frac{c_{2,1}}{1} M_{3} &{} \frac{c_{2,2}}{2} M_{4} &{} \frac{c_{2,3}}{3} M_{5} &{} &{} &{} \\ \frac{c_{3,1}}{1} M_{4} &{} \frac{c_{3,2}}{2} M_{5} &{} &{} &{} &{} \\ \frac{c_{4,1}}{1} M_{5} &{} &{} &{} &{} &{} \\ \vdots &{} &{} &{} &{} &{} \end{pmatrix}. \end{aligned}$$

(13)

The matrix $\varvec{S}\varvec{A}$ is symmetric because $\varvec{C}$ and $M_j$, for $j \ge 0$, are symmetric. By using (11a) we get that the first block-row of $\varvec{S}\varvec{B}$ is equal to its first block column, whereas the equation (11b) and the symmetry of $\varvec{C}$ gives the relation

$$\begin{aligned} \frac{c_{i-1,j}}{j}=\frac{c_{j-1,i}}{i}=\frac{c_{i,j-1}}{i}, \end{aligned}$$

which directly implies that the (i, j)-th and the (j, i)-th blocks of $\varvec{S}\varvec{B}$ are equal. Hence the matrix $\varvec{S}\varvec{B}$ is symmetric and (10) is a symmetric eigenvalue problem. $\square $

Remark 1

The relation between the eigenpairs of the original NEP (1) and the companion linearization (7) is presented with details in [29], where the space where the eigenvectors of (7) are defined is analyzed. The eigenpairs of (7) are also eigenvalues of (10). In case the matrix $\varvec{S}$ is singular, (10) may have spurious eigenvalues. Therefore, since our approach is based on solving (10), it is essential to compute the residual with respect to the original problem (1).

Remark 2

The eigenvalue problems (7) and (10) have the same eigenpairs if the symmetrizer (9) is nonsingular, namely $\varvec{S}\varvec{x}= 0$ only for $\varvec{x}= 0$. In the next section we assume that $[\varvec{S}]_{2N}$ is invertible for an N large enough. This condition can be phrases in terms of solvability of a specific matrix equation as discussed in Observation 2.

The method that we refer to as infinite Lanczos, or shortly ILAN, consists of applying the indefinite Lanczos method (Algorithm 1), described in the next section, to the symmetric eigenvalue problem (10).

3 Infinite Lanczos method

3.1 Indefinite Lanczos method

Eigenpair approximations to the generalized eigenvalue problem $Ax=\lambda Bx$, with $A,B \in {{\mathbb {C}}}^{n \times n}$ symmetric matrices, not necessarily Hermitian, and A nonsingular, can be obtained by using the indefinite Lanczos method [4, Section 8.6.1] that is summarized in Algorithm 1. The method consists of computing an orthogonal basis of the Krylov space

$$\begin{aligned} {\mathcal {K}}_k(A^{-1} B, q_1) := {{\,\textrm{span}\,}}\left( q_1, A^{-1} B q_1, (A^{-1} B)^2 q_1, \dots , (A^{-1} B)^{k-1} q_1 \right) \end{aligned}$$

(14)

by using, instead of the (standard) Euclidean scalar product, the indefinite scalar product defined by the matrix B, namely $x^T B y$ is the B-product between $x,y \in {{\mathbb {C}}}^n$. The fact that $A^{-1} B$ is self-adjoint, with respect to this indefinite scalar product, leads to the property that the B-orthogonal basis of the Krylov space can be computed with a three-term recurrence. In particular, at the k-th iteration of Algorithm 1, the following relations are fulfilled

$$\begin{aligned}&A^{-1} B Q_k = Q_{k+1} T_{k+1,k}, \end{aligned}$$

(15a)

$$\begin{aligned}&Q_{k+1}^T B Q_{k+1} = \varOmega _{k+1}, \end{aligned}$$

(15b)

where the diagonal matrix $\varOmega _{k+1}:={{\,\textrm{diag}\,}}(\omega _1, \dots , \omega _{k+1})$ and the tridiagonal matrix $T_{k+1,k}=[t_{i,j}]_{i,j=1}^{k+1}$ contain the orthogonalization and normalization coefficients. The matrix $Q_{k+1}$ is B-orthogonal in the sense of (15b) and its columns, generated with a three-term recurrence, span the Krylov space (14). The Ritz pairs of (15a), defined as follows,

$$\begin{aligned} (\lambda ,Q_k z), \ \ \text{ where } \ \ T_{k} z = \lambda \varOmega _k z, \end{aligned}$$

(16)

provide an approximation to the eigenpairs of the original problem. Since the indefinite scalar product defined by B is in general degenerate, there may be cases of break down in Algorithm 1. We refer to [4, Section 8.6.1] and reference therein for a detailed discussion of this issue.

3.2 Infinite Lanczos method in finite arithmetic

We now derive a method that consists of applying the indefinite Lanczos method (Algorithm 1) to the symmetric eigenvalue problem

$$\begin{aligned}{}[\varvec{S}\varvec{A}]_{N} x = \lambda [\varvec{S}\varvec{B}]_{N} x \end{aligned}$$

(17)

obtained by extracting the main block sub-matrices from (10), where N is a non-fixed parameter greater than the number of iterations performed in Algorithm 1. The method we derive is independent on N and, under the assumption that $\varvec{S}$ given in (9) is invertibile, corresponds to apply Algorithm 1 directly to the linear infinite dimensional eigenvalue problem (10) with a specific starting vector. This equivalence is formally presented in Theorem 5 at the end of this section.

Algorithm 1 can be efficiently applied to (17) by exploiting the structure of the matrices (17). We start with Step 2 that can be performed as stated in the following result.

Theorem 3

(Action of $ [\varvec{S}\varvec{A}]_N^{-1} [\varvec{S}\varvec{B}]_N $) Assume that $[\varvec{S}]_{2N}$ is invertible, let $q_k \in {{\mathbb {C}}}^{Nn}$ be such that only its first k blocks are nonzero, corresponding to the columns of $Q_k:=[{\tilde{q}}_1, \dots , {\tilde{q}}_k]$. Then, only the first $k+1$ blocks of $w=[\varvec{S}\varvec{A}]_N^{-1} [\varvec{S}\varvec{B}]_N q_k$ are nonzero, corresponding to the columns of $W:=[w_1, \dots , w_{k+1}]$ given by

$$\begin{aligned} W=w_1 e_1^T + Q_k D, \end{aligned}$$

(18)

where $D \in {{\mathbb {R}}}^{k \times (k+1)}$ is a diagonal matrix with coefficients $d_{j,j+1}=1/j$ and

$$\begin{aligned} w_1=M_0^{-1} \sum _{j=1}^k \frac{M_{j}}{j} {\tilde{q}}_j. \end{aligned}$$

(19)

Proof

By using the specific structure of the matrices (12) and (13), the nonzero blocks of w fulfill ${{\,\textrm{vec}\,}}(W) = ([\varvec{S}\varvec{A}]_N)^{-1} ([\varvec{S}\varvec{B}]_N) {{\,\textrm{vec}\,}}([Q_k, \ 0])$. We can then derive the following relations

$$\begin{aligned} \begin{pmatrix} {{\,\textrm{vec}\,}}(W) \\ 0 \end{pmatrix}&= ([\varvec{S}]_{2N} [\varvec{A}]_{2N})^{-1} ([\varvec{S}]_{2 N} [\varvec{B}]_{2 N}) \begin{pmatrix} {{\,\textrm{vec}\,}}([Q_k, \ 0]) \\ 0 \end{pmatrix} \\&= ([\varvec{A}]_{2N})^{-1} ([\varvec{B}]_{2N}) \begin{pmatrix} {{\,\textrm{vec}\,}}([Q_k, \ 0]) \\ 0 \end{pmatrix}. \end{aligned}$$

Hence ${{\,\textrm{vec}\,}}(W) = ([\varvec{A}]_{N})^{-1} ([\varvec{B}]_{N}) {{\,\textrm{vec}\,}}([Q_k, \ 0])$, this directly implies (18) and (19), c.f., [29, Section 4.2]. $\square $

By using the previous result, we conclude that in Algorithm 1, if $q_1$ has only the first block which is nonzero, then $q_k$ at the k-th iteration will have k nonzero blocks. This is due to the fact that, none of the steps, except Step 2, introduce fill-in in the vectors $q_1, q_2, \dots , q_k$. In Step 4 of Algorithm 1 the products $z^T q_{k}$, $z^T q_{k-1}$ and $z^T w$ are computed. Observe that the vectors multiplied by z have at most $k+1$ nonzero blocks. Therefore, even if $z=[\varvec{S}\varvec{B}]_N w$ is in general a full vector, only the first $k+1$ blocks are required. These blocks can be computed as follows.

Theorem 4

(Action of $[\varvec{S}\varvec{B}]_N$) Let us consider $z:=[\varvec{S}\varvec{B}]_N w$, where w is given as in Theorem 3. Then the first $k+1$ blocks of z, corresponding to the columns of $Z:=[z_1, \dots , z_{k+1}]$, fulfill the relation

$$\begin{aligned} Z = \sum _{m=1}^p A_m W (G_{k+1} \circ F_m), \end{aligned}$$

(20)

where $G_{k+1} \in {{\mathbb {R}}}^{(k+1) \times (k+1)}$ has coefficients $g_{j,1}=g_{1,j}=1/j$ for $j=1, \dots , k+1$ and $g_{i,j}=c_{i-1,j}/j$ and

$$\begin{aligned} F_m := \begin{pmatrix} f_m^{(1)} (0) &{} f_m^{(2)}(0) &{} f_m^{(3)}(0) &{} \dots &{} f_m^{(k+1)}(0) \\ f_m^{(2)} (0) &{} f_m^{(3)}(0) &{} \dots &{} &{} f_m^{(k+2)}(0) \\ f_m^{(3)} (0) &{} \dots &{} &{} &{} f_m^{(k+3)}(0) \\ \vdots &{} &{} &{} &{} \vdots \\ f_m^{(k+1)} (0) &{} f_m^{(k+2)}(0) &{} f_m^{(k+3)}(0) &{} \dots &{} f_m^{(2k+1)}(0) \end{pmatrix}. \end{aligned}$$

Proof

Since w has only the first $k+1$ blocks that are nonzero, we can express

$$\begin{aligned} {{\,\textrm{vec}\,}}Z = [\varvec{S}\varvec{B}]_{k+1} {{\,\textrm{vec}\,}}W. \end{aligned}$$

(21)

By using (13) and that $M_j = \displaystyle \sum _{m=1}^p f_m^{(j)}(0) A_m$, we can decompose

$$\begin{aligned}{}[ \varvec{S}\varvec{B}]_{k+1}&= \sum _{m=1}^p (G_{k+1} \circ F_m) \otimes A_m. \end{aligned}$$

(22)

Equation (20) follows by combining (22) and (21), and by using the properties of the Kronecker product. $\square $

Observation 1

The scalar product between the vectorization of two matrices can be carried out directly in matrix form by using the Hadamard product as follows: $({{\,\textrm{vec}\,}}Z)^T {{\,\textrm{vec}\,}}W = {\tilde{e}}^T (Z \circ W) e$ with ${\tilde{e}} \in {{\mathbb {R}}}^n$ and $e \in {{\mathbb {R}}}^k$.

Observation 2

With the same reasoning as in Theorem 4, we can decompose (9) as

$$\begin{aligned}{}[\varvec{S}]_{2N}&= \sum _{m=1}^p \left[ \begin{pmatrix} 1 &{} \\ &{} [\varvec{C}]_{2N-1} \end{pmatrix} \circ F_m \right] \otimes A_m. \end{aligned}$$

Therefore, we can relate the invertibility of $[\varvec{S}]_{2N}$ with the solvability of the following linear matrix equation

$$\begin{aligned} \sum _{m=1}^p A_m X \left[ \begin{pmatrix} 1 &{} \\ &{} [\varvec{C}]_{2N-1} \end{pmatrix} \circ F_m \right] = B \end{aligned}$$

for any $B \in {{\mathbb {C}}}^{2N \times n}$. Linear matrix equations are extensively studied in recent literature. See the review paper [49] and reference therein. In the numerical examples reported in Sect. 5 we never encounter a case when $[\varvec{S}]_{2N}$ was singular. A case when this matrix is obviously singular is when the NEP is defined by polynomial functions. Although the theory does not cover this case, we have successfully applied the method we are deriving without introducing any breakdown or instability.

Figure 1 illustrates the structure of the matrices and vectors, involved in Algorithm 1, when applied to (17) with a starting vector that has only the first block which is nonzero. At iteration k only the vectors $q_{k-1}, q_{k}$ are needed, therefore they are the only vectors that need to be stored.

Algorithm 2 is the combination of the results presented in this section. More precisely, Algorithm 2 is the reformulation of Algorithm 1, applied to (17), where the nonzero blocks of the vectors $q_k, q_{k-1}, w$, and the needed blocks of z, are stored as columns of the matrices $Q_k, Q_{k-1}, W$ and Z. Moreover, the size of the linearization (17) is implicitly expanded at each iteration. Observe that at iteration k only the first $2k+1$ derivatives $f^{(j)}_m(0)$ for $m=1, \dots , p$ and $j=1, \dots ,2k+1$ are needed. We now conclude this section by showing the equivalence between Algorithm 1, directly applied to the infinite dimensional problem (10), and Algorithm 2.

Theorem 5

(Infinite dimensional equivalence) Assume that the matrix $\varvec{S}$, given in (9), is invertible and let $\varvec{q}_1$ be an infinite length vector with only the first block $q_1$ nonzero. Then, Algorithm 1, with stating vector $\varvec{q}_1$, is applicable to (10) and the matrices $Q_k$, that have as columns the first k nonzero blocks of $\varvec{q}_k$, $ T_k$, and $\omega _{k}$ are equal to the homonymous matrices generated by Algorithm 2 with starting matrix $Q_1=[q_1]$.

Proof

We denote by $\varvec{q}_1, \varvec{q}_2, \dots , \varvec{q}_{k}$ the infinite-length vectors generated by Algorithm 1 and by $Q_1, Q_2, \dots , Q_k$ the matrices generated by Algorithm 2. The proof is based on induction over the iteration count k. The result is trivial for $k=1$. Suppose the results holds for some k. In Step 2 of Algorithm 1, by using that $\varvec{S}$ is invertible, we have

$$\begin{aligned} \varvec{w} = (\varvec{S}\varvec{A})^{-1} (\varvec{S}\varvec{B}) \varvec{q}_k = \varvec{A}^{-1} \varvec{B}\varvec{q}_k. \end{aligned}$$

By using the induction hypothesis, $\varvec{q}_k$ has k nonzero blocks, corresponding to the column of the matrix $Q_k$ generated at the $(k-1)$-th iteration of Algorithm 2. Because of the structure of the matrices (12) and (13), we get that $\varvec{w}$ has only the first k blocks which are nonzero, corresponding to the columns of the matrix W that fulfills (18). Therefore this matrix corresponds to the matrix computed in Step 2 of Algorithm 2. In the Step 3 of Algorithm 1 we compute $\varvec{z} = \varvec{S}\varvec{B}\varvec{w}$. This vector is in general full. However, in the Step 4 of Algorithm 1 the products $\varvec{z}^T \varvec{q}_k$, $\varvec{z}^T \varvec{q}_{k-1}$ and $\varvec{z}^T \varvec{w}$ are computed. By induction hypothesis, $\varvec{q}_{k}$, $\varvec{q}_{k-1}$ have respectively k and $k-1$ nonzero blocks, corresponding to the columns of the matrices $Q_{k}$ and $Q_{k-1}$ generated by Algorithm 2. Therefore, only the first $k+1$ blocks of $\varvec{z}$ are required and they can be computed with the same reasoning of Theorem 4. More precisely, the first $k+1$ blocks of $\varvec{z}$ are the columns of the matrix Z that fulfills (20). Therefore this matrix coincides with the matrix generated by Step 3 of Algorithm 2. In order to conclude that $\varvec{q}_{k+1}$ has only we first $k+1$ nonzero blocks, corresponding to the columns of the matrix $Q_{k+1}$ generated by Algorithm 2, we only need to use the equality $\Vert M \Vert _F = \Vert {{\,\textrm{vec}\,}}(M) \Vert _2$, which holds for every matrix M. $\square $

3.3 Robust extraction of eigenpair approximations

We propose to enhance Algorithm 2 as follows. We consider the projected NEP

$$\begin{aligned} V^T M(\lambda ) V z = 0, \end{aligned}$$

(23)

where $V \in {{\mathbb {C}}}^{n \times k}$ is an orthogonal matrix. Under the assumption that V posses good approximation properties, eigenpair approximations to the NEP (1) are given by $(\lambda , Vz)$. This can be seen as the Galerkin projection method that uses the range of V as projection space. This technique for extracting eigenpair approximations, called nonlinear Rayleigh–Ritz procedure or subspace acceleration, is often used to improve properties of more basic algorithms, e.g., the nonlinear Arnoldi method [54], Jacobi–Davidson methods [7, 16], infinite Arnoldi [27], block preconditioned harmonic projection methods [56], and many more.

In our framework, there is a natural choice for the projection space. The matrix V is chosen as the orthogonal matrix whose columns span the subspace of vectors obtained by extracting the first column from $Q_1, \dots , Q_k$ generated by Algorithm 2. The reason this matrix contains good approximation properties is due to the following argument. In the Ritz pairs extraction described in Sect. 3.1, the eigenvector approximations of (7) (or equivalently of (10)) are given by the first block row of the Ritz vectors (16). Thus, the eigenvector approximations to the NEP (1) are also obtained by the first block of these Ritz vectors and thus by the first block row of the Krylov basis, namely by the columns of the proposed matrix V.

The projected problem (23) has size k equal to the number of iterations, which is typically, in the context of Krylov methods, a small number, i.e., $k \ll n$. Therefore, the projected problem (23) has small size and solving (23) is not the computationally dominating part of Algorithm 2. In the numerical experiments we have tested, for solving (23), the following methods: Beyn’s contour integral method [8, Algorithm 1], NLEIGS [22] and IAR [29]. The choice the method for solving (23) depends on the features of the original problem (1) and there is not a favorite candidate. For example, one may want to exploit that the projected problem (23) is defined by the same nonlinear functions and inherits several features of the original NEP (1), e.g., being symmetric, palindromic, polynomial, etc. Therefore, this problem can be solved in a more robust way with structure preserving methods.

4 Indefinite scalar product computation

Under the assumption that the linear systems with the matrix $M_0$ can be efficiently solved, e.g., exploiting the sparsity, the dominating part of Algorithm 2 is the Step 3, namely the computation of (20), which has complexity ${\mathcal {O}}(k^2n)$. In this section we derive efficient methods for computing this quantity.

4.1 Computation of step 3: general case

The following theorem provides an effective approximation to (20) without any specific assumption on the coefficients of (2).

Theorem 6

Let $U,V \in {{\mathbb {R}}}^{n \times q}$ the factors of the best rank q approximation, with respect to the Euclidean norm, to the matrix $G_{k+1}$. Then

$$\begin{aligned} {\tilde{Z}} = \sum _{m=1}^p A_m \sum _{j=1}^q W {{\,\textrm{diag}\,}}(u_j) F_m {{\,\textrm{diag}\,}}(v_j) \end{aligned}$$

(24)

is such that

$$\begin{aligned} \Vert Z - {\tilde{Z}} \Vert _F \le \left( \sum _{m=1}^p \Vert A_m W\Vert _F \Vert F_m \Vert _F \right) \sum _{j=q+1}^k \sigma _j(G_k). \end{aligned}$$

(25)

Proof

The approximation (24) is obtained by replacing $G_{k+1}$ with $U V^T$ in (20) and using $u_j v_j^T \circ F_m = {{\,\textrm{diag}\,}}(u_j) F_m {{\,\textrm{diag}\,}}(v_j)$. The equation (25) follows by the triangular inequality and by the fact that the Frobenius norm is sub-multiplicative with respect to the Hadamard product. $\square $

The approximation (24) is effective, with q small, since the matrix $G_k$, which is problem independent, has a fast decay in the singular values. In Fig. 2 the singular values¹ of this matrix are displayed for different sizes k. Moreover, the computation of (24), requires less computation time than (20) since the products with the Hankel matrices $F_j$ can be efficiently computed with FFTs [34, Section 4]. The complexity for computing (24) is ${\mathcal {O}}(n k \log k)$.

4.2 Computation of step 3: delay eigenvalue problem

The stability analysis of delay systems of the form

$$\begin{aligned} \dot{x}(t) = A_2 x(t) + \sum _{m=3}^p A_m x(t- \tau _m) \end{aligned}$$

(26)

is related to solving NEPs, referred to as delay eigenvalue problems, see [25, Ch. 2] and [42, Ch. 1 Sect. 1.2], defined as follows:

$$\begin{aligned} M(\lambda ) = - \lambda I + A_2 + \sum _{m=3}^p A_m e^{-\lambda \tau _m}. \end{aligned}$$

(27)

In this case the matrices $F_m$ have at most rank one. More precisely, a direct computation leads to $F_1=-e_1 e_1^T$, $F_2=0$ and for $m \ge 3$ we get $F_m=-\tau _m v v^T$ with $v_j=(-\tau _m)^{j-1}$. Therefore, the computation time of (24) is much lower than (20) since it involves only products with low rank matrices. The complexity for computing of (24) is reduced, by exploiting the low-rank structure, to ${\mathcal {O}}(n p k)$.

4.3 Computation of Step 3: polynomial plus low-rank structured NEPs

In certain applications (2) can be written as sum of a polynomial and a low-rank part. See, e.g., [15, Ch. 2 Sect. 4], [52, Ch. 1 Sect. 1.2], [6, gun problem] and [5, Sect. 6.2.2]. More precisely:

$$\begin{aligned} M(\lambda ) = \sum _{m=1}^d \lambda ^{m-1} A_m + \sum _{m=d+1}^p f_m(\lambda ) U_m U_m^T \end{aligned}$$

(28)

with $U_m, U_m \in {{\mathbb {C}}}^{n \times r_m} $ and $r_m \ll n$. In this case we split (20) in the polynomial and low-rank terms, namely $Z=Z_p+Z_{lr}$ with

$$\begin{aligned} Z_p := \sum _{m=1}^d A_m \left[ W (G_{k+1} \circ F_m) \right] , \end{aligned}$$

(29a)

$$\begin{aligned} Z_{lr} := \sum _{m=d+1}^p \left[ U_m (U_m^T W) \right] (G_{k+1} \circ F_m). \end{aligned}$$

(29b)

The term (29a) can be efficiently approximated as in (24) by exploiting the low-rank structure of the matrices $F_m$. Moreover, since $U_m U_m^T$ in (29b) have low rank, the computation of Z with (29), respecting the order given by the parentheses in (29b), requires less computation time than (20). The complexity of (29) is ${\mathcal {O}}((d+r)n)$ where $r=\displaystyle \max _{d+1 \le t \le p} r_m$.

5 Numerical simulations

In the following numerical experiments² we use, as error measure, the relative residual norm defined as follows

$$\begin{aligned} {{\,\textrm{Err}\,}}(\lambda ,x) := \frac{ \Vert M(\lambda ) x \Vert _2 }{ \sum _{m=1}^p| f_m(\lambda ) | \Vert A_m \Vert _{\infty } \Vert x \Vert _2 }. \end{aligned}$$

An eigenpair approximation is marked as “converged” if ${{\,\textrm{Err}\,}}(\lambda ,x)<10^{-8}$. The software used in these simulations is implemented in the Julia programming language [9], and publicly available in the Julia package NEP-PACK [26].³

5.1 Delay eigenvalue problem

We consider the delay eigenvalue problem arising from the spatial discretization of the following partial delay differential equation

$$\begin{aligned}&u_{t}(\xi ,t) = -\varDelta u(\xi ,t) + a(\xi ) u(\xi ,t-1) \nonumber \\&\xi =(\xi _1,\xi _2) \in [0, \pi ]^2, \ t>0 \end{aligned}$$

(30)

where $a(\xi )=-\xi _1 \sin (\xi _1+\xi _2)$, resulting in a problem of the form (26) where all the matrices are real and symmetric. The spatial domain is partitioned with a uniform equispaced grid with N points in each direction. The Laplace operator is discretized by the 5-points stencil finite difference approximation, leading to a NEP of size $n=N^2$, cf. [5, Section 6.2.1]. More precisely, the NEP is defined as $M(\lambda )=-\lambda I + A_2 + e^{-\lambda } A_3$ where $I \in {{\mathbb {R}}}^{n \times n}$ and the other matrix coefficients are given by

$$\begin{aligned}&D := \frac{1}{h^2} \begin{pmatrix} -2 &{} 1 &{} &{} \\ 1 &{} \ddots &{} \ddots &{} \\ &{} \ddots &{} &{} 1 \\ &{} &{} 1 &{} -2 \\ \end{pmatrix} \in {{\mathbb {R}}}^{N \times N},{} & {} \begin{array}{l} F := {{\,\textrm{vec}\,}}\left( \left[ a(\xi _i,\xi _j) \right] _{i,j=1}^N \right) \in {{\mathbb {R}}}^{N \times N}, \\ \\ {\tilde{I}} \in {{\mathbb {R}}}^N, \end{array} \\&A_2:=D \otimes {\tilde{I}} + {\tilde{I}} \otimes D \in {{\mathbb {R}}}^{n \times n},{} & {} A_3 := {{\,\textrm{diag}\,}}( F ) \in {{\mathbb {R}}}^{n \times n} \end{aligned}$$

with $h:=\pi /(N-1)$ discretization step. We run 50 iterations of Algorithm 2. In Figs. 3 and 4 we illustrate the robustness of the strategy for extracting the eigenpair approximations presented in Sect. 3.3. More precisely, we compare the standard approach for extracting the eigenpair approximations, which is based on the computation of the Ritz pairs, with the more robust approach consisting of solving the projected NEP. In Fig. 3 is displayed the spectrum and the converged eigenvalues, respectively computed with the Ritz and the projected NEP approach. In this example we solve the projected NEP with the Beyn contour integral method⁴ [8]. The error history is presented in Fig. 4. As expected, the convergence of the Algorithm 2, with the Ritz pair approximation, appear to be slower with respect to the convergence of the Algorithm 2 with the more robust eigenpair extraction based on solving the projected NEP. Therefore, an important aspect that strongly influences the convergence of Algorithm 2 consists in using a robust algorithm for solving the projected NEP. Similar conclusions are reached in [13], in our case this is even more critical due to the loss of orthogonality of the Lanczos method which Algorithm 2 is based on.

The performance of Algorithm 2 is affected by the method used for solving the projected NEP. In Table 1 we compare the time, and the number of computed eigenvalues, after 50 iterations of Algorithm 2 combined with three different NEP solves for the projected problem: Beyn contour integral method (with the same settings as before), IAR [29] (50 iterations) ⁵ and IAR (100 iterations). For the latter variation of Algorithm 2, besides the 11 eigenvalues in the disk centered in zero with radius 4, which is the region used in the Beyn contour integral method, several eigenvalues outside this disk are also computed.

Table 1

Performance of Algorithm 2 applied to the NEP in Sect. 5.1 of different sizes

	Beyn		IAR: 50 iter.		IAR: 100 iter.
Prob. size	Time	Conv. Eig	Time	Conv. Eig	Time	Conv. Eig
10000	1.997 s	11	1.500 s	9	2.570 s	13
90000	14.191 s	11	13.120 s	9	14.295 s	19
250000	36.177 s	11	35.496 s	9	36.460 s	17

The NEPs are obtained by discretizing (30) respectively with $N=100, 300, 500$ nodes in each direction. The projected problems are solved only at the last iteration with: Beyn contour integral method, IAR [29] (50 iterations) and IAR (100 iterations)

5.2 A benchmark problem

We now illustrate the performance of Algorithm 2 for solving a NEP that is symmetric but not Hermitian. We consider the gun problem that belong to the problem collection [6]. This NEP has the following form

$$\begin{aligned} M(\lambda ) = A_1 - \lambda A_2 + i \sqrt{\lambda } A_3 + i \sqrt{\lambda - \sigma _2^2} A_4 \end{aligned}$$

(31)

where $\sigma _2 = 108.8774$. The matrices $A_j \in {{\mathbb {R}}}^{9956 \times 9956}$, for $j=1, \dots , 4$, are real and symmetric. The NEP (31) can be written in the form (28) since the matrix coefficients of the nonlinear part have low-rank, namely ${{\,\textrm{rank}\,}}(A_3)=19$ and ${{\,\textrm{rank}\,}}(A_4)=65$. The eigenvalues of interest are located inside the the closed disk centered in $250^2$ and with radius $5 \cdot 10^4$. Before applying Algorithm 2, the problem is shifted and scaled. We set the parameters to $\lambda =\lambda _0 + \alpha {{\hat{\lambda }}}$ where $\lambda _0=300^2$ and $\alpha =(300-200)^2$. This problem has been solved with various methods [19, 22, 29, 33, 53] and, by numerical evidences, there are 21 eigenvalues in the region of interest. We now use IAR for solving the projected NEP. More precisely, we test two variants: IAR (50 iterations) and IAR (200 iterations). As showed in the numerical experiment in Sect. 5.1, the robustness of the whole Algorithm 2 is effected by the choice of the method used for solving the projected NEP. In Fig. 5 we can see that more eigenvalues converge when we solve more accurately the projected problem. The error history is presented in Fig. 6. Solving more accurately the projected NEP is necessary to compute the outermost eigenvalues.

5.3 A random symmetrized problem

In conclusion we illustrate how Algorithm 2 can be used for solving a nonsymmetric NEP. We introduce a symmetrization technique, consisting of doubling the problem size, based in the idea presented in [43, Sect. 5]. Namely, we define the symmetrized NEP as

$$\begin{aligned} {\tilde{M}} (\lambda ) := \begin{pmatrix} 0 &{} M(\lambda ) \\ M(\lambda )^T &{} 0 \end{pmatrix} = \sum _{m=1}^p f_m(\lambda ) \begin{pmatrix} 0 &{} A_m \\ A_m^T &{} 0 \end{pmatrix}. \end{aligned}$$

(32)

Observe that if $\left( \lambda , [y^T,x^T]^T \right) $ is an eigenpair of (32), then $(\lambda ,x)$ is an eigenpair of $M(\lambda )$. We now consider the symmetrization, in the sense of (32), of the following NEP that is artificially constructed:

$$\begin{aligned} M(\lambda ) = A_1 - \lambda A_2 + \sin (\lambda ) A_3 + e^{-\lambda } A_4 \end{aligned}$$

(33)

where $A_j \in {{\mathbb {C}}}^{500 \times 500}$ are defined as follows: $A_1$ is the bidiagonal matrix with elements equal to 500 in the upper and lower diagonal, $A_2$ is the identity matrix, $A_3=A_1/500$ and $A_4$ is a diagonal matrix with elements equal to i (complex unit) in the lower diagonal. We perform 50 iterations of Algorithm 2 and solve the projected NEP with NLEIGS ⁶ [22] by targeting the eigenvalues contained in the square with opposite vertices in $-1.5-1.5 i$ and $0.5+0.5 i$. In Fig. 7 is illustrated the spectrum and the converged eigenvalues, in the region of interest, after 50 iterations of Algorithm 2. The error history is illustrated in Fig. 8.

6 Conclusions and outlook

We have presented a method for solving symmetric NEPs. We have also illustrated how the problem structure, in particular the structure of the matrices and functions in (2), can be exploited in order to reduce the computation time. However, there are NEPs that cannot be written in the format (2), e.g., the waveguide eigevalue problem [28], the reformulation of the Dirichlet eigenvalue problem with the boundary element method [17, 50], etc. For some of these problems, only a routine for computing $M_k x$ is available. We believe that further research can potentially extend the applicability of Infinite Lanczos to such problems.

In the numerical experiment in Sect. 5.3, we have successfully solved a nonsymmetric NEP in the following way. Firstly we constructed a symmetric NEP (32) whose eigenvalues are also eigenvalues of the original NEP. Then we applied the infinite Lanczos to the symmetrized problem (32). The matrices (32) have clearly a very well defined block structure. We believe that infinite Lanczos can be further specialized for solving nonsymmetric NEPs by exploiting these structures. In conclusion we also believe that similar ideas can be extended to NEPs that are Hermitian, namely, $M(\lambda )^H=M(\bar{\lambda })$ where ${{\bar{\lambda }}}$ represents the complex conjugate of $\lambda \in {{\mathbb {C}}}$ and $M(\lambda )^H$ the Hermitian, or conjugate transpose, of the matrix $M(\lambda )$.

Acknowledgements

The author wishes to thank Sarah Gaaf (Utrecht University) who suggested a first approach to extend Theorem 1 to the scaled companion linearization. The author also thanks Elias Jarlebring (KTH Royal Institute of Technology) who has provided feedback and help in the first stage of the project. The author gratefully acknowledge the support of the Swedish Research Council under Grant No. 621-2013-4640.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

next article Reproducing kernel-based piecewise methods for efficiently solving oscillatory systems of second-order initial value problems

The singular values are computed in BigFloat arithmetic using the package GenericSVD.

All simulations were carried out with Intel octa core i7-4770 CPU 3.40GHz and 24 GB RAM.

The scripts reproducing several of the presented examples are directly available in the web-page: https://people.kth.se/~gmele/InfLan/

The disk of interest is set with center in the origin and radius 4 with $N=1000$ discretization points and $\text{ tol}_{\tiny \text{ res }}=\text{ tol}_{\tiny \text{ rank }}=10^{-8}$.

We perform 50 iteration of IAR for solving the projected NEP and return the converged eigenvalues.

We used the static variant with Leja-Bagby points automatically generated from the polygonal target. The shift is in zero and it is kept constant during the iterations.

Abdel-Rehim, A.M., Morgan, R.B., Nicely, D.A., Wilcox, W.: Deflated and restarted symmetric Lanczos methods for eigenvalues and linear equations with multiple right-hand sides. SIAM J. Sci. Comput. 32(1), 129–149 (2010)MathSciNetMATH

Amiraslani, A., Corless, R.M., Lancaster, P.: Linearization of matrix polynomials expressed in polynomial bases. IMA J. Numer. Anal. 29(1), 141–157 (2008)MathSciNetMATH

Antoniou, E.N., Vologiannidis, S.: A new family of companion forms of polynomial matrices. Electron. J. Linear Algebra 11(411), 78–87 (2004)MathSciNetMATH

Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H.: Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide, vol. 11. Siam, Philadelphia (2000)MATH

Betcke, M., Voss, H.: Restarting iterative projection methods for Hermitian nonlinear eigenvalue problems with minmax property. Numer. Math. 135(2), 397–430 (2017)MathSciNetMATH

Betcke, T., Higham, N.J., Mehrmann, V., Schröder, C., Tisseur, F.: NLEVP: a collection of nonlinear eigenvalue problems. ACM Trans. Math. Softw. 39(2), 7 (2013)MathSciNetMATH

Betcke, T., Voss, H.: A Jacobi-Davidson-type projection method for nonlinear eigenvalue problems. Future Gener. Comput. Syst. 20(3), 363–372 (2004)

Beyn, W.J.: An integral method for solving nonlinear eigenvalue problems. Linear Algebra Appl. 436(10), 3839–3863 (2012)MathSciNetMATH

Bezanson, J., Edelman, A., Karpinski, S., Shah, V.B.: Julia: a fresh approach to numerical computing. SIAM Rev. 59(1), 65–98 (2017)MathSciNetMATH

10.

Bueno, M., Curlett, K., Furtado, S.: Structured strong linearizations from Fiedler pencils with repetition i. Linear Algebra Appl. 460, 51–80 (2014)MathSciNetMATH

11.

Bueno, M., Dopico, F., Furtado, S., Medina, L.: A block-symmetric linearization of odd degree matrix polynomials with optimal eigenvalue condition number and backward error. Calcolo 55(3), 32 (2018)MathSciNetMATH

12.

Bueno, M., Furtado, S.: Structured strong linearizations from Fiedler pencils with repetition ii. Linear Algebra Appl. 463, 282–321 (2014)MathSciNetMATH

13.

Chen, H., Maeda, Y., Imakura, A., Sakurai, T., Tisseur, F.: Improving the numerical stability of the Sakurai-Sugiura method for quadratic eigenvalue problems. JSIAM Lett. 9, 17–20 (2017)MathSciNetMATH

14.

De Terán, F., Dopico, F.M., Mackey, D.S.: Fiedler companion linearizations and the recovery of minimal indices. SIAM J. Matrix Anal. Appl. 31(4), 2181–2204 (2010)MathSciNetMATH

15.

Effenberger, C.: Robust solution methods for nonlinear eigenvalue problems. Ph.D. Thesis, EPF Lausanne, Switzerland (2013)

16.

Effenberger, C.: Robust successive computation of eigenpairs for nonlinear eigenvalue problems. SIAM J. Matrix Anal. Appl. 34(3), 1231–1256 (2013)MathSciNetMATH

17.

Effenberger, C., Kressner, D.: Chebyshev interpolation for nonlinear eigenvalue problems. BIT Numer. Math. 52(4), 933–951 (2012)MathSciNetMATH

18.

Faßbender, H., Saltenberger, P.: Block Kronecker ansatz spaces for matrix polynomials. Linear Algebra Appl. 542, 118–148 (2018)MathSciNetMATH

19.

Gaaf, S.W., Jarlebring, E.: The infinite bi-Lanczos method for nonlinear eigenvalue problems. SIAM J. Sci. Comput. 39(5), S898–S919 (2017)MathSciNetMATH

20.

Gohberg, I., Lancaster, P., Rodman, L.: Matrix Polynomials. SIAM, Philadelphia (2009)MATH

21.

Güttel, S., Tisseur, F.: The nonlinear eigenvalue problem. Acta Numer. 26, 1–94 (2017)MathSciNetMATH

22.

Güttel, S., Van Beeumen, R., Meerbergen, K., Michiels, W.: NLEIGS: A class of fully rational Krylov methods for nonlinear eigenvalue problems. SIAM J. Sci. Comput. 36(6), A2842–A2864 (2014)MathSciNetMATH

23.

Hernandez, V., Roman, J.E., Vidal, V.: SLEPc: a scalable and flexible toolkit for the solution of eigenvalue problems. ACM Trans. Math. Softw. 31(3), 351–362 (2005)MathSciNetMATH

24.

Higham, N.J., Mackey, D.S., Mackey, N., Tisseur, F.: Symmetric linearizations for matrix polynomials. SIAM J. Matrix Anal. Appl. 29(1), 143–159 (2006)MathSciNetMATH

25.

Jarlebring, E.: The spectrum of delay-differential equations: numerical methods, stability and perturbation. Ph.D. thesis, Inst. Comp. Math, TU Braunschweig (2008)

26.

Jarlebring, E., Bennedich, M., Mele, G., Ringh, E., Upadhyaya, P.: NEP-PACK: a Julia package for nonlinear eigenvalue problems (2018). https://github.com/nep-pack

27.

Jarlebring, E., Meerbergen, K., Michiels, W.: An Arnoldi method with structured starting vectors for the delay eigenvalue problem. IFAC Proc. Vol. 43(2), 57–62 (2010)

28.

Jarlebring, E., Mele, G., Runborg, O.: The waveguide eigenvalue problem and the tensor infinite Arnoldi method. SIAM J. Sci. Comput. 39(3), A1062–A1088 (2017)MathSciNetMATH

29.

Jarlebring, E., Michiels, W., Meerbergen, K.: A linear eigenvalue algorithm for the nonlinear eigenvalue problem. Numer. Math. 122(1), 169–195 (2012)MathSciNetMATH

30.

Lancaster, P.: Lambda-Matrices and Vibrating Systems. Pergamon Press, Oxford (1966)MATH

31.

Lanczos, C.: An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Res. Natl. Bur. Stand. B 45, 255–282 (1950). https://doi.org/10.6028/jres.045.026MathSciNetCrossRef

32.

Lawrence, P.W., Van Barel, M., Van Dooren, P.: Backward error analysis of polynomial eigenvalue problems solved by linearization. SIAM J. Matrix Anal. Appl. 37(1), 123–144 (2016)MathSciNetMATH

33.

Lietaert, P., Meerbergen, K., Tisseur, F.: Compact two-sided Krylov methods for nonlinear eigenvalue problems. SIAM J. Sci. Comput. 40(5), A2801–A2829 (2018)MathSciNetMATH

34.

Luk, F.T., Qiao, S.: A fast eigenvalue algorithm for Hankel matrices. Linear Algebra Appl. 316(1–3), 171–182 (2000)MathSciNetMATH

35.

Mackey, D.S., Mackey, N., Mehl, C., Mehrmann, V.: Vector spaces of linearizations for matrix polynomials. SIAM J. Matrix Anal. Appl. 28(4), 971–1004 (2006)MathSciNetMATH

36.

Mackey, D.S., Mackey, N., Mehl, C., Mehrmann, V.: Numerical methods for palindromic eigenvalue problems: computing the anti-triangular Schur form. Numer. Linear Algebra Appl. 16(1), 63–86 (2009)MathSciNetMATH

37.

Mackey, D.S., Mackey, N., Mehl, C., Mehrmanns, V.: Palindromic polynomial eigenvalue problems: Good vibrations from good linearizations. Tech. rep., DFG Research Center Matheon, “Mathematics for key technologies” in Berlin, TU Berlin, Berlin, Germany (2005). http://www.matheon.de/

38.

Mackey, D.S., Mackey, N., Mehl, C., Mehrmanns, V.: Structured polynomial eigenvalue problems: good vibrations from good linearizations. SIAM J. Matrix Anal. Appl. 28(4), 1029–1051 (2006)MathSciNetMATH

39.

Mehrmann, V., Voss, H.: Nonlinear eigenvalue problems: a challenge for modern eigenvalue methods. GAMM-Mitteilungen 27(2), 121–152 (2004)MathSciNetMATH

40.

Mehrmann, V., Watkins, D.: Structure-preserving methods for computing eigenpairs of large sparse skew-Hamiltonian/Hamiltonian pencils. SIAM J. Sci. Comput. 22(6), 1905–1925 (2001)MathSciNetMATH

41.

Mehrmann, V., Watkins, D.: Polynomial eigenvalue problems with Hamiltonian structure. Electron. trans. numer. anal. 13, 106–118 (2002)MathSciNetMATH

42.

Michiels, W., Niculescu, S.I.: Stability and Stabilization of Time-Delay Systems: An Eigenvalue-Based Approach. SIAM, Philadelphia (2007)MATH

43.

Nour-Omid, B.: Applications of the Lanczos method. Comput. Phys. Commun. 53(1–3), 157–168 (1989)MATH

44.

Parlett, B.N.: The Symmetric Eigenvalue Problem, vol. 20. SIAM, Philadelphia (1998)MATH

45.

Parlett, B.N., Chen, H.C.: Use of indefinite pencils for computing damped natural modes. Linear Algebra Appl. 140, 53–88 (1990)MathSciNetMATH

46.

Parlett, B.N., Scott, D.S.: The Lanczos algorithm with selective orthogonalization. Math. Comp. 33(145), 217–238 (1979)MathSciNetMATH

47.

Robol, L., Vandebril, R., Dooren, P.V.: A framework for structured linearizations of matrix polynomials in various bases. SIAM J. Matrix Anal. Appl 38(1), 188–216 (2017)MathSciNetMATH

48.

Simon, H.D.: The Lanczos algorithm with partial reorthogonalization. Math. Comp. 42(165), 115–142 (1984)MathSciNetMATH

49.

Simoncini, V.: Computational methods for linear matrix equations. SIAM Rev. 58(3), 377–441 (2016)MathSciNetMATH

50.

Steinbach, O., Unger, G.: A boundary element method for the Dirichlet eigenvalue problem of the Laplace operator. Numer. Math. 113(2), 281–298 (2009)MathSciNetMATH

51.

Su, Y., Bai, Z.: Solving rational eigenvalue problems via linearization. SIAM J. Matrix Anal. Appl. 32(1), 201–216 (2011)MathSciNetMATH

52.

Van Beeumen, R.: Rational Krylov methods for nonlinear eigenvalue problems. Ph.D. thesis, KU Leuven (2015)

53.

Van Beeumen, R., Meerbergen, K., Michiels, W.: Compact rational Krylov methods for nonlinear eigenvalue problems. SIAM J. Matrix Anal. Appl. 36(2), 820–838 (2015)MathSciNetMATH

54.

Voss, H.: An arnoldi method for nonlinear eigenvalue problems. BIT Numer. Math. 44(2), 387–401 (2004)MathSciNetMATH

55.

Wu, K., Simon, H.: Thick-restart Lanczos method for large symmetric eigenvalue problems. SIAM J. Matrix Anal. Appl. 22(2), 602–616 (2000)MathSciNetMATH

56.

Xue, F.: A block preconditioned harmonic projection method for large-scale nonlinear eigenvalue problems. SIAM J. Sci. Comput. 40(3), A1809–A1835 (2018)MathSciNetMATH

Title: The infinite Lanczos method for symmetric nonlinear eigenvalue problems
Author: Giampaolo Mele
Publication date: 01-06-2023
Publisher: Springer International Publishing
Published in: Calcolo / Issue 2/2023
Print ISSN: 0008-0624
Electronic ISSN: 1126-5434
DOI: https://doi.org/10.1007/s10092-023-00511-x

Springer Professional

The infinite Lanczos method for symmetric nonlinear eigenvalue problems

Abstract

Publisher's Note

1 Introduction

2 Indefinite Lanczos method in infinite dimensional settings

2.1 An infinite dimensional symmetric linearization

3 Infinite Lanczos method

3.1 Indefinite Lanczos method

3.2 Infinite Lanczos method in finite arithmetic

3.3 Robust extraction of eigenpair approximations

4 Indefinite scalar product computation

4.1 Computation of step 3: general case

4.2 Computation of step 3: delay eigenvalue problem

4.3 Computation of Step 3: polynomial plus low-rank structured NEPs

5 Numerical simulations

5.1 Delay eigenvalue problem

5.2 A benchmark problem

5.3 A random symmetrized problem

6 Conclusions and outlook

Acknowledgements

Publisher's Note

Premium Partner

Springer Professional

Abstract

Publisher's Note

1 Introduction

2 Indefinite Lanczos method in infinite dimensional settings

2.1 An infinite dimensional symmetric linearization

3 Infinite Lanczos method

3.1 Indefinite Lanczos method

3.2 Infinite Lanczos method in finite arithmetic

3.3 Robust extraction of eigenpair approximations

4 Indefinite scalar product computation

4.1 Computation of step 3: general case

4.2 Computation of step 3: delay eigenvalue problem

4.3 Computation of Step 3: polynomial plus low-rank structured NEPs

5 Numerical simulations

5.1 Delay eigenvalue problem

5.2 A benchmark problem

5.3 A random symmetrized problem

6 Conclusions and outlook

Acknowledgements

Publisher's Note

Other articles of this Issue 2/2023

Cohomology of the discrete de Rham complex on domains of general topology

Reproducing kernel-based piecewise methods for efficiently solving oscillatory systems of second-order initial value problems

Positivity preserving truncated scheme for the stochastic Lotka–Volterra model with small moment convergence

An adaptive immersed finite element method for linear parabolic interface problems with nonzero flux jump

Numerical solution of the space-time fractional diffusion equation based on fractional reproducing kernel collocation method

Numerical analysis for backward Euler spectral discretization for Stokes equations with boundary conditions involving the pressure: part I

Premium Partner