Pseudospectral Shattering, the Sign Function, and Diagonalization in Nearly Matrix Multiplication Time
DOI: 10.1007/s10208-022-09577-5
© The Author(s) 2022
Received: 11 February 2021
Accepted: 4 March 2022
Published: 24 August 2022
Abstract
We exhibit a randomized algorithm which, given a square matrix \(A\in \mathbb {C}^{n\times n}\) with \(\Vert A\Vert \le 1\) and \(\delta >0\), computes with high probability an invertible V and diagonal D such that \( \Vert A-VDV^{-1}\Vert \le \delta \) using \(O(T_\mathsf {MM}(n)\log ^2(n/\delta ))\) arithmetic operations, in finite arithmetic with \(O(\log ^4(n/\delta )\log n)\) bits of precision. The computed similarity V additionally satisfies \(\Vert V\Vert \Vert V^{-1}\Vert \le O(n^{2.5}/\delta )\). Here \(T_\mathsf {MM}(n)\) is the number of arithmetic operations required to multiply two \(n\times n\) complex matrices numerically stably, known to satisfy \(T_\mathsf {MM}(n)=O(n^{\omega +\eta })\) for every \(\eta >0\) where \(\omega \) is the exponent of matrix multiplication (Demmel et al. in Numer Math 108(1):59–91, 2007). The algorithm is a variant of the spectral bisection algorithm in numerical linear algebra (Beavers Jr. and Denman in Numer Math 21(1-2):143–169, 1974) with a crucial Gaussian perturbation preprocessing step. Our result significantly improves the previously best-known provable running times of \(O(n^{10}/\delta ^2)\) arithmetic operations for diagonalization of general matrices (Armentano et al. in J Eur Math Soc 20(6):1375–1437, 2018) and (with regard to the dependence on n) \(O(n^3)\) arithmetic operations for Hermitian matrices (Dekker and Traub in Linear Algebra Appl 4:137–154, 1971). It is the first algorithm to achieve nearly matrix multiplication time for diagonalization in any model of computation (real arithmetic, rational arithmetic, or finite arithmetic), thereby matching the complexity of other dense linear algebra operations such as inversion and QR factorization up to polylogarithmic factors. The proof rests on two new ingredients. (1) We show that adding a small complex Gaussian perturbation to any matrix splits its pseudospectrum into n small well-separated components. In particular, this implies that the eigenvalues of the perturbed matrix have a large minimum gap, a property of independent interest in random matrix theory. (2) We give a rigorous analysis of Roberts’ Newton iteration method (Roberts in Int J Control 32(4):677–687, 1980) for computing the sign function of a matrix in finite arithmetic, itself an open problem in numerical analysis since at least 1986.
Keywords
Linear algebra Random matrix theory Numerical analysis Computational complexityMathematics Subject Classification
60B20 65F15 68Q251 Introduction
We study the algorithmic problem of approximately finding all of the eigenvalues and eigenvectors of a given arbitrary \(n\times n\) complex matrix. While this problem is quite well-understood in the special case of Hermitian matrices (see, e.g., [52]), the general non-Hermitian case has remained mysterious from a theoretical standpoint even after several decades of research. In particular, the currently best-known provable algorithms for this problem run in time \(O(n^{10}/\delta ^2)\) [2] or \(O(n^c\log (1/\delta ))\) [17] with \(c\ge 12\) where \(\delta >0\) is the desired accuracy, depending on the model of computation and notion of approximation considered.1 To be sure, the non-Hermitian case is well-motivated: coupled systems of differential equations, linear dynamical systems in control theory, transfer operators in mathematical physics, and the nonbacktracking matrix in spectral graph theory are but a few situations where finding the eigenvalues and eigenvectors of a non-Hermitian matrix is important.
The key difficulties in dealing with non-normal matrices are the interrelated phenomena of non-orthogonal eigenvectors and spectral instability, the latter referring to extreme sensitivity of the eigenvalues and invariant subspaces to perturbations of the matrix. Non-orthogonality slows down convergence of standard algorithms such as the power method, and spectral instability can force the use of very high precision arithmetic, also leading to slower algorithms. Both phenomena together make it difficult to reduce the eigenproblem to a subproblem by “removing” an eigenvector or invariant subspace, since this can only be done approximately and one must control the spectral stability of the subproblem in order to be able to rigorously reason about it.
In this paper, we overcome these difficulties by identifying and leveraging a phenomenon we refer to as pseudospectral shattering: adding a small complex Gaussian perturbation to any matrix typically yields a matrix with well-conditioned eigenvectors and a large minimum gap between the eigenvalues, implying spectral stability. Previously, even the existence of such a regularizing perturbation with favorable parameters was not known [20]. This result builds on the recent solution of Davies’ conjecture [9] and is of independent interest in random matrix theory, where minimum eigenvalue gap bounds in the non-Hermitian case were previously only known for i.i.d. models [33, 55].
We complement the above by proving that a variant of the well-known spectral bisection algorithm in numerical linear algebra [11] is both fast and numerically stable when run on a pseudospectrally shattered matrix—we call an iterative algorithm numerically stable if it can be implemented using finite precision arithmetic with polylogarithmically many bits, corresponding to a dynamical system whose trajectory to the approximate solution is robust to adversarial noise (see, e.g. [57]). The key step in the bisection algorithm is computing the sign function of a matrix, a problem of independent interest in many areas such as control theory and approximation theory [44]. Our main algorithmic contribution is a rigorous analysis of the well-known Newton iteration method [53] for computing the sign function in finite arithmetic, showing that it converges quickly and numerically stably on matrices for which the sign function is well-conditioned, in particular on pseudospectrally shattered ones.
The end result is an algorithm which reduces the general diagonalization problem to a polylogarithmic (in the desired accuracy and dimension n) number of invocations of standard numerical linear algebra routines (multiplication, inversion, and QR factorization), each of which is reducible to matrix multiplication [22], yielding a nearly matrix multiplication runtime for the whole algorithm. This improves on the previously best-known running time of \(O(n^3+n^2\log (1/\delta ))\) arithmetic operations even in the Hermitian case ([21], see also [41, 52]), and yields the same improvement for the related problem of computing the singular value decomposition of a matrix.
We now proceed to give precise mathematical formulations of the eigenproblem and computational model, followed by statements of our results and a detailed discussion of related work.
1.1 Problem Statement
1.1.1 Accuracy and Conditioning
Due to the Abel–Ruffini theorem, it is impossible to have a finite-time algorithm which solves the eigenproblem exactly using arithmetic operations and radicals. Thus, all we can hope for is approximate eigenvalues and eigenvectors, up to a desired accuracy \(\delta >0\). There are two standard notions of approximation. We assume \(\Vert A\Vert \le 1\) for normalization, where throughout this work, \(\Vert \cdot \Vert \) denotes the spectral norm (the \(\ell ^2 \rightarrow \ell ^2\) operator norm).
Proposition 1.1
Note that \(\kappa _{\mathrm {eig}}=\infty \) if and only if A has a double eigenvalue; in this case, a relation like (4) is not possible since different infinitesimal changes to A can produce macroscopically different eigenpairs.
In this paper we will present a backward approximation for the eigenproblem with running time scaling polynomially in \(\log (1/\delta )\), which by (4) yields a forward approximation algorithm with running time scaling polynomially in \(\log (1/\kappa _{\mathrm {eig}}\delta )\).
Remark 1.2
(Multiple Eigenvalues) A backward approximation algorithm for the eigenproblem can be used to accurately find bases for the eigenspaces of matrices with multiple eigenvalues, but quantifying the forward error requires introducing condition numbers for invariant subspaces rather than eigenpairs. A standard treatment of this can be found in any numerical linear algebra textbook, e.g. [26], and we do not discuss it further in this paper for simplicity of exposition.
1.1.2 Models of Computation
These questions may be studied in various computational models: exact real arithmetic (i.e., infinite precision), variable precision rational arithmetic (rationals are stored exactly as numerators and denominators), and finite precision arithmetic (real numbers are rounded to a fixed number of bits which may depend on the input size and accuracy). Only the last two models yield actual Boolean complexity bounds, but introduce a second source of error stemming from the fact that computers cannot exactly represent real numbers.
We study the third model in this paper, axiomatized as follows.
Thus, the outcomes of all operations are adversarially noisy due to roundoff. The bit lengths of numbers stored in this form remain fixed at \(\lg (1/{\textbf {u }})\), where \(\lg \) denotes the logarithm base 2. The bit complexity of an algorithm is therefore the number of arithmetic operations times \(O^*(\log (1/{\textbf {u }}))\), the running time of standard floating point arithmetic, where the \(*\) suppresses \(\log \log (1/{\textbf {u }})\) factors. We will state all running times in terms of arithmetic operations accompanied by the required number of bits of precision, which thereby immediately imply bit complexity bounds.
Remark 1.3
(Overflow, Underflow, and Additive Error) Using p bits for the exponent in the floating-point representation allows one to represent numbers with magnitude in the range \([2^{-2^p},2^{2^p}]\). It can be easily checked that all of the nonzero numbers, norms, and condition numbers appearing during the execution of our algorithms lie in the range \([2^{-\lg ^c(n/\delta )},2^{\lg ^c(n/\delta )}]\) for some small c, so overflow and underflow do not occur. In fact, we could have analyzed our algorithm in a computational model where every number is simply rounded to the nearest rational with denominator \(2^{\lg ^c(n/\delta )}\)—corresponding to additive arithmetic errors. We have chosen to use the multiplicative error floating point model since it is the standard in numerical analysis, but our algorithms do not exploit any subtleties arising from the difference between the two models.
The advantages of the floating point model are that it is realistic and potentially yields very fast algorithms by using a small number of bits of precision (polylogarithmic in n and \(1/\delta \)), in contrast to rational arithmetic, where even a simple operation such as inverting an \(n\times n\) integer matrix requires n extra bits of precision (see, e.g., Chapter 1 of [35]). An iterative algorithm that can be implemented in finite precision (typically, polylogarithmic in the input size and desired accuracy) is called numerically stable.
The disadvantage of the model is that it is only possible to compute forward approximations of quantities which are well-conditioned in the input—in particular, discontinuous quantities such as eigenvalue multiplicity cannot be computed in the floating point model, since it is not even assumed that the input is stored exactly.
1.2 Results and Techniques
Eigenvalue Gaps, \(\kappa _V\), and Pseudospectral Shattering. The key probabilistic result of the paper is that a random complex Gaussian perturbation of any matrix yields a nearby matrix with large minimum eigenvalue gap and small \(\kappa _V\).
Theorem 1.4
The proof of Theorem 1.4 appears in Sect. 3.1. The key idea is to first control \(\kappa _V(X)\) using [9] and then observe that for a matrix with small \(\kappa _V\), two eigenvalues of X near a complex number z imply a small second-least singular value of \(z-X\), which we are able to control.
In Sect. 3.2 we develop the notion of pseudospectral shattering, which is implied by Theorem 1.4 and says roughly that the pseudospectrum consists of n components that lie in separate squares of an appropriately coarse grid in the complex plane. This is useful in the analysis of the spectral bisection algorithm in Sect. 5.
Theorem 1.5
The main new idea in the proof of Theorem 1.5 is to control the evolution of the pseudospectra \(\Lambda _{\epsilon _k}(A_k)\) of the iterates with appropriately decreasing (in k) parameters \(\epsilon _k\), using a sequence of carefully chosen shrinking contour integrals in the complex plane. The pseudospectrum provides a richer induction hypothesis than scalar quantities such as condition numbers, and allows one to control all quantities of interest using the holomorphic functional calculus. This technique is introduced in Sects. 4.1 and 4.2, and carried out in finite arithmetic in Sect. 4.3, yielding Theorem 1.5.
Diagonalization by Spectral Bisection. Given an algorithm for computing the sign function, there is a natural and well-known approach to the eigenproblem pioneered in [11]. The idea is that the matrices \((I\pm \mathrm {sgn}(A))/2\) are spectral projectors onto the invariant subspaces corresponding to the eigenvalues of A in the left and right open half planes, so if some shifted matrix \(z + A\) or \(z + iA\) has roughly half its eigenvalues in each half plane, the problem can be reduced to smaller subproblems appropriate for recursion.
The two difficulties in carrying out the above approach are: (a) efficiently computing the sign function (b) finding a balanced splitting along an axis that is well-separated from the spectrum. These are nontrivial even in exact arithmetic, since the iteration (8) converges slowly if (b) is not satisfied, even without roundoff error. We use Theorem 1.4 to ensure that a good splitting always exists after a small Gaussian perturbation of order \(\delta \), and Theorem 1.5 to compute splittings efficiently in finite precision. Combining this with well-understood techniques such as rank-revealing QR factorization, we obtain the following theorem, whose proof appears in Sect. 5.1.
Theorem 1.6
Since there is a correspondence in terms of the condition number between backward and forward approximations, and as it is customary in numerical analysis, our discussion revolves around backward approximation guarantees. For convenience of the reader, we write down below the explicit guarantees that one gets by using (4) and invoking \(\mathsf {EIG}\) with accuracy \(\frac{\delta }{6n \kappa _{\mathrm {eig}}}\).
Corollary 1.7
Remark 1.8
1.3 Related Work
Minimum Eigenvalue Gap. The minimum eigenvalue gap of random matrices has been studied in the case of Hermitian and unitary matrices, beginning with the work of Vinson [64], who proved an \(\Omega (n^{-4/3})\) lower bound on this gap in the case of the Gaussian Unitary Ensemble (GUE) and the Circular Unitary Ensemble (CUE). Bourgade and Ben Arous [3] derived exact limiting formulas for the distributions of all the gaps for the same ensembles. Nguyen, Tao, and Vu [50] obtained non-asymptotic inverse polynomial bounds for a large class of non-integrable Hermitian models with i.i.d. entries (including Bernoulli matrices).
In a different direction, Aizenman et al. proved an inverse-polynomial bound [1] in the case of an arbitrary Hermitian matrix plus a GUE matrix or a Gaussian orthogonal ensemble (GOE) matrix, which may be viewed as a smoothed analysis of the minimum gap. Theorem 3.6 may be viewed as a non-Hermitian analogue of the last result.
In the non-Hermitian case, Ge [33] obtained an inverse polynomial bound for i.i.d. matrices with real entries satisfying some mild moment conditions, and [55]5 proved an inverse polynomial lower bound for the complex Ginibre ensemble. Theorem 3.6 may be seen as a generalization of these results to non-centered complex Gaussian matrices.
Smoothed Analysis and Free Probability. The study of numerical algorithms on Gaussian random matrices (i.e., the case \(A=0\) of smoothed analysis) dates back to [25, 29, 56, 65]. The powerful idea of improving the conditioning of a numerical computation by adding a small amount of Gaussian noise was introduced by Spielman and Teng in [59], in the context of the simplex algorithm. Sankar, Spielman, and Teng [54] showed that adding real Gaussian noise to any matrix yields a matrix with polynomially bounded condition number; [9] can be seen as an extension of this result to the condition number of the eigenvector matrix, where the proof crucially requires that the Gaussian perturbation is complex rather than real. The main difference between our results and most of the results on smoothed analysis (including [2]) is that our running time depends logarithmically rather than polynomially on the size of the perturbation.
The broad idea of regularizing the spectral instability of a nonnormal matrix by adding a random matrix can be traced back to the work of Śniady [58] and Haagerup and Larsen [37] in the context of Free Probability theory.
This is precisely the problem solved by Theorem 1.5, which is as far as we know the first provable algorithm for computing the sign function of an arbitrary matrix which does not require computing the Jordan form.Of course, to obtain a complete picture, we also need to understand the effect of rounding errors on the iteration prior to convergence. This effect is surprisingly difficult to analyze. \(\ldots \) Since errors will in general occur on each iteration, the overall error will be a complicated function of \(\kappa _{sign}(X_k)\) and \(E_k\) for all k. \(\ldots \) We are not aware of any published rounding error analysis for the computation of sign(A) via the Newton iteration.—[40, Section 5.7]
In the special case of Hermitian matrices, Higham [38] established efficient reductions between the sign function and the polar decomposition. Byers and Xu [16] proved backward stability of a certain scaled version of the Newton iteration for Hermitian matrices, in the context of computing the polar decomposition. Higham and Nakatsukasa [49] (see also the improvement [48]) proved backward stability of a different iterative scheme for computing the polar decomposition, and used it to give backward stable spectral bisection algorithms for the Hermitian eigenproblem with \(O(n^3)\)-type complexity.
Non-Hermitian Eigenproblem. Floating Point Arithmetic. The eigenproblem has been thoroughly studied in the numerical analysis community, in the floating point model of computation. While there are provably fast and accurate algorithms in the Hermitian case (see the next subsection) and a large body of work for various structured matrices (see, e.g., [13]), the general case is not nearly as well-understood. As recently as 1997, J. Demmel remarked in his well-known textbook [26]: “\(\ldots \) the problem of devising an algorithm [for the non-Hermitian eigenproblem] that is numerically stable and globally (and quickly!) convergent remains open.”
Results for finite-precision floating-point arithmetic
Result | Error | Arithmetic Ops | Boolean Ops | Restrictions |
---|---|---|---|---|
[52] | Backward | \(n^3+n^2\log (1/\delta )\) | \(n^3\log (n/\delta )+n^2\log (1/\delta )\log (n/\delta )\) | Hermitian |
[2] | Backward | \(n^{10}/\delta ^2\) | \(n^{10}/\delta ^2\cdot \mathrm {polylog}(n/\delta )\)a | |
[12] | Backward | \(n^{\omega +1}\mathrm {polylog}(n)\log (1/\delta )\) | \(n^{\omega +1}\mathrm {polylog}(n)\log (1/\delta )\) | Hermitian |
Theorem 1.6b | Backward | \(T_\mathsf {MM}(n)\log ^2(n/\delta )\) | \(T_\mathsf {MM}(n)\log ^6(n/\delta )\log (n)\) | |
Corollary 1.7 | Forward | \(T_\mathsf {MM}(n)\log ^2(n\kappa _{\mathrm {eig}}/\delta )\) | \(T_\mathsf {MM}(n)\log ^6(n\kappa _{\mathrm {eig}}/\delta )\log (n)\) |
Results for other models of arithmetic
Result | Model | Error | Arithmetic Ops | Boolean Ops | Restrictions |
---|---|---|---|---|---|
[17] | Rational | Forwarda | \(\mathrm {poly}(a,n,\log (1/\delta ))\)b | \(\mathrm {poly}(a,n,\log (1/\delta ))\) | |
[51] | Rational | Forward | \(n^{\omega }+n\log \log (1/\delta )\) | \(n^{\omega +1}a+n^2\log (1/\delta )\log \log (1/\delta )\) | Eigenvalues onlyc |
[45] | Finite\(^{c}\) | Forward | \(n^\omega \log (n)\log (1/\delta )\) | \(n^\omega \log ^4(n)\log ^2(n/\delta )\) | Hermitian , \(\lambda _1\) only |
As far as we are aware, there are no other published provably polynomial-time algorithms for the general eigenproblem. The two standard references for diagonalization appearing most often in theoretical computer science papers do not meet this criterion. In particular, the widely cited work by Pan and Chen [51] proves that one can compute the eigenvalues of A in \(O(n^\omega + n\log \log (1/\delta ))\) (suppressing logarithmic factors) arithmetic operations by finding the roots of its characteristic polynomial, which becomes a bound of \(O(n^{\omega +1}a+n^2\log (1/\delta )\log \log (1/\delta ))\) bit operations if the characteristic polynomial is computed exactly in rational arithmetic and the matrix has entries of bit length a. However that paper does not give any bound for the amount of time taken to find approximate eigenvectors from approximate eigenvalues, and states this as an open problem.9
Finally, the important work of Demmel et al. [22] (see also the followup [6]), which we rely on heavily, does not claim to provably solve the eigenproblem either—it bounds the running time of one iteration of a specific algorithm, and shows that such an iteration can be implemented numerically stably, without proving any bound on the number of iterations required in general.
Hermitian Eigenproblem. For comparison, the eigenproblem for Hermitian matrices is much better understood. We cannot give a complete bibliography of this huge area, but mention one relevant landmark result: the work of Wilkinson [66], who exhibited a globally convergent diagonalization algorithm, and the work of Dekker and Traub [21] who quantified the rate of convergence of Wilkinson’s algorithm and from which it follows that the Hermitian eigenproblem can be solved with backward error \(\delta \) in \(O(n^3+n^2\log (1/\delta ))\) arithmetic operations in exact arithmetic.10 We refer the reader to [52, §8.10] for the simplest and most insightful proof of this result, due to Hoffman and Parlett [41]
There has also recently been renewed interest in this problem in the theoretical computer science community, with the goal of bringing the runtime close to \(O(n^\omega )\): Louis and Vempala [45] show how to find a \(\delta \)-approximation of just the largest eigenvalue in \(O(n^\omega \log ^4(n)\log ^2(1/\delta ))\) bit operations, and Ben-Or and Eldar [12] give an \(O(n^{\omega +1}\mathrm {polylog}(n))\)-bit-operation algorithm for finding a \(1/\mathrm {poly}(n)\)-approximate diagonalization of an \(n\times n\) Hermitian matrix normalized to have \(\Vert A\Vert \le 1\).
Remark 1.9
(Davies’ Conjecture) The beautiful paper [20] introduced the idea of approximating a matrix function f(A) for nonnormal A by \(f(A+E)\) for some well-chosen E regularizing the eigenvectors of A. This directly inspired our approach to solving the eigenproblem via regularization.
The existence of an approximate diagonalization (1) for every A with a well-conditioned similarity V (i.e, \(\kappa (V)\) depending polynomially on \(\delta \) and n) was precisely the content of Davies’ conjecture [20], which was recently solved by some of the authors and Mukherjee in [9]. The existence of such a V is a pre-requisite for proving that one can always efficiently find an approximate diagonalization in finite arithmetic, since if \(\Vert V\Vert \Vert V^{-1}\Vert \) is very large it may require many bits of precision to represent. Thus, Theorem 1.6 can be viewed as an efficient algorithmic answer to Davies’ question.
Remark 1.10
(Subsequent work in Random Matrix Theory) Since the first version of the present paper was made public there have been some advances in random matrix theory [8, 43] that prove analogues of Theorem 1.4 in the case where \(G_n\) is replaced by a perturbation with random real independent entries. These results formally articulate that, in the context of this paper, there is nothing special about complex Ginibre matrices, and that the same regularization effect can be achieved using a broader class of perturbations. Bounding the eigenvector condition number and the eigenvalue gap when the random perturbation has real entries poses interesting technical challenges that were tackled in different ways in the aforementioned papers. We also refer the reader to [18] where optimal results were obtained in the case where \(A=0\) and \(G_n\) has real Gaussian entries.
Remark 1.11
(Alternate Proofs using [2]) In October 2021 (about two years after the first appearance of this paper), we noticed that a version of Theorem 1.4 (with a worse \(\kappa _V\) bound but a better eigenvalue gap bound) as well as the main theorem of [9] (with a slightly worse dependence on n) can be easily derived from some auxiliary results shown in [2] (specifically Proposition 2.7 and Theorem 2.14 of that paper), which we were not previously aware of. We present these short alternate proofs in “Appendix D”. We remark that our original proofs are essentially different from those appearing in [2]—in particular, they rely on studying the area of pseudospectra, whereas the proof of Theorem 2.14 of [2] relies on geometric concepts and the coarea formula for Gaussian integrals of certain determinantal quantities on Riemannian manifolds. The proofs based on pseudospectra are arguably more flexible; as mentioned in Remark 1.10, they have been recently generalized to ensembles besides the complex Ginibre ensemble, which seems difficult to do for the more algebraic proofs of [2].
Reader Guide. This paper contains a lot of parameters and constants. On first reading, it may be good to largely ignore the constants not appearing in exponents and to keep in mind the typical setting \(\delta =1/\mathrm {poly}(n)\) for the accuracy, in which case the important auxiliary parameters \(\omega , 1-\alpha , \epsilon , \beta , \eta \) are all \(1/\mathrm {poly}(n)\), and the machine precision is \(\log (1/{\textbf {u }})=\mathrm {polylog}(n)\).
2 Preliminaries
Let \(M \in \mathbb {C}^{n\times n}\) be a complex matrix, not necessarily normal. We will write matrices and vectors with uppercase and lowercase letters, respectively. Let us denote by \(\Lambda (M)\) the spectrum of M and by \(\lambda _i(M)\) its individual eigenvalues. In the same way we denote the singular values of M by \(\sigma _i(M)\) and we adopt the convention \(\sigma _1(M) \ge \sigma _2(M) \ge \cdots \ge \sigma _n(M)\). When M is clear from the context we will simplify notation and just write \(\Lambda , \lambda _i\) or \(\sigma _i\), respectively.
2.1 Spectral Projectors and Holomorphic Functional Calculus
Since every spectral projector P commutes with M, its range agrees exactly with an invariant subspace of M. We will often find it useful to choose some region of the complex plane bounded by a simple closed positively oriented rectifiable curve \(\Gamma \), and compute the spectral projector onto the invariant subspace spanned by those eigenvectors whose eigenvalues lie inside \(\Gamma \). Such a projector can be computed by a contour integral analogous to the above.
Recall that if f is any function, and M is diagonalizable, then we can meaningfully define \(f(M) := V f(D) V^{-1}\), where f(D) is simply the result of applying f to each element of the diagonal matrix D. The holomorphic functional calculus gives an equivalent definition that extends to the case when M is non-diagonalizable. As we will see, it has the added benefit that bounds on the norm of the resolvent of M can be converted into bounds on the norm of f(M).
Proposition 2.1
2.2 Pseudospectrum and Spectral Stability
The \(\epsilon \)-pseudospectrum of a matrix is defined in (5). Directly from this definition, we can relate the pseudospectra of a matrix and a perturbation of it.
Proposition 2.2
([62], Theorem 52.4) For any \(n \times n\) matrices M and E and any \(\epsilon > 0\), \(\Lambda _{\epsilon - \Vert E\Vert }(M) \subseteq \Lambda _\epsilon (M+E)\).
It is also immediate that \(\Lambda (M) \subset \Lambda _\epsilon (M)\), and in fact a stronger relationship holds as well:
Proposition 2.3
([62], Theorem 4.3) For any \(n \times n\) matrix M, any bounded connected component of \(\Lambda _\epsilon (M)\) must contain an eigenvalue of M.
Lemma 2.4
In this paper we will repeatedly use that assumptions about the pseudospectrum of a matrix can be turned into stability statements about functions applied to the matrix via the holomorphic functional calculus. Here we describe an instance of particular importance.
Proof of Proposition 1.1
For \(t\in [0, 1]\) define \(A(t) = (1-t)A+ tA' \). Since \(\delta <\frac{\mathrm {gap}(A)}{8\kappa _V(A)}\) the Bauer–Fike theorem implies that A(t) has distinct eigenvalues for all t, and in fact \(\mathrm {gap}(A(t))\ge \frac{3\mathrm {gap}(A)}{4}\). Standard results in perturbation theory (for instance [34, Theorem 1] or any of the references therein) imply that for every \(i=1, \dots , n\), A(t) has a unique eigenvalue \(\lambda _i(t)\) such that \(\lambda _i(t)\) is a differentiable trajectory, \(\lambda _i(0) =\lambda _i\) and \(\lambda _i(1)=\lambda _i'\). Let \(P_i(t)\) be the associated spectral projector of \(\lambda _i(t)\), which is uniquely defined via a contour integral, and write \(P_i = P_i(0)\).
2.3 Finite-Precision Arithmetic
Throughout the paper, we will take the pedagogical perspective that our algorithms are games played between the practitioner and an adversary who may additively corrupt each operation. In particular, we will include explicit error terms (always denoted by \(E_{(\cdot )}\)) in each appropriate step of every algorithm. In many cases we will first analyze a routine in exact arithmetic—in which case the error terms will all be set to zero—and subsequently determine the machine precision \({\textbf {u }}\) necessary so that the errors are small enough to guarantee convergence.
2.4 Sampling Gaussians in Finite Precision
For various parts of the algorithm, we will need to sample from normal distributions. For our model of arithmetic, we assume that the complex normal distribution can be sampled up to machine precision in O(1) arithmetic operations. To be precise, we assume the existence of the following sampler:
Definition 2.5
(Complex Gaussian Sampling)
Note that, since the Gaussian distribution has unbounded support, one should only expect the sampler \(\mathsf {N}(\sigma )\) to have a relative error guarantee of the sort \(|{\widetilde{G}} - G| \le c_{\mathsf {N}}\sigma |G| \cdot {\textbf {u }}\). However, as it will become clear below, we only care about realizations of Gaussians satisfying \(|G|<R\), for a certain prespecified \(R>0\), and the rare event \(|G|>R\) will be accounted for in the failure probability of the algorithm. So, for the sake of exposition we decided to omit the |G| in the bound on \(|{\widetilde{G}}-G|\).
We will only sample \(O(n^2)\) Gaussians during the algorithm, so this sampling will not contribute significantly to the runtime. Here as everywhere in the paper, we will omit issues of underflow or overflow. Throughout this paper, to simplify some of our bounds, we will also assume that \(c_{\mathsf {N}}\ge 1\).
2.5 Black-box Error Assumptions for Multiplication, Inversion, and QR
Our algorithm uses matrix–matrix multiplication, matrix inversion, and QR factorization as primitives. For our analysis, we must therefore assume some bounds on the error and runtime costs incurred by these subroutines. In this section, we first formally state the kind of error and runtime bounds we require, and then discuss some implementations known in the literature that satisfy each of our requirements with modest constants.
Our definitions are inspired by the definition of logarithmic stability introduced in [22]. Roughly speaking, they say that implementing the algorithm with floating point precision \({\textbf {u }}\) yields an accuracy which is at most polynomially or quasipolynomially in n worse than \({\textbf {u }}\) (possibly also depending on the condition number in the case of inversion). Their definition has the property that while a logarithmically stable algorithm is not strictly speaking backward stable, it can attain the same forward error bound as a backward stable algorithm at the cost of increasing the bit length by a polylogarithmic factor. See Section 3 of their paper for a precise definition and a more detailed discussion of how their definition relates to standard numerical stability notions.
Definition 2.6
Definition 2.7
Definition 2.8
- 1.
R is exactly upper triangular.
- 2.There is a unitary \(Q'\) and a matrix \(A'\) such thatand$$\begin{aligned} Q' A'= R, \end{aligned}$$(17)$$\begin{aligned} \Vert Q' - Q\Vert \le \mu _\mathsf {QR}(n){\textbf {u }}, \quad \text {and} \quad \Vert A'-A\Vert \le \mu _\mathsf {QR}(n){\textbf {u }}\Vert A\Vert , \end{aligned}$$
Remark 2.9
The above definitions can be instantiated with traditional \(O(n^3)\)-complexity algorithms for which \(\mu _{\mathsf {MM}}, \mu _\mathsf {QR}, \mu _{\mathsf {INV}}\) are all O(n) and \(c_\mathsf {INV}=1\) [39]. This yields easily implementable practical algorithms with running times depending cubically on n.
In order to achieve \(O(n^\omega )\)-type efficiency, we instantiate them with fast-matrix-multiplication-based algorithms and with \(\mu (n)\) taken to be a low-degree polynomial [22]. Specifically, the following parameters are known to be achievable.
Theorem 2.10
- 1.
If \(\omega \) is the exponent of matrix multiplication, then for every \(\eta >0\) there is a \(\mu _{\mathsf {MM}}(n)\)-stable multiplication algorithm with \(\mu _{\mathsf {MM}}(n)=n^{c_\eta }\) and \(T_\mathsf {MM}(n)=O(n^{\omega +\eta })\), where \(c_\eta \) does not depend on n.
- 2.Given an algorithm for matrix multiplication satisfying (1), there is a (\(\mu _{\mathsf {INV}}(n),c_\mathsf {INV})\) -stable inversion algorithm withand \(T_\mathsf {INV}(n)\le T_\mathsf {MM}(3n)=O(T_\mathsf {MM}(n))\).$$\begin{aligned} \mu _{\mathsf {INV}}(n)\le O(\mu _{\mathsf {MM}}(n)n^{\lg (10)}),\quad \quad c_\mathsf {INV}\le 8, \end{aligned}$$
- 3.Given an algorithm for matrix multiplication satisfying (1), there is a \(\mu _\mathsf {QR}(n)\)-stable QR factorization algorithm withwhere \(c_\mathsf {QR}\) is an absolute constant, and \(T_\mathsf {QR}(n)=O(T_\mathsf {MM}(n))\).$$\begin{aligned} \mu _\mathsf {QR}(n)=O(n^{c_\mathsf {QR}} \mu _{\mathsf {MM}}(n)), \end{aligned}$$
Proof
(1) is Theorem 3.3 of [23]. (2) is Theorem 3.3 (see also equation (9) above its statement) of [22]. The final claim follows by noting that \(T_\mathsf {MM}(3n)=O(T_\mathsf {MM}(n))\) by dividing a \(3n\times 3n\) matrix into nine \(n\times n\) blocks and proceeding blockwise, at the cost of a factor of 9 in \(\mu _{\mathsf {INV}}(n)\). (3) appears in Section 4.1 of [22]. \(\square \)
We remark that for specific existing fast matrix multiplication algorithms such as Strassen’s algorithm, specific small values of \(\mu _\mathsf {MM}(n)\) are known (see [23] and its references for details), so these may also be used as a black box, though we will not do this in this paper.
3 Pseudospectral Shattering
This section is devoted to our central probabilistic result, Theorem 1.4, and the accompanying notion of pseudospectral shattering which will be used extensively in our analysis of the spectral bisection algorithm in Sect. 5.
3.1 Smoothed Analysis of Gap and Eigenvector Condition Number
As is customary in the literature, we will refer to an \(n\times n\) random matrix \(G_n\) whose entries are independent complex Gaussians drawn from \(\mathcal {N}(0,1_\mathbb {C}/n)\) as a normalized complex Ginibre random matrix. To be absolutely clear, and because other choices of scaling are quite common, we mean that \(\mathbb {E}G_{i,j} = 0\) and \(\mathbb {E}|G_{i,j}|^2 = 1/n\).
In the course of proving Theorem 1.4, we will need to bound the probability that the second-smallest singular value of an arbitrary matrix with small Ginibre perturbation is atypically small. We begin with a well-known lower tail bound on the singular values of a Ginibre matrix alone.
Theorem 3.1
As in several of the authors’ earlier work [9], we can transfer this result to case of a Ginibre perturbation via a remarkable coupling result of P. Śniady.
Theorem 3.2
- 1.
the marginals \(G_1\) and \(G_2\) are distributed as normalized complex Ginibre matrices, and
- 2.
almost surely \(\sigma _i(A_1 + \sqrt{t} G_1) \le \sigma _i(A_2 + \sqrt{t} G_2)\) for every i.
Corollary 3.3
Proof
\(\square \)
We will need as well the main theorem of [9], which shows that the addition of a small complex Ginibre to an arbitrary matrix tames its eigenvalue condition numbers.
Theorem 3.4
Our final lemma before embarking on the proof in earnest shows that bounds on the j-th smallest singular value and eigenvector condition number are sufficient to rule out the presence of j eigenvalues in a small region. For our particular application, we will take \(j=2\).
Lemma 3.5
Proof
We now present the main tail bound that we use to control the minimum gap and eigenvector condition number.
Theorem 3.6
Proof
Write \(\Lambda (X):=\{\lambda _1,\ldots ,\lambda _n\}\) for the (random) eigenvalues of \(X:=A+\gamma G_n\), in increasing order of magnitude (there are no ties almost surely). Let \(\mathcal {N}\subset \mathbb {C}\) be a minimal r/2-net of \(B:=D(0,3)\), recalling the standard fact that one exists of size no more than \((3\cdot 4/r)^2=144/r^2\). The most useful feature of such a net is that, by the triangle inequality, for any \(a,b \in D(0,3)\) with distance at most r, there is a point \(y\in \mathcal {N}\) with \(|y-(a+b)/2|<r/2\) satisfying \(a,b\in D(y,r)\). In particular, if \(\mathrm {gap}(X) < r\), then there are two eigenvalues in the disk of radius r centered at some point \(y \in \mathcal {N}\).
A specific setting of parameters in Theorem 3.6 immediately yields Theorem 1.4.
Proof of Theorem 1.4
Since it is of independent interest in random matrix theory, we record the best bound on the gap alone that is possible to extract from the theorem above.
Corollary 3.7
(Minimum Gap Bound)
Proof
3.2 Shattering
Propositions 2.2 and 2.3 in the preliminaries together tell us that if the \(\epsilon \)-pseudospectrum of an \(n\times n\) matrix A has n connected components, then each eigenvalue of any size-\(\epsilon \) perturbation \({\widetilde{A}}\) will lie in its own connected component of \(\Lambda _\epsilon (A)\). The following key definitions make this phenomenon quantitative in a sense which is useful for our analysis of spectral bisection.
Definition 3.8
Definition 3.9
- 1.
Every square of \(\mathsf {g}\) has at most one eigenvalue of A.
- 2.
\(\Lambda _\epsilon (A)\cap \mathsf {g}=\emptyset \).
Observation 3.10
As \(\Lambda _\epsilon (A)\) contains a ball of radius \(\epsilon \) about each eigenvalue of A, shattering of the \(\epsilon \)-pseudospectrum with respect to a grid with side length \(\omega \) implies \(\epsilon \le \omega /2\).
As a warm-up for more sophisticated arguments later on, we give here an easy consequence of the shattering property.
Lemma 3.11
If \(\lambda _1, \dots , \lambda _n\) are the eigenvalues of A, and \(\Lambda _\epsilon (A)\) is shattered with respect to a grid \(\mathsf {g}\) with side length \(\omega \), then every eigenvalue condition number satisfies \(\kappa (\lambda _i) \le \frac{2\omega }{\pi \epsilon }\).
Proof
Theorem 3.12
Proof
Theorem 3.13
Proof
- 1.
An additive error of operator norm at most \(n\cdot c_{\mathsf {N}}\cdot (1/\sqrt{n})\cdot {\textbf {u }}\le c_{\mathsf {N}}\sqrt{n}\cdot {\textbf {u }}\) from \(\mathsf {N}\), by Definition 2.5.
- 2.
An additive error of norm at most \(\sqrt{n}\cdot \Vert X\Vert \cdot {\textbf {u }}\le 3\sqrt{n}{\textbf {u }}\), with probability at least \(1-1/n\), from the roundoff E in step 2.
4 Matrix Sign Function
In Sect. 4.1 we briefly discuss the specific preliminaries that will be used throughout this section. In Sect. 4.2 we give a pseudospectral proof of the rapid global convergence of this iteration when implemented in exact arithmetic. In Sect. 4.2 we show that the proof provided in Sect. 4.3 is robust enough to handle the finite arithmetic case; a formal statement of this main result is the content of Theorem 4.9.
4.1 Circles of Apollonius
It has been known since antiquity that a circle in the plane may be described as the set of points with a fixed ratio of distances to two focal points. By fixing the focal points and varying the ratio in question, we get a family of circles named for the Greek geometer Apollonius of Perga. We will exploit several interesting properties enjoyed by these Circles of Apollonius in the analysis below.
Observation 4.1
([53]) The Newton map \(g\) is a two-to-one map from \(\mathsf {C}^{+}_{\alpha }\) to \(\mathsf {C}^{+}_{\alpha ^2}\), and a two-to-one map from \(\mathsf {C}^{-}_{\alpha }\) to \(\mathsf {C}^{-}_{\alpha ^2}\).
Proof
It follows from Observation 4.1 that under repeated application of the Newton map g, any point in the right or left half-plane converges to \(+1\) or \(-1\), respectively.
4.2 Exact Arithmetic
In this section, we set \(A_0 := A\) and \(A_{k+1} := g(A_k)\) for all \(k \ge 0\). In the case of exact arithmetic, Observation 4.1 implies global convergence of the Newton iteration when A is diagonalizable. For the convenience of the reader, we provide this argument (due to [53]) below.
Proposition 4.2
Proof
Consider the spectral decomposition \(A = \sum _{i=1}^n \lambda _i v_i w_i^*,\) and denote by \(\lambda _i^{(N)}\) the eigenvalues of \(A_N\).
The above analysis becomes useless when trying to prove the same statement in the framework of finite arithmetic. This is due to the fact that at each step of the iteration the roundoff error can make the eigenvector condition numbers of the \(A_k\) grow. In fact, since \(\kappa _V(A_k)\) is sensitive to infinitesimal perturbations whenever \(A_k\) has a multiple eigenvalue, it seems difficult to control it against adversarial perturbations as the iteration converges to \(\mathrm {sgn}(A_k)\) (which has very high multiplicity eigenvalues). A different approach, also due to [53], yields a proof of convergence in exact arithmetic even when A is not diagonalizable. However, that proof relies heavily on the fact that \(m(A_N)\) is an exact power of \(m(A_0)\), or more precisely, it requires the sequence \(A_k\) to have the same generalized eigenvectors, which is again not the case in the finite arithmetic setting.
Therefore, a robust version, tolerant to perturbations, of the above proof is needed. To this end, instead of simultaneously keeping track of the eigenvector condition number and the spectrum of the matrices \(A_k\), we will just show that for certain \(\epsilon _k > 0\), the \(\epsilon _k\)-pseudospectra of these matrices are contained in a certain shrinking region dependent on k. This invariant is inherently robust to perturbations smaller than \(\epsilon _k\), unaffected by clustering of eigenvalues due to convergence, and allows us to bound the accuracy and other quantities of interest via the functional calculus. For example, the following lemma shows how to obtain a bound on \(\Vert A_N- \mathrm {sgn}(A)\Vert \) solely using information from the pseudospectrum of \(A_N\).
Lemma 4.3
Proof
The lemma below is instrumental in determining the sequences \(\alpha _k, \epsilon _k\).
Lemma 4.4
Proof
From the definition of pseudospectrum, our hypothesis implies \(\Vert (z - A)^{-1}\Vert < 1/\epsilon \) for every z outside of \(\mathsf {C}_{\alpha }\). The proof will hinge on the observation that, for each \(\alpha ' \in (\alpha ^2,\alpha )\), this resolvent bound allows us to bound the resolvent of \(g(A)\) everywhere in the Apollonian annulus \(\mathsf {A}_{\alpha ,\alpha '}\).
Lemma 4.5
Let \(1> \alpha , \beta > 0\) be given. Then for any \(x \in \partial \mathsf {C}_{\alpha }\) and \(y \in \partial \mathsf {C}_{\beta }\), we have \(|x-y| \ge (\alpha -\beta )/2\).
Proof
Lemma 4.4 will also be useful in bounding the condition numbers of the \(A_k\), which is necessary for the finite arithmetic analysis.
Corollary 4.6
Proof
Another direct application of Lemma 4.4 yields the following.
Lemma 4.7
Proof
Define recursively \(\alpha _0 = \alpha \), \(\epsilon _0 = \epsilon \), \(\alpha _{k+1} = D \alpha _k^2\) and \(\epsilon _{k+1}= \frac{1}{8} \epsilon _k \alpha _k (D-1)(1-\alpha _0^2).\) It is easy to see by induction that this definition is consistent with the definition of \(\alpha _N\) and \(\epsilon _N\) given in the statement.
We are now ready to prove the main result of this section, a pseudospectral version of Proposition 4.2.
Proposition 4.8
Proof
4.3 Finite Arithmetic
Finally, we turn to the analysis of \(\mathsf {SGN}\) in finite arithmetic. By making the machine precision small enough, we can bound the effect of roundoff to ensure that the parameters \(\alpha _k\), \(\epsilon _k\) are not too far from what they would have been in the exact arithmetic analysis above. We will stop the iteration before any of the quantities involved become prohibitively small, so we will only need \(\mathrm {polylog}(1-\alpha _0, \epsilon _0, \beta )\) bits of precision, where \(\beta \) is the accuracy parameter.
In exact arithmetic, recall that the Newton iteration is given by \(A_{k+1} = g(A_{k}) = \frac{1}{2} (A_k + A_k^{-1}).\) Here we will consider the finite arithmetic version \(\mathsf {G}\) of the Newton map \(g\), defined as \(\mathsf {G}(A) := g(A)+E_A\) where \(E_A\) is an adversarial perturbation coming from the roundoff error. Hence, the sequence of interest is given by \({\widetilde{A}}_0 := A\) and \({\widetilde{A}}_{k+1} := \mathsf {G}({\widetilde{A}}_k)\) .
In this subsection we will prove the following theorem concerning the runtime and precision of \(\mathsf {SGN}\). Our assumptions on the size of the parameters \(\alpha _0, \beta , \mu _{\mathsf {INV}}(n)\) and \(c_\mathsf {INV}\) are in place only to simplify the analysis of constants; these assumptions are not required for the execution of the algorithm.
Theorem 4.9
Later on, we will need to call \(\mathsf {SGN}\) on a matrix with shattered pseudospectrum; the lemma below calculates acceptable parameter settings for shattering so that the pseudospectrum is contained in the required pair of Apollonian circles, satisfying the hypothesis of Theorem 4.9.
Lemma 4.10
Proof
The proof of Theorem 4.9 will proceed as in the exact arithmetic case, with the modification that \(\epsilon _k\) must be decreased by an additional factor after each iteration to account for roundoff. At each step, we set the machine precision \({\textbf {u }}\) small enough so that the \(\epsilon _k\) remain close to what they would be in exact arithmetic. For the analysis we will introduce an explicit auxiliary sequence \(e_k\) that lower bounds the \(\epsilon _k\), provided that \({\textbf {u }}\) is small enough.
Lemma 4.11
The proof of this lemma is deferred to “Appendix A”.
- 1.
The base case \(k=0\) holds because by assumption, \(\Lambda _{\epsilon _0} \subset \mathsf {C}_{\alpha _0}\).
- 2.Here we recursively define \(\alpha _{k+1}\). SetIn the notation of Sect. 4.2, this corresponds to setting \(D = 1+s/4\). This definition ensures that \(\alpha _k^2 \le \alpha _{k+1} \le \alpha _k\) for all k, and also gives us the bound \((1+s/4)\alpha _0 \le 1-s/2\). We also have the closed form$$\begin{aligned} \alpha _{k+1} := (1 + s/4) \alpha _k^2. \end{aligned}$$which implies the useful bound$$\begin{aligned} \alpha _k = (1+s/4)^{2^k - 1} \alpha _0^{2^k}, \end{aligned}$$$$\begin{aligned} \alpha _k \le (1-s/2)^{2^k}. \end{aligned}$$(30)
- 3.Here we recursively define \(\epsilon _{k+1}\). Combining Lemma 4.4, the recursive definition of \(\alpha _{k+1}\), and the fact that \(1 - \alpha _k^2 \ge 1 - \alpha _0^2 \ge 1 - \alpha _0 = s\), we find that \(\Lambda _{\epsilon '}\left( g({\widetilde{A}}_k)\right) \subset \mathsf {C}_{\alpha _{k+1}}\), whereThus in particular$$\begin{aligned} \epsilon ' = \epsilon _k \frac{\left( \alpha _{k+1} - \alpha _k^2\right) (1-\alpha _k^2)}{8\alpha _k} = \epsilon _k \frac{s\alpha _k(1-\alpha _k^2)}{32} \ge \epsilon _k\frac{ \alpha _k s^2}{32}. \end{aligned}$$Since \({\widetilde{A}}_{k+1} = \mathsf {G}({\widetilde{A}}_k) = g({\widetilde{A}}_k) + E_k\), for some error matrix \(E_k\) arising from roundoff, Proposition 2.2 ensures that if we set$$\begin{aligned} \Lambda _{\epsilon _k\alpha _k s^2/32} \left( g({\widetilde{A}}_k)\right) \subset \mathsf {C}_{\alpha _{k+1}}. \end{aligned}$$we will have \(\Lambda _{\epsilon _{k+1}}({\widetilde{A}}_{k+1}) \subset \mathsf {C}_{\alpha _{k+1}}, \) as desired.$$\begin{aligned} \epsilon _{k+1} := \epsilon _k\frac{ s^2 \alpha _k }{32} - \Vert E_k \Vert \end{aligned}$$(31)
Lemma 4.12
Proof
The last statement follows from the fact that \(\epsilon _i\) is decreasing in i and \(K_{\epsilon _i}\) is increasing in i.
We now show that the conclusion of Lemma 4.12 still holds if we replace \(\epsilon _i\) everywhere in the hypothesis by \(e_i\), which is an explicit function of \(\epsilon _0\) and \(\alpha _0\) defined in Lemma 4.12. Note that we do not know \(\epsilon _i \ge e_i\) a priori, so to avoid circularity we must use a short inductive argument.
Corollary 4.13
Proof
The last statement follows from the fact that \(e_i\) is decreasing in i and \(K_{e_i}\) is increasing in i.
Assuming the full hypothesis of this lemma, we prove \(\epsilon _i \ge e_i\) for \(0 \le i \le k\) by induction on i. For the base case, we have \(\epsilon _0 \ge e_0 = \epsilon _0 \alpha _0\).
Lemma 4.14
Proof
Combining the above with the triangle inequality, we obtain the desired bound.
\(\square \)
We would like to apply Lemma 4.14 to ensure \(\Vert \widetilde{A_N} - \mathrm {sgn}(A) \Vert \) is at most \(\beta \), the desired accuracy parameter. The upper bound (38) in Lemma 4.14 is the sum of two terms; we will make each term less than \(\beta /2\). The bound for the second term will yield a sufficient condition on the number of iterations N. Given that, the bound on the first term will then give a sufficient condition on the machine precision \({\textbf {u }}\). This will be the content of Lemmas 4.16 and 4.17.
We start with the second term. The following preliminary lemma will be useful:
Lemma 4.15
The proof is deferred to “Appendix A”.
Lemma 4.16
Proof
Now we move to the first term in the bound of Lemma 4.14.
Lemma 4.17
(Bound on first term of (38))
Proof
Matching the statement of Theorem 4.9, we give a slightly cleaner sufficient condition on N that implies the hypothesis on N appearing in the above lemmas. The proof is deferred to “Appendix A”.
Lemma 4.18
Taking the logarithm of the machine precision yields the number of bits required:
Lemma 4.19
Proof
In the course of the proof, for convenience we also record a nonasymptotic bound (for \(s<1/100\), \(\beta < 1/12\), \(\epsilon _0 < 1\) and \(c_\mathsf {INV}\log n > 1\) as in the hypothesis of Theorem 4.9), at the cost of making the computation somewhat messier.
Using that \(\mu _{\mathsf {INV}}(n) = \mathrm {poly}(n)\) and discarding subdominant terms, we obtain the desired asymptotic bound. \(\square \)
This completes the proof of Theorem 4.9. Finally, we may prove the theorem advertised in Sect. 1.
Proof of Theorem 1.5
Set \(\epsilon := \min \{ \frac{1}{K}, 1\}\). Then \(\Lambda _\epsilon (A)\) does not intersect the imaginary axis, and furthermore \(\Lambda _\epsilon (A) \subseteq D(0, 2)\) because \(\Vert A \Vert \le 1\). Thus, we may apply Lemma 4.10 with \({{\,\mathrm{diam}\,}}(\mathsf {g}) = 4\sqrt{2}\) to obtain parameters \(\alpha _0, \epsilon _0\) with the property that \(\log (1/(1-\alpha _0))\) and \(\log (1/\epsilon _0)\) are both \(O(\log K)\). Theorem 4.9 now yields the desired conclusion. \(\square \)
5 Spectral Bisection Algorithm
- Split::
-
Given an \(n\times n\) matrix A, find a partition of the spectrum into pieces of roughly equal size, and output spectral projectors \(P_{\pm }\) onto each of these pieces.
- Deflate::
-
Given an \(n\times n\) rank-k projector P, output an \(n\times k\) matrix Q with orthogonal columns that span the range of P.
Observation 5.1
The spectrum of A is exactly \(\Lambda (A_+) \sqcup \Lambda (A_-)\), and every eigenvector of A is of the form \(Q_{\pm }v\) for some eigenvector v of one of \(A_{\pm }\).
The difficulty, of course, is that neither of these routines can be executed exactly: we will never have access to true projectors \(P_{\pm }\), nor to the actual orthogonal matrices \(Q_{\pm }\) whose columns span their range, and must instead make do with approximations. Because our algorithm is recursive and our matrices nonnormal, we must take care that the errors in the sub-instances \(A_{\pm }\) do not corrupt the eigenvectors and eigenvalues we are hoping to find. Additionally, the Newton iteration we will use to split the spectrum behaves poorly when an eigenvalue is close to the imaginary axis, and it is not clear how to find a splitting which is balanced.
Our tactic in resolving these issues will be to pass to our algorithms a matrix and a grid with respect to which its \(\epsilon \)-pseudospectrum is shattered. To find an approximate eigenvalue, then, one can settle for locating the grid square it lies in; containment in a grid square is robust to perturbations of size smaller than \(\epsilon \). The shattering property is robust to small perturbations, inherited by the subproblems we pass to, and—because the spectrum is quantifiably far from the grid lines—allows us to run the Newton iteration in the first place.
Theorem 5.2
Theorem 5.3
Remark 5.4
The proof of the above theorem, which is deferred to “Appendix C”, closely follows and builds on the analysis of the randomized rank revealing factorization algorithm (\(\mathsf {RURV}\)) introduced in [22] and further studied in [7]. The parameters in the theorem are optimized for the particular application of finding a basis for a deflating subspace given an approximate spectral projector.
The main difference with the analysis in [22] and [7] is that here, to make it applicable to complex matrices, we make use of Haar unitary random matrices instead of Haar orthogonal random matrices. In our analysis of the unitary case, we discovered a strikingly simple formula (Corollary C.6) for the density of the smallest singular value of an \(r\times r\) sub-matrix of an \(n\times n\) Haar unitary; this formula is leveraged to obtain guarantees that work for any n and r, and not only for when \(n-r \ge 30\), as was the case in [7]. Finally, we explicitly account for finite arithmetic considerations in the Gaussian randomness used in the algorithm, where true Haar unitary matrices can never be produced.
Theorem 5.5
Remark 5.6
We have not fully optimized the large constant \(2^{9.59}\) appearing in the bit length above.
Theorem 5.5 easily implies Theorem 1.6 when combined with \(\mathsf {SHATTER}\).
Theorem 5.7
Proof
- 1.
\((X, \mathsf {g}, \epsilon )\leftarrow \mathsf {SHATTER}(A,\delta /8)\).
- 2.\((V,D)\leftarrow \mathsf {EIG}(X,\delta ',\mathsf {g},\epsilon ,1/n,n)\), where$$\begin{aligned} \delta ' := \frac{\delta ^3}{n^{4.5} \cdot 6 \cdot 128 \cdot 2}. \end{aligned}$$(42)
5.1 Proof of Theorem 5.5
A key stepping-stone in our proof will be the following elementary result controlling the spectrum, pseudospectrum, and eigenvectors after perturbing a shattered matrix.
Lemma 5.8
Proof
The algorithm \(\mathsf {EIG}\) works by recursively reducing to subinstances of smaller size, but requires a pseudospectral guarantee to ensure speed and stability. We thus need to verify that the pseudospectrum does not deteriorate too substantially when we pass to a sub-problem.
Lemma 5.9
Proof
For the case in which the columns of Q span the rows of P, the above proof can be easily modified by now taking v with the property that \(\Vert v^* Q^* (z-A)Q\Vert \le \epsilon \Vert v\Vert \). \(\square \)
Observation 5.10
Initial lemmas in hand, let us begin to analyze the algorithm. At several points we will make an assumption on the machine precision in the margin. These will be collected at the end of the proof, where we will verify that they follow from the precision hypothesis of Theorem 5.5.
Correctness.
Lemma 5.11
(Accuracy of \(\widetilde{\lambda _i}\)) When \(\mathsf {DEFLATE}\) succeeds, each eigenvalue of A shares a square of \(\mathsf {g}\) with a unique eigenvalue of either \(\widetilde{A_{+}}\) or \(\widetilde{A_{-}}\), and furthermore \(\Lambda _{4\epsilon /5} (\widetilde{A_{\pm }}) \subset \Lambda _\epsilon (A)\).
Proof
Everything is now in place to show that, if every call to \(\mathsf {DEFLATE}\) succeeds, \(\mathsf {EIG}\) has the advertised accuracy guarantees. After we show this, we will lower bound this success probability and compute the running time.
Running Time and Failure Probability. Let’s begin with a simple lemma bounding the depth of \(\mathsf {EIG}\)’s recursion tree.
Lemma 5.12
(Recursion Depth) The recursion tree of \(\mathsf {EIG}\) has depth at most \(\log _{5/4} n\), and every branch ends with an instance of size \(1\times 1\).
Proof
On each new call to \(\mathsf {EIG}\) the grid only decreases in size, so the initial assumption is sufficient. Finally, we need that every matrix passed to \(\mathsf {SPLIT}\) throughout the course of the algorithm has norm at most 4. Lemma 5.11 shows that if \(\Vert A\Vert \le 4\) and has its \(\epsilon \)-pseudospectrum shattered, then \(\Vert \widetilde{A_{\pm }} - A_{\pm }\Vert \le \epsilon /5\), and since \(\Vert A_{\pm }\Vert = \Vert A\Vert \), this means \(\Vert \widetilde{A_{\pm }}\Vert \le \Vert A\Vert + \epsilon /5\). Thus each time we pass to a subproblem, the norm of the matrix we pass to \(\mathsf {EIG}\) (and thus to \(\mathsf {SPLIT}\)) increases by at most an additive \(\epsilon /5\), where \(\epsilon \) is the input to the outermost call to \(\mathsf {EIG}\). Since \(\epsilon \) decreases by a factor of 4/5 on each recursion step, this means that by the end of the algorithm the norm of the matrix passed to \(\mathsf {EIG}\) will increase by at most an additive \((\epsilon + (4/5)\epsilon + (4/5)^2 \epsilon + \cdots )/5 = \epsilon \le 1/2\). Thus we will be safe if our initial matrix has norm at most 3.5, as assumed.
Lemma 5.13
Proof
Along each branch of the recursion tree, we replace \(\epsilon \leftarrow 4\epsilon /5\) and \(\delta \leftarrow 4\delta /5\) at most \(\log _{5/4}n\) times, so each can only decrease by a factor of n from their initial settings. The parameters \(\eta '\) and \(\beta '\) are computed directly from \(\epsilon '\) and \(\delta '\). \(\square \)
Lemma 5.14
(Failure Probability) \(\mathsf {EIG}\) fails with probability no more than \(\theta \).
Proof
Required Bits of Precision. We will need the following bound on the norms of all spectral projectors.
Lemma 5.15
(Sizes of Spectral Projectors) Throughout the algorithm, every approximate spectral projector \({\widetilde{P}}\) given to \(\mathsf {DEFLATE}\) satisfies \(\Vert {\widetilde{P}}\Vert \le 10n/\epsilon \).
Proof
Remark 5.16
A constant may be extracted directly from the expression above—leaving \(\epsilon ,\delta ,\theta \) fixed, a crude bound on it is \(2^{9.59} \cdot 26 \cdot 8 \cdot c_{\mathsf {INV}} \approx 160303 c_{\mathsf {INV}}\). This can certainly be optimized, the improvement with the highest impact would be tighter analysis of \(\mathsf {SPLIT}\), with the aim of eliminating the additive 7.59 term in \(N_{\mathsf {SPLIT}}\).
6 Conclusion and Open Questions
In this paper, we reduced the approximate diagonalization problem to a polylogarithmic number of matrix multiplications, inversions, and QR factorizations on a floating point machine with precision depending only polylogarithmically on n and \(1/\delta \). The key phenomena enabling this were: (a) every matrix is \(\delta \)-close to a matrix with well-behaved pseudospectrum, and such a matrix can be found by a complex Gaussian perturbation and (b) the spectral bisection algorithm can be shown to converge rapidly to a forward approximate solution on such a well-behaved matrix, using a polylogarithmic in n and \(1/\delta \) amount of precision and number of iterations. The combination of these facts yields a \(\delta \)-backward approximate solution for the original problem.
Using fast matrix multiplication, we obtain algorithms with nearly optimal asymptotic computational complexity (as a function of n, compared to matrix multiplication), for general complex matrices with no assumptions. Using naïve matrix multiplication, we get easily implementable algorithms with \(O(n^3)\) type complexity and much better constants which are likely faster in practice. The constants in our bit complexity and precision estimates (see Theorem 5.5 and equations (41) and (42)), while not huge, are likely suboptimal. The reasonable practical performance of spectral bisection based algorithms is witnessed by the many empirical papers (see e.g. [5]) which have studied it. The more recent of these works further show that such algorithms are communication-avoiding and have good parallelizability properties.
Remark 6.1
(Hermitian Matrices) A curious feature of our algorithm is that even when the input matrix is Hermitian or real symmetric, it begins by adding a complex non-Hermitian perturbation to regularize the spectrum. If one is only interested in this special case, one can replace this first step by a Hermitian GUE or symmetric GOE perturbation and appeal to the result of [1] instead of Theorem 1.4, which also yields a polynomial lower bound on the minimum gap of the perturbed matrix. It is also possible to obtain a much stronger analysis of the Newton iteration in the Hermitian case, since the iterates are all Hermitian and \(\kappa _V=1\) for such matrices. By combining these observations, one can obtain a running time for Hermitian matrices which is significantly better (in logarithmic factors) than our main theorem. We do not pursue this further since our main goal was to address the more difficult non-Hermitian case.
- 1.
Devise a deterministic algorithm with similar guarantees. The main bottleneck to doing this is deterministically finding a regularizing perturbation, which seems quite mysterious. Another bottleneck is computing a rank-revealing QR factorization in near matrix multiplication time deterministically (all of the currently known deterministic algorithms require \(\Omega (n^3)\) time).
- 2.
Determine the correct exponent for smoothed analysis of the eigenvalue gap of \(A+\gamma G\) where G is a complex Ginibre matrix. We currently obtain roughly \((\gamma /n)^{8/3}\) in Theorem 3.6. Is it possible to match the \(n^{-4/3}\) type dependence [64] which is known for a pure Ginibre matrix?
- 3.
Reduce the dependence of the running time and precision to a smaller power of \(\log (1/\delta )\). The bottleneck in the current algorithm is the number of bits of precision required for stable convergence of the Newton iteration for computing the sign function. Other, “inverse-free” iterative schemes have been proposed for this, which conceivably require lower precision.
- 4.
Study the convergence of “scaled Newton iteration” and other rational approximation methods (see [40, 48]) for computing the sign function on non-Hermitian matrices. Perhaps these have even faster convergence and better stability properties?
Acknowledgements
We thank Peter Bürgisser for introducing us to this problem, and Ming Gu, Olga Holtz, Vishesh Jain, Ravi Kannan, Pravesh Kothari, Lin Lin, Satyaki Mukherjee, Yuji Nakatsukasa, and Nick Trefethen for helpful conversations. We thank the referees for a careful reading of the paper and many helpful comments which improved it. We thank the Institute for Pure and Applied Mathematics, where part of this work was carried out.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.