Skip to main content
Top

2015 | Book

Hierarchical Matrices: Algorithms and Analysis

insite
SEARCH

About this book

This self-contained monograph presents matrix algorithms and their analysis. The new technique enables not only the solution of linear systems but also the approximation of matrix functions, e.g., the matrix exponential. Other applications include the solution of matrix equations, e.g., the Lyapunov or Riccati equation. The required mathematical background can be found in the appendix.

The numerical treatment of fully populated large-scale matrices is usually rather costly. However, the technique of hierarchical matrices makes it possible to store matrices and to perform matrix operations approximately with almost linear cost and a controllable degree of approximation error. For important classes of matrices, the computational cost increases only logarithmically with the approximation error. The operations provided include the matrix inversion and LU decomposition.

Since large-scale linear algebra problems are standard in scientific computing, the subject of hierarchical matrices is of interest to scientists in computational mathematics, physics, chemistry and engineering.

Table of Contents

Frontmatter

Introductory and Preparatory Topics

Frontmatter
Chapter 1. Introduction
Abstract
We introduce basic problems in Section 1.1 and discuss various possibilities for representing matrices in Section 1.3. To introduce examples of ‘large-scale problems’, we recall the discretisation of boundary value problems (cf. §1.5.1) and integral equations (cf. §1.5.2).
Wolfgang Hackbusch
Chapter 2. Rank-r Matrices
Abstract
Rank-r matrices will be an important stepping stone towards hierarchical matrices. Since typically we assume r to be of moderate size, also the term low-rank matrices is used. The storage of rank-r matrices as well as operations involving rank-r matrices form the basis of the hierarchical matrix representation and the hierarchical matrix operations, since these are reduced to additions and multiplications by rank-r or small full matrices.
In Section 2.1 we recall the rank of a matrix and various properties related to the matrix rank. As shown in Section 2.2, rank-r matrices allow for a suitable representation with low storage cost. We introduce the notation \(\mathcal{R}(r, I, J)\) for matrices presented in this format. To avoid a possible misunderstanding, we note that rank-r matrices need not have the exact rank r, but are at most of rank r. In Section 2.3 we discuss the arithmetical work of matrix-vector multiplication, matrix-matrix addition, and matrix-matrix multiplication by rank-r matrices. In Section 2.4 we recall that the best approximation of a general matrix by a rank-r matrix makes use of the singular value decomposition (SVD) (details and proofs in Appendix C.2). Approximating a rank-s matrix by another matrix of smaller rank r < s, as discussed in Section 2.5, will become an important tool. The QR decomposition and the reduced QR decomposition are defined. In Section 2.6 we apply tools of the previous subsections and introduce formatted addition involving the characteristic truncation to the low-rank format. In Section 2.7 we mention a modification of the standard representation \(\mathcal{R}(r, I, J)\).
Wolfgang Hackbusch
Chapter 3. Introductory Example
Abstract
The fact that a matrix can be simplified by compressing its submatrices is not new. The panel clustering method (cf. [149], [225, §7]), the multipole method (cf. [115], [225, §7.1.3.2]), mosaic approximation (cf. [238]), and matrix compression techniques by wavelets (cf. [76]) are based on the same concept. However, in none of these cases is it possible to efficiently perform matrix operations other than matrix-vector multiplication. Therefore, in this chapter we want to illustrate how matrix operations are performed and how costly they are. In particular, it will turn out that all matrix operations can be computed with almost linear work (instead of O(n2) or O(n3) for the full matrix representation). On the other hand, we have to notice that, in general, the results contain approximation errors (these will be analysed later). In the following model example we use a very simple and regularly structured block partition. We remark that in realistic applications the block partition will be adapted individually to the actual problem (cf. Chapter 5).
Wolfgang Hackbusch
Chapter 4. Separable Expansions and Low-Rank Matrices
Abstract
In the previous chapters we studied low-rank matrices and model formats with low-rank matrices as matrix blocks. The essential question remains whether and in which cases low-rank matrices may yield a good approximation. In many cases, this property follows from the existence of a separable expansion which is the subject of this chapter.
In the case of an integral operator with a kernel function κ, the discretisation matrix inherits properties of the function κ. In Section 4.1 we demonstrate how certain separability properties of κ can be exploited to construct approximating rank-r matrices. The following two properties will be related:
  • Approximability of a submatrix M|b (b suitable block) by a rank-r matrix.
  • Approximability of the function κ(x, y)—restricted to a suitable subdomain corresponding to the block b—by a separable expansion with r terms.
In Section 4.2 the basic terms are explained which are needed in the sequel. The separable expansion (§4.2.1) is the starting point. In the most favourable case, exponential convergence holds (§4.2.2). Under certain conditions on the kernel function κ, separability follows from an admissibility condition (§4.2.3) for the domain X × Y where κ(·, ·) is evaluated.
In Section 4.3 we discuss separable expansions via polynomials. The Taylor expansion (§4.3.1) is a possible tool to obtain approximating polynomials, but interpolation (§4.3.2) is the easiest method to apply. A suitable regularity condition on the kernel function κ is its asymptotic smoothness, since it ensures exponential convergence (§4.3.3). Next we discuss error estimates of the Taylor expansion (§4.3.5) and interpolation (§§4.3.6–4.3.8).
Polynomial approximation is not the only choice. In Section 4.4 we consider further techniques (§§4.4.1–4.4.5). For theoretical purposes, we introduce the optimal separable expansion by the infinite singular value decomposition (§4.4.7).
Section 4.5 reveals the crucial role of a separable expansion: The discretisation of integral kernels with separation rank r yields matrices of rank ≤ r.
Section 4.6 is devoted to the error analysis.
Wolfgang Hackbusch
Chapter 5. Matrix Partition
Abstract
After the introduction in Section 5.1, the concept of admissible blocks is presented in Section 5.2. It will turn out that blocks satisfying this property allow for a good approximation by a low-rank matrix block. For the partition of the matrix into (admissible) blocks, we need a block cluster tree. Its construction starts with the cluster tree T(I) introduced in Section 5.3. The practical generation is discussed in Section 5.4. The block cluster tree T(I×J) (see Section 5.5) can easily be obtained from cluster trees T(I) and T(J). Together, we obtain the admissible partition of the matrix (cf. §5.6), which is the basis of the definition of hierarchical matrices in the next chapter.
Wolfgang Hackbusch

<InlineEquation ID="IEq1"><EquationSource Format="TEX"><![CDATA[$$\mathcal{H}$$]]></EquationSource></InlineEquation>-Matrices and Their Arithmetic

Chapter 6. Definition and Properties of Hierarchical Matrices
Abstract
The set \(\mathcal{H}(r,P)\) of hierarchical matrices (\(\mathcal{H}\)-matrices) is defined in Section 6.1. Section 6.2 mentions elementary properties; e.g., the H-matrix structure is invariant with respect to transformations by diagonal matrices and transposition. The first essential property of \(\mathcal{H}\)-matrices is data-sparsity proved in Section 6.3. The storage cost of an n × n matrix is O(n log* n). The precise estimate together with a description of the constants is given in §6.3.2 using the quantity C sp from (6.5b). In Section 6.4 we prove that matrices arising from a finite element discretisation lead to a constant C sp depending only on the shape regularity of the finite elements. In Section 6.5 we analyse how approximation errors of the submatrices affect the whole matrix. In the definition of \(\mathcal{H}(r,P)\), the parameter r can be understood as a fixed local rank. In practice, an adaptive computation of the ranks is more interesting, as described in Section 6.6. The construction of the partition yields an a priori choice of the local ranks. These may too large. Therefore, a subsequent reduction of the rank (‘recompression’) is advisable, as explained in Section 6.7. In Section 6.8 we discuss how additional side conditions can be taken into consideration.
Wolfgang Hackbusch
Chapter 7. Formatted Matrix Operations for Hierarchical Matrices
Abstract
Essential progress obtained by hierarchical matrix technique is the possibility to perform all matrix operations with almost linear complexity. Therefore this chapter plays a crucial role for its implementation. Since the input and output data are presented in the \(\mathcal{H}\)-matrix format, matrix operations are called formatted operations. These will be described algorithmically in Sections 7.1–7.6. More precisely, Section 7.1 is concerned with the matrix-vector multiplication. Section 7.2 describes truncations and conversions that are required for the formatted matrix addition explained in Section 7.3. A more involved operation is the formatted matrix-matrix multiplication outlined in Section 7.4. In the case of matrix addition and multiplication there is also the option of an exact computation at the cost of an increased rank (cf. Corollary 7.8 and Lemma 7.13). Section 7.5 deals with the formatted matrix inversion, while the formatted LU and Cholesky decompositions are described in Section 7.6. Section 7.7 concerns the Hadamard product. The analysis of the computational work required by these operations is given in Section 7.8. Concerning parallel implementations one finds details in Kriemann [176, 177, 178, 181], Bebendorf–Kriemann [28], and Izadi [163].
Wolfgang Hackbusch
Chapter 8. $$\mathcal{H}^{2}$$ -Matrices
Abstract
Combining the \(\mathcal{H}\)-matrices with of a second hierarchical structure, we are led to the \(\mathcal{H}^{2}\)-matrices. Here the storage cost and the arithmetical cost of the matrix operations are clearly reduced. In many cases, one can avoid the logarithmic factor in the theoretical estimates. In this chapter we need some new notation introduced in Section 8.1. Next we discuss pre-versions of the \(\mathcal{H}^{2}\)-matrix format in Sections 8.2–8.3. Section 8.4 contains the final definition of an \(\mathcal{H}^{2}\)-matrix, requiring special nestedness conditions for a family of vector spaces. Special topics are the transfer matrices (cf. §8.4.2), transformations (cf. §8.4.4), orthonormal bases (cf. §8.4.5), SVD bases (cf. §8.4.7), and truncation (cf. §8.4.8). The characteristic nestedness condition can be inherited from the continuous problem as studied in Section 8.5. For suitable rank distributions we even prove a linear estimate of the cost without any logarithmic factor (see Section 8.6). The matrix-vector multiplication by \(\mathcal{H}^{2}\)-matrices and the corresponding work is described in Section 8.7. The multiplication algorithm for two \(\mathcal{H}^{2}\)-matrices is given in Section 8.9. Also in §9.3.3 we shall refer to \(\mathcal{H}^{2}\)-matrices.
Wolfgang Hackbusch
Chapter 9. Miscellaneous Supplements
Abstract
This chapter is devoted to six different topics. Section 9.1 concerns the solution of linear systems. Since the \(\mathcal{H}\)-matrix technique allows us to compute the approximate LU decomposition, this approximation can be used as a perfect preconditioner in the iterative solution. The hierarchical LU decomposition of sparse matrices can be improved by the choice of a special ternary cluster tree (see Section 9.2). Section 9.3 introduces the weak admissibility which can be useful in particular cases. The cross approximation, explained in Section 9.4, is an important algorithm for the efficient computation of low-rank approximations. A fundamential question is the approximability by an \(\mathcal{H}\)-format. Some criteria are discussed in Section 9.5. The efficient updating after a local grid refinement is touched upon in Section 9.6.
Wolfgang Hackbusch

Applications

Frontmatter
Chapter 10. Applications to Discretised Integral Operators
Abstract
The discretisation of integral operators leads to fully populated matrices. In Sections 10.1–10.5, those integral operators are treated which arise from the reformulation of elliptic boundary value problems. The particular case of integral formulations for the Helmholtz equation is studied in Section 10.4. The inversion of the BEM matrix is considered in Section 10.5. Finally, we discuss general Fredholm and Volterra integral operators in Sections 10.6–10.7. A related problem are convolution integrals (see Section 10.8).
There are several H-matrix applications involving integral formulations beyond the topics considered below. We give a list of subjects and related papers.
  • Partial differential equations with stochastic surface: Dölz–Harbrecht–Peters [80]
  • Computation of covariances of a random solution: Dölz–Harbrecht–Schwab [81]
  • Homogenisation: Cazeaux–Zahm [71].
  • Multi-level methods: Börm [41].
  • Linear elasticity: Bebendorf–Grzhibovskis [24].
  • Helmholtz equation: Engquist–Ying [86] and citations therein, Brunner et al. [68], Ansari-Oghol-Beig et al. [6], Bebendorf–Kuske–Venn [29]. See also §10.4.
  • Eddy-current problems: Börm–Ostrowski [60], Smajić et al. [233].
  • Karhunen-Loève decomposition: Khoromskij–Litvinenko–Matthies [170], Allaix–Carbone [3].
  • Maxwell equations and electron dynamics : Ballani et al. [7], Koch et al. [172].
  • Gaussian processes: Börm–Garcke [51].
  • Radiosity equation: [164]
  • FEM–BEM coupling: Steinmetz et al. [234].
Wolfgang Hackbusch
Chapter 11. Applications to Finite Element Matrices
Abstract
In §9.2.2 it was shown that the matrices arising from finite element discretisations (in what follows called the finite element matrices) are not only sparse but also belong to the \(\mathcal{H}\)-matrix set \(\mathcal{H}(r, P)\) for all \( r \in {_0} \), where P is the standard partition. This allows us to consider all finite element matrices as hierarchical matrices. In particular, no truncation is needed to use a finite element matrix as an input parameter for the inversion or for the LU algorithm.
In Section 11.1 we discuss the inverse of the mass matrix. Using tools from §9.5, we show that the inverse can be approximated by a hierarchical matrix. This result is required in the later analysis.
Section 11.2 is concerned with the continuous and discrete Green operator.
The analysis of the Green function in Section 11.3 yields the \(\mathcal{H}\)-matrix property of the inverse finite element matrix.
The results of this chapter have been improved by recent contributions mentioned in Section 11.4.
Wolfgang Hackbusch
Chapter 12. Inversion with Partial Evaluation
Abstract
The complete inversion A–1 is required if all components of \( x = {A^{ - 1}}b \) are needed. If only a smaller part of the solution x or a number of functionals \( {\varphi _i}\left( x \right) \) is of interest, the question arises as to whether the computation of the complete inversion can be avoided and whether a partial evaluation of A–1 is cheaper.
The conceptuality in linear algebra and analysis is quite contrastive. The usual understanding in linear algebra is that the respective data (vectors, matrices) must be defined completely. In the case of a linear system Ax = b, only the full solution vector \( x \in {^I} \) is the correct answer, while the inversion of A requires all entries of the matrix \( {A^{ - 1}} \in {^{I \times I}} \).
Wolfgang Hackbusch
Chapter 13. Eigenvalue Problems
Abstract
In this chapter, we consider various numerical methods for solving the eigenvalue problem and the application of hierarchical matrix techniques. After the introduction in Section 13.1, we discuss the LR and QR methods in Section 13.2, the vector iteration and Krylov methods in Section 13.3, and the preconditioned inverse iteration in Section 13.4. Furthermore, we consider the bisection method in Section 13.5 and the divide-and-conquer method in Section 13.6. Quite another approach is the computation of an eigenvalue distribution in Section 13.7. Finally, spectral projections are studied in Section 13.8.
Wolfgang Hackbusch
Chapter 14. Matrix Functions
Abstract
The most prominent example of a matrix function is the matrix exponential \(e^{M}\). It appears as the solution \(u(t)\;=\;e^{tM}\) of the ordinary differential equation \($$u\prime(t)\;=\;Mu(t)\) with initial value \(u(0)\;=\;u_{0}\). The matrix exponential will be an important tool in Chapter 16. The matrix functions are defined in Section 14.1. For their construction, different methods can be used which are explained in Section 14.2. In general, matrix functions are fully populated matrices. The essential question is whether the (exact) matrix function can be approximated as an \(\mathcal{H}\)-matrix. This is the subject of Section 14.3.
In this chapter we use the field of complex numbers since, also in the case of real matrices, complex eigenvalues and complex contour integrals may appear.
A highly recommended introduction into the theory and practical treatment of matrix functions is the monograph of Higham [157]. We remark that most of the following statements can be generalised to operators instead of matrices.
Wolfgang Hackbusch
Chapter 15. Matrix Equations
Abstract
The usual solution methods of discretised partial differential equations are based exclusively on matrix-vector multiplications as basis operation. On the one hand, this is supported by the use of sparse matrices (cf. §1.3.2.5); on the other hand, one tries to apply fast iterative methods (e.g., the multigrid method [124], [119, §12]) whose basic steps are matrix-vector multiplications. Krylov methods are based on the same concept.
However, there are interesting problems which require the solution of a linear or nonlinear matrix equation1 and cannot be solved via the matrix-vector multiplication. Examples are the linear Lyapunov and Sylvester equations as well as the quadratic Riccati equation, which arise, e.g., in optimal control problems for partial differential equations and in model reduction methods.
However, there are interesting problems which require the solution of a linear or nonlinear matrix equation and cannot be solved via the matrix-vector multiplication. Examples are the linear Lyapunov and Sylvester equations as well as the quadratic Riccati equation, which arise, e.g., in optimal control problems for partial differential equations and in model reduction methods.
The \(\mathcal{H}\)-matrix arithmetic allows the solution of these matrix equations efficiently. Here, the use of hierarchical matrix operations and matrix-valued functions is only one part of the solution method. Another essential fact is that the solution \( X \in \mathbb{R}^{I \times I}(n==\!\!\!\!\!/ \!\!\!\!/I)\) can be replaced by an \(\mathcal{H}\)-matrix \(X_{\mathcal{H}}\). If one considers the equation \(f(X)\;=\;0\) as a system of n 2 equation for the n 2 components of X, even an optimal solution method has complexity \(\mathcal{O}(n^2)\), since this is linear complexity in a number of unknowns (cf. Remark 1.1). Using traditional techniques, the solution of large-scale matrix equations is not feasible. Only an \(\mathcal{H}\)-matrix \(X_{\mathcal{H}}\) with \(\mathcal{O}(n\; \mathrm{log}^*n)\) data instead of n 2 admits a solution with a cost almost linear with respect to n.
Section 15.1 introduces Lyapunov and Sylvester equations and discusses their solution. In Section 15.2 we consider quadratic Riccati equation. An interesting approach uses the matrix version of the sign function from §14.1.1. General nonlinear matrix equations may be solved iteratively by Newton’s method or related methods (cf. Section 15.3). As an example, computing the square root of a positive definite matrix is described in §15.3.1. The influence of the truncation error introduced by H-matrix arithmetic is investigated in Section 15.3.
Wolfgang Hackbusch
Chapter 16. Tensor Spaces
Abstract
Similar to fully populated matrices, tensors are mathematical objects with such a large amount of data that naive representations must fail. Sparse representations of tensors are an actual field of research (see Hackbusch [132]). Here, we restrict the discussion to tensor techniques which make use of hierarchical matrices. General vector spaces \( {V_j}\left( {1 \le j \le d} \right) \) can be used to form the tensor space \( V = \otimes _{j = 1}^d{V_j} \). Concerning the vector spaces Vj, we discuss two cases in Section 16.1: finite-dimensional model vector spaces \( {V_j} = {^{{I_j}}} \) (cf. §16.1.2), and matrix spaces \( {V_j} = {^{{I_j} \times {J_j}}} \) (cf. §16.1.3). In the latter case, the tensor product is also called a Kronecker matrix. The crucial problem is the sparse representation of tensors and their approximation, see Section 16.2. In §16.2.1, we discuss the r-term representation. For Kronecker products, the r-term representation can be combined with the hierarchical matrix format, resulting in the “HKT representation” (cf. §16.2.5). In Section 16.3 we present two applications. The first concerns an integral operator giving rise to a representation by (Kronecker) tensors of order d=2. The second application shows that the tensor approach can be used to solve differential equations in a high number of spatial variables \(d\;\gg 2\). The latter application is based on a stable r-term approximation constructed using exponential sums for the function 1/x. In general, the r-term approximation is not easy to apply to tensors of order d ≥ 3. Better tensor representations are described in Hackbusch [132, 133].
The tensor applications in this chapter concern matrices, since this is the subject of the monograph. In general, tensor approximations are directed to “vectors” represented by tensors. In the context of hierarchical matrices, it is assumed that vectors in Rn can be treated in full format. However, considering a boundary value problem in a cube [0, 1]3 discretised by N × N × N grid points with N = 106, the size of the grid function (“vector”) is n = N3 = 1018 and a direct approach is difficult. Regarding such a vector as a tensor of order d = 3, there may be good tensor approximations reducing the data size to \(\mathcal{O}(\mathrm{log} n)\) (cf. [132, §3.2], [133, §12]).
Also the tensor approximations are based on low-rank approximations (involving different kinds of ranks!), but these are global approximations, in contrast to the local low-rank approximations of the hierarchical matrices.
Wolfgang Hackbusch
Backmatter
Metadata
Title
Hierarchical Matrices: Algorithms and Analysis
Author
Wolfgang Hackbusch
Copyright Year
2015
Publisher
Springer Berlin Heidelberg
Electronic ISBN
978-3-662-47324-5
Print ISBN
978-3-662-47323-8
DOI
https://doi.org/10.1007/978-3-662-47324-5

Premium Partner