Top

Foundations of Computational Mathematics

Published in:

Open Access 17-08-2021

Convergence Rates of First- and Higher-Order Dynamics for Solving Linear Ill-Posed Problems

Authors: Radu Boţ, Guozhi Dong, Peter Elbau, Otmar Scherzer

Published in: Foundations of Computational Mathematics | Issue 5/2022

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Patentsearch

Off

Abstract

Recently, there has been a great interest in analysing dynamical flows, where the stationary limit is the minimiser of a convex energy. Particular flows of great interest have been continuous limits of Nesterov’s algorithm and the fast iterative shrinkage-thresholding algorithm, respectively. In this paper, we approach the solutions of linear ill-posed problems by dynamical flows. Because the squared norm of the residual of a linear operator equation is a convex functional, the theoretical results from convex analysis for energy minimising flows are applicable. However, in the restricted situation of this paper they can often be significantly improved. Moreover, since we show that the proposed flows for minimising the norm of the residual of a linear operator equation are optimal regularisation methods and that they provide optimal convergence rates for the regularised solutions, the given rates can be considered the benchmarks for further studies in convex analysis.

Communicated by Jim Renegar.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

We consider the problem of solving a linear operator equation

$$\begin{aligned} L x=y, \end{aligned}$$

(1)

where $L:{\mathcal {X}} \rightarrow {\mathcal {Y}}$ is a bounded linear operator between (infinite dimensional) real Hilbert spaces ${\mathcal {X}}$ and ${\mathcal {Y}}$. If the range of L is not closed, Eq. 1 is ill-posed, see [13], in the sense that small perturbations in the data y can cause non-solvability of Eq. 1 or large perturbations of the corresponding solution of Eq. 1 by perturbed right hand side. These undesirable effects are prevented by regularisation.

In this particular paper, we consider dynamical regularisation methods for solving Eq. 1. That is, we approximate the minimum norm solution $x^\dagger $ of Eq. 1 by the solution $\xi $ of a dynamical system of the form

$$\begin{aligned} \begin{aligned} \xi ^{(N)} (t) + \sum _{k=1}^{N-1} a_k(t)\xi ^{(k)}(t)&= - L^*L \xi (t) + L^*{\tilde{y}}&\text {for all } t \in \left( 0,\infty \right) , \\ \xi ^{(k)}(0)&= 0&\text {for all } k=0,\ldots ,N-1, \end{aligned} \end{aligned}$$

(2)

at an appropriate time, where $N \in \mathbb {N}$, $a_k:(0,\infty ) \rightarrow \mathbb {R}$, $k=1,\ldots ,N-1$, are continuous functions, and ${\tilde{y}}$ is a perturbation of y. The stopping time is in practice often chosen via a standard discrepancy principle, see [13, Chapter 3.3]. We are now interested under which conditions the regularised solution $\xi (t)$ can be guaranteed to converge to the solution $x^\dagger $ as $t\rightarrow \infty $ and how fast this convergence happens.

Studying first the case of exact data ${\tilde{y}}=y$, it turns out that the convergence rate, that is, the decay of $\Vert \xi (t)-x^\dagger \Vert ^2$ in the limit $t\rightarrow \infty $, can be uniquely characterised by the spectral decomposition of the minimum norm solution $x^\dagger $ with respect to the operator $L^*L$, which allows us to get optimal convergence rates as a function of the “regularity” of the source $x^\dagger $. This regularity is usually described by so-called source conditions, the most common ones being of the form $x^\dagger \in {\mathcal {R}}((L^*L)^\frac{\mu }{2})$ for some $\mu >0$; we refer to [13, Chapter 2.2] and [9, Chapter 3.2] for an introduction to the use of those source conditions for obtaining convergence rates. Moreover, these convergence rates for exact data are seen to be in a one-to-one correspondence to certain convergence rates for perturbed data as the perturbation $\Vert {\tilde{y}}-y\Vert ^2$ goes to zero.

Outside the regularisation community source conditions might appear technical because they involve the operator L. However, it was demonstrated that for differential and integral operators L, these conditions very well coincide with smoothness conditions in Sobolev spaces. See for instance [14], where the analogy of smoothness and source conditions has been explained for the problem of numerical differentiation. For this analogy, these conditions are also often termed smoothness conditions.

In particular, we will apply the general theory of this equivalent characterisation of convergence rates to the following three, well-studied examples:

Showalter’s method (also known as the gradient flow method), see [27, 28], which corresponds to the case $N=1$ in Eq. 2:

$$\begin{aligned} \begin{aligned} \xi '(t)&= -L^*L \xi (t)+ L^*{\tilde{y}} \text { for all } t \in \left( 0,\infty \right) , \\ \xi (0)&= 0, \end{aligned} \end{aligned}$$

(3)

see Table 1 for an overview of the available convergence rates results;

the heavy ball method, introduced in [22], corresponding to $N=2$ with a constant function $a_1(t)=b>0$ in Eq. 2:

$$\begin{aligned} \partial _{t t}\xi (t;{\tilde{y}}) + b\partial _t\xi (t;{\tilde{y}})&= - L^*L \xi (t;{\tilde{y}}) + L^*{\tilde{y}}\text { for all } t \in \left( 0,\infty \right) , \nonumber \\ \partial _t\xi (0;{\tilde{y}})&= 0, \nonumber \\ \xi (0;{\tilde{y}})&= 0, \end{aligned}$$

(4)

where known convergence rates results are collected in Table 2;

the vanishing viscosity method, see [29], which is the case of $N=2$ with $a_1(t)=\frac{b}{t}$ for some $b>0$ in Eq. 2:

$$\begin{aligned} \partial _{t t}\xi (t;{\tilde{y}}) + \frac{b}{t}\partial _t\xi (t;{\tilde{y}})&= - L^*L \xi (t;{\tilde{y}}) + L^*{\tilde{y}}\text { for all } t \in \left( 0,\infty \right) , \nonumber \\ \partial _t\xi (0;{\tilde{y}})&= 0, \nonumber \\ \xi (0;{\tilde{y}})&= 0. \end{aligned}$$

(5)

Some convergence rates from the literature are listed in Table 3.

Table 1

Convergence rates for Showalter’s method

Source condition	$\Vert \xi (t)-x^\dagger \Vert ^2$	$\left\\| L\xi (t)-y \right\\| ^2$
$\Vert L^\dag \Vert <\infty $	${\mathcal {O}}(\mathrm e^{-\Vert L^\dag \Vert ^{-2}t})$ [27, Theorem 1]	${\mathcal {O}}(\mathrm e^{-\Vert L^\dag \Vert ^{-2}t})$
$x^\dagger \in {\mathcal {R}}((L^*L)^{\frac{\mu }{2}})$	${\mathcal {O}}(t^{-\mu })$ [28, Theorem 1] ($\mu =1$), Corollary 6	${\mathcal {O}}(t^{-\mu -1})$ Corollary 6
$x^\dagger \in {\mathcal {N}}(L)^\perp $	o(1) [28, Theorem 1], Proposition 3 with Corollary 1	${\mathcal {O}}(t^{-1})$ [28, Theorem 1] $o(t^{-1})$ Proposition 3 with Corollary 5

To compare the results from [28], we remark that the condition $y\in {\mathcal {R}}(LL^*)$ given therein is equivalent to $x^\dagger \in {\mathcal {R}}((L^*L)^{\frac{1}{2}})$, which can be directly seen, for example, from the characterisation of the range of a dual operator given in [26, Lemma 8.31]. We also remark that in the well-posed case $\Vert L^\dag \Vert <\infty $, the rates for $\Vert \xi (t)-x^\dagger \Vert ^2$ and for $\Vert L\xi (t)-y\Vert ^2$ are always the same, since $\Vert L^\dag \Vert ^{-2}\Vert \xi (t)-x^\dagger \Vert ^2\le \Vert L\xi (t)-y\Vert ^2\le \Vert L\Vert ^2\Vert \xi (t)-x^\dagger \Vert ^2$

Table 2

Convergence rates for the heavy ball method

Source condition	$\Vert \xi (t)-x^\dagger \Vert ^2$	$\left\\| L\xi (t)-y \right\\| ^2$
$\Vert L^\dag \Vert <\infty $	${\mathcal {O}}(\mathrm e^{\varepsilon t-\beta (\Vert L^\dag \Vert )\frac{bt}{2}})$ [22, Theorem 9.(5)]	${\mathcal {O}}(\mathrm e^{\varepsilon t-\beta (\Vert L^\dag \Vert )\frac{bt}{2}})$
$x^\dagger \in {\mathcal {R}}((L^*L)^{\frac{\mu }{2}})$	${\mathcal {O}}(t^{-\mu })$ [33, Theorem 5.1], Corollary 7	${\mathcal {O}}(t^{-\mu -1})$ Corollary 7
$x^\dagger \in {\mathcal {N}}(L)^\perp $	o(1) Proposition 4 with Lemma 22 and Corollary 1	$o(t^{-1})$ [33, Lemma 3.2] ($b\ge \Vert L\Vert $), Proposition 4 with Lemma 22 and Corollary 5

Here, $\varepsilon >0$ denotes an arbitrarily small parameter and we have set $\beta (\Vert L^\dag \Vert )=1-(1-\frac{4}{b^2\Vert L^\dag \Vert ^2})^{\frac{1}{2}}$ for $\Vert L^\dag \Vert \ge \frac{2}{b}$ and $\beta (\Vert L^\dag \Vert )=1$ for $\Vert L^\dag \Vert <\frac{2}{b}$

Table 3

Convergence rates for the vanishing viscosity flow

Source condition	Parameters	$\Vert \xi (t)-x^\dagger \Vert ^2$	$\left\\| L\xi (t)-y \right\\| ^2$
$\Vert L^\dag \Vert <\infty $	$b>3$	$o(t^{-2})$	$o(t^{-2})$ [4, Theorem 4.16]
		${\mathcal {O}}(t^{-\frac{2b}{3}})$ [5, Theorem 3.4]	${\mathcal {O}}(t^{-\frac{2b}{3}})$ [5, Theorem 3.4]
$\Vert L^\dag \Vert <\infty $	$b>2$	${\mathcal {O}}(t^{-2})$	${\mathcal {O}}(t^{-2})$ [29, Theorem 7]
		${\mathcal {O}}(t^{-b})$	${\mathcal {O}}(t^{-b})$ [7, Theorem 4.2]
$\Vert L^\dag \Vert <\infty $	$0<b<3$	${\mathcal {O}}(t^{-\frac{2b}{3}})$	${\mathcal {O}}(t^{-\frac{2b}{3}})$ [4, Theorem 4.19]
$x^\dagger \in {\mathcal {R}}((L^*L)^{\frac{\mu }{2}})$	$0<\mu <\frac{b}{2}$	${\mathcal {O}}(t^{-2\mu })$ Corollary 9
$x^\dagger \in {\mathcal {R}}((L^*L)^{\frac{\mu }{2}})$	$0<\mu <\frac{b}{2}-1$		${\mathcal {O}}(t^{-2\mu -2})$ Corollary 9
$x^\dagger \in {\mathcal {N}}(L)^\perp $	$b\ge 3$		${\mathcal {O}}(t^{-2})$ [5, Theorem 2.7]
$x^\dagger \in {\mathcal {N}}(L)^\perp $	$b>0$	o(1) Proposition 5 with Lemma 27 and Corollary 1	${\mathcal {O}}(t^{-b+\varepsilon })+o(t^{-2})$ Proposition 5 with Lemma 27 and Corollary 4 and Corollary 5

As before, $\varepsilon >0$ denotes an arbitrarily small parameter

Especially the vanishing viscosity method has recently been heavily investigated, see [5, 6, 8, 29], for example, as it shows a faster convergence compared to the other two methods, and it was demonstrated to be a time continuous formulation of Nesterov’s algorithm, see [20], providing an explanation of the rapid convergence of this algorithm. Consequently, it was not only studied in the form of Eq. 5, but more generally with the right hand side (which in Eq. 5 is the negative gradient of ${\mathcal {J}}_0(x)=\frac{1}{2}\Vert L x-y\Vert ^2$) replaced by the negative gradient of an arbitrary convex and differentiable functional ${\mathcal {J}}$. But, since our theory relies on spectral analysis, we limit our discussion to the quadratic functional ${\mathcal {J}}_0$.

In terms of convergence rates, however, the discussions for general functionals ${\mathcal {J}}$ are often limited to the estimation of the convergence of ${\mathcal {J}}(\xi (t))-\min _{x\in {\mathcal {X}}}{\mathcal {J}}(x)$, which for ${\mathcal {J}}={\mathcal {J}}_0$ is given by $\frac{1}{2}\Vert L\xi (t)-y\Vert ^2$. In the well-posed case where the operator L has a bounded pseudoinverse $L^\dag $, this convergence of the squared norm of the residual is equivalent to the convergence of the error $\Vert \xi (t)-x^\dagger \Vert ^2$, but this is no longer true in the ill-posed case where the pseudoinverse is unbounded. In contrast to this, our approach directly gives convergence rates for $\Vert \xi (t)-x^\dagger \Vert ^2$, which then imply a convergence (typically of higher order) of the squared norm of the residual.

We will proceed as follows:

In Sect. 2, we revisit convergence rates results of regularisation methods from [3], which, in particular, allow to analyse first- and higher-order dynamics.
In the following sections, we apply the general results of Sect. 2 to regularising flow equations. In Sect. 4 we derive well-known convergence rates results of Showalter’s method and prove optimality of this method. In Sect. 5, we prove regularising properties, optimality, and convergence rates of the heavy ball dynamical flow. In the context of inverse problems, this method has already been analysed by [33], however not in terms of optimality, as it is done here.
In Sect. 6, we consider the vanishing viscosity flow. We apply the general theory of Sect. 2 and prove optimality of this method. In particular, we prove under source conditions (see for instance [9, 13]) optimal convergence rates (in the sense of regularisation theory) for $\Vert \xi (t)-x^\dagger \Vert ^2$. These rates (and the resulting ones for the squared norm of the residual) are seen to interpolate nicely between the known rates in the well-posed (finite-dimensional) and those in the ill-posed setting when varying the regularity of the solution $x^\dagger $ (via changing the parameter $\mu $ in Table 3).

We want to emphasise that the terminologies optimal from [7] (a representative reference for this field) and [3] differ by the class of problems and the amount of a priori information taken into account. In [7], best worst-case error rates in the class of convex energies are derived, while we focus on squared functionals ${\mathcal {J}}$. Moreover, we take into account prior knowledge on the solution. In view of this, it is not surprising that we get different “optimal” rates.

2 Generalisations of Convergence Rates Results

In the following, we slightly generalise convergence rates and saturation results from [3] so that they can be applied to prove convergence of the second order regularising flows in Sects. 5 and 6. Thereby one needs to be aware that in classical regularisation theory, the regularisation parameter $\alpha > 0$ is considered a small parameter, meaning that we consider small perturbations of Eq. 1. For dynamic regularisation methods of the form of Eq. 2, we take large times to approximate the stationary state. To link these two theories, we will apply an inverse polynomial identification of optimal regularisation time and regularisation parameter.

Let $L:{\mathcal {X}}\rightarrow {\mathcal {Y}}$ be a bounded linear operator between two real Hilbert spaces ${\mathcal {X}}$ and ${\mathcal {Y}}$ with operator norm $\left\| L \right\| $, $y \in {\mathcal {R}}(L)$, and let $x^\dagger \in {\mathcal {X}}$ be the minimum norm solution of $L x = y$ defined by

$$\begin{aligned} L x^\dag =y\text { and }\Vert x^\dag \Vert =\inf \{\left\| x \right\| \mid L x=y\}. \end{aligned}$$

Definition 1

We call a family $(r_\alpha )_{\alpha >0}$ of continuous functions $r_\alpha :[0,\infty )\rightarrow [0,\infty )$ the generator of a regularisation method if

there exists a constant $\sigma \in (0,1)$ such that

$$\begin{aligned} r_\alpha (\lambda ) \le \min \left\{ \frac{2}{\lambda },\frac{\sigma }{\sqrt{\alpha \lambda }}\right\} \text { for every } \lambda>0,\;\alpha >0; \end{aligned}$$

(6)

the error function $\tilde{r}_\alpha :(0,\infty )\rightarrow [-1,1]$, defined by

$$\begin{aligned} \tilde{r}_\alpha (\lambda )=1-\lambda r_\alpha (\lambda ),\;\lambda > 0, \end{aligned}$$

(7)

is non-negative and monotonically decreasing on the interval $(0,\alpha )$;

there exists for every $\alpha >0$ a monotonically decreasing, continuous function $\tilde{R}_\alpha :(0,\infty )\rightarrow \left[ 0,1\right] $ such that $\tilde{R}_\alpha \ge |\tilde{r}_\alpha |$ and $\alpha \mapsto \tilde{R}_\alpha (\lambda )$ is continuous and monotonically increasing for every fixed $\lambda > 0$;

there exists for every ${\bar{\alpha }}>0$ a constant ${\tilde{\sigma }}\in (0,1)$ such that

$$\begin{aligned} \tilde{R}_\alpha (\alpha )<{\tilde{\sigma }} \text { for all } \alpha \in (0,{\bar{\alpha }}). \end{aligned}$$

Remark 1

The definition of the generator of a regularisation method differs from the one in [3] by allowing the regularisation method to overshoot, meaning that $r_\alpha (\lambda )>\frac{1}{\lambda }$ is possible at some points $\lambda >0$ (the choice $r_\alpha (\lambda )=\frac{1}{\lambda }$, which is not a regularisation method in the sense of Definition 1, would correspond to taking the inverse without regularisation, see Eq. 8). Consequently, we also relaxed the assumption that the error function $\tilde{r}_\alpha $ is monotonically decreasing to the existence of a monotonically decreasing upper bound $\tilde{R}_\alpha $ for $\tilde{r}_\alpha $. We also want to remark that in the definition of the error function in [3], ${\tilde{r}}_\alpha ^{[3]}$, there is an additional square included, that is, ${\tilde{r}}_\alpha ^{[3]}=\tilde{r}_\alpha ^2$.

Definition 2

Let $(r_\alpha )_{\alpha >0}$ be the generator of a regularisation method.

The regularised solutions according to a generator $(r_\alpha )_{\alpha >0}$ and data $\tilde{y}$ are defined by

$$\begin{aligned} x_\alpha :{\mathcal {Y}}\rightarrow {\mathcal {X}},\;x_\alpha (\tilde{y}) = r_\alpha (L^*L)L^* \tilde{y}, \end{aligned}$$

(8)

where we use the bounded Borel functional calculus to identify the function $r_\alpha :[0,\infty )\rightarrow [0,\infty )$ with a function acting on the space of positive semi-definite self-adjoint operators, see [32, Chapter XI.12], for example.

Let $(\tilde{R}_\alpha )_{\alpha >0}$ be as in Definition 1 item 3. Then, we define for all $\alpha >0$ the envelopes

$$\begin{aligned} R_\alpha :(0,\infty )\rightarrow [0,\infty ),\;R_\alpha (\lambda ) = \frac{1}{\lambda }\left( 1-\tilde{R}_\alpha (\lambda )\right) , \end{aligned}$$

(9)

and the corresponding regularised solutions

$$\begin{aligned} X_\alpha :{\mathcal {Y}}\rightarrow {\mathcal {X}},\;X_\alpha (\tilde{y})=R_\alpha (L^*L)L^*\tilde{y}. \end{aligned}$$

(10)

Remark 2

The family $(R_\alpha )_{\alpha >0}$ is also a generator of a regularisation method, since we have

$$\begin{aligned} R_\alpha (\lambda ) = \frac{1-\tilde{R}_\alpha (\lambda )}{\lambda }\le \frac{1-\tilde{r}_\alpha (\lambda )}{\lambda }= r_\alpha (\lambda ) \le \min \left\{ \frac{2}{\lambda },\frac{\sigma }{\sqrt{\alpha \lambda }} \right\} \end{aligned}$$

(11)

for every $\lambda >0$ and $\alpha >0$, which verifies Definition 1 item 1; and the other three conditions of Definition 1 are tautologically fulfilled: Definition 1 item 2 by the definition of $\tilde{R}_\alpha $ via Definition 1 item 3, and Definition 1 item 3 and item 4 by choosing $\tilde{R}_\alpha $ itself as upper bound for $|\tilde{R}_\alpha |$.

The idea of these regularised solutions is to replace the unbounded inverse of $L:{\mathcal {N}}(L)^\perp \rightarrow {\mathcal {R}}(L)$ by the bounded approximation $x_\alpha $, where the parameter $\alpha >0$ quantifies the regularisation. It should disappear in the limit $\alpha \rightarrow 0$, where we typically expect $r_\alpha (\lambda )\rightarrow \frac{1}{\lambda }$ corresponding to $x_\alpha (y)\rightarrow (L^*L)^\dag L^*y = x^\dag $ (this is, however, not enforced by Definition 1, but we will add in Definition 4 a compatibility condition to ensure this).

Example 1

The most prominent regularisation method is probably Tikhonov regularisation, where the regularised solution $x_\alpha ({\tilde{y}})$ is defined as the minimisation point of the Tikhonov functional

$$\begin{aligned} {\mathcal {T}}_{\alpha ,{\tilde{y}}}:{\mathcal {X}}\rightarrow \mathbb {R},\;{\mathcal {T}}_{\alpha ,{\tilde{y}}}(x) = \Vert L x-{\tilde{y}}\Vert ^2+\alpha \Vert x\Vert ^2. \end{aligned}$$

Solving the optimality condition, gives us for $x_\alpha ({\tilde{y}})$ the expression

$$\begin{aligned} x_\alpha ({\tilde{y}}) = (L^*L+\alpha I)^{-1}L^*{\tilde{y}}, \end{aligned}$$

where $I:{\mathcal {X}}\rightarrow {\mathcal {X}}$ denotes the identity map on ${\mathcal {X}}$, which has with $r_\alpha (\lambda ):=\frac{1}{\lambda +\alpha }$ the form of Eq. 8 and $r_\alpha $ satisfies all the conditions of Definition 1, see [3, Example 2.4].

We will show later in Sections 4, 5, and 6 that also some common dynamical regularisation methods fall into this regularisation scheme so that all the convergence rates results from this section can be applied to these methods.

Definition 3

We denote by $A\mapsto {\mathbf {E}}_A$ and $A \mapsto {\mathbf {F}}_A$ the spectral measures of the operators $L^*L$ and $LL^*$, respectively, on all Borel sets $A \subseteq [0,\infty )$; and we define the right-continuous and monotonically increasing function

$$\begin{aligned} e:(0,\infty ) \rightarrow [0,\infty ),\; e(\lambda )=\Vert {\mathbf {E}}_{[0,\lambda ]}x^\dagger \Vert ^2. \end{aligned}$$

(12)

We remark that the minimum norm solution $x^\dagger $ is in the orthogonal complement of the null space ${\mathcal {N}}(L)$ of L and we therefore have ${\mathbf {E}}_{[0,\lambda ]}x^\dagger ={\mathbf {E}}_{(0,\lambda ]}x^\dagger $.

Moreover, if $f:(0,\infty )\rightarrow \mathbb {R}$ is a right-continuous, monotonically increasing, and bounded function, we write

$$\begin{aligned} \int _a^b g(\lambda )\,\mathrm df(\lambda ) = \int _{(a,b]} g(\lambda )\,\mathrm d\mu _f(\lambda ) \end{aligned}$$

for the Lebesgue–Stieltjes integral of f, where $\mu _f$ denotes the unique non-negative Borel measure defined by $\mu _f((\lambda _1,\lambda _2])=f(\lambda _2)-f(\lambda _1)$ and $g\in L^1(\mu _f)$.

We introduce the following quantities, whose behaviour we want to relate to each other:

the spectral tail of the minimum norm solution $x^\dagger $ with respect to the operator $L^*L$, that is, the asymptotic behaviour of $e(\lambda )$ as $\lambda $ tends to zero, see [21];
the error between the minimum norm solution $x^\dagger $ and the regularised solution $x_\alpha (y)$ or $X_\alpha (y)$ for the exact data y called the noise-free regularisation error, that is,
$$\begin{aligned} d(\alpha ) :=\left\| x_\alpha (y)-x^\dagger \right\| ^2 \text { and } D(\alpha ):=\left\| X_\alpha (y)-x^\dagger \right\| ^2, \end{aligned}$$

(13)
respectively, as $\alpha $ tends to zero;
the best worst-case error between the minimum norm solution $x^\dagger $ and the regularised solution $x_\alpha (\tilde{y})$ or $X_\alpha (\tilde{y})$ for some data $\tilde{y}$ with distance less than or equal to $\delta >0$ to the exact data y under optimal choice of the regularisation parameter $\alpha $, that is,
$$\begin{aligned} \begin{aligned}&{\tilde{d}}(\delta ) :=\sup _{\tilde{y}\in {\bar{B}}_\delta (y)}\inf _{\alpha>0} \left\| x_\alpha ({\tilde{y}})-x^\dagger \right\| ^2 \text { and} \\&{\tilde{D}}(\delta ) :=\sup _{\tilde{y}\in {\bar{B}}_\delta (y)}\inf _{\alpha >0} \left\| X_\alpha ({\tilde{y}})-x^\dagger \right\| ^2, \end{aligned} \end{aligned}$$

(14)
respectively, as $\delta $ tends to zero;
the noise-free residual error, which is the error between the image of the regularised solution $x_\alpha (y)$ or $X_\alpha (y)$ and the exact data y, that is,
$$\begin{aligned} q(\alpha ) :=\Vert Lx_\alpha (y)-y\Vert ^2\text { and } Q(\alpha ) :=\Vert LX_\alpha (y)-y\Vert ^2, \end{aligned}$$

(15)
respectively, as $\alpha $ tends to zero.

To describe the behaviour of these quantities, we consider, for example, convergence rates of the form

$$\begin{aligned} d(\alpha ) = \Vert x_\alpha (y) - x^\dagger \Vert ^2 \le C_d\varphi (\alpha )\text { for all }\alpha >0, \end{aligned}$$

with some constant $C_d>0$ for the noise-free regularisation error d, characterised by the decay of a monotonically increasing function $\varphi :(0,\infty )\rightarrow (0,\infty )$ for $\alpha \rightarrow 0$, and look for a corresponding (equivalent) characterisation of the convergence rates of the other quantities, such as $e(\lambda )=\Vert {\mathbf {E}}_{[0,\lambda ]}x^\dagger \Vert ^2$ or $q(\alpha )=\left\| Lx_\alpha (y)-y \right\| ^2$.

Example 2

Common families of functions $\varphi $ used to describe the convergence rates are Hölder functions

$$\begin{aligned} \varphi ^{\mathrm H}_\mu :(0,\infty )\rightarrow \mathbb {R},\;\varphi ^{\mathrm H}_\mu (\alpha ) = \alpha ^\mu \text { for all }\mu >0, \end{aligned}$$

(16)

see [13], for example; and logarithmic

$$\begin{aligned} \varphi ^{\mathrm L}_\mu :(0,\infty )\rightarrow \mathbb {R},\;\varphi ^{\mathrm L}_\mu (\alpha ) = {\left\{ \begin{array}{ll}\left| \log \alpha \right| ^{-\mu },&{}\alpha <\mathrm e^{-1}, \\ 1,&{}\alpha \ge \mathrm e^{-1},\end{array}\right. }\text { for all }\mu >0, \end{aligned}$$

(17)

or even double logarithmic functions, see for instance [17, 25]. See Fig. 1 for a sketch of the graphs of these functions.

The main results are collected in Theorem 1 and Corollary 3. We proceed in the following way to derive them:

In Lemma 1 and Corollary 1, we write the different regularisation errors in spectral form.
In Lemma 2 and Corollary 3, we show the relations between the convergence rates of the noise-free quantities e, d, and D. For this, we require the function $\varphi $, which describes the rate of convergence and is the same for all three quantities, to be compatible with the regularisation method, see Definition 4.
In Lemma 10 and Lemma 11, we derive the relations of the best worst-case errors ${\tilde{d}}$ and ${\tilde{D}}$ to the quantities e and D. The corresponding rate of convergence is hereby of the form $\varPhi [\varphi ]$, where the mapping $\varPhi $ is introduced in Definition 5 and some of its elementary properties are shown in Lemma 6, Lemma 7, Lemma 8, and Lemma 9.
The statements for the residual errors q and Q are then concluded from Theorem 1 by using the identification of q and Q for the minimum norm solution $x^\dag $ with the noise-free errors d and D for the minimum norm solution ${\bar{x}}^\dag =(L^*L)^{\frac{1}{2}}x^\dag $ of the problem $L x={\bar{y}}$ with ${\bar{y}}=L{\bar{x}}^\dag $, and they are summarised in Corollary 3, Corollary 4, and Corollary 5.

In the remaining of this section, we will always consider $(r_\alpha )_{\alpha >0}$ to be the generator of a regularisation method with an envelope $(R_\alpha )_{\alpha >0}$ and corresponding regularised solutions $(x_\alpha )_{\alpha >0}$ and $(X_\alpha )_{\alpha >0}$, respectively. Moreover, we use the functions e, d, D, ${\tilde{d}}$, ${\tilde{D}}$, q, and Q as defined in Definition 3, see Table 4 for a summary of the notation.

Table 4

Used variables and references to their definitions

Abbreviation	Description	References
$r_\alpha $	Generator	Definition 1
$R_\alpha $	Envelope generator	Equation 9
$\tilde{r}_\alpha $	Error function	Equation 7
$\tilde{R}_\alpha $	Envelope error function	Definition 1 item 3
$x_\alpha ({\tilde{y}})=r_\alpha (L^L)L^{\tilde{y}}$	Regularised solution according to $r_\alpha $	Equation 8
$X_\alpha ({\tilde{y}})=R_\alpha (L^L)L^{\tilde{y}}$	Regularised solution according to $R_\alpha $	Equation 10
$d(\alpha )=\left\\| x_\alpha (y)-x^\dagger \right\\| ^2$	Noise-free regularisation error for $r_\alpha $	Equation 13
$D(\alpha )=\left\\| X_\alpha (y)-x^\dagger \right\\| ^2$	Noise-free regularisation error for $R_\alpha $	Equation 13
$\tilde{d}(\delta )$	Best worst-case error for $r_\alpha $	Equation 14
$\tilde{D}(\delta )$	Best worst-case error for $R_\alpha $	Equation 14
$q(\alpha )=\left\\| Lx_\alpha (y)-y \right\\| ^2$	Noise-free residual error for $r_\alpha $	Equation 15
$Q(\alpha )=\left\\| LX_\alpha (y)-y \right\\| ^2$	Noise-free residual error for $R_\alpha $	Equation 15
${\mathbf {E}}_A, {\mathbf {F}}_A$	Spectral measures of $L^L, LL^$	Definition 3
$e(\lambda )=\Vert {\mathbf {E}}_{[0,\lambda ]}x^\dagger \Vert ^2$	Spectral tail of $x^\dagger $	Equation 12
$\hat{\varphi }$	$\hat{\varphi }(\alpha ) = \sqrt{\alpha \varphi (\alpha )}$	Definition 5
${\hat{\varphi }}^{-1}$	Generalised inverse of a function ${\hat{\varphi }}$	Definition 5
$\varPhi $	Noise-free to noisy transform	Definition 5

2.1 Spectral Representations of the Regularisation Errors

To do the analysis, we will expand the quantities of interest with respect to the measure $A\mapsto \left\| {\mathbf {E}}_Ax^\dagger \right\| ^2$, which describes the spectral decomposition of $x^\dagger $ with respect to the operator $L^*L$. With the function e defined in Eq. 12, we can write the resulting integrals in the form of Lebesgue–Stieltjes integrals.

Lemma 1

We have the representations

$$\begin{aligned} d(\alpha ) = \int _0^{\left\| L \right\| ^2}\tilde{r}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) \text { and } D(\alpha ) = \int _0^{\left\| L \right\| ^2}\tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) \end{aligned}$$

(18)

for the regularisation errors d and D, respectively, and

$$\begin{aligned} q(\alpha ) = \int _0^{\left\| L \right\| ^2}\lambda \tilde{r}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) \text { and } Q(\alpha ) = \int _0^{\left\| L \right\| ^2}\lambda \tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) \end{aligned}$$

(19)

for the residuals q and Q, respectively.

Proof

We can write the differences between one of the regularised solutions $x_\alpha (y)$ or $X_\alpha (y)$ and the minimum norm solution $x^\dag $ in the form

$$\begin{aligned} x_\alpha (y)-x^\dagger&= r_\alpha (L^*L)L^*y - x^\dagger = (r_\alpha (L^*L)L^*L-I)x^\dagger \text { and} \\ X_\alpha (y)-x^\dagger&= (R_\alpha (L^*L)L^*L-I)x^\dagger , \end{aligned}$$

respectively, where $I:{\mathcal {X}}\rightarrow {\mathcal {X}}$ denotes the identity map on ${\mathcal {X}}$. According to spectral theory, we can formulate this with the definition of the error functions $\tilde{r}_\alpha $ and $\tilde{R}_\alpha $, see Eqs. 7 and 9, as

$$\begin{aligned} \left\| x_\alpha (y)-x^\dagger \right\| ^2 = \int _0^{\left\| L \right\| ^2}\tilde{r}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) \text { and } \left\| X_\alpha (y)-x^\dagger \right\| ^2 = \int _0^{\left\| L \right\| ^2}\tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda ). \end{aligned}$$

For the differences between the image of the regularised solution $x_\alpha (y)$ or $X_\alpha (y)$ and the exact data, we find similarly

$$\begin{aligned}&\Vert Lx_\alpha (y)-y\Vert ^2 = \Vert Lr_\alpha (L^*L)L^*L x^\dag - Lx^\dagger \Vert ^2 = \left\langle x^\dag ,L^*L(r_\alpha (L^*L)L^*L-I)^2x^\dag \right\rangle \text { and} \\&\Vert LX_\alpha (y)-y\Vert ^2 = \left\langle x^\dag ,L^*L(R_\alpha (L^*L)L^*L-I)^2x^\dag \right\rangle . \end{aligned}$$

Thus, we have

$$\begin{aligned} \left\| Lx_\alpha (y)-y \right\| ^2 = \int _0^{\left\| L \right\| ^2}\lambda \tilde{r}_\alpha ^2(\lambda )\,\mathrm de(\lambda )\text { and } \left\| LX_\alpha (y)-y \right\| ^2 = \int _0^{\left\| L \right\| ^2}\lambda \tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda ). \end{aligned}$$

$\square $

From this representation, we immediately get that the regularised solutions $(x_\alpha )_{\alpha >0}$ and $(X_\alpha )_{\alpha >0}$ converge to the minimum norm solution $x^\dagger $ if the error functions $(\tilde{r}_\alpha )_{\alpha >0}$ and $(\tilde{R}_\alpha )_{\alpha >0}$ tend to zero as $\alpha \rightarrow 0$.

Corollary 1

The regularisation errors D, Q, ${\tilde{d}}$, and ${\tilde{D}}$ (but not necessarily d and q) are monotonically increasing functions and the functions D and Q are also continuous.

Moreover, if $\lim _{\alpha \rightarrow 0}\tilde{r}_\alpha (\lambda )=0$ (or $\lim _{\alpha \rightarrow 0}\tilde{R}_\alpha (\lambda )=0$, respectively) for every $\lambda >0$, then the regularised solutions $x_\alpha (y)$ (or $X_\alpha (y)$, respectively) converge for $\alpha \rightarrow 0$ in the norm topology to the minimum norm solution $x^\dagger $.

Proof

By assumption, see Definition 1 item 3, $\alpha \mapsto \tilde{R}_\alpha (\lambda )$ is monotonically increasing, and so are the functions

$$\begin{aligned} \alpha \mapsto D(\alpha ) = \int _0^{\left\| L \right\| ^2}\tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda )\text { and }\alpha \mapsto Q(\alpha ) = \int _0^{\left\| L \right\| ^2}\lambda \tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda ). \end{aligned}$$

The monotonicity of ${\tilde{d}}$ and ${\tilde{D}}$ follows directly from their definition in Eq. 14 as suprema over the increasing sets ${\bar{B}}_\delta (y)$, $\delta >0$.

Since $\tilde{R}_\alpha (\lambda )\in [0,1]$ for every $\alpha >0$ and every $\lambda >0$ and $\alpha \mapsto \tilde{R}_\alpha (\lambda )$ is for every $\lambda >0$ continuous, see Definition 1 item 3, Lebesgue’s dominated convergence theorem implies for every $\alpha _0>0$:

$$\begin{aligned}&\lim _{\alpha \rightarrow \alpha _0}D(\alpha ) = \int _0^{\left\| L \right\| ^2}\lim _{\alpha \rightarrow \alpha _0}\tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) = D(\alpha _0)\text { and} \\&\lim _{\alpha \rightarrow \alpha _0}Q(\alpha ) = \int _0^{\left\| L \right\| ^2}\lim _{\alpha \rightarrow \alpha _0}\lambda \tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) = Q(\alpha _0), \end{aligned}$$

which proves the continuity of D and Q.

Similarly, we get with $|\tilde{r}_\alpha (\lambda )|\le \tilde{R}_\alpha (\lambda )\le 1$ for every $\alpha >0$ and every $\lambda >0$ from Lebesgue’s dominated convergence theorem that

$$\begin{aligned} \lim _{\alpha \rightarrow 0}\left\| x_\alpha (y)-x^\dagger \right\| ^2 = \lim _{\alpha \rightarrow 0}d(\alpha ) = \int _0^{\left\| L \right\| ^2}\lim _{\alpha \rightarrow 0}\tilde{r}_\alpha ^2(\lambda )\,\mathrm de (\lambda ) = 0\text { if }\lim _{\alpha \rightarrow 0}\tilde{r}_\alpha (\lambda )=0, \end{aligned}$$

and also

$$\begin{aligned} \lim _{\alpha \rightarrow 0}\left\| X_\alpha (y)-x^\dagger \right\| ^2 = \lim _{\alpha \rightarrow 0}D(\alpha ) = \int _0^{\left\| L \right\| ^2}\lim _{\alpha \rightarrow 0}\tilde{R}_\alpha ^2(\lambda )\,\mathrm de (\lambda ) = 0\text { if }\lim _{\alpha \rightarrow 0}\tilde{R}_\alpha (\lambda )=0. \end{aligned}$$

$\square $

2.2 Bounds for the Noise-Free Regularisation Errors

The representations of the noise-free regularisation errors as integrals over the spectral tail e allow us to characterise the convergence of the regularisation errors $d(\alpha )$ and $D(\alpha )$ in the limit $\alpha \rightarrow 0$ in terms of the behaviour of the spectral tail $e(\lambda )$ for $\lambda \rightarrow 0$.

Lemma 2

With the constant $\sigma \in (0,1)$ from Definition 1 item 1, we have for every $\alpha >0$ the relation

$$\begin{aligned} (1-\sigma )^2e(\alpha ) \le d(\alpha ) \le D(\alpha ). \end{aligned}$$

(20)

That is, $(1-\sigma )^2$ times the spectral tail is a lower bound for the noise-free regularisation error of the regularisation method, which in turn is a lower bound for the error of the regularisation method of the envelope generator.

Proof

Let $\alpha >0$ be fixed. With Eq. 18 and $\tilde{R}_\alpha \ge |\tilde{r}_\alpha |$, according to Definition 1 item 3, we find for the errors d and D that

$$\begin{aligned} D(\alpha ) = \int _0^{\left\| L \right\| ^2}\tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) \ge \int _0^{\left\| L \right\| ^2}\tilde{r}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) = d(\alpha ). \end{aligned}$$

Furthermore, since $\tilde{r}_\alpha ^2$ is monotonically decreasing on $[0,\alpha ]$, according to Definition 1 item 2, and $e(\lambda )=e(\left\| L \right\| ^2)$ for all $\lambda \ge \left\| L \right\| ^2$, we can estimate

$$\begin{aligned} d(\alpha ) \ge \int _0^{\min \{\alpha ,\left\| L \right\| ^2\}}\tilde{r}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) = \int _0^\alpha \tilde{r}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) \ge \tilde{r}_\alpha ^2(\alpha )e(\alpha ). \end{aligned}$$

Inserting the expression of Eq. 7 for $\tilde{r}_\alpha $ and using the upper bound from Definition 1 item 1, we thus have

$$\begin{aligned} d(\alpha ) \ge (1-\alpha r_\alpha (\alpha ))^2e(\alpha ) \ge (1-\sigma )^2e(\alpha ). \end{aligned}$$

$\square $

Since we did not require so far that the error functions $\tilde{r}_\alpha $ and $\tilde{R}_\alpha $ vanish as $\alpha \rightarrow 0$, we cannot assure that the regularised solutions $x_\alpha (y)$ and $X_\alpha (y)$ converge as $\alpha \rightarrow 0$ to the minimum norm solution or even get an upper bound on the regularisation errors d and D. We therefore impose the following additional constraint for a function $\varphi $ to serve as an upper bound for the regularisation error.

Definition 4

We call a monotonically increasing function $\varphi :(0,\infty )\rightarrow (0,\infty )$ compatible with the regularisation method $(r_\alpha )_{\alpha >0}$ with correspondingly chosen error functions $(\tilde{R}_\alpha )_{\alpha >0}$ according to Definition 1 item 3 if there exists for arbitrary $\varLambda >0$ a monotonically decreasing, integrable function $F:[1,\infty )\rightarrow \mathbb {R}$ such that

$$\begin{aligned} \tilde{R}_\alpha ^2(\lambda ) \le F\left( \frac{\varphi (\lambda )}{\varphi (\alpha )}\right) \text { for }0<\alpha \le \lambda \le \varLambda . \end{aligned}$$

(21)

In particular, a monotonically increasing function $\varphi :(0,\infty )\rightarrow (0,\infty )$ with $\lim _{\alpha \rightarrow 0}\varphi (\alpha )=0$ can only be compatible with $(r_\alpha )_{\alpha >0}$ if

$$\begin{aligned} \lim _{\alpha \rightarrow 0}\frac{\tilde{R}_\alpha ^2(\lambda )}{\varphi (\alpha )} = 0\text { for every }\lambda >0, \end{aligned}$$

(22)

since the integrability of the monotonically decreasing function F in Eq. 21 implies the asymptotic behaviour $\lim _{z\rightarrow \infty }z F(z)=0$.

Remark 3

With $F(z)=(A z)^{-\frac{1}{\mu }}$, $A\in (0,\infty )$, $\mu \in (0,1)$, Eq. 21 is exactly the condition from [3, Equation 7] for the error function $\tilde{R}_\alpha $ (there we assume that $\tilde{r}_\alpha $ satisfies Definition 1 item 3 and item 4 such that we can take $\tilde{R}_\alpha =\tilde{r}_\alpha $).

These sort of conditions for ensuring convergence rates of the method have a long history. For the special choice $F(z)=Az^{-2}$, it was introduced as qualification of the regularisation method in [19, Definition 1 and 2], which is now commonly used for characterising convergence rates, see [12, 16], for example. Even before that, the condition was used for the convergence rates $\varphi ^{\mathrm H}_\mu $, see, for example, the textbooks [30, Theorem 4.3], [31, Theorem 1.1 in Chapter 3], and [9, Theorem 4.3, Corollary 4.4].

Lemma 3

Let $\varphi :(0,\infty )\rightarrow (0,\infty )$ be a monotonically increasing function which is compatible with $(r_\alpha )_{\alpha >0}$ in the sense of Definition 4 and dominates the spectral tail, that is,

$$\begin{aligned} e(\lambda ) \le \varphi (\lambda ) \text { for all }\lambda >0. \end{aligned}$$

(23)

Then, with a monotonically decreasing and integrable function $F:[1,\infty )\rightarrow \mathbb {R}$ fulfilling Eq. 21 for $\varLambda =\Vert L\Vert ^2$, we get

$$\begin{aligned} D(\alpha )\le (\max \{1,F(1)\}+\Vert F\Vert _{L^1})\varphi (\alpha )\text { for all }\alpha >0. \end{aligned}$$

That is, the order of the noise-free regularisation error D of the envelope generator $(R_\alpha )_{\alpha >0}$ is given by the function $\varphi $.

Proof

We first extend the function F to ${\tilde{F}}:[0,\infty )\rightarrow \mathbb {R}$ via ${\tilde{F}}(z):=\max \{1,F(1)\}$ for $z\in [0,1]$ and ${\tilde{F}}(z):=F(z)$ for $z\in (1,\infty )$ so that we have (because of $\tilde{R}_\alpha ^2(\lambda )\le 1$ for all $\alpha >0$ and $\lambda >0$)

$$\begin{aligned} \tilde{R}_\alpha ^2(\lambda ) \le {\tilde{F}}\left( \frac{\varphi (\lambda )}{\varphi (\alpha )}\right) \text { for all }\alpha >0\text { and }0<\lambda \le \left\| L \right\| ^2. \end{aligned}$$

Taking for D the representation from Eq. 18 and using that ${\tilde{F}}$ is monotonically decreasing, we get

$$\begin{aligned} D(\alpha ) = \int _0^{\left\| L \right\| ^2}\tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) \le \int _0^{\left\| L \right\| ^2}{\tilde{F}}\left( \frac{\varphi (\lambda )}{\varphi (\alpha )}\right) \,\mathrm de(\lambda )\le \int _0^{\left\| L \right\| ^2}{\tilde{F}}\left( \frac{e(\lambda )}{\varphi (\alpha )}\right) \,\mathrm de(\lambda ). \end{aligned}$$

Then, the substitution $z=\frac{e(\lambda )}{\varphi (\alpha )}$ gives us

$$\begin{aligned} D(\alpha ) \le \varphi (\alpha )\int _0^\infty {\tilde{F}}(z)\,\mathrm dz = (\max \{1,F(1)\}+\Vert F\Vert _{L^1})\varphi (\alpha ). \end{aligned}$$

$\square $

Remark 4

The result of Lemma 3 is analogous to [3, Proposition 2.3] where the noise-free regularisation error produced by a generator $(r_\alpha )_{\alpha >0}$ is estimated.

The compatibility condition in Eq. 21 is essentially a way to measure if the regularisation method converges at each spectral value faster than a given convergence rate $\varphi $, see Eq. 22. It is therefore not surprising that if some convergence rate is compatible with $(r_\alpha )_{\alpha >0}$, then all slower convergence rates are also compatible with it.

Lemma 4

Let $\varphi _1,\varphi _2:(0,\infty )\rightarrow (0,\infty )$ be two monotonically increasing, continuous functions such that the ratio $\psi :=\frac{\varphi _1}{\varphi _2}$ is monotonically increasing on $(0,\alpha _0]$ for some $\alpha _0>0$.

Then $\varphi _2$ is compatible with the regularisation method $(r_\alpha )_{\alpha >0}$ in the sense of Definition 4 if $\varphi _1$ is compatible with $(r_\alpha )_{\alpha >0}$.

Proof

Let $\varLambda >0$ be arbitrary. Since $\psi $ is continuous and everywhere positive, we have the positive bounds $m:=\min _{\alpha \in [\alpha _0,\varLambda ]}\psi (\alpha )>0$ and $M:=\max _{\alpha \in [\alpha _0,\varLambda ]}\psi (\alpha )\ge m$. Then, the monotonicity of $\psi $ on the interval $(0,\alpha _0]$ implies for every $\alpha \in (0,\varLambda ]$ that

$$\begin{aligned} \min _{\lambda \in [\alpha ,\varLambda ]}\frac{\psi (\lambda )}{\psi (\alpha )} \ge \frac{\min \{\psi (\alpha ),m\}}{\psi (\alpha )} = \min \left\{ 1,\frac{m}{\psi (\alpha )}\right\} \ge \min \left\{ 1,\frac{m}{M}\right\} = \frac{m}{M}. \end{aligned}$$

By definition of $\psi $, this means that

$$\begin{aligned} \frac{\varphi _1(\lambda )}{\varphi _1(\alpha )} \ge \frac{m}{M}\,\frac{\varphi _2(\lambda )}{\varphi _2(\alpha )}\text { for all }0<\alpha \le \lambda \le \varLambda . \end{aligned}$$

Thus, if F is a monotonically decreasing, integrable function $F:[1,\infty )\rightarrow \mathbb {R}$ such that Eq. 21 holds for $\varphi =\varphi _1$, then

$$\begin{aligned} \tilde{R}_\alpha ^2(\lambda ) \le F\left( \frac{\varphi _1(\lambda )}{\varphi _1(\alpha )}\right) \le F\left( \frac{m}{M}\,\frac{\varphi _2(\lambda )}{\varphi _2(\alpha )}\right) \text { for all }0<\alpha \le \lambda \le \varLambda . \end{aligned}$$

Since the function ${\tilde{F}}:[1,\infty )\rightarrow \mathbb {R}$ given by ${\tilde{F}}(z):=F(\frac{m}{M} z)$ is also monotonically decreasing and integrable, this proves that $\varphi _2$ is compatible with $(r_\alpha )_{\alpha >0}$, too. $\square $

In particular, if one of the Hölder rates from Example 2 is compatible with $(r_\alpha )_{\alpha >0}$, then all the logarithmic rates are compatible.

Corollary 2

Let $\varphi ^{\mathrm H}_\mu $ and $\varphi ^{\mathrm L}_\mu $, $\mu >0$, be the rates defined in Example 2.

Then $\varphi ^{\mathrm L}_\mu $ is for every $\mu >0$ compatible with the regularisation method $(r_\alpha )_{\alpha >0}$ in the sense of Definition 4 if there exists a parameter $\nu >0$ such that $\varphi ^{\mathrm H}_\nu $ is compatible with $(r_\alpha )_{\alpha >0}$.

Proof

Let $\varphi ^{\mathrm H}_\nu $ be compatible with $(r_\alpha )_{\alpha >0}$ for some $\nu >0$ and consider for arbitrary $\mu >0$ the function $\psi :=\frac{\varphi ^{\mathrm H}_\nu }{\varphi ^{\mathrm L}_\mu }$. Since

$$\begin{aligned} \psi '(\alpha )=\alpha ^{\nu -1}\left| \log \alpha \right| ^{\mu -1}\left( \nu \left| \log \alpha \right| -\mu \right) >0\text { for }0<\alpha \le \min \{\mathrm e^{-1},\mathrm e^{-\frac{\mu }{\nu }}\}=:\alpha _0, \end{aligned}$$

the function $\psi $ is monotonically increasing on $(0,\alpha _0]$. Thus, Lemma 4 implies the compatibility of the function $\varphi ^{\mathrm L}_\mu $. $\square $

2.3 Relation Between Convergence Rates for Noise Free and for Noisy Data

We will see that when applying the regularisation to noisy data, the convergence rates D give rise to convergence rates of the form ${\tilde{D}}(\delta )\le C_{{\tilde{D}}}\varPhi [D](\delta )$ for some constant $C_{{\tilde{D}}}>0$ and the transform $\varPhi [D]$ of the function D which satisfies the equation system

$$\begin{aligned} \varPhi [D](\delta ) = D(\alpha _\delta ) = \frac{\delta ^2}{\alpha _\delta } \end{aligned}$$

for some suitable function $\delta \mapsto \alpha _\delta $.

Definition 5

Let $\varphi :(0,\infty ) \rightarrow [0,\infty )$ be a monotonically increasing function which is not everywhere zero. We define the noise-free to noisy transform $\varPhi [\varphi ]:(0,\infty )\rightarrow (0,\infty )$ of $\varphi $ by

$$\begin{aligned} \varPhi [\varphi ](\delta ) :=\frac{\delta ^2}{{\hat{\varphi }}^{-1}(\delta )}, \end{aligned}$$

where we introduce the function

$$\begin{aligned} {\hat{\varphi }}:(0,\infty )\rightarrow [0,\infty ),\;{\hat{\varphi }}(\alpha )=\sqrt{\alpha \varphi (\alpha )} \end{aligned}$$

and write ${\hat{\varphi }}^{-1}$ for the generalised inverse

$$\begin{aligned} {\hat{\varphi }}^{-1}(\delta ):=\inf \{\alpha >0\mid {\hat{\varphi }}(\alpha )\ge \delta \}. \end{aligned}$$

Remark 5

We emphasise that the considered functions need to be neither continuous nor surjective to be able to define a generalised inverse. In particular the function ${\hat{e}}:(0,\infty )\rightarrow [0,\infty )$, $\lambda \mapsto \sqrt{\lambda e(\lambda )}$, with e defined in Eq. 12, is only right-continuous and not surjective in general. Nevertheless, a generalised inverse exists.

We also note that if $\varphi :(0,\infty )\rightarrow [0,\infty )$ is a monotonically increasing function which is not everywhere zero and $\alpha _0:=\inf \left\{ \alpha>0\mid \varphi (\alpha )>0\right\} $, then we have that ${\hat{\varphi }}:(0,\infty )\rightarrow [0,\infty )$, $\alpha \mapsto \sqrt{\alpha \varphi (\alpha )}$ is a strictly increasing function on $(\alpha _0,\infty )$ so that $\alpha ={\hat{\varphi }}^{-1}({\hat{\varphi }}(\alpha ))$ for every $\alpha \in (\alpha _0,\infty )$.

Later on, we will apply this transform to the functions describing the convergence rates. We therefore calculate (at least in leading order) the noise-free to noisy transforms for the families of convergence rates introduced in Example 2.

Lemma 5

Let $\varphi ^{\mathrm H}_\mu $ and $\varphi ^{\mathrm L}_\mu $ be the functions introduced in Example 2.

Then, we have for every $\mu >0$ that

$\displaystyle \varPhi [\varphi ^{\mathrm H}_\mu ] = \varphi ^{\mathrm H}_{\frac{2\mu }{\mu +1}}$ and

$\displaystyle 0<\liminf _{\delta \rightarrow 0}\frac{\varPhi [\varphi ^{\mathrm L}_\mu ](\delta )}{\varphi ^{\mathrm L}_\mu (\delta )}\le \limsup _{\delta \rightarrow 0}\frac{\varPhi [\varphi ^{\mathrm L}_\mu ](\delta )}{\varphi ^{\mathrm L}_\mu (\delta )}<\infty $.

Proof

We find directly from Definition 5 that

$$\begin{aligned} \varPhi [\varphi ^{\mathrm H}_\mu ](\delta )=\frac{\delta ^2}{({\hat{\varphi }}^{\mathrm H}_\mu )^{-1}(\delta )}\text { with }{\hat{\varphi }}^{\mathrm H}_\mu (\alpha )=\alpha ^{\frac{1+\mu }{2}}, \end{aligned}$$

which gives

$$\begin{aligned} \varPhi [\varphi ^{\mathrm H}_\mu ](\delta )=\frac{\delta ^2}{\delta ^{\frac{2}{1+\mu }}} = \delta ^{\frac{2\mu }{\mu +1}}. \end{aligned}$$

This is shown in [3, Example 3.4 (ii)].

$\square $

Let us collect some elementary properties of the transform $\varPhi $ before estimating the quantities ${\tilde{d}}$ and ${\tilde{D}}$.

Lemma 6

Let $\varphi :(0,\infty )\rightarrow [0,\infty )$ be a monotonically increasing function which is not everywhere zero and ${\hat{\varphi }}(\alpha ):=\sqrt{\alpha \varphi (\alpha )}$.

Then, we have

for every $\delta \in {\hat{\varphi }}\big ((0,\infty )\big )\setminus \left\{ 0\right\} $ that

$$\begin{aligned} \varPhi [\varphi ](\delta ) = \varphi ({\hat{\varphi }}^{-1}(\delta ))\text { and,} \end{aligned}$$

if $\varphi $ is additionally right-continuous, that

$$\begin{aligned} \varPhi [\varphi ](\delta ) \le \varphi ({\hat{\varphi }}^{-1}(\delta ))\hbox { for every}\ \delta >0. \end{aligned}$$

Proof

Since ${\hat{\varphi }}$ is strictly increasing on $\{\alpha>0\mid {\hat{\varphi }}(\alpha )>0\}$ and $\delta \in {\hat{\varphi }}\big ((0,\infty )\big )\setminus \left\{ 0\right\} $, there exists exactly one point $\alpha >0$ with ${\hat{\varphi }}(\alpha )=\delta $, which then is by definition $\alpha ={\hat{\varphi }}^{-1}(\delta )$. Thus, we have that ${\hat{\varphi }}({\hat{\varphi }}^{-1}(\delta ))=\delta $, which means that

$$\begin{aligned} \varphi ({\hat{\varphi }}^{-1}(\delta )) = \frac{\delta ^2}{{\hat{\varphi }}^{-1}(\delta )} = \varPhi [\varphi ](\delta ). \end{aligned}$$

Since $\varphi $ is right-continuous and monotonically increasing, it is upper semi-continuous and so is ${\hat{\varphi }}$. Thus, the set $\{\alpha >0\mid {\hat{\varphi }}(\alpha )\ge \delta \}$ is closed and therefore ${\hat{\varphi }}^{-1}(\delta )=\min \{\alpha >0\mid {\hat{\varphi }}(\alpha )\ge \delta \}$. In particular, we have that the inequality

$$\begin{aligned} {\hat{\varphi }}({\hat{\varphi }}^{-1}(\delta ))\ge \delta ,\text { that is, } \varphi ({\hat{\varphi }}^{-1}(\delta ))\ge \frac{\delta ^2}{{\hat{\varphi }}^{-1}(\delta )}=\varPhi [\varphi ](\delta ), \end{aligned}$$

(24)

holds.

$\square $

Lemma 7

Let $\varphi ,\psi :(0,\infty )\rightarrow [0,\infty )$ be monotonically increasing functions which are not everywhere zero.

Then,

$\psi \le \varphi $ implies that $\varPhi [\psi ]\le \varPhi [\varphi ]$ and,

if $\varphi $ is additionally right-continuous, then $\varPhi [\psi ]\le \varPhi [\varphi ]$ also implies $\psi \le \varphi $.

Proof

We set ${\hat{\varphi }}(\alpha ):=\sqrt{\alpha \varphi (\alpha )}$ and ${\hat{\psi }}(\alpha ):=\sqrt{\alpha \psi (\alpha )}$.

Let $\psi \le \varphi $. Then, we have

$$\begin{aligned} {\hat{\psi }}^{-1}(\delta ) = \inf \{\alpha>0\mid \alpha \psi (\alpha )\ge \delta ^2\} \ge \inf \{\alpha >0\mid \alpha \varphi (\alpha )\ge \delta ^2\} = {\hat{\varphi }}^{-1}(\delta ) \end{aligned}$$

and thus

$$\begin{aligned} \varPhi [\psi ](\delta ) = \frac{\delta ^2}{{\hat{\psi }}^{-1}(\delta )} \le \frac{\delta ^2}{{\hat{\varphi }}^{-1}(\delta )} = \varPhi [\varphi ](\delta ). \end{aligned}$$

Conversely, if $\varPhi [\psi ]\le \varPhi [\varphi ]$, then we get immediately that ${\hat{\varphi }}^{-1}\le {\hat{\psi }}^{-1}$.

Now, let $\alpha >0$ be arbitrary. If ${\hat{\psi }}(\alpha )=0$, there is nothing to show; so we assume ${\hat{\psi }}(\alpha )>0$ and define $\delta :={\hat{\psi }}(\alpha )$. Then, $\alpha ={\hat{\psi }}^{-1}(\delta )\ge {\hat{\varphi }}^{-1}(\delta )$, so that we find with Eq. 24 (using the right-continuity of $\varphi $) that

$$\begin{aligned} \sqrt{\alpha \varphi (\alpha )}\ge {\hat{\varphi }}({\hat{\varphi }}^{-1}(\delta ))\ge \delta =\sqrt{\alpha \psi (\alpha )}. \end{aligned}$$

So, $\varphi (\alpha )\ge \psi (\alpha )$.

$\square $

Lemma 8

Let $C>0$, $c>0$, and $\varphi :(0,\infty )\rightarrow [0,\infty )$ be a monotonically increasing function which is not everywhere zero. We set

$$\begin{aligned} \psi (\alpha ) :=C^2\varphi (c^2\alpha ). \end{aligned}$$

Then,

$$\begin{aligned} \varPhi [\psi ](\delta ) = C^2\varPhi [\varphi ](\tfrac{c}{C}\delta ). \end{aligned}$$

Proof

We define again ${\hat{\varphi }}(\alpha ):=\sqrt{\alpha \varphi (\alpha )}$ and ${\hat{\psi }}(\alpha ):=\sqrt{\alpha \psi (\alpha )}$. Then, we have for every $\delta >0$ that

$$\begin{aligned} {\hat{\psi }}^{-1}(\delta )&= \inf \{\alpha>0\mid \alpha \psi (\alpha )\ge \delta ^2\} = \inf \{\alpha>0\mid C^2\alpha \varphi (c^2\alpha )\ge \delta ^2\} \\&= \frac{1}{c^2}\inf \{{\tilde{\alpha }}>0\mid {\tilde{\alpha }}\varphi ({\tilde{\alpha }})\ge (\tfrac{c}{C}\delta )^2\} = \frac{1}{c^2}{\hat{\varphi }}^{-1}(\tfrac{c}{C}\delta ), \end{aligned}$$

which gives us

$$\begin{aligned} \varPhi [\psi ](\delta ) = \frac{\delta ^2}{{\hat{\psi }}^{-1}(\delta )} = \frac{(c\delta )^2}{{\hat{\varphi }}^{-1}(\tfrac{c}{C}\delta )} = C^2\varPhi [\varphi ](\tfrac{c}{C}\delta ). \end{aligned}$$

$\square $

Lemma 9

Let $\varphi :(0,\infty )\rightarrow (0,\infty )$ be a monotonically increasing function and assume there exists a continuous, monotonically increasing function $G:(0,\infty )\rightarrow (0,\infty )$ such that

$$\begin{aligned} \varphi (\gamma \alpha )\le G(\gamma )\varphi (\alpha )\text { for all } \gamma>0,\;\alpha >0. \end{aligned}$$

Then,

$$\begin{aligned} \varPhi [\varphi ]({\tilde{\gamma }}\delta )\le \varPhi [G]({\tilde{\gamma }})\varPhi [\varphi ](\delta )\text { for all }{\tilde{\gamma }}>0,\;\delta >0. \end{aligned}$$

Proof

We get from $\varphi ({\tilde{\alpha }})\le G(\gamma )\varphi (\tfrac{1}{\gamma }{\tilde{\alpha }})$ with Lemma 7 and Lemma 8 that

$$\begin{aligned} \varPhi [\varphi ]({\tilde{\delta }}) \le G(\gamma )\varPhi [\varphi ]\left( \tfrac{1}{\sqrt{\gamma G(\gamma )}}{\tilde{\delta }}\right) . \end{aligned}$$

Thus, switching to the variable ${\tilde{\gamma }}:={{\hat{G}}}(\gamma ):=\sqrt{\gamma G(\gamma )}$ (which means that $\gamma ={{\hat{G}}}^{-1}({\tilde{\gamma }})$ and thus, by Lemma 6, $\varPhi [G]({\tilde{\gamma }})=G(\gamma )$), we find with $\delta :=\frac{1}{{\tilde{\gamma }}}{\tilde{\delta }}$:

$$\begin{aligned} \varPhi [\varphi ]({\tilde{\gamma }}\delta ) \le \varPhi [G]({\tilde{\gamma }})\varPhi [\varphi ](\delta ). \end{aligned}$$

$\square $

2.4 Bounds for the Best Worst-Case Errors

Let us finally come back to the functions ${\tilde{d}}$ and ${\tilde{D}}$, the best worst-case errors of the regularisation methods defined by the generators $(r_\alpha )_{\alpha >0}$ and $(R_\alpha )_{\alpha >0}$, respectively. Here, we derive an estimate between the best worst-case errors and the noise-free regularisation errors.

Lemma 10

Let $x^\dagger \ne 0$. Then, we have with the constant $\sigma \in (0,1)$ from Definition 1 item 1 that

$$\begin{aligned} {\tilde{d}}(\delta ) \le (1+\sigma )^2\varPhi [D](\delta ) \text { and }{\tilde{D}}(\delta ) \le (1+\sigma )^2\varPhi [D](\delta )\text { for all }\delta >0. \end{aligned}$$

Proof

To estimate the distance between the regularised solutions for exact data y and inexact data $\tilde{y} \in {\bar{B}}_\delta (y)$, we define the Borel measure

$$\begin{aligned} \mu (A)=\Vert {\mathbf {F}}_A(\tilde{y}-y)\Vert ^2, \end{aligned}$$

where ${\mathbf {F}}$ denotes the spectral measure of the operator $LL^*$. Then, we get with Eq. 11 the relation

$$\begin{aligned} \left\| X_\alpha ({\tilde{y}})-X_\alpha (y) \right\| ^2&= \left\langle \tilde{y}-y,R_\alpha ^2(LL^*)LL^*(\tilde{y}-y)\right\rangle = \int _{(0,\left\| L \right\| ^2]}\lambda R_\alpha ^2(\lambda )\,\mathrm d\mu (\lambda ) \\&\le \int _{(0,\left\| L \right\| ^2]}\lambda r_\alpha ^2(\lambda )\,\mathrm d\mu (\lambda ) = \left\| x_\alpha ({\tilde{y}})-x_\alpha (y) \right\| ^2. \end{aligned}$$

Thus, we have with Eq. 6 the upper bound

$$\begin{aligned} \left\| X_\alpha ({\tilde{y}})-X_\alpha (y) \right\| ^2&\le \left\| x_\alpha ({\tilde{y}})-x_\alpha (y) \right\| ^2 = \int _{(0,\left\| L \right\| ^2]}\lambda r_\alpha ^2(\lambda )\,\mathrm d\mu (\lambda ) \\&\le \delta ^2\sup _{\lambda \in (0,\Vert L\Vert ^2]}\lambda r_\alpha ^2(\lambda ) \le \sigma ^2\frac{\delta ^2}{\alpha }. \end{aligned}$$

The triangular inequality gives us then

$$\begin{aligned} {\tilde{D}}(\delta ) = \sup _{\tilde{y}\in {\bar{B}}_\delta (y)}\inf _{\alpha>0} \left\| X_\alpha ({\tilde{y}})-x^\dagger \right\| ^2 \le \inf _{\alpha >0}\left( \left\| X_\alpha (y)-x^\dagger \right\| +\sigma \frac{\delta }{\sqrt{\alpha }}\right) ^2. \end{aligned}$$

(25)

We estimate the infimum therein from above by the value at $\alpha :={{\hat{D}}}^{-1}(\delta )$, where we set ${{\hat{D}}}(\alpha ):=\sqrt{\alpha D(\alpha )}$. Since the function D is according to Corollary 1 monotonically increasing and continuous, we get from Lemma 6 and Definition 5 the identity $D({{\hat{D}}}^{-1}(\delta ))=\frac{\delta ^2}{{{\hat{D}}}^{-1}(\delta )}=\varPhi [D](\delta )$, so that both terms in the infimum are for this choice of $\alpha $ of the same order. This gives us

$$\begin{aligned} {\tilde{D}}(\delta ) \le \left( \sqrt{D({{\hat{D}}}^{-1}(\delta ))} + \sigma \sqrt{\frac{\delta ^2}{{{\hat{D}}}^{-1}(\delta )}} \right) ^2 = (1+\sigma )^2\varPhi [D](\delta ). \end{aligned}$$

(26)

Because of Eq. 20, we get in the same way

$$\begin{aligned} \begin{aligned} {\tilde{d}}(\delta ) = \sup _{\tilde{y}\in {\bar{B}}_\delta (y)}\inf _{\alpha>0} \left\| x_\alpha ({\tilde{y}})-x^\dagger \right\| ^2&\le \inf _{\alpha>0}\left( \left\| x_\alpha (y)-x^\dagger \right\| +\sigma \frac{\delta }{\sqrt{\alpha }}\right) ^2 \\&\le \inf _{\alpha >0}\left( \left\| X_\alpha (y)-x^\dagger \right\| +\sigma \frac{\delta }{\sqrt{\alpha }}\right) ^2 \\&\le (1+\sigma )^2\varPhi [D](\delta ), \end{aligned} \end{aligned}$$

(27)

where we used Eq. 26 in the last inequality. $\square $

The following lemma provides relations between the best worst-case errors ${\tilde{d}}$ and ${\tilde{D}}$ of the regularisation methods generated by $(r_\alpha )_{\alpha >0}$ and $(R_\alpha )_{\alpha >0}$, respectively, and the spectral tail e.

Lemma 11

Let $x^\dagger \ne 0$. Then, there exist constants $c>0$ and $C>0$ such that we have the inequalities

$$\begin{aligned} {\tilde{d}}(\delta ) \ge c\varPhi [e](\delta ) \text { and }{\tilde{D}}(\delta ) \ge C\varPhi [e](\delta )\text { for all }\delta >0. \end{aligned}$$

Proof

To obtain a lower bound on ${\tilde{d}}$, we write

$$\begin{aligned} \begin{aligned} \left\| x_\alpha ({\tilde{y}})-x^\dagger \right\| ^2&= \left\| x_\alpha (y)-x^\dagger \right\| ^2+\left\| x_\alpha ({\tilde{y}})-x_\alpha (y) \right\| ^2\\&\quad +2\left\langle x_\alpha (\tilde{y})-x_\alpha (y),x_\alpha (y)-x^\dagger \right\rangle \\&=\left\| x_\alpha (y)-x^\dagger \right\| ^2+\left\langle \tilde{y}-y,r_\alpha ^2(LL^*)LL^*(\tilde{y}-y)\right\rangle \\&\quad +2\left\langle r_\alpha (LL^*)(\tilde{y}-y),r_\alpha (LL^*)LL^*y-y\right\rangle . \end{aligned} \end{aligned}$$

(28)

We set ${{\hat{e}}}(\alpha ):=\sqrt{\alpha e(\alpha )}$ and choose an arbitrary ${\bar{\alpha }}>0$ with the property that ${\bar{\delta }}:={{\hat{e}}}({\bar{\alpha }})>0$. Then, we find according to Definition 1 item 4 a parameter ${\tilde{\sigma }}\in (0,1)$ with

$$\begin{aligned} \tilde{R}_\alpha (\alpha )<{\tilde{\sigma }}\text { for all }\alpha \in (0,{\bar{\alpha }}). \end{aligned}$$

(29)

We now consider for $\delta \in (0,{\bar{\delta }})$ the two cases ${{\hat{e}}}^{-1}(\delta )\in \varvec{\sigma }(L^*L)\setminus \{0\}$ and ${{\hat{e}}}^{-1}(\delta )\notin \varvec{\sigma }(L^*L)\setminus \{0\}$, where $\varvec{\sigma }(L^*L)$ denotes the spectrum of the operator $L^*L$.

Assume that $\delta \in (0,{\bar{\delta }})$ is such that $\alpha _\delta :={{\hat{e}}}^{-1}(\delta )\in \varvec{\sigma }(L^*L)\setminus \left\{ 0\right\} $. From the continuity of ${\tilde{R}}_{\alpha _\delta }$ and Eq. 29, we find that there exists a parameter $a_\delta \in (0,\alpha _\delta )$ such that
$$\begin{aligned} {\tilde{R}}_{\alpha _\delta }(a_\delta )<{\tilde{\sigma }}. \end{aligned}$$

(30)
Then, the assumption $\alpha _\delta \in \varvec{\sigma }(L^*L)\setminus \left\{ 0\right\} $ implies that the spectral projection ${\mathbf {F}}$ of the operator $LL^*$ fulfils ${\mathbf {F}}_{[a_\delta ,2\alpha _\delta ]}\ne 0$. To estimate Eq. 28 further, we will choose for given values of $\alpha >0$ and $\delta \in (0,{\bar{\delta }})$ a particular point ${\tilde{y}}$. For this choice, we differ again between two cases.
- If
  $$\begin{aligned} z_{\alpha ,\delta } :={\mathbf {F}}_{[a_\delta ,2\alpha _\delta ]}(r_\alpha (LL^*)LL^*y-y) \ne 0, \end{aligned}$$
  we pick
  $$\begin{aligned} \tilde{y}=y+\delta \frac{z_{\alpha ,\delta }}{\left\| z_{\alpha ,\delta } \right\| } \end{aligned}$$
  in Eq. 28 and obtain
  $$\begin{aligned}&\left\| x_\alpha \left( y+\delta \tfrac{z_{\alpha ,\delta }}{\left\| z_{\alpha ,\delta } \right\| } \right) - x^\dagger \right\| ^2 = \left\| x_\alpha (y)-x^\dagger \right\| ^2 \\&\quad +\frac{\delta ^2}{\left\| z_{\alpha ,\delta } \right\| ^2}\left\langle z_{\alpha ,\delta },r_\alpha ^2(LL^*)LL^*z_{\alpha ,\delta }\right\rangle +\frac{2\delta }{\left\| z_{\alpha ,\delta } \right\| }\left\langle r_\alpha (LL^*)z_{\alpha ,\delta },z_{\alpha ,\delta }\right\rangle . \end{aligned}$$
  Here, we may drop the last term as it is non-negative, which gives us the lower bound
  $$\begin{aligned} \left\| x_\alpha \left( y+\delta \tfrac{z_{\alpha ,\delta }}{\left\| z_{\alpha ,\delta } \right\| } \right) - x^\dagger \right\| ^2 \ge \left\| x_\alpha (y)-x^\dagger \right\| ^2+\delta ^2\min _{\lambda \in [a_\delta ,2\alpha _\delta ]}\lambda r_\alpha ^2(\lambda ). \end{aligned}$$
- Otherwise, if
  $$\begin{aligned} {\mathbf {F}}_{[a_\delta ,2\alpha _\delta ]}(r_\alpha (LL^*)LL^*y-y) = 0, \end{aligned}$$
  we choose $z_{\alpha ,\delta }\in {\mathcal {R}}({\mathbf {F}}_{[a_\delta ,2\alpha _\delta ]})\setminus \left\{ 0\right\} $ arbitrarily. Then, with $\tilde{y}=y+\delta \frac{z_{\alpha ,\delta }}{\left\| z_{\alpha ,\delta } \right\| }$, the last term in Eq. 28 vanishes and we find again
  $$\begin{aligned} \left\| x_\alpha \left( y+\delta \tfrac{z_{\alpha ,\delta }}{\left\| z_{\alpha ,\delta } \right\| } \right) - x^\dagger \right\| ^2 \ge \left\| x_\alpha (y)-x^\dagger \right\| ^2+\delta ^2\min _{\lambda \in [a_\delta ,2\alpha _\delta ]}\lambda r_\alpha ^2(\lambda ). \end{aligned}$$
Therefore, we end up with
$$\begin{aligned} {\tilde{d}}(\delta ) = \sup _{\tilde{y}\in {\bar{B}}_\delta (y)}\inf _{\alpha>0} \left\| x_\alpha ({\tilde{y}})-x^\dagger \right\| ^2 \ge \inf _{\alpha >0}\left( \left\| x_\alpha (y)-x^\dagger \right\| ^2+\delta ^2\min _{\lambda \in [a_\delta ,2\alpha _\delta ]}\lambda r_\alpha ^2(\lambda )\right) . \end{aligned}$$
Using Eq. 11 and that $\tilde{R}_\alpha $ is by Definition 1 item 3 monotonically decreasing, we get the inequality
$$\begin{aligned} \lambda r_\alpha ^2(\lambda ) \ge \frac{1}{\lambda }\left( 1-\tilde{R}_\alpha (\lambda )\right) ^2 \ge \frac{1}{2\alpha _\delta }\left( 1-\tilde{R}_\alpha (a_\delta )\right) ^2\text { for all } \lambda \in [a_\delta ,2\alpha _\delta ], \end{aligned}$$
and since we already proved in Lemma 2 that $d\ge (1-\sigma )^2e$, we can estimate further
$$\begin{aligned} {\tilde{d}}(\delta ) \ge \inf _{\alpha >0}\left( (1-\sigma )^2e(\alpha )+\frac{\delta ^2}{2\alpha _\delta }\left( 1-\tilde{R}_\alpha (a_\delta )\right) ^2\right) . \end{aligned}$$
Now, the first term is monotonically increasing in $\alpha $ and, since $\alpha \mapsto \tilde{R}_\alpha (\lambda )$ is for every $\lambda >0$ monotonically increasing, see Definition 1 item 3, the second term is monotonically decreasing in $\alpha $. Thus, we can estimate the expression for $\alpha <\alpha _\delta $ from below by the second term at $\alpha =\alpha _\delta $, and for $\alpha \ge \alpha _\delta $ by the first term at $\alpha =\alpha _\delta $:
$$\begin{aligned} {\tilde{d}}(\delta ) \ge \min \left\{ (1-\sigma )^2e(\alpha _\delta ), \frac{\delta ^2}{2\alpha _\delta } \left( 1-{\tilde{R}}_{\alpha _\delta }(a_\delta )\right) ^2\right\} . \end{aligned}$$
Recalling that $\alpha _\delta ={{\hat{e}}}^{-1}(\delta )$ and that the function e is right-continuous, we get from Lemma 6 that $e(\alpha _\delta )\ge \varPhi [e](\delta )$ and have by Definition 5 that $\frac{\delta ^2}{\alpha _\delta }=\varPhi [e](\delta )$. Thus, we obtain with Eq. 30 that
$$\begin{aligned} {\tilde{d}}(\delta ) \ge c_0\varPhi [e](\delta )\text { with }c_0 :=\min \left\{ (1-\sigma )^2,\tfrac{1}{2}(1-{\tilde{\sigma }})^2\right\} . \end{aligned}$$

(31)
It remains the case where $\alpha _\delta :={{\hat{e}}}^{-1}(\delta )\notin \varvec{\sigma }(L^*L)\setminus \left\{ 0\right\} $. We define
$$\begin{aligned} \alpha _0:=\inf \{\alpha >0\mid e(\alpha )\ge e(\alpha _\delta )\} \in (0,\alpha _\delta ]. \end{aligned}$$
Since e is right-continuous and monotonically increasing, the infimum is achieved and we have that $e(\alpha _0)=e(\alpha _\delta )$. Moreover, $\alpha _0\in \varvec{\sigma }(L^*L)$, since e is constant on every interval in $(0,\infty )\setminus \varvec{\sigma }(L^*L)$ and so $\alpha _0\notin \varvec{\sigma }(L^*L)$ would imply that $e(\lambda )=e(\alpha _\delta )$ for all $\lambda \in (\alpha _0-\varepsilon ,\alpha _0+\varepsilon )$ for some $\varepsilon >0$ which would contradict the minimality of $\alpha _0$.

Setting $\delta _0:={{\hat{e}}}(\alpha _0)$ (so ${{\hat{e}}}^{-1}(\delta _0)=\alpha _0$ and, according to Lemma 6, $e(\alpha _0)=\varPhi [e](\delta _0)$), we have that $\delta _0={{\hat{e}}}(\alpha _0)\le {{\hat{e}}}(\alpha _\delta )=\delta $ and we therefore find with the monotonicity of ${\tilde{d}}$, see Corollary 1, Eq. 31, and Lemma 6 that
$$\begin{aligned} {\tilde{d}}(\delta ) \ge {\tilde{d}}(\delta _0)\ge c_0\varPhi [e](\delta _0) = c_0e(\alpha _0) = c_0e(\alpha _\delta ) \ge c_0\varPhi [e](\delta ). \end{aligned}$$

Thus, we have shown for every $\delta \in (0,{\bar{\delta }})$ that

$$\begin{aligned} {\tilde{d}}(\delta ) \ge c_0\varPhi [e](\delta ), \end{aligned}$$

(32)

where $c_0$ is given by Eq. 31.

Now, we know from Lemma 6 that $\varPhi [e](\delta )\le e({{\hat{e}}}^{-1}(\delta ))\le e(\Vert L\Vert ^2)$ for every $\delta >0$. Thus, setting $c:=\min \{c_0,\frac{{\tilde{d}}({\bar{\delta }})}{e(\left\| L \right\| ^2)}\}$, it follows with Eq. 32 that the inequality ${\tilde{d}}(\delta )\ge c\varPhi [e](\delta )$ holds for every $\delta >0$.

Following exactly the same lines, we also get that there exists a constant $C>0$ with

$$\begin{aligned} {\tilde{D}}(\delta ) \ge C\varPhi [e](\delta )\text { for every }\delta >0. \end{aligned}$$

$\square $

2.5 Optimal Convergence Rates

Putting together all these results, we can characterise the convergence of the regularisation errors for noise-free data and the best worst-case errors equivalently in terms of the regularity of the minimum norm solution, concretely, in the behaviour of the spectral tail. And we have shown in [3] that this can also be written in the form of variational source conditions.

Theorem 1

Let $\eta \in (0,1)$ be an arbitrary parameter and $\varphi :(0,\infty )\rightarrow (0,\infty )$ be a monotonically increasing function which is compatible with $(r_\alpha )_{\alpha >0}$ in the sense of Definition 4. (The function $\varphi $ represents the expected convergence rate of the regularisation method.)

Then, the following statements are equivalent:

There exists a constant $C_e>0$ such that $e(\lambda )\le C_e\varphi (\lambda )$ for every $\lambda >0$, meaning that the ratio of the spectral tail and the expected convergence rate is bounded.

There exists a constant $C_d>0$ such that $d(\alpha )\le C_d\varphi (\alpha )$ for every $\alpha >0$, meaning that the ratio of the noise-free rate of the regularisation method and the expected convergence rate is bounded.

There exists a constant $C_D>0$ such that $D(\alpha )\le C_D\varphi (\alpha )$ for every $\alpha >0$, meaning that the ratio of the noise-free rate of the envelope generated regularisation method and the expected convergence rate is bounded.

The expected convergence rate satisfies the variational source condition that there exists a constant $C_\eta >0$ with

$$\begin{aligned} \left\langle x^\dagger ,x\right\rangle \le C_\eta \Vert \varphi ^{\frac{1}{2\eta }}(L^*L)x\Vert ^\eta \Vert x\Vert ^{1-\eta }\text { for all } x\in {\mathcal {X}}. \end{aligned}$$

(33)

If the function $\varphi $ is additionally right-continuous and G-subhomogeneous in the sense that there exists a continuous and monotonically increasing function $G:(0,\infty )\rightarrow (0,\infty )$ such that

$$\begin{aligned} \varphi (\gamma \alpha )\le G(\gamma )\varphi (\alpha )\text { for all } \gamma>0,\;\alpha >0, \end{aligned}$$

(34)

then every one of these statements is also equivalent to each of the following two:

There exists a constant $C_{{\tilde{d}}}>0$ such that ${\tilde{d}}(\delta )\le C_{{\tilde{d}}}\varPhi [\varphi ](\delta )$ for every $\delta >0$, meaning that the best worst-case error of the regularisation method and the noise-free to noisy transformed expected convergence rate is bounded (in fact this justifies the name of the noise-free to noisy transform).

There exists a constant $C_{{\tilde{D}}}>0$ such that ${\tilde{D}}(\delta )\le C_{{\tilde{D}}}\varPhi [\varphi ](\delta )$ for every $\delta >0$, meaning that the best worst-case error of the envelope regularisation method and the noise-free to noisy transformed expected convergence rate is bounded.

Proof

We first note that there is nothing to show if $x^\dagger =0$, since then $e=d=D={\tilde{d}}={\tilde{D}}=0$, see Eq. 18, Eqs. 25, and 27. So, we assume that $x^\dagger \ne 0$.

We also remark that if $\varphi $ is compatible with a regularisation method in the sense of Definition 4 and $C>0$, then $C\varphi $ is compatible with the regularisation method.

1$\implies $3:

This follows directly from Lemma 3.

3$\implies $2:

This follows directly from Lemma 2.

2$\implies $1:

This follows again directly from Lemma 2.

1$\iff $4:

This equivalence was proved in [3, Proposition 4.1].

3$\implies $5:

Since $D\le C_D\varphi $, we get from Lemma 7 and Lemma 8 that

$$\begin{aligned} \varPhi [D](\delta )\le \varPhi [C_D\varphi ](\delta )=C_D\varPhi [\varphi ](C_D^{-\frac{1}{2}}\delta )\text { for every }\delta >0. \end{aligned}$$

Now, using the assumption from Eq. 34, we find with Lemma 9 that

$$\begin{aligned} \varPhi [D](\delta )\le C_D\varPhi [G](C_D^{-\frac{1}{2}})\varPhi [\varphi ](\delta )\text { for every }\delta >0. \end{aligned}$$

We therefore get from Lemma 10 that

$$\begin{aligned} {\tilde{d}}(\delta )\le (1+\sigma )^2\varPhi [D](\delta ) \le (1+\sigma )^2C_D\varPhi [G](C_D^{-\frac{1}{2}})\varPhi [\varphi ](\delta )\text { for every }\delta >0, \end{aligned}$$

where $\sigma \in (0,1)$ is the constant from Definition 1 item 1.

3$\implies $6:

As before, Lemma 10 implies

$$\begin{aligned} {\tilde{D}}(\delta )\le (1+\sigma )^2\varPhi [D](\delta ) \le (1+\sigma )^2C_D\varPhi [G](C_D^{-\frac{1}{2}})\varPhi [\varphi ](\delta )\text { for every }\delta >0. \end{aligned}$$

5$\implies $1:

The estimate ${\tilde{d}}\le C_{{\tilde{d}}}\varPhi [\varphi ]$ together with the constant $c>0$ found in Lemma 11 yields that

$$\begin{aligned} \varPhi [e](\delta ) \le \frac{1}{c}{\tilde{d}}(\delta ) \le \frac{C_{{\tilde{d}}}}{c}\varPhi [\varphi ](\delta )\text { for every }\delta >0. \end{aligned}$$

Since we know from Lemma 8 that the function $\psi :(0,\infty )\rightarrow (0,\infty )$, defined by

$$\begin{aligned} \psi (\alpha ):=\frac{C_{{\tilde{d}}}}{c}\varphi \left( \frac{C_{{\tilde{d}}}}{c}\alpha \right) , \text { fulfils } \varPhi [\psi ](\delta ) = \frac{C_{{\tilde{d}}}}{c}\varPhi [\varphi ](\delta )\text { for every }\delta >0, \end{aligned}$$

it follows that $\varPhi [e] \le \varPhi [\psi ]$ and we get with Lemma 7 and Eq. 34 that

$$\begin{aligned} e(\alpha ) \le \psi (\alpha ) = \frac{C_{{\tilde{d}}}}{c}\varphi \left( \frac{C_{{\tilde{d}}}}{c}\alpha \right) \le \frac{C_{{\tilde{d}}}}{c} G\left( \frac{C_{{\tilde{d}}}}{c}\right) \varphi (\alpha )\text { for every }\alpha >0. \end{aligned}$$

6$\implies $1:

The estimate ${\tilde{D}}\le C_{{\tilde{D}}}\varPhi [\varphi ]$ yields with the constant $C>0$ found in Lemma 11 the inequality

$$\begin{aligned} \varPhi [e](\delta ) \le \frac{1}{C}{\tilde{D}}(\delta ) \le \frac{C_{{\tilde{D}}}}{C}\varPhi [\varphi ](\delta )\text { for every }\delta >0 \end{aligned}$$

and thus with Eq. 34 as above:

$$\begin{aligned} e(\alpha ) \le \frac{C_{{\tilde{D}}}}{C}\varphi \left( \frac{C_{{\tilde{D}}}}{C}\alpha \right) \le \frac{C_{{\tilde{D}}}}{C}G\left( \frac{C_{{\tilde{D}}}}{C}\right) \varphi (\alpha )\text { for every }\alpha >0. \end{aligned}$$

$\square $

Remark 6

We note that the conditions in Theorem 1 item 2, item 3, item 5, and item 6 are convergence rates for the regularised solutions, which are equivalent to the spectral tail condition in Theorem 1 item 1 and to the variational source conditions in Theorem 1 item 4. We also want to stress, and this is a new result in comparison to [3], that this holds for regularisation methods $(r_\alpha )_{\alpha >0}$ whose error functions $\tilde{r}_\alpha $ are not necessarily non-negative and monotonically decreasing and that this also enforces optimal convergence rates for the regularisation methods generated by the envelopes $(R_\alpha )_{\alpha >0}$.

The first work on equivalence of optimality of regularisation methods is [21], which has served as a basis for the results in [3]. The equivalence of the optimal rate in Theorem 1 item 1 and the variational source condition in Theorem 1 item 4 has been analysed in a more general setting in [10‐12, 15]

In particular, all the equivalent statements of Theorem 1 follow (under the assumptions of Theorem 1) from the standard source condition, see [13, e.g. Corollary 3.1.1]. However, the standard source condition is not equivalent to these statements, see, for example, [3, Corollary 4.2].

Proposition 1

Let $\varphi :(0,\infty )\rightarrow (0,\infty )$ be a monotonically increasing, continuous function such that the standard source condition

$$\begin{aligned} x^\dagger \in {\mathcal {R}}(\varphi ^{\frac{1}{2}}(L^*L)) \end{aligned}$$

is fulfilled.

Then, there exists for every $\eta \in (0,1]$ a constant $C_\eta >0$ such that

$$\begin{aligned} \left\langle x^\dagger ,x\right\rangle \le C_\eta \Vert \varphi ^{\frac{1}{2\eta }}(L^*L)x\Vert ^\eta \Vert x\Vert ^{1-\eta }\text { for all } x\in {\mathcal {X}}. \end{aligned}$$

Proof

This statement is shown in [3, Corollary 4.2]. $\square $

Let us finally take a look at the additional condition of G-subhomogeneity introduced in Eq. 34 in Theorem 1 to prove optimal convergence rates for the best worst case errors and check that the convergence rates from Example 2 satisfy this condition.

Lemma 12

Let $\varphi ^{\mathrm H}_\mu $ and $\varphi ^{\mathrm L}_\mu $ denote the families of convergence rates defined in Example 2.

Then, we have for every parameter $\mu >0$ that

the function $\varphi ^{\mathrm H}_\mu $ is G-subhomogeneous for $G(\gamma ):=\gamma ^\mu $ in the sense of Eq. 34 and

there exists a monotonically increasing, continuous function $G:(0,\infty )\rightarrow (0,\infty )$ such that function $\varphi ^{\mathrm L}_\mu $ is G-subhomogeneous in the sense of Eq. 34.

Proof

We clearly have $\varphi ^{\mathrm H}_\mu (\gamma \alpha )=\gamma ^\mu \varphi ^{\mathrm H}_\mu (\alpha )$ for all $\gamma >0$ and $\alpha >0$.

We consider the function $g(\alpha ;\gamma ):=\frac{\varphi ^{\mathrm L}_\mu (\gamma \alpha )}{\varphi ^{\mathrm L}_\mu (\alpha )}$. Since $g:(0,\infty )\times (0,\infty )\rightarrow (0,\infty )$ is continuous, $g(\alpha ;\gamma ) \le 1$ for $\alpha \ge \mathrm e^{-1}$, and

$$\begin{aligned} \lim _{\alpha \rightarrow 0}g(\alpha ;\gamma ) = \lim _{\alpha \rightarrow 0}\left( \frac{\left| \log \alpha \right| }{\left| \log \alpha \right| -\log \gamma }\right) ^\mu = 1, \end{aligned}$$

the function ${\tilde{G}}:(0,\infty )\rightarrow (0,\infty )$, ${\tilde{G}}(\gamma ):=\sup _{\alpha \in (0,\infty )}g(\alpha ;\gamma )$ is well-defined, monotonically increasing and satisfies by construction $\varphi ^{\mathrm L}_\mu (\gamma \alpha )\le {\tilde{G}}(\gamma )\varphi ^{\mathrm L}_\mu (\alpha )$ for all $\gamma >0$ and $\alpha >0$. Thus, $\varphi ^{\mathrm L}$ is G-subhomogeneous for every monotonically increasing, continuous function G with $G\ge {\tilde{G}}$.

$\square $

2.6 Optimal Convergence Rates for the Residual Error

By applying Theorem 1 to the source $(L^*L)^{\frac{1}{2}}x^\dag $, we can directly establish a relation to the convergence rates for the noise-free residual errors q and Q of the regularisation method and the envelope generated regularisation method as defined in Eq. 15.

Corollary 3

We introduce the squared norm of the spectral projection of ${\bar{x}}^\dag =(L^*L)^{\frac{1}{2}}x^\dag $ as

$$\begin{aligned} {\bar{e}}(\lambda ):=\left\| {\mathbf {E}}_{[0,\lambda ]}{\bar{x}}^\dag \right\| ^2=\int _0^\lambda {\tilde{\lambda }}\,\mathrm de({\tilde{\lambda }}). \end{aligned}$$

(35)

Let ${\bar{\varphi }}:(0,\infty )\rightarrow (0,\infty )$ be a monotonically increasing function which is compatible with $(r_\alpha )_{\alpha >0}$ in the sense of Definition 4. Then, the following statements are equivalent:

There exists a constant $C_{{\bar{e}}}>0$ such that ${\bar{e}}(\lambda )\le C_{{\bar{e}}}{\bar{\varphi }}(\lambda )$ for every $\lambda >0$.

There exists a constant $C_q>0$ such that $q(\alpha )\le C_q{\bar{\varphi }}(\alpha )$ for every $\alpha >0$.

There exists a constant $C_Q>0$ such that $Q(\alpha )\le C_Q{\bar{\varphi }}(\alpha )$ for every $\alpha >0$.

Proof

We first remark that since $x^\dag \in {\mathcal {N}}(L)^\perp ={\mathcal {N}}(L^*L)^\perp $, also ${\bar{x}}^\dag \in {\mathcal {N}}(L)^\perp $ and is therefore the minimum norm solution of the equation $L x={\bar{y}}$ with ${\bar{y}}=L{\bar{x}}^\dag =L(L^*L)^{\frac{1}{2}}x^\dag $. The claim now follows from Theorem 1 for the minimum norm solution ${\bar{x}}^\dag $ by identifying the function e with ${\bar{e}}$ and the distances d and D because of

$$\begin{aligned} \begin{aligned} q(\alpha )&= \int _0^{\left\| L \right\| ^2}\lambda \tilde{r}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) = \int _0^{\left\| L \right\| ^2}\tilde{r}_\alpha ^2(\lambda )\,\mathrm d{\bar{e}}(\lambda )\text { and }\\ Q(\alpha )&= \int _0^{\left\| L \right\| ^2}\tilde{R}_\alpha ^2(\lambda )\,\mathrm d{\bar{e}}(\lambda ), \end{aligned} \end{aligned}$$

(36)

see Lemma 1, with q and Q, respectively. $\square $

From Corollary 3, we can obtain a non-optimal characterisation for the convergence rates of the noise-free residual errors q and Q in terms of the spectral tail e of the minimum norm solution $x^\dagger $ instead of having to rely on the spectral tail ${\bar{e}}$ of the point $(L^*L)^{\frac{1}{2}}x^\dagger $.

Corollary 4

Let ${\bar{\varphi }}:(0,\infty )\rightarrow (0,\infty )$ be a monotonically increasing function which is compatible with $(r_\alpha )_{\alpha >0}$ in the sense of Definition 4 and fulfils

$$\begin{aligned} \lambda e(\lambda )\le {\bar{\varphi }}(\lambda )\text { for all }\lambda >0, \end{aligned}$$

(37)

meaning that the ratio of the spectral tail and ${\bar{\varphi }}$ is bounded by the spectral representation of the inverse of $L^*L$.

Then, there exists a constant $C>0$ such that we have

$$\begin{aligned} q(\alpha )\le Q(\alpha )\le C{\bar{\varphi }}(\alpha )\text { for all }\alpha >0. \end{aligned}$$

(38)

Proof

The first inequality follows with Definition 1 item 3 directly from the representation in Eq. 19 for q and Q:

$$\begin{aligned} q(\alpha ) = \int _0^{\left\| L \right\| ^2}\lambda \tilde{r}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) \le \int _0^{\left\| L \right\| ^2}\lambda \tilde{R}_\alpha ^2(\lambda )\,\mathrm de(\lambda ) = Q(\alpha ). \end{aligned}$$

(39)

For the second inequality, we use that the function ${\bar{e}}$ defined in Eq. 35 fulfils

$$\begin{aligned} {\bar{e}}(\lambda ) = \int _0^\lambda {\tilde{\lambda }}\,\mathrm de({\tilde{\lambda }}) \le \lambda \int _0^\lambda \,\mathrm de({\tilde{\lambda }}) = \lambda e(\lambda ) \le {\bar{\varphi }}(\lambda )\text { for every }\lambda >0. \end{aligned}$$

(40)

Thus, Corollary 3 implies that there exists a constant $C>0$ with $Q(\alpha )\le C{\bar{\varphi }}(\alpha )$ for all $\alpha >0$. $\square $

Remark 7

In particular, Corollary 3 implies that Eq. 38 holds for all monotonically increasing functions ${\bar{\varphi }}$ with ${\bar{\varphi }}(\alpha )\ge c\alpha $ for some $c>0$ which are compatible with $(r_\alpha )_{\alpha >0}$.

The condition in Eq. 37 is, however, not equivalent to those in Corollary 3.

Example 3

Let $x^\dagger $ be such that its spectral tail e has the form

$$\begin{aligned} e(\lambda ) = \frac{1}{\left| \log \lambda \right| }\text { for }\lambda \in (0,\lambda _0] \end{aligned}$$

(41)

for some $\lambda _0\in (0,1)$.

Then, we claim that ${\bar{e}}$, defined by Eq. 35, converges faster to zero than $\lambda \mapsto \lambda e(\lambda )$, that is,

$$\begin{aligned} \lim _{\lambda \rightarrow 0}\frac{{\bar{e}}(\lambda )}{\lambda e(\lambda )} = 0, \end{aligned}$$

(42)

proving that the condition in Eq. 37 is stronger than those in Corollary 3.

To verify Eq. 42, we plug in Eq. 35 and perform an integration by parts in the numerator to obtain

$$\begin{aligned} \lim _{\lambda \rightarrow 0}\frac{{\bar{e}}(\lambda )}{\lambda e(\lambda )} = 1-\lim _{\lambda \rightarrow 0}\frac{\int _0^\lambda e({\tilde{\lambda }})\,\mathrm d{\tilde{\lambda }}}{\lambda e(\lambda )}. \end{aligned}$$

Now, L’Hospital’s rule implies that

$$\begin{aligned} \lim _{\lambda \rightarrow 0}\frac{{\bar{e}}(\lambda )}{\lambda e(\lambda )} = 1-\lim _{\lambda \rightarrow 0}\frac{e(\lambda )}{e(\lambda )+\lambda e'(\lambda )} = 1-\frac{1}{1+\lim _{\lambda \rightarrow 0}\frac{\lambda e'(\lambda )}{e(\lambda )}}. \end{aligned}$$

Inserting our expression for e from Eq. 41, we find that

$$\begin{aligned} \lim _{\lambda \rightarrow 0}\frac{\lambda e'(\lambda )}{e(\lambda )} = \lim _{\lambda \rightarrow 0}\frac{1}{\left| \log \lambda \right| } = 0 \end{aligned}$$

herein, which shows Eq. 42.

Since ${\bar{e}}$ tends by definition faster to zero than the identity ${\bar{\varphi }}:(0,\infty )\rightarrow (0,\infty )$, ${\bar{\varphi }}(\alpha )=\alpha $, the noise-free residual errors q and Q also converge (without imposing an additional source condition) faster than the identity provided that ${\bar{\varphi }}$ is compatible with $(r_\alpha )_{\alpha >0}$.

Corollary 5

If the convergence rate ${\bar{\varphi }}:(0,\infty )\rightarrow (0,\infty )$, ${\bar{\varphi }}(\alpha )=\alpha $ is compatible with $(r_\alpha )_{\alpha >0}$ in the sense of Definition 4, then we have that

$$\begin{aligned} \lim _{\alpha \rightarrow 0}\frac{q(\alpha )}{\alpha }= \lim _{\alpha \rightarrow 0}\frac{Q(\alpha )}{\alpha }= 0. \end{aligned}$$

Proof

Since $q\le Q$, see Eq. 39, it is enough to prove it for the function Q. We define ${\bar{e}}$ as in Eq. 35 and differ between two cases.

If ${\bar{e}}(\lambda )=0$ for all $\lambda \in [0,\lambda _0]$ for some $\lambda _0>0$, then we estimate, using the integral representation for Q from Eq. 36,
$$\begin{aligned} Q(\alpha ) = \int _{\lambda _0}^{\left\| L \right\| ^2}\tilde{R}_\alpha ^2(\lambda )\,\mathrm d{\bar{e}}(\lambda ) \le \tilde{R}_\alpha ^2(\lambda _0)\Vert (L^*L)^{\frac{1}{2}}x^\dagger \Vert ^2. \end{aligned}$$
Since ${\bar{\varphi }}$ is compatible to $(r_\alpha )_{\alpha >0}$, we known from Eq. 22 that
$$\begin{aligned} \lim _{\alpha \rightarrow 0}\frac{Q(\alpha )}{\alpha }= \Vert Lx^\dagger \Vert ^2\lim _{\alpha \rightarrow 0}\frac{\tilde{R}_\alpha ^2(\lambda _0)}{\alpha }= 0. \end{aligned}$$
If ${\bar{e}}(\lambda )>0$ for all $\lambda >0$, then we first construct using the compatibility of ${\bar{\varphi }}$, as in the proof of Lemma 3, a monotonically decreasing and integrable function ${\tilde{F}}:[0,\infty )\rightarrow \mathbb {R}$ with
$$\begin{aligned} \tilde{R}_\alpha ^2(\lambda ) \le {\tilde{F}}(\tfrac{\lambda }{\alpha })\text { for all }\alpha >0\text { and }0<\lambda \le \left\| L \right\| ^2. \end{aligned}$$
Next, we pick a monotonically increasing function $f:(0,\infty )\rightarrow (0,\left\| L \right\| ^2)$ with
$$\begin{aligned} \lim _{\alpha \rightarrow 0}f(\alpha ) = 0\text { and }\lim _{\alpha \rightarrow 0}\frac{{\bar{e}}(f(\alpha ))}{\alpha }= \infty \end{aligned}$$

(43)
and split the integral in Eq. 36 for Q at the point $f(\alpha )$ into two giving us
$$\begin{aligned} Q(\alpha ) = \int _0^{\left\| L \right\| ^2}\tilde{R}_\alpha ^2(\lambda )\,\mathrm d{\bar{e}}(\lambda ) \le \int _0^{f(\alpha )}{\tilde{F}}(\tfrac{\lambda }{\alpha })\,\mathrm d{\bar{e}}(\lambda )+\int _{f(\alpha )}^{\left\| L \right\| ^2}{\tilde{F}}(\tfrac{\lambda }{\alpha })\,\mathrm d{\bar{e}}(\lambda ). \end{aligned}$$

(44)
We check that both terms decay faster than $\alpha $.
- Since ${\bar{e}}$ fulfils by its definition in Eq. 35 that
  $$\begin{aligned} \lim _{\lambda \rightarrow 0}\frac{{\bar{e}}(\lambda )}{\lambda }= 0, \end{aligned}$$
  we find for every $\varepsilon >0$ a value $\alpha _0>0$ such that
  $$\begin{aligned} {\bar{e}}(\lambda ) \le \varepsilon \lambda \text { for all }0<\lambda <f(\alpha _0). \end{aligned}$$
  
  (45)
  Therefore, we get for the first term in Eq. 44 with the substitution $z=\frac{{\bar{e}}(\lambda )}{\varepsilon \alpha }$ that
  $$\begin{aligned} \int _0^{f(\alpha )}{\tilde{F}}(\tfrac{\lambda }{\alpha })\,\mathrm d{\bar{e}}(\lambda ) \le \int _0^{f(\alpha )}{\tilde{F}}(\tfrac{{\bar{e}}(\lambda )}{\varepsilon \alpha })\,\mathrm d{\bar{e}}(\lambda )\le \varepsilon \alpha \Vert {\tilde{F}}\Vert _{L^1}\text { for all }\alpha <\alpha _0. \end{aligned}$$
  And since this holds for arbitrary $\varepsilon >0$, we see that
  $$\begin{aligned} \lim _{\alpha \rightarrow 0}\frac{1}{\alpha }\int _0^{f(\alpha )}{\tilde{F}}(\tfrac{\lambda }{\alpha })\,\mathrm d{\bar{e}}(\lambda ) = 0. \end{aligned}$$
- For the second term in Eq. 44, we remark that Eq. 45 also implies that there exists a constant $C>0$ with
  $$\begin{aligned} {\bar{e}}(\lambda ) \le C\lambda \text { for all }\lambda >0. \end{aligned}$$
  Thus, we find with the substitution $z=\frac{{\bar{e}}(\lambda )}{C\alpha }$ that
  $$\begin{aligned} \int _{f(\alpha )}^{\left\| L \right\| ^2}{\tilde{F}}(\tfrac{\lambda }{\alpha })\,\mathrm d{\bar{e}}(\lambda ) \le \int _{f(\alpha )}^{\left\| L \right\| ^2}{\tilde{F}}(\tfrac{{\bar{e}}(\lambda )}{C\alpha })\,\mathrm d{\bar{e}}(\lambda )\le C\alpha \int _{\frac{{\bar{e}}(f(\alpha ))}{C\alpha }}^\infty {\tilde{F}}(z)\,\mathrm dz \end{aligned}$$
  for all $\alpha >0$. According to our choice of f, see Eq. 43, the integral converges to zero for $\alpha \rightarrow 0$ and we therefore obtain
  $$\begin{aligned} \lim _{\alpha \rightarrow 0}\frac{1}{\alpha }\int _{f(\alpha )}^{\left\| L \right\| ^2}{\tilde{F}}(\tfrac{\lambda }{\alpha })\,\mathrm d{\bar{e}}(\lambda ) = 0. \end{aligned}$$

$\square $

The results of this section explain the interplay of the convergence rates of the spectral tail of the minimum norm solution, the noise free regularisation error, and the best worst-case error. For these different concepts, equivalent rates can be derived. Moreover, these rates also infer rates for the noise-free residual error. In addition to standard regularisation theory, we proved rates on the associated regularisation method defined in Eq. 9.

3 Spectral Decomposition Analysis of Regularising Flows

We now turn to the applications of these results to the method in Eq. 2 with some continuous functions $a_k\in C((0,\infty );\mathbb {R})$, $k=0,\ldots ,N-1$. We hereby consider the solution as a function of the possibly not exact data ${\tilde{y}}\in {\mathcal {Y}}$. Thus, we look for a solution $\xi :\left[ 0,\infty \right) \times {\mathcal {Y}} \rightarrow {\mathcal {X}}$ of

$$\begin{aligned}&\partial _t^N\xi (t;{\tilde{y}}) + \sum _{k=1}^{N-1} a_k(t)\partial _t^k\xi (t;{\tilde{y}}) = - L^*L \xi (t;{\tilde{y}}) + L^*{\tilde{y}}\text { for all } t \in \left( 0,\infty \right) , \end{aligned}$$

(46a)

$$\begin{aligned}&\partial _t^k\xi (0;{\tilde{y}}) = 0 \qquad \qquad \qquad \text { for all }k\in \left\{ 0,\ldots ,N-1\right\} , \end{aligned}$$

(46b)

such that $\xi (\cdot ;{\tilde{y}})$ is N times continuously differentiable for every ${\tilde{y}}$.

The following proposition provides an existence and uniqueness of the solution of flows of higher order. In case that the coefficients $a_k$ are in $C^\infty ([0,\infty );\mathbb {R})$ the result can also be derived simpler from an abstract Picard–Lindelöf theorem, see, for example, [18, Section II.2.1]. However, in our case $a_k$ might also have a singularity at the origin, such as in Eq. 5, and the proof gets more involved.

Proposition 2

Let $N\in \mathbb {N}$ and ${\tilde{y}}\in {\mathcal {Y}}$ be arbitrary, and let $A\mapsto {\mathbf {E}}_A$ denote the spectral measure of the operator $L^*L$.

Assume that the initial value problem

$$\begin{aligned}&\partial _t^N{\tilde{\rho }}(t;\lambda )+\sum _{k=1}^{N-1}a_k(t)\partial _t^k&\tilde{\rho }(t;\lambda ) =-\lambda {\tilde{\rho }}(t;\lambda ) \text { for all }\lambda \in \left[ 0,\infty \right) ,\;t \in \left( 0,\infty \right)&, \end{aligned}$$

(47a)

$$\begin{aligned}&\partial _t^k{\tilde{\rho }}(0;\lambda )=0&\text { for all }\lambda \in \left[ 0,\infty \right) ,\;k\in \left\{ 1,\ldots ,N-1\right\}&, \end{aligned}$$

(47b)

$$\begin{aligned}&{\tilde{\rho }}(0;\lambda )=1&\text { for all }\lambda \in \left[ 0,\infty \right)&, \end{aligned}$$

(47c)

has a unique solution ${\tilde{\rho }}:[0,\infty )\times [0,\infty )\rightarrow \mathbb {R}$ which is N times partially differentiable with respect to t. Moreover, we assume that $\partial _t^k{\tilde{\rho }}\in C^1([0,\infty )\times [0,\infty );\mathbb {R})$ for every $k\in \{0,\ldots ,N\}$.

We define the function $\rho :[0,\infty )\times (0,\infty )\rightarrow \mathbb {R}$ by

$$\begin{aligned} \rho (t;\lambda ) :=\frac{1-{\tilde{\rho }}(t;\lambda )}{\lambda }. \end{aligned}$$

(48)

Then, the function $\xi (\cdot ;{\tilde{y}})$, given by

$$\begin{aligned} \xi (t;{\tilde{y}}) = \int _{(0,\left\| L \right\| ^2]}\rho (t;\lambda )\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}}\text { for every }t\in [0,\infty ), \end{aligned}$$

(49)

is the unique solution of Eq. 46 in the class of N times strongly continuously differentiable functions.

Proof

We split the proof in multiple parts. First, we will show that $\rho $ and $\xi $, defined by Eqs. 48 and 49, are sufficiently regular. Then, we conclude from this that $\xi $ satisfies Eq. 46. And finally, we show that every other solution of Eq. 46 coincides with $\xi $.

We start by showing that the function $\rho $ defined by Eq. 48 can be extended to a function $\rho :[0,\infty )\times [0,\infty )\rightarrow \mathbb {R}$ which is N times continuously differentiable with respect to t by setting
$$\begin{aligned} \rho (t;0) :=-\partial _\lambda {\tilde{\rho }}(t;0). \end{aligned}$$

(50)
For this, we only have to check the continuity of all the derivatives at the points (t, 0), $t\in [0,\infty )$. We observe that the solution of Eq. 47 for $\lambda =0$ is given by
$$\begin{aligned} {\tilde{\rho }}(t;0)=1\text { for every }t\in [0,\infty ). \end{aligned}$$
For the derivatives $\partial _t^k\rho $, $k\in \{0,\ldots ,N\}$, we therefore find with the mean value theorem (recall that $\partial _\lambda \partial _t^k{\tilde{\rho }}=\partial _t^k\partial _\lambda {\tilde{\rho }}$ according to Schwarz’s theorem, see, e.g., [23, Theorem 9.1], since $\partial _t^\ell {\tilde{\rho }}\in C^1([0,\infty )\times [0,\infty );\mathbb {R})$ for every $\ell \in \{0,\ldots ,k\}$) and Eq. 50 that
$$\begin{aligned} \lim _{({\tilde{t}},{\tilde{\lambda }})\rightarrow (t,0)}&\left( \partial _t^k\rho ({\tilde{t}},{\tilde{\lambda }})-\partial _t^k\rho (t,0)\right) \\&= \lim _{({\tilde{t}},{\tilde{\lambda }})\rightarrow (t,0)} \left( \frac{\partial _t^k{\tilde{\rho }}({\tilde{t}},0)-\partial _t^k{\tilde{\rho }}({\tilde{t}},{\tilde{\lambda }})}{{\tilde{\lambda }}} +\partial _t^k\partial _\lambda {\tilde{\rho }}(t;0)\right) \\&= \lim _{({\tilde{t}},{\hat{\lambda }})\rightarrow (t,0)}\left( \partial _t^k\partial _\lambda {\tilde{\rho }}(t;0) -\partial _\lambda \partial _t^k{\tilde{\rho }}({\tilde{t}},{\hat{\lambda }})\right) = 0, \end{aligned}$$
which proves that $\partial _t^k\rho $ is for every $k\in \{0,\ldots ,N\}$ continuous in $[0,\infty )\times [0,\infty )$.
Next, we are going to show that the function $\xi $ is N times continuously differentiable with respect to t and that its partial derivatives are for every $k\in \{0,\ldots ,N\}$ given by
$$\begin{aligned} \partial _t^k\xi (t;{\tilde{y}}) = \int _{(0,\left\| L \right\| ^2]}\partial _t^k\rho (t;\lambda )\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}}. \end{aligned}$$

(51)
To see this, we assume by induction that Eq. 51 holds for $k=\ell $ for some $\ell \in \{0,\ldots ,N-1\}$. Then, we get with the Borel measure $\mu _{L^*{\tilde{y}}}$ on $[0,\infty )$ defined by $\mu _{L^*{\tilde{y}}}(A)=\Vert {\mathbf {E}}_AL^*{\tilde{y}}\Vert ^2$ that
$$\begin{aligned} \lim _{h\rightarrow 0}&\left\| \frac{\partial _t^\ell \xi (t+h;{\tilde{y}})-\partial _t^\ell \xi (t;{\tilde{y}})}{h}-\int _{(0,\left\| L \right\| ^2]}\partial _t^{\ell +1}\rho (t;\lambda )\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}}\right\| ^2 \\&= \lim _{h\rightarrow 0}\left\| \int _{(0,\left\| L \right\| ^2]}\left( \frac{\partial _t^\ell \rho (t+h;\lambda )-\partial _t^\ell \rho (t;\lambda )}{h}-\partial _t^{\ell +1}\rho (t;\lambda )\right) \,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}}\right\| ^2 \\&= \lim _{h\rightarrow 0}\int _{(0,\left\| L \right\| ^2]}\left( \frac{\partial _t^\ell \rho (t+h;\lambda )-\partial _t^\ell \rho (t;\lambda )}{h}-\partial _t^{\ell +1}\rho (t;\lambda )\right) ^2\,\mathrm d\mu _{L^*{\tilde{y}}}(\lambda ). \end{aligned}$$
Now, since $\partial _t^{\ell +1}\rho $ is continuous, it is in particular bounded on every compact set $[0,T]\times [0,\left\| L \right\| ^2]$, $T>0$. And since the measure $\mu _{L^*{\tilde{y}}}$ is finite, Lebesgue’s dominated convergence theorem implies that
$$\begin{aligned} \lim _{h\rightarrow 0}&\left\| \frac{\partial _t^\ell \xi (t+h;{\tilde{y}})-\partial _t^\ell \xi (t;{\tilde{y}})}{h}\!-\!\int _{(0,\left\| L \right\| ^2]}\partial _t^{\ell +1}\rho (t;\lambda )\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}}\right\| ^2 \\&\!\!=\! \int _{(0,\left\| L \right\| ^2]}\lim _{h\rightarrow 0}\left( \frac{\partial _t^\ell \rho (t\!+\!h;\lambda )\!-\!\partial _t^\ell \rho (t;\lambda )}{h}\!-\!\partial _t^{\ell +1}\rho (t;\lambda )\right) ^2\,\mathrm d\mu _{L^*{\tilde{y}}}(\lambda )\!=\! 0, \end{aligned}$$
which proves Eq. 51 for $k=\ell +1$. Since Eq. 51 holds by definition of $\xi $ for $k=0$, this implies by induction that Eq. 51 holds for all $k\in \{0,\ldots ,N\}$.

Finally, the continuity of the Nth derivative $\partial _t^N\xi $ follows in the same way directly from Lebesgue’s dominated convergence theorem:
$$\begin{aligned} \lim _{{\tilde{t}}\rightarrow t}\left\| \partial _t^N\xi ({\tilde{t}};{\tilde{y}})\!-\!\partial _t^N\xi (t;{\tilde{y}})\right\| ^2 \!=\! \lim _{{\tilde{t}}\rightarrow t}\int _{(0,\left\| L \right\| ^2]}\left( \partial _t^N\rho ({\tilde{t}};\lambda )\!-\!\partial _t^N\rho (t;\lambda )\right) ^2\,\mathrm d\mu _{L^*{\tilde{y}}} \!=\! 0. \end{aligned}$$
To prove that $\xi $ solves Eq. 46, we plug the definition of $\rho $ from Eq. 48 into Eq. 51 and find
$$\begin{aligned} \partial _t^N\xi (t;{\tilde{y}})&+\sum _{k=1}^{N-1}a_k(t)\partial _t^k\xi (t;{\tilde{y}}) \\&= -\int _{(0,\left\| L \right\| ^2]}\frac{1}{\lambda }\left( \partial _t^N{\tilde{\rho }}(t;\lambda )+\sum _{k=1}^{N-1}a_k(t)\partial _t^k{\tilde{\rho }}(t;\lambda )\right) \,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}}. \end{aligned}$$
Making use of Eq. 47, we get that $\xi $ fulfils Eq. 46a:
$$\begin{aligned} \partial _t^N\xi (t;{\tilde{y}})+\sum _{k=1}^{N-1}a_k(t)\partial _t^k\xi (t;{\tilde{y}})&= \int _{(0,\left\| L \right\| ^2]}{\tilde{\rho }}(t;\lambda )\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}} \\&= \int _{(0,\left\| L \right\| ^2]}(-\lambda \rho (t;\lambda )+1)\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}} \\&= -L^*L\xi (t;{\tilde{y}})+L^*{\tilde{y}}. \end{aligned}$$
(We remark that we have the relation ${\mathcal {R}}(L^*)\subset {\mathcal {N}}(L)^\perp ={\mathcal {N}}(L^*L)^\perp $ which implies the identity ${\mathbf {E}}_{(0,\Vert L\Vert ^2]}L^*{\tilde{y}}=L^*{\tilde{y}}$.)

And for the initial conditions, we get, in agreement with Eq. 46b, from Eq. 51 that
$$\begin{aligned} \partial _t^k\xi (0;{\tilde{y}})&= -\int _{(0,\left\| L \right\| ^2]}\partial _t^k{\tilde{\rho }}(0;\lambda )\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}} = 0,\;k\in \left\{ 1,\ldots ,N-1\right\} ,\text { and} \\ \xi (0;{\tilde{y}})&= \int _{(0,\left\| L \right\| ^2]}\frac{1-{\tilde{\rho }}(0;\lambda )}{\lambda }\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}} = 0. \end{aligned}$$
It remains to show that Eq. 49 defines the only solution of Eq. 46.

So assume that we have two different solutions of Eq. 46 and call $\xi _0$ the difference between the two solutions. We choose an arbitrary $t_0>0$ and write $\partial _t^k\xi _0(t_0;{\tilde{y}})=\xi ^{(k)}$ for every $k\in \left\{ 0,\ldots ,N-1\right\} $. Then, $\xi _0$ is a solution of the initial value problem
$$\begin{aligned}&\partial _t^N\xi _0(t;{\tilde{y}}) + \sum _{k=1}^{N-1}a_k(t)\partial _t^k\xi _0(t;{\tilde{y}}) = - L^*L \xi _0(t;{\tilde{y}}) \text { for all } t \in \left( 0,\infty \right) \end{aligned}$$

(52a)
$$\begin{aligned}&\partial _t^k\xi _0(t_0;{\tilde{y}}) = \xi ^{(k)} \qquad \qquad \text { for all } k\in \left\{ 0,\ldots ,N-1\right\} . \end{aligned}$$

(52b)
We know, for example, from [18, Section II.2.1], that Eq. 52 has a unique solution on every interval $[t_1,t_2]$, $0<t_1<t_0<t_2$. Thus, we can write $\xi _0$ in the form
$$\begin{aligned} \xi _0(t;{\tilde{y}}) = \sum _{\ell =0}^{N-1}\int _{[0,\infty )}\rho _\ell (t;\lambda )\,\mathrm d{\mathbf {E}}_\lambda \xi ^{(\ell )} \end{aligned}$$
with the functions $\rho _\ell $ solving for every $\lambda \in [0,\infty )$ the initial value problems
$$\begin{aligned}&\partial _t^N\rho _\ell (t;\lambda )+\sum _{k=1}^{N-1}&a_k(t)\partial _t^k\rho _\ell (t;\lambda )=-\lambda \rho _\ell (t;\lambda )\text { for all }t \in \left( 0,\infty \right) ,&\\&\partial _t^k\rho _\ell (t_0;\lambda )=\delta _{k\ell }&\text { for all }k,\ell \in \left\{ 0,\ldots ,N-1\right\} .&\end{aligned}$$
(Since $a_k$ is continuous on $(0,\infty )$, Lebesgue’s dominated convergence theorem is applicable to every compact set $[t_1,t_2]\times [0,\left\| L \right\| ^2]$, $0<t_1<t_0<t_2$.)

Now, we have for every measurable subset $A\subset [0,\infty )$ and every $k\in \{0,\ldots ,N-1\}$ that
$$\begin{aligned} \Vert {\mathbf {E}}_A\partial _t^k\xi _0(t;{\tilde{y}})\Vert ^2 = \sum _{\ell ,m=0}^{N-1}\int _A\partial _t^k\rho _\ell (t;\lambda )\partial _t^k\rho _m(t;\lambda )\,\mathrm d\mu _{\xi ^{(\ell )},\xi ^{(m)}}(\lambda ), \end{aligned}$$
where the signed measures $\mu _{\eta _1,\eta _2}$, $\eta _1,\eta _2\in {\mathcal {X}}$, are defined by $\mu _{\eta _1,\eta _2}(A)=\left\langle \eta _1,{\mathbf {E}}_A\eta _2\right\rangle $.

The measures $\mu _{\xi ^{(\ell )},\xi ^{(m)}}$ with $\ell \ne m$ are absolutely continuous with respect to $\mu _{\xi ^{(\ell )},\xi ^{(\ell )}}$ and with respect to $\mu _{\xi ^{(m)},\xi ^{(m)}}$. Moreover, we can use Lebesgue’s decomposition theorem, see, e.g., [24, Theorem 6.10], to split the measures $\mu _{\xi ^{(\ell )},\xi ^{(\ell )}}$, $\ell \in \{0,\ldots ,N-1\}$, into measures $\mu _j$, $j\in \{0,\ldots ,J\}$, $J\le N-1$, which are mutually singular to each other, so, explicitly, we write
$$\begin{aligned} \mu _{\xi ^{(\ell )},\xi ^{(m)}} = \sum _{j=0}^J f_{j\ell m}\mu _j \end{aligned}$$
for some measurable functions $f_{j\ell m}$ with $f_{j\ell m}=f_{j m\ell }$. Since then
$$\begin{aligned} 0\le \left\| \sum _{\ell =0}^{N-1}\int _Ag_\ell (\lambda )\,\mathrm d{\mathbf {E}}_\lambda \xi ^{(\ell )}\right\| ^2 = \sum _{j=0}^J\int _A\sum _{\ell ,m=0}^{N-1}f_{j\ell m}(\lambda )g_\ell (\lambda )g_m(\lambda )\,\mathrm d\mu _j(\lambda ) \end{aligned}$$
has to hold for all functions $g_\ell \in C([0,\infty );\mathbb {R})$, $\ell \in \{0,\ldots ,N-1\}$, and all measurable sets $A\subset [0,\infty )$, the matrices $F_j(\lambda )=(f_{j\ell m}(\lambda ))_{\ell ,m=0}^{N-1}$ are (after possibly redefining $f_{j\ell m}$ on sets $A_{j\ell m}$ with $\mu _j(A_{j\ell m})=0$) positive semi-definite. Thus, we have for every measurable set $A\subset [0,\infty )$ that
$$\begin{aligned} \Vert {\mathbf {E}}_A\partial _t^k\xi _0(t;{\tilde{y}})\Vert ^2 = \sum _{j=0}^J\int _A\sum _{\ell ,m=0}^{N-1}f_{j\ell m}(\lambda )\partial _t^k\rho _\ell (t;\lambda )\partial _t^k\rho _m(t;\lambda )\,\mathrm d\mu _j(\lambda ), \end{aligned}$$
where the integrand is a positive semi-definite quadratic form of $\partial _t^k\rho $, namely $(\partial _t^k\rho )^{\mathrm T}F_j(\partial _t^k\rho )$, where $\rho =(\rho _\ell )_{\ell =0}^{N-1}$. We can therefore find for every $j\in \{0,\ldots ,J\}$ and every $\lambda $ a change of coordinates $O_j(\lambda )\in \mathrm {SO}_N(\mathbb {R})$ such that the matrix $O_j^{\mathrm T}(\lambda )F_j(\lambda )O_j(\lambda )=\mathrm {diag}(d_{j\ell }(\lambda ))_{\ell =0}^{N-1}$ is diagonal with non-negative diagonal entries $d_{j\ell }(\lambda )$. Setting ${\bar{\rho }}_{j\ell }(t;\lambda )=(O_j(\lambda )\rho (t;\lambda ))_\ell $ and ${\bar{\mu }}_{j\ell }=d_{j\ell }\mu _j$, we get
$$\begin{aligned} \Vert {\mathbf {E}}_A\partial _t^k\xi _0(t;{\tilde{y}})\Vert ^2 = \sum _{j=0}^J\sum _{\ell =0}^{N-1}\int _A\left( \partial _t^k{\bar{\rho }}_{j\ell }(t;\lambda )\right) ^2\,\mathrm d{\bar{\mu }}_{j\ell }(\lambda ). \end{aligned}$$

(53)
Since $\xi _0:[0,\infty )\rightarrow {\mathcal {X}}$ is N times continuously differentiable, it follows from Eq. 53 that
$$\begin{aligned} \int _0^{t_0}\int _{[0,\infty )}\left( \partial _t^k{\bar{\rho }}_{j\ell }(t;\cdot )\right) ^2\,\mathrm d{\bar{\mu }}_{j\ell }(\lambda )\,\mathrm dt < \infty \text { for every }k\in \{0,\ldots ,N\}, \end{aligned}$$
and therefore, there exists a set $\varLambda _{j\ell }\subset [0,\infty )$ with ${\bar{\mu }}_{j\ell }([0,\infty )\setminus \varLambda _{j\ell })=0$ such that
$$\begin{aligned} \int _0^{t_0}\left( \partial _t^k{\bar{\rho }}_{j\ell }(t;\lambda )\right) ^2\,\mathrm dt < \infty \text { for every }\lambda \in \varLambda _{j\ell }\text { and every }k\in \{0,\ldots ,N\}. \end{aligned}$$
So, ${\bar{\rho }}_{j\ell }(\cdot ;\lambda )$ is for every $\lambda \in \varLambda _{j\ell }$ in the Sobolev space $H^N([0,t_0],{\bar{\mu }}_{j\ell })$. By the Sobolev embedding theorem, see, e.g., [2, Theorem 5.4], we thus have that $\partial _t^k{\bar{\rho }}_{j\ell }(\cdot ;\lambda )$ extends for every $\lambda \in \varLambda _{j\ell }$ and every $k\in \{0,\ldots ,N-1\}$ continuously to a function on $[0,t_0]$.

Since $\xi _0$ is the difference of two solutions of Eq. 46, we have in particular that
$$\begin{aligned} \lim _{t\rightarrow 0}\Vert \partial _t^k\xi _0(t;{\tilde{y}})\Vert ^2 = 0\text { for every } k\in \{0,\ldots ,N-1\}. \end{aligned}$$
Thus, Eq. 53 implies that $\partial _t^k{\bar{\rho }}_{j\ell }(t;\cdot )\rightarrow 0$ in $L^2([0,\infty ),{\bar{\mu }}_{j\ell })$ with respect to the norm topology as $t\rightarrow 0$. Because of the continuity of $\partial _t^k{\bar{\rho }}_{j\ell }(\cdot ;\lambda )$, this means that there exists a set ${\tilde{\varLambda }}_{j\ell }$ with ${\bar{\mu }}_{j\ell }([0,\infty )\setminus {\tilde{\varLambda }}_{j\ell })=0$ such that we have for every $k\in \{0,\ldots ,N-1\}$:
$$\begin{aligned} \lim _{t\rightarrow 0}\partial _t^k{\bar{\rho }}_{j\ell }(t;\lambda ) = 0\text { for every }\lambda \in {\tilde{\varLambda }}_{j\ell }. \end{aligned}$$
But since Eq. 47 has a unique solution, this implies that ${\bar{\rho }}_{j\ell }(t;\lambda )=0$ for all $t\in [0,\infty )$, $\lambda \in {\tilde{\varLambda }}_{j\ell }$, and therefore, because of Eq. 53, that $\xi _0(t;{\tilde{y}})=0$ for every $t\in [0,\infty )$, which proves the uniqueness of the solution of Eq. 46.

$\square $

In the following sections, we want to show for various choices of coefficients $a_k$ that there exists a mapping $T:(0,\infty )\rightarrow (0,\infty )$ between the regularisation parameter $\alpha $ and the time t such that the solution $\xi $ corresponds to a regularised solution $x_\alpha $, as defined in Definition 2, via

$$\begin{aligned} \xi (T(\alpha );{\tilde{y}}) = x_\alpha ({\tilde{y}}) \end{aligned}$$

for some appropriate generator $(r_\alpha )_{\alpha >0}$ of a regularisation method as introduced in Definition 1. Since we have by Definition 2 of the regularised solution that

$$\begin{aligned} x_\alpha ({\tilde{y}}) = r_\alpha (L^*L)L^*{\tilde{y}} = \int _{(0,\left\| L \right\| ^2]}r_\alpha (\lambda )\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}} \end{aligned}$$

and the solution $\xi $ is according to Proposition 2 of the form of Eq. 49, this boils down to find a mapping T such that if we define the functions $r_\alpha $ by

$$\begin{aligned} r_\alpha (\lambda ) = \rho (T(\alpha );\lambda ), \end{aligned}$$

they generate a regularisation method in the sense of Definition 1.

4 Showalter’s Method

Showalter’s method, given by Eq. 3, is the gradient flow method for the functional ${\mathcal {J}}$. According to Proposition 2, we rewrite it as a system of first-order ordinary differential equations for the error function ${\tilde{\rho }}$ of the spectral values $\lambda $ of $L^*L$, which in this particular case reads

$$\begin{aligned} \begin{aligned} \partial _t\tilde{\rho }(t;\lambda )+\lambda {\tilde{\rho }}(t;\lambda )&=0 \text { for all } \lambda \in \left( 0,\infty \right) ,\;t \in \left( 0,\infty \right) , \\ \tilde{\rho }(0;\lambda )&= 1 \text { for all }\lambda \in \left( 0,\infty \right) . \end{aligned} \end{aligned}$$

(54)

Lemma 13

The solution $\tilde{\rho }$ of Eq. 54 is given by

$$\begin{aligned} \tilde{\rho }(t;\lambda ) = \mathrm e^{-\lambda t}\text { for all }(t,\lambda ) \in \left[ 0,\infty \right) \times \left( 0,\infty \right) . \end{aligned}$$

(55)

In particular, the solution of Showalter’s method, that is, the solution of Eq. 46 with $N=1$, is given by

$$\begin{aligned} \xi (t;{\tilde{y}}) = \int _{(0,\left\| L \right\| ^2]}\frac{1-\mathrm e^{-\lambda t}}{\lambda }\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}}, \end{aligned}$$

(56)

where $A\mapsto {\mathbf {E}}_A$ denotes the spectral measure of $L^*L$.

Proof

Clearly, the smooth function ${\tilde{\rho }}$ defined in Eq. 55 is the unique solution of Eq. 54 and the function $\rho $ defined in Eq. 48 is $\rho (t;\lambda )=\frac{1-\mathrm e^{-\lambda t}}{\lambda }$, $t\ge 0$, $\lambda >0$. So, Proposition 2 gives us the solution Eq. 56. $\square $

Next, we want to show that, by identifying $\alpha =\frac{1}{t}$ as regularisation parameter, the solution $\xi (\frac{1}{\alpha };{\tilde{y}})$ is a regularised solution of the equation $L x=y$ in the sense of Definition 2. For the verification of the property in Definition 1 item 1 of the regularisation method, it is convenient to be able to estimate the function $1-\mathrm e^{-z}$ by $\sqrt{z}$.

Lemma 14

There exists a constant $\sigma _0\in (0,1)$ such that

$$\begin{aligned} 1-\mathrm e^{-z}\le \sigma _0 \sqrt{z}\text { for every }z\ge 0. \end{aligned}$$

(57)

Proof

We consider the function $f:(0,\infty )\rightarrow (0,\infty )$, $f(z)=\frac{1-\mathrm e^{-z}}{\sqrt{z}}$. Since we have $\lim _{z\rightarrow 0}f(z)=0$ and $\lim _{z\rightarrow \infty }f(z)=0$, f attains its maximum at the only critical point $z_0>0$ given as the unique solution of the equation

$$\begin{aligned} 0=f'(z)= \frac{\mathrm e^{-z}}{\sqrt{z}}-\frac{1-\mathrm e^{-z}}{2 z^\frac{3}{2}} = \frac{\mathrm e^{-z}}{2 z^\frac{3}{2}}(2z+1-\mathrm e^z),\; z>0, \end{aligned}$$

where the uniqueness follows from the convexity of the exponential function. Since $2z+1>\mathrm e^z$ at $z=1$, we know additionally that $z_0>1$. Therefore, we have in particular

$$\begin{aligned} f(z) \le f(z_0)< 1-\mathrm e^{-z_0} < 1\text { for every }z>0, \end{aligned}$$

which gives Eq. 57 upon setting $\sigma _0:=1-\mathrm e^{-z_0}$. $\square $

In order to show that Showalter’s method is a regularisation method, we verify now all the assumptions in Definition 1.

Proposition 3

Let ${\tilde{\rho }}$ be the solution of Eq. 54 given in Eq. 55. Then, the functions $(r_\alpha )_{\alpha >0}$ defined by

$$\begin{aligned} r_\alpha (\lambda ) :=\frac{1}{\lambda }\left( 1-{\tilde{\rho }}(\tfrac{1}{\alpha };\lambda )\right) = \frac{1-\mathrm e^{-\frac{\lambda }{\alpha }}}{\lambda }\end{aligned}$$

(58)

generate a regularisation method in the sense of Definition 1.

Proof

We verify that $(r_\alpha )_{\alpha >0}$ satisfies the four conditions from Definition 1.

We clearly have $r_\alpha (\lambda )\le \frac{1}{\lambda }\le \frac{2}{\lambda }$. To prove the second part of the inequality Definition 1 item 1, we use Lemma 14 and find

$$\begin{aligned} r_\alpha (\lambda ) \le \frac{\sigma _0}{\sqrt{\alpha \lambda }}, \end{aligned}$$

where $\sigma _0\in (0,1)$ denotes the constant found in Lemma 14.

Moreover, the function $\tilde{r}_\alpha $, given by $\tilde{r}_\alpha (\lambda )={\tilde{\rho }}(\frac{1}{\alpha };\lambda )=\mathrm e^{-\frac{\lambda }{\alpha }}$, is non-negative and monotonically decreasing.

Since $\tilde{r}_\alpha $ is monotonically decreasing and $\alpha \mapsto \tilde{r}_\alpha (\lambda )$ is monotonically increasing, we can choose $\tilde{R}_\alpha :=\tilde{r}_\alpha $ to fulfil Definition 1 item 3.

We have $\tilde{R}_\alpha (\alpha )=\tilde{r}_\alpha (\alpha )=\mathrm e^{-1}<1$ for every $\alpha >0$.

$\square $

Finally, we check that the common convergence rate functions are compatible with this regularisation method.

Lemma 15

The functions $\varphi ^{\mathrm H}_\mu $ and $\varphi ^{\mathrm L}_\mu $ defined in Example 2 are for all $\mu >0$ compatible with the regularisation method $(r_\alpha )_{\alpha >0}$, defined by Eq. 58, in the sense of Definition 4.

Proof

According to Corollary 2, it is enough to prove that $\varphi ^{\mathrm H}_\mu $ is for arbitrary $\mu >0$ compatible with $(r_\alpha )_{\alpha >0}$. To see this, we remark that

$$\begin{aligned} \tilde{R}_\alpha ^2(\lambda ) = \mathrm e^{-2\frac{\lambda }{\alpha }} = F_\mu \left( \frac{\varphi ^{\mathrm H}_\mu (\lambda )}{\varphi ^{\mathrm H}_\mu (\alpha )}\right) \text { with }F_\mu (z)=\exp (-2z^{\frac{1}{\mu }}). \end{aligned}$$

Since $\int _1^\infty \exp (-2z^{\frac{1}{\mu }})\,\mathrm dz = \mu \int _1^\infty \mathrm e^{-2w}w^{\mu -1}\,\mathrm dw < \infty $ for every $\mu >0$, $F_\mu $ is integrable and thus, $\varphi ^{\mathrm H}_\mu $ is compatible with $(r_\alpha )_{\alpha >0}$. $\square $

We have thus shown that we can apply Theorem 1 to the regularisation method which is induced by Eq. 3, that is, the regularisation method generated by the functions $(r_\alpha )_{\alpha >0}$ defined in Eq. 58, and the convergence rate functions $\varphi ^{\mathrm H}_\mu $ or $\varphi ^{\mathrm L}_\mu $ for arbitrary $\mu >0$. This gives us optimal convergence rates under variational source conditions as defined in Eq. 33, for example.

However, to compare with the literature, see [9, Example 4.7], we formulate the result under the slightly stronger standard source condition, see Proposition 1.

Corollary 6

Let $y\in {\mathcal {R}}(L)$ be given such that the corresponding minimum norm solution $x^\dag \in {\mathcal {X}}$, fulfilling $L x^\dag =y$ and $\Vert x^\dag \Vert =\inf \{\left\| x \right\| \mid L x=y\}$, satisfies for some $\mu >0$ the source condition

$$\begin{aligned} x^\dagger \in {\mathcal {R}}\big ((L^*L)^{\frac{\mu }{2}}\big ). \end{aligned}$$

(59)

Then, if $\xi $ is the solution of the initial value problem in Eq. 3,

there exists a constant $C_1>0$ such that

$$\begin{aligned} \left\| \xi (t;y)-x^\dag \right\| ^2 \le C_1t^{-\mu }\text { for all }t>0; \end{aligned}$$

there exists a constant $C_2>0$ such that

$$\begin{aligned} \inf _{t>0}\left\| \xi (t;{\tilde{y}})-x^\dag \right\| ^2 \le C_2\left\| {\tilde{y}}-y \right\| ^{\frac{2\mu }{\mu +1}}\text { for all }{\tilde{y}}\in {\mathcal {Y}}; \end{aligned}$$

and

there exists a constant $C_3>0$ such that

$$\begin{aligned} \left\| L\xi (t;y)-y \right\| ^2 \le C_3t^{-\mu -1}\text { for all }t>0. \end{aligned}$$

Proof

We consider the regularisation method defined by the functions $(r_\alpha )_{\alpha >0}$ from Eq. 58. We have already seen in Lemma 12 and Lemma 15 that the function $\varphi ^{\mathrm H}_\mu (\alpha ) = \alpha ^\mu $ is G-subhomogeneous in the sense of Eq. 34 with $G(\gamma )=\gamma ^\mu $ and compatible with the regularisation method given by $(r_\alpha )_{\alpha >0}$.

According to Proposition 1 and Theorem 1 with the convergence rate function $\varphi =\varphi ^{\mathrm H}_\mu $, the source condition in Eq. 59 implies the existence of a constant $C_d$ such that

$$\begin{aligned} d(\alpha ) \le C_d\varphi ^{\mathrm H}_\mu (\alpha ) = C_d\alpha ^\mu , \end{aligned}$$

where d is given by Eq. 13 with the regularised solution $x_\alpha $ defined in Eq. 8 fulfilling according to Eqs. 58 and 56 that

$$\begin{aligned} x_\alpha ({\tilde{y}})=r_\alpha (L^*L)L^*{\tilde{y}}=\int _{(0,\left\| L \right\| ^2]}\frac{1-\mathrm e^{-\frac{\lambda }{\alpha }}}{\lambda }\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}} = \xi (\tfrac{1}{\alpha };{\tilde{y}}). \end{aligned}$$

(60)

Thus, by definition of d, we have that

$$\begin{aligned} \left\| \xi (t;y)-x^\dag \right\| ^2 = \left\| x_{\frac{1}{t}}(y)-x^\dag \right\| ^2 = d(\tfrac{1}{t}) \le \frac{C_d}{t^\mu }\text { for every }t>0. \end{aligned}$$

According to Theorem 1, we also find a constant $C_{{\tilde{d}}}$ such that

$$\begin{aligned} {\tilde{d}}(\delta ) \le C_{{\tilde{d}}}\varPhi [\varphi ^{\mathrm H}_\mu ](\delta ) = C_{{\tilde{d}}}\delta ^{\frac{2\mu }{\mu +1}}, \end{aligned}$$

where $\varPhi $ denotes the noise-free to noisy transform defined in Definition 5 and ${\tilde{d}}$ is given by Eq. 14 with the regularised solution $x_\alpha $ given by Eq. 60. Therefore, we have that

$$\begin{aligned} \inf _{t>0}\left\| \xi (t;{\tilde{y}})-x^\dag \right\| ^2 = \inf _{\alpha >0}\left\| \xi (\tfrac{1}{\alpha };{\tilde{y}})-x^\dag \right\| ^2\le {\tilde{d}}(\left\| {\tilde{y}}-y \right\| ) \le C_{{\tilde{d}}}\left\| {\tilde{y}}-y \right\| ^{\frac{2\mu }{\mu +1}} \end{aligned}$$

for every ${\tilde{y}}\in {\mathcal {Y}}$.

Furthermore, Theorem 1 implies that there is a constant $C_e>0$ such that $e(\lambda )\le C_e\varphi ^{\mathrm H}_\mu (\lambda )$. In particular, we then have $\lambda e(\lambda )\le \varphi ^{\mathrm H}_{\mu +1}(\lambda )$. And since $\varphi ^{\mathrm H}_{\mu +1}$ is by Lemma 15 compatible with $(r_\alpha )_{\alpha >0}$, we can apply Corollary 4 and find a constant $C>0$ such that the function q, defined in Eq. 15 with the regularised solution $x_\alpha $ as in Eq. 60, fulfils

$$\begin{aligned} q(\alpha ) \le C\varphi ^{\mathrm H}_{\mu +1}(\alpha )\text { for all }\alpha >0. \end{aligned}$$

Thus, by definition of q, we have

$$\begin{aligned} \left\| L\xi (t;y)-y \right\| ^2 = \left\| L x_{\frac{1}{t}}(y)-y \right\| ^2 = q(\tfrac{1}{t}) \le \frac{C}{t^{\mu +1}}\text { for all }t>0. \end{aligned}$$

$\square $

We emphasise that for Showalter’s method, we did not make use of the extended theory involving envelopes of regularisation methods (cf. Definition 2), and this theory could have been developed also with the regularisation results from [3].

5 Heavy Ball Dynamics

The heavy ball method consists of Eq. 2 for $N=2$ and $a_1(t)=b$ for some $b>0$, that is, Eq. 4.

According to Proposition 2, this corresponds to the initial value problems for every $\lambda > 0$

$$\begin{aligned} \begin{aligned} \partial _{t t}\tilde{\rho }(t;\lambda ) + b\partial _t\tilde{\rho }(t;\lambda ) +\lambda \tilde{\rho }(t;\lambda )&=0 \text { for all } t \in \left( 0,\infty \right) ,\\ \partial _t\tilde{\rho }(0;\lambda )&=0, \\ \tilde{\rho }(0;\lambda )&=1. \end{aligned} \end{aligned}$$

(61)

Lemma 16

The solution of Eq. 61 is given by

$$\begin{aligned} \tilde{\rho }(t;\lambda ) = {\left\{ \begin{array}{ll} \mathrm e^{-\frac{b t}{2}}\left( \cosh \left( \beta _-(\lambda )\frac{b t}{2}\right) +\frac{1}{\beta _-(\lambda )}\sinh \left( \beta _-(\lambda )\frac{b t}{2}\right) \right) &{}\text {if}\;\lambda \in (0,\frac{b^2}{4}),\\ \mathrm e^{-\frac{b t}{2}}\left( \cos \left( \beta _+(\lambda )\frac{b t}{2}\right) +\frac{1}{\beta _+(\lambda )}\sin \left( \beta _+(\lambda )\frac{b t}{2}\right) \right) &{}\text {if}\;\lambda \in (\frac{b^2}{4},\infty ),\\ \mathrm e^{-\frac{b t}{2}}(1+\frac{b t}{2})&{}\text {if}\;\lambda =\frac{b^2}{4}, \end{array}\right. } \end{aligned}$$

(62)

where

$$\begin{aligned} \beta _-(\lambda )=\sqrt{1-\frac{4\lambda }{b^2}}\text { and }\beta _+(\lambda )=\sqrt{\frac{4\lambda }{b^2}-1}, \end{aligned}$$

(63)

see Fig. 2. In particular, the solution of Eq. 4 is given by

$$\begin{aligned} \xi (t;{\tilde{y}}) = \int _{(0,\Vert L\Vert ^2]}\frac{1-{\tilde{\rho }}(t;\lambda )}{\lambda }\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}}, \end{aligned}$$

(64)

where $A\mapsto {\mathbf {E}}_A$ denotes the spectral measure of $L^*L$.

Proof

The characteristic equation of Eq. 61 is

$$\begin{aligned} z^2(\lambda )+b z(\lambda )+\lambda = 0 \end{aligned}$$

and has the solutions

$$\begin{aligned} z_1(\lambda )=-\frac{b}{2}-\sqrt{\frac{b^2}{4}-\lambda }\text { and } z_2(\lambda )=-\frac{b}{2}+\sqrt{\frac{b^2}{4}-\lambda }. \end{aligned}$$

Thus, for $\lambda <\frac{b^2}{4}$, we have the solution

$$\begin{aligned} \tilde{\rho }(t;\lambda ) = \mathrm e^{-\frac{b t}{2}}\left( C_1(\lambda )\cosh \left( t\sqrt{\frac{b^2}{4}-\lambda }\right) +C_2(\lambda )\sinh \left( t\sqrt{\frac{b^2}{4}-\lambda }\right) \right) ; \end{aligned}$$

for $\lambda >\frac{b^2}{4}$, we get the oscillating solution

$$\begin{aligned} \tilde{\rho }(t;\lambda ) = \mathrm e^{-\frac{b t}{2}}\left( C_1(\lambda )\cos \left( t\sqrt{\lambda -\frac{b^2}{4}}\right) +C_2(\lambda )\sin \left( t\sqrt{\lambda -\frac{b^2}{4}}\right) \right) ; \end{aligned}$$

and for $\lambda =\frac{b^2}{4}$, we have

$$\begin{aligned} \tilde{\rho }(t;\lambda ) = \mathrm e^{-\frac{b t}{2}}(C_1(\lambda )+C_2(\lambda )t). \end{aligned}$$

Plugging in the initial condition $\tilde{\rho }(0;\lambda )=1$, we find that $C_1(\lambda )=1$ for all $\lambda >0$, and the initial condition $\partial _t\tilde{\rho }(0;\lambda )=0$ then implies

$$\begin{aligned} C_2(\lambda )\sqrt{\frac{b^2}{4}-\lambda }&=\frac{b}{2}\text { for }\lambda <\frac{b^2}{4}, \\ C_2(\lambda )\sqrt{\lambda -\frac{b^2}{4}}&=\frac{b}{2}\text { for }\lambda >\frac{b^2}{4},\text { and}\\ C_2(\tfrac{b^2}{4})&=\frac{b}{2}. \end{aligned}$$

Moreover, since ${\tilde{\rho }}$ is smooth and the unique solution of Eq. 61, the function $\xi $ defined in Eq. 64 is by Proposition 2 the unique solution of Eq. 4. $\square $

To see that this solution gives rise to a regularisation method as introduced in Definition 1, we first verify that the function $\lambda \mapsto {\tilde{\rho }}(t;\lambda )$, which corresponds to the error function $\tilde{r}_\alpha $ in Definition 1, is non-negative and monotonically decreasing for sufficiently small values of $\lambda $ as required for $\tilde{r}_\alpha $ in Definition 1 item 2.

Lemma 17

The function $\lambda \mapsto \tilde{\rho }(t;\lambda )$ defined by Eq. 62 is for every $t\in (0,\infty )$ non-negative and monotonically decreasing on the interval $(0,\frac{b^2}{4}+\frac{\pi ^2}{4t^2})$.

Proof

We prove this separately for $\lambda \in (0,\tfrac{b^2}{4})$ and for $\lambda \in (\frac{b^2}{4},\frac{b^2}{4}+\frac{\pi ^2}{4t^2})$.

We remark that the function
$$\begin{aligned} g_\tau :(0,\infty )\rightarrow \mathbb {R},\; g_\tau (\beta )=\cosh (\beta \tau )+\frac{\sinh (\beta \tau )}{\beta }, \end{aligned}$$
is non-negative and fulfils for arbitrary $\tau >0$ that
$$\begin{aligned} g_\tau '(\beta )&= \tau \sinh (\beta \tau )+\frac{\tau \cosh (\beta \tau )}{\beta }-\frac{\sinh (\beta \tau )}{\beta ^2} \\&= \tau \sinh (\beta \tau )+\frac{\cosh (\beta \tau )}{\beta ^2}(\beta \tau -\tanh (\beta \tau )) \ge 0, \end{aligned}$$
since $\tanh (z)\le z$ for all $z\ge 0$. Thus, writing the function ${\tilde{\rho }}$ for $\lambda \in (0,\frac{b^2}{4})$ with the function $\beta _-$ given by Eq. 63 in the form
$$\begin{aligned} \tilde{\rho }(t;\lambda )=\mathrm e^{-\frac{b t}{2}}g_{\frac{b t}{2}}(\beta _-(\lambda )), \end{aligned}$$
we find that
$$\begin{aligned} \partial _\lambda \tilde{\rho }(t;\lambda ) = \mathrm e^{-\frac{b t}{2}}g_{\frac{b t}{2}}'(\beta _-(\lambda ))\beta _-'(\lambda ) \le 0, \end{aligned}$$
since $\beta _-'(\lambda )=-\frac{2}{b^2\beta _-(\lambda )}\le 0$. Therefore, the function $\lambda \mapsto {\tilde{\rho }}(t;\lambda )$ is non-negative and monotonically decreasing on $(0,\frac{b^2}{4})$.
Similarly, we consider for $\lambda \in (\frac{b^2}{4},\infty )$ the function
$$\begin{aligned} G_\tau :(0,\infty )\rightarrow \mathbb {R},\; G_\tau (\beta )=\cos (\beta \tau )+\frac{\sin (\beta \tau )}{\beta }, \end{aligned}$$
for arbitrary $\tau >0$. Since $\lim _{\beta \rightarrow 0}G_\tau (\beta )=1+\tau >0$ and since the smallest zero $\beta _\tau $ of $G_\tau $ is the smallest non-negative solution of the equation $\tan (\beta \tau )=-\beta $, implying that $\beta _\tau \tau \in (\frac{\pi }{2},\pi )$, we have that $G_\tau (\beta )\ge 0$ for all $\beta \in (0,\frac{\pi }{2\tau })\subset (0,\beta _\tau )$.

Moreover, the derivative of $G_\tau $ satisfies for every $\beta \in (0,\tfrac{\pi }{2\tau })$ that
$$\begin{aligned} G_\tau '(\beta )&= -\tau \sin (\beta \tau )+\frac{\tau \cos (\beta \tau )}{\beta }-\frac{\sin (\beta \tau )}{\beta ^2} \\&= -\frac{\cos (\beta \tau )}{\beta ^2}\left( (\beta ^2\tau +1)\tan (\beta \tau )-\beta \tau \right) \le 0, \end{aligned}$$
since $\tan (z)\ge z$ for every $z\ge 0$. Therefore, we find for the function ${\tilde{\rho }}$ on the domain $(0,\infty )\times (\frac{b^2}{4},\infty )$, where it has the form
$$\begin{aligned} \tilde{\rho }(t;\lambda )=\mathrm e^{-\frac{b t}{2}}G_{\frac{b t}{2}}(\beta _+(\lambda )) \end{aligned}$$
with $\beta _+$ given by Eq. 63, that
$$\begin{aligned} {\tilde{\rho }}(t;\lambda )\ge 0\text { and }\partial _\lambda \tilde{\rho }(t;\lambda ) = \mathrm e^{-\frac{b t}{2}}G_{\frac{b t}{2}}'(\beta _+(\lambda ))\beta _+'(\lambda ) \le 0 \end{aligned}$$
for $\beta _+(\lambda )<\frac{\pi }{b t}$, that is, for $\lambda <\frac{b^2}{4}+\frac{\pi ^2}{4t^2}$; since we have $\beta _+'(\lambda )=\frac{2}{b^2\beta _+(\lambda )}\ge 0$.

Because $\tilde{\rho }$ is continuous, this implies that $\lambda \mapsto \tilde{\rho }(t;\lambda )$ is for every $t\in (0,\infty )$ non-negative and monotonically decreasing on $(0,\frac{b^2}{4}+\frac{\pi ^2}{4t^2})$. $\square $

In a next step, we introduce the function ${\tilde{P}}(t;\cdot )$ as a correspondence to the upper bound $\tilde{R}_\alpha $ and show that it fulfils the properties necessary for Definition 1 item 3.

Lemma 18

We define the function

$$\begin{aligned} {\tilde{P}}(t;\lambda ) = {\left\{ \begin{array}{ll} \mathrm e^{-\frac{b t}{2}}\left( \cosh \left( \beta _-(\lambda )\frac{b t}{2}\right) +\frac{1}{\beta _-(\lambda )}\sinh \left( \beta _-(\lambda )\frac{b t}{2}\right) \right) &{}\text {if}\;\lambda \in (0,\frac{b^2}{4}),\\ \mathrm e^{-\frac{b t}{2}}(1+\frac{b t}{2})&{}\text {if}\; \lambda \in [\frac{b^2}{4},\infty ), \end{array}\right. } \end{aligned}$$

(65)

where the function $\beta _-$ shall be given by Eq. 63.

Then, ${\tilde{P}}$ is an upper bound for the absolute value of the function ${\tilde{\rho }}$ defined by Eq. 62: ${\tilde{P}}\ge |\tilde{\rho }|$.

Proof

Since $\tilde{\rho }(t;\lambda )={\tilde{P}}(t;\lambda )$ for $\lambda \le \frac{b^2}{4}$ for every $t>0$, we only need to consider the case $\lambda >\frac{b^2}{4}$. Using that $\left| \cos (z)\right| \le 1$ and $\left| \sin (z)\right| \le |z|$ for all $z\in \mathbb {R}$, we find with $\beta _+$ as in Eq. 63 for every $\lambda >\frac{b^2}{4}$ and every $t>0$ that

$$\begin{aligned} |\tilde{\rho }(t;\lambda )|&=\mathrm e^{-\frac{b t}{2}}\left| \cos \left( \beta _+(\lambda )\frac{b t}{2}\right) +\frac{1}{\beta _+(\lambda )}\sin \left( \beta _+(\lambda )\frac{b t}{2}\right) \right| \\&\le \mathrm e^{-\frac{b t}{2}}\left( 1+\frac{b t}{2}\right) = {\tilde{P}}(t;\lambda ). \end{aligned}$$

$\square $

Lemma 19

Let ${\tilde{P}}$ be given by Eq. 65. Then, $\lambda \mapsto {\tilde{P}}(t;\lambda )$ is monotonically decreasing and $t\mapsto {\tilde{P}}(t;\lambda )$ is strictly decreasing.

Proof

For the derivative of ${\tilde{P}}$ with respect to t, we get

$$\begin{aligned} \partial _t {\tilde{P}}(t;\lambda ) = {\left\{ \begin{array}{ll} \frac{b}{2}\mathrm e^{-\frac{b t}{2}}\left( \beta _-(\lambda )-\frac{1}{\beta _-(\lambda )}\right) \sinh \left( \beta _-(\lambda )\frac{b t}{2}\right) &{}\text {if }\lambda \in (0,\frac{b^2}{4}),\\ -\frac{b^2t}{4}\mathrm e^{-\frac{b t}{2}} &{}\text {if }\lambda \in [\frac{b^2}{4},\infty ), \end{array}\right. } \end{aligned}$$

with $\beta _-$ defined in Eq. 63; and since $\beta _-(\lambda )\in (0,1)$ for every $\lambda \in (0,\frac{b^2}{4})$, we thus have $\partial _t {\tilde{P}}(t;\lambda )<0$ for every $t>0$ and every $\lambda >0$.

Since ${\tilde{P}}(t;\lambda )={\tilde{\rho }}(t;\lambda )$ for $\lambda \in (0,\frac{b^2}{4}]$, where ${\tilde{\rho }}$ denotes the solution of Eq. 61, given by Eq. 62, we already know from Lemma 17 that $\lambda \mapsto {\tilde{P}}(t;\lambda )$ is monotonically decreasing on $(0,\frac{b^2}{4}]$. And since $\lambda \mapsto {\tilde{P}}(t;\lambda )$ is constant on $[\frac{b^2}{4},\infty )$, it is monotonically decreasing on $(0,\infty )$. $\square $

To verify later the compatibility of the convergence rate functions $\varphi ^{\mathrm H}_\mu $ and $\varphi ^{\mathrm L}_\mu $ introduced in Eqs. 16 and 17, we derive here an appropriate upper bound for ${\tilde{P}}$.

Lemma 20

We have for every $\varLambda >0$ that the function ${\tilde{P}}$ defined in Eq. 65 can be bounded from above by

$$\begin{aligned} {\tilde{P}}(t;\lambda ) \le \varPsi _\varLambda (\lambda t)\text { for all }t>0,\;\lambda \in (0,\varLambda ], \end{aligned}$$

where

$$\begin{aligned} \varPsi _\varLambda (z) = \max \left\{ 2\mathrm e^{-\frac{z}{b}},\mathrm e^{-\frac{b z}{2\varLambda }}\left( 1+\frac{b z}{2\varLambda }\right) \right\} . \end{aligned}$$

(66)

Proof

We consider the two cases $\lambda \in (0,\frac{b^2}{4})$ and $\lambda \in [\frac{b^2}{4},\varLambda ]$ separately.

For $\lambda \in (0,\frac{b^2}{4})$, we use the two inequalities $\cosh (z)\le \mathrm e^z$ and $\frac{\sinh (z)}{z}\le \mathrm e^z$ for all $z\ge 0$, where the latter follows from the fact that $f(z)=2z\mathrm e^z(\mathrm e^z-\frac{\sinh (z)}{z})=(2z-1)\mathrm e^{2z}+1$ is because of $f'(z)=4z\mathrm e^{2z}\ge 0$ monotonically increasing on $[0,\infty )$ and thus fulfils $f(z)\ge f(0)=0$ for every $z\ge 0$. With this, we find from Eq. 65 that
$$\begin{aligned} {\tilde{P}}(t;\lambda ) \le 2\exp \left( \left( \sqrt{1-\frac{4\lambda }{b^2}}-1\right) \frac{b t}{2}\right) . \end{aligned}$$
Since $\sqrt{1-z}\le 1-\frac{z}{2}$ for all $z\in (0,1)$, we then obtain
$$\begin{aligned} {\tilde{P}}(t;\lambda ) \le 2\mathrm e^{-\frac{\lambda t}{b}}\text { for every }t>0,\;\lambda \in (0,\tfrac{b^2}{4}). \end{aligned}$$
For $\lambda \in [\frac{b^2}{4},\varLambda ]$, we use that $t\mapsto {\tilde{P}}(t;\lambda )$ is according to Lemma 19 for every $\lambda \in (0,\infty )$ monotonically decreasing and obtain from Eq. 65 that
$$\begin{aligned} {\tilde{P}}(t;\lambda ) \le {\tilde{P}}\left( \frac{\lambda t}{\varLambda };\lambda \right) = \mathrm e^{-\frac{b\lambda t}{2\varLambda }}\left( 1+\frac{b\lambda t}{2\varLambda }\right) \text { for every }t>0. \end{aligned}$$

$\square $

Next, we give an upper bound for the function $\rho $, $\rho (t;\lambda ):=\frac{1}{\lambda }(1-{\tilde{\rho }}(t;\lambda ))$, which allows us to verify the property in Definition 1 item 1 for the corresponding generator $(r_\alpha )_{\alpha >0}$ of the regularisation method.

Lemma 21

Let ${\tilde{\rho }}$ be given by Eq. 62. Then, there exists a constant $\sigma _1\in (0,1)$ such that

$$\begin{aligned} \frac{1}{\lambda }(1-{\tilde{\rho }}(t;\lambda )) \le \sigma _1{\sqrt{\frac{2t}{b\lambda }}}\text { for all }t>0,\;\lambda >0. \end{aligned}$$

Proof

We consider the two cases for $\lambda \in (0,\frac{b^2}{4})$ and $\lambda \in (\frac{b^2}{4},\infty )$ separately. The estimate for $\lambda =\frac{b^2}{4}$ then follows directly from the fact that the function $\lambda \mapsto {\tilde{\rho }}(t;\lambda )$ is continuous for every $t\in [0,\infty )$.

For $\lambda \in (0,\frac{b^2}{4})$, we use that $\cosh (z)=\mathrm e^z-\sinh (z)$ for every $z\in \mathbb {R}$ and obtain with the function $\beta _-$ from Eq. 63 that
$$\begin{aligned} \frac{1}{\lambda }(1-{\tilde{\rho }}(t;\lambda )) = \frac{1}{\lambda }\left( 1-\mathrm e^{-(1-\beta _-(\lambda ))\frac{b t}{2}}-\left( \frac{1}{\beta _-(\lambda )}-1\right) \mathrm e^{-\frac{b t}{2}}\sinh (\beta _-(\lambda )\tfrac{b t}{2})\right) . \end{aligned}$$
Since $\beta _-(\lambda )\in (0,1)$, we can therefore estimate this with the help of Lemma 14 by
$$\begin{aligned} \frac{1}{\lambda }(1-{\tilde{\rho }}(t;\lambda )) \le \frac{1}{\lambda }\left( 1-\mathrm e^{-(1-\beta _-(\lambda ))\frac{b t}{2}}\right) \le \frac{\sigma _0}{\lambda }\sqrt{1-\beta _-(\lambda )}\sqrt{\frac{b t}{2}}, \end{aligned}$$
where $\sigma _0\in (0,1)$ is the constant found in Lemma 14. Since $\lambda =\frac{b^2}{4}(1-\beta _-^2(\lambda ))$, this means
$$\begin{aligned} \frac{1}{\lambda }(1-{\tilde{\rho }}(t;\lambda )) \le \frac{\sigma _0}{\sqrt{1+\beta _-(\lambda )}}\sqrt{\frac{2t}{b\lambda }} \le \sigma _0\sqrt{\frac{2t}{b\lambda }}. \end{aligned}$$
For $\lambda \in (\frac{b^2}{4},\infty )$, we remark that
$$\begin{aligned} \partial _t{\tilde{\rho }}(t;\lambda ) = -\frac{b}{2}\left( \beta _+(\lambda )+\frac{1}{\beta _+(\lambda )}\right) \mathrm e^{-\frac{b t}{2}}\sin \left( \beta _+(\lambda )\frac{b t}{2}\right) , \end{aligned}$$
where the function $\beta _+$ is given by Eq. 63. Since the function $[0,\infty )\rightarrow \mathbb {R}$, $z\mapsto (\mathrm e^{-z}\sin (a z))^2$, $a>0$, attains its maximal value at its smallest non-negative critical point $z=\frac{1}{a}\arctan (a)$, we have that
$$\begin{aligned} \left| \partial _t{\tilde{\rho }}(t;\lambda )\right| \le \frac{b}{2}\left( \beta _+(\lambda )+\frac{1}{\beta _+(\lambda )}\right) \mathrm e^{-\frac{\arctan (\beta _+(\lambda ))}{\beta _+(\lambda )}}\left| \sin (\arctan (\beta _+(\lambda )))\right| . \end{aligned}$$
Using that $\sin (z)=\frac{\tan (z)}{\sqrt{1+\tan ^2(z)}}$ for all $z\in (-\frac{\pi }{2},\frac{\pi }{2})$, this reads
$$\begin{aligned} \left| \partial _t{\tilde{\rho }}(t;\lambda )\right| \le \frac{b}{2}\sqrt{1+\beta _+^2(\lambda )}\,\mathrm e^{-\frac{\arctan (\beta _+(\lambda ))}{\beta _+(\lambda )}}. \end{aligned}$$

(67)
We further realise that the function $f:(0,\infty )\rightarrow \mathbb {R}$, $f(z)=\frac{1}{\sqrt{1+z^2}}\mathrm e^{-\frac{\arctan (z)}{z}}$, is monotonically decreasing because of
$$\begin{aligned} f'(z)&= -\frac{1}{\sqrt{1+z^2}}\mathrm e^{-\frac{\arctan (z)}{z}}\left( \frac{z}{1+z^2}+\frac{1}{z(1+z^2)}-\frac{\arctan (z)}{z^2}\right) \\&= -\frac{1}{z^2\sqrt{1+z^2}}\mathrm e^{-\frac{\arctan (z)}{z}}(z-\arctan (z)) \le 0. \end{aligned}$$
Thus, $f(z)\le \lim _{z\rightarrow 0}f(z)=\mathrm e^{-1}$ and Eq. 67 therefore implies that
$$\begin{aligned} \left| \partial _t{\tilde{\rho }}(t;\lambda )\right| \le \frac{b}{2\mathrm e}(1+\beta _+^2(\lambda )). \end{aligned}$$
With $\frac{4}{b^2}\lambda =(1+\beta _+^2(\lambda ))$, the mean value theorem therefore gives us
$$\begin{aligned} \frac{1}{\lambda }(1-{\tilde{\rho }}(t;\lambda )) = \frac{1}{\lambda }({\tilde{\rho }}(0;\lambda )-{\tilde{\rho }}(t;\lambda )) \le \frac{2t}{\mathrm eb}\text { for all }t>0. \end{aligned}$$
Since we know from Lemmas 18 and 19 that we can estimate ${\tilde{\rho }}$ with the function ${\tilde{P}}$ from Eq. 65 by
$$\begin{aligned} |{\tilde{\rho }}(t;\lambda )|\le {\tilde{P}}(t;\lambda )\le {\tilde{P}}(0;\lambda )=1, \end{aligned}$$

(68)
we find by using the estimate $\min \{a,b\}\le \min \{\sqrt{a},\sqrt{b}\}\max \{\sqrt{a},\sqrt{b}\}=\sqrt{ab}$ for all $a,b>0$ that
$$\begin{aligned} \frac{1}{\lambda }(1-{\tilde{\rho }}(t;\lambda ))\le \min \left\{ \frac{2}{\lambda },\frac{2t}{\mathrm eb}\right\} \le \sqrt{\frac{2}{\mathrm e}}\sqrt{\frac{2t}{b\lambda }} \text { for all }t>0. \end{aligned}$$

$\square $

Finally, we can put together all the estimates to obtain a regularisation method corresponding to the solution $\xi $ of the heavy ball equation, Eq. 4.

Proposition 4

Let $\tilde{\rho }$ be the solution of Eq. 61. Then, the functions $(r_\alpha )_{\alpha >0}$,

$$\begin{aligned} r_\alpha (\lambda ) :=\frac{1}{\lambda }(1-\tilde{\rho }(\tfrac{b}{2\alpha };\lambda )), \end{aligned}$$

(69)

define a regularisation method in the sense of Definition 1.

Proof

We verify the four conditions in Definition 1.

We have already seen in Eq. 68 that $|\tilde{\rho }(t;\lambda )|\le 1$ and thus $r_\alpha (\lambda )\le \frac{2}{\lambda }$ for every $\lambda >0$.

Moreover, Lemma 21 implies that there exists a parameter $\sigma _1\in (0,1)$ such that

$$\begin{aligned} r_\alpha (\lambda ) = \frac{1}{\lambda }(1-\tilde{\rho }(\tfrac{b}{2\alpha };\lambda )) \le \frac{\sigma _1}{\sqrt{\alpha \lambda }}, \end{aligned}$$

which is Eq. 6.

The corresponding error function

$$\begin{aligned} \tilde{r}_\alpha :(0,\infty )\rightarrow [-1,1],\;\tilde{r}_\alpha (\lambda )={\tilde{\rho }}(\tfrac{b}{2\alpha };\lambda ), \end{aligned}$$

is according to Lemma 17 non-negative and monotonically decreasing on the interval $(0,\frac{b^2}{4}+\frac{\pi ^2\alpha ^2}{b^2})$. Using that $a^2+b^2\ge 2ab$ for all $a,b\in \mathbb {R}$, we find that

$$\begin{aligned} \frac{b^2}{4}+\frac{\pi ^2\alpha ^2}{b^2} \ge 2\sqrt{\frac{\pi ^2\alpha ^2}{b^2}\,\frac{b^2}{4}} = \pi \alpha > \alpha , \end{aligned}$$

which implies that $\tilde{r}_\alpha $ is for every $\alpha >0$ non-negative and monotonically decreasing on $(0,\alpha )$.

Choosing

$$\begin{aligned} \tilde{R}_\alpha (\lambda ) :={\tilde{P}}(\tfrac{b}{2\alpha };\lambda ) \end{aligned}$$

(70)

with the function ${\tilde{P}}$ from Eq. 65, we know from Lemma 18 that $\tilde{R}_\alpha (\lambda )\ge \left| \tilde{r}_\alpha (\lambda )\right| $ holds for all $\lambda >0$ and $\alpha >0$. Moreover, Lemma 19 tells us that $\tilde{R}_\alpha $ is for every $\alpha >0$ monotonically decreasing and that $\alpha \mapsto \tilde{R}_\alpha (\alpha ;\lambda )$ is for every $\lambda >0$ monotonically increasing.

To estimate the values $\tilde{R}_\alpha (\alpha )$ for $\alpha $ in a neighbourhood of zero, we calculate the limit

$$\begin{aligned} \lim _{\alpha \rightarrow 0}\tilde{R}_\alpha (\alpha )&= \lim _{\alpha \rightarrow 0}{\tilde{P}}(\tfrac{b}{2\alpha };\alpha )) \\&= \lim _{\alpha \rightarrow 0}\mathrm e^{-\frac{b^2}{4\alpha }}\left( \cosh \left( \beta _-(\alpha )\frac{b^2}{4\alpha }\right) +\frac{1}{\beta _-(\alpha )}\sinh \left( \beta _-(\alpha )\frac{b^2}{4\alpha }\right) \right) , \end{aligned}$$

where $\beta _-$ is given by Eq. 63. Setting ${\tilde{\alpha }}=\frac{4\alpha }{b^2}$ and using that then $\beta _-(\alpha )=\sqrt{1-\frac{4\alpha }{b^2}}=\sqrt{1-{\tilde{\alpha }}}$, we get that

$$\begin{aligned} \lim _{\alpha \rightarrow 0}\tilde{R}_\alpha (\alpha )&= \lim _{{\tilde{\alpha }}\rightarrow 0}\mathrm e^{-\frac{1}{{\tilde{\alpha }}}}\left( \cosh \left( \frac{\sqrt{1-{\tilde{\alpha }}}}{{\tilde{\alpha }}}\right) +\frac{1}{\sqrt{1-{\tilde{\alpha }}}}\sinh \left( \frac{\sqrt{1-{\tilde{\alpha }}}}{{\tilde{\alpha }}}\right) \right) \\&= \lim _{{\tilde{\alpha }}\rightarrow 0}\frac{1}{2}\left( 1+\frac{1}{\sqrt{1-{\tilde{\alpha }}}}\right) \mathrm e^{\frac{\sqrt{1-{\tilde{\alpha }}}-1}{{\tilde{\alpha }}}} = \mathrm e^{-\frac{1}{2}} < 1. \end{aligned}$$

Thus, there exists for an arbitrarily chosen ${\tilde{\sigma }}_0\in (\mathrm e^{-\frac{1}{2}},1)$ a parameter ${\bar{\alpha }}_0>0$ such that $\tilde{R}_\alpha (\alpha )\le {\tilde{\sigma }}_0$ for every $\alpha \in (0,{\bar{\alpha }}_0)$.

Using further that $t\mapsto {\tilde{P}}(t;\lambda )$ is strictly decreasing, see Lemma 19, we have for every $\alpha >0$ that

$$\begin{aligned} \tilde{R}_\alpha (\alpha )={\tilde{P}}(\tfrac{b}{2\alpha };\alpha ) < {\tilde{P}}(0;\alpha )=1. \end{aligned}$$

Thus, since $\alpha \mapsto \tilde{R}_\alpha (\alpha )$ is by definition of ${\tilde{P}}$ in Eq. 65 continuous on $(0,\infty )$, we have for every ${\bar{\alpha }}>0$ that

$$\begin{aligned} \sup _{\alpha \in (0,{\bar{\alpha }}]}\tilde{R}_\alpha (\alpha )&= \max \bigg \{\sup _{\alpha \in (0,{\bar{\alpha }}_0)}\tilde{R}_\alpha (\alpha ),\sup _{\alpha \in [{\bar{\alpha }}_0,{\bar{\alpha }}]}\tilde{R}_\alpha (\alpha )\bigg \} \\&\le \max \bigg \{{\tilde{\sigma }}_0,\max _{\alpha \in [{\bar{\alpha }}_0,{\bar{\alpha }}]}\tilde{R}_\alpha (\alpha )\bigg \} < 1, \end{aligned}$$

which shows Definition 1 item 4.

$\square $

To be able to apply Theorem 1 for the regularisation method generated by $(r_\alpha )_{\alpha >0}$ from Eq. 69 to the common convergence rates $\varphi ^{\mathrm H}_\mu $ and $\varphi ^{\mathrm L}_\mu $, it remains to show that they are compatible with $(r_\alpha )_{\alpha >0}$.

Lemma 22

Proof

We know from Corollary 2 that we only need to prove the statement for $\varphi ^{\mathrm H}_\mu $ for every $\mu >0$. The function $\tilde{R}_\alpha $ defined in Eq. 70 fulfils according to Lemma 20 for arbitrary $\varLambda >0$ that

$$\begin{aligned} \tilde{R}_\alpha ^2(\lambda ) = {\tilde{P}}^2\left( \frac{b}{2\alpha };\lambda \right) \le \varPsi _\varLambda ^2\left( \frac{b\lambda }{2\alpha }\right) \le \varPsi _\varLambda ^2\left( \frac{b}{2}\left( \frac{\varphi ^{\mathrm H}_\mu (\lambda )}{\varphi ^{\mathrm H}_\mu (\alpha )}\right) ^{\frac{1}{\mu }}\right) \end{aligned}$$

for every $\alpha >0$ and $\lambda \in (0,\varLambda ]$, where $\varPsi _\varLambda $ is given by Eq. 66. Since $z\mapsto \varPsi _\varLambda ^2(\frac{b}{2}z^{\frac{1}{\mu }})$ is for every $\mu >0$ integrable, $\varphi ^{\mathrm H}_\mu $ is compatible with $(r_\alpha )_{\alpha >0}$. $\square $

We can therefore apply Theorem 1 to the regularisation method induced by Eq. 4, which is the regularisation method generated by the functions $(r_\alpha )_{\alpha >0}$ defined in Eq. 69, and the convergence rate functions $\varphi ^{\mathrm H}_\mu $ or $\varphi ^{\mathrm L}_\mu $ for arbitrary $\mu >0$. Thus, although the functions $t\mapsto {\tilde{\rho }}(t;\lambda )$ and $\lambda \mapsto {\tilde{\rho }}(t;\lambda )$ are not monotonic, we obtain optimal convergence rates of the regularisation method under variational source conditions such as in Eq. 33.

If we formulate it with the stronger standard source condition, see Proposition 1, we can reproduce a result similar to [33, Theorem 5.1].

Corollary 7

$$\begin{aligned} x^\dagger \in {\mathcal {R}}\big ((L^*L)^{\frac{\mu }{2}}\big ). \end{aligned}$$

Then, if $\xi $ is the solution of the initial value problem in Eq. 4,

there exists a constant $C_1>0$ such that

$$\begin{aligned} \left\| \xi (t;y)-x^\dag \right\| ^2 \le C_1t^{-\mu }\text { for all }t>0; \end{aligned}$$

there exists a constant $C_2>0$ such that

and

there exists a constant $C_3>0$ such that

$$\begin{aligned} \left\| L\xi (t;y)-y \right\| ^2 \le C_3t^{-\mu -1}\text { for all }t>0. \end{aligned}$$

Proof

The proof follows exactly the lines of the proof of Corollary 6, where the compatibility of $\varphi ^{\mathrm H}_\mu $ is shown in Lemma 22 and we have here the slightly different scaling

$$\begin{aligned} x_\alpha ({\tilde{y}})=r_\alpha (L^*L)L^*{\tilde{y}}=\int _{(0,\left\| L \right\| ^2]}\frac{1-{\tilde{\rho }}(\tfrac{b}{2\alpha };\lambda )}{\lambda }\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}} = \xi (\tfrac{b}{2\alpha };{\tilde{y}}) \end{aligned}$$

between the regularised solution $x_\alpha $, defined in Eq. 8 with the regularisation method $(r_\alpha )_{\alpha >0}$ from Eq. 69, and the solutions $\xi $ of Eq. 4 and ${\tilde{\rho }}$ of Eq. 61; which however does not cause a change in the order of the convergence rates. $\square $

6 The Vanishing Viscosity Flow

We consider now the dynamical method Eq. 2 for $N=2$ with the variable coefficient $a_1(t)=\frac{b}{t}$ for some parameter $b > 0$, that is, Eq. 5. According to Proposition 2, the solution of Eq. 5 is defined via the spectral integral in Eq. 49 of $\rho (t;\lambda ) = \frac{1-\tilde{\rho }(t;\lambda )}{\lambda }$, where $\tilde{\rho }$ solves for every $\lambda \in (0,\infty )$ the initial value problem

$$\begin{aligned} \begin{aligned} \partial _{t t}{\tilde{\rho }}(t;\lambda )+\frac{b}{t}\partial _t{\tilde{\rho }}(t;\lambda )+\lambda {\tilde{\rho }}(t;\lambda )&=0 \text { for all } t \in \left( 0,\infty \right) ,\\ \partial _t{\tilde{\rho }}(0;\lambda )&=0, \\ {\tilde{\rho }}(0;\lambda )&=1. \end{aligned} \end{aligned}$$

(71)

As already noted in [29, Section 3.2], we obtain a closed form in terms of Bessel functions for the solution of Eq. 71.

Lemma 23

Let $b, \lambda > 0$. Then, Eq. 71 has the unique solution

$$\begin{aligned} {\tilde{\rho }}(t;\lambda ) = u(t\sqrt{\lambda })\text { with }u(\tau )=\left( \frac{2}{\tau }\right) ^{\frac{1}{2}(b-1)}\varGamma (\tfrac{1}{2}(b+1))J_{\frac{1}{2}(b-1)}(\tau ), \end{aligned}$$

(72)

where $\varGamma $ is the gamma function and $J_\nu $ denotes the Bessel function of first kind of order $\nu \in \mathbb {R}$. See Fig. 3 for a sketch of the graph of the function u.

Proof

We rescale Eq. 71 by switching to the function

$$\begin{aligned} v:[0,\infty )\times (0,\Vert L\Vert ^2]\rightarrow \mathbb {R},\; v(\tau ;\lambda ) = \tau ^\kappa {\tilde{\rho }}(\sigma _\lambda \tau ;\lambda ) \end{aligned}$$

(73)

with some parameters $\sigma _\lambda \in (0,\infty )$ and $\kappa \in \mathbb {R}$. The function v thus has the derivatives

$$\begin{aligned} \partial _\tau v(\tau ;\lambda ) = \sigma _\lambda \tau ^\kappa \partial _t{\tilde{\rho }}(\sigma _\lambda \tau ;\lambda )+\kappa \tau ^{\kappa -1}{\tilde{\rho }}(\sigma _\lambda \tau ;\lambda ) \end{aligned}$$

(74)

and

$$\begin{aligned} \partial _{\tau \tau }v(\tau ;\lambda ) = \sigma _\lambda ^2\tau ^\kappa \partial _{t t}{\tilde{\rho }}(\sigma _\lambda \tau ;\lambda )+2\sigma _\lambda \kappa \tau ^{\kappa -1}\partial _t{\tilde{\rho }}(\sigma _\lambda \tau ;\lambda )+\kappa (\kappa -1)\tau ^{\kappa -2}{\tilde{\rho }}(\sigma _\lambda \tau ;\lambda ). \end{aligned}$$

We use Eq. 71 to replace the second derivative of ${\tilde{\rho }}$ and obtain

$$\begin{aligned} \partial _{\tau \tau }v(\tau ;\lambda ) = \sigma _\lambda (2\kappa -b)\tau ^{\kappa -1}\partial _t{\tilde{\rho }}(\sigma _\lambda \tau ;\lambda )+(\kappa (\kappa -1)-\lambda \sigma _\lambda ^2\tau ^2)\tau ^{\kappa -2}{\tilde{\rho }}(\sigma _\lambda \tau ;\lambda ), \end{aligned}$$

which, after writing $\partial _t{\tilde{\rho }}$ and ${\tilde{\rho }}$ via Eqs. 74 and 73 in terms of the function v, becomes the differential equation

$$\begin{aligned} \tau ^2\partial _{\tau \tau }v(\tau ;\lambda )+(b-2\kappa )\tau \partial _\tau v(\tau ;\lambda )+(\lambda \sigma _\lambda ^2\tau ^2-\kappa (\kappa -1)-\kappa (b-2\kappa ))v(\tau ;\lambda ) = 0 \end{aligned}$$

for the function v. Choosing now $\kappa =\frac{1}{2}(b-1)$, so that $b-2\kappa =1$, and $\sigma _\lambda =\frac{1}{\sqrt{\lambda }}$, we end up with Bessel’s differential equation

$$\begin{aligned} \tau ^2\partial _{\tau \tau }v(\tau ;\lambda )+\tau \partial _\tau v(\tau ;\lambda )+(\tau ^2-\kappa ^2)v(\tau ;\lambda ) = 0, \end{aligned}$$

for which every solution can be written as

$$\begin{aligned} v(\tau ;\lambda ) = {\left\{ \begin{array}{ll} C_{1,\kappa }J_{|\kappa |}(\tau )+C_{2,\kappa }Y_{|\kappa |}(\tau ), &{}\kappa \in \mathbb {Z},\\ C_{1,\kappa }J_{|\kappa |}(\tau )+C_{2,\kappa }J_{-|\kappa |}(\tau ), &{}\kappa \in \mathbb {R}\setminus \mathbb {Z}, \end{array}\right. } \end{aligned}$$

for some constants $C_{1,\kappa },C_{2,\kappa }\in \mathbb {R}$, where $J_\nu $ and $Y_\nu $ denote the Bessel functions of first and second kind of order $\nu \in \mathbb {R}$, respectively; see, for example, [1, Chapter 9.1].

We can therefore write the solution ${\tilde{\rho }}$ as

$$\begin{aligned} {\tilde{\rho }}(t;\lambda ) = {\left\{ \begin{array}{ll} C_{1,\kappa }(t\sqrt{\lambda })^{-\kappa }J_{|\kappa |}(t\sqrt{\lambda })+C_{2,\kappa }(t\sqrt{\lambda })^{-\kappa }Y_{|\kappa |}(t\sqrt{\lambda }), &{}\kappa \in \mathbb {Z},\\ C_{1,\kappa }(t\sqrt{\lambda })^{-\kappa }J_{|\kappa |}(t\sqrt{\lambda })+C_{2,\kappa }(t\sqrt{\lambda })^{-\kappa }J_{-|\kappa |}(t\sqrt{\lambda }), &{}\kappa \in \mathbb {R}\setminus \mathbb {Z}. \end{array}\right. } \end{aligned}$$

(75)

To determine the constants $C_{1,\kappa }$ and $C_{2,\kappa }$ from the initial conditions, we remark that the Bessel functions have for all $\kappa \in \mathbb {R}\setminus (-\mathbb {N})$ and all $n\in \mathbb {N}$ asymptotically for $\tau \rightarrow 0$ the behaviour

$$\begin{aligned} \begin{aligned}&\tau ^{-\kappa }J_{\kappa }(\tau ) = \frac{1}{2^{\kappa }\varGamma (\kappa +1)}+{\mathcal {O}}(\tau ^2), \\&\lim _{\tau \rightarrow 0}\tau ^n Y_n(\tau ) = -\frac{2^n(n-1)!}{\pi },\text { and } \lim _{\tau \rightarrow 0}\frac{Y_0(\tau )}{\log (\tau )} = \frac{2}{\pi }, \end{aligned} \end{aligned}$$

(76)

see, for example, [1, Formulae 9.1.10 and 9.1.11].

We consider the cases $\kappa \ge 0$ and $\kappa \in (-\frac{1}{2},0)$ separately.

In particular, the relations in Eq. 76 imply that, for the last terms in Eq. 75, we have with $\tau =t\sqrt{\lambda }$ asymptotically for $\tau \rightarrow 0$
- for $\kappa =0$:
  $$\begin{aligned} C_{2,0}Y_0(\tau ) = \frac{2}{\pi }C_{2,0}{\mathcal {O}}(\log (\tau )) \end{aligned}$$
  because of the third relation in Eq. 76;
- for $\kappa \in \mathbb {N}$:
  $$\begin{aligned} C_{2,\kappa }\tau ^{-\kappa } Y_\kappa (\tau )= C_{2,\kappa }\tau ^{-2\kappa } (\tau ^\kappa Y_\kappa (\tau )) = C_{2,\kappa }\left( -\frac{2^\kappa (\kappa -1)!}{\pi }+o(1)\right) \tau ^{-2\kappa } \end{aligned}$$
  because of the second relation in Eq. 76; and
- for $\kappa \in (0,\infty )\setminus \mathbb {N}$:
  $$\begin{aligned} C_{2,\kappa }\tau ^{-\kappa }J_{-\kappa }(\tau )=C_{2,\kappa }\tau ^{-2\kappa } (\tau ^\kappa J_{-\kappa }(\tau ))=C_{2,\kappa }\left( \frac{2^\kappa }{\varGamma (1-\kappa )}+{\mathcal {O}}(\tau ^2)\right) \tau ^{-2\kappa } \end{aligned}$$
  because of the first relation in Eq. 76.
Thus, the last terms in Eq. 75 diverge for every $\kappa \ge 0$ as $t\rightarrow 0$.
Since the first terms in Eq. 75 converge according to the first relation in Eq. 76 for $t\rightarrow 0$, the initial condition ${\tilde{\rho }}(0;\lambda )=1$ can only be fulfilled if the coefficients $C_{2,\kappa }$, $\kappa \ge 0$, in front of the singular terms are all zero so that we have
$$\begin{aligned} {\tilde{\rho }}(t;\lambda ) = C_{1,\kappa }(t\sqrt{\lambda })^{-\kappa }J_\kappa (t\sqrt{\lambda })\text { for all }\kappa \ge 0. \end{aligned}$$
Furthermore, the initial condition ${\tilde{\rho }}(0;\lambda )=1$ implies according to the first relation in Eq. 76 that
$$\begin{aligned} C_{1,\kappa } = 2^\kappa \varGamma (\kappa +1)\text { for all }\kappa \ge 0, \end{aligned}$$
which gives the representation of Eq. 72 for the solution ${\tilde{\rho }}$.

It remains to check that also the initial condition $\partial _t{\tilde{\rho }}(0;\lambda )=0$ is for all $\kappa \ge 0$ fulfilled, which again follows directly from the first relation in Eq. 76:
$$\begin{aligned} \partial _t{\tilde{\rho }}(0;\lambda ) = \lim _{t\rightarrow 0}\frac{1}{t}\left( 2^\kappa \varGamma (\kappa +1)(t\sqrt{\lambda })^{-\kappa }J_\kappa (t\sqrt{\lambda })-1\right) = 0\text { for all }\kappa \ge 0. \end{aligned}$$
For $\kappa \in (-\frac{1}{2},0)$, we have that the first term in ${\tilde{\rho }}(t;\lambda )$ converges for $t\rightarrow 0$ to 0 because of
$$\begin{aligned} C_{1,\kappa }(t\sqrt{\lambda })^{|\kappa |}J_{|\kappa |}(t\sqrt{\lambda }) = C_{1,\kappa }(t\sqrt{\lambda })^{2|\kappa |}\left( \frac{1}{2^{|\kappa |}\varGamma (|\kappa |+1)}+{\mathcal {O}}(t^2)\right) , \end{aligned}$$
which follows from the first relation of Eq. 76. Therefore, the initial condition ${\tilde{\rho }}(0;\lambda )=1$ requires that
$$\begin{aligned} 1 = {\tilde{\rho }}(0;\lambda ) = C_{2,\kappa }\lim _{t\rightarrow 0}(t\sqrt{\lambda })^{|\kappa |}J_{-|\kappa |}(t\sqrt{\lambda })\text { for all }\kappa \in (-\tfrac{1}{2},0), \end{aligned}$$
from which we get with the first property in Eq. 76 that
$$\begin{aligned} C_{2,\kappa } = 2^\kappa \varGamma (\kappa +1). \end{aligned}$$
To determine the coefficient $C_{1,\kappa }$, we remark that the first identity in Eq. 76 then gives us for $t\rightarrow 0$ the asymptotic behaviour
$$\begin{aligned} {\tilde{\rho }}(t;\lambda ) = 1+C_{1,\kappa }\frac{(t\sqrt{\lambda })^{2|\kappa |}}{2^{|\kappa |}\varGamma (|\kappa |+1)}+{\mathcal {O}}(t^2). \end{aligned}$$
Therefore, we have for the first derivative at $t=0$ the expression
$$\begin{aligned} \partial _t{\tilde{\rho }}(0;\lambda ) = \lim _{t\rightarrow 0}C_{1,\kappa }\frac{2|\kappa |\lambda ^{|\kappa |}}{2^{|\kappa |}\varGamma (|\kappa |+1)}\,\frac{1}{t^{1-2|\kappa |}}. \end{aligned}$$
To satisfy the initial condition $\partial _t{\tilde{\rho }}(0;\lambda )=0$, we thus have to choose $C_{1,\kappa }=0$ for $\kappa \in (-\frac{1}{2},0)$, which leaves us again with Eq. 72.

$\square $

Corollary 8

The unique solution $\xi :[0,\infty )\times {\mathcal {Y}}\rightarrow \mathbb {R}$ of the vanishing viscosity flow, Eq. 5, which is twice continuously differentiable with respect to t is given by

$$\begin{aligned} \xi (t;{\tilde{y}}) = \int _{(0,\left\| L \right\| ^2]}\frac{1-u(t\sqrt{\lambda })}{\lambda }\,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}} \end{aligned}$$

where the function u is defined by Eq. 72.

Proof

We have already seen in Lemma 23 that Eq. 71 has the unique solution ${\tilde{\rho }}$ given by ${\tilde{\rho }}(t;\lambda )=u(t\sqrt{\lambda })$. To apply Proposition 2, it is thus enough to show that ${\tilde{\rho }}$ is smooth.

Since the function u has the representation

$$\begin{aligned} u(\tau ) = v(\tau ^2)\text { with } v({\tilde{\tau }}) = \varGamma (\tfrac{1}{2}(b+1))\sum _{k=0}^\infty \frac{(-\frac{1}{4}{\tilde{\tau }})^k}{k!\varGamma (\frac{1}{2}(b+1)+k)}, \end{aligned}$$

see, for example, [1, Formula 9.1.10], the solution ${\tilde{\rho }}:[0,\infty )\times [0,\infty )\rightarrow \mathbb {R}$ given by Eq. 72 is of the form ${\tilde{\rho }}(t;\lambda )=u(t\sqrt{\lambda })=v(\lambda t^2)$ and is therefore seen to be smooth. Therefore, Proposition 2 yields the claim. $\square $

Again, we want to determine a corresponding regularisation method. We start by showing that the function $\lambda \mapsto {\tilde{\rho }}(t;\lambda )$, which corresponds to the error function $\tilde{r}_\alpha $ of the regularisation method, is non-negative and monotonically decreasing for sufficiently small values $\lambda $ as required for $\tilde{r}_\alpha $ in Definition 1 item 2.

Lemma 24

Let $j_{\kappa ,1}\in (0,\infty )$ denote the first positive zero of the Bessel function $J_\kappa $. Then, the solution ${\tilde{\rho }}$ given in Eq. 72 fulfils

for every $\lambda >0$ that the function $t\mapsto {\tilde{\rho }}(t;\lambda )$ is strictly decreasing on the interval $(0,\frac{1}{\sqrt{\lambda }}j_{\frac{1}{2}(b-1),1})$ and
for every $t>0$ that the function $\lambda \mapsto {\tilde{\rho }}(t;\lambda )$ is strictly decreasing on the interval $(0,\frac{1}{t^2}j_{\frac{1}{2}(b-1),1}^2)$.

Proof

Since we can write ${\tilde{\rho }}$ in the form ${\tilde{\rho }}(t;\lambda )=u(t\sqrt{\lambda })$, see Eq. 72, it is enough to show that

$$\begin{aligned} u'(\tau )<0\text { for }\tau \in (0,j_{\frac{1}{2}(b-1),1}). \end{aligned}$$

This property of u follows directly from the representation of the Bessel functions $J_\kappa $, $\kappa \in (-\frac{1}{2},\infty )$, as an infinite product, see, for example, [1, Formula 9.5.10]:

$$\begin{aligned} J_\kappa (\tau ) = \frac{\tau ^\kappa }{2^\kappa \varGamma (\kappa +1)}\prod _{\ell =1}^\infty \left( 1-\frac{\tau ^2}{j_{\kappa ,\ell }^2}\right) , \end{aligned}$$

where $j_{\kappa ,\ell }$ denotes the $\ell $th positive zero (sorted in increasing order) of $J_\kappa $; since this gives

$$\begin{aligned} u(\tau ) = \prod _{\ell =1}^\infty \left( 1-\frac{\tau ^2}{j_{\frac{1}{2}(b-1),\ell }^2}\right) , \end{aligned}$$

which is for $\tau \in (0,j_{\frac{1}{2}(b-1),1})$ a product of only positive factors. Therefore, we have

$$\begin{aligned} u'(\tau ) = -2\tau \sum _{\ell =1}^\infty \left[ j_{\frac{1}{2}(b-1),\ell }^{-2} \prod _{{\tilde{\ell }}\in \mathbb {N}\setminus \{\ell \}}\left( 1-\frac{\tau ^2}{j_{\frac{1}{2}(b-1),{\tilde{\ell }}}^2}\right) \right] < 0\text { for all } \tau \in (0,j_{\frac{1}{2}(b-1),1}). \end{aligned}$$

$\square $

Furthermore, we can construct an upper bound ${\tilde{P}}(t;\lambda )=U(t\sqrt{\lambda })$ of $|{\tilde{\rho }}(t;\lambda )|$, which corresponds to the envelope value $\tilde{R}_\alpha (\lambda )$, such that ${\tilde{P}}(t;\cdot )$ is monotonically decreasing. This will give us the condition of Definition 1 item 3 for the function $\tilde{R}_\alpha $. The additionally derived explicit upper bound for U helps us to show the compatibility of the convergence rate functions $\varphi ^{\mathrm H}_\mu $ and $\varphi ^{\mathrm L}_\mu $.

Lemma 25

Let ${\tilde{\rho }}$ be the solution of Eq. 71 given by Eq. 72. Then, there exist a constant $C>0$ and a continuous, monotonically decreasing function $U:[0,\infty )\rightarrow [0,1]$ so that

$|{\tilde{\rho }}(t;\lambda )| \le U(t\sqrt{\lambda })$ for every $t\ge 0$, $\lambda >0$,
$U(\tau )<1$ for all $\tau >0$, and
$U(\tau ) \le C\tau ^{-\frac{b}{2}}$ for all $\tau >0$.

Proof

We use again the function u defined in Eq. 72 which satisfies ${\tilde{\rho }}(t;\lambda )=u(t\sqrt{\lambda })$. Then, we remark that the energy

$$\begin{aligned} E(\tau ) :={u'}^2(\tau )+u^2(\tau ),\;\tau \ge 0, \end{aligned}$$

fulfils (using Eq. 71 with $\lambda =1$, $t=\tau $ and $u(\tau )={\tilde{\rho }}(\tau ;1)$)

$$\begin{aligned} E'(\tau ) = 2u'(\tau )\left( u''(\tau )+u(\tau )\right) = -\frac{b}{\tau }{u'}^2(\tau ) \le 0. \end{aligned}$$

Since we know from Lemma 24 that $u'(\tau )=\partial _t{\tilde{\rho }}(\tau ;1)<0$ for $\tau \in (0,j_{\frac{1}{2}(b-1),1})$, we have that E is strictly decreasing on $(0,j_{\frac{1}{2}(b-1),1})$ so that $E(j_{\frac{1}{2}(b-1),1}) < E(0)$. For $\tau \ge j_{\frac{1}{2}(b-1),1}$, we can therefore estimate u by

$$\begin{aligned} u^2(\tau ) \le E(\tau ) \le E(j_{\frac{1}{2}(b-1),1}) < E(0) = 1. \end{aligned}$$

Thus, u is monotonically decreasing on $(0,j_{\frac{1}{2}(b-1),1})$ and uniformly bounded by $E(j_{\frac{1}{2}(b-1),1})<1$ on $[j_{\frac{1}{2}(b-1),1},\infty )$. Therefore, we can find a monotonically decreasing function ${\tilde{U}}:[0,\infty )\rightarrow [0,1]$ with

$$\begin{aligned} u(\tau ) \le {\tilde{U}}(\tau )<1\text { for every }\tau >0. \end{aligned}$$

Since it follows from [1, Formula 9.2.1] that there exists a constant $c>0$ such that

$$\begin{aligned} |J_{\frac{1}{2}(b-1)}(\tau )| \le c\tau ^{-\frac{1}{2}}\text { for all }\tau >0, \end{aligned}$$

which implies according to Eq. 72 with $C=2^{\frac{1}{2}(b-1)}\varGamma (\frac{1}{2}(b+1))c$ the upper bound

$$\begin{aligned} |u(\tau )|\le C\tau ^{-\frac{b}{2}}\text { for all }\tau >0, \end{aligned}$$

the function U defined by $U(\tau )=\min \{{\tilde{U}}(\tau ),C\tau ^{-\frac{b}{2}}\}$ satisfies all the properties. $\square $

To verify the condition in Definition 1 item 1 for $r_\alpha $, we establish here the corresponding lower bound for the function ${\tilde{\rho }}$.

Lemma 26

Let ${\tilde{\rho }}$ be the solution of Eq. 71 given by Eq. 72. Then, there exists a constant $\tau _b\in (0,j_{\frac{1}{2}(b-1),1}]$ such that

$$\begin{aligned} {\tilde{\rho }}(t;\lambda )\ge 1-\frac{t\sqrt{\lambda }}{2\tau _b}\text { for all }t\ge 0,\;\lambda >0. \end{aligned}$$

(77)

Proof

We define u again by Eq. 72 and choose some arbitrary $c>0$. Then, the initial conditions $u(0)=1$ and $u'(0)=0$ imply that we find a ${\bar{\tau }}>0$ such that $u(\tau )\ge 1-c\tau $ for all $\tau \in [0,{\bar{\tau }}]$. Setting now $\tau _b:=\min \{\frac{1}{2c},\frac{{\bar{\tau }}}{4},j_{\frac{1}{2}(b-1),1}\}$, we have by construction

$$\begin{aligned} u(\tau )\ge 1-c\tau \ge 1-\frac{\tau }{2\tau _b}\text { for all }\tau \in [0,{\bar{\tau }}]. \end{aligned}$$

Moreover, the uniform bound $\left| u(\tau )\right| <1$ for all $\tau >0$, shown in Lemma 25, implies that

$$\begin{aligned} u(\tau ) > -1 \ge 1-\frac{2}{{\bar{\tau }}}\tau \ge 1-\frac{\tau }{2\tau _b}\text { for all }\tau \in [{\bar{\tau }},\infty ). \end{aligned}$$

Thus, ${\tilde{\rho }}(t;\lambda )=u(t\sqrt{\lambda })$ yields the claim. $\square $

These estimates for ${\tilde{\rho }}$ suffice now to show that the functions $r_\alpha $ defined by Eq. 78 generate the regularisation method corresponding to the solution $\xi $ of Eq. 5.

Proposition 5

Let ${\tilde{\rho }}$ be the solution of Eq. 71 given by Eq. 72, $\tau _b$ be the constant defined in Lemma 26, and set

$$\begin{aligned} r_\alpha (\lambda ) :=\frac{1}{\lambda }\left( 1-{\tilde{\rho }}\left( \frac{\tau _b}{\sqrt{\alpha }};\lambda \right) \right) . \end{aligned}$$

(78)

Then, $(r_\alpha )_{\alpha >0}$ generates a regularisation method in the sense of Definition 1.

Proof

We verify the four conditions in Definition 1.

We know from Lemma 25 that $|{\tilde{\rho }}|\le 1$, and thus it follows that

$$\begin{aligned} r_\alpha (\lambda ) \le \frac{2}{\lambda }. \end{aligned}$$

Moreover, it follows from Eq. 77 that

$$\begin{aligned} r_\alpha (\lambda ) = \frac{1}{\lambda }\left( 1-{\tilde{\rho }}\left( \frac{\tau _b}{\sqrt{\alpha }};\lambda \right) \right) \le \frac{1}{2\sqrt{\alpha \lambda }}. \end{aligned}$$

The error function ${\tilde{r}}_\alpha $ corresponding to the generator $r_\alpha $ is given by

$$\begin{aligned} \tilde{r}_\alpha (\lambda ) = {\tilde{\rho }}\left( \frac{\tau _b}{\sqrt{\alpha }};\lambda \right) , \end{aligned}$$

which is a monotonically decreasing function on $(0,\frac{1}{\tau _b^2}j_{\frac{1}{2}(b-1),1}^2\alpha )$ according to Lemma 24. Since we have chosen $\tau _b\in (0,j_{\frac{1}{2}(b-1),1}]$, see Lemma 26, this in particular shows that $\tilde{r}_\alpha $ is monotonically decreasing on $(0,\alpha )$.

Let U be the function constructed in Lemma 25. We define

$$\begin{aligned} \tilde{R}_\alpha (\lambda ) :=U\left( \tau _b\sqrt{\frac{\lambda }{\alpha }}\right) . \end{aligned}$$

(79)

Then, we have by Lemma 25 that $\lambda \mapsto \tilde{R}_\alpha (\lambda )$ is monotonically decreasing, $\alpha \mapsto \tilde{R}_\alpha (\lambda )$ is continuous and monotonically increasing and $\tilde{R}_\alpha $ fulfils

$$\begin{aligned} \tilde{r}_\alpha (\lambda ) = {\tilde{\rho }}\left( \frac{\tau _b}{\sqrt{\alpha }};\lambda \right) \le U\left( \tau _b\sqrt{\frac{\lambda }{\alpha }}\right) = \tilde{R}_\alpha (\lambda ). \end{aligned}$$

We have again by Lemma 25 that

$$\begin{aligned} \tilde{R}_\alpha (\alpha ) = U(\tau _b) < 1\text { for all }\alpha >0. \end{aligned}$$

$\square $

As before, we also verify that the classical convergence rate functions $\varphi ^{\mathrm H}_\mu $ and $\varphi ^{\mathrm L}_\mu $ are compatible with the regularisation method $(r_\alpha )_{\alpha >0}$. In contrast to Showalter’s method and the heavy ball method, the compatibility for $\varphi ^{\mathrm H}_\mu $ only holds up to a certain saturation value for the parameter $\mu $.

Lemma 27

The functions $\varphi ^{\mathrm H}_\mu $ for all $\mu \in (0,\frac{b}{2})$ and the functions $\varphi ^{\mathrm L}_\mu $ for all $\mu >0$, as defined in Example 2, are compatible with the regularisation method $(r_\alpha )_{\alpha >0}$ defined by Eq. 78 in the sense of Definition 4.

Proof

As before, it is because of Corollary 2 enough to check this for the functions $\varphi ^{\mathrm H}_\mu $, $\mu \in (0,\frac{b}{2})$. The function $\tilde{R}_\alpha $ defined in Eq. 79 fulfils according to Lemma 25 that there exists a constant $C>0$ with

$$\begin{aligned} \tilde{R}_\alpha ^2(\lambda ) = U^2\left( \tau _b\sqrt{\frac{\lambda }{\alpha }}\right) \le C^2\tau _b^{-b}\left( \frac{\lambda }{\alpha }\right) ^{-\frac{b}{2}} \le C^2\tau _b^{-b}\left( \frac{\varphi ^{\mathrm H}_\mu (\lambda )}{\varphi ^{\mathrm H}_\mu (\alpha )}\right) ^{-\frac{b}{2\mu }}, \end{aligned}$$

which is Eq. 21 with the compatibility function $F_\mu (z)=C^2\tau _b^{-b}z^{-\frac{b}{2\mu }}$. It remains to check that $F_\mu :[1,\infty )\rightarrow \mathbb {R}$ is integrable, which is the case for $\mu <\frac{b}{2}$. $\square $

We can therefore apply Theorem 1 to the regularisation method generated by the functions $(r_\alpha )_{\alpha >0}$ defined in Eq. 78 and the convergence rates $\varphi ^{\mathrm H}_\mu $, $\mu \in (0,\frac{b}{2})$, and $\varphi ^{\mathrm L}_\mu $, $\mu >0$. By using that we have by construction $x_\alpha ({\tilde{y}})=\xi (\frac{\tau _b}{\sqrt{\alpha }};{\tilde{y}})$, see Eq. 80 below, this gives us equivalent characterisations for convergence rates of the flow $\xi $ of Eq. 5. As before for Showalter’s method and the heavy ball method, we formulate the resulting convergence rates under the stronger, but more commonly used standard source condition, see Proposition 1.

Corollary 9

$$\begin{aligned} x^\dagger \in {\mathcal {R}}\big ((L^*L)^{\frac{\mu }{2}}\big ). \end{aligned}$$

Then, if $\xi $ is the solution of the initial value problem in Eq. 5,

there exists a constant $C_1>0$ such that

$$\begin{aligned} \left\| \xi (t;y)-x^\dag \right\| ^2 \le \frac{C_1}{t^{2\mu }}\text { for all }t>0; \end{aligned}$$

there exists a constant $C_2>0$ such that

and

if $\mu <\frac{b}{2}-1$, there exists a constant $C_3>0$ such that

$$\begin{aligned} \left\| L\xi (t;y)-y \right\| ^2 \le \frac{C_3}{t^{2(\mu +1)}}\text { for all }t>0. \end{aligned}$$

Proof

The proof follows exactly the lines of the proof of Corollary 6, where the compatibility of $\varphi ^{\mathrm H}_\mu $ is shown in Lemma 27 and we have the different scaling

$$\begin{aligned} \begin{aligned} x_\alpha ({\tilde{y}})&=r_\alpha (L^*L)L^*{\tilde{y}} \\&=\int _{(0,\left\| L \right\| ^2]}\frac{1}{\lambda }\left( 1-{\tilde{\rho }}\left( \frac{\tau _b}{\sqrt{\alpha }};\lambda \right) \right) \,\mathrm d{\mathbf {E}}_\lambda L^*{\tilde{y}} = \xi \left( \frac{\tau _b}{\sqrt{\alpha }};{\tilde{y}}\right) \end{aligned} \end{aligned}$$

(80)

between the regularised solution $x_\alpha $, defined in Eq. 8 with the regularisation method $(r_\alpha )_{\alpha >0}$ from Eq. 78, and the solutions $\xi $ of Eq. 5 and ${\tilde{\rho }}$ of Eq. 71. Following Corollary 6 and using the notation d from Eq. 13 and ${\tilde{d}}$ from Eq. 14 we get

in the case of exact data the convergence rates

$$\begin{aligned}&\left\| \xi (t;y)-x^\dag \right\| ^2 = \left\| x_{\tau _b^2t^{-2}}(y)-x^\dag \right\| ^2 = d\left( \frac{\tau _b^2}{t^2}\right) \text { and} \\&\left\| x_{\tau _b^2t^{-2}}(y)-x^\dag \right\| ^2\le \frac{C_d\tau _b^{2\mu }}{t^{2\mu }}. \end{aligned}$$

For perturbed data we get the convergence rate

$$\begin{aligned} \inf _{t>0}\left\| \xi (t;{\tilde{y}})-x^\dag \right\| ^2 = \inf _{\alpha >0}\left\| \xi \left( \frac{\tau _b}{\sqrt{\alpha }};{\tilde{y}}\right) -x^\dag \right\| ^2\le {\tilde{d}}(\left\| {\tilde{y}}-y \right\| ) \le C_{{\tilde{d}}}\left\| {\tilde{y}}-y \right\| ^{\frac{2\mu }{\mu +1}}. \end{aligned}$$

Moreover, using that for $\mu <\frac{b}{2}-1$ also $\varphi ^{\mathrm H}_{\mu +1}$ is compatible with $(r_\alpha )_{\alpha >0}$, we get from Corollary 4 the convergence rate

$$\begin{aligned} \left\| L\xi (t;y)-y \right\| ^2 = \left\| L x_{\tau _b^2t^{-2}}(y)-y \right\| ^2 = q\left( \frac{\tau _b^2}{t^2}\right) \le C\tau _b^{2(\mu +1)}t^{-2(\mu +1)} \end{aligned}$$

for the noise-free residual error, where q is defined in Eq. 15.

$\square $

We end this section by a few remarks.

Remark 8

(Comparison of Flows) Comparing the results in Corollary 6, Corollary 7, and Corollary 9, we see that the three methods we have analysed, namely Showalter’s method, the heavy ball dynamics, and the vanishing viscosity flow, all give the same rate of convergence for noisy data with optimal stopping time. However, one should notice that their optimal stopping times are different. This is due to the acceleration property of the vanishing viscosity flow in comparison with the other two, which has been analysed in the literature.

Remark 9

(Saturation of Viscosity Flow) The vanishing viscosity flow suffers from a saturation effect for the convergence rate functions $\varphi ^{\mathrm H}_\mu $ allowing only convergence rates up to certain values of $\mu $, which is not the case in the other two methods (because of their exponential decay of the error function at every fixed spectral value).

Remark 10

(Comparison with literature) Equation 71 has been investigated quite heavily in a more general context of non-smooth, convex functionals ${\mathcal {J}}$ and abstract ordinary differential equations of the form

$$\begin{aligned} \begin{aligned} \xi '' (t) + \frac{b}{t} \xi '(t) + \partial {\mathcal {J}}(\xi (t))&\ni 0 \text { for all } t \in \left( 0,\infty \right) , \\ \xi '(0)&= 0, \\ \xi (0)&= 0, \end{aligned} \end{aligned}$$

(81)

see for instance [4‐7, 29]. Equation 81 corresponds to Eq. 2 with $N=2$ and $a_1(t)=\frac{b}{t}$, $b>0$, for the particular energy functional ${\mathcal {J}}(x)=\frac{1}{2}\left\| L x-y \right\| ^2$.

The authors prove optimality of Eq. 81, which, however, is a different term than in our paper:

In the above referenced paper, optimality is considered with respect to all possible smooth and convex functionals ${\mathcal {J}}$, while in our work optimality is considered with respect to all possible variations of y only. The papers [4, 7, 29] consider a finite dimensional setting where ${\mathcal {J}}$ maps a subset of a finite dimensional space $\mathbb {R}^d$ into the extended reals.

The second difference in the optimality results is that we consider primarily optimal convergence rate of $\xi (t)-x^\dagger $ for $t \rightarrow \infty $ and not of ${\mathcal {J}}(\xi (t)) \rightarrow \min _{x\in {\mathcal {X}}}{\mathcal {J}}(x)$, that is, we are considering rates in the domain of L, while in the referenced papers convergence in the image domain is considered. Consequently, we get rates for the residual squared (which is the rate of $J(\xi (t))$ in the referenced papers), which are based on optimal rates (in the sense of this paper) for $\xi (t)-x^\dagger \rightarrow 0$. The presented rates in the image domain are, however, not necessarily optimal.

Nevertheless, it is very interesting to note that the two cases $b \ge 3$ and $0< b < 3$, referred to as heavy and low friction cases, do not result in a different analysis in our paper, compared to, for instance, [4]. This is of course not a contradiction, because we consider a different optimality terminology.

7 Conclusions

The paper shows that the dynamical flows provide optimal regularisation methods (in the sense explained in Sect. 2). We proved optimal convergence rates of the solutions of the flows to the minimum norm solution for $t\rightarrow \infty $ and we also provide convergence rates of the residuals of the regularised solutions.

We observed that the vanishing viscosity method, heavy ball dynamics, and Showalter’s method provide optimal reconstructions for different times. In fact, eventually, for a fair numerical comparison of the results of all three methods one should compare the results of Showalter’s method and the heavy ball dynamics, respectively, at time $t_0^2$ with the vanishing viscosity flow at time $t_0$.

Acknowledgements

RB acknowledges support from the Austrian Science Fund (FWF) within the project I2419-N32 (Employing Recent Outcomes in Proximal Theory Outside the Comfort Zone). GD is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy “The Berlin Mathematics Research Center MATH” (EXC-2046/1, project ID: 390685689). PE and OS are supported by the Austrian Science Fund (FWF), with SFB F68, project F6804-N36 (Quantitative Coupled Physics Imaging) and project F6807-N36 (Tomography with Uncertainties). OS acknowledges support from the Austrian Science Fund (FWF) within the national research network Geometry and Simulation, project S11704 (Variational Methods for Imaging on Manifolds) and I3661-N27 (Novel Error Measures and Source Conditions of Regularization Methods for Inverse Problems).

Open Access

This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

previous article Optimal Combination of Linear and Spectral Estimators for Generalized Linear Models

next article Stationary Points at Infinity for Analytic Combinatorics

Abramowitz, M., Stegun, I.: Handbook of Mathematical Functions. Dover, Downers Grove (1972)MATH

Adams, R.A.: Sobolev Spaces. No. 65 in Pure and Applied Mathematics. Academic Press (1975)

Albani, V., Elbau, P., de Hoop, M.V., Scherzer, O.: Optimal convergence rates results for linear inverse problems in Hilbert spaces. Numerical Functional Analysis and Optimization 37(5), 521–540 (2016). https://doi.org/10.1080/01630563.2016.1144070

Apidopoulos, V., Aujol, J.F., Dossal, C.: The differential inclusion modeling FISTA algorithm and optimality of convergence rate in the case $b\le 3$. SIAM Journal on Optimization 28(1), 551–574 (2018). https://doi.org/10.1137/17m1128642

Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Mathematical Programming 168(1–2), 123–175 (2018). https://doi.org/10.1007/s10107-016-0992-8

Attouch, H., Chbani, Z., Riahi, H.: Combining fast inertial dynamics for convex optimization with Tikhonov regularization. Journal of Mathematical Analysis and Applications 457(2), 1065–1094 (2018). https://doi.org/10.1016/j.jmaa.2016.12.017

Aujol, J.F., Dossal, C., Rondepierre, A.: Optimal convergence rates for Nesterov acceleration. SIAM Journal on Optimization 29(4), 3131–3153 (2019). https://doi.org/10.1137/18m1186757

Boţ, R.I., Csetnek, E.R.: Second order forward-backward dynamical systems for monotone inclusion problems. SIAM Journal on Control and Optimization 54(3), 1423–1443 (2016). https://doi.org/10.1137/15M1012657

Engl, H.W., Hanke, M., Neubauer, A.: Regularization of inverse problems. No. 375 in Mathematics and its Applications. Kluwer Academic Publishers Group (1996)

10.

Flemming, J.: Solution smoothness of ill-posed equations in Hilbert spaces: four concepts and their cross connections. Applicable Analysis 91(5), 1029–1044 (2011). https://doi.org/10.1080/00036811.2011.563736

11.

Flemming, J., Hofmann, B.: A new approach to source conditions in regularization with general residual term. Numerical Functional Analysis and Optimization 31(3), 245–284 (2010)MathSciNetCrossRef

12.

Flemming, J., Hofmann, B., Mathé, P.: Sharp converse results for the regularization error using distance functions. Inverse Problems 27(2), 025006 (2011). https://doi.org/10.1088/0266-5611/27/2/025006

13.

Groetsch, C.W.: The Theory of Tikhonov Regularization for Fredholm Equations of the First Kind. Pitman, London (1984)MATH

14.

Hanke, M., Scherzer, O.: Inverse problems light: numerical differentiation. The American Mathematical Monthly 108(6), 512–521 (2001). https://doi.org/10.2307/2695705

15.

Hein, T., Hofmann, B.: Approximate source conditions for nonlinear ill-posed problems—chances and limitations. Inverse Problems 25(3), 035003 (2009). https://doi.org/10.1088/0266-5611/25/3/035003

16.

Hofmann, B., Mathé, P.: Analysis of profile functions for general linear regularization methods. SIAM Journal on Numerical Analysis 45(3), 1122–1141 (2007). https://doi.org/10.1137/060654530

17.

Hohage, T.: Regularization of exponentially ill-posed problems. Numerical Functional Analysis and Optimization 21(3-4), 439–464 (2000). https://doi.org/10.1080/01630560008816965

18.

Krein, S.G.: Linear Differential Equations in Banach Space. No. 29 in Translations of Mathematical Monographs. American Mathematical Society (1972)

19.

Mathé, P., Pereverzev, S.V.: Geometry of linear ill-posed problems in variable Hilbert scales. Inverse Problems 19(3), 789–803 (2003). https://doi.org/10.1088/0266-5611/19/3/319

20.

Nesterov, Y.E.: A method for solving the convex programming problem with convergence rate $O(1/k^2)$. Doklady Akademii Nauk SSSR 269(3), 543–547 (1983)MathSciNet

21.

Neubauer, A.: On converse and saturation results for Tikhonov regularization of linear ill-posed problems. SIAM Journal on Numerical Analysis 34(2), 517–527 (1997). https://doi.org/10.1137/S0036142993253928

22.

Poljak, B.T.: Some methods of speeding up the convergence of iterative methods. Zhurnal Vychislitelnoi Matematiki i Matematicheskoi Fiziki 4, 791–803 (1964)MathSciNet

23.

Rudin, W.: Principles of Mathematical Analysis, 3 edn. McGraw-Hill, New York (1976)MATH

24.

Rudin, W.: Real and Complex Analysis, 3 edn. McGraw-Hill, New York (1987)MATH

25.

Scherzer, O.: A posteriori error estimates for the solution of nonlinear ill-posed operator equations. Nonlinear Analysis: Theory, Methods & Applications 45(4), 459–481 (2001). https://doi.org/10.1016/S0362-546X(99)00413-7

26.

Scherzer, O., Grasmair, M., Grossauer, H., Haltmeier, M., Lenzen, F.: Variational methods in imaging. No. 167 in Applied Mathematical Sciences. Springer (2009). https://doi.org/10.1007/978-0-387-69277-7

27.

Showalter, D.: Representation and computation of the pseudoinverse. Proceedings of the American Mathematical Society 18(4), 584–586 (1967). https://doi.org/10.2307/2035419

28.

Showalter, D.W., Ben-Israel, A.: Representation and computation of the generalized inverse of a bounded linear operator between Hilbert spaces. Atti della Accademia Nazionale dei Lincei. Rendiconti. Classe di Scienze Fisiche, Matematiche e Naturali 48, 184–194 (1970)MATH

29.

Su, W., Boyd, S., Candès, E.: A differential equation for modeling Nesterov’s accelerated gradient method: Theory and insights. Journal of Machine Learning Research (JMLR) 17(153), 1–43 (2016). http://jmlr.org/papers/v17/15-084.html

30.

Vainikko, G.: Solution Methods for Linear Ill-Posed Problems in Hilbert Spaces. Tartu State University, Tartu (1982)

31.

Vainikko, G., Veretennikov, A.: Iteration Procedures in Ill-Posed Problems. Nauka (1986). In Russian

32.

Yosida, K.: Functional analysis. Classics in Mathematics. Springer-Verlag (1995). Reprint of the sixth (1980) edition

33.

Zhang, Y., Hofmann, B.: On the second-order asymptotical regularization of linear ill-posed inverse problems. Applicable Analysis 99(6), 1000–1025 (2020). https://doi.org/10.1080/00036811.2018.1517412

Title: Convergence Rates of First- and Higher-Order Dynamics for Solving Linear Ill-Posed Problems
Authors: Radu Boţ
Guozhi Dong
Peter Elbau
Otmar Scherzer
Publication date: 17-08-2021
Publisher: Springer US
Published in: Foundations of Computational Mathematics / Issue 5/2022
Print ISSN: 1615-3375
Electronic ISSN: 1615-3383
DOI: https://doi.org/10.1007/s10208-021-09536-6

Source condition	\(\Vert \xi (t)-x^\dagger \Vert ^2\)	\(\left\\| L\xi (t)-y \right\\| ^2\)
\(\Vert L^\dag \Vert <\infty \)	\({\mathcal {O}}(\mathrm e^{-\Vert L^\dag \Vert ^{-2}t})\) [27, Theorem 1]	\({\mathcal {O}}(\mathrm e^{-\Vert L^\dag \Vert ^{-2}t})\)
\(x^\dagger \in {\mathcal {R}}((L^*L)^{\frac{\mu }{2}})\)	\({\mathcal {O}}(t^{-\mu })\) [28, Theorem 1] (\(\mu =1\)), Corollary 6	\({\mathcal {O}}(t^{-\mu -1})\) Corollary 6
\(x^\dagger \in {\mathcal {N}}(L)^\perp \)	o(1) [28, Theorem 1], Proposition 3 with Corollary 1	\({\mathcal {O}}(t^{-1})\) [28, Theorem 1] \(o(t^{-1})\) Proposition 3 with Corollary 5

Source condition	\(\Vert \xi (t)-x^\dagger \Vert ^2\)	\(\left\\| L\xi (t)-y \right\\| ^2\)
\(\Vert L^\dag \Vert <\infty \)	\({\mathcal {O}}(\mathrm e^{\varepsilon t-\beta (\Vert L^\dag \Vert )\frac{bt}{2}})\) [22, Theorem 9.(5)]	\({\mathcal {O}}(\mathrm e^{\varepsilon t-\beta (\Vert L^\dag \Vert )\frac{bt}{2}})\)
\(x^\dagger \in {\mathcal {R}}((L^*L)^{\frac{\mu }{2}})\)	\({\mathcal {O}}(t^{-\mu })\) [33, Theorem 5.1], Corollary 7	\({\mathcal {O}}(t^{-\mu -1})\) Corollary 7
\(x^\dagger \in {\mathcal {N}}(L)^\perp \)	o(1) Proposition 4 with Lemma 22 and Corollary 1	\(o(t^{-1})\) [33, Lemma 3.2] (\(b\ge \Vert L\Vert \)), Proposition 4 with Lemma 22 and Corollary 5

Source condition	Parameters	\(\Vert \xi (t)-x^\dagger \Vert ^2\)	\(\left\\| L\xi (t)-y \right\\| ^2\)
\(\Vert L^\dag \Vert <\infty \)	\(b>3\)	\(o(t^{-2})\)	\(o(t^{-2})\) [4, Theorem 4.16]
		\({\mathcal {O}}(t^{-\frac{2b}{3}})\) [5, Theorem 3.4]	\({\mathcal {O}}(t^{-\frac{2b}{3}})\) [5, Theorem 3.4]
\(\Vert L^\dag \Vert <\infty \)	\(b>2\)	\({\mathcal {O}}(t^{-2})\)	\({\mathcal {O}}(t^{-2})\) [29, Theorem 7]
		\({\mathcal {O}}(t^{-b})\)	\({\mathcal {O}}(t^{-b})\) [7, Theorem 4.2]
\(\Vert L^\dag \Vert <\infty \)	\(0<b<3\)	\({\mathcal {O}}(t^{-\frac{2b}{3}})\)	\({\mathcal {O}}(t^{-\frac{2b}{3}})\) [4, Theorem 4.19]
\(x^\dagger \in {\mathcal {R}}((L^*L)^{\frac{\mu }{2}})\)	\(0<\mu <\frac{b}{2}\)	\({\mathcal {O}}(t^{-2\mu })\) Corollary 9
\(x^\dagger \in {\mathcal {R}}((L^*L)^{\frac{\mu }{2}})\)	\(0<\mu <\frac{b}{2}-1\)		\({\mathcal {O}}(t^{-2\mu -2})\) Corollary 9
\(x^\dagger \in {\mathcal {N}}(L)^\perp \)	\(b\ge 3\)		\({\mathcal {O}}(t^{-2})\) [5, Theorem 2.7]
\(x^\dagger \in {\mathcal {N}}(L)^\perp \)	\(b>0\)	o(1) Proposition 5 with Lemma 27 and Corollary 1	\({\mathcal {O}}(t^{-b+\varepsilon })+o(t^{-2})\) Proposition 5 with Lemma 27 and Corollary 4 and Corollary 5

Abbreviation	Description	References
\(r_\alpha \)	Generator	Definition 1
\(R_\alpha \)	Envelope generator	Equation 9
\(\tilde{r}_\alpha \)	Error function	Equation 7
\(\tilde{R}_\alpha \)	Envelope error function	Definition 1 item 3
\(x_\alpha ({\tilde{y}})=r_\alpha (L^L)L^{\tilde{y}}\)	Regularised solution according to \(r_\alpha \)	Equation 8
\(X_\alpha ({\tilde{y}})=R_\alpha (L^L)L^{\tilde{y}}\)	Regularised solution according to \(R_\alpha \)	Equation 10
\(d(\alpha )=\left\\| x_\alpha (y)-x^\dagger \right\\| ^2\)	Noise-free regularisation error for \(r_\alpha \)	Equation 13
\(D(\alpha )=\left\\| X_\alpha (y)-x^\dagger \right\\| ^2\)	Noise-free regularisation error for \(R_\alpha \)	Equation 13
\(\tilde{d}(\delta )\)	Best worst-case error for \(r_\alpha \)	Equation 14
\(\tilde{D}(\delta )\)	Best worst-case error for \(R_\alpha \)	Equation 14
\(q(\alpha )=\left\\| Lx_\alpha (y)-y \right\\| ^2\)	Noise-free residual error for \(r_\alpha \)	Equation 15
\(Q(\alpha )=\left\\| LX_\alpha (y)-y \right\\| ^2\)	Noise-free residual error for \(R_\alpha \)	Equation 15
\({\mathbf {E}}_A, {\mathbf {F}}_A\)	Spectral measures of \(L^L, LL^\)	Definition 3
\(e(\lambda )=\Vert {\mathbf {E}}_{[0,\lambda ]}x^\dagger \Vert ^2\)	Spectral tail of \(x^\dagger \)	Equation 12
\(\hat{\varphi }\)	\(\hat{\varphi }(\alpha ) = \sqrt{\alpha \varphi (\alpha )}\)	Definition 5
\({\hat{\varphi }}^{-1}\)	Generalised inverse of a function \({\hat{\varphi }}\)	Definition 5
\(\varPhi \)	Noise-free to noisy transform	Definition 5

Springer Professional

Convergence Rates of First- and Higher-Order Dynamics for Solving Linear Ill-Posed Problems

Abstract

Publisher's Note

1 Introduction

2 Generalisations of Convergence Rates Results

2.1 Spectral Representations of the Regularisation Errors

2.2 Bounds for the Noise-Free Regularisation Errors

2.3 Relation Between Convergence Rates for Noise Free and for Noisy Data

2.4 Bounds for the Best Worst-Case Errors

2.5 Optimal Convergence Rates

2.6 Optimal Convergence Rates for the Residual Error

3 Spectral Decomposition Analysis of Regularising Flows

4 Showalter’s Method

5 Heavy Ball Dynamics

6 The Vanishing Viscosity Flow

7 Conclusions

Acknowledgements

Open Access

Publisher's Note

Premium Partner

Springer Professional

Abstract

Publisher's Note

1 Introduction

2 Generalisations of Convergence Rates Results

2.1 Spectral Representations of the Regularisation Errors

2.2 Bounds for the Noise-Free Regularisation Errors

2.3 Relation Between Convergence Rates for Noise Free and for Noisy Data

2.4 Bounds for the Best Worst-Case Errors

2.5 Optimal Convergence Rates

2.6 Optimal Convergence Rates for the Residual Error

3 Spectral Decomposition Analysis of Regularising Flows

4 Showalter’s Method

5 Heavy Ball Dynamics

6 The Vanishing Viscosity Flow

7 Conclusions

Acknowledgements

Open Access

Publisher's Note

Other articles of this Issue 5/2022

Stationary Points at Infinity for Analytic Combinatorics

Vandermonde Varieties, Mirrored Spaces, and the Cohomology of Symmetric Semi-algebraic Sets

A Geometric Integration Approach to Nonsmooth, Nonconvex Optimisation

Optimal Combination of Linear and Spectral Estimators for Generalized Linear Models

Quasi-optimal Nonconforming Approximation of Elliptic PDEs with Contrasted Coefficients and , , Regularity

Construction of Cubic Splines on Arbitrary Triangulations

Premium Partner