In the stiff situation, we consider the long-time behavior of the relative error \(\gamma _n\) in the numerical integration of a linear ordinary differential equation \(y^{\prime }(t)=Ay(t),\quad t\ge 0\), where A is a normal matrix. The numerical solution is obtained by using at any step an approximation of the matrix exponential, e.g. a polynomial or a rational approximation. We study the long-time behavior of \(\gamma _n\) by comparing it to the relative error \(\gamma _n^{\mathrm{long}}\) in the numerical integration of the long-time solution, i.e. the projection of the solution on the eigenspace of the rightmost eigenvalues. The error \( \gamma _n^{\mathrm{long}}\) grows linearly in time, it is small and it remains small in the long-time. We give a condition under which \(\gamma _n\approx \gamma _n^{\mathrm{long}}\), i.e. \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\approx 1\), in the long-time. When this condition does not hold, the ratio \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\) is large for all time. These results describe the long-time behavior of the relative error \(\gamma _n\) in the stiff situation.
Notes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
where \(R:{\mathscr {D}}\subseteq {\mathbb {C}}\rightarrow {\mathbb {C}}\) is a analytic approximant of the exponential \(\mathrm{e}^{z}\), \(z\in {\mathbb {C}}\). When the numerical solution is obtained by a Runge–Kutta (RK) method, the approximant R is the stability function of the RK method and it is a polynomial or a rational function.
The paper [9] analyzed in the non-stiff situation the time behavior of the normwise relative error
in case of a normal matrix A. It seems to be the first paper in literature dealing in detail with the relative error time behavior of numerical solutions of ODEs. This is quite surprising because relative errors are generally considered better than absolute errors as quality measures of approximations. Indeed, componentwise relative errors are involved in the stepsize control mechanism (see [12]).
Advertisement
The present paper continues to analyze, in case of A normal, the error \(\gamma _n\) by considering its long-time behavior in the stiff situation. Next subsection, with all its subsubsections, contains the basic material for facing such an analysis. Part of this material was introduced in [9].
1.1 Fundamental notations and notions
1.1.1 Small and large
We set that, for \(a\ge 0\), “a is small” is the same as “\(a\ll 1\) ” and “a is large” is the same as “\(a\gg 1\)”.
For \(b\ge 0\) and \(c>0\), \(b\ll c\) means \(\frac{b}{c}\ll 1\).
1.1.2 The notation \(\approx \)
For \(a,b\in {\mathbb {R}}\), \(a\approx b\) means
We say \(a\approx b\) with degree \(\varepsilon \), where \(\varepsilon >0\), if \(\frac{\Vert a-b\Vert _2}{\Vert b\Vert _2}\le \varepsilon \).
1.1.3 The meaning of “it is expected”
In the paper, we often say “it is expected S”, where S is a statement, with the meaning that the statement not S is “unlikely” or “unusual” or “extreme”.
Sentences of this form can seem vague, although they are able to convey significant information. However, they are never used in definitions or theorems, which are stated in a precise manner without any such type of vagueness. The sentences are used for a better understanding of technical notions and results.
By introducing probability measures on data, we could made “it is expected S” mathematically precise, but this is out of the scope of the present paper.
of the normal matrix A, where \(\lambda _{1},\lambda _{2},\ldots , \lambda _{p}\) are the distinct eigenvalues of A, is partitioned by decreasing real part in the subsets \(\varLambda _{1},\varLambda _{2},\ldots , \varLambda _{q}\) (see Fig. 1): we have
The generic situation for the initial value \(y_0\) is \(\varLambda ^*=\varLambda \). In order to use simpler notations, we assume this generic situation.
If it does not hold, then below we have to see \(\varLambda _1,\ldots ,\varLambda _q\) as \(\varLambda _{1}^*,\ldots ,\varLambda _q^*\) without the sets \(\varLambda ^{*}_j\) that are empty. In other words, we see \(\varLambda _1\) as \(\varLambda _{j_1^*}\) where
and so on. Of course, when we do this, the number q of sets in \(\varLambda _1,\ldots ,\varLambda _q\) is no longer equal to the number of possible real parts in the spectrum \(\varLambda \), but it is equal to the number of possible real parts in \(\varLambda ^*\).
1.1.6 Rightmost and non-rightmost eigenvalues
The set \(\varLambda _1\) is the set of the rightmost eigenvalues. The set
It is expected \(\vert \beta _2\vert \) non-small.
1.1.8 Dimensionless quantities
We use the dimensionless stepsize \(h\rho _1\), or \(h\rho \), and the dimensionless time \(t\rho _1\), or \(t\rho \), rather than the stepsize h and the time t, respectively, because they are small or large independently of the unit used for time.
In this paper, when we say that a certain quantity is small or large, this quantity is always dimensionless.
The numbers \(\beta _j\) defined above are dimensionless, as well as the errors \(\sigma _i\) now introduced.
1.1.9 The errors \(\sigma _i\)
We assume that the approximant R has order l, where l is a positive integer. This means
with \(C\ne 0\). It is assumed that the domain \({\mathscr {D}}\) of R includes a neighborhood of zero. Moreover, we assume \(h\lambda _i\in {\mathscr {D}}\), \(i=1,\ldots ,p\).
We can consider \(\max \nolimits _{\lambda _{i}\in \varLambda _{1}}\left| \sigma _{i}\right| \) and \(\max \nolimits _{\lambda _{i}\in \varLambda }\left| \sigma _{i}\right| \) as local relative errors, and \(E_1\) and E as global relative errors, of the numerical integration. An explanation for this is given below at points 2 of Remarks 1.1 and 1.2.
The right-hand side follows by (1.9). Observe that the generic situation for the matrix A is to have \(\varLambda _1\) constituted by a real eigenvalue or by a unique pair of complex conjugate eigenvalues. In this generic situation, we have \(K_{1}=1\).
We call base situation the situation where \(\max \nolimits _{\lambda _{i}\in \varLambda _{1}}\left| \sigma _{i}\right| \) is small.
Here are some observations about the base situation.
In the base situation, it is expected \(E_1\) small, i.e. \(\max \nolimits _{\lambda _{i}\in \varLambda _{1}}\left| \sigma _{i}\right| \ll h\rho _1\), and \(h\rho _1\) non-large. Look at (1.10).
We do not say that in the base situation it is expected \(h\rho _1\) small. In fact, we do not see the case where \(\max \nolimits _{\lambda _{i}\in \varLambda _{1}}\left| \sigma _{i}\right| \) is small and \(h\rho _1\) is not small as “unusual”, when R is an high order approximant.
1.1.14 The non-stiff situation and the stiff situation
The base situation is partitioned in two disjoint sub-situations: the non-stiff situation and the stiff situation.
We call non-stiff situation (stiff situation) the sub-situation of the base situation where \(\max \nolimits _{\lambda _{i}\in \varLambda }\left| \sigma _{i}\right| \) is small ( \(\max \nolimits _{\lambda _{i}\in \varLambda }\left| \sigma _{i}\right| \) is not small), equivalently \(\max \nolimits _{\lambda _{i}\in \varLambda ^{-}}\left| \sigma _{i}\right| \) is small (\(\max \nolimits _{\lambda _{i}\in \varLambda ^{-}}\left| \sigma _{i}\right| \) is not small).
The non-stiff situation and the stiff situation correspond to what is meant as non-stiff and stiff in the traditional terminology of numerical ODEs. The explanation is given below at point 3 of Remark 1.2.
Here are some observations about the non-stiff and stiff situations.
In the non-stiff situation, it is expected E small, i.e. \(\max \nolimits _{\lambda _{i}\in \varLambda }\left| \sigma _{i}\right| \ll h\rho \), and \(h\rho \) non-large. Look at (1.11).
small. In fact, it is expected \(E_1\) small and then to have both \(\max \nolimits _{\lambda _{i}\in \varLambda ^{-}}|\sigma _{i}|\) and \(\max \nolimits _{\lambda _{i}\in \varLambda _1}|\sigma _{i}|\) small with their ratio M not satisfying \(M\ll \frac{1}{E_1}\) appears to be an “extreme” case.
In the stiff situation, it is expected \(h\rho \) non-small. In fact, to have \(\max \nolimits _{\lambda _{i}\in \varLambda }\left| \sigma _{i}\right| \) non-small with \(h\rho \) small appear to be “unlikely”.
In the stiff situation, M is large since it the ratio between a non-small number and a small number.
The function g is increasing with \(g(0)=0\). We have \(g(c)\approx \frac{c}{2}\) for c small, \(g(1)=0.71828\) and \(g(c)=1\) for \(c=1.2564\).
1.2 Analysis of the error \(\gamma _n\)
After having introduced the basic material in the previous subsection, we can proceed with our analysis of the error \(\gamma _n\).
Next theorem (it is Theorem 4.1 in [9] stated with E instead of \(\max \nolimits _{\lambda _{i}\in \varLambda }\left| \sigma _{i}\right| \)) describes how the error \(\gamma _{n}\) grows in time.
Theorem 1.1
Assume \(0\notin \varLambda \). Fix \(c>0\). For \(t_n\rho \le \frac{c}{E}\), we have
If \(E\ll 1\), then (1.16) says that \(\gamma _{n}\) is small and grows linearly in time up to large times \(t_n\rho \), precisely up to the large time \(\frac{1}{E}\). This result is useful in the non-stiff situation, where it is expected \(E\ll 1\).
Remark 1.1
1.
By taking a small c in the previous theorem, we have
$$\begin{aligned} \frac{t_n\rho E}{K}\lessapprox \gamma _{n}\lessapprox t_n\rho E. \end{aligned}$$
To be more precise, this holds for times \(t_n\rho \le x\), where \(x>0\) is such that \(xE\ll 1\).
This explains because \(\max \limits _{\lambda _{i}\in \varLambda }\left| \sigma _{i}\right| \) can be considered as local relative error in the numerical integration of the solution. At \(t_n\rho =1\), we have
This explains because E can be considered as global relative error in the numerical integration of the solution.
3.
The theorem assumes \(0\notin \varLambda \). If \(\varLambda =\{0\}\), we have \(\gamma _n=0\) for any n. For the case \(0\in \varLambda \) and \(\varLambda \ne \{0\}\), see point 5 of Remark 4.1 in [9].
1.2.1 The long-time solution
Let \(y^{\mathrm{long}}\) be the solution of (1.1) with initial value \( Q_{1}y_{0}\) instead of \(y_0\).
The solution \(y^{\mathrm{long}}\) is the long-time solution of (1.1), since we have \(y(t)\approx y^{\mathrm{long}}(t)\) for \(t\rho _1\) large. In particular, we have \(y(t)\approx y^{\mathrm{long}} (t)\) with degree \(\varepsilon \), where \(\varepsilon >0\), if
(see Theorem 5.1 in [9]). Observe that the left-hand side of (1.17) goes to zero as \(t\rho _1\rightarrow +\infty \).
1.2.2 The error \(\gamma _n^{\mathrm{long}}\)
Let \(\gamma _{n}^{\mathrm{long}}\) be the error \(\gamma _n\) of the long-time solution \(y^{\mathrm{long}}\). Next theorem (it is Theorem 5.2 in [9] stated with \(E_1\) instead of \(\max \nolimits _{\lambda _{i}\in \varLambda _1}\left| \sigma _{i}\right| \)) describes how the error \(\gamma _{n}^{\mathrm{long}}\) grows in time.
Theorem 1.2
Assume \(0\notin \varLambda _{1}\). Fix \(c>0\). For \(t_n\rho _1\le \frac{c}{E_1}\), we have
If \(E_1\ll 1\), then (1.18) says that \(\gamma _{n}^{\mathrm{long}}\) is small and grows linearly in time up to large times \(t_n\rho _1\), precisely up to the large time \(\frac{1}{E_1}\). This result is useful in the base situation, where it is expected \(E_1\ll 1\).
Remark 1.2
1.
By taking a small c in the previous theorem, we have
This holds for times \(t_n\rho _1\le x\), where \(x>0\) is such that \(xE_1\ll 1\). If \(\varLambda _{1}\) is constituted by a real eigenvalue or by a complex conjugate pair of eigenvalues (the generic situation for the matrix A), we have \(K_1=1\) and then
Similarly to the point 1 of Remark 1.1, we can explain because \(\max \nolimits _{\lambda _{i}\in \varLambda _{1}}\left| \sigma _{i}\right| \) and \(E_1\) can be considered as local relative error and global relative error, respectively, in the numerical integration of the long-time solution.
3.
Since \(\max \nolimits _{\lambda _{i}\in \varLambda }\left| \sigma _{i}\right| \) and \(\max \nolimits _{\lambda _{i}\in \varLambda _{1}}\left| \sigma _{i}\right| \) can be considered as local relative errors in the numerical integration of the solution and the long-time solution, respectively, we can say that in the non-stiff situation the local relative error of the solution is small, whereas in the stiff situation the local relative error of the solution is not small, but the local relative error of the long-time solution is small. This agrees with the traditional concepts of non-stiff and stiff.
4.
The theorem assumes \(0\notin \varLambda _1\). If \(\varLambda _1=\{0\}\), we have \(\gamma _n^{\mathrm{long}}=0\) for any n. For the case \(0\in \varLambda _1\) and \(\varLambda _1\ne \{0\}\), see point 5 of Remark 5.2 in [9].
1.2.3 Long-time behavior of \(\gamma _n\)
We want to study the long-time behavior of the error \(\gamma _n\). This is done by comparing it to the error \(\gamma _{n}^{\mathrm{long}}\).
Since in the long-time the solution y becomes the solution \(y^{\mathrm{long}}\) whose error \(\gamma _n\) is just \(\gamma _{n}^{\mathrm{long}}\), it is quite reasonable to have \(\gamma _{n}\approx \gamma _{n}^{\mathrm{long}}\) in the long-time.
Indeed, at point 4 of Remark 5.3 in [9], it is stated the following result.
Theorem 1.3
Assume \(q>1\) and \(0\notin \varLambda _1\). Fix \(c>0\) such that \(g(c)<1\), i.e. \(c<1.2564\). For any \(\varepsilon >0\), there exist \(H_0>0\) (independent of \(\varepsilon \)) and \(s\ge 0\) (dependent on \(\varepsilon \)) such that, for \(h\rho \le H_0\) and \(s\le t_n\rho \le \frac{c}{E}\), we have \(\gamma _{n}\approx \gamma _{n}^{\mathrm{long}}\) with degree \(\varepsilon \).
Remark 1.3
The theorem assumes \(q>1\). If \(q=1\), then \(\gamma _{n}=\gamma _{n}^{\mathrm{long}}\) for any n. In addition, it assumes \(0\notin \varLambda _1\). If \(q>1\) and \(\varLambda _1=\left\{ 0\right\} \), then \(\gamma _{n}^{\mathrm{long}}=0\) for any n and it does not make sense look at \(\gamma _{n}\approx \gamma _{n}^{\mathrm{long}}\), since this implies \(\gamma _{n}=0\). For the case \(q>1\), \(0\in \varLambda _1\) and \(\varLambda _1\ne \left\{ 0\right\} \), see point 6 of Remark 5.3 in [9].
The previous theorem is of interest in the non-stiff situation, where the condition \(h\rho \le H_0\) is not restrictive. In fact, in the non-stiff situation it is expected \(h\rho \) non-large.
On the other hand, the result is not useful in the stiff situation, since the condition \(h\rho \le H_0\) is restrictive. In fact, in the stiff situation it is expected \(h\rho \) non-small.
1.3 The contents of this paper
The present paper wants to study the long-time behavior of the relative error \(\gamma _n\) in the stiff situation. As above, this is done by comparing it to \(\gamma _{n}^{\mathrm{long}}\).
In the stiff situation, it is important to have \(\gamma _{n}\approx \gamma _{n}^{\mathrm{long}}\) in the long-time. In fact, if this happens, since \(\gamma _{n}^{\mathrm{long}}\) is small up to large times \(t_n\rho _1\), we have the very surprising fact that the error \(\gamma _n\) is small in the long-time, although the stepsize h is tuned only for having a small local relative error of the long-time solution and, because of this, the local relative error of the solution is not small.
In other words, when we are interested in the numerical integration of the solution in the long-time, we can start from the beginning with a stepsize suitable for integrating with a small local relative error the long-time solution, larger than the stepsize suitable for integrating with a small local relative error the solution, and in the long-time we will have a small error \(\gamma _n\).
As in [9], we confine our attention to normal matrices. This should be not considered as a limitation, since the class of the normal matrices is sufficiently large to include important types of matrices and, moreover, the test problem (1.1) with A normal shows unexplored and interesting situations in numerical ODEs.
The plan of the paper is as follows.
Section 2 shows two examples of stiff situation where we can fail to get \(\gamma _{n}\approx \gamma _{n}^{\mathrm{long}}\) in the long-time with \(\gamma _n\) non-small and growing unboundedly.
Section 3 introduces the definition of “\(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time” of our interest.
Section 4 gives the condition for having, in the stiff situation, \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time.
Section 5 show that when this condition does not hold, we have, in the stiff situation, \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time.
Section 6 revises the examples of Sect. 2 in the light of the results of Sects. 4 and 5.
Section 7 studies when the condition for having \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) holds independently of the specific non-rightmost spectrum.
This final subsection includes replies to general questions or criticisms which could be issued about the contents of this paper.
Question. What is the motivation of this paper?
Reply. This paper studies the relative error of numerical approximations of ODEs, although confined to linear systems with normal matrix. Of course, the absolute error and the relative error of the numerical approximations have the same order of convergence with respect to the stepsize h, but they have a different time behavior in the numerical integration of a solution spanning over various orders of magnitude.
The motivation for studying the relative error time behavior in numerical ODEs, as the present paper is doing, comes from the following two facts:
as it is widely recognized, the relative error is an important measure for the quality of an approximation, often better than the absolute error;
there has been no attention in the numerical ODEs community about the relative error time behavior of numerical approximations.
Anyway, the fact that in the numerical ODEs field the relative error is considered important is attested by the numerical solvers, which accept as an input argument a tolerance on the componentwise relative error. Thus, this paper (similarly to [9] and [10]) try to fill this gap between theory, where there are not studies on the relative error, and practice, where the relative error is used.
Question. What is the relevance of the results achieved?
Reply. For the numerical ODEs community, it should be of interest to know the relative error time behavior of numerical approximations of the ODE (1.1) with A normal. The results achieved describes this time behavior and their relevance is that they give a new prospective on the numerical integration errors. We can summarize this new perspective in the following points.
In the non-stiff situation, the relative error is small and it grows linearly in time. Moreover, this linear growth is determined in the long-time only by the rightmost eigenvalues.
In the stiff situation, the relative error is not small at the beginning of the numerical integration and it is not guaranteed that in the long-time it will become small, with a linear growth determined only by the rightmost eigenvalues. This happens if and only if a certain condition is satisfied and this condition is a novelty in the numerical ODE theory.
Gauss RK methods, despite they are considered stable in the classic numerical linear stability theory (they are A-stable methods), are not suitable to have the above condition satisfied. On the other hand, Radau and Lobatto IIIC RK methods are suitable to have this condition satisfied.
where\(y_{n,i}\)and\(y_i(t_n)\), \(i=1,\ldots ,d\), are the components of\(y_n\)and\(y(t_n)\), should be considered (as in the numerical ODE solvers), not the normwise relative error (1.3).
Reply. In literature both normwise relative errors and componentwise relative errors are considered as quality measures of vector approximations (see [2]). The componentwise approach has the advantage that it gives information on the precision of the components, but it has the drawback that the components must be nonzero (when some component becomes zero, we need to switch to the absolute error). On the other hand, the normwise approach can give anyway information about the componentwise relative errors (for example, a large normwise relative error implies that some component has a large relative error) and it works also when some component becomes zero.
Criticism. Relative errors should be not considered in situations where the exact solution approaches zero, as those studied in this paper. A rule of thumb in numerical analysis says that one should switch to the absolute error in this situation.
Reply. In mathematical modeling and numerical analysis there is a threshold in the order of magnitude of quantities (scalars or vectors) under which they are considered zero. Under the threshold, it is important to use the absolute error for approximations, since they are considered approximations of zero. But, in case of a solution of (1.1) which is going to zero, and so it is spanning over several orders of magnitude, it could be of interest to compute with a good precision this solution for the orders of magnitude larger than the threshold. In this situation, the relative error is important.
Of course, the numerical analyst’s point of view is that the threshold is the order of magnitude of the machine epsilon, but in applications this threshold can be larger.
As an example, we can consider the radioactivity decay of radionuclides, where the activity a(t) (measured in becquerel (Bq) by a Geiger counter) of a given amount of radionuclide satisfies \(a^{\prime }(t)=-\lambda a(t)\) with \(\lambda >0\). For a decay chain, we have \(a^{\prime }(t)=A a(t)\), where A is a lower bi-diagonal matrix, the so-called Bateman equation. The threshold could be the order of magnitude \(10^2 \mathrm{Bq/kg}\) of the background radiation. Of course, this threshold becomes a much smaller ten power by using an unit larger than the becquerel, e.g. the curie. It could be interesting to numerically compute with a good precision a solution a(t) whose initial value has order of magnitude \(10^6 \mathrm{Bq/kg}\) (like in a nuclear plant accident). Since the solution becomes small compared to the initial value, using the relative error for the approximations of the solution is better than using the absolute error when the solution is not yet considered as zero.
Another example could be a space discretization of the heat equation, with homogeneous Dirichlet boundary condition, by the method of lines. In this case, the space discrete temperature approaches zero (the border temperature) and under a given threshold in the order of magnitude, say \(10^{-2}\ ^\circ \mathrm{C}\), it can be considered zero. But, over this order of magnitude, the temperature is not zero and it becomes important to use the relative error for time-space approximations, especially when the solution spans over several orders of magnitude due to an initial value with order of magnitude larger than the threshold, for example \(10^2\ ^\circ \mathrm{C}\).
We remark that the analysis in this paper also consider the situation where the solution, instead of approaching to zero, grows up to large values with respect to the initial value. Also in this situation the relative error is important.
Criticism. The paper considers ODEs (1.1) with matrixAnormal. Such problems can be diagonalized with a unitary transformation and then one can assume without loss of generality thatAis diagonal.
Reply. In the paper, we do not assume from the beginning that A is diagonal because this does not simplify the exposition. In fact, the analysis presented starts from the fundamental relation (4.6) given below for the relative error, which maintains the same form when A is diagonal. We have such a net expression for the relative error precisely for the possibility to reduce to the diagonal case by a unitary transformation. Hence, the assumption that A is diagonal is already implicitly done when one decides to deal with a normal matrix.
Criticism. Since it is possible to reduce to the diagonal case, it would be sufficient to study the behavior of the numerical scheme at a scalar problem, which is really trivial.
Reply. Although we can reduce to a linear systems of uncoupled scalar differential equations, this does not mean that they are fully uncoupled in the numerical scheme, since we are using the same stepsize h in all scalar equations. This reflects the fact that the numerical scheme is applied to an ODE (1.1) with a matrix A in general non-diagonal, without thinking to diagonalize it in advance. Moreover, the analysis of the present paper requires to have rightmost and non-rightmost eigenvalues. In other words, we need eigenvalues with different real parts, i.e. an ODE (1.1) with different time scales. The case of a sole scalar equation is not considered. Anyway, we can observe that in the base situation for a scalar equation, the relative error \(\gamma _n=\gamma _n^{\mathrm{long}}\) is expected to be small and linearly growing in time up to large times.
2 Examples
In this section, we give two examples of stiff situations where the error \(\gamma _{n}\) is not small from the beginning of the numerical integration and it grows without to approach in the long-time to the small error \(\gamma _{n}^{\mathrm{long}}\).
We remind that the stability region of the approximant R (see [5]) is the set
whose eigenvalues are a and b with relevant eigenvectors \(\left( 1,1\right) \) and \(\left( 1,-1\right) \), respectively. We consider \(a=-1\) and the following three possibilities for b:
(P1)
\(b=-11\);
(P2)
\(b=-13.5\);
(P3)
\(b=-16\).
The initial value is \(y_{0}=\left( 2,-1\right) \), for which we have
For the possibility P1), we see in Fig. 2, for \(n=0,1,2,\ldots ,N\), the relative errors \(\gamma _{n}\) (solid red line) and \(\gamma _n^{\mathrm{long}}\) (dash blue line).
×
Starting from a non-small \(\gamma _1\) (remind that \(\gamma _0=0\)), the error \(\gamma _{n}\) goes down to the small error \(\gamma _n^{\mathrm{long}}\). In the long-time, we have small errors \(\gamma _{n}\) although the stepsize is tuned only for having a small \(\sigma _{1}\), without any concern about \(\sigma _{2}\).
For the possibility P2), we see in Fig. 3 the same as in Fig. 2. As in P1), starting from a non-small \(\gamma _1\), the error \(\gamma _n\) goes down to \(\gamma _n^{\mathrm{long}}\), although \(\gamma _n^{\mathrm{long}}\) is reached at a larger time with respect to P1).
×
Finally, for the possibility P3), we see in Fig. 4 the same as in Figs. 2 and 3. Unlike P1) and P2), the error \(\gamma _{n}\) does not go down to \(\gamma _n^{\mathrm{long}}\), but it continues to grow.
×
2.1.1 Order star and stability region
Fixed \(a=-1\), we are interested in understanding for which b, with \(b<a\), we have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time. This happens in P1) and P2), but not in P3).
Order star and stability region for the Taylor approximant of order five are depicted in Fig. 5.
×
The values of \(\left| S\left( hb\right) \right| \) and \(\left| R\left( hb\right) \right| \) are:
Observe that \(hb\in {\mathscr {R}}\) for all three possibilities and \(hb\in {\mathscr {S}}^{c}\) only in P1). In other words, by looking at the negative real axis of Fig 5, hb lies in the red region for all three possibilities and hb lies in the white finger only in P1).
In the next Sect. 4, we will see a condition on hb for having \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time. When it does not hold, we have \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time. The condition is something between \(hb\in {\mathscr {S}}^c\) (i.e. to stay in the white finger) and \(hb\in {\mathscr {R}}\) (i.e. to stay in the red region). Indeed, to have \(hb\in {\mathscr {S}}^c\) is sufficient, but not necessary, for this condition on hb and to have \(hb\in {\mathscr {R}}\) is necessary, but not sufficient.
2.2 Same ODE with different approximants
As second example, we consider the ODE (1.1) with the normal matrix
The solution y consists of two decaying oscillations: the fast oscillation \(y-y^{\mathrm{long}}\) decays faster than the slow oscillation \(y^{\mathrm{long}}\) and, in the long-time, only the slow oscillation is present. We have \(y(t)\approx y^{\mathrm{long}}(t)\) if
Assume that the numerical integration of the ODE is accomplished by the fourth order two-stage Gauss RK method, corresponding to the \(\left( 2,2\right) -\)Padé approximant
Both methods are applied with stepsize \(h=\frac{1}{10}\) over \(N=100\) steps up to \(t_N=Nh=10\). Observe that such a stepsize is not suitable for approximating the fast oscillation.
$$\begin{aligned} \gamma _{n}^{\mathrm{long}}\approx t_n\rho _1 E_1 =\left\{ \begin{array}{l} t_n\cdot 7.86\cdot 10^{-7} \quad\text { for the Gauss RK method}\\ t_n\cdot 5.41\cdot 10^{-5} \quad \text { for the Radau RK method} \end{array}\right. \end{aligned}$$
for \(t_n\le t_N\).
In the upper part of Fig. 6, we see the trajectory \(t_n\mapsto \left( y_1\left( t_{n}\right) ,y_2\left( t_{n}\right) \right) \) in the plane \({\mathbb {R}}^2\) for the first two components of the exact solution \(y(t_n)\), when \(t_{n}\in \left[ 8,10\right] \). In the middle and lower parts, we see the trajectory \(t_n\mapsto \left( y_{n,1} ,y_{n,2}\right) \) for the first two components of the numerical solution \(y_n \), when \(t_{n}\in \left[ 8,10\right] \). Middle part for the Gauss RK method and lower part for the Radau RK method.
×
For the long-time \(t_{n}\in \left[ 8,10\right] \), where only the slow oscillation \(y^{\mathrm{long}}\) is present, the exact components \(y_1(t_n)\) and \(y_2(t_n)\) are equal and have order of magnitude \(10^{-4}\). The Gauss RK method exhibits numerical components \(y_{n,1}\) and \(y_{n,2}\) of order of magnitude \(10^0\). On the other hand, the Radau RK method exhibits accurate numerical components \(y_{n,1}\) and \(y_{n,2}\), although the stepsize is not suitable for approximating the fast oscillation.
In Fig. 7, we see the error \(\gamma _n\), for \(n=0,1,\ldots ,N\), for both approximants: for the Gauss RK method the error continues to grow and for the Radau RK method it goes down to \(\gamma _n^{\mathrm{long}}\).
×
2.2.1 Order star and stability region
Fixed \(\lambda _1=-1+i\) and \(\lambda _3=-3+1000i\), we are interested in understanding for which approximants we have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time. In our situation, this happens for the Radau RK method, but not for the Gauss RK method.
Order star and stability region for such approximants are shown in Fig. 8.
×
We have \(h\lambda _{3}\in {\mathscr {R}}\) for both methods, since they are A-stable. On the other hand, we have \(h\lambda _{3}\in {\mathscr {S}}^c\) only for the Radau RK method:
$$\begin{aligned} \left| S\left( h\lambda _3\right) \right| =\left\{ \begin{array}{l} 1.3494\quad \text { for the Gauss RK method}\\ 0.0270\quad \text { for the Radau RK method.} \end{array}\right. \end{aligned}$$
In Sect. 4, we will see a condition on the approximant for having \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time. When the condition does not hold, we have \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time. To have \(h\lambda _3\in {\mathscr {S}}^c\) (i.e., with reference Fig. 8, the white region of the approximant contains \(h\lambda _3\)) is sufficient, but not necessary, for this condition on the approximant and to have \(h\lambda _3\in {\mathscr {R}}\) (i.e. the red region of the approximant contains \(h\lambda _3\)) is necessary, but not sufficient.
3 The appropriate definition of \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time
In the following, we assume to be in the base situation. Then, it is expected \(E_1\) small and \(h\rho _1\) non-large. To make easier the exposition, we assume \(E_1\) small and \(h\rho _1\) non-large.
Since \(E_1\) is small, the error \(\gamma _{n}^{\mathrm{long}}\) grows linearly in time and it is small up to large times \(t_n\rho _1\).
(The number c plays a role similar the number c appearing in Theorems 1.1, 1.2 and 1.3). As a reference value for c, one can take \(c=1\). As a matter of generality, we do not confine c only to this value. In all theorems below, it is stated for which \(c>0\) they are valid. However, when the theorems are applied, c is considered non-small, so we have
$$\begin{aligned} \tau \gg 1, \end{aligned}$$
and such that \(g(c)<1\), i.e. \(c<1.2564\), with \(1-g(c)\) non-small.
In order to describe the long-time behavior of the error \(\gamma _n\), we compare it to \(\gamma _n^{\mathrm{long}}\) and we are interested in whether or not \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time.
Here, “in the long-time” does not mean \(t_n\rho _1\rightarrow +\infty \). In fact, it is not of great interest to consider what happens for \(t_n\rho _1\rightarrow +\infty \), since \(\gamma _{n}^{\mathrm{long}}\) becomes non-small for a sufficiently large \(t_n\rho _1\). It is of interest to have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) starting from times \(t_n\rho _1\) such that \(\gamma _{n}^{\mathrm{long}}\) is still small.
So, we introduce the following definition.
Definition 3.1
We say that \(\gamma _n\approx \gamma _n^{\mathrm{long}}\)in the long-time if, for some \(s\in [0,\tau )\), \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) for \(t_n\rho _1\) in the interval \([s,\tau ]\) and \(\gamma _{n}^{\mathrm{long}}\ll 1\) for \(t_n\rho _1\) up to the beginning of this interval, i.e. for \(t_n\rho _1\in [0,\kappa s]\) and \(\kappa \ge 1\) non-large.
In the definition, we consider times \(t_n\rho _1\) up to \(\tau \). Observe that if \(K_1\) is not large (remind (1.12) and remind that \(K_1=1\) is the generic situation for the matrix A), then the error \(\gamma _n^{\mathrm{long}}\) is not small for \(t_n\rho _1\) at the end of the interval \([0,\tau ]\).
In fact, for \(t_n\rho _1\in [\kappa \tau ,\tau ]\), where \(\kappa \in (0,1]\) is not small, by Theorem 1.2 we have
where \(e_n\) is such that \(\gamma _n=\gamma _n^{\mathrm{long}}(1+e_n)\).
Remark 3.1
In the previous definition, we also allow monitor functions \(s:(0,a]\times [b,+\infty )\rightarrow [0,+\infty )\), where \(0<a,b<+\infty \). In this case, we have to specify that (3.1) holds for \(\varepsilon \in (0,a]\) and (3.2) holds for \(\varepsilon \in (0,a]\) and \(\tau \ge b\).
3.2 What does the definition with monitor function mean?
Suppose \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time with monitor function s.
Let \(\varepsilon >0\). By (3.2), we see that \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) with degree \(\varepsilon \) for \(t_n\rho _1\in [s(\varepsilon ,\tau ),\tau ]\). Moreover, by Theorem 1.2, we see that if \(\frac{s(\varepsilon ,\tau )}{\tau }\ll 1\), then
for \(t_n\rho _1\in [0,\kappa s(\epsilon ,\tau )] \), where \(\kappa \ge 1\) is not large, i.e. \(\gamma _n^{\mathrm{long}}\ll 1\) for \(t_n\rho _1\) up to the beginning of the interval \([s(\varepsilon ,\tau ),\tau ]\).
Regarding the satisfiability of \(\frac{s(\varepsilon ,\tau )}{\tau }\ll 1\), observe that s satisfies (3.1) and we have \(\tau \gg 1\).
4 Analysis of the long-time behavior of \(\gamma _n\)
In the paper [9], it was presented an analysis of the long-time behavior of the error \(\gamma _n\) important for the non-stiff situation. In the present paper, it is developed another type of analysis important for the stiff situation. In this new analysis, the complex numbers \(w_i\) and \(\alpha _i\) introduced below are important.
4.1 The numbers \(w_i\)
For any \(\lambda _{i}\in \varLambda ^{-}\), i.e. for any non-rightmost eigenvalue, we introduce the complex number
It is expected \(\vert \alpha \vert \) non-small. In fact, let \(\lambda _i\in \varLambda _j\), with \(j=2,\ldots ,q\), be a non-rightmost eigenvalue such that
with \(\vert e\vert \ll 1\), is “unlikely”. Observe that it is expected \(\vert \beta _j\vert \) non-small.
In the non-stiff situation, it is expected \(\alpha \) negative non-small. In fact, it is expected \(\vert \beta _2\vert \) non-small and, in the non-stiff situation, it is expected that the right-hand side of (4.4) is small and then it is expected \(\vert \alpha -\beta _{2}\vert \) small.
4.3 The basic theorem
The next theorem is, in our new analysis, the analog of Theorem 5.3 in [9] (which was suitable for studying the long-time behavior of \(\gamma _n\) in the non-stiff situation).
Theorem 4.1
Assume \(q>1\) and \(0\notin \varLambda _{1}\). Fix \(c>0\) such that \(g(c)<1\), i.e. \(c<1.2564\).
In the following, we continue to assume \(0\notin \varLambda _1\), but all our conclusions are valid (with easy adaptations) also for the case \( 0\in \varLambda _1\) and \(\varLambda _1\ne \{0\}\).
4.4 A first result
We give a first theorem about \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time with a monitor function.
Theorem 4.2
Assume \(q>1\) and \(0\notin \varLambda _1\). Fix \(c>0\) such that \(g(c)<1\), i.e. \(c<1.2564\).
We have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time with monitor function
By inverting the function f, we obtain the monitor function (4.10). \(\square \)
Remark 4.2
1.
For \(\Vert Q_1\widehat{y}_0\Vert _2\) sufficiently close to 1, we have \(s(\varepsilon )<0\). There are two ways for dealing with this. One is to redefine \(s(\varepsilon )\) as 0 when \(s(\varepsilon )<0\). The other is to use (0, a] as domain of s, where \(s(a)=0\). So, we have \(s(\varepsilon )\ge 0\) for \(\varepsilon \in (0,a]\).
2.
By (4.10), (1.14) and (1.12), one can easily prove Theorem 1.3.
The previous theorem with \(c=1\) gives the following results.
then \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) for \(t_n\rho _1\) in the interval \([s,\tau ]\) and \(\gamma _n^{\mathrm{long}}\ll 1\) for \(t_n\rho _1\) up to the beginning of this interval. In particular, we have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) with degree \(\varepsilon \) for \(t_n\rho _1\in [s,\tau ]\) and
It is expected that the last three conditions in (4.13) are satisfied. In fact, since \(E_1\ll 1\), they are not satisfied only in “extreme” cases. Moreover, in the non-stiff situation, it is expected that the first condition is satisfied. In fact, since in the non-stiff situation it is expected \(\frac{\max \nolimits _{\lambda _i\in \varLambda ^{-}}\left| \sigma _{i}\right| }{h\rho _1}\ll 1\), the first condition is not satisfied only in “extreme” cases.
So, we can state the following important conclusion.
Conclusion 4.5
Suppose to be in the non-stiff situation. It is expected \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time.
where W and \(\alpha \) are defined in (4.1) and (4.2), respectively.
Next theorem shows that, under the condition A, we have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time with a new monitor function different from (4.10).
Theorem 4.6
Assume \(q>1\) and \(0\notin \varLambda _1\). Fix \(c>0\) such that \(g(c)<1\), i.e. \(c<1.2564\).
If A holds, then \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time with monitor function
defined for \(x\ge 1\) and for \(\varepsilon >0\) such that the right-hand side of (4.14) with \(x=1\) is greater than or equal to 1, so we have \(s(\epsilon ,x)\ge 1\) for \(x\ge 1\) and such \(\varepsilon \).
Proof
Let A holds. Let \(\tau \ge 1\) and let \(s\in [1,\tau ]\). For \(t_{n}\rho _1 \in \left[ s,\tau \right] \), in (4.5) of Theorem 4.1 we have
By looking at the proof of the previous theorem, we see that there is also a monitor function \(s(\varepsilon ,x)\) defined for all \(\varepsilon >0\) and \(x>0\). It is obtained by inverting with respect to s the upper bound
of \(\vert e_n\vert \), where \(e_n\) is given in Theorem 4.1. Observe that the inverse exists since \(\frac{\mathrm{e}^{-\min \{\vert \alpha \vert ,\vert \beta _2\vert \}s}}{s}\) is a strictly decreasing function of s. This new monitor function has the advantage that it is no longer necessary to suppose \(\varepsilon \le a\) (where is a given at point 1) above) and \(\tau \ge 1\). However, we prefer to use the old monitor function (4.14) because it has an explicit expression.
The previous theorem with \(c=1\) gives the next important results.
Theorem 4.7
Suppose A holds. Let \(\tau =\frac{1}{E_1}\), let \(k>0\) and let
then \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) for \(t_n\rho _1\) in the interval \([\max \{1,s\},\tau ]\) and \(\gamma _n^{\mathrm{long}}\ll 1\) for \(t_n\rho _1\) up to the beginning of this interval. In particular, we have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) with degree \(\varepsilon \) for \(t_n\rho _1\in [\max \{1,s\},\tau ]\) and
Theorem 4.6 says that \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) with degree \(\varepsilon \) for \(t_n\rho _1\in [s,\tau ]\). If \(s<1\), consider \(\overline{k}\), with \(\overline{k}> k\), such that
We have \(s(\overline{\varepsilon },\tau )=1\), where \(\overline{\varepsilon }=\mathrm{e}^{-\overline{k}}\), with \(\overline{\varepsilon }< \varepsilon \). Theorem 4.6 says that \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) with degree \(\overline{\varepsilon }\), and then with degree \(\varepsilon \), for \(t_n\rho _1\in [1,\tau ]\). \(\square \)
then \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time.
Proof
Use the previous theorem with \(\varepsilon =\frac{1}{\tau }=E_1\ll 1\). \(\square \)
It is expected that if A holds, then the three conditions in (4.16) are satisfied. In fact, since \(E_1\ll 1\), they are not satisfied only in “extreme” cases. So, it is expected that if A holds then \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time. Of course, we already know that in the non-stiff situation it is expected \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time, independently of the condition A, as well as we know that it expected that A holds in the non-stiff situation.
So, what is really important is the following conclusion.
Conclusion 4.9
Suppose to be in the stiff situation. It is expected that if A holds, then \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time.
4.5.1 Order star and stability region
The condition A can be related to the order star and the stability region of the approximant (recall the beginning of Sect. 2).
Theorem 4.10
Let \({\mathscr {S}}^c\) be the complementary set of the order star of the approximant. The condition
$$\begin{aligned} h\lambda _{i}\in {\mathscr {S}}^{c} \text { for any }\lambda _{i}\in \varLambda ^- \end{aligned}$$
implies A.
Proof
Let \(\lambda _{i}\in \varLambda ^{-}\). If \(h\lambda _{i}\in {\mathscr {S}}^{c}\), then \(|w_{i}|<1\). In fact, if \(h\lambda _{i}\in {\mathscr {S}}^{c}\), then
In the next section we study, what happens when A does not holds. Of course, when A does not hold, it is expected that B holds.
5 When the condition A does not hold
Next theorem helps to say when \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\). We exclude the time \(t_n\rho _1=0\), i.e. the index \(n=0\), since the ratio \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\) for \(n=0\) is indeterminate \(\frac{0}{0}\).
5.1 Definition of \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time
Here, “for all time” we do not mean for all times \(t_n\rho _1\), since in our analysis we consider \(t_n\rho _1\) up to \(\tau \). So, we introduce the following definition
Definition 5.1
We say that \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\)for all time if \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for \(t_n\rho _1\in (0,\tau ]\).
This definition is made more precise by using a monitor function.
Definition 5.2
Let \(F:(0,+\infty )\rightarrow (0,+\infty )\) such that
then \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for \(t_n\rho _1\in (0,\tau ]\). In particular, we have \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\ge F(\tau )\) for \(t_n\rho _1\in (0,\tau ]\).
then \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time.
In the stiff situation, it is expected that if B holds, then (5.5) holds. In fact, \(E_1\ll 1\) and it is expected \(\vert \alpha \vert \) non-small. So, we can state the following important conclusion.
Conclusion 5.5
Suppose to be in the stiff situation. It is expected that if B holds, then \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time.
The all time lower bound (5.4) of the ratio \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\) is proportional to \(\alpha \tau \). At the end of the interval \([0,\tau ]\) this ratio has a lower bound exponential in \(\alpha \tau \).
In fact, by Theorem 5.1, we see that for \(t_n\rho _1\in [\kappa \tau ,\tau ]\), where \(\kappa \in (0,1]\) is not small,
Although it is expected that C does not hold, we study anyway the condition C since it characterizes the transition between \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time and \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time.
For the condition C, we need a weak form of \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time.
Definition 5.3
Let \(S:(0,+\infty )\rightarrow (0,+\infty )\) be a function such that
We say that S-weakly\(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\)for all time if \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for \(t_n\rho _1\in (0,S(\tau )]\).
Here is the definition with a monitor function.
Definition 5.4
Let \(S,F:(0,+\infty )\rightarrow (0,+\infty )\) such that
then \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for \(t_n\rho _1\in (0,\tau ^{v}]\). In particular, we have \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\ge F(\tau )\) for \(t_n\rho _1\in (0,\tau ^{v}]\).
Theorem 5.8
Suppose C holds. Let \(v\in \left( 0,1\right) \). If
then S-weakly \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time, where \(S(x)=x^v,\ x\ge 1\).
Suppose to be in the stiff situation and suppose that C holds and \(E_1^{1-v}\ll 1\). Then it is expected that (5.8) holds. So, we can state the following conclusion.
Conclusion 5.9
Let \(S(x)=x^v,\ x\ge 1\), with \(v\in (0,1)\) such that \(E_1^{1-v}\ll 1\). Suppose to be in the stiff situation and suppose that C holds. It is expected S-weakly \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time.
6 Examples revisited
Now, we look at the two examples of Sect. 2 in the light of the results of Sects. 4 and 5.
6.1 Same approximant with different ODEs
The conditions A, B and C, are \(W<1\), \(W>1\) and \(W=1\), respectively, where
for \(t_n\rho _1\in [0,\kappa s]\), where \(\kappa \ge 1\) is not large. We have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\)in the long-time. Observe that the values s agree with Figs. 2 and 3.
In (P3), the condition B holds. The value of the monitor function (5.4) is \(F(\tau )=8.79\cdot 10^5\). We have \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\ge F(\tau )\) for \(t_n\rho _1=t_n\in (0,\tau ]\) and so \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time.
6.1.1 The region \({\mathscr {R}}_{hr_1}\)
Recall Sect. 4.5.2. The condition A can be stated as
The region \({\mathscr {R}}_{-0.2}\) is shown in Fig. 9 (compare with Fig. 5 showing \({\mathscr {R}}_0 =\overset{\circ }{{\mathscr {R}}}\)). The part of \({\mathscr {R}}_{-0.2}\cap (-\infty ,-0.2)\) in the white finger corresponds to the sufficient condition \(hb\in {\mathscr {S}}^{c}\). Out of the white finger, we have an additional range of values for hb guaranteeing the condition A. The border value for b between the conditions A and B, where the condition C holds, is \(b=-15.565\). Observe that we are out of the white finger for \(b<-11.887\) and out of the stability region for \(b<-16.085\).
×
6.2 Same ODE with different approximants
The conditions A, B and C, are \(W<1\), \(W>1\) and \(W=1\), respectively, where
For the Gauss RK method, the condition B holds. The value of the monitor function (5.4) is \(F(\tau )=3.31\cdot 10^5\). We have \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\ge F(\tau )\) for \(t_n\rho _1=\sqrt{2}t_n\in (0,\tau ]\) and so \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time.
For the Radau RK method, the condition A holds. The value of s in (4.15) relevant to \(k=3\) is \(s=9.25\). We have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) with degree \(\varepsilon =4.98\cdot 10^{-2}\) for \(t_n\rho _1=\sqrt{2}t_n\in [s,\tau ]\) and
for \(t_n\rho _1\in [0,\kappa s]\), where \(\kappa \ge 1\) is not large. We have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\)in the long-time. With reference to Fig. 6, we have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) with degree \(\varepsilon \) for \(t_n\in [8,10]\), i.e. for \(t_n\rho =\sqrt{2}t_n\in [11.31,14.14]\).
The region \({\mathscr {R}}_{-0.1}\) for the two methods is shown in Fig. 10. In the left part of the figure, we see that the region for the Gauss RK method does not cover points with large imaginary part on the line
On the other hand, in the right part, we see that the region for the Radau RK method completely includes this line.
×
7 Independence of the non-rightmost spectrum
In this section, we study when the condition A holds independently of the particular non-rightmost spectrum \(\varLambda ^{-}\).
Here, we consider an analytic approximant R with domain \({\mathscr {D}}\) such that \(\{z\in {\mathbb {C}}:\mathrm{Re}\left( z\right) < \beta _R\}\subseteq {\mathscr {D}}\) for some \(\beta _R\in (0,+\infty ]\), i.e. \({\mathscr {D}}\) includes a left half-plane.
7.1 The property A(x)
We introduce the property \(\mathrm{A}(x)\) of the approximant R.
Definition 7.1
Let \(x<\beta _R\). Let
$$\begin{aligned} \mathrm{A}(x)\ \ \overset{\mathrm{def}}{\Longleftrightarrow } \ \ \mathrm{e}^{-x}\left| R\left( z\right) \right|<1\text { for all }z\in {\mathbb {C}}\text { such that } \mathrm{Re}\left( z\right) <x, \end{aligned}$$
where \(\overset{\mathrm{def}}{\Longleftrightarrow }\) has the meaning of “if and only if” by definition.
The property \(\mathrm{A}(x)\) can be also written as
where \({\mathscr {R}}_{x}\) is the region defined in (4.18). Observe that A(0) is the A-stability property.
The property \(\mathrm{A}(x)\) is important because \(\mathrm{A}(hr_{1})\) implies the condition A for all non-rightmost spectra \(\varLambda ^{-}\).
It is of interest to consider the property \(\mathrm{A}(x)\) for \(\vert x\vert \) non-large. In fact, \(\vert hr_{1}\vert \le h\rho _1\) and we are assuming \(h\rho _1\) non-large.
We have the following negative result.
Theorem 7.1
There exists \(x_0>0\) such that, for \(x<\beta _R\) with \(\vert x\vert \le x_0\) and \(x\ne 0\), \(\mathrm{A}(x)\) is not true.
Proof
Remind that l is the order of the approximant R. In the complex plane, there exists a small disk centered at the origin which consists of \(l+1\) sectors of width \(\frac{\pi }{l+1}\) included in the order star \({\mathscr {S}}\), intercalated with \(l+1\) sectors of width \(\frac{\pi }{l+1}\) included in \( {\mathscr {S}}^{c}\). Thus, there exists \(x_0>0\) such that, for \(x<\beta _R\) with \(\vert x\vert \le x_0\) and \(x\ne 0\), the line \(\mathrm{Re}(z)=x\) has a non-empty intersection with the order star \({\mathscr {S}}\). Let w be a point in this intersection. We have \(\mathrm{Re}\left( w\right) =x\) and
Then, due to the continuity of R, there exists \(\varepsilon >0\) such that, for any \(z\in {\mathbb {C}}\) with \(x-\varepsilon \le \mathrm{Re} \left( z\right) \le x\) and \(\mathrm{Im}\left( z\right) =\mathrm{Im} \left( w\right) \) , we have
The previous Theorem 7.1 says that, for any \(x<\beta _R\) with \(\vert x\vert \le x_0\) and \(x\ne 0\), there exists \(z\in {\mathbb {C}}\) with \(\mathrm{Re}(z)<x\) such that \(\mathrm{e}^{-x}\vert R(z)\vert \ge 1\). So, we can have, for some normal matrix A, a situation where the rightmost real part is \(r_{1}=\frac{x}{h}\) and \(\lambda _{i}=\frac{z}{h}\) is a non-rightmost eigenvalue. For this eigenvalue we have
and it is expected \(\vert \beta _2 \vert \) non-small.
So, the negative result of Theorem 7.1 is not disastrous. The theorem says that, for any rightmost real part \(r_1\ne 0\) with \(\vert hr_1\vert \le x_0\), there is a situation where we have a non-rightmost eigenvalue \(\lambda _i\) such that \(\vert w_i\vert \ge 1\). But, such eigenvalue could be non-significant and, if this is true, then it is expected that such a situation does not happen.
In Sect. 7.10 below, we will introduce a condition on the approximant under which any non-rightmost eigenvalue \(\lambda _i\) with \(\vert w_i\vert \ge 1\) is non-significant.
7.3 The properties \(\mathrm{A}(x,a)\) and \(\mathrm{B}(x,a)\)
It is expected that any non-rightmost eigenvalue \(\lambda _i\) with \(\left| w_{i}\right| \ge 1\) has \(\vert h\lambda _i\vert \) non-small. In fact, it is expected that \(\lambda _i\) is significant, i.e. it is expected that
is not small, and then it is “unlikely” to have \(\vert h\lambda _i\vert \) small.
Thus, we look at condition A for a non-rightmost spectrum \(\varLambda ^{-}\) with all the eigenvalues \(\lambda _i\) such that \(\vert h\lambda _i\vert \) is not small. In this context, the following two properties of the approximant R are important.
Definition 7.3
Let \(x<\beta _{R}\) and let \(a\ge 0\). Let
$$\begin{aligned}&\mathrm{A}(x,a)\ \ \overset{\mathrm{def}}{\Longleftrightarrow }\ \ \mathrm{e}^{-x}\left| R\left( z\right) \right|<1\text { for all }z\in {\mathbb {C}}\text { such that }\mathrm{Re}\left( z\right)<x \text { and}\ \left| z\right| \ge a\\&\mathrm{B}(x,a)\ \ \overset{\mathrm{def}}{\Longleftrightarrow }\ \ \mathrm{e}^{-x}\left| R\left( z\right) \right| >1\text { for all }z\in {\mathbb {C}}\text { such that }\mathrm{Re}\left( z\right) <x\text { and}\ \left| z\right| \ge a. \end{aligned}$$
The properties \(\mathrm{A}(x,a)\) and \(\mathrm{B}(x,a)\) can be also written as
The properties \(\mathrm{A}(x,a)\) and \(\mathrm{B}(x,a)\) are important because \(\mathrm{A}(hr_{1},a)\) implies the condition A for all non-rightmost spectra \(\varLambda ^{-}\) such that \(h\mu ^{-}\ge a\) and \(\mathrm{B}(hr_{1},a)\) implies the condition B for all non-rightmost spectra \(\varLambda ^{-}\) such that \(h\rho ^{-}\ge a\). Remind that \(\mu ^{-}\) and \(\rho ^{-}\) are defined in (1.5).
If \(L>\mathrm{e}^{hr_{1}}\), then for any \(\theta \in \left( 1,\mathrm{e}^{-hr_{1}}L\right) \) and for any non-rightmost spectrum \(\varLambda ^{-}\) satisfying \(h\rho ^{-}\ge a\), where a is given in (7.4) with \(x=hr_1\), the condition B holds with
Observe that, by varying \(\theta \) in \(\left( 1,\mathrm{e}^{-hr_{1}} L\right) \), the lower bound \(\frac{\log \theta }{h\rho _1}\) of \(\alpha \) can be arbitrarily close from below to the positive number
If \(L<\mathrm{e}^{hr_{1}}\), then for any \(\theta \in \left( \mathrm{e}^{-hr_{1}}L,1 \right) \) and for any non-rightmost spectrum \(\varLambda ^{-}\) satisfying \(h\mu ^{-}\ge a\), where a is given in (7.6) with \(x=hr_1\), the condition A holds with
where \(L_{\mathrm{inf}}=\inf \nolimits _{\mathrm{Re}(z)<hr_1}\vert R(z)\vert \).
Proof
Suppose \(h\mu ^{-}\ge a\). For a non-rightmost eigenvalues \(\lambda _i\) such that \(\alpha =\mathrm{Re}(\alpha _i)\) we have \(\vert h\lambda _i\vert \ge a\) and then
Observe that, by varying \(\theta \) in \(\left( \mathrm{e}^{-hr_{1}}L,1 \right) \), the upper bound \(\frac{\log \theta }{h\rho _1}\) of \(\alpha \) can be arbitrarily close from above to the negative number
If, in addition, \(L=L_{\mathrm{inf}}\), then \(\alpha \) is not smaller than this negative number and \(\alpha \) can be arbitrarily close to it.
7.7 Approximants with \(L=+\infty \)
Consider approximants with \(L=+\infty \). Examples of such approximants are Taylor approximants and superdiagonal Padé approximants.
The results in Sect. 7.5 say that the condition B holds for \(h\rho ^{-}\) sufficiently away from zero, as confirmed in the first example of Sect. 2. In particular, B holds for
As \(h\rho ^{-}\rightarrow +\infty \), B holds with \(\alpha \rightarrow +\infty \).
7.8 Approximants with \(L=0\)
Consider approximants with \(L=0\). Examples of such approximants are subdiagonal Padé approximants. Radau e Lobatto IIIC RK methods correspond to the first and second subdiagonal Padé approximants, respectively.
The results in Sect. 7.6 say that the condition A holds for \(h\mu ^{-}\) sufficiently away from zero. In particular, A holds for
As \(h\mu ^{-}\rightarrow +\infty \), A holds with \(\alpha \rightarrow -\infty \). Moreover, Theorem 7.1 says that, for any rightmost real part \(r_1\ne 0\) with \(\vert hr_1\vert \le x_0\), we cannot have that A holds for all \(h\mu ^{-}\).
A-stable approximants with \(L=0\) are called L-stable (see [5]) and they are considered particularly suitable for integrating very stiff ODEs (see [1, 3, 8, 11]). Observe that here we are also considering approximants with \(L=0\) which are not A-stable. Indeed, the A-stability property does not play a crucial role in this context. Among subdiagonal Padé approximants, only the first and second subdiagonal Padé approximants (Radau and Lobatto IIIC methods) are A-stable.
7.9 Approximants with \(L=1\)
Consider approximants with \(L=1\). Examples of approximants with \(L=1\) are diagonal Pad é approximants, which are also A-stable. Gauss methods correspond to the diagonal Padé approximants.
Suppose \(r_{1}<0\). The results in Sect. 7.5 say that the condition B holds for \(h\rho ^{-}\) sufficiently away from zero, as confirmed in the second example of Sect. 2. In particular, B holds for
For an A-stable approximant, B holds with \(\alpha \le -\frac{r_{1}}{\rho _1}\) and, as \(h\rho ^{-}\rightarrow +\infty \), \(\alpha \rightarrow -\frac{r_{1}}{\rho _1}\).
Suppose \(r_1>0\). The results in Sect. 7.6 say that the condition A holds for \(h\mu ^{-}\) sufficiently away from zero. In particular, A holds for
For an A-stable approximant, A holds with \(\alpha \ge -\frac{r_{1}}{\rho _1}\) and, as \(h\mu ^{-}\rightarrow +\infty \), \(\alpha \rightarrow -\frac{r_{1}}{\rho _1}\).
7.10 Non-significant eigenvalues II
In this subsection we study when any non-rightmost eigenvalue \(\lambda _i\) with \(\vert w_i\vert \ge 1\) is non-significant (see Sect. 7.2).
where \({\mathscr {R}}_{x}^{c}\) is the complementary set of \({\mathscr {R}}_x\).
We have \(\mathrm{A}(x)\) if and only if \({\mathscr {P}}_{x}=\emptyset \). Moreover, for \(a\ge 0\), we have \(\mathrm{A}(x,a)\) if and only if the open disk of radius a centered at the origin includes \({\mathscr {P}}_{x}\).
The importance of the region \({\mathscr {P}}_{x}\) is due to the fact that, for a non-rightmost eigenvalue \(\lambda _{i}\), we have \(|w_{i}|\ge 1\) if and only if \(h\lambda _{i}\in {\mathscr {P}}_{hr_{1}}\).
In other words, a(x) is the infimimum of the radii of open disks centered at the origin and including \({\mathscr {P}}_x\).
The importance of the number a(x) is given by the following theorem.
Theorem 7.8
For a non-rightmost eigenvalue \(\lambda _i\) such that \(\vert w_i\vert \ge 1\), we have \(\vert h\lambda _i\vert \le a\left( hr_{1}\right) \).
Proof
The closed disk of radius a(x) centered at the origin includes the region \({\mathscr {P}}_{x}\). The theorem follows by reminding that \({\mathscr {P}}_{x}\) contains the non-rightmost eigenvalues \(\lambda _i\) such that \(\vert w_i\vert \ge 1\). \(\square \)
7.10.3 The theorem on the non-significant eigenvalues
Next theorem says when any non-rightmost eigenvalue \(\lambda _i\) with \(\vert w_i\vert \ge 1\) is non-significant. It involves the behavior of a(x) as \(x\rightarrow 0\).
The theorem now follows by reminding the definition of non-significant eigenvalue. \(\square \)
Remark 7.3
The term \(O(h\rho _1)\) in (7.8) is not larger than \(C h\rho _1\) for \(h\rho _1\le D\), where \(C\ge 0\) and \(D>0\) depend only on the approximant.
By the previous theorem we obtain the following important conclusion.
Conclusion 7.10
Suppose that the approximant satisfies (7.7). It is expected that A holds.
In fact, suppose A does not hold, i.e. there is a non-rightmost eigenvalue \(\lambda _i\) with \(\vert w_i\vert \ge 1\). It is expected that this eigenvalue is significant. On the other hand, if it is significant, then, by the previous theorem, we obtain that (7.8) does not hold and this is “unlikely”.
In the next subsection, we show that the implicit Euler method satisfies (7.7).
7.11 The implicit Euler method
We examine the property \(\mathrm{A}(x)\) and determine the number a(x) for the the implicit Euler method, corresponding to the \(\left( 0,1\right) \)-Padé approximant
The region \({\mathscr {R}}_x\), \(x<1\), for this approximant is the exterior of the disk of center 1 and radius \(\mathrm{e}^{-x}\) and the region \({\mathscr {P}}_x\) is the part of the closed disk at the left of the line \(\mathrm{Re}(z)=x\) (see Fig. 11).
Theorem 7.11
Let \(x<1\). For the implicit Euler method, we have \(\mathrm{A}(x)\) if and only \(x=0\). Moreover, we have
since the complementary set \({\mathscr {R}}_x^c\) of \({\mathscr {R}}_x\) is the closed disk of center 1 and radius \(\mathrm{e}^{-x}\) (see Fig. 10). Thus A(x) is not true.
For the second part, let \(b\ge 0\). An easy computation shows that, for \(z\in {\mathbb {C}}\) such that \(\vert z\vert =b\), we have
We can conclude that it is expected that A holds for the implicit Euler method.
8 Conclusions
In the stiff situation, we have studied the long-time behavior of the relative error in the numerical integration of the ODE (1.1) with A normal. The numerical integration is accomplished over a mesh of constant stepsize h, by using at any step of an analytic approximant R of the exponential: see (1.2). The relative error \(\gamma _n\) of the numerical integration is given in (1.3).
We have defined the long-time solution \(y^{\mathrm{long}}\) as the solution of (1.1) projected on the eigenspace of the rightmost eigenvalues and we have considered the relative error \(\gamma _n^{\mathrm{long}}\) of the numerical integration of \(y^{\mathrm{long}}\). The error \( \gamma _n^{\mathrm{long}}\) grows linearly in time, it is small and it remains small in the long-time.
where \(r_1\) is the real part of the rightmost eigenvalues of A. When A holds, we have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time. When A does not hold, we have \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time.
Let \(L=\lim \nolimits _{z\rightarrow \infty }\vert R(z)\vert \). In order to have the condition A satisfied, it is better to use approximant with \(L=0\) (for example Radau and Lobatto IIIC methods). Approximants with \(L=1\) (for example Gauss methods) does not work well when \(r_1<0\).
The paper [10] analyzes the numerical integration in the stiff situation by looking to a different question. In [10], the interest is about numerical approximations (1.2) of the long-time solution starting with a perturbed initial value. The approximants are analyzed by means of their error growth function \(\varphi _R\) (see [4, 5]) in order to study how they propagate the initial perturbation from the relative error point of view. In this other context, we have a non-large propagation of the initial perturbation if and only if
We have considered the case of A normal. Some numerical experiments, not included here, suggest that also for non-normal matrices we have \(\gamma _n\approx \gamma _n^{\mathrm{long}}\) in the long-time when the condition A holds and \(\frac{\gamma _n}{\gamma _n^{\mathrm{long}}}\gg 1\) for all time when A does not hold. In light of this, the results of Sect. 7 becomes more important, since they are about the condition A.
We conclude by remarking that the findings of this paper are interesting in applications involving differential models described by linear ODEs with \(r_1\ne 0\). In particular, they are interesting when we are integrating an ODE whose solution decreases to small orders of magnitude (case \(r_1<0\)), but it is not yet considered as zero, or grows up to a large orders of magnitude (case \(r_1>0\)), but it is not yet considered as infinite.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.