nach oben

Calcolo

Erschienen in:

Open Access 01.03.2022

On the convergence rate of the Kačanov scheme for shear-thinning fluids

verfasst von: Pascal Heid, Endre Süli

Erschienen in: Calcolo | Ausgabe 1/2022

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Patentsuche

Aus

Abstract

We explore the convergence rate of the Kačanov iteration scheme for different models of shear-thinning fluids, including Carreau and power-law type explicit quasi-Newtonian constitutive laws. It is shown that the energy difference contracts along the sequence generated by the iteration. In addition, an a posteriori computable contraction factor is proposed, which improves, on finite-dimensional Galerkin spaces, previously derived bounds on the contraction factor in the context of the power-law model. Significantly, this factor is shown to be independent of the choice of the cut-off parameters whose use was proposed in the literature for the Kačanov iteration applied to the power-law model. Our analytical findings are confirmed by a series of numerical experiments.

PH acknowledges the financial support of the Swiss National Science Foundation (SNF), Project No. P2BEP2_191760.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

In this work, we focus on the iterative solution of nonlinear partial differential equations that arise in models of steady flows of incompressible shear-thinning fluids, including models with explicit constitutive relations of Carreau and power-law type. In particular, we consider the following quasi-Newtonian fluid flow problem: find $(\mathbf {u},p)$ such that

$$\begin{aligned} -\nabla \cdot \{\mu ({\varvec{x}},|\underline{e}(\mathbf {u})|^2) \underline{e}(\mathbf {u})\}+\nabla p &= \mathbf {f} \quad \text {in } \Omega , \\ \nabla \cdot \mathbf {u} & =0 \quad \text {in } \Omega , \\ \mathbf {u}&=\mathbf {0} \quad \text {on } \partial \Omega , \end{aligned}$$

(1)

where $\Omega \subset {\mathbb {R}}^d$, $d \in \{2,3\}$, is a bounded Lipschitz domain, the source term $\mathbf {f} \in \mathrm {L}^{2}(\Omega )^d$ is a given external force, $\mathbf {u}$ is the velocity vector, p denotes the pressure, and $\underline{e}(\mathbf {u})$ is the $d \times d$ rate-of-strain tensor defined by

$$\begin{aligned} e_{ij}(\mathbf {u}):=\frac{1}{2} \left( \frac{\partial u_i}{\partial x_j}+\frac{\partial u_j}{\partial x_i}\right) ,\qquad i,j=1,\ldots ,d. \end{aligned}$$

here $|\underline{e}(\mathbf {u})|$ denotes the Frobenius norm of $\underline{e}(\mathbf {u})$, and the (real-valued) viscosity coefficient $\mu $ is assumed to satisfy the following structural assumptions:

(A1)

$\mu \in C({\overline{\Omega }} \times {\mathbb {R}}_{\ge 0})$ and it is differentiable in the second variable;

(A2)

There exist constants $m_\mu ,M_\mu >0$ such that

$$\begin{aligned} m_\mu (t-s) \le \mu ({\varvec{x}},t^2)t-\mu ({\varvec{x}},s^2)s \le M_\mu (t-s), \qquad t \ge s \ge 0, \quad {\varvec{x}}\in {\overline{\Omega }}; \end{aligned}$$

(2)

(A3)

$\mu $ is decreasing in the second variable, i.e., $\mu '({\varvec{x}},t) \le 0$ for all $t \ge 0$ and all ${\varvec{x}}\in {\overline{\Omega }}$, where $\mu '$ denotes the derivative of $\mu $ with respect to the variable t.

The assumption (A3) asserts that the viscosity decreases with increasing strain rate, in line with our assumption that the fluid under consideration is shear-thinning. Moreover, (A2) immediately implies that $\mu $ is bounded from above and below; indeed, by setting $s=0$, we obtain

$$\begin{aligned} m_\mu \le \mu ({\varvec{x}},t) \le M_\mu \qquad \text {for all } {\varvec{x}}\in {\overline{\Omega }}, \, t \ge 0. \end{aligned}$$

(3)

The bounds $m_\mu $ and $M_\mu $ are, in general, closely related to the infinite and zero shear viscosity plateau, respectively. In the sequel, the dependence of $\mu $ on ${\varvec{x}}\in \Omega $ will be suppressed.

Upon defining $V:=\{\mathbf {u}\in \mathrm {H}^1_0(\Omega )^d: \nabla \cdot \mathbf {u}=0\}$, the weak formulation of (1) is as follows:

$$\begin{aligned} \text {find} \ \mathbf {u}\in V \ \text {such that} \qquad \int _\Omega \mu (|\underline{e}(\mathbf {u})|^2)\underline{e}(\mathbf {u}):\underline{e}(\mathbf {v})\,\mathsf {d}{\varvec{x}}= \int _\Omega \mathbf {f} \cdot \mathbf {v} \,\mathsf {d}{\varvec{x}}\qquad \text {for all } \mathbf {v}\in V, \end{aligned}$$

(4)

where $\underline{e}(\mathbf {u}):\underline{e}(\mathbf {v})$ denotes the Frobenius inner product of $\underline{e}(\mathbf {u})$ and $\underline{e}(\mathbf {v})$; we refer to [2] Sect. 2 for more details concerning the weak formulation (4). The space V is endowed with the inner product

$$\begin{aligned} (\mathbf {u},\mathbf {v})_V=\int _\Omega \underline{e}(\mathbf {u}):\underline{e}(\mathbf {v}) \,\mathsf {d}{\varvec{x}}, \qquad \mathbf {u},\mathbf {v}\in V, \end{aligned}$$

(5)

and the induced norm ${\left| \left| \left| \mathbf {u} \right| \right| \right| }_\Omega ^2=(\mathbf {u},\mathbf {u})_V$, $\mathbf {u}\in V$. We emphasize that

$$\begin{aligned} \frac{1}{2} \sum _{i=1}^d \int _\Omega |\nabla u_i|^2 \,\mathsf {d}{\varvec{x}}\le \int _\Omega |\underline{e}(\mathbf {u})|^2 \,\mathsf {d}{\varvec{x}}\le \sum _{i=1}^d \int _\Omega |\nabla u_i|^2\,\mathsf {d}{\varvec{x}}\qquad \text {for all } \mathbf {u}\in V, \end{aligned}$$

i.e., the norm ${\left| \left| \left| \cdot \right| \right| \right| }_\Omega $ is equivalent to the standard norm on $\mathrm {H}^1_0(\Omega )^d$; the first inequality is a special case of Korn’s inequality (see, e.g., inequality (1.7) in [17]), while the second can be easily verified by invoking the Cauchy–Schwarz inequality. In particular, V endowed with the inner product of (5) and induced norm ${\left| \left| \left| \cdot \right| \right| \right| }_\Omega $ is a Hilbert space.

The weak form (4) of the boundary-value problem under consideration is known to have a unique solution $\mathbf {u}^\star \in V$, which will be shown, nonetheless, in Sect. 2; moreover, this element $\mathbf {u}^\star \in V$ is the unique minimiser of the energy functional

$$\begin{aligned} \mathsf {E}(\mathbf {u}):= \int _\Omega \varphi \left( \left| \underline{e}(\mathbf {u})\right| ^2\right) \,\mathsf {d}{\varvec{x}}-\int _\Omega \mathbf {f} \cdot \mathbf {u} \,\mathsf {d}{\varvec{x}}, \qquad \mathbf {u}\in V, \end{aligned}$$

(6)

where

$$\begin{aligned} \varphi (s):=\frac{1}{2}\int _0^s \mu (t)\, \mathrm {d}t. \end{aligned}$$

Indeed, a straightforward calculation reveals that, for a given $\mathbf {u}\in V$,

$$\begin{aligned} \mathsf {E}'(\mathbf {u})(\mathbf {v})=\int _\Omega \mu \left( \left| \underline{e}(\mathbf {u})\right| ^2\right) \underline{e}(\mathbf {u}):\underline{e}(\mathbf {v})\,\mathsf {d}{\varvec{x}}- \int _{\Omega } \mathbf {f} \cdot \mathbf {v} \,\mathsf {d}{\varvec{x}}, \qquad \mathbf {v} \in V, \end{aligned}$$

(7)

where $\mathsf {E}'$ denotes the Gâteaux derivative; we refer to [1, Prop. 2.1] for details. In particular, the weak formulation (4) is the Euler–Lagrange equation for the minimisation of $\mathsf {E}$ over V.

A prominent iterative solver for the nonlinear problem (4) is Kačanov’s scheme, which, in simple terms, fixes the nonlinearity at the previous iterate: for a given $\mathbf {u}^n \in V$ find $\mathbf {u}^{n+1} \in V$ such that

$$\begin{aligned} \int _\Omega \mu \left( \left| \underline{e}\left( \mathbf {u}^n\right) \right| ^2\right) \underline{e}(\mathbf {u}^{n+1}):\underline{e}(\mathbf {v})\,\mathsf {d}{\varvec{x}}= \int _\Omega \mathbf {f} \cdot \mathbf {v} \,\mathsf {d}{\varvec{x}}\qquad \text {for all } \mathbf {v}\in V,\quad n=0,1,\ldots , \end{aligned}$$

(8)

where $\mathbf {u}^0 \in V$ is an arbitrary initial guess. Early references concerning this iterative method include [14], where it was used to compute a stationary magnetic field in nonlinear media, and [5], where the convergence of the Kačanov iteration was investigated in the context of Galerkin methods; Fučík, Kratochvíl and Nečas point in their work [5] to pages 369–370 of Michlin’s 1966 monograph [15] for a description of the iterative method introduced by Kačanov in [13], in the context of variational methods for plasticity problems. Kačanov’s iteration scheme has been, by now, carefully examined; see, e.g., the monographs [16, Sect. 4.5] and [21, Sect. 25.14], or the papers [7, 8, 10]. More recently, it was shown in the articles [9] and [4] that the energy $\mathsf {E}$ from (6) contracts along the sequence generated by the Kačanov iteration (8). Indeed, the first of these two papers established the energy contraction for a more general iteration scheme, and the latter focuses on the Kačanov scheme for a ‘relaxed p-Poisson problem’ involving a truncation of the nonlinearity from below and from above using a pair of positive cut-off parameters $\varepsilon _{-}$ and $\varepsilon _{+}$. The derived upper bound on the contraction factor depends on the quotient $\nicefrac {m_\mu }{M_\mu }$ involving $\varepsilon _{-}$ and $\varepsilon _{+}$, and may be extremely close to 1 in certain situations; interestingly, this unfavourable predicted dependence of the contraction factor on the ratio $\nicefrac {m_\mu }{M_\mu }$ has not been observed in numerical experiments. It is this mismatch between the observed behaviour of the method and the rather more pessimistic results of the analysis reported in the literature that motivated the work outlined herein.

We will establish an improved upper bound on the contraction factor of the Kačanov iteration for a general class of shear-thinning fluids. The resulting bound will then be further examined for fluids obeying either the Carreau law or a relaxed power-law, to be specified in the lines below. It will be shown that for (finite-dimensional) Galerkin approximations of the relaxed power-law model it is the power-law exponent, rather than the ratio $\nicefrac {m_\mu }{M_\mu }$, that is responsible for the rate of convergence of the iteration. Specifically, we will show that the contraction factor of the iteration on finite-dimensional spaces is independent of the choice of the lower and upper cut-off parameters featuring in the so-called relaxed Kačanov iteration, where a truncation of the power-law nonlinearity from below and above is carried out by means of these two positive truncation parameters. To the best of our knowledge the proof of such a result was an open question in the literature.

The paper is structured as follows. In Sect. 2 we will show that the weak formulation (4) of the problem under consideration has a unique solution, which, in turn, is the unique minimiser of $\mathsf {E}$ in V. The proof is based on auxiliary results, which will also be decisive for the derivation of the contraction factor in Sect. 3. In Sect. 4 we will perform a series of numerical experiments, which confirm our theoretical results. The paper closes with concluding remarks recorded in Sect. 5.

2 Existence and uniqueness of the solution

We will show in this section that the weak formulation (4) has a unique solution. To this end we define, for given $\mathbf {u}\in V$, the linear operator $\mathsf {A}[\mathbf {u}]: V \rightarrow V^\star $, where $V^\star $ denotes the dual space of V, by

$$\begin{aligned} \mathsf {A}[\mathbf {u}](\mathbf {v})(\mathbf {w}):= \int _\Omega \mu (|\underline{e}(\mathbf {u})|^2) \underline{e}(\mathbf {v}):\underline{e}(\mathbf {w}) \,\mathsf {d}{\varvec{x}}, \qquad \mathbf {v},\mathbf {w}\in V, \end{aligned}$$

(9)

and the linear form $\ell \in V^\star $ by

$$\begin{aligned} \ell (\mathbf {w}):=\int _\Omega \mathbf {f} \cdot \mathbf {w} \,\mathsf {d}{\varvec{x}}, \qquad \mathbf {w}\in V. \end{aligned}$$

In terms of these, the weak formulation (4) can be restated in the following equivalent form:

$$\begin{aligned} \text {find} \ \mathbf {u}\in V \ \text {such that} \qquad \mathsf {A}[\mathbf {u}](\mathbf {u})(\mathbf {v})=\ell (\mathbf {v}) \qquad \text {for all } \mathbf {v}\in V, \end{aligned}$$

(10)

and the Kačanov iteration (8) takes the form: given $\mathbf {u}^0 \in V$,

$$\begin{aligned} \text {find} \ \mathbf {u}^{n+1} \in V \ \text{ such } \text{ that } \quad \mathsf {A}[\mathbf {u}^n](\mathbf {u}^{n+1})(\mathbf {v})=\ell (\mathbf {v}) \qquad \text {for all } \mathbf {v}\in V, \quad n=0,1,\ldots . \end{aligned}$$

(11)

By (7) and the definitions of $\mathsf {A}$ and $\ell $ we further have that

$$\begin{aligned} \mathsf {E}'(\mathbf {u})=\mathsf {A}[\mathbf {u}](\mathbf {u})-\ell . \end{aligned}$$

(12)

We will now show that the operator $\mathbf {u}\mapsto \mathsf {A}[\mathbf {u}](\mathbf {u})$ is Lipschitz continuous and strongly monotone, since, in that case, the theory of strongly monotone operators implies that the weak equation (10) has a unique solution $\mathbf {u}^\star \in V$; see, e.g., [16, Sect. 3.3] or [21, Sect. 25.4]. For the proof of Lipschitz continuity and strong monotonicity of the operator $\mathbf {u}\mapsto \mathsf {A}[\mathbf {u}](\mathbf {u})$ we require the following result, which, as well as its proof, is largely borrowed from [2, Lemma 3.1]. However, we place emphasis on sharp bounds, since these will be crucial for our convergence analysis below, leading to the improved factors appearing in (15) and (16).

Lemma 2.1

Let $\mu $ satisfy the assumptions (A1)–(A3) and define $\xi (t):=\mu (t^2)t$, $t \ge 0$. Then, for any $\underline{\kappa },\underline{\tau } \in {\mathbb {R}}^{d\times d}$, the following inequalities hold:

$$\begin{aligned} \left| \mu (|\underline{\kappa }|^2)\underline{\kappa }-\mu (|\underline{\tau }|^2)\underline{\tau }\right| ^2 \le C(\underline{\kappa },\underline{\tau }) |\underline{\kappa }-\underline{\tau }|^2 \le 3 M_\mu ^2 |\underline{\kappa }-\underline{\tau }|^2 \end{aligned}$$

(13)

and

$$\begin{aligned} (\mu (|\underline{\kappa }|^2)\underline{\kappa }-\mu (|\underline{\tau }|^2)\underline{\tau }):(\underline{\kappa }-\underline{\tau }) \ge c(\underline{\kappa },\underline{\tau })|\underline{\kappa }-\underline{\tau }|^2 \ge m_\mu |\underline{\kappa }-\underline{\tau }|^2, \end{aligned}$$

(14)

where

$$\begin{aligned} C(\underline{\kappa },\underline{\tau })&:=\left( \sup _{t \in (0,1)} \xi '(|\underline{\kappa }|+t(|\underline{\tau }|-|\underline{\kappa }|)) \right) ^2+2\mu (|\underline{\kappa }|^2)\mu (|\underline{\tau }|^2), \end{aligned}$$

(15)

$$\begin{aligned} c(\underline{\kappa },\underline{\tau })&:=\inf _{t \in (0,1)} \xi '(|\underline{\kappa }|+t(|\underline{\tau }|-|\underline{\kappa }|)). \end{aligned}$$

(16)

Proof

We will only prove (14), since the contraction factor will strongly rely on this bound, but not on (13). For the proof of the latter, we refer to [2].

A simple and straightforward calculation reveals that

$$\begin{aligned} (\mu (|\underline{\kappa }|^2)\underline{\kappa }-\mu (|\underline{\tau }|^2)\underline{\tau }):(\underline{\kappa }-\underline{\tau })&= \mu (|\underline{\kappa }|^2)|\underline{\kappa }|^2+\mu (|\underline{\tau }|^2)|\underline{\tau }|^2-\mu (|\underline{\kappa }|^2)\underline{\kappa }:\underline{\tau }-\mu (|\underline{\tau }|^2)\underline{\tau }:\underline{\kappa }\nonumber \\&=\left( \mu (|\underline{\kappa }|^2)|\underline{\kappa }|-\mu (|\underline{\tau }|^2)|\underline{\tau }|\right) (|\underline{\kappa }|-|\underline{\tau }|) \end{aligned}$$

(17)

$$\begin{aligned}&\quad\quad + (\mu (|\underline{\kappa }|^2)+\mu (|\underline{\tau }|^2))(|\underline{\kappa }||\underline{\tau }|-\underline{\kappa }:\underline{\tau }).\end{aligned} $$

(18)

We note that the summand in (17) can be written as $(\xi (|\underline{\kappa }|)-\xi (|\underline{\tau }|))(|\underline{\kappa }|-|\underline{\tau }|)$, since $\xi (t)=\mu (t^2)t$ for $t \ge 0$. Then, the mean value theorem implies that

$$\begin{aligned} (\xi (|\underline{\kappa }|)-\xi (|\underline{\tau }|))(|\underline{\kappa }|-|\underline{\tau }|) \ge \inf _{t \in (0,1)} \xi '(|\underline{\kappa }|+t(|\underline{\tau }|-|\underline{\kappa }|))(|\underline{\kappa }|-|\underline{\tau }|)^2=c(\underline{\kappa },\underline{\tau })(|\underline{\kappa }|-|\underline{\tau }|)^2. \end{aligned}$$

(19)

Furthermore, since $\mu '(t) \le 0$ for all $t \ge 0$ by (A3), and, in turn, $\xi '(t)=\mu (t^2)+2t^2 \mu '(t^2) \le \mu (t^2)$, we find that

$$\begin{aligned}c(\underline{\kappa },\underline{\tau })=\inf _{t \in (0,1)} \xi '(|\underline{\kappa }|+t(|\underline{\tau }|-|\underline{\kappa }|)) \le \min \left\{ \mu \left( \left| \underline{\kappa }\right| ^2\right) ,\mu \left( \left| \underline{\tau }\right| ^2\right) \right\} \!.\end{aligned}$$

Consequently, the summand in (18) can be bounded from below as follows

$$\begin{aligned} \left( \mu \left( \left| \underline{\kappa }\right| ^2\right) +\mu \left( \left| \underline{\tau }\right| ^2\right) \right) (|\underline{\kappa }||\underline{\tau }|-\underline{\kappa }:\underline{\tau }) \ge 2c(\underline{\kappa },\underline{\tau })(|\underline{\kappa }||\underline{\tau }|-\underline{\kappa }:\underline{\tau }), \end{aligned}$$

(20)

since $|\underline{\kappa }||\underline{\tau }|-\underline{\kappa }:\underline{\tau }\ge 0$ by the Cauchy–Schwarz inequality. Hence, using the established bounds (19) and (20) for the summands in (17) and (18), respectively, yields

$$\begin{aligned} \left( \mu \left( \left| \underline{\kappa }\right| ^2\right) \underline{\kappa }-\mu \left( \left| \underline{\tau }\right| ^2\right) \underline{\tau }\right) :(\underline{\kappa }-\underline{\tau })&\ge c(\underline{\kappa },\underline{\tau })\left( (|\underline{\kappa }|-|\underline{\tau }|)^2+2\left( |\underline{\kappa }||\underline{\tau }|-\underline{\kappa }:\underline{\tau }\right) \right) \\&=c(\underline{\kappa },\underline{\tau })\left( |\underline{\kappa }|^2+|\underline{\tau }|^2-2 \underline{\kappa }:\underline{\tau }\right) \\&=c(\underline{\kappa },\underline{\tau })|\underline{\kappa }-\underline{\tau }|^2. \end{aligned}$$

Finally we note that dividing (2) by $(t-s)$, for $t>s$, and taking the limit $s \rightarrow t$ yields that $m_\mu \le \xi '(t) \le M_\mu $ for all $t \ge 0$, i.e., $c(\underline{\kappa },\underline{\tau }) \ge m_\mu $. $\square $

Now we are ready to show that the operator $\mathsf {A}$, cf. (9), is strongly monotone and Lipschitz continuous, which implies the unique solvability of (4).

Proposition 2.2

Let $\mathsf {A}$ be defined as in (9), with $\mu $ satisfying (A1)–(A3).

(a)

For given $\mathbf {u}\in V$, $\mathsf {A}[\mathbf {u}](\cdot )(\cdot )$ is a uniformly bounded and coercive, symmetric bilinear form on $V \times V$. In particular, the following inequalities hold:

$$\begin{aligned} \mathsf {A}[\mathbf {u}](\mathbf {v})(\mathbf {w}) \le M_\mu {\left| \left| \left| \mathbf {v} \right| \right| \right| }_\Omega {\left| \left| \left| \mathbf {w} \right| \right| \right| }_\Omega \end{aligned}$$

and

$$\begin{aligned} \mathsf {A}[\mathbf {u}](\mathbf {v})(\mathbf {v}) \ge m_\mu {\left| \left| \left| \mathbf {v} \right| \right| \right| }_\Omega ^2 \end{aligned}$$

(21)

for any $\mathbf {u},\mathbf {v},\mathbf {w}\in V$.

(b)

The mapping $\mathbf {u}\mapsto \mathsf {A}[\mathbf {u}](\mathbf {u})$ is Lipschitz continuous with

$$\begin{aligned} \mathsf {A}[\mathbf {u}](\mathbf {u})(\mathbf {w})-\mathsf {A}[\mathbf {v}](\mathbf {v})(\mathbf {w}) \le \sqrt{3} M_\mu {\left| \left| \left| \mathbf {u}-\mathbf {v} \right| \right| \right| }_\Omega {\left| \left| \left| \mathbf {w} \right| \right| \right| }_\Omega , \qquad \mathbf {u},\mathbf {v},\mathbf {w}\in V \end{aligned}$$

(22)

and strongly monotone with

$$\begin{aligned} \mathsf {A}[\mathbf {u}](\mathbf {u})(\mathbf {u}-\mathbf {v})-\mathsf {A}[\mathbf {v}](\mathbf {v})(\mathbf {u}-\mathbf {v}) \ge m_\mu {\left| \left| \left| \mathbf {u}-\mathbf {v} \right| \right| \right| }_\Omega ^2, \qquad \mathbf {u},\mathbf {v}\in V. \end{aligned}$$

(23)

Consequently, the problem (4) has a unique solution $\mathbf {u}^\star \in V$.

Proof

Ad (a): By invoking the definition of $\mathsf {A}$, cf. (9), the boundedness of the viscosity coefficient $\mu $, cf. (3), and applying the Cauchy–Schwarz inequality twice, first for the Frobenius inner product and subsequently for the ${\text{L}}^2(\Omega )$-inner product, we obtain

$$\begin{aligned} \mathsf {A}[\mathbf {u}](\mathbf {v})(\mathbf {w})&= \int _\Omega \mu (|\underline{e}(\mathbf {u})|^2) \underline{e}(\mathbf {v}):\underline{e}(\mathbf {w}) \,\mathsf {d}{\varvec{x}}\\&\le M_\mu \int _\Omega |\underline{e}(\mathbf {v}):\underline{e}(\mathbf {w})| \,\mathsf {d}{\varvec{x}}\\&\le M_\mu \left( \int _\Omega |\underline{e}(\mathbf {v})|^2 \,\mathsf {d}{\varvec{x}}\right) ^{\nicefrac {1}{2}} \left( \int _\Omega |\underline{e}(\mathbf {w})|^2 \,\mathsf {d}{\varvec{x}}\right) ^{\nicefrac {1}{2}} \\&= M_\mu {\left| \left| \left| \mathbf {v} \right| \right| \right| }_\Omega {\left| \left| \left| \mathbf {w} \right| \right| \right| }_\Omega . \end{aligned}$$

Similarly, the inequality (3) implies the uniform coercivity (21).

Ad (b): The definition of $\mathsf {A}$, cf. (9), and the Cauchy–Schwarz inequality yield

$$\begin{aligned} \mathsf {A}[\mathbf {u}](\mathbf {u})(\mathbf {w})-\mathsf {A}[\mathbf {v}](\mathbf {v})(\mathbf {w})&=\int _\Omega \left[ \mu (|\underline{e}(\mathbf {u})|^2)\underline{e}(\mathbf {u})-\mu (|\underline{e}(\mathbf {v})|^2)\underline{e}(\mathbf {v})\right] :\underline{e}(\mathbf {w}) \,\mathsf {d}{\varvec{x}}\\&\le \int _\Omega |\mu (|\underline{e}(\mathbf {u})|^2)\underline{e}(\mathbf {u})-\mu (|\underline{e}(\mathbf {v})|^2)\underline{e}(\mathbf {v})||\underline{e}(\mathbf {w})| \,\mathsf {d}{\varvec{x}}. \end{aligned}$$

Hence, by (13) and the linearity of $\underline{e}(\mathbf {\cdot })$, this leads to

$$\begin{aligned} \mathsf {A}[\mathbf {u}](\mathbf {u})(\mathbf {w})-\mathsf {A}[\mathbf {v}](\mathbf {v})(\mathbf {w}) \le \sqrt{3} M_\mu \int _\Omega |\underline{e}(\mathbf {u}-\mathbf {v})||\underline{e}(\mathbf {w})| \,\mathsf {d}{\varvec{x}}. \end{aligned}$$

Applying once more the Cauchy–Schwarz inequality implies that

$$\begin{aligned} \mathsf {A}[\mathbf {u}](\mathbf {u})(\mathbf {w})-\mathsf {A}[\mathbf {v}](\mathbf {v})(\mathbf {w})&\le \sqrt{3} M_\mu \left( \int _\Omega |\underline{e}(\mathbf {u-v})|^2 \,\mathsf {d}{\varvec{x}}\right) ^{\nicefrac {1}{2}} \left( \int _\Omega |\underline{e}(\mathbf {w})|^2 \,\mathsf {d}{\varvec{x}}\right) ^{\nicefrac {1}{2}} \\&= \sqrt{3} M_\mu {\left| \left| \left| \mathbf {u}-\mathbf {v} \right| \right| \right| }_\Omega {\left| \left| \left| \mathbf {w} \right| \right| \right| }_\Omega . \end{aligned}$$

Similarly, by the definition of $\mathsf {A}$, cf. (9), and (14) we obtain

$$\begin{aligned} \mathsf {A}[\mathbf {u}](\mathbf {u})(\mathbf {u}-\mathbf {v})-\mathsf {A}[\mathbf {v}](\mathbf {v})(\mathbf {u}-\mathbf {v})&=\int _\Omega \left( \mu (|\underline{e}(\mathbf {u})|^2)\underline{e}(\mathbf {u})-\mu (|\underline{e}(\mathbf {v})|^2)\underline{e}(\mathbf {v})\right) :(\underline{e}(\mathbf {u})-\underline{e}(\mathbf {v})) \,\mathsf {d}{\varvec{x}}\\&\ge m_\mu \int _\Omega |\underline{e}(\mathbf {u-v})|^2 \,\mathsf {d}{\varvec{x}}\\&= m_\mu {\left| \left| \left| \mathbf {u}-\mathbf {v} \right| \right| \right| }_\Omega ^2. \end{aligned}$$

The existence and uniqueness of a solution to the Eq. (4) now follows from the theory of monotone operators, cf. [16, Sect. 3.3] or [21, Sect. 25.4]. $\square $

Remark 2.3

Since (4) is the Euler–Lagrange equation of the minimisation problem

$$\begin{aligned} \min _{\mathbf {u}\in V} \mathsf {E}(\mathbf {u}), \end{aligned}$$

the above proposition yields that $\mathbf {u}^\star \in V$ is the unique minimiser of the functional $\mathsf {E}$; we emphasise that $\mathsf {E}$ is strictly convex thanks to the strong monotonicity (23), cf. [21, Prop. 25.10]. Moreover, the Kačanov scheme (11) is well defined thanks to Proposition 2.2 (a) and the Lax–Milgram theorem.

3 Energy contraction

In this section, we will show that the energy error, given by $\mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star )$, contracts along the sequence $\{\mathbf {u}^n\}$ generated by the Kačanov iteration (11). To this end, we need an auxiliary result.

Lemma 3.1

Let $\mathsf {A}$ and $\mathsf {E}$ be defined as in (9) and (6), respectively, with $\mu $ satisfying (A1)–(A3). Then

$$\begin{aligned} \mathsf {E}(\mathbf {u})-\mathsf {E}(\mathbf {v}) \ge \frac{1}{2} \mathsf {A}[\mathbf {u}](\mathbf {u})(\mathbf {u})-\frac{1}{2}\mathsf {A}[\mathbf {u}](\mathbf {v})(\mathbf {v})-\ell (\mathbf {u})+\ell (\mathbf {v}), \qquad \mathbf {u},\mathbf {v}\in V. \end{aligned}$$

(24)

This result is well-known for the Kačanov iteration in the given setting, and the proof can be found, e.g., in [21, Sect. 25.12] or [16, Sect. 4.5]. However, as it is stated in a slightly different form in those references, and also for the sake of completeness, we will include the proof of this statement nonetheless.

Proof

It can be shown that

$$\begin{aligned}\varphi (t)-\varphi (s) \ge \frac{1}{2} \mu (t) (t-s), \qquad t,s \ge 0,\end{aligned}$$

see, e.g., [10, Sect. 5.1], and therefore

$$\begin{aligned} \int _\Omega \varphi \left( \left| \underline{e}(\mathbf {u})\right| ^2\right) -\varphi \left( \left| \underline{e}(\mathbf {v})\right| ^2\right) \,\mathsf {d}{\varvec{x}}&\ge \frac{1}{2} \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u})\right| ^2\right) \left( \left| \underline{e}(\mathbf {u})\right| ^2-\left| \underline{e}(\mathbf {v})\right| ^2\right) \,\mathsf {d}{\varvec{x}}\\&= \frac{1}{2} \left( \mathsf {A}[\mathbf {u}](\mathbf {u})(\mathbf {u})-\mathsf {A}[\mathbf {u}](\mathbf {v})(\mathbf {v})\right) , \end{aligned}$$

for any $\mathbf {u},\mathbf {v}\in V$. Hence, by the definition of $\mathsf {E}$, cf. (6), we find that

$$\begin{aligned} \mathsf {E}(\mathbf {u})-\mathsf {E}(\mathbf {v})&=\int _\Omega \varphi \left( \left| \underline{e}(\mathbf {u})\right| ^2\right) \,\mathsf {d}{\varvec{x}}-\int _\Omega \varphi \left( \left| \underline{e}(\mathbf {v})\right| ^2\right) \,\mathsf {d}{\varvec{x}}-\ell (\mathbf {u})+\ell (\mathbf {v})\\&\ge \frac{1}{2} \mathsf {A}[\mathbf {u}](\mathbf {u})(\mathbf {u})-\frac{1}{2}\mathsf {A}[\mathbf {u}](\mathbf {v})(\mathbf {v})-\ell (\mathbf {u})+\ell (\mathbf {v}), \end{aligned}$$

which completes the proof of the claim. $\square $

Now we are in a position to prove the contraction of the energy along the sequence generated by the Kačanov scheme (11). We note that similar results can be found, e.g., in [9, Thm. 2.1] or [4, Cor. 19].

Theorem 3.2

Assume that (A1)–(A3) hold and let $\mathsf {E}$ be defined as in (6). Then, the energy error contracts along the sequence $\{\mathbf {u}^n\}$ generated by the Kačanov iteration (11) in the sense that

$$\begin{aligned} 0\le \mathsf {E}(\mathbf {u}^{n+1})-\mathsf {E}(\mathbf {u}^\star ) \le q(n) \left( \mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star )\right) \!, \end{aligned}$$

where

$$\begin{aligned} q(n):=1-\frac{1}{4} \left\{ \mathop {\mathrm {ess\,sup}}\limits _{{\varvec{x}}\in \Omega } \frac{\mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) }{\inf _{t \in (-1,1)}\xi '\left( \max \left\{ 0,\left| \underline{e}(\mathbf {u}^\star )\right| +t\left| \underline{e}(\mathbf {u}^n)-\underline{e}(\mathbf {u}^\star )\right| \right\} \right) }\right\} ^{-1}, \end{aligned}$$

(25)

and $\xi (t)=\mu (t^2)t$ for $t \ge 0$.

Proof

We largely proceed along the lines of [4]. However, as we want to improve the contraction factor from that reference and, in addition, remove any unknown constants, some non-trivial modifications are necessary in the second part of the proof.

Let us define the real-valued function $\psi (t):=\mathsf {E}(\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star ))$, $t \in [0,1]$. Then, by invoking the fundamental theorem of calculus, we obtain

$$\begin{aligned} \mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star )=\int _0^1 \psi '(t) \, \mathrm {d} t. \end{aligned}$$

We will first show that $\psi '(t), \, t \in [0,1]$, is increasing. A straightforward calculation reveals that

$$\begin{aligned} \psi ''(t)= & {} \int _\Omega 2 \mu '\left( \left| \underline{e}(\mathbf {u}^\star +t( \mathbf {u}^n-\mathbf {u}^\star) )\right| ^2\right) \left( \underline{e}(\mathbf {u}^\star +t ( \mathbf {u}^n-\mathbf {u}^\star ) ):\underline{e}(\mathbf {u}^n-\mathbf {u}^\star )\right) ^2 \,\mathsf {d}{\varvec{x}}\\&+ \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^\star +t( \mathbf {u}^n-\mathbf {u}^\star ) )\right| ^2\right) \left| \underline{e}(\mathbf {u}^n-\mathbf {u}^\star )\right| ^2 \,\mathsf {d}{\varvec{x}}. \end{aligned}$$

By assumption (A3) we have that $\mu '(t) \le 0$ for $t \ge 0$, and thus, by the Cauchy–Schwarz inequality,

$$\begin{aligned} \psi ''(t)&\ge \int _\Omega \left( \mu \left( \left| \underline{e}(\mathbf {u}^\star +t( \mathbf {u}^n-\mathbf {u}^\star ) )\right| ^2\right) + 2 \mu '\left( \left| \underline{e}(\mathbf {u}^\star +t( \mathbf {u}^n-\mathbf {u}^\star ) )\right| ^2\right) \left| \underline{e}(\mathbf {u}^\star +t( \mathbf {u}^n-\mathbf {u}^\star ) )\right| ^2\right) \\&\quad \cdot |\underline{e}(\mathbf {u}^n-\mathbf {u}^\star )|^2 \,\mathsf {d}{\varvec{x}}\\&= \int _\Omega \xi '(|\underline{e}(\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star ))|)|\underline{e}(\mathbf {u}^n-\mathbf {u}^\star )|^2 \,\mathsf {d}{\varvec{x}}, \end{aligned}$$

where $\xi (t)=\mu (t^2)t$ as before. Finally, since $\xi '(t) \ge m_\mu >0$ for all $t \ge 0$, see the end of the proof of Lemma 2.1, we find that $\psi ''(t) \ge 0$, i.e., $\psi '(t)$ is increasing. As a consequence, we immediately get that

$$\begin{aligned} \mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star ) \le \psi '(1)= \mathsf {E}'(\mathbf {u}^n)(\mathbf {u}^n-\mathbf {u}^\star )=\mathsf {A}[\mathbf {u}^n](\mathbf {u}^n)(\mathbf {u}^n-\mathbf {u}^\star ) - \ell (\mathbf {u}^n-\mathbf {u}^\star ). \end{aligned}$$

Moreover, by the definition of the Kačanov scheme (11), we have that

$$\begin{aligned} \mathsf {A}[\mathbf {u}^n]( \mathbf {u}^{n+1}) ( \mathbf {u}^n-\mathbf {u}^\star ) =\ell ( \mathbf {u}^n-\mathbf {u}^\star ) . \end{aligned}$$

Consequently, the above inequality becomes

$$\begin{aligned} \mathsf {E}( \mathbf {u}^n) -\mathsf {E}( \mathbf {u}^\star )&\le \mathsf {A}[\mathbf {u}^n]( \mathbf {u}^n-\mathbf {u}^{n+1}) ( \mathbf {u}^n-\mathbf {u}^\star ) =\int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \underline{e}(\mathbf {u}^n-\mathbf {u}^{n+1}):\underline{e}(\mathbf {u}^n-\mathbf {u}^\star ) \,\mathsf {d}{\varvec{x}}. \end{aligned}$$

Let us recall that, for $a,b \ge 0$, $ab \le \frac{1}{2\gamma }a^2+\frac{\gamma }{2}b^2$ for all $\gamma >0$: Indeed, this holds true as the function $\gamma \mapsto \frac{1}{2\gamma }a^2+\frac{\gamma }{2}b^2$ takes its minimum ab at $\gamma =\nicefrac {a}{b}$. Consequently, we obtain

$$\begin{aligned} \mathsf {E}( \mathbf {u}^n) -\mathsf {E}( \mathbf {u}^\star ) \le \frac{1}{2\gamma } \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) | \underline{e}(\mathbf {u}^n-\mathbf {u}^{n+1})| ^2 \,\mathsf {d}{\varvec{x}}+ \frac{\gamma }{2} \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \left| \underline{e}(\mathbf {u}^n-\mathbf {u}^{\star })\right| ^2 \,\mathsf {d}{\varvec{x}}. \end{aligned}$$

(26)

We will now examine the two summands on the right-hand side above separately.

The first summand can be bounded from above in a similar manner as in the proof of [4, Thm. 18]. First, we note that

$$\begin{aligned} \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \left| \underline{e}(\mathbf {u}^n-\mathbf {u}^{n+1})\right| ^2 \,\mathsf {d}{\varvec{x}}&= \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \left| \underline{e}(\mathbf {u}^n)\right| ^2 \,\mathsf {d}{\varvec{x}}- \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \left| \underline{e}(\mathbf {u}^{n+1})\right| ^2 \,\mathsf {d}{\varvec{x}}\\&\quad -2 \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \underline{e}(\mathbf {u}^{n+1}):\underline{e}(\mathbf {u}^{n}) \,\mathsf {d}{\varvec{x}}\\&\quad +2 \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \underline{e}(\mathbf {u}^{n+1}):\underline{e}(\mathbf {u}^{n+1}) \,\mathsf {d}{\varvec{x}}, \end{aligned}$$

and thus, by the definition of the operator $\mathsf {A}$, cf. (9),

$$\begin{aligned} \int _\Omega \mu (|\underline{e}(\mathbf {u}^n)|^2)|\underline{e}(\mathbf {u}^n-\mathbf {u}^{n+1})|^2 \,\mathsf {d}{\varvec{x}}&= \mathsf {A}[\mathbf {u}^n](\mathbf {u}^n)(\mathbf {u}^n)-\mathsf {A}[\mathbf {u}^n](\mathbf {u}^{n+1})(\mathbf {u}^{n+1}) \\&\quad -2 \mathsf {A}[\mathbf {u}^n](\mathbf {u}^{n+1})(\mathbf {u}^n)+2 \mathsf {A}[\mathbf {u}^n](\mathbf {u}^{n+1})(\mathbf {u}^{n+1}). \end{aligned}$$

Recall that $\mathsf {A}[\mathbf {u}^n](\mathbf {u}^{n+1})(\mathbf {v})=\ell (\mathbf {v})$ for all $\mathbf {v}\in V$, cf. (11), which leads to

$$\begin{aligned} \int _\Omega \mu (|\underline{e}(\mathbf {u}^n)|^2)|\underline{e}(\mathbf {u}^n-\mathbf {u}^{n+1})|^2 \,\mathsf {d}{\varvec{x}}&= \mathsf {A}[\mathbf {u}^n](\mathbf {u}^n)(\mathbf {u}^n)-\mathsf {A}[\mathbf {u}^n](\mathbf {u}^{n+1})(\mathbf {u}^{n+1})-2 \ell (\mathbf {u}^n)+2 \ell (\mathbf {u}^{n+1}), \end{aligned}$$

and hence, by (24),

$$\begin{aligned} \frac{1}{2}\int _\Omega \mu (|\underline{e}(\mathbf {u}^n)|^2)|\underline{e}(\mathbf {u}^n-\mathbf {u}^{n+1})|^2 \,\mathsf {d}{\varvec{x}}\le \mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^{n+1}). \end{aligned}$$

(27)

Next, we will take care of the second summand in (26). As was done in [4], we want to bound this summand in terms of the energy difference $\mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star )$. However, in order to improve the contraction factor whilst removing all unknown constants, some modifications to the argument presented in [4] are necessary. For $\psi (t)=\mathsf {E}(\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star ))$, the fundamental theorem of calculus implies that

$$\begin{aligned} \mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star ) = \int _0^1 \psi '(t) \, \mathrm {d} t = \int _0^1 \mathsf {E}'(\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star ))(\mathbf {u}^n-\mathbf {u}^\star ) \, \mathrm {d} t. \end{aligned}$$

Recall that $\mathsf {E}'(\mathbf {u})(\mathbf {v})=\mathsf {A}[\mathbf {u}][\mathbf {u}](\mathbf {v})-\ell (\mathbf {v})$, cf. (12), and, since $\mathbf {u}^\star \in V$ is the unique solution of (10), $\ell (\mathbf {v})=\mathsf {A}[\mathbf {u}^\star ](\mathbf {u}^\star )(\mathbf {v})$ for any $\mathbf {v}\in V$. As a consequence, we have that

Invoking the definition of $\mathsf {A}$, cf. (9), and (14) further implies that

$$\begin{aligned} \mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star )&\ge \int _0^1 t \int _\Omega c(\underline{e}(\mathbf {u}^\star )+t(\underline{e}(\mathbf {u}^n)-\underline{e}(\mathbf {u}^\star )),\underline{e}(\mathbf {u}^\star )) |\underline{e}(\mathbf {u}^n-\mathbf {u}^\star )|^2 \,\mathsf {d}{\varvec{x}}\, \mathrm {d}t. \end{aligned}$$

(28)

Moreover, by the definition of $c(\cdot ,\cdot )$, cf. (16), and a brief argument based on reflection we get

$$\begin{aligned} c(\underline{e}(\mathbf {u}^\star )+s(\underline{e}(\mathbf {u}^n)-\underline{e}(\mathbf {u}^\star )),\underline{e}(\mathbf {u}^\star )) \ge \inf _{t \in (-1,1)} \xi ' \left(\max \left\{ 0,|\underline{e}(\mathbf {u}^\star )|+t|\underline{e}(\mathbf {u}^n)-\underline{e}(\mathbf {u}^\star )|\right\} \right) \end{aligned}$$

(29)

for all $s \in [0,1]$, where $\xi (t)=\mu (t^2)t$; indeed, it is easily verified that, for any $\underline{\kappa },\underline{\tau } \in {\mathbb {R}}^{d\times d}$, we have $c(\underline{\kappa },\underline{\tau })=c(\underline{\tau },\underline{\kappa })$, and, in turn,

$$\begin{aligned} c(\underline{\kappa }+s(\underline{\tau }-\underline{\kappa }),\underline{\kappa })=\inf _{t \in (0,1)}\xi '((1-t) |\underline{\kappa }|+t|\underline{\kappa }+s(\underline{\tau }-\underline{\kappa })|), \qquad s \in [0,1]. \end{aligned}$$

(30)

The triangle inequality yields that

$$\begin{aligned} |\underline{\kappa }|-ts|\underline{\tau }-\underline{\kappa }| \le (1-t)|\underline{\kappa }|+t|\underline{\kappa }+s(\underline{\tau }-\underline{\kappa })| \le |\underline{\kappa }|+ts|\underline{\tau }-\underline{\kappa }| \qquad \text {for all} \ t \in (0,1), \end{aligned}$$

and thus

$$\begin{aligned} (1-t)|\underline{\kappa }|+t|\underline{\kappa }+s(\underline{\tau }-\underline{\kappa })|=|\underline{\kappa }|+(2r-1)st|\underline{\kappa }-\underline{\tau }| \qquad \text {for some} \ r \in [0,1]. \end{aligned}$$

This further implies that, for any $s \in [0,1]$, we have

$$\begin{aligned} \{(1-t)|\underline{\kappa }|+t|\underline{\kappa }+s(\underline{\tau }-\underline{\kappa })|: t \in (0,1)\} \subseteq \{|\underline{\kappa }|+t |\underline{\kappa }- \underline{\tau }|:t \in (-1,1)\}, \end{aligned}$$

and consequently, in regard of (30),

$$c(\underline{\kappa }+s(\underline{\tau }-\underline{\kappa }),\underline{\kappa }) \ge \inf _{t \in (-1,1)} \xi ' \left(\max \left\{ 0,|\underline{\kappa }|+ts|\underline{\tau }-\underline{\kappa }|\right\} \right)$$

, which immediately implies (29). Combining the equalities (28) and (29) yields

$$\begin{aligned} \mathsf {E}\left( \mathbf {u}^n\right) -\mathsf {E}( \mathbf {u}^\star ) \ge \frac{1}{2} \int _\Omega \inf _{t \in (-1,1)} \xi '\left( \max \left\{ 0,\left| \underline{e}(\mathbf {u}^\star )\right| +t\left| \underline{e}(\mathbf {u}^n)-\underline{e}(\mathbf {u}^\star )\right| \right\} \right) \left| \underline{e}(\mathbf {u}^n-\mathbf {u}^\star )\right| ^2 \,\mathsf {d}{\varvec{x}}. \end{aligned}$$

(31)

Since, in addition,

$$\begin{aligned} \frac{1}{2} \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \left| \underline{e}(\mathbf {u}^n-\mathbf {u}^{\star })\right| ^2 \,\mathsf {d}{\varvec{x}}&= \frac{1}{2} \int _\Omega \frac{\mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) }{\inf _{t \in (-1,1)} \xi '\left( \max \left\{ 0,\left| \underline{e}(\mathbf {u}^\star )\right| +t\left| \underline{e}(\mathbf {u}^n)-\underline{e}(\mathbf {u}^\star )\right| \right\} \right) } \\&\quad \cdot \inf _{t \in (-1,1)} \xi '\left( \max \left\{ 0,\left| \underline{e}(\mathbf {u}^\star )\right| +t\left| \underline{e}(\mathbf {u}^n)-\underline{e}(\mathbf {u}^\star )\right| \right\} \right) \left| \underline{e}(\mathbf {u}^n-\mathbf {u}^\star )\right| ^2 \,\mathsf {d}{\varvec{x}}, \end{aligned}$$

the lower bound (31) implies that

$$\begin{aligned} \frac{1}{2} \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \left| \underline{e}(\mathbf {u}^n-\mathbf {u}^{\star })\right| ^2 \,\mathsf {d}{\varvec{x}}\le Q(n) \left( \mathsf {E}\left( \mathbf {u}^n\right) -\mathsf {E}\left( \mathbf {u}^\star \right) \right) , \end{aligned}$$

(32)

where

$$\begin{aligned} Q(n):=\mathop {\mathrm {ess\,sup}}\limits _{{\varvec{x}}\in \Omega } \frac{\mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) }{\inf _{t \in (-1,1)} \xi '\left( \max \left\{ 0,\left| \underline{e}(\mathbf {u}^\star )\right| +t\left| \underline{e}(\mathbf {u}^n)-\underline{e}(\mathbf {u}^\star )\right| \right\} \right) }. \end{aligned}$$

Finally, combining (26), (27), and (32) yields

$$\begin{aligned} \gamma (1-\gamma Q(n)) \left( \mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star )\right) \le \mathsf {E}(\mathbf {u}^n)-\mathsf {E}( \mathbf {u}^{n+1}) , \end{aligned}$$

and, in turn,

$$\begin{aligned} \mathsf {E}( \mathbf {u}^{n+1}) -\mathsf {E}( \mathbf {u}^\star )&=\mathsf {E}( \mathbf {u}^n) -\mathsf {E}( \mathbf {u}^\star ) -\left( \mathsf {E}\left( \mathbf {u}^n\right) -\mathsf {E}\left( \mathbf {u}^{n+1}\right) \right) \le \left( 1-\gamma \left( 1-\gamma Q(n)\right) \right) \left( \mathsf {E}\left( \mathbf {u}^n\right) -\mathsf {E}( \mathbf {u}^\star ) \right) . \end{aligned}$$

It is straightforward to verify that the contraction factor is minimal for $\gamma =\nicefrac {1}{2Q(n)}$, and, in that case, one has that

$$\begin{aligned} 0 \le \mathsf {E}( \mathbf {u}^{n+1}) -\mathsf {E}( \mathbf {u}^\star ) \le \left( 1-\frac{1}{4 Q(n)}\right) \left( \mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star )\right) \!, \end{aligned}$$

which proves the claim. $\square $

Remark 3.3

Since $m_\mu \le \mu (t) \le M_\mu $ as well as $m_\mu \le \xi '(t) \le M_\mu $ for all $t \ge 0$, we get the following crude uniform bound on the contraction factor:

$$\begin{aligned}q(n) \le \left( 1-\frac{m_\mu }{4 M_\mu }\right) \in [0.75,1), \qquad n \ge 0.\end{aligned}$$

We note that, in the context of the relaxed power-law model, cf. Sect. 3.2, this bound, in principle, coincides with the contraction factor from [4].

We note that the contraction factor (25) is not computable as it involves $\mathbf {u}^\star $, and the uniform upper bound from Remark 3.3 is rather pessimistic. In the following, we will establish an improved computable bound, up to higher order error terms, for the contraction factor on finite-dimensional subspaces.

Theorem 3.4

Assume that (A1)–(A3) hold, let $W \subset V$ be a finite-dimensional subspace, and let $\mathsf {E}$ be defined as in (6) (restricted to W). Then, the energy error contracts along the sequence $\{\mathbf {u}^n\} \subset W$ generated by the Kačanov iteration (11) on W in the sense that

$$\begin{aligned} 0\le \mathsf {E}(\mathbf {u}^{n+1})-\mathsf {E}(\mathbf {u}^\star ) \le q_A(n) \left( \mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star )\right) +o_W\left({\left| \left| \left| \mathbf {u}^\star -\mathbf {u}^n \right| \right| \right| }_\Omega ^2\right), \end{aligned}$$

where now $\mathbf {u}^\star $ denotes the unique minimiser of $\mathsf {E}$ in W,

$$\begin{aligned} q_A(n):=1-\frac{1}{4} \left\{ \mathop {\mathrm {ess\,sup}}\limits _{{\varvec{x}}\in \Omega } \frac{\mu (|\underline{e}(\mathbf {u}^n)|^2)}{2 \mu '(|\underline{e}(\mathbf {u}^n)|^2) |\underline{e}(\mathbf {u}^n)|^2+\mu (|\underline{e}(\mathbf {u}^n)|^2)}\right\} ^{-1}, \end{aligned}$$

(33)

and $o_W({\left| \left| \left| \mathbf {u}-\mathbf {u}^n \right| \right| \right| }^2_\Omega )$ denotes a remainder term depending on W.

Proof

This result follows, in principle, from the proof of Theorem 3.2 with a modification of the bound from (32). Consider the map $\omega :W \rightarrow W^\star $ given by $\omega (\mathbf {u}):=\mathsf {A}[\mathbf {u}](\mathbf {u})$ for $\mathbf {u}\in W$. A lengthy, but straightforward calculation reveals that the Gâteaux derivative of $\omega $ exists and is given by

$$\begin{aligned} \omega '(\mathbf {u})(\mathbf {v})(\mathbf {w})=\int _\Omega 2 \mu '\left( \left| \underline{e}(\mathbf {u})\right| ^2\right) (\underline{e}(\mathbf {u}):\underline{e}(\mathbf {v}))(\underline{e}(\mathbf {u}):\underline{e}(\mathbf {\mathbf {w}}))+\mu \left( \left| \underline{e}(\mathbf {u})\right| ^2\right) \left( \underline{e}(\mathbf {v}):\underline{e}(\mathbf {\mathbf {w}})\right) \,\mathsf {d}{\varvec{x}}, \quad \mathbf {v},\mathbf {w}\in W. \end{aligned}$$

Since W is a finite-dimensional space and $\omega :W \rightarrow W^\star $ is Lipschitz continuous by Proposition 2.2, the Gâteaux derivative coincides with the Fréchet derivative, see [19, Prop. 3.5]. By definition of the Fréchet derivative, one has that

$$\begin{aligned} \omega (\mathbf {u})=\omega \left( \mathbf {u}^n\right) +\omega '\left( \mathbf {u}^n\right) \left( \mathbf {u}-\mathbf {u}^n\right) +o_W\left( {\left| \left| \left| \mathbf {u}-\mathbf {u}^n \right| \right| \right| }_\Omega \right) ; \end{aligned}$$

here, $o_W({\left| \left| \left| \mathbf {u}-\mathbf {u}^n \right| \right| \right| }_\Omega )$ denotes a remainder in the dual space $W^\star $. Combining these two observations yields

$$\begin{aligned} \left( \mathsf {A}[\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star )](\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star ))-\mathsf {A}[\mathbf {u}^\star ](\mathbf {u}^\star )\right) (\mathbf {u}^n-\mathbf {u}^\star ) \\ \begin{aligned}&=\left( \omega (\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star ))-\omega (\mathbf {u}^\star )\right) (\mathbf {u}^n-\mathbf {u}^\star )\\&=t\omega '(\mathbf {u}^n)(\mathbf {u}^n-\mathbf {u}^\star )(\mathbf {u}^n-\mathbf {u}^\star )+o_W\left({\left| \left| \left| \mathbf {u}^\star -\mathbf {u}^n \right| \right| \right| }_\Omega ^2\right) \\&= t\int _\Omega 2 \mu '(|\underline{e}(\mathbf {u}^n)|^2)(\underline{e}(\mathbf {u}^n):\underline{e}(\mathbf {u}^n-\mathbf {u}^\star ))^2 \\&\quad +\mu (|\underline{e}(\mathbf {u}^n)|^2)|\underline{e}(\mathbf {u}^n-\mathbf {u}^\star )|^2 \,\mathsf {d}{\varvec{x}}+o_W\left({\left| \left| \left| \mathbf {u}^\star -\mathbf {u}^n \right| \right| \right| }_\Omega ^2\right). \end{aligned} \end{aligned}$$

Recall that, by assumption (A3), $\mu '(t) \le 0$ for all $t \ge 0$. Therefore, the Cauchy–Schwarz inequality implies that

$$\begin{aligned} \left( \mathsf {A}[\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star )](\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star ))\right. & \left. -\mathsf {A}[\mathbf {u}^\star ](\mathbf {u}^\star )\right) (\mathbf {u}^n-\mathbf {u}^\star )\\ & \ge t \int _\Omega \Big \{2 \mu '( | \underline{e}(\mathbf {u}^n)| ^2) | \underline{e}(\mathbf {u}^n)| ^2 +\mu ( | \underline{e}(\mathbf {u}^n)| ^2) \Big \}| \underline{e}( \mathbf {u}^n-\mathbf {u}^\star ) | ^2 \,\mathsf {d}{\varvec{x}}\\ & +o_W\left( {\left| \left| \left| \mathbf {u}^\star -\mathbf {u}^n \right| \right| \right| }_\Omega ^2\right) . \end{aligned}$$

Consequently,

$$\begin{aligned} \mathsf {E}(\mathbf {u}^n)-\mathsf {E}(\mathbf {u}^\star )&= \int _0^1 \left( \mathsf {A}[\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star )](\mathbf {u}^\star +t(\mathbf {u}^n-\mathbf {u}^\star ))-\mathsf {A}[\mathbf {u}^\star ](\mathbf {u}^\star )\right) (\mathbf {u}^n-\mathbf {u}^\star ) \, \mathrm {d} t \\&\ge \frac{1}{2} \int _\Omega \left\{ 2 \mu '\left( |\underline{e}(\mathbf {u}^n)|^2\right) |\underline{e}(\mathbf {u}^n)|^2+\mu \left( |\underline{e}(\mathbf {u}^n)|^2\right) \right\} | \underline{e}( \mathbf {u}^n-\mathbf {u}^\star ) | ^2 \,\mathsf {d}{\varvec{x}}+o_W\left( {\left| \left| \left| \mathbf {u}^\star -\mathbf {u}^n \right| \right| \right| }_\Omega ^2\right) , \end{aligned}$$

and thus

$$\begin{aligned} \frac{1}{2} \int _\Omega \mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \left| \underline{e}(\mathbf {u}^n-\mathbf {u}^\star )\right| ^2 \,\mathsf {d}{\varvec{x}}&\le \mathop {\mathrm {ess\,sup}}\limits _{{\varvec{x}}\in \Omega } \frac{\mu ( | \underline{e}(\mathbf {u}^n)| ^2) }{2 \mu '( | \underline{e}(\mathbf {u}^n)| ^2) \left| \underline{e}(\mathbf {u}^n)\right| ^2+\mu ( | \underline{e}(\mathbf {u}^n)| ^2) }\Big (\left( \mathsf {E}\left( \mathbf {u}^n\right) -\mathsf {E}\left( \mathbf {u}^\star \right) \right) \\&\quad +o_W\left( {\left| \left| \left| \mathbf {u}^\star -\mathbf {u}^n \right| \right| \right| }_\Omega ^2\right) \Big ). \end{aligned}$$

The rest follows as in the proof of Theorem 3.2. We note, however, that the factor of the remainder term $o_W\left({\left| \left| \left| \mathbf {u}^\star -\mathbf {u}^n \right| \right| \right| }_\Omega ^2\right)$ above cancels by the multiplication with $\gamma $ in (26). $\square $

Remark 3.5

We emphasize that the contraction factor $q_A$ from (33) is independent of the finite-dimensional subspace $W \subset V$. However, the remainder term $o_W$ may depend on the choice of the given discrete subspace, as indicated by the subscript.

Finally we remark that the energy error is equivalent to the norm error, i.e., the norm error contracts, up to some uniform constant, along the sequence generated by the Kačanov scheme as well. This equivalence was already established in a similar setting, e.g., in [11, Lem. 2.3] and [6, Lem. 5.1]. The proof can also be found in those references.

Proposition 3.6

Let $\mathsf {E}$ be defined as in (6), with $\mu $ satisfying (A1)–(A3), and let $\mathbf {u}^\star $ be the unique minimiser of $\mathsf {E}$ in V; then,

$$\begin{aligned} \frac{m_\mu }{2}{\left| \left| \left| \mathbf {u}^\star - \mathbf {v} \right| \right| \right| }_\Omega ^2 \le \mathsf {E}(\mathbf {v})-\mathsf {E}(\mathbf {u}^\star ) \le \frac{\sqrt{3}M_\mu }{2} {\left| \left| \left| \mathbf {u}^\star -\mathbf {v} \right| \right| \right| }^2_\Omega \qquad \text {for all } \mathbf {v}\in V. \end{aligned}$$

(34)

An analogous result holds on any finite-dimensional subspace $W \subset V$, with V replaced by W in the assertion above.

3.1 Application to the Carreau model

A widely used model for the flow of incompressible non-Newtonian fluids is the Carreau law, cf. [3]. In that case the viscosity coefficient $\mu $ in (1) is of the form

$$\begin{aligned} \mu (t)=\mu _\infty + (\mu _0-\mu _\infty )(1+ \lambda t)^{\frac{r-2}{2}}, \end{aligned}$$

(35)

where for shear-thinning fluids, $r \in (1,2)$, $\lambda > 0$ is the relaxation time, and $0<\mu _\infty<\mu _0<\infty $ denote the infinite and zero shear rate, respectively. The function $\mu $ from (35) is smooth, decreasing since $r \in (1,2)$, and satisfies the structural assumption (A2), cf. (2), thanks to the following lemma.

Lemma 3.7

Let $r \in (1,2)$, $\lambda >0$, and $0<\mu _\infty< \mu _0 < \infty $. Then, the following inequalities hold:

$$\begin{aligned} \mu _\infty (t-s) \le \mu (t^2)t-\mu (s^2)s \le \mu _0 (t-s), \qquad t \ge s \ge 0. \end{aligned}$$

Proof

Define $\xi (t):=\mu (t^2)t$, $t \ge 0$. The mean value theorem yields

$$\begin{aligned} \inf _{\tau \ge 0} \xi '(\tau ) (t-s) \le \xi (t)-\xi (s) \le \sup _{\tau \ge 0} \xi '(\tau )(t-s), \end{aligned}$$

and thus we need to show that $\mu _\infty =\inf _{\tau \ge 0} \xi '(\tau )$ and $\mu _0=\sup _{\tau \ge 0} \xi '(\tau )$. A straightforward calculation reveals that $\xi ''(\tau ) \ne 0$ for all $\tau \ge 0$, i.e., $\xi '$ has no local extrema in the interval $(0,\infty )$. Since, in addition, $\lim _{\tau \rightarrow 0} \xi '(\tau )=\mu _0$ and $\lim _{\tau \rightarrow \infty } \xi '(\tau )=\mu _\infty $, the lemma is established. $\square $

In particular, we may apply Theorems 3.2 and 3.4 to the Carreau model. In this case, the computable contraction factor from (33) reads as follows, with $\mathbf {u}^n \in W$:

$$\begin{aligned} q_A(n)&:=1-\frac{1}{4}\left( \mathop {\mathrm {ess\,sup}}\limits _{{\varvec{x}}\in \Omega }\frac{\mu _\infty +(\mu _0-\mu _\infty )(1+\lambda |\underline{e}(\mathbf {u}^n)|^2)^{\frac{r-2}{2}}}{\mu _\infty +(\mu _0-\mu _\infty )(1+\lambda |\underline{e}(\mathbf {u}^n)|^2)^{-1+\frac{r-2}{2}}(1+\lambda (r-1)|\underline{e}(\mathbf {u}^n)|^2)}\right) ^{-1} \\&=1-\frac{1}{4}\mathop {\mathrm {ess\,inf}}\limits _{{\varvec{x}}\in \Omega }\frac{\mu _\infty +(\mu _0-\mu _\infty )(1+\lambda |\underline{e}(\mathbf {u}^n)|^2)^{-1+\frac{r-2}{2}}(1+\lambda (r-1)|\underline{e}(\mathbf {u}^n)|^2)}{\mu _\infty +(\mu _0-\mu _\infty )(1+\lambda |\underline{e}(\mathbf {u}^n)|^2)^{\frac{r-2}{2}}}. \end{aligned}$$

Let us further examine this factor. First we note that

$$\begin{aligned}\frac{\mu _\infty +(\mu _0-\mu _\infty )(1+\lambda t^2)^{-1+\frac{r-2}{2}}(1+\lambda (r-1)t^2)}{\mu _\infty +(\mu _0-\mu _\infty )(1+\lambda t^2)^{\frac{r-2}{2}}} \rightarrow 1 \qquad \text {as } t \rightarrow \infty ,\end{aligned}$$

which is optimal from the point of view of contraction. Consequently, we do not expect a significant deterioration of the convergence rate if the rate-of-strain tensor of the solution, i.e., $\underline{e}(\mathbf {u}^\star )$, is unbounded, cf. Experiment 4.1.1.

Moreover, an elementary calculation reveals that

$$\begin{aligned} \frac{\mu _\infty +(\mu _0-\mu _\infty )(1+\lambda t^2)^{-1+\frac{r-2}{2}}\left( 1+\lambda (r-1)t^2\right) }{\mu _\infty +\left( \mu _0-\mu _\infty \right) (1+\lambda t^2)^{\frac{r-2}{2}}} \ge \frac{1+\lambda (r-1)t^2}{1+\lambda t^2} \qquad \text {for all } t \ge 0, \end{aligned}$$

and therefore

$$\begin{aligned} q_A(n)&=1-\frac{1}{4}\mathop {\mathrm {ess\,inf}}\limits _{{\varvec{x}}\in \Omega }\frac{\mu _\infty +\left( \mu _0-\mu _\infty \right) \left( 1+\lambda \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) ^{-1+\frac{r-2}{2}}\left( 1+\lambda (r-1)\left| \underline{e}(\mathbf {u}^n)\right| ^2\right) }{\mu _\infty +\left( \mu _0-\mu _\infty \right) \left( 1+\lambda \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) ^{\frac{r-2}{2}}}\\&\le 1-\frac{1}{4} \mathop {\mathrm {ess\,inf}}\limits _{{\varvec{x}}\in \Omega } \frac{1+\lambda (r-1) \left| \underline{e}(\mathbf {u}^n)\right| ^2}{1+ \lambda \left| \underline{e}(\mathbf {u}^n)\right| ^2}\\ {}&\le 1-\frac{1}{4}(r-1). \end{aligned}$$

In combination with Remark 3.3, we get

$$\begin{aligned} q_A(n) \le \min \left\{ 1-\frac{1}{4} \frac{\mu _\infty }{\mu _0},1-\frac{1}{4}(r-1)\right\} \!, \end{aligned}$$

(36)

i.e., the convergence rate may only deteriorate drastically if $r\rightarrow 1$ and, in addition, $\nicefrac {\mu _\infty }{\mu _0} \rightarrow 0$.

3.2 Application to the relaxed power-law model

Another prominent model for non-Newtonian fluids, e.g., in polymer processing, is the power-law model, see, e.g., [20, Ch. 3.3]. For this model, the weak formulation (4) of the boundary-value problem under consideration is as follows:

$$\begin{aligned} \text {find } \mathbf {u}\in X \ \text {such that} \qquad \int _\Omega |\underline{e}(\mathbf {u})|^{r-2} \underline{e}(\mathbf {u}):\underline{e}(\mathbf {v})\,\mathsf {d}{\varvec{x}}=\ell (\mathbf {v}) \qquad \text {for all } \mathbf {v}\in X; \end{aligned}$$

(37)

here $X:=\{\mathbf {u}\in \mathrm {W}_0^{1,r}(\Omega )^d: \nabla \cdot \mathbf {u}=0\}$ and $\ell \in X^\star $, where, for shear-thinning fluids, $r \in (1,2)$. In particular, the viscosity coefficient is given by

$$\begin{aligned} \mu (t)=t^{\frac{r-2}{2}}. \end{aligned}$$

Clearly, $\mu :{\mathbb {R}}_{> 0} \rightarrow {\mathbb {R}}_{> 0}$ is neither bounded away from zero nor bounded from above, i.e., (A2) is not satisfied. Therefore, as was proposed in the work [4], we consider a relaxed version of $\mu $: for $0<\varepsilon _{-}<\varepsilon _{+}<\infty $ we define the viscosity coefficient

$$\begin{aligned} \mu _{\varepsilon }(t):= {\left\{ \begin{array}{ll} \varepsilon _{-}^{r-2} &{}\quad \, \text {for}\ 0 \le t < \varepsilon _{-}^2, \\ t^{\frac{r-2}{2}} &{}\quad \, \text {for}\ \varepsilon _{-}^2 \le t \le \varepsilon _{+}^2, \\ \varepsilon _{+}^{r-2} &{}\quad \, \text {for}\ t \ge \varepsilon _{+}^2. \end{array}\right. } \end{aligned}$$

(38)

The function $\mu _\varepsilon $ is decreasing, strictly positive, bounded, globally Lipschitz continuous, and satisfies (A2) with

$$\begin{aligned}(r-1)\varepsilon _{+}^{r-2}(t-s) \le \mu (t^2)t-\mu (s^2)s \le \varepsilon _{-}^{r-2}(t-s), \qquad t \ge s \ge 0;\end{aligned}$$

it is, furthermore, differentiable at all $t \in [0,\infty )\setminus \{\varepsilon _{-}^2,\varepsilon _{+}^2\}$ and has finite left- and right-derivatives at $t=\varepsilon _{-}^2$ and $t=\varepsilon _{+}^2$, respectively. Hence, even though $\mu _{\varepsilon }$ is not continuously differentiable on $[0,\infty )$, Theorem 3.2 can, nevertheless, be applied in the given setting. Moreover, in the generic case when the set $\Omega _S^n:=\{{\varvec{x}}\in \Omega : |\underline{e}(\mathbf {u}^n({\varvec{x}}))| \in \{\varepsilon _{-},\varepsilon _{+}\}\}$, for every $n \ge 0$, has Lebesgue measure zero, the operator $\omega $ from the proof of Theorem 3.4 is Fréchet differentiable at $\mathbf {u}^n \in W$. Thus, in turn, Theorem 3.4 can then also be applied to the relaxed power-law model¹. A simple calculation reveals that the computable contraction factor from (33) can again be bounded; indeed,

$$\begin{aligned} q_A(n) \le 1-\frac{1}{4}(r-1). \end{aligned}$$

(39)

Moreover, one even has that $q_A(n)=1-4^{-1}(r-1)$ if the set $\{{\varvec{x}}\in \Omega : \varepsilon _{-} \le |\underline{e}(\mathbf {u}^n({\varvec{x}}))| \le \varepsilon _{+}\}$ is of positive Lebesgue measure. We further remark that

$$\begin{aligned} \frac{m_\mu }{M_\mu }=\frac{(r-1) \varepsilon _{+}^{r-2}}{\varepsilon _{-}^{r-2}} < (r-1), \end{aligned}$$

since $r \in (1,2)$. This shows that the bound (39) is, for every value $r \in (1,2)$, sharper than the bound from Remark 3.3. Furthermore, this bound predicts that it is the physical parameter r that affects the convergence rate of the iteration, in the finite-dimensional setting at least, rather than the quotient $\nicefrac {\varepsilon _{+}^{r-2}}{\varepsilon _{-}^{r-2}}$ implied by existing bounds on the contraction factor, cf. [4, Cor. 19]. Significantly, the upper bound $(r-1)$ on the contraction factor appearing of the right-hand side of (39) is independent of the relaxation parameters $\varepsilon _{\pm }$. This is of importance as we are interested in the power-law model (37) and we thus need to let $\varepsilon _{-} \rightarrow 0$ and $\varepsilon _{+} \rightarrow \infty $. We note that the existence of a bound independent of $\varepsilon _{\pm }$ on the contraction factor of the relaxed Kačanov iteration applied to the power-law model with $r \in (1,2)$ was stated in the infinite-dimensional case as an open problem in [4, Ex. 20].

We further note that the energy functional $\mathsf {E}_{\varepsilon }$ corresponding to the viscosity from (38) coincides with the energy functional ${\mathcal {J}}_\epsilon $ from [4] up to a constant shift depending on $\varepsilon _{-}$. To be precise, one has that

$$\begin{aligned} \mathsf {E}_{\varepsilon }(\mathbf {u})={\mathcal {J}}_\epsilon (\mathbf {u})+\left( \frac{1}{2}-\frac{1}{r}\right) \varepsilon _{-}^r, \qquad \mathbf {u}\in V, \end{aligned}$$

and thus the results established in [4] may be directly applied in our setting. In particular, this implies that the sequence of unique minimisers $\mathbf {u}^\star _\varepsilon \in V$ of $\mathsf {E}_{\varepsilon }$ converges in $\mathrm {W}^{1,r}_{0}(\Omega )^d$ to the unique minimiser $\mathbf {u}^\star \in X$ of

$$\begin{aligned}\mathsf {E}(\mathbf {u})=\frac{1}{r}\int _\Omega |\underline{e}(\mathbf {u})|^r\,\mathsf {d}{\varvec{x}}-\ell (\mathbf {u}),\end{aligned}$$

cf. [4, Cor. 10].

Remark 3.8

The relaxed power-law model could also be solved by using a (damped) Newton method, cf. [10, Prop. 5.3]. However, it is unclear whether and how the convergence rate will deteriorate as $\varepsilon _{-} \rightarrow 0$ and $\varepsilon _{+} \rightarrow \infty $. For an application of Newton’s method to the power-law model with a different regularisation approach we refer to [12]; however, the convergence rate in relation to the choice of the regularisation parameter $\varepsilon $ is not examined in that work.

Remark 3.9

We emphasise that our analysis does also apply to a variable (measurable) exponent $r:\Omega \rightarrow (1,2)$ for both the relaxed power-law model as well as the Carreau model. Then, in (39) and (36), respectively, we need to replace $1-\nicefrac {1}{4}(r-1)$ by $1-\nicefrac {1}{4}(\mathop {\mathrm {ess\,inf}}\limits _{{\varvec{x}}\in \Omega } r({\varvec{x}})-1)$.

4 Experiments

In this section, we will perform some numerical tests to assess our findings. To this end, we consider the simplified problem

$$\begin{aligned} \text {find} \ u \in \mathrm {H}^1_0(\Omega )\ \text {such that} \qquad \int _\Omega \mu (|\nabla u|^2)\nabla u \cdot \nabla v= \int _\Omega fv \,\mathsf {d}{\varvec{x}}\qquad \text {for all } v \in \mathrm {H}^1_0(\Omega ), \end{aligned}$$

where $\Omega :=(-1,1)^2 \setminus [0,1]\times [-1,0] \subset {\mathbb {R}}^2$ is an L-shaped domain, $f \in \mathrm {L}^2(\Omega )$, and the coefficient $\mu $ either obeys the Carreau law (35) or the relaxed power-law (38). We remark that the theory derived before equally applies to this simpler case. In all our experiments below, we use a conforming P1-finite element discretisation, where the mesh consists of ${\mathcal {O}}(10^6)$ triangles, except where explicitly stated otherwise.

4.1 Error decay in dependence on r

First, we will examine how the convergence rate of the error depends on the exponent $\frac{r-2}{2}$; recall that the norm error is equivalent to the energy error, cf. Proposition 3.6. This will be done for both the Carreau and the relaxed power-law model, for smooth and irregular solutions.

4.1.1 Error decay for the Carreau model

Let the function $\mu $ obey the Carreau law (35), with $\mu _\infty =1$, $\mu _0=100$, $\lambda =2$, and varying values of $r \in (1,2)$. The source term f is chosen so that the unique solution is given by

(a)

The smooth function $u^\star (x,y)=\sin (\pi x)\sin (\pi y)$, where $(x,y) \in {\mathbb {R}}^2$ denote the Euclidean coordinates;

(b)

The function

$$\begin{aligned} u^\star (R,\varphi )=R^{\nicefrac {2}{3}}\sin \left( \nicefrac {2\varphi }{3}\right) (1-R \cos (\varphi ))(1+R \cos (\varphi )) (1- R \sin (\varphi ))(1+R \sin (\varphi ))\cos (\varphi ), \end{aligned}$$

where R and $\varphi $ are polar coordinates, which exhibits a singularity at the origin (0, 0).

In the smooth case (a) the mesh is uniform, and in the singular case (b) the mesh is increasingly refined in the vicinity of the singularity point (0,0). In Fig. 1 we plot the error $\left\| \nabla u^n- \nabla u^\star \right\| _{\mathrm {L}^2(\Omega )}$ against the number of iterative steps n. We can clearly see that the convergence rate deteriorates with decreasing r, as was predicted in Sect. 3. We further note that the irregularity of the solution in (b) does not affect the convergence rate, as was conjectured in Sect. 3.1.

4.1.2 Error decay for the relaxed power-law model

Now consider the relaxed power-law model, cf. (38), with $\varepsilon _{-}=10^{-6}$ and $\varepsilon _{+}=10^6$. As before, the source term f is chosen so that (a) $u^\star $ is smooth, and (b) $u^\star $ exhibits a singularity at the origin (0,0). In Fig. 2, the error $\left\| \nabla u^n- \nabla u^\star \right\| _{\mathrm {L}^2(\Omega )}$ is plotted against the number of iterative steps n. We observe that for the power-law model the dependence of the convergence rate on the exponent is even stronger than for the Carreau model.

4.1.3 Error decay for close to constant viscosity

In the experiments before we had that the ratio of the infinite and zero shear rates was much smaller than $(r-1)$, cf. (36). Now we choose the parameters so that the ‘shear stress’ depends almost linearly on the ‘shear rate’, and we further let the source term f be such that the unique solution of (4) is given by the smooth function $u^\star (x,y)=\sin (\pi x) \sin (\pi y)$. For the Carreau law we set $\mu _\infty =1$, $\mu _0=2$, $\lambda =2$, and take again varying values of $r \in (1,2)$; we emphasize that, in this test, we consider even smaller values of r than in the experiments before. In view of the a posteriori computable contraction factor (36) we expect that the convergence rate will not deteriorate drastically for r close to one, which is confirmed by our numerical experiment, cf. Fig. 3 (left). In the case of the relaxed power-law model, we set $\varepsilon _{-}=1$, $\varepsilon _{+}=2$, and test the same values $r \in (1,2)$ as for the Carreau model. We note that these choices of the relaxation parameters $\varepsilon _{\pm }$ are in practice of no interest, as one is, rather, interested in $\varepsilon _{-} \rightarrow 0$ and $\varepsilon _{+} \rightarrow \infty $. Nonetheless, we still presume that the convergence deteriorates for r close to one by our analysis in Sect. 3.2. This is indeed the case, as illustrated in Fig. 3 (right).

4.2 Error decay in dependence on the zero and infinite shear rates

Next, we will show that, for fixed $r \in (1,2)$, the convergence rate does not essentially deteriorate when the ratio of the infinite and zero shear rates decreases. As in the experiment before, we choose the source term f so that the unique solution is given by the smooth function $u^\star (x,y)=\sin (\pi x)\sin (\pi y)$. For the Carreau model we set $\lambda =2$, $r=1.5$, $\mu _0=10^a$, and $\mu _\infty =10^{-a}$ for $a \in \{1,2,3,4,5\}$. As we can see from Fig. 4 (left), the convergence rate is (almost) independent of a, i.e., the convergence does not deteriorate for a decreasing quotient $\nicefrac {\mu _\infty }{\mu _0}$. For the relaxed power-law model we set $r=1.5$, $\varepsilon _{-}=10^{-a}$, and $\varepsilon _{+}=10^a$ for $a \in \{1,2,3,4,5\}$. Even though the plots differ for the various values of a, the convergence rate is almost the same for all of them; indeed, no significant deterioration of the convergence rate can be observed in Fig. 4 (right) for increasing a.

4.3 Energy decay and the contraction factor

We now focus on the energy decay, and compare the exact contraction factor, cf. (40), the a posteriori computable factor (41), and the worst case factor from Remark 3.3, cf. (42). Again, this will be done for the Carreau and the relaxed power-law models. In our figures below, we plot the energy decay $\mathsf {E}(u^n)-\mathsf {E}(u^\star )$, as well as the aforementioned factors

$$\begin{aligned} q_E(n)&=\frac{\mathsf {E}(u^n)-\mathsf {E}(u^\star )}{\mathsf {E}(u^{n-1})-\mathsf {E}(u^\star )}, \end{aligned}$$

(40)

$$\begin{aligned} q_A(n)&=1-\frac{1}{4} \left\{ \mathop {\mathrm {ess\,sup}}\limits _{{\varvec{x}}\in \Omega } \frac{\mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) }{2 \mu '\left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) \left| \underline{e}(\mathbf {u}^n)\right| ^2+\mu \left( \left| \underline{e}(\mathbf {u}^n)\right| ^2\right) }\right\} ^{-1}, \end{aligned}$$

(41)

$$\begin{aligned} q_W(n)&=1-\frac{1}{4} \frac{m_\mu }{M_\mu }, \end{aligned}$$

(42)

against the number of iteration steps n.

4.3.1 Energy contraction for the Carreau model

We consider the Carreau model, cf. (35), for $\mu _\infty =1$, $\mu _0=100$, $\lambda =2$, and $r=1.3$, respectively $r=1.1$. In both cases, we approximate the discrete solution for the source term f from case (a) before by performing seventy steps of the Kačanov iteration (11), and subsequently use this approximation for the determination of the reference energy $\mathsf {E}(u^\star )$; here, $u^\star $ denotes the unique minimiser in the finite element space. We can clearly observe in Fig. 5 that, on the one hand, the a posteriori computable factor $q_A(n)$, cf. (41), is much larger than the actual factor $q_E(n)$, cf. (40). On the other hand, however, the factor $q_A(n)$ clearly still improves the worst case factor $q_W(n)$ from Remark 3.3, cf. (42).

4.3.2 Energy contraction for the relaxed power-law model

Let the coefficient $\mu $ obey the relaxed power-law model with $\varepsilon _{-}=10^{-6}$, $\varepsilon _{+}=10^6$, and $r=1.3$, respectively $r=1.1$. In this experiment, we approximate the discrete solution $u^\star $ by performing fifty and one hundred iteration steps for $r=1.3$ and $r=1.1$, respectively. As before, the a posteriori computable contraction factor $q_A(n)$ is noticeably larger than the exact factor $q_E(n)$, however, this is less marked than before; see Fig. 6. Moreover, it considerably improves the worst case contraction factor $q_W(n) \approx 1-10^{-12}$.

We now repeat this experiment on a coarser mesh consisting of ${\mathcal {O}}(10^5)$ uniform triangles. In Fig. 7 we plot the factors $q_A(n)$, $q_E(n)$, as well as q(n) from (25) against the number of iteration steps. We observe that the (non-computable) factor q(n) from (25) has a similar trend as the exact factor $q_E(n)$, cf. (40), and approximates the computable factor $q_A(n)$, cf. (41), as the number of iteration steps increases.

Finally, we remark that, in the context of fixed point iterations, the contraction factor can be (heuristically) approximated by

$$\begin{aligned} q_H(n):=\min \left\{ 1,\frac{\mathsf {E}(u^n)-\mathsf {E}(u^{n-1})}{\mathsf {E}(u^{n-1})-\mathsf {E}(u^{n-2})}\right\} \end{aligned}$$

(43)

as $n \rightarrow \infty $, see, e.g., [18]; we emphasize that $q_H(n) \ge 0$, for $n \ge 2$, thanks to (27). As can be observed in Fig. 8, the factor $q_H(n)$, cf. (43), does indeed approximate the exact factor $q_E(n)$ from (40) well for sufficiently large n. However, in contrast with the bound $q_A(n)$ from Theorem 3.4, the computable factor $q_H(n)$ does not provide any guaranteed a priori information.

4.4 Energy decay for different mesh sizes

We conclude this section with a comparison of the energy decay for different mesh sizes. For the Carreau model, cf. (35), we set $\mu _\infty =1$, $\mu _0=100$, $\lambda =2$, and $r=1.3$. In the case of the relaxed power-law model, let $\varepsilon _{-}=10^{-6}$, $\varepsilon _{+}=10^6$, and $r=1.3$. In each case we approximated the discrete solution, and, in turn, the corresponding energy by performing one hundred iteration steps. As we can see from Fig. 9, the asymptotic convergence rates (almost) coincide for the different mesh sizes.

5 Conclusion

In this article, we established an a posteriori computable (energy) contraction factor for the Kačanov scheme (on finite-dimensional Galerkin spaces), motivated by applications to quasi-Newtonian fluid flow problems. For the relaxed power-law model, this factor is independent of the relaxation parameters $\varepsilon _{\pm }$; we also demonstrated that it is, instead, the power-law exponent that affects the convergence rate of the iteration. In contrast, existing bounds on the contraction factor of the relaxed Kačanov iteration depend on the relaxation parameters $\varepsilon _{\pm }$ in an unfavourable manner, in the sense that they tend to 1 as $\varepsilon _{-} \rightarrow 0$ and/or $\varepsilon _{+} \rightarrow \infty $. A series of numerical tests have confirmed that our a posteriori computable contraction factor improves, on finite-dimensional Galerkin spaces, existing bounds, and that, as predicted by our analysis, for the power-law model it is in fact the closeness of the power-law exponent $r \in (1,2)$ to 1 that influences the convergence rate of the iteration. However, our experiments revealed that the theoretically derived bound on the contraction factor of the Kačanov scheme is still too pessimistic.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Vorheriger Artikel A proximal point like method for solving tensor least-squares problems

Nächster Artikel A lowest-degree quasi-conforming finite element de Rham complex on general quadrilateral grids by piecewise polynomials

Appendix

A Smoothly relaxed power-law model

In this appendix, we will introduce a continuously differentiable approximation of the relaxed power-law viscosity (38). In particular, for $0<\delta<\varepsilon _{-}^2 < \varepsilon _{+}^2$, we want to define a function $\mu _{\delta ,\varepsilon }:{\mathbb {R}}_{\ge 0} \rightarrow {\mathbb {R}}_{\ge 0}$ which

(a)

Is continuously differentiable;

(b)

Coincides with $\mu _{\varepsilon }$, cf. (38), in the domain $[\varepsilon _{-}^2+\delta ,\varepsilon _{+}^2-\delta ]$;

(c)

Is constant on $[0,\varepsilon _{-}^2-\delta ] \cup [\varepsilon _{+}^2+\delta ,\infty )$;

(d)

Converges pointwise to $\mu _\varepsilon $ for $\delta \rightarrow 0$ (Fig. 10).

The idea is to identify quadratic functions $g_{\delta ,\varepsilon }^{\pm }$ on $[\varepsilon _{\pm }^2-\delta ,\varepsilon _{\pm }^2+\delta ]$, respectively, which smoothly connect the constant parts of $\mu _{\varepsilon }$ with the map $t \mapsto t^{\frac{r-2}{2}} = \mu (t)$ on $[\varepsilon _{-}^2+\delta ,\varepsilon _{+}^2-\delta ]$, i.e.,

$$\begin{aligned} \mu _{\delta ,\varepsilon }(t)={\left\{ \begin{array}{ll} g_{\delta ,\varepsilon }^{-}(\varepsilon _{-}^2-\delta ) & 0 \le t \le \varepsilon _{-}^2-\delta \\ g_{\delta ,\varepsilon }^{-}(t) &{} \varepsilon _{-}^2-\delta<t\le \varepsilon _{-}^2+\delta \\ t^{\frac{r-2}{2}} &{} \varepsilon _{-}^2+\delta< t \le \varepsilon _{+}^2-\delta \\ g_{\delta ,\varepsilon }^{+}(t) &{} \varepsilon _{+}^2-\delta <t \le \varepsilon _{+}^2+\delta \\ g_{\delta ,\varepsilon }^{+}(\varepsilon _{+}^2+\delta ) &{} t > \varepsilon _{+}^2+\delta . \end{array}\right. } \end{aligned}$$

A straightforward calculation reveals that the properties (a)–(d) are satisfied for

$$\begin{aligned} g_{\delta ,\varepsilon }^{-}(t)=&\frac{\left( \varepsilon _{-}^2+\delta \right) ^{\frac{r-4}{2}}(r-2)}{8 \delta } t^2-\frac{\left( \varepsilon _{-}^2+\delta \right) ^{\frac{r-4}{2}}\left( \varepsilon _{-}^2- \delta \right) (r-2)}{4 \delta }t \\&\quad -\frac{\left( \varepsilon _{-}^2+ \delta \right) ^{\frac{r-2}{2}}\left( -14 \delta + 2 \varepsilon _{-}^2+3 \delta r-\varepsilon _{-}^2r\right) }{8 \delta } \end{aligned}$$

and

$$\begin{aligned} g_{\delta ,\varepsilon }^{+}(t)=&-\frac{\left( \varepsilon _{+}^2-\delta \right) ^{\frac{r-4}{2}}(r-2)}{8 \delta } t^2+\frac{\left( \varepsilon _{+}^2-\delta \right) ^{\frac{r-4}{2}}\left( \varepsilon _{+}^2+\delta \right) (r-2)}{4 \delta } t \\&\quad -\frac{\left( \varepsilon _{+}^2-\delta \right) ^{\frac{r-2}{2}}\left( -14 \delta - 2 \varepsilon _{+}^2+3 \delta r+\varepsilon _{+}^2r\right) }{8 \delta }. \end{aligned}$$

Nonetheless we will present a continuously differentiable version of the viscosity coefficient (38) in the Appendix.

Baranger, J., Najib, K.: Analyse numérique des écoulements quasi-newtoniens dont la viscosité obéit à la loi puissance ou la loi de Carreau. Numer. Math. 58(1), 35–49 (1990)MathSciNetCrossRef

Barrett, J.W., Liu, W.B.: Finite element error analysis of a quasi-Newtonian flow obeying the Carreau or power law. Numer. Math. 64(4), 433–453 (1993)MathSciNetCrossRef

Carreau, P.J.: Rheological equations from molecular network theories. Trans. Soc. Rheol. 16(1), 99–127 (1972)CrossRef

Diening, L., Fornasier, M., Tomasi, R., Wank, M.: A relaxed Kačanov iteration for the $p$-poisson problem. Numer. Math. 145(1), 1–34 (2020)MathSciNetCrossRef

Fučík, S., Kratochvíl, A., Nečas, J.: Kačanov-Galerkin method. Comment. Math. Univ. Carolinae 14, 651–659 (1973) (MR 365300)MathSciNetMATH

Gantner, G., Haberl, A., Praetorius, D., Stiftner, B.: Rate optimal adaptive FEM with inexact solver for nonlinear operators. IMA J. Numer. Anal. 38(4), 1797–1831 (2018)MathSciNetCrossRef

Garau, E.M., Morin, P., Zuppa, C.: Convergence of an adaptive Kačanov FEM for quasi-linear problems. Appl. Numer. Math. 61(4), 512–529 (2011)MathSciNetCrossRef

Han, W., Jensen, S., Shimansky, I.: The Kačanov method for some nonlinear problems. Appl. Numer. Meth. 24, 57–79 (1997)CrossRef

Heid, P., Praetorius, D., Wihler, T.P.: Energy contraction and optimal convergence of adaptive iterative linearized finite element methods. Comput. Methods Appl. Math. 21(2), 407–422 (2021) (MR 4235817)MathSciNetCrossRef

10.

Heid, P., Wihler, T.P.: Adaptive iterative linearization Galerkin methods for nonlinear problems. Math. Comput. 89(326), 2707–2734 (2020)MathSciNetCrossRef

11.

Heid, P., Wihler, T.P.: On the convergence of adaptive iterative linearized Galerkin methods. Calcolo 57(3), 24 (2020) (MR 4131951)MathSciNetCrossRef

12.

Hirn, A.: Finite element approximation of singular power-law systems. Math. Comput. 82(283), 1247–1268 (2013)MathSciNetCrossRef

13.

Kačanov, L.M.: Variational methods of solution of plasticity problems. J. Appl. Math. Mech. 23, 880–883 (1959)MathSciNetCrossRef

14.

Kačur, J., Nečas, J., Polák, J., Souček, J.: Convergence of a method for solving the magnetostatic field in nonlinear media. Apl. Mat. 13(6), 456–465 (1968)CrossRef

15.

Michlin, S.G.: Čislennaja Realizacija Variacionnych Metodov. Nauka, Izd (1966)

16.

Nečas, J.: Introduction to the Theory of Nonlinear Elliptic Equations. Wiley, New Jersey (1986)MATH

17.

Neff, P., Pauly, D., Witsch, K.-J.: Poincaré meets Korn via Maxwell: extending Korn''s first inequality to incompatible tensor fields. J. Differ. Equ. 258(4), 1267–1302 (2015)CrossRef

18.

Senning, J.R.: Computing and Estimating the Rate of Convergence, 2007, online lecture notes available from: http://fourier.eng.hmc.edu/e176/lectures/rate.pdf

19.

Shapiro, A.: On concepts of directional differentiability. J. Optim. Theory Appl. 66(3), 477–487 (1990) (MR 1080259)MathSciNetCrossRef

20.

Tadmor, Z., Gogos, C.G.: Principles of Polymer Processing. Wiley, EngineeringPro collection, New Jersey (2006)

21.

Zeidler, E.: Nonlinear Functional Analysis and its Applications. Springer, New York, II/B (1990)CrossRef

Titel: On the convergence rate of the Kačanov scheme for shear-thinning fluids
verfasst von: Pascal Heid
Endre Süli
Publikationsdatum: 01.03.2022
Verlag: Springer International Publishing
Erschienen in: Calcolo / Ausgabe 1/2022
Print ISSN: 0008-0624
Elektronische ISSN: 1126-5434
DOI: https://doi.org/10.1007/s10092-021-00444-3

Springer Professional

Abstract

Publisher's Note

1 Introduction

2 Existence and uniqueness of the solution

3 Energy contraction

3.1 Application to the Carreau model

3.2 Application to the relaxed power-law model

4 Experiments

4.1 Error decay in dependence on r

4.1.1 Error decay for the Carreau model

4.1.2 Error decay for the relaxed power-law model

4.1.3 Error decay for close to constant viscosity

4.2 Error decay in dependence on the zero and infinite shear rates

4.3 Energy decay and the contraction factor

4.3.1 Energy contraction for the Carreau model

4.3.2 Energy contraction for the relaxed power-law model

4.4 Energy decay for different mesh sizes

5 Conclusion

Publisher's Note

Appendix

A Smoothly relaxed power-law model

Weitere Artikel der Ausgabe 1/2022

A proximal point like method for solving tensor least-squares problems

Correction to: Analysis of generalised alternating local discontinuous Galerkin method on layer-adapted mesh for singularly perturbed problems

Superconvergence of a modified weak Galerkin method for singularly perturbed two-point elliptic boundary-value problems

A mass and energy conservative fourth-order compact difference scheme for the Klein-Gordon-Dirac equations

Rational interpolation operator with finite Lebesgue constant

Randomised one-step time integration methods for deterministic operator differential equations