Skip to main content
Top

Open Access 18-10-2024

Explicit A Posteriori Error Representation for Variational Problems and Application to TV-Minimization

Authors: Sören Bartels, Alex Kaltenbach

Published in: Foundations of Computational Mathematics

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we propose a general approach for explicit a posteriori error representation for convex minimization problems using basic convex duality relations. Exploiting discrete orthogonality relations in the space of element-wise constant vector fields as well as a discrete integration-by-parts formula between the Crouzeix–Raviart and the Raviart–Thomas element, all convex duality relations are transferred to a discrete level, making the explicit a posteriori error representation –initially based on continuous arguments only– practicable from a numerical point of view. In addition, we provide a generalized Marini formula that determines a discrete primal solution in terms of a given discrete dual solution. We benchmark all these concepts via the Rudin–Osher–Fatemi model. This leads to an adaptive algorithm that yields a (quasi-optimal) linear convergence rate.
Notes
Communicated by Endre Süli.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

The numerical analysis of the approximation of variational problems is challenging when these are non-differentiable, degenerate, or involve constraints. In particular, following established concepts for linear elliptic partial differential equations often leads to sub-optimal results only. The framework of convex duality provides an attractive concept to reveal hidden information and structures to obtain quasi-optimal error representation formulas under meaningful regularity conditions. Similar to [45, 46], we first exploit this idea to derive explicit computable a posteriori error estimates for a natural error measure. Then, this general result is applied to a non-differentiable model problem with discontinuous solutions. As a whole, our results, similar to [45, 46], show that the question of developing asymptotically exact a posteriori error estimators is rather a question of identifying optimal error quantities. However, different from [45, 46], we also propose a general approach for making our results practicable from a numerical point of view.
Given a domain \(\Omega \subseteq \mathbb {R}^d\)\({d\in \mathbb {N}}\), a proper, lower semi-continuous, and convex energy density \(\phi :{\mathbb {R}^d}\rightarrow \mathbb {R}\cup \{+\infty \}\), a (Lebesgue) measurable energy density \(\psi :\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) that is proper, lower semi-continuous, and convex with respect to the second argument, and a Banach space X consisting of functions defined in \(\Omega \), we denote by the minimization of the energy functional \(I:X\rightarrow \mathbb {R}\cup \{+\infty \}\), for every \(v\in X\) defined by
$$\begin{aligned} I(v) {:}{=}\int _{\Omega }{\phi (\nabla v)\,\textrm{d}x} + \int _{\Omega }{\psi (\cdot , v)\,\textrm{d}x}\,, \end{aligned}$$
(1.1)
the primal problem.
Its (Fenchel) dual problem is given via the maximization of the energy functional \(D:Y\rightarrow \mathbb {R}\cup \{-\infty \}\), where Y is a Banach space consisting of vector fields defined in \(\Omega \), for every \(y\in Y\) is defined by
$$\begin{aligned} D(y) {:}{=}-\int _{\Omega }{\phi ^*(y)\,\textrm{d}x} - \int _{\Omega }{\psi ^*(\cdot , \textrm{div}\,y)\,\textrm{d}x}\,. \end{aligned}$$
(1.2)
Here, \(\phi ^*:\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \}\) and \(\psi ^*:\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) (with respect to the second argument) denote the (Fenchel) conjugates of \(\phi :{\mathbb {R}^d}\rightarrow \mathbb {R}\cup \{+\infty \}\) and \(\psi :\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\), respectively. Under rather general conditions (cf. [33, 51]), we have the well-posedness of the primal problem and the dual problem, i.e., the existence of a minimizer \(u\in X\) of (1.1), i.e., a primal solution, and of a maximizer \(z\in Y\) of (1.2), i.e., a dual solution, and the strong duality relation
$$\begin{aligned} \min _{v\in X} I(v) = I(u)= D(z) = \max _{y\in Y} D(y)\,. \end{aligned}$$
(1.3)
Since \(u\hspace{-0.1em}\in \hspace{-0.1em}X\) and \(z\hspace{-0.1em}\in \hspace{-0.1em} Y\) are optimal for (1.1) and (1.2), respectively, it holds that  \(0\hspace{-0.1em}\in \hspace{-0.1em} \partial I(u)\) and \({0\hspace{-0.1em}\in \hspace{-0.1em} \partial D(z)}\). In particular, for every \(v\in X\) and \(y\in Y\), the quantities
$$\begin{aligned} \rho _I^2(v,u)&{:}{=}I(v) - I(u)\,, \end{aligned}$$
(1.4)
$$\begin{aligned} \rho _{-D}^2(y,z)&{:}{=}D(z) - D(y)\,, \end{aligned}$$
(1.5)
are non-negative. They define distances, if (1.1) and (1.2), respectively, are strictly convex, and are called optimal strong convexity measures.
For admissible approximations \(v\in X\) with \(I(v)<+\infty \) and \(y\in Y\) with \(D(y)>-\infty \), given the definitions (1.4) and (1.5), the strong duality relation (1.3) implies the a posteriori error identity
$$\begin{aligned} \begin{aligned} \rho _I^2(v,u) + \rho _{-D}^2(y,z) = I(v) - D({y}) {=}{:}\eta ^2(v,y)\,. \end{aligned} \end{aligned}$$
(1.6)
Hence, the fully computable error estimator \(\eta ^2:X\times Y\rightarrow \mathbb {R}\cup \{+\infty \}\), cf. (1.6), exactly represents the sum of the primal and dual approximation errors, i.e., of (1.4) and (1.5).
The error representation (1.6) can be seen as a generalization of the Prager–Synge result (cf. [19, 20, 43]), which states that for the Poisson problem, i.e., \(\phi \hspace{-0.15em}{:}{=}\hspace{-0.15em} \frac{1}{2}\vert \cdot \vert ^2\hspace{-0.15em}\in \hspace{-0.15em} C^1(\mathbb {R}^d)\)\({\psi \hspace{-0.15em}{:}{=}\hspace{-0.15em} ((t,x)^\top \hspace{-0.15em}\mapsto \hspace{-0.15em} -f(x)t):}\) \(\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\), where \(f\in L^2(\Omega )\), \(X{:}{=}W^{1,2}_D(\Omega )\), and \({Y{:}{=}W^2_N(div ;\Omega )}\) (cf. (2.5)), for every \({v\in W^{1,2}_D(\Omega )}\) and \({y\in W^2_N(div ;\Omega )}\) with \(div \,y=-f\) a.e. in \(\Omega \), we have that
$$\begin{aligned} \begin{aligned} \tfrac{1}{2} \Vert \nabla v -\nabla u\Vert _{L^2(\Omega ;\mathbb {R}^d)}^2 + \tfrac{1}{2} \Vert y - z \Vert _{L^2(\Omega ;\mathbb {R}^d)}^2 = \tfrac{1}{2} \Vert \nabla v-y \Vert ^2_{L^2(\Omega ;\mathbb {R}^d)}\,. \end{aligned} \end{aligned}$$
(1.7)
The equation (1.7) has been used by various authors to define error estimators; for a comprehensive list of references, we refer the reader to [18]. Often, local procedures are devised to construct an admissible vector field \(y\in W^2_N(div ;\Omega )\) with \(-div \,y=f\) a.e. in \(\Omega \) from a given function \({v\in W^{1,2}_D(\Omega )}\). While this leads to efficient procedures to obtain accurate error estimators, the arguments cannot be expected to transfer to non-linear problems. Another alternative to computing approximations for the primal and dual problems consists in using finite element methods for which reconstruction formulas are available, e.g., using the discontinuous Crouzeix–Raviart finite element method and the Marini formula in the case of the Poisson problem (cf. [39]).
It has recently been found (cf. [4, 27]) that the discontinuous Crouzeix–Raviart finite element method leads to quasi-optimal a priori error estimates for non-linear and non-differentiable problems, while continuous finite element methods provide only a sub-optimal convergence behavior. In the derivation of those results, a general discrete convex duality theory with Raviart–Thomas vector fields has emerged that also leads to reconstruction formulas in rather general settings. As a consequence, given an approximation \(v\in X\) or \(y\in Y\), respectively, the missing one can be obtained via a simple post-processing procedure. Then, the pair leads to the error representation formula (1.6). It should also be noted that neither \(v\in X\) nor \(y\in Y\) needs to be optimal in a subspace of X or Y. By introducing appropriate residuals, any pair of admissible approximations of \(u\in X\) and \(z\in Y\) can be used. This is particularly important for non-linear problems, i.e., non-quadratic functionals, where an exact solution of discrete problems is neither possible nor rational.
A difficulty in the application of the explicit a posteriori error representation formula (1.6) arises from the condition that \(v\in X\) and \(y\in Y\) need to be admissible for the functionals (1.1) and (1.2). In the case of the Poisson problem, this arises, e.g., via element-wise constant approximations of \(f\in L^2(\Omega )\) that are the images of Raviart–Thomas vector fields under the divergence operator. While data terms can be controlled by introducing appropriate data oscillation terms, structural peculiarities of the energy densities \(\phi :\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \}\) and \(\psi :\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) and their (Fenchel) conjugates \(\phi ^*:\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \}\) and \(\psi ^*:\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) are often more challenging. We illustrate this by analyzing a non-differentiable problem which leads to a new error analysis and an adaptive refinement procedure for the computationally challenging minimization problem.
For a given function \(g\in L^2(\Omega )\), i.e., the noisy image, and a given parameter \(\alpha >0\), i.e., the fidelity parameter, the Rudin–Osher–Fatemi (ROF) model (cf. [47]) seeks a minimizer \(u\in BV(\Omega )\cap L^2(\Omega )\), i.e., the de-noised image, where \(BV(\Omega )\) denotes the space of functions with bounded variation, for the functional \(I:BV(\Omega )\cap L^2(\Omega )\rightarrow \mathbb {R}\), for every \({v\in BV(\Omega )\cap L^2(\Omega )}\) defined by
$$\begin{aligned} I(v) {:}{=}\vert \textrm{D}v\vert (\Omega ) + \tfrac{\alpha }{2} \Vert v-g\Vert _{L^2(\Omega )}^2 \,, \end{aligned}$$
(1.8)
where \(\vert \textrm{D}(\cdot )\vert (\Omega ):\hspace{-0.1em}BV(\Omega )\hspace{-0.1em}\rightarrow \hspace{-0.1em} [0,+\infty ]\) denotes the total variation functional. The functional (1.8) can be seen as a special case of (1.1) with \(\phi \hspace{-0.15em}= \hspace{-0.15em}|\cdot |\hspace{-0.15em}\in \hspace{-0.15em} C^0(\mathbb {R}^d)\) and \({\psi \hspace{-0.15em}=\hspace{-0.15em}((x,t)^\top \hspace{-0.15em}\mapsto \hspace{-0.15em} \frac{\alpha }{2}(t-g(x))^2):\hspace{-0.15em}\Omega \hspace{-0.15em}\times \hspace{-0.15em} \mathbb {R}\hspace{-0.15em}\rightarrow \hspace{-0.15em} \mathbb {R}}\). The (Fenchel) (pre-)dual problem to the minimization of the functional (1.8) consists in the maximization of the functional \(D:W_N^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\rightarrow \mathbb {R}\cup \{-\infty \}\), for every \({y\in W_N^2(div ;\Omega )} \) \(\cap L^\infty (\Omega ;\mathbb {R}^d)\) defined by
$$\begin{aligned} D(y) {:}{=}-\smash {I_{K_1(0)}^{\Omega }}(y)-\tfrac{1}{2\alpha } \Vert \textrm{div}\, y+\alpha g\Vert _{L^2(\Omega )}^2+\tfrac{\alpha }{2}\Vert g\Vert _{L^2(\Omega )}^2\,, \end{aligned}$$
(1.9)
where \(\smash {I_{K_1(0)}^{\Omega }}(y){:}{=}0\) if \(\vert y\vert \le 1\) a.e. in \(\Omega \) and \(\smash {I_{K_1(0)}^{\Omega }}(y){:}{=}+\infty \) else. The primal solution \(u\in BV(\Omega )\) \(\cap L^2(\Omega )\), i.e., the unique minimizer of (1.8), and a dual solution \({z\in W_N^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)}\), i.e., a (possibly non-unique) maximizer of (1.9), are (formally) related via (cf. [26, p. 284])
$$\begin{aligned} \begin{aligned} z&\in \left. {\left\{ \begin{array}{ll} \big \{\frac{\nabla u}{|\nabla u|}\big \}& \text { if }|\nabla u|>0\\ K_1(0)& \text { if }|\nabla u|=0 \end{array}\right. }\right\} & \quad \text { a.e. in }\Omega \,,\\ div \, z&= \alpha \, (u-g) & \quad \text { a.e. in }\Omega \,. \end{aligned} \end{aligned}$$
(1.10)
The relations (1.10) determine \(z\in W_N^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\) via \(u\in BV(\Omega )\cap L^2(\Omega )\) and vice versa. A Crouzeix–Raviart finite element approximation of (1.1) is given by the minimization of the regularized, discrete functional \(I_{h,\varepsilon }^{cr}:\mathcal {S}^{1,cr}(\mathcal {T}_h)\rightarrow \mathbb {R}\), \(h,\varepsilon >0\), for every \(v_h\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\) defined by
$$\begin{aligned} I_{h,\varepsilon }^{cr}(v_h) {:}{=}\Vert f_\varepsilon (\vert \nabla _{\!h} v_h\vert )\Vert _{L^1(\Omega )} + \tfrac{\alpha }{2} \Vert \Pi _h(v_h-g)\Vert _{L^2(\Omega )}^2 \,. \end{aligned}$$
Here, \(\nabla _{\!h}\) is the element-wise application of the gradient operator and \(f_\varepsilon \!\in \! \smash {C^1(\mathbb {R})}\) is a regularization of the modulus \(\vert \cdot \vert \), and \(\Pi _h\) denotes the (local) \(L^2\)-projection onto element-wise constant functions. A quasi-optimal dual Raviart–Thomas vector field \(z_{h,\varepsilon }^{rt}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) can be associated with a minimizing function \(u_{h,\varepsilon }^{cr}\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\) of \(I_{h,\varepsilon }^{cr}:\mathcal {S}^{1,cr}(\mathcal {T}_h)\rightarrow \mathbb {R}\) via the reconstruction formula
$$\begin{aligned} z_{h,\varepsilon }^{rt} = \tfrac{f_\varepsilon '(\vert \nabla _{\!h} u_{h,\varepsilon }^{cr}\vert ) }{\vert \nabla _{\!h} u_{h,\varepsilon }^{cr}\vert }\nabla _{\!h} u_{h,\varepsilon }^{cr} + \alpha \tfrac{\Pi _h (u_{h,\varepsilon }^{cr} -g)}{d}\big ( \textrm{id}_{\mathbb {R}^d}- \Pi _h \textrm{id}_{\mathbb {R}^d}\big )\quad \text { in }\mathcal {R}T^0_N(\mathcal {T}_h)\,. \end{aligned}$$
(1.11)
For canonical choices of \(f_\varepsilon \hspace{-0.15em}\in \hspace{-0.15em}C^1(\mathbb {R})\), e.g., \(f_\varepsilon \hspace{-0.15em}=\hspace{-0.15em}\vert \cdot \vert _\varepsilon \hspace{-0.15em}=\hspace{-0.15em} ((\cdot )^2+\varepsilon ^2)^{1/2}\), it holds that \(\vert \Pi _h z_{h,\varepsilon }^{rt}\vert \hspace{-0.15em}\le \hspace{-0.15em} 1\) a.e. in \(\Omega \), but not \(|z_{h,\varepsilon }^{rt}|\le 1\) a.e. in \(\Omega \). Thus, we employ \(f_\varepsilon = (1-\varepsilon )\, |\cdot |_\varepsilon \), so that \(|f_\varepsilon '(t)|\le 1-\varepsilon \) for all \(t\in \mathbb {R}\). The choice \(\varepsilon \sim h^2\) in (1.11) and an additional projection step onto \(K_1(0)\) provides an accurate approximation \({\overline{z}}_{h,\varepsilon }^{rt}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) of \(z\in W_N^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\), which satisfies \(|{\overline{z}}_{h,\varepsilon }^{rt}|\le 1\) a.e. in \(\Omega \) and, thus, represents an admissible test function that leads to the definition of an error estimator. The resulting adaptive mesh-refinement procedure leads to significantly improved experimental convergence rates compared to recent related contributions (cf. [9, 11, 13]). More precisely, we report quasi-optimal linear convergence rates which have been obtained only for meshes with quadratic grading towards a sufficiently simple jump set of a piece-wise regular g in [11].
This article is organized as follows: In Sect. 2, we introduce the employed notation and the relevant finite element spaces. In Sect. 3, we propose a general approach for explicit a posteriori error representation for convex minimization problems based on (discrete) convex duality relations. In Sect. 4, we apply the concepts of Sect. 3 to the Rudin–Osher–Fatemi model and propose a regularization scheme. In Sect. 5, we review our theoretical findings via numerical experiments.

2 Preliminaries

2.1 Convex Analysis

For a (real) Banach space X, which is equipped with the norm \(\Vert \cdot \Vert _X:X\rightarrow \mathbb {R}_{\ge 0}\), we denote its corresponding (continuous) dual space by \(X^*\) equipped with the dual norm \(\Vert \cdot \Vert _{X^*}:X^*\rightarrow \mathbb {R}_{\ge 0}\), defined by \(\Vert x^*\Vert _{X^*}{:}{=}\sup _{\Vert x\Vert _X\le 1}{\langle x^*,x\rangle _X}\) for every \(x^*\in X^*\), where \(\langle \cdot ,\cdot \rangle _X:X^*\times X\rightarrow \mathbb {R}\), defined by \(\langle x^*,x\rangle _X{:}{=}x^*(x)\) for every \(x^*\in X^*\) and \(x\in X\), denotes the duality pairing. A functional \(F:X\rightarrow \mathbb {R}\cup \{+\infty \}\) is called sub-differentiable in \(x\in X\), if \( F(x)<\infty \) and if there exists \(x^*\in X^*\), called sub-gradient, such that for every \( y\in X \), it holds that
$$\begin{aligned} \langle x^*,y-x\rangle _X\le F(y)-F(x)\,. \end{aligned}$$
(2.1)
The sub-differential \(\partial F:X\rightarrow 2^{X^*}\) of a functional \(F:X\rightarrow \mathbb {R}\cup \{+\infty \}\) for every \( x\in X\) is defined by \({\partial F(x)}{:}{=}\{x^*\in X^*\mid \) (2.1) holds for \(x^*\}\) if \(F(x)<\infty \) and \({\partial F(x)}{:}{=}\emptyset \) else.
For a given functional \(F:X\rightarrow \mathbb {R}\cup \{\pm \infty \}\), we denote its corresponding (Fenchel) conjugate by \(F^*:X^*\rightarrow \mathbb {R}\cup \{\pm \infty \}\), which for every \(x^*\in X^*\) is defined by
$$\begin{aligned} F^*(x^*){:}{=}\sup _{x\in X}{\langle x^*,x\rangle _X-F(x)}\,. \end{aligned}$$
(2.2)
If \(F:X\rightarrow \mathbb {R}\cup \{+\infty \}\) is a proper, convex, and lower semi-continuous functional, then also its (Fen-chel) conjugate \(F^*:X^*\rightarrow \mathbb {R}\cup \{+\infty \}\) is a proper, convex, and lower semi-continuous functional (cf. [33, p. 17]). Furthermore, for every \(x^*\in X^*\) and \(x\in X\) such that \( F^*(x^*)+F(x)\) is well-defined, i.e., the critical case \(\infty -\infty \) does not occur, the Fenchel–Young inequality
$$\begin{aligned} \langle x^*,x\rangle _X\le F^*(x^*)+F(x) \end{aligned}$$
(2.3)
applies. In particular, for every \(x^*\in X^*\) and \(x\in X\), the Fenchel–Young identity
$$\begin{aligned} x^*\in {\partial F(x)}\quad \Leftrightarrow \quad \langle x^*,x\rangle _X= F^*(x^*)+F(x) \end{aligned}$$
(2.4)
holds.
The following strong convexity measures for functionals play an important role in the derivation of an explicit a posteriori error representation for convex minimization problems in Sect. 3; for further information, we refer the reader to [13, 22, 41, 42].
Definition 2.1
(Brègman distance and symmetric Brègman distance) Let X be a (real) Banach space and \(F:X\rightarrow \mathbb {R}\cup \{+\infty \}\) proper, i.e., \({dom (F)}{:}{=}\{x\in X\mid F(x)<\infty \}\ne \emptyset \).
(i)
The Brègman distance \(\sigma ^2_F:{dom (F)}\times X\rightarrow [0,+\infty ]\) for every \(x\in {dom (F)}\), \({y\in X}\) is defined by
$$\begin{aligned} \sigma ^2_F(y,x){:}{=}F(y)-F(x)-\sup _{x^*\in {\partial F(x)}}{\langle x^*,y-x\rangle _X}\,, \end{aligned}$$
where we use the convention \(\sup (\emptyset ){:}{=}-\infty \).
 
(ii)
The symmetric Brègman distance \({\sigma ^2_{F,s}}:{dom (F)}^2\rightarrow [0,+\infty ]\) for every \(x,y\in {dom (F)}\) is defined by
$$\begin{aligned} \sigma _{F,s}^2(y,x){:}{=}\sigma _F^2(y,x)+\sigma _F^2(x,y)=\inf _{x^*\in {\partial F(x)};y^*\in {\partial F(y)}}{\langle x^*-y^*,x-y\rangle _X}\,, \end{aligned}$$
where we use the convention \(\inf (\emptyset ){:}{=}+\infty \).
 
Definition 2.2
(Optimal strong convexity measure at a minimizer) Let X be a (real) Banach space and \(F:X\rightarrow \mathbb {R}\cup \{+\infty \}\) proper. Moreover, let \(x\in X\) be minimal for \(F:X\rightarrow \mathbb {R}\cup \{+\infty \}\). Then, the optimal strong convexity measure \(\rho ^2_F:X^2\rightarrow [0,+\infty ]\) at \(x\in X\) for every \(y\in X\) is defined <by
$$\begin{aligned} \rho ^2_F(y,x){:}{=}F(y)-F(x)\ge 0 \,. \end{aligned}$$
Remark 2.3
Let X be a (real) Banach space and \(F:X\rightarrow \mathbb {R}\cup \{+\infty \}\) proper. Moreover, let \(x\in X\) be minimal for \(F:X\rightarrow \mathbb {R}\cup \{+\infty \}\). Then, due to \(0\in {\partial F(x)}\), for every \(y\in X\), it holds that
$$\begin{aligned} \sigma ^2_F(y,x)\le \rho ^2_F(y,x)\,. \end{aligned}$$

2.2 Function Spaces

    Throughout the article, we denote by \({\Omega \subseteq \mathbb {R}^d}\)\({d \in \mathbb {N}}\), a bounded simplicial Lipschitz domain, whose (topological) boundary is disjointly divided into a Dirichlet part \(\Gamma _D\) and a Neumann part \(\Gamma _N\), i.e., \(\partial \Omega = \Gamma _D\cup \Gamma _N \) and \(\emptyset = \Gamma _D\cap \Gamma _N\), where either \(\Gamma _D=\emptyset \) or \(\vert \Gamma _D\vert >0\) and either \(\Gamma _N=\emptyset \) or \(\vert \Gamma _N\vert >0\).
For \(p\in \left[ 1,\infty \right] \) and \(\ell \in \mathbb {N}\), we employ the standard notation1
$$\begin{aligned} W^{1,p}_D(\Omega ;\mathbb {R}^\ell )&{:}{=}\big \{v\in L^p(\Omega ;\mathbb {R}^l) & \mid \nabla v\in L^p(\Omega ;\mathbb {R}^{\ell \times d}),\, tr \,v=0\text { in }W^{\smash {1-\frac{1}{p},p}}(\Gamma _D;\mathbb {R}^\ell )\big \}\,,\\ W^{p}_N(div ;\Omega )&{:}{=}\big \{y\in L^p(\Omega ;\mathbb {R}^d) & \mid div \,y\in L^p(\Omega ),\,{\langle tr _n\,y,v\rangle _{W^{\smash {1-\frac{1}{p'},p'}}(\partial \Omega )}=0\text { for all }v\in W^{1,p'}_D(\Omega )}\big \}\,,\nonumber \end{aligned}$$
(2.5)
\(\smash {W^{1,p}(\Omega ;\mathbb {R}^\ell )\hspace{-0.175em}{:}{=}\hspace{-0.175em} W^{1,p}_D(\Omega ;\mathbb {R}^\ell )}\) if \(\smash {\Gamma _D\hspace{-0.175em}=\hspace{-0.175em}\emptyset }\), and \(\smash {W^{p}(div ;\Omega )\hspace{-0.175em}{:}{=}\hspace{-0.175em}W^{p}_N(div ;\Omega )}\) if \(\smash {\Gamma _N\hspace{-0.175em}=\hspace{-0.175em}\emptyset }\), where we denote by \(tr :\hspace{-0.175em}\smash {W^{1,p}(\Omega ;\mathbb {R}^\ell )}\hspace{-0.175em} \rightarrow \hspace{-0.175em}\smash {W^{\smash {1-\frac{1}{p},p}}(\partial \Omega ;\mathbb {R}^\ell )}\) the trace operator and by \( tr _n(\cdot ):\smash {W^p(div ;\Omega )}\hspace{-0.175em}\rightarrow \hspace{-0.175em}\smash {W^{\smash {-\frac{1}{p}},p}(\partial \Omega )}\) the  normal trace operator. In particular, we always omit \(tr (\cdot )\) and \(tr _n(\cdot )\). In addition, we employ the abbreviations \(L^p(\Omega )\hspace{-0.05em} {:}{=}\hspace{-0.05em} L^p(\Omega ;\mathbb {R}^1)\),\({W^{1,p}(\Omega )\hspace{-0.05em}{:}{=}\hspace{-0.05em} W^{1,p}(\Omega ;\mathbb {R}^1)}\), and \({W^{1,p}_D(\Omega )\hspace{-0.05em}{:}{=}\hspace{-0.05em} W^{1,p}_D(\Omega ;\mathbb {R}^1)}\). For (Lebesgue) measurable functions \(u,v:\hspace{-0.1em}\Omega \hspace{-0.1em}\rightarrow \hspace{-0.1em} \mathbb {R}\) and a (Lebesgue) measurable set \({M\hspace{-0.1em}\subseteq \hspace{-0.1em}\Omega }\), we write
$$\begin{aligned} (u,v)_{M}{:}{=}\int _{M}{u\,v\,\textrm{d}x}\,, \end{aligned}$$
whenever the right-hand side is well-defined. Analogously, for (Lebesgue) measurable vector fields \(z,y:\Omega \rightarrow \mathbb {R}^d\) and a (Lebesgue) measurable set \(M\subseteq \Omega \), we write \({(z,y)_{M}{:}{=}\int _{M}{z\cdot y\,\textrm{d}x}}\). Moreover, let \(\vert D (\cdot )\vert (\Omega ) :L^1_{loc }(\Omega ) \rightarrow \mathbb {R}\cup \{+\infty \}\), for every \(v\in L^1_{loc }(\Omega )\) defined by2
$$\begin{aligned} \vert D {v}\vert (\Omega ){:}{=}\sup \big \{-(v,div \,\phi )_{\Omega }\mid \phi \in C_c^\infty (\Omega ;\mathbb {R}^d); \Vert \phi \Vert _{L^\infty (\Omega ;\mathbb {R}^d)}\le 1\big \} \,, \end{aligned}$$
denote the total variation functional. Then, the space of functions with bounded variation is defined by \(BV(\Omega ){:}{=}\{v\in L^1(\Omega )\mid \vert D v\vert (\Omega )<\infty \}\).

2.3 Triangulations

    Throughout the entire paper, we denote by \(\{\mathcal {T}_h\}_{h>0}\), a family of uniformly shape regular and conforming triangulations of \(\Omega \subseteq \mathbb {R}^d\), \(d\in \mathbb {N}\), (cf. [34]). Here, \({h>0}\) refers to the averaged mesh-size, i.e., \(h{:}{=}\left( \vert \Omega \vert / card (\mathcal {N}_h)\right) ^{1/d}\), where \(\mathcal {N}_h\) denotes the set of all vertices of \(\mathcal {T}_h\).
For every element \(T \in \mathcal {T}_h\), we denote by \(\rho _T>0\), the supremum of diameters of inscribed balls. We assume that there exists a constant \(\omega _0\hspace{-0.15em}>\hspace{-0.15em}0\), independent of \(h\hspace{-0.15em}>\hspace{-0.15em}0\), such that \({\max _{T\in \mathcal {T}_h}{h_T}{\rho _T^{-1}}\hspace{-0.15em}\le \hspace{-0.15em} \omega _0}\). The smallest such constant is called the chunkiness of \(\{\mathcal {T}_h\}_{h>0}\). The sets \(\mathcal {S}_h\), \(\mathcal {S}_h^{i}\)\(\mathcal {S}_h^{\partial \Omega }\)\(\mathcal {S}_h^{\Gamma _D}\), and \(\mathcal {S}_h^{\Gamma _N}\), contain the sides, interior sides, boundary sides, Dirichlet boundary sides, and Neumann boundary sides, respectively, of \(\mathcal {T}_h\).
For \(k\in \mathbb {N}\cup \{0\}\) and \(T\in \mathcal {T}_h\), let \(\mathcal {P}_k(T)\) denote the set of polynomials of maximal degree k on T. Then, for \(k\in \mathbb {N}\cup \{0\}\) and \(\ell \in \mathbb {N}\), the sets of discontinuous and continuous element-wise polynomial functions or vector fields, respectively, are defined by
$$\begin{aligned} (\mathcal {L}^k(\mathcal {T}_h))^\ell&{:}{=}\big \{v_h\in L^\infty (\Omega ;\mathbb {R}^\ell )\mid v_h|_T\in (\mathcal {P}_k(T))^\ell \text { for all }T\in \mathcal {T}_h\big \}\,,\\ (\mathcal {S}^k(\mathcal {T}_h))^\ell&{:}{=}(\mathcal {L}^k(\mathcal {T}_h))^\ell \cap C^0(\overline{\Omega };\mathbb {R}^\ell )\,. \end{aligned}$$
For every \(T\in \mathcal {T}_h\) and \(S\in \mathcal {S}_h\), let \(\smash {x_T{:}{=}\frac{1}{d+1}\sum _{z\in \mathcal {N}_h\cap T}}\) and \(\smash {x_S{:}{=}\frac{1}{d}\sum _{z\in \mathcal {N}_h\cap S}}\) denote the barycenters of T and S, respectively. The (local) \(L^2\)-projection operator \(\Pi _h:L^1(\Omega ;\mathbb {R}^\ell )\rightarrow (\mathcal {L}^0(\mathcal {T}_h))^\ell \) onto element-wise constant functions or vector fields, respectively, for every \({v\hspace{-0.2em}\in \hspace{-0.2em} L^1(\Omega ;\mathbb {R}^\ell )} \), is defined by https://static-content.springer.com/image/art%3A10.1007%2Fs10208-024-09676-5/MediaObjects/10208_2024_9676_IEq219_HTML.gif for all \(T\hspace{-0.2em}\in \hspace{-0.2em} \mathcal {T}_h\). The element-wise gradient \({\nabla _{\!h}:\hspace{-0.2em}(\mathcal {L}^1(\mathcal {T}_h))^\ell \hspace{-0.2em}\rightarrow \hspace{-0.2em} (\mathcal {L}^0(\mathcal {T}_h))^{\ell \times d}}\), for every \(v_h\hspace{-0.1em}\in \hspace{-0.1em} (\mathcal {L}^1(\mathcal {T}_h))^\ell \), is defined by \(\nabla _{\!h}v_h|_T\hspace{-0.1em}{:}{=}\hspace{-0.1em}\nabla (v_h|_T)\) for all \({T\hspace{-0.1em}\in \hspace{-0.1em} \mathcal {T}_h}\).
For every \(v_h\in \mathcal {L}^k(\mathcal {T}_h)\) and \(S\in \mathcal {S}_h\), the jump across S is defined by
$$\begin{aligned} \llbracket {v_h}\rrbracket _S{:}{=}{\left\{ \begin{array}{ll} v_h|_{T_+}-v_h|_{T_-}& \text { if }S\in \mathcal {S}_h^{i}\,,\text { where }T_+, T_-\in \mathcal {T}_h\text { satisfy }\partial T_+\cap \partial T_-=S\,,\\ v_h|_T& \text { if }S\in \mathcal {S}_h^{\partial \Omega }\,,\text { where }T\in \mathcal {T}_h\text { satisfies }S\subseteq \partial T\,. \end{array}\right. } \end{aligned}$$
For every \(y_h\in (\mathcal {L}^k(\mathcal {T}_h))^d\) and \(S\in \mathcal {S}_h\), the normal jump across S is defined by
$$\begin{aligned}&\llbracket {y_h\cdot n}\rrbracket _S\\&\quad {:}{=}{\left\{ \begin{array}{ll} y_h|_{T_+}\!\cdot n_{T_+}+y_h|_{T_-}\!\cdot n_{T_-}& \text { if }S\in \mathcal {S}_h^{i}\,,\text { where }T_+, T_-\in \mathcal {T}_h\text { satisfy }\partial T_+\cap \partial T_-=S\,,\\ y_h|_T\cdot n& \text { if }S\in \mathcal {S}_h^{\partial \Omega }\,,\text { where }T\in \mathcal {T}_h\text { satisfies }S\subseteq \partial T\,, \end{array}\right. } \end{aligned}$$
where, for every \(T\in \mathcal {T}_h\), \(\smash {n_T:\partial T\rightarrow \mathbb {S}^{d-1}}\) denotes the outward unit normal vector field to T.

2.3.1 Crouzeix–Raviart Element

The Crouzeix–Raviart finite element space (cf. [28]) consists of element-wise affine functions that are continuous at the barycenters of interior element sides, i.e.,
$$\begin{aligned} \mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h){:}{=}\big \{v_h\in \mathcal {L}^1(\mathcal {T}_h)\mid \llbracket {v_h}\rrbracket _S(x_S)=0\text { for all }S\in \mathcal {S}_h^{i}\big \}. \end{aligned}$$
Note that \(\mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\subseteq BV(\Omega )\). More precisely, for every \(v_h\in \mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\), we have that \(\textrm{D}v_h=\nabla _{\! h}v_h\otimes \textrm{d}x+\llbracket {v_h}\rrbracket \otimes \textrm{d}s|_{\mathcal {S}_h}\) with \(\nabla _{\! h}v_h\otimes \textrm{d}x\perp \llbracket {v_h}\rrbracket \otimes \textrm{d}s|_{\mathcal {S}_h}\) (cf. [21, Theorem 1.63]), so that (cf. [15])
$$\begin{aligned} \vert \textrm{D}v_h\vert (\Omega )= \Vert \nabla _{\! h}v_h\Vert _{L^1(\Omega ;\mathbb {R}^d)}+\Vert \llbracket {v_h}\rrbracket \Vert _{L^1(\mathcal {S}_h)}\,. \end{aligned}$$
(2.6)
The Crouzeix–Raviart finite element space with homogeneous Dirichlet boundary condition on \(\Gamma _D\) is defined by
$$\begin{aligned} \smash {\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}{:}{=}\big \{v_h\in \smash {\mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)}\mid v_h(x_S)=0\text { for all }S\in {\mathcal {S}_h^{\Gamma _D}}\big \}\,. \end{aligned}$$
A basis for \(\smash {\mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)}\) is given by \(\varphi _S\hspace{-0.1em}\in \hspace{-0.1em} \smash {\mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)}\), \(S\hspace{-0.1em}\in \hspace{-0.1em} \mathcal {S}_h\), satisfying the Kronecker property \(\varphi _S(x_{S'})=\delta _{S,S'}\) for all \(S,S'\in \mathcal {S}_h\). A basis for \(\smash {\smash {\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}}\) is given by \({\varphi _S\in \smash {\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}}\), \({S\in {\mathcal {S}_h{\setminus }\mathcal {S}_h^{\Gamma _D}}}\).

2.3.2 Raviart–Thomas Element

The Raviart–Thomas finite element space (cf. [44]) consists of element-wise affine vector fields that have continuous constant normal components on interior element sides, i.e.,
$$\begin{aligned} \mathcal {R}T^0(\mathcal {T}_h){:}{=}\bigg \{ y_h\in (\mathcal {L}^1(\mathcal {T}_h))^d\Bigg | \begin{array}{l} \smash {y_h|_T\cdot n_T=const \text { on }\partial T\text { for all }T\in \mathcal {T}_h\,,}\\ \smash { \llbracket {y_h\cdot n}\rrbracket _S=0\text { on }S\text { for all }S\in \mathcal {S}_h^{i}\,.}\end{array}\bigg \} \end{aligned}$$
Note that \(\mathcal {R}T^{0}_N(\mathcal {T}_h)\subseteq W^\infty _N(div ;\Omega )\). The Raviart–Thomas finite element space with homogeneous normal component boundary condition on \(\Gamma _N\) is defined by
$$\begin{aligned} \smash {\mathcal {R}T^{0}_N(\mathcal {T}_h)}{:}{=}\big \{y_h\in \mathcal {R}T^0(\mathcal {T}_h)\mid y_h\cdot n=0\text { on }\Gamma _N\big \}\,. \end{aligned}$$
A basis for \(\mathcal {R}T^0(\mathcal {T}_h)\) is given by vector fields \(\psi _S\hspace{-0.1em}\in \hspace{-0.1em} \mathcal {R}T^0(\mathcal {T}_h)\), \(S\hspace{-0.1em}\in \hspace{-0.1em} \mathcal {S}_h\), satisfying Kronecker property \(\psi _S|_{S'}\cdot n_{S'}=\delta _{S,S'}\) on \(S'\) for all \(S'\in \mathcal {S}_h\), where \(n_S\) is the unit normal vector on S pointing from \(T_-\) to \(T_+\) if \(T_+\cap T_-=S\in \mathcal {S}_h\). A basis for \(\smash {\mathcal {R}T^{0}_N(\mathcal {T}_h)}\) is given by \(\psi _S\in \smash {\mathcal {R}T^{0}_N(\mathcal {T}_h)}\), \({S\in { \mathcal {S}_h{\setminus }\mathcal {S}_h^{\Gamma _N}}}\).

2.3.3 Discrete Integration-by-Parts Formula

      For every \(v_h\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)\) and \({y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)}\), it holds the discrete integration-by-parts formula
$$\begin{aligned} (\nabla _{\!h}v_h,\Pi _h y_h)_\Omega =-(\Pi _h v_h,\,div \,y_h)_\Omega \,. \end{aligned}$$
(2.7)
In addition, cf. [12, Section 2.4], if a vector field \(y_h\in (\mathcal {L}^0(\mathcal {T}_h))^d\) satisfies for every \(v_h\in \smash {\mathcal {S}^{1,cr}_D(\mathcal {T}_h)}\)
$$\begin{aligned} (y_h,\nabla _{\!h} v_h)_{\Omega }=0\,, \end{aligned}$$
then, choosing \(v_h=\varphi _S\in \mathcal {S}^{1,cr}_D(\mathcal {T}_h) \) for all \(S\in { \mathcal {S}_h{\setminus }\mathcal {S}_h^{\Gamma _D}}\), one finds that \(y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)\). Similarly, if a function \(v_h\in \mathcal {L}^0(\mathcal {T}_h)\) satisfies for every \(y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)\)
$$\begin{aligned} (v_h,div \,y_h)_{\Omega }=0\,, \end{aligned}$$
then, choosing \(y_h\hspace{-0.15em}=\hspace{-0.15em}\psi _S\hspace{-0.15em}\in \hspace{-0.15em}\mathcal {R}T^0_N(\mathcal {T}_h) \) for all \(S\hspace{-0.15em}\in \hspace{-0.15em} { \mathcal {S}_h\hspace{-0.15em} {\setminus }\hspace{-0.15em} \mathcal {S}_h^{\Gamma _N}}\), one finds that \(v_h\hspace{-0.15em}\in \hspace{-0.15em} \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\). In other words, we have the orthogonal (with respect to the inner product \((\cdot ,\cdot )_{\Omega }\)) decompositions
$$\begin{aligned} (\mathcal {L}^0(\mathcal {T}_h))^d&=ker (div |_{\smash {\mathcal {R}T^0_N(\mathcal {T}_h)}})\oplus \nabla _{\!h}(\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)) \,, \end{aligned}$$
(2.8)
$$\begin{aligned} \mathcal {L}^0(\mathcal {T}_h)&=ker (\nabla _{\!h}|_{\smash {\mathcal {S}^{1,cr}_D(\mathcal {T}_h)}})\oplus div \,(\mathcal {R}T^0_N(\mathcal {T}_h)) \,. \end{aligned}$$
(2.9)

3 Exact A Posteriori Error Estimation for Convex Minimization Problems

3.1 Continuous Convex Minimization Problem and Continuous Convex Duality

    Let \(\phi :\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \}\) be a proper, convex, and lower semi-continuous function and let \(\psi :\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) be a (Lebesgue) measurable function such that for a.e. \(x\in \Omega \), the function \(\psi (x,\cdot ):\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) is proper, convex, and lower semi-continuous. Then, we consider the minimization problem given via the minimization of the functional \(I:W^{1,p}_D(\Omega )\rightarrow \mathbb {R}\cup \{+\infty \}\), for every \(v\in \smash {W^{1,p}_D(\Omega )}\) defined by
$$\begin{aligned} I(v){:}{=}\int _{\Omega }{\phi (\nabla v)\,d x}+\int _{\Omega }{\psi (\cdot ,v)\,d x}\,. \end{aligned}$$
(3.1)
In what follows, we refer to the minimization of \(I :W^{1,p}_D(\Omega ) \rightarrow \mathbb {R} \cup \{+\infty \}\) as the primal problem. A (Fenchel) dual problem to the minimization of (3.1) is given via the maximization of the functional \(D:\smash {L^{p'}(\Omega ;\mathbb {R}^d)}\rightarrow \mathbb {R} \cup \{ -\infty \}\), for every \(y\in L^{p'}(\Omega ;\mathbb {R}^d)\) defined by
$$\begin{aligned} D(y){:}{=}-\int _{\Omega }{\phi ^*( y)\,d x}-F^*(Div \,y)\,, \end{aligned}$$
(3.2)
where the distributional divergence \(Div :L^{p'}(\Omega ;\mathbb {R}^d)\rightarrow (W^{1,p}_D(\Omega ))^*\) for every \(y\in \smash {L^{p'}(\Omega ;\mathbb {R}^d)}\) and \(v\hspace{-0.1em}\in \hspace{-0.1em} \smash {W^{1,p}_D(\Omega )}\) is defined by \(\langle Div \,y,v\rangle _{\smash {W^{1,p}_D(\Omega )}}\hspace{-0.1em}{:}{=}\hspace{-0.1em} -(y,\nabla v)_{\Omega }\) and \(\smash {F^*:\hspace{-0.1em}L^{p'}(\Omega )\hspace{-0.1em}\rightarrow \hspace{-0.1em} \mathbb {R}\hspace{-0.1em}\cup \hspace{-0.1em}\{\pm \infty \}}\) denotes the Fenchel conjugate to \(F:L^p(\Omega )\rightarrow \mathbb {R}\cup \{+\infty \}\), defined by \(F(v){:}{=}\int _{\Omega }{\psi (\cdot ,v)\,d x}\) for all \({ v\in L^p(\Omega )}\). Note that for every \({y\hspace{-0.1em}\in \hspace{-0.1em}\smash {W^{p'}_N(div ;\Omega )}}\), we have that \(\langle Div \,y,v\rangle _{\smash {W^{1,p}_D(\Omega )}}\hspace{-0.1em}=\hspace{-0.1em}(div \,y, v)_{\Omega }\) for all \({v\hspace{-0.1em}\in \hspace{-0.1em} W^{1,p}_D(\Omega )}\) and, thus, the representation
$$\begin{aligned} D(y)=-\int _{\Omega }{\phi ^*( y)\,d x}-\int _{\Omega }{\psi ^*(\cdot ,div \,y)\,d x}\,. \end{aligned}$$
(3.3)
A weak duality relation applies (cf. [33, Proposition 1.1, p. 48]), i.e.,
$$\begin{aligned} \inf _{v\in W^{1,p}_D(\Omega )}{I(v)}\ge \sup _{y\in L^{p'}(\Omega ;\mathbb {R}^d)}{D(y)}\,. \end{aligned}$$
(3.4)
In what follows, we always assume that \(\phi :\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \}\) and \(\psi :\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) are such that (3.1) admits at least one minimizer \(u\in W^{1,p}_D(\Omega ) \), called the primal solution, (3.2) at least one maximizer \(z\in L^{p'}(\Omega ;\mathbb {R}^d) \), called the dual solution, and that a strong duality relation applies, i.e.,
$$\begin{aligned} I(u)= D(z)\,. \end{aligned}$$
(3.5)
By the Fenchel–Young inequality (cf. (2.3)), (3.5) is equivalent to the convex optimality relations
$$\begin{aligned} z\cdot \nabla u&=\phi ^*(z)+\phi (\nabla u)\quad a.e. in \,\Omega \,, \end{aligned}$$
(3.6)
$$\begin{aligned} Div \,z&\in \partial F(u)\,. \end{aligned}$$
(3.7)
If \(z\in \smash {W^{p'}_N(div ;\Omega )}\), then the convex optimality relation (3.7) is equivalent to
$$\begin{aligned} div \,z\, u=\psi ^*(\cdot ,div \,z)+\psi (\cdot , u)\quad a.e. in \,\Omega \,. \end{aligned}$$
(3.8)
If \(\phi \in C^1(\mathbb {R}^d)\), then, by the Fenchel–Young identity (cf. (2.4)), (3.6) is equivalent to
$$\begin{aligned} z= D\phi (\nabla u)\quad in \smash {L^{p'}(\Omega ;\mathbb {R}^d)}\,. \end{aligned}$$
(3.9)
Similarly, if \(z\in \smash {W^{p'}_N(div ;\Omega )}\) and \(\psi (x,\cdot )\in C^1(\mathbb {R})\) for a.e. \(x\in \Omega \), then (3.8) is equivalent to
$$\begin{aligned} div \,z=D\psi (\cdot , u)\quad in \smash {L^{p'}(\Omega )}\,. \end{aligned}$$
(3.10)
The convex duality relations (3.4)–(3.10) motivate introducing the primal-dual gap estimator \(\eta ^2:W^{1,p}_D(\Omega )\times L^{p'}(\Omega ;\mathbb {R}^d)\rightarrow [0,+\infty ]\), for every \(v\in W^{1,p}_D(\Omega )\) and \(y\in L^{p'}(\Omega ;\mathbb {R}^d)\) defined by
$$\begin{aligned} \eta ^2(v,y){:}{=}I(v)-D(y)\,. \end{aligned}$$
(3.11)
Note that the sign of the estimator (3.11) is a consequence of the weak duality relation (3.4).
Using the optimal strong convexity measures (cf. Definition 2.2) \(\rho _I^2:(W^{1,p}_D(\Omega ))^2\rightarrow [0,+\infty ]\) of (3.1) at a primal solution \(u\in W^{1,p}_D(\Omega )\) and \(\rho _{-D}^2:\hspace{-0.15em} \smash {(L^{p'}(\Omega ;\mathbb {R}^d))^2}\hspace{-0.15em}\rightarrow \hspace{-0.15em} [0,+\infty ]\) of the negative of (3.2) at a dual solution \(z\in \smash {L^{p'}(\Omega ;\mathbb {R}^d)}\), we arrive at the following explicit a posteriori error representation.
Theorem 3.1
(Explicit (a posteriori) error representation) The following statements apply:
(i)
For every \(v\in W^{1,p}_D(\Omega )\) and \(y\in \smash {L^{p'}(\Omega ;\mathbb {R}^d)}\), we have that
$$\begin{aligned} \smash {\rho ^2_I(v,u)+\rho ^2_{-D}(y,z)=\eta ^2(v,y)}\,. \end{aligned}$$
(3.12)
 
(ii)
For every \(v\in \smash { W^{1,p}_D(\Omega )}\) and \(y\in \smash {W^{p'}_N(div ;\Omega )}\), we have that
$$\begin{aligned} \begin{aligned} \eta ^2(v,y)&= \int _{\Omega }\big \{{\phi (\nabla v)-\nabla v\cdot y+\phi ^*(y)\big \}\,\textrm{d}x}\\&+\int _{\Omega }\big \{{\psi (\cdot , v)- v\,\textrm{div}\,y+\psi ^*(\cdot ,\textrm{div}\,y)\big \}\,\textrm{d}x}\,. \end{aligned} \end{aligned}$$
(3.13)
 
Remark 3.2
(i)
By the Fenchel–Young inequality (2.3), the integrands in the representation (3.13), are non-negative and, thus, suitable as local refinement indicators.
 
(ii)
According to Remark 2.3, from Theorem 3.1(i), for every \(v\in W^{1,p}_D(\Omega )\) and \(y\in L^{p'}(\Omega ;\mathbb {R}^d)\), it follows that \(\sigma _I^2(v,u)+\sigma _{-D}^2(y,z)\le \eta ^2(v,y)\).
 
Proof of Theorem 3.1).
ad (i) Due to \(I(u)=D(z)\) (cf. (3.5)), Definition 2.2, and (3.11), for every \(v\in W^{1,p}_D(\Omega )\) and \(y\in L^{p'}(\Omega ;\mathbb {R}^d)\), we have that
$$\begin{aligned} \rho ^2_I(v,u)+\rho ^2_{-D}(y,z)&=I(v)-I(u)+D(z)-D(y)\\&=\eta ^2(v,y)\,. \end{aligned}$$
ad (ii). Using (3.1), (3.3), and integration-by-parts, we conclude that (3.13) applies. \(\quad \square \)
Remark 3.3
(Examples)
(i)
In the p-Dirichlet problem (cf. [30, 31]), i.e., \({\phi {:}{=}\smash {\frac{1}{p}\vert \cdot \vert ^p}\in C^1(\mathbb {R})}\), \(p\hspace{-0.1em}\in \hspace{-0.1em} (1,\infty )\), and \(\psi \hspace{-0.1em}{:}{=}\hspace{-0.1em} (\smash {(t,x)^\top }\mapsto -f(x)t):\Omega \times \mathbb {R}\hspace{-0.1em}\rightarrow \hspace{-0.1em} \mathbb {R}\), where \(f\hspace{-0.1em}\in \hspace{-0.1em} L^{p'}(\Omega )\)cf. [48], for every \(v\in W^{1,p}_D(\Omega )\) and \(y\in W^{p'}_N(div ;\Omega )\), we have that
$$\begin{aligned} \rho ^2_I(v,u)&\sim \Vert F(\nabla v)-F(\nabla u)\Vert _{L^2(\Omega ;\mathbb {R}^d)}^2\,,\\ \quad \rho ^2_{-D}(y,z)\;&\gtrsim \; \Vert F^*(y)-F^*(z)\Vert _{L^2(\Omega ;\mathbb {R}^d)}^2\,, \end{aligned}$$
where \(F,F^*:\hspace{-0.05em}\mathbb {R}^d\hspace{-0.1em}\rightarrow \hspace{-0.1em} \mathbb {R}^d\) for every \(a\hspace{-0.1em}\in \hspace{-0.1em} \mathbb {R}^d\) are defined by \(F(a)\hspace{-0.1em}{:}{=}\hspace{-0.1em}\smash {\vert a\vert ^{\frac{p-2}{2}}}a\) and \({F^*(a)\hspace{-0.1em}{:}{=}\hspace{-0.1em} \smash {\vert a\vert ^{\frac{p'-2}{2}}}a}\). A proof that strong duality, i.e., (3.5), applies can be found in [33, Proposition 5.1, p. 115].
 
(ii)
In the obstacle problem (cf. [7]), i.e., \(\phi {:}{=}\frac{1}{2}\vert \cdot \vert ^2\in C^1(\mathbb {R})\) and \(\psi {:}{=}((t,x)^\top \mapsto -f(x)t +I_{\chi (x)}(t)):\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\), where \(f\in L^2(\Omega )\) and \(\chi \in W^{1,2}(\Omega )\) with \(\chi \le 0\) on \(\Gamma _D\)cf. [7], where \(I_{\chi (x)}(t){:}{=}0\) if \(t\ge {\chi (x)}\) and \(I_{\chi (x)}(t){:}{=}+\infty \) else, for every \(v\in W^{1,2}_D(\Omega )\) with \(v\ge \chi \) a.e. in \(\Omega \) and \(y\in L^2(\Omega ;\mathbb {R}^d)\), we have that
$$\begin{aligned} \rho ^2_I(v,u)&= \tfrac{1}{2}\Vert \nabla v-\nabla u\Vert _{L^2(\Omega ;\mathbb {R}^d)}^2+\langle -\Lambda ,v-u\rangle _{W^{1,2}_D(\Omega )}\,,\\ \rho ^2_{-D}(y,z)&\ge \tfrac{1}{2}\Vert y-z\Vert _{L^2(\Omega ;\mathbb {R}^d)}^2\,, \end{aligned}$$
where \(\Lambda \in (W^{1,2}_D(\Omega ))^*\) is defined by \(\langle \Lambda ,v\rangle _{W^{1,2}_D(\Omega )}{:}{=}(f,v)_{\Omega }-(\nabla u,\nabla v)_{\Omega }\) for all \(v\in W^{1,2}_D(\Omega )\). A proof that strong duality, i.e., (3.5), applies can be found in [33, Section 2.4, p. 84 ff.].
 
(iii)
In an optimal design problem (cf. [23]), i.e., \(\phi {:}{=}\zeta \circ \vert \,\cdot \vert \in C^1(\mathbb {R})\), where \(\zeta (0){:}{=} 0\), \(\zeta '(t){:}{=}\mu _2 t\) if \(t\in [0,t_1]\), \(\zeta '(t){:}{=}\mu _2 t_1\) if \(t\in [t_1,t_2]\), and \(\zeta '(t){:}{=}\mu _1 t\) if \(t\in [t_2,+\infty )\) for some \(0\!<\!t_1\!<\!t_2\) and \(0\!<\!\mu _1\!<\!\mu _2\) with \(t_1\mu _2\!=\!t_2\!\mu _1\), and \({\psi {:}{=} ((t,x)^\top \mapsto -f(x)t)\!:\! \Omega \!\times \! \mathbb {R}\!\!\rightarrow \!\! \mathbb {R}}\), where \(f\!\!\in \!\! L^2(\Omega )\), cf. [23, Lemma 3.4], for every \(v\in W^{1,p}_D(\Omega )\) and \(y\in W^{p'}_N(div ;\Omega )\), we have that
$$\begin{aligned}&\rho ^2_I(v,u)\ge \tfrac{1}{2\mu }\Vert D\phi (\nabla v)-D\phi (\nabla u)\Vert _{L^2(\Omega ;\mathbb {R}^d)}^2\,,\\&\rho ^2_{-D}(y,z)\ge \tfrac{1}{2\mu }\Vert y-z\Vert _{L^2(\Omega ;\mathbb {R}^d)}^2\,. \end{aligned}$$
A proof that strong duality, i.e., (3.5), applies can be found in [24].
 
(iv)
In the Rudin–Osher–Fatemi (ROF) model (cf. [47]), i.e., \(\phi {:}{=}\vert \cdot \vert \in C^0(\mathbb {R})\) and \(\psi {:}{=}((t,x)^\top \mapsto \frac{\alpha }{2}(t-g(x))^2):\Omega \times \mathbb {R}\rightarrow \mathbb {R}\), where \(g\in L^2(\Omega )\), cf. [3, Lemma 10.2], for every \(v\in BV(\Omega )\cap L^2(\Omega )\) and \(y\in W^2(div ;\Omega )\), we have that
$$\begin{aligned}&\rho ^2_I(v,u)\ge \tfrac{\alpha }{2}\Vert v-u\Vert _{L^2(\Omega )}^2\,,\\ &\rho ^2_{-D}(y,z)\ge \tfrac{1}{2\alpha }\Vert div \,y-div \,z\Vert _{L^2(\Omega )}^2\,. \end{aligned}$$
A proof that strong duality, i.e., (3.5), applies can be found in [36, Theorem 2.2].
 
For the a posteriori error estimator (3.13) to be numerically practicable, it is necessary to have a computationally cheap way to obtain sufficiently accurate approximations of the dual solution and of the primal solution. In Sect. 3.2, resorting to (discrete) convex duality relations between a non-conforming Crouzeix–Raviart approximation of the primal problem and a Raviart–Thomas approximation of the dual problem, we arrive at discrete reconstruction formulas, called generalized Marini formula (cf. [4, 39]).

3.2 Discrete Convex Minimization Problem and Discrete Convex Duality

    Let \(\psi _h:\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) denote a suitable approximation3 of \( \psi :\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) such that \(\psi _h(\cdot ,t)\in \mathcal {L}^0(\mathcal {T}_h)\) for all \(t\in \mathbb {R}\) and for a.e. \(x\in \Omega \), \(\psi _h(x,\cdot ):\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) is a proper, convex, and lower semi-continuous functional. Then, we consider the minimization problem given via the minimization of the functional \(I_h^{\textit{cr}}:\smash {\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}\rightarrow \mathbb {R}\cup \{+\infty \}\), for every \(v_h\in \smash {\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}\) defined by
$$\begin{aligned} I_h^{\textit{cr}}(v_h){:}{=}\int _{\Omega }{\phi (\nabla _{\! h} v_h)\,d x}+\int _{\Omega }{\psi _h(\cdot ,\Pi _h v_h)\,d x}\,. \end{aligned}$$
(3.14)
In what follows, we refer to the minimization of \(I_h^{\textit{cr}}:\smash {\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}\rightarrow \mathbb {R}\cup \{+\infty \}\) as the discrete primal problem. In [4, 12], it is shown that the corresponding (Fenchel) dual problem to the minimization of (3.14) consists in the maximization of \(D_h^{\textit{rt}}:\mathcal {R}T^0_N(\mathcal {T}_h)\rightarrow \mathbb {R}\cup \{-\infty \}\), for every \({y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)}\) defined by
$$\begin{aligned} D_h^{\textit{rt}}(y_h){:}{=}-\int _{\Omega }{\phi ^*(\Pi _h y_h)\,d x}-\int _{\Omega }{\psi _h^*(\cdot ,div \,y_h)\,d x}\,. \end{aligned}$$
(3.15)
A discrete weak duality relation (cf. [4, Proposition 3.1]) implies that
$$\begin{aligned} \inf _{v_h\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}{I_h^{\textit{cr}}(v_h)} \ge \sup _{y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)}{D_h^{\textit{rt}}(y_h)}\,. \end{aligned}$$
(3.16)
We will always assume that \(\phi :\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \}\) and \(\psi _h:\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) are such that (3.14) admits at least one minimizer \(u_h^{\textit{cr}}\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)\), called the discrete primal solution, (3.15) admits at least one maximizer \(z_h^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_h)\), called the discrete dual solution, and that a discrete strong duality relation applies, i.e.,
$$\begin{aligned} I_h^{\textit{cr}}(u_h^{\textit{cr}})=D_h^{\textit{rt}}(z_h^{\textit{rt}})\,. \end{aligned}$$
(3.17)
By the Fenchel–Young identity (cf. (2.4)), (3.17) is equivalent to the discrete convex optimality relations
$$\begin{aligned} \Pi _h z_h^{\textit{rt}}\cdot \nabla _{\! h} u_h^{\textit{cr}}&=\phi ^*(\Pi _hz_h^{\textit{rt}})+\phi (\nabla _{\! h} u_h^{\textit{cr}}) & \quad \text { a.e. in }\Omega \,, \end{aligned}$$
(3.18)
$$\begin{aligned} div \,z_h^{\textit{rt}}\,\Pi _hu_h^{\textit{cr}}&=\psi _h^*(\cdot , div \,z_h^{\textit{rt}})+\psi _h(\cdot ,\Pi _hu_h^{\textit{cr}}) & \quad \text { a.e. in }\Omega \,. \end{aligned}$$
(3.19)
If \(\phi \in C^1(\mathbb {R}^d)\), then, by the Fenchel–Young identity (cf. (2.4)), (3.18) is equivalent to
$$\begin{aligned} \Pi _h z_h^{\textit{rt}}=D\phi (\nabla _{\! h} u_h^{\textit{cr}})\quad \text { in }(\mathcal {L}^0(\mathcal {T}_h))^d\,, \end{aligned}$$
(3.20)
and if \(\phi ^*\in C^1(\mathbb {R}^d)\), then, by the Fenchel–Young identity (cf. (2.4)), (3.19) is equivalent to
$$\begin{aligned} \nabla _{\! h} u_h^{\textit{cr}}=D\phi ^*(\Pi _h z_h^{\textit{rt}})\quad \text { in }(\mathcal {L}^0(\mathcal {T}_h))^d\,. \end{aligned}$$
(3.21)
Similarly, if \(\psi _h(x,\cdot )\in C^1(\mathbb {R})\) for a.e. \(x\in \Omega \), then (3.19) is equivalent to
$$\begin{aligned} div \,z_h^{\textit{rt}}=D\psi _h(\cdot ,\Pi _hu_h^{\textit{cr}})\quad \text { in }\mathcal {L}^0(\mathcal {T}_h)\,, \end{aligned}$$
(3.22)
and if \(\psi _h^*(x,\cdot )\in C^1(\mathbb {R})\) for a.e. \(x\in \Omega \), then (3.19) is equivalent to
$$\begin{aligned} \Pi _hu_h^{\textit{cr}}=D\psi _h^*(\cdot ,div \,z_h^{\textit{rt}})\quad \text { in }\mathcal {L}^0(\mathcal {T}_h)\,. \end{aligned}$$
(3.23)
The relations (3.20)–(3.23) motivate the following discrete reconstruction formulas for a discrete dual solution \(z_h^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) from a discrete primal solution \(u_h^{\textit{cr}}\in \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\) and vice versa, called generalized Marini formulas (cf. [4, 39]).
Proposition 3.4
(Generalized Marini formulas) The following statements apply:
(i)
If \(\phi \in C^1(\mathbb {R}^d)\) and \(\psi _h(x,\cdot )\in C^1(\mathbb {R})\) for a.e. \(x\in \Omega \), then, given a minimizer \(u_h^{\textit{cr}}\in \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\) of (3.14), a maximizer \(z_h^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) of (3.15) is given via
$$\begin{aligned} z_h^{\textit{rt}}= D\phi (\nabla _{\! h} u_h^{\textit{cr}})+\frac{D\psi _h(\cdot , \Pi _hu_h^{\textit{cr}})}{d}\big (id _{\mathbb {R}^d}-\Pi _hid _{\mathbb {R}^d}\big )\quad \text { in }\mathcal {R}T^0_N(\mathcal {T}_h)\,, \end{aligned}$$
(3.24)
a discrete strong duality relation applies, i.e., (3.17).
 
(ii)
If \(\phi ^*\in C^1(\mathbb {R}^d)\) and \(\psi _h^*(x,\cdot )\in C^1(\mathbb {R})\) for a.e. \(x\in \Omega \), then, given a maximizer \(z_h^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) of (3.15), a minimizer \(u_h^{\textit{cr}}\in \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\) of (3.14) is given via
$$\begin{aligned} u_h^{\textit{cr}} = D\psi _h^*(\cdot ,div \,z_h^{\textit{rt}})+ D\phi ^*(\Pi _h z_h^{\textit{rt}})\cdot \big (id _{\mathbb {R}^d}-\Pi _hid _{\mathbb {R}^d}\big ) \quad \text { in }\mathcal {S}^{1,cr}_D(\mathcal {T}_h)\,, \end{aligned}$$
(3.25)
a discrete strong duality relation applies, i.e., (3.17).
 
Remark 3.5
It is possible to derive reconstruction formulas similar to (3.24) and (3.25) under weaker conditions, e.g., resorting to a regularization argument (cf. Proposition 4.5) or given discrete Lagrange multipliers (cf. [7, Proposition 3.3]).
Proof
ad (i). See [4, Proposition 3.1].
ad (ii). By definition, it holds that \( u_h^{\textit{cr}}\in \mathcal {L}^1(\mathcal {T}_h)\) and the discrete convex optimality relation (3.23) is satisfied. Since \(z_h^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) is maximal for (3.15) as well as \(\phi ^*\in C^1(\mathbb {R}^d)\) and \(\psi _h^*(x,\cdot )\in C^1(\mathbb {R})\) for a.e. \(x\in \Omega \), for every \(y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)\), we have that
$$\begin{aligned} (D\phi ^*(\Pi _h z_h^{\textit{rt}}),\Pi _hy_h)_{\Omega }+(D\psi _h^*(\cdot ,div \,z_h^{\textit{rt}}),div \,y_h)_{\Omega }=0\,. \end{aligned}$$
(3.26)
In particular, (3.26) implies that \(D\phi ^*(\Pi _h z_h^{\textit{rt}})\hspace{-0.1em}\in \hspace{-0.1em} (ker (div |_{\smash {\mathcal {R}T^0_N(\mathcal {T}_h)}}))^\perp \). Appealing to [27, Lemma 2.4], it holds that \((ker (div |_{\smash {\mathcal {R}T^0_N(\mathcal {T}_h)}}))^\perp =\nabla _{\!h}(\mathcal {S}^{1,cr}_D(\mathcal {T}_h))\). Therefore, there exists \(v_h\in \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\) such that
$$\begin{aligned} \nabla _{\!h} v_h= D\phi ^*(\Pi _h z_h^{\textit{rt}})\quad \text { in }(\mathcal {L}^0(\mathcal {T}_h))^d\,. \end{aligned}$$
(3.27)
Hence, for every \(y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)\), resorting to the discrete integration-by-parts formula (2.7), (3.27), (3.26), and (3.23), we find that
$$\begin{aligned} \begin{aligned} (\Pi _h u_h^{cr}-\Pi _hv_h,div \,y_h)_{\Omega }&=(D\phi ^*(\Pi _h z_h^{\textit{rt}}),\Pi _hy_h)_{\Omega }+(D\psi _h^*(\cdot ,div \,z_h^{\textit{rt}}),div \,y_h)_{\Omega }\\&=0\,. \end{aligned} \end{aligned}$$
In other words, for every \(y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)\), we have that
$$\begin{aligned} \begin{aligned} ( v_h-u_h^{cr},div \,y_h)_{\Omega }= (\Pi _h v_h-\Pi _h u_h^{cr},div \,y_h)_{\Omega }=0\,. \end{aligned} \end{aligned}$$
(3.28)
On the other hand, we have that \(\nabla _{\! h}(u_h^{cr}-v_h)=0\) in \((\mathcal {L}^0(\mathcal {T}_h))^d\), i.e., \(u_h^{cr}-v_h\in \mathcal {L}^0(\mathcal {T}_h)\). Therefore, (3.28) in conjunction with (2.9) implies that \(u_h^{cr}-v_h\in (div \,(\mathcal {R}T^0_N(\mathcal {T}_h)))^{\perp }=ker (\nabla _{\!h}|_{\smash {\mathcal {S}^{1,cr}_D(\mathcal {T}_h)}})\). As a result, due to \(v_h\in \smash {\mathcal {S}^{1,cr}_D(\mathcal {T}_h)}\), we conclude that \(u_h^{cr}\in \smash {\mathcal {S}^{1,cr}_D(\mathcal {T}_h)}\) with
$$\begin{aligned} \begin{aligned} \nabla _{\! h} u_h^{\textit{cr}}&=D\phi ^*(\Pi _h z_h^{\textit{rt}}) & \quad \text { in }(\mathcal {L}^0(\mathcal {T}_h))^d\,,\\ \Pi _hu_h^{\textit{cr}}&=D\psi _h^*(\cdot ,div \,z_h^{\textit{rt}}) & \quad \text { in }\mathcal {L}^0(\mathcal {T}_h)\,. \end{aligned} \end{aligned}$$
(3.29)
By the Fenchel–Young identity (cf. (2.4)), (3.29) is equivalent to
$$\begin{aligned} \begin{aligned} \Pi _h z_h^{\textit{rt}}\cdot \nabla _{\! h} u_h^{\textit{cr}}&=\phi ^*(\Pi _hz_h^{\textit{rt}})+\phi (\nabla _{\! h} u_h^{\textit{cr}}) & \quad \text { a.e. in }\Omega \,,\\ div \,z_h^{\textit{rt}}\,\Pi _hu_h^{\textit{cr}}&=\psi _h^*(\cdot , div \,z_h^{\textit{rt}})+\psi _h(\cdot ,\Pi _hu_h^{\textit{cr}}) & \quad \text { a.e. in }\Omega \,. \end{aligned} \end{aligned}$$
(3.30)
Eventually, adding (3.30)\(_1\) and (3.30)\(_2\), subsequently, integration with respect to \(x\in \Omega \), resorting to the discrete integration-by-parts formula (2.7), and using the definitions (3.14) and (3.15), we arrive at \(I_h^{\textit{cr}}(u_h^{\textit{cr}})=D_h^{\textit{rt}}(z_h^{\textit{rt}})\), which, according to the discrete weak duality relation (3.16), implies that \(u_h^{\textit{cr}}\in \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\) is minimal for (3.14). \(\square \)

4 Application to the Rudin–Osher–Fatemi (ROF) Model

    In this section, we apply the concepts derived in Sect. 3 to the non-differentiable Rudin–Osher–Fatemi (ROF) model (cf. [47]). The approximation of the ROF model has been investigated by numerous authors: A priori error estimates has been derived in [4, 6, 10, 11, 27]. A posteriori error estimates and adaptivity results can be found in [9, 11, 13, 17, 35].

4.1 The Continuous Rudin–Osher–Fatemi (ROF) Model

    Given a function \(g\in L^2(\Omega )\), i.e., the noisy image, and a constant parameter \(\alpha >0\), i.e., the fidelity parameter, the Rudin–Osher–Fatemi (ROF) model (cf. [47]) is defined via the minimization of the functional \(I:BV(\Omega )\cap L^2(\Omega )\rightarrow \mathbb {R}\), for every \(v\in BV(\Omega )\cap L^2(\Omega )\) defined by
$$\begin{aligned} \smash {I(v){:}{=}\vert D v\vert (\Omega )+\tfrac{\alpha }{2}\Vert v-g\Vert ^2_{L^2(\Omega )}}\,. \end{aligned}$$
(4.1)
In [3, Theorem 10.5 & 10.6], it has been proved that there exists a unique minimizer \(u\hspace{-0.1em}\in \hspace{-0.1em} BV(\Omega )\cap L^2(\Omega )\) of (4.1). Appealing to [36, Theorem 2.2] or [3, Section 10.1.3], the corresponding (Fenchel) (pre-)dual problem to the minimization of (4.1) is given via the maximization of the functional \(D:W^2_N(div ;\Omega ) \cap L^\infty (\Omega ;\mathbb {R}^d)\rightarrow \mathbb {R} \cup \{-\infty \}\), for every \(\smash {y\in W^2_N(div ;\Omega ) \cap L^\infty (\Omega ;\mathbb {R}^d)}\) defined by
$$\begin{aligned} \smash {D(y){:}{=}-\smash {I_{K_1(0)}^{\Omega }}(y) -\tfrac{1}{2\alpha }\Vert div \,y+\alpha \,g\Vert _{L^2(\Omega )}^2+\tfrac{\alpha }{2}\Vert g\Vert _{L^2(\Omega )}^2} \,, \end{aligned}$$
(4.2)
where \(\smash {I_{K_1(0)}^{\Omega }}:L^\infty (\Omega ;\mathbb {R}^d)\rightarrow \mathbb {R} \cup \{+\infty \}\) is defined by \(\smash {I_{K_1(0)}^{\Omega }}(y) {:}{=}0\) if \(y\in L^\infty (\Omega ;\mathbb {R}^d)\) with \(\vert y\vert \le 1\) a.e. in \(\Omega \) and \(\smash {I_{K_1(0)}^{\Omega }}(y){:}{=}+ \infty \) else. In [36, Theorem 2.2], it is shown that (4.2) admits a maximizer \({z\in W^2_N(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)}\) and that a strong duality relation applies, i.e.,
$$\begin{aligned} \smash {I(u)=D(z)}\,. \end{aligned}$$
(4.3)
According to [3, Proposition 10.4], (4.3) is equivalent to the convex optimality relations
$$\begin{aligned} div \,z&=\alpha \, (u-g)\quad \text { in }L^2(\Omega )\,, \end{aligned}$$
(4.4)
$$\begin{aligned} -(u,div \,z)_{\Omega }&=\vert D u\vert (\Omega )\,. \end{aligned}$$
(4.5)
Next, if we introduce, by analogy with Sect. 3, the primal-dual gap estimator \(\eta ^2:(BV(\Omega )\cap L^2(\Omega ))\times (W^2_N(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d))\rightarrow [0,+\infty ]\), for every \(v\in BV(\Omega )\cap L^2(\Omega )\) and \(y\in W^2_N(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\) defined by
$$\begin{aligned} \smash {\eta ^2(v,y){:}{=}I(v)-D(y)}\,, \end{aligned}$$
(4.6)
then the concepts of Sect. 3 can be applied to the ROF model.
Theorem 4.1
(Explicit (a posteriori) error representation) The following statements apply:
(i)
For every \(v\in BV(\Omega )\cap L^2(\Omega )\) and \(y\in W^2_N(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\), we have that
$$\begin{aligned} \smash {\rho ^2_I(v,u)+\rho ^2_{-D}(y,z)=\eta ^2(v,y)}\,. \end{aligned}$$
 
(ii)
For every \(v\in BV(\Omega )\cap L^2(\Omega )\) and \(y\in W^2_N(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\), we have that
$$\begin{aligned} \eta ^2(v,y)&= \vert \textrm{D}v\vert (\Omega )+(div \,y,v)_{\Omega }+\smash {I_{K_1(0)}^{\Omega }}(y)\nonumber \\&\quad +\tfrac{1}{2\alpha }\Vert div \,y-\alpha \,(v-g)\Vert _{L^2(\Omega )}^2\,. \end{aligned}$$
(4.7)
 
Proof
ad (i). Due to \(I(u)=D(z)\) (cf. (4.3)), Definition 2.2, and (4.6), for every \(v\in BV(\Omega )\cap L^2(\Omega )\) and \(y\in W^2_N(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\), we have that
$$\begin{aligned} \rho ^2_I(v,u)+\rho ^2_{-D}(y,z)&=I(v)-I(u)+D(z)-D(y)\\&=\eta ^2(v,y)\,. \end{aligned}$$
ad (ii). For every \(v\in BV(\Omega )\cap L^2(\Omega )\) and \(y\in W^2_N(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\), we have that
$$\begin{aligned} \eta ^2(v,y)&=\vert \textrm{D}v\vert (\Omega )+(div \,y,v)_{\Omega }+\smash {I_{K_1(0)}^{\Omega }}(y)+\tfrac{1}{2\alpha }\Vert \alpha \,(v-g)\Vert _{L^2(\Omega )}^2\\&\quad -\tfrac{1}{2\alpha }2(div \,y,\alpha \,v)_{\Omega }+\tfrac{1}{2\alpha }\Vert div \,y+\alpha \, g\Vert _{L^2(\Omega )}^2-\tfrac{\alpha }{2}\Vert g\Vert _{L^2(\Omega )^2}^2\\&=\vert \textrm{D}v\vert (\Omega )+(div \,y,v)_{\Omega }+\smash {I_{K_1(0)}^{\Omega }}(y)+\tfrac{\alpha }{2}\Vert v-g\Vert _{L^2(\Omega )}^2\\&\quad -\tfrac{1}{2\alpha }\Vert div \,y-\alpha \,(v-g)\Vert _{L^2(\Omega )}^2-\tfrac{\alpha }{2}\Vert v-g\Vert _{L^2(\Omega )}^2, \end{aligned}$$
which yields the claimed representation. \(\square \)
Restricting the primal-dual gap estimator (4.6) to sub-classes of \(BV(\Omega )\cap L^2(\Omega )\) and \(W^2_N(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\), respectively, for which a suitable integration-by-parts formula applies, e.g., (2.7), it is possible to derive further representations of the primal-dual gap estimator (4.6), whose integrands are point-wise non-negative and, thus, suitable as local refinement indicators.
Remark 4.2
(Alternative representations of (4.6) and local refinement indicators)
(i)
For every \(v\in W^{1,1}(\Omega ){\cap L^2(\Omega )}\) and \(y\in W^2_N(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\), it holds that
$$\begin{aligned} \eta ^2(v,y)&=\Vert \nabla v\Vert _{L^1(\Omega ;\mathbb {R}^d)}-(\nabla v,y)_{\Omega }+\smash {I_{K_1(0)}^{\Omega }}(y)\\&+\smash {\tfrac{1}{2\alpha }\Vert div \,y+\alpha \,(v-g)\Vert _{L^2(\Omega )}^2\ge 0\,.} \end{aligned}$$
 
(ii)
For every \(T\hspace{-0.15em}\in \hspace{-0.15em} \mathcal {T}_h\), we define the local refinement indicator \(\eta _T^2:\hspace{-0.15em} (W^{1,1}(\Omega )\hspace{-0.15em}\cap \hspace{-0.15em} {L^2(\Omega )})\hspace{-0.15em}\times \hspace{-0.15em} W^2_N(div ;\Omega ) \) \(\cap L^\infty (\Omega ;\mathbb {R}^d)\rightarrow [0,+\infty ]\) for every \(v\in W^{1,1}(\Omega )\cap {L^2(\Omega )}\) and \(y\in W^2_N(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\) by
$$\begin{aligned} \eta ^2_{T,W}(v,y)&{:}{=}\Vert \nabla v\Vert _{L^1(T;\mathbb {R}^d)}-(\nabla v,y)_T+\smash {I_{K_1(0)}^{\Omega }}(y) \\&\quad +\tfrac{1}{2\alpha }\Vert div \,y+\alpha \,(v-g)\Vert _{L^2(T)}^2\,\ge 0. \end{aligned}$$
 
(iii)
For every \(v_h\hspace{-0.15em}\in \hspace{-0.15em} \mathcal {S}^{1,cr}(\Omega )\) and \(y_h\hspace{-0.15em}\in \hspace{-0.15em} \mathcal {R}T^0_N(\mathcal {T}_h)\), by the representation of the total variation of Crouzeix–Raviart functions (2.6) and the discrete integration-by-parts formula (2.7), it holds that
$$\begin{aligned} \eta ^2(v_h,y_h)&=\Vert \nabla _{\! h} v_h\Vert _{L^1(\Omega ;\mathbb {R}^d)}+\Vert \llbracket {v_h}\rrbracket \Vert _{L^1(\mathcal {S}_h)}-(\nabla _{\! h} v_h,\Pi _h y_h)_{\Omega }\\&\quad +\smash {I_{K_1(0)}^{\Omega }}(y_h)+\tfrac{1}{2\alpha }\Vert div \,y_h+\alpha \,(v_h-g)\Vert _{L^2(\Omega )}^2\ge 0\,. \end{aligned}$$
 
(iv)
For every \(T\in \mathcal {T}_h\), we define the discrete local refinement indicator \(\eta _{T,\textit{CR}}^2:{\mathcal {S}^{1,cr}(\mathcal {T}_h)\times \mathcal {R}T^0_N(\mathcal {T}_h)}\) \(\rightarrow [0,+\infty ]\) for every \(v_h\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\) and \(y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)\) by
$$\begin{aligned} \eta ^2_{T,\textit{CR}}(v_h,y_h)&{:}{=}\Vert \nabla v_h\Vert _{L^1(T;\mathbb {R}^d)}+\sum _{S\in \mathcal {S}_h\,:\,S\subseteq T}{\Vert \llbracket {v_h}\rrbracket \Vert _{L^1(S)}}\\&\quad -(\nabla _{\! h} v_h,\Pi _h y_h)_T+\smash {I_{K_1(0)}^{\Omega }}(y_h)\\&+\tfrac{1}{2\alpha }\Vert div \,y_h+\alpha \,(v_h-g)\Vert _{L^2(T)}^2\ge 0\,. \end{aligned}$$
 
We note that the primal-dual gap estimator (4.6) and the representations (4.7) or in Remark 4.2(i) &(ii) are well-known (cf. [9, 11, 13]). However, the combination of (4.6) with the representation of the total variation of Crouzeix–Raviart functions (2.6) and the discrete integration-by-parts formula (2.7) in Remark 4.2(iii) &(iv), to the best of the authors’ knowledge, is new and leads to improved experimental convergence rates of the adaptive mesh-refinement procedure compared to the contributions [9, 11, 13] (cf. Sect. 5), where the error decay rate \(\mathcal {O}(h^{0.6})\) has been measured. More precisely, in Sect. 5, we report linear convergence rates which have been obtained only for meshes with quadratic grading towards a sufficiently simple jump set of a piece-wise regular g in  [11].

4.2 The Discretized Rudin–Osher–Fatemi (ROF) Model

    Given \(g\in L^2(\Omega )\) and \(\alpha >0\), with \(g_h{:}{=}\Pi _hg\in \mathcal {L}^0(\mathcal {T}_h)\), the discretized ROF model, proposed in [27], is given via the minimization of \(I^{cr}_h:\mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\rightarrow \mathbb {R}\), for every \(v_h\in \mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\) defined by
$$\begin{aligned} I^{cr}_h(v_h){:}{=}\Vert \nabla _{\!h}v_h\Vert _{L^1(\Omega ;\mathbb {R}^d)}+\tfrac{\alpha }{2}\Vert \Pi _hv_h-\alpha \, g_h\Vert ^2_{L^2(\Omega )}\,. \end{aligned}$$
(4.8)
Note that the functional (4.8) defines a non-conforming approximation of the functional (4.1), as, e.g., jump terms of across interior element sides are not included. This turned out to be essential in the derivation of a quasi-optimal a priori error estimate in [4, 27]. Since the functional (4.8) is proper, strictly convex, weakly coercive, and lower semi-continuous, the direct method in the calculus of variations (cf. [29]) yields the existence of a unique minimizer \(u_h^{cr}\in \mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\), called the discrete primal solution. A (Fenchel) dual problem to the minimization of (4.8) is given via the maximization of the functional \(D_h^{rt}:\mathcal {R}T^0_N(\mathcal {T}_h)\rightarrow \mathbb {R}\cup \{-\infty \}\), for every \({y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)}\) defined by
$$\begin{aligned} D_h^{rt}(y_h){:}{=}-\smash {I_{K_1(0)}^{\Omega }}(\Pi _hy_h)-\tfrac{1}{2\alpha }\Vert div \,y_h+\alpha \,g_h\Vert _{L^2(\Omega )}^2+\tfrac{\alpha }{2}\Vert g_h\Vert _{L^2(\Omega )}^2\,.\\\nonumber \end{aligned}$$
(4.9)
Appealing to Theorem 4.8 (below), there exists a maximizer \(z_h^{rt}\hspace{-0.1em}\in \hspace{-0.1em} \mathcal {R}T^0_N(\mathcal {T}_h)\) of (4.9), which satisfies \(\vert \Pi _h z_h^{rt}\vert \le 1\) a.e. in \(\Omega \), a discrete strong duality relation applies, i.e.,
$$\begin{aligned} I^{cr}_h(u_h^{cr})= D_h^{rt}(z_h^{rt})\,,\\\nonumber \end{aligned}$$
(4.10)
and the discrete convex optimality relations
$$\begin{aligned} div \, z_h^{rt}&=\alpha \, (\Pi _h u_h^{cr}-g_h) & \quad in \mathcal {L}^0(\mathcal {T}_h)\,, \end{aligned}$$
(4.11)
$$\begin{aligned} \Pi _hz_h^{rt}\cdot \nabla _{\!h} u_h^{cr}&=\vert \nabla _{\!h} u_h^{cr}\vert & \quad in \mathcal {L}^0(\mathcal {T}_h)\,.\\\nonumber \end{aligned}$$
(4.12)

4.3 The Regularized, Discretized Rudin–Osher–Fatemi Model

    To approximate a discrete minimizer \(u_h^{cr}\in \mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\) of (4.8), it is common to approximate the modulus function by strictly convex regularizations. In this connection, for every \(\varepsilon \in (0,1)\), we define a special regularization \(f_\varepsilon :\mathbb {R}\rightarrow \mathbb {R}_{\ge 0}\) of the modulus function, for every \(t\in \mathbb {R}\), via
$$\begin{aligned} f_\varepsilon (t){:}{=}(1-\varepsilon )\, \vert t\vert _\varepsilon \,,\qquad \vert t\vert _\varepsilon {:}{=}(t^2+\varepsilon ^2)^{\frac{1}{2}}\,, \end{aligned}$$
(4.13)
where \(\vert \cdot \vert _\varepsilon :\mathbb {R}\rightarrow \mathbb {R}_{\ge 0}\) is commonly referred to as the standard regularization.
Let us collect the most important properties of the regularization (4.13).
Lemma 4.3
For every \(\varepsilon \in (0,1)\), the following statements apply:
(i)
\(f_\varepsilon \in C^1(\mathbb {R})\) with \(f_\varepsilon '(0)=0\).
 
(ii)
For every \(t\in \mathbb {R}\), it holds that \(-\varepsilon \, \vert t\vert -\varepsilon ^2\le f_\varepsilon (t)-\vert t\vert \le \varepsilon \, (1-\vert t\vert )\).
 
(iii)
For every \(t\in \mathbb {R}\), it holds that \(\vert f_\varepsilon '(t)\vert \le 1-\varepsilon \).
 
(iv)
For every \(s\in \mathbb {R}\), it holds that
$$\begin{aligned} f_\varepsilon ^*(s)={\left\{ \begin{array}{ll} \smash {-\varepsilon \,((1-\varepsilon )^2-\vert s\vert ^2)^{\frac{1}{2}}}& \quad \text { if } \vert s\vert \le 1-\varepsilon \\ +\infty & \quad \text { if } \vert s\vert > 1-\varepsilon \end{array}\right. } \end{aligned}$$
 
Remark 4.4
The main reason to consider the regularization \(f_\varepsilon :\mathbb {R}\rightarrow \mathbb {R}_{\ge 0}\) instead of the standard regularization \(\vert \cdot \vert _\varepsilon :\mathbb {R}\rightarrow \mathbb {R}_{\ge 0}\) consists in the property (iii) in Lemma 4.3. This additional slope reduction enables us later to construct a sufficiently accurate, admissible approximation of the dual solution using an additional projection step (cf. Remark 4.6 (below) and Sect. 5 (below)).
Proof
ad (i). The claimed regularity \(f_\varepsilon \in C^1(\mathbb {R})\) is evident. Since for every \(t\in \mathbb {R}\), it holds that
$$\begin{aligned} f_\varepsilon '(t)=(1-\varepsilon )\,\tfrac{t}{(t^2+\varepsilon ^2)^{\frac{1}{2}}}\,, \end{aligned}$$
(4.14)
we have that \(f_\varepsilon '(0)=0\).
ad (ii). For every \(t\in \mathbb {R}\), due to \(0\le \vert t\vert _\varepsilon -\vert t\vert \le \varepsilon \), we have that
$$\begin{aligned} -\varepsilon \, \vert t\vert -\varepsilon ^2&\le -\varepsilon \, \vert t\vert _\varepsilon \\&\le f_\varepsilon (t)-\vert t\vert \\&=\varepsilon -\varepsilon \,\vert t\vert _\varepsilon \\&\le \varepsilon \, (1-\vert t\vert )\,. \end{aligned}$$
ad (iii). Due to (4.14), we have that \(\vert f'_\varepsilon (t)\vert =(1-\varepsilon )\vert t\vert (t^2+\varepsilon ^2)^{-\frac{1}{2}}\le (1-\varepsilon )\vert t\vert \vert t\vert ^{-1}\) for all \({t\in \mathbb {R}}\).
ad (iv). Due to [16, Proposition 13.20(i)], for every \(s\in \mathbb {R}\) and \(\varepsilon \in (0,1)\), we have that
$$\begin{aligned} f_\varepsilon ^*(s)&=((1-\varepsilon )\,\vert \cdot \vert _\varepsilon )^*(s)\\&=(1-\varepsilon )\,(\vert \cdot \vert _\varepsilon )^*\big (\tfrac{s}{1-\varepsilon }\big )\,. \end{aligned}$$
Since for every \(s\in \mathbb {R}\) and \(\varepsilon \in (0,1)\), it holds that
$$\begin{aligned} (\vert \cdot \vert _\varepsilon )^*\big (s)&=\sup _{t\in \mathbb {R}^d}{\big \{s\cdot t-\vert t\vert _\varepsilon \big \}}\\&= {\left\{ \begin{array}{ll} -\varepsilon \,(1-\vert s\vert ^2)^{\frac{1}{2}}& \quad \text { if } \vert s\vert \le 1\\ +\infty & \quad \text { if } \vert s\vert > 1 \end{array}\right. }\,, \end{aligned}$$
where we used, e.g., in the case \(\vert s\vert < 1\), that the supremum is attained at \(t\in \mathbb {R}^d\) if \(s=t\vert t\vert _\varepsilon ^{-1}\)i.e., \(t\!=\!\varepsilon s(1-\vert s\vert ^2)^{-1}\!\), we conclude that the claimed representation of the Fenchel conjugate applies. \(\square \)
Given \(g\in L^2(\Omega )\), \(\alpha > 0\), and an element-wise constant regularization parameter \(\varepsilon _h\in \mathcal {L}^0(\mathcal {T}_h)\) with \(0<\varepsilon _h<1\) a.e. in \(\Omega \), for \({g_h{:}{=}\Pi _hg\in \mathcal {L}^0(\mathcal {T}_h)}\), the regularized, discrete ROF model is given via the minimization of the functional \({I^{cr}_{h,\varepsilon _h}:\hspace{-0.1em}\mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\hspace{-0.1em}\rightarrow \hspace{-0.1em} \mathbb {R}}\), for every \(v_h\in \mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\) defined by
$$\begin{aligned} I^{cr}_{h,\varepsilon _h}(v_h){:}{=}\Vert f_{\varepsilon _h}(\vert \nabla _{\!h}v_h\vert )\Vert _{L^1(\Omega )}+\tfrac{\alpha }{2}\Vert \Pi _hv_h-g_h\Vert ^2_{L^2(\Omega )}\,. \end{aligned}$$
(4.15)
Since the functional (4.15) is proper, strictly convex, weakly coercive, and lower semi-continuous, the direct method in the calculus of variations (cf. [29]) yields the existence of a unique minimizer \(u_{h,\varepsilon _h}^{cr}\in \mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\), called the regularized, discrete primal solution. According to \((f_{\varepsilon _h}\circ \vert \cdot \vert )^*=f_{\varepsilon _h}^*\circ \vert \cdot \vert \) (cf. [16, Example 13.7]), the corresponding (Fenchel) dual problem to the minimization of (4.8) is given via the maximization of functional \(D_{h,\varepsilon _h}^{rt}:\mathcal {R}T^0_N(\mathcal {T}_h)\rightarrow \mathbb {R}\cup \{-\infty \}\), for every \(y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)\) defined by
$$\begin{aligned} D_{h,\varepsilon _h}^{rt}(y_h)&{:}{=}-\int _{\Omega }{f_{\varepsilon _h}^*(\vert \Pi _hy_h\vert )\,\textrm{d}x}\nonumber \\&\quad -\tfrac{1}{2\alpha }\Vert div \,y_h+\alpha \,g_h\Vert _{L^2(\Omega )}^2+\tfrac{\alpha }{2}\Vert g_h\Vert _{L^2(\Omega )}^2\,. \end{aligned}$$
(4.16)
The following proposition clarifies the well-posedness of the dual regularized, discretized ROF model, i.e., the existence of a maximizer of (4.16). It also yields a discrete reconstruction formula for a maximizer of (4.16) from a minimizer of (4.15) and proves discrete strong duality.
Proposition 4.5
The following statements apply:
(i)
A discrete weak duality relation applies, i.e.,
$$\begin{aligned} \inf _{v_h\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}{I_{h,\varepsilon _h}^{\textit{cr}}(v_h)}\ge \sup _{y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)}{D_{h,\varepsilon _h}^{\textit{rt}}(y_h)}\,. \end{aligned}$$
(4.17)
 
(ii)
The discrete flux \(z_h^{\textit{rt}}\in (\mathcal {L}^1(\mathcal {T}_h))^d\), defined via the generalized Marini formula
$$\begin{aligned} z_{h,\varepsilon _h}^{rt}{:}{=}\tfrac{f_{\varepsilon _h}'(\vert \nabla _{\!h} u_{h,\varepsilon _h}^{cr}\vert )}{\vert \nabla _{\!h} u_{h,\varepsilon _h}^{cr}\vert }\nabla _{\!h} u_{h,\varepsilon _h}^{cr}+\alpha \tfrac{\Pi _h u_{h,\varepsilon _h}^{cr}-g_h}{d}\big (id _{\mathbb {R}^d}-\Pi _hid _{\mathbb {R}^d}\big )\,, \end{aligned}$$
(4.18)
satisfies \(z_{h,\varepsilon _h}^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) and the discrete convex optimality relations
$$\begin{aligned} div \,z_{h,\varepsilon _h}^{rt}&=\alpha \,(\Pi _hu_{h,\varepsilon _h}^{cr}-g_h)\qquad \text { in }\mathcal {L}^0(\mathcal {T}_h)\,, \end{aligned}$$
(4.19)
$$\begin{aligned} \Pi _h z_{h,\varepsilon _h}^{rt}&=\tfrac{f_{\varepsilon _h}'(\vert \nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}}\vert )}{\vert \nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}}\vert }\nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}} \quad \text { in }(\mathcal {L}^0(\mathcal {T}_h))^d\,. \end{aligned}$$
(4.20)
 
(iii)
The discrete flux \(z_h^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) is a maximizer of (4.16) and discrete strong duality applies, i.e.,
$$\begin{aligned} I^{cr}_{h,\varepsilon _h}(u_{h,\varepsilon _h}^{cr})=D_{h,\varepsilon _h}^{rt}(z_{h,\varepsilon _h}^{rt})\,. \end{aligned}$$
 
By the Fenchel–Young identity (cf. [33, Proposition 5.1, p. 21]), (4.20) is equivalent to
$$\begin{aligned} \Pi _h z_{h,\varepsilon _h}^{rt}\cdot \nabla _{\!h} u_{h,\varepsilon _h}^{cr}&=f_{\varepsilon _h}^*(\vert \Pi _h z_{h,\varepsilon _h}^{rt}\vert )+f_\varepsilon (\vert \nabla _{\!h} u_{h,\varepsilon _h}^{cr}\vert )\quad \text { in }\mathcal {L}^0(\mathcal {T}_h)\,. \end{aligned}$$
(4.21)
Remark 4.6
Due to Lemma 4.3(iii), we have that \(\vert \Pi _h z_{h,\varepsilon _h}^{rt}\vert \le 1-\varepsilon _h\) a.e. in \(\Omega \). Therefore, if \(\Vert \Pi _hu_{h,\varepsilon _h}^{cr}-g_h\Vert _{L^\infty (\Omega )}\le c_0\) for some \(c_0>0\), which can be expected by discrete maximum principles, then, choosing \(\varepsilon _h{:}{=}\frac{\alpha c_0}{d}h\), yields that \(\Vert z_{h,\varepsilon _h}^{rt}\Vert _{L^\infty (\Omega ;\mathbb {R}^d)}\le 1\). However, choices like \({\varepsilon _h\sim h}\) let us expect convergence rates not better than \(\mathcal {O}(h^{1/2})\) (cf. Proposition 4.7(i) (below)). To allow for the convergence rate \(\mathcal {O}(h)\), one needs \(\varepsilon _h\sim h^2\). But, in this case, we cannot guarantee that \(\Vert z_{h,\varepsilon _h}^{rt}\Vert _{L^\infty (\Omega ;\mathbb {R}^d)}\le 1\), so that we instead consider the scaled solution \(\overline{z}_{h,\varepsilon _h}^{rt}{:}{=}z_{h,\varepsilon _h}^{rt}(\max \{1,\Vert z_{h,\varepsilon _h}^{rt}\Vert _{L^\infty (\Omega ;\mathbb {R}^d)}\})^{-1}\in \mathcal {R}T^0_N(\mathcal {T}_h)\), which is still a sufficiently accurate approximation of the dual solution, as indicated by the numerical experiments (cf. Sect. 5).
Proof
ad (i). Using element-wise that \(f_{\varepsilon _h}=f_{\varepsilon _h}^{**}\), the definition of the Fenchel conjugate (cf. (2.2)), and the discrete integration-by-parts formula (2.7), we find that
$$\begin{aligned}&\inf _{v_h\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}{I_{h,\varepsilon _h}^{\textit{cr}}(v_h)}=\inf _{v_h\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}{\bigg \{ \int _{\Omega }{f_{\varepsilon _h}^{**}(\vert \nabla _{\! h} v_h \vert )\,\textrm{d}x} +\tfrac{\alpha }{2}\Vert \Pi _h v_h-g_h\Vert _{L^2(\Omega )}^2\bigg \}} \\&\qquad = \inf _{v_h\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}{ \sup _{\overline{y}_h\in \mathcal {L}^0(\mathcal {T}_h)^d}{\bigg \{-\int _{\Omega }{f_{\varepsilon _h}^{*}(\vert \overline{y}_h \vert )\,\textrm{d}x}+(\overline{y}_h,\nabla _{\! h} v_h)_\Omega +\tfrac{\alpha }{2}\Vert \Pi _h v_h-g_h\Vert _{L^2(\Omega )}^2\bigg \}}}\\&\qquad \ge \inf _{v_h\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}{ \sup _{y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)}{\bigg \{-\int _{\Omega }{f_{\varepsilon _h}^{*}(\vert \Pi _h y_h \vert )\,\textrm{d}x}-(div \,y_h,\Pi _h v_h)_\Omega +\tfrac{\alpha }{2}\Vert \Pi _h v_h-g_h\Vert _{L^2(\Omega )}^2\bigg \}}}\\&\qquad \ge \sup _{y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)}{\bigg \{-\int _{\Omega }{f_{\varepsilon _h}^{*}(\vert \Pi _h y_h \vert )\,\textrm{d}x}-\sup _{\overline{v}_h\in \mathcal {L}^0(\mathcal {T}_h)}{\big \{(div \,y_h,\overline{v}_h)_\Omega -\tfrac{\alpha }{2}\Vert \overline{v}_h-g_h\Vert _{L^2(\Omega )}^2\big \}}\bigg \}}\\&\qquad = \sup _{y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)}{\bigg \{-\int _{\Omega }{f_{\varepsilon _h}^{*}(\vert \Pi _h y_h \vert )\,\textrm{d}x}-\tfrac{1}{2\alpha }\Vert div \,y_h+\alpha \, g_h\Vert _{L^2(\Omega )}^2+\tfrac{\alpha }{2}\Vert g_h\Vert _{L^2(\Omega )}^2\bigg \}}\\&\qquad = \sup _{y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)}{D_{h,\varepsilon _h}^{\textit{rt}}(y_h)}\,, \end{aligned}$$
which is the claimed discrete weak duality relation.
ad (ii). By Lemma 4.3, the minimality of \(u_{h,\varepsilon _h}^{\textit{cr}}\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\) for (4.15), for every \(v_h\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\), yields that
$$\begin{aligned} \Big (f_{\varepsilon _h}'(\vert \nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}}\vert )\tfrac{\nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}}}{\vert \nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}}\vert },\nabla _{\! h} v_h\Big )_\Omega +\alpha \,(\Pi _hu_{h,\varepsilon _h}^{\textit{cr}}-g_h,\Pi _h v_h)_\Omega =0\,. \end{aligned}$$
(4.22)
By definition, the discrete flux \(z_{h,\varepsilon _h}^{\textit{rt}}\in (\mathcal {L}^1(\mathcal {T}_h))^d\), defined by (4.18), satisfies the discrete convex optimality condition (4.20) and \(div \,(z_{h,\varepsilon _h}^{\textit{rt}}|_T)=\alpha \,(\Pi _hu_{h,\varepsilon _h}^{\textit{cr}}-g_h)|_T\) in T for all \(T\in \mathcal {T}_h\). Choosing \(v_h=1\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\) in (4.22), we find that \(\int _{\Omega }{\alpha \,(\Pi _hu_{h,\varepsilon _h}^{\textit{cr}}-g_h)\,\textrm{d}x}=0\). Hence, since for \(\Gamma _D=\emptyset \) the divergence operator \(div :\mathcal {R}T^0_N(\mathcal {T}_h)\rightarrow \mathcal {L}^0(\mathcal {T}_h)/\mathbb {R}\) is surjective, there exists \(y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)\) such that \(div \, y_h=\alpha \,(\Pi _hu_{h,\varepsilon _h}^{\textit{cr}}-g_h) \) in \(\mathcal {L}^0(\mathcal {T}_h)\). Then, we have that \(div \,((z_{h,\varepsilon _h}^{\textit{rt}}-y_h)|_T)=0\) in T for all \(T\in \mathcal {T}_h\), i.e., \(z_{h,\varepsilon _h}^{\textit{rt}}-y_h\in (\mathcal {L}^0(\mathcal {T}_h))^d\). In addition, for every \(v_h\in \mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\), it holds that
$$\begin{aligned} \begin{aligned} (\Pi _h y_h,\nabla _{\! h} v_h)_\Omega&=-(div \, y_h,\Pi _h v_h)_\Omega \\ &=-\alpha \,(\Pi _hu_{h,\varepsilon _h}^{\textit{cr}}-g_h,\Pi _h v_h)_\Omega \\ &=\Big (f_{\varepsilon _h}'(\vert \nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}}\vert )\tfrac{\nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}}}{\vert \nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}}\vert },\nabla _{\! h} v_h\Big )_\Omega \\&=(\Pi _h z_{h,\varepsilon _h}^{\textit{rt}},\nabla _{\! h} v_h)_\Omega \,. \end{aligned} \end{aligned}$$
In other words, for every \(v_h\in \mathcal {S}^{1,\textit{cr}}(\mathcal {T}_h)\), it holds that
$$\begin{aligned} (y_h-z_{h,\varepsilon _h}^{\textit{rt}},\nabla _{\! h} v_h)_\Omega =(\Pi _h y_h-\Pi _h z_{h,\varepsilon _h}^{\textit{rt}},\nabla _{\! h} v_h)_\Omega =0\,, \end{aligned}$$
i.e., \(y_h-z_{h,\varepsilon _h}^{\textit{rt}}\in \nabla _{\! h}(\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h))^{\perp }\). By (2.8), we have that \(\nabla _{\! h}(\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h))^{\perp }=ker (div |_{\smash {\mathcal {R}T^0_N(\mathcal {T}_h)}})\subseteq \mathcal {R}T^0_N(\mathcal {T}_h)\). Hence, it holds that \(y_h-z_{h,\varepsilon _h}^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_h)\). Due to \({y_h\in \mathcal {R}T^0_N(\mathcal {T}_h)}\), we conclude that \(z_{h,\varepsilon _h}^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_h)\). In particular, now from \(div \,(z_{h,\varepsilon _h}^{\textit{rt}}|_T)=\alpha \,(\Pi _hu_{h,\varepsilon _h}^{\textit{cr}}-g_h)|_T\) in T for all \(T\in \mathcal {T}_h\), it follows the discrete optimality condition (4.19).
ad (iii). Using (4.21), (4.19), and the discrete integration-by-parts formula (2.7), we find that
$$\begin{aligned} I_{h,\varepsilon _h}^{\textit{cr}}(u_{h,\varepsilon _h}^{\textit{cr}})&=\Vert f_{\varepsilon _h}(\vert \nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}}\vert )\Vert _{L^1(\Omega )}+\tfrac{\alpha }{2}\Vert \Pi _h u_{h,\varepsilon _h}^{\textit{cr}}-g_h\Vert _{L^2(\Omega )}^2\\&=-\int _{\Omega }{f_{\varepsilon _h}^{*}(\vert \Pi _h z_{h,\varepsilon _h}^{\textit{rt}} \vert )\,\textrm{d}x}+(\Pi _h z_{h,\varepsilon _h}^{\textit{rt}},\nabla _{\! h} u_{h,\varepsilon _h}^{\textit{cr}})_\Omega +\tfrac{1}{2\alpha }\Vert div \,z_{h,\varepsilon _h}^{\textit{rt}}\Vert _{L^2(\Omega )}^2\\&=-\int _{\Omega }{f_{\varepsilon _h}^{*}(\vert \Pi _h z_{h,\varepsilon _h}^{\textit{rt}} \vert )\,\textrm{d}x}-( div \,z_{h,\varepsilon _h}^{\textit{rt}},\Pi _hu_{h,\varepsilon _h}^{\textit{cr}})_\Omega +\tfrac{1}{2\alpha }\Vert div \,z_{h,\varepsilon _h}^{\textit{rt}}\Vert _{L^2(\Omega )}^2\\&=-\int _{\Omega }{f_{\varepsilon _h}^{*}(\vert \Pi _h z_{h,\varepsilon _h}^{\textit{rt}} \vert )\,\textrm{d}x}-\tfrac{1}{\alpha }( div \,z_{h,\varepsilon _h}^{\textit{rt}},div \,z_{h,\varepsilon _h}^{\textit{rt}}+\alpha \, g_h)_\Omega +\tfrac{1}{2\alpha }\Vert div \,z_{h,\varepsilon _h}^{\textit{rt}}\Vert _{L^2(\Omega )}^2\\&=-\int _{\Omega }{f_{\varepsilon _h}^{*}(\vert \Pi _h z_{h,\varepsilon _h}^{\textit{rt}} \vert )\,\textrm{d}x}-\tfrac{1}{2\alpha }\Vert div \,z_{h,\varepsilon _h}^{\textit{rt}}+\alpha \, g_h\Vert _{L^2(\Omega )}^2\\&=D_{h,\varepsilon _h}^{\textit{rt}}(z_{h,\varepsilon _h}^{\textit{rt}})\,, \end{aligned}$$
which is the claimed discrete strong duality relation and, thus, according to the discrete weak duality relation (4.17), proves the maximality of \(z_{h,\varepsilon _h}^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) for (4.16). \(\square \)
The following proposition describe the approximative behavior to the regularized, discretized ROF model towards the (unregularized) discretized ROF problem, given uniform convergence (to zero) of the element-wise constant regularization parameter \(\varepsilon _h\in \mathcal {L}^0(\mathcal {T}_h)\). In what follows, in the convergence \(\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0\), the triangulation \(\mathcal {T}_h\) is always fixed.
Proposition 4.7
If \(\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}<1\), then the following statements apply:
(i)
\(\tfrac{\alpha }{2}\Vert \Pi _h u_{h,\varepsilon _h}^{cr}-\Pi _hu_h^{cr}\Vert _{L^2(\Omega )}^2 \le \tfrac{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}}{1-\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}}\,(\tfrac{\alpha }{2}\,\Vert g\Vert _{L^2(\Omega )}^2+2\,\vert \Omega \vert )\).
 
(ii)
\(div \, z_{h,\varepsilon _h}^{rt}\rightarrow \alpha \,(\Pi _hu_h^{cr}-g_h)\) in \(\mathcal {L}^0(\mathcal {T}_h)\) \((\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0)\).
 
(iii)
\(f_{\varepsilon _h}^*(\vert \Pi _h z_{h,\varepsilon _h}^{rt}\vert )\rightarrow 0\) in \(\mathcal {L}^0(\mathcal {T}_h)\) \((\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0)\).
 
(iv)
\(f_{\varepsilon _h} (\vert \nabla _{\!h} u_{h,\varepsilon _h}^{cr}\vert )\rightarrow \vert \nabla _{\!h} u_h^{cr}\vert \) in \(\mathcal {L}^0(\mathcal {T}_h)\) \((\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0)\).
 
Proof
ad (i). Using the strong convexity of \(I_h^{cr}:\mathcal {S}^{1,cr}(\mathcal {T}_h)\rightarrow \mathbb {R}\) and Lemma 4.3(ii), we obtain
$$\begin{aligned} \begin{aligned} \tfrac{\alpha }{2}\Vert \Pi _h u_{h,\varepsilon _h}^{cr}-\Pi _hu_h^{cr}\Vert _{L^2(\Omega )}^2&\le I_h^{cr}(u_{h,\varepsilon _h}^{cr})-I_h^{cr}(u_h^{cr})\\ &\le \tfrac{1}{1-\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}} I_{h,\varepsilon _h}^{cr}(u_{h,\varepsilon _h}^{cr})+\tfrac{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}^2}{1-\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}}\vert \Omega \vert -I_h^{cr}(u_h^{cr})\\&\le \tfrac{1}{1-\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}} I_{h,\varepsilon _h}^{cr}(u_h^{cr})+\tfrac{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}^2}{1-\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}}\vert \Omega \vert -I_h^{cr}(u_h^{cr})\\&\le \tfrac{1}{1-\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}} ( I_h^{cr}(u_h^{cr}) +2\,\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\,\vert \Omega \vert )-I_h^{cr}(u_h^{cr})\\&= \tfrac{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}}{1-\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}} \,(I_h^{cr}(u_h^{cr})+2\,\vert \Omega \vert )\,. \end{aligned} \end{aligned}$$
Since, by the minimality of \(u_h^{cr}\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\) for (4.8) and the \(L^2\)-stability of \(\Pi _h\), it holds that
$$\begin{aligned} I_h^{cr}(u_h^{cr})&\le I_h^{cr}(0)=\tfrac{\alpha }{2}\Vert g_h\Vert _{L^2(\Omega )}^2\le \tfrac{\alpha }{2}\Vert g\Vert _{L^2(\Omega )}^2\,, \end{aligned}$$
(4.23)
from (4.24) we conclude the claimed error estimate.
ad (ii). From claim (i), it follows that
$$\begin{aligned} \Pi _h u_{h,\varepsilon _h}^{cr}\rightarrow \Pi _hu_h^{cr}\quad \text { in } \mathcal {L}^0(\mathcal {T}_h)\quad (\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0)\,. \end{aligned}$$
(4.24)
Thus, using (4.24), from \(div \, z_{h,\varepsilon _h}^{rt}=\alpha \, ( \Pi _h u_{h,\varepsilon _h}^{cr}-g_h)\) in \(\mathcal {L}^0(\mathcal {T}_h)\) (cf. (4.19)), we conclude that
$$\begin{aligned} div \, z_{h,\varepsilon _h}^{rt}\rightarrow \alpha \,(\Pi _hu_h^{cr}-g_h)\quad in \mathcal {L}^0(\mathcal {T}_h)\quad (\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0)\,. \end{aligned}$$
ad (iii). Due to \(\Pi _h z_{h,\varepsilon _h}^{rt}=\frac{f_{\varepsilon _h}'(\vert \nabla _{\!h} u_{h,\varepsilon _h}^{cr}\vert )}{\vert \nabla _{\!h} u_{h,\varepsilon _h}^{cr}\vert } \nabla _{\!h} u_{h,\varepsilon _h}^{cr} \) and Lemma 4.3(iii), we have that
$$\begin{aligned} \vert \Pi _h z_{h,\varepsilon _h}^{rt}\vert&=\vert f_{\varepsilon _h}'(\vert \nabla _{\!h} u_{h,\varepsilon _h}^{cr}\vert )\vert \le 1-\varepsilon _h\quad \text { a.e. in }\Omega \,. \end{aligned}$$
(4.25)
Therefore, using Lemma 4.3(iv) together with (4.25), we conclude that
$$\begin{aligned} \left. \begin{aligned} \vert f_{\varepsilon _h}^*(\vert \Pi _h z_{h,\varepsilon _h}^{rt}\vert )\vert&= \varepsilon _h\,((1-\varepsilon _h)^2-\vert \Pi _h z_{h,\varepsilon _h}^{rt}\vert ^2)^{\frac{1}{2}}\\&\le \varepsilon _h\,(1-\varepsilon _h)\\&\le \varepsilon _h \end{aligned}\quad \right\} \quad \text { a.e. in }\Omega \,, \end{aligned}$$
which implies that \(f_{\varepsilon _h}^*(\vert \Pi _h z_{h,\varepsilon _h}^{rt}\vert )\rightarrow 0\) in \(\mathcal {L}^0(\mathcal {T}_h)\) \((\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0)\).
ad (iv). Due to (4.23), \((u_{h,\varepsilon _h}^{cr})_{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0}\subseteq \mathcal {S}^{1,cr}(\mathcal {T}_h)\) is bounded. The finite-dimensionality of \(\mathcal {S}^{1,cr}(\mathcal {T}_h)\) and the Bolzano–Weierstraß theorem yield a subsequence \({(u_{h,\varepsilon _h'}^{cr})_{\Vert \varepsilon _h'\Vert _{L^\infty (\Omega )}\rightarrow 0}\subseteq \mathcal {S}^{1,cr}(\mathcal {T}_h)}\) and a function \(\tilde{u}_h^{cr}\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\) such that
$$\begin{aligned} u_{h,\varepsilon _h'}^{cr}\rightarrow \tilde{u}_h^{cr}\quad \text { in } \mathcal {S}^{1,cr}(\mathcal {T}_h)\quad (\Vert \varepsilon _h'\Vert _{L^\infty (\Omega )}\rightarrow 0)\,. \end{aligned}$$
(4.26)
From (4.26) it is readily derived that
$$\begin{aligned} f_{\varepsilon _h'} (\vert \nabla _{\!h} u_{h,\varepsilon _h'}^{cr}\vert )\rightarrow \vert \nabla _{\!h} \tilde{u}_h^{cr}\vert \quad \text { in }\mathcal {L}^0(\mathcal {T}_h)\quad (\Vert \varepsilon _h'\Vert _{L^\infty (\Omega )}\rightarrow 0)\,. \end{aligned}$$
Consequently, for every \(v_h\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\), we find that
$$\begin{aligned} I_h^{cr}(\tilde{u}_h^{cr})&=\lim _{\Vert \varepsilon _h'\Vert _{L^\infty (\Omega )}\rightarrow 0}{I_{h,\varepsilon _h'}^{cr}(u_{h,\varepsilon _h'}^{cr})}\\&\le \lim _{\Vert \varepsilon _h'\Vert _{L^\infty (\Omega )}\rightarrow 0}{I_{h,\varepsilon _h'}^{cr}(v_h)}\\&=I_h^{cr}(v_h)\,. \end{aligned}$$
Thus, due to the uniqueness of \(u_h^{cr}\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\) as a minimizer of (4.8), we get \(\tilde{u}_h^{cr}=u_h^{cr}\) in \(\mathcal {S}^{1,cr}(\mathcal {T}_h)\). Since this argumentation remains valid for each subsequence of \((u_{h,\varepsilon _h}^{cr})_{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0}\subseteq \mathcal {S}^{1,cr}(\mathcal {T}_h)\), the standard subsequence principle implies that \(f_{\varepsilon _h} (\vert \nabla _{\!h} u_{h,\varepsilon _h}^{cr}\vert )\hspace{-0.2em}\rightarrow \hspace{-0.2em}\vert \nabla _{\!h} u_h^{cr}\vert \) in \(\mathcal {L}^0(\mathcal {T}_h)\) \({(\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\hspace{-0.2em}\rightarrow \hspace{-0.2em} 0)}\). \(\square \)
The approximation properties of the regularized, discrete ROF model (4.15) (and (4.16)) towards the (unregularized) discrete ROF model (4.8) (and (4.16)) enable us to transfer the discrete convex duality relations established in Proposition 4.5, which apply mainly due to the differentiability of the regularized, discrete ROF model, to the non-differentiable discrete ROF model. To the best of the authors’ knowledge, the following discrete convex duality relations for the (unregularized) discrete ROF model (4.8) seem to be new.
Theorem 4.8
There exists a vector field \(z_h^{rt}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) with \(\vert \Pi _h z_h^{rt}\vert \le 1\) a.e. in \(\Omega \) and the following properties:
(i)
Up to passing to a subsequence, it holds that
$$\begin{aligned} z_{h,\varepsilon _h}^{rt}\rightarrow z_h^{rt}\quad \text { in } \mathcal {R}T^0_N(\mathcal {T}_h)\quad (\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0)\,. \end{aligned}$$
 
(ii)
There hold the following discrete convex optimality relations:
$$\begin{aligned} \begin{aligned} div \, z_h^{rt}&=\alpha \, (\Pi _h u_h^{cr}-g_h) & \text { in }\mathcal {L}^0(\mathcal {T}_h)\,,\\ \Pi _hz_h^{rt}\cdot \nabla _{\!h} u_h^{cr}&=\vert \nabla _{\!h} u_h^{cr}\vert & \text { in }\mathcal {L}^0(\mathcal {T}_h)\,. \end{aligned} \end{aligned}$$
 
(iii)
The discrete flux \(z_h^{rt}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) is maximal for \(D_h^{rt}:\mathcal {R}T^0_N(\mathcal {T}_h)\rightarrow \mathbb {R}\) and discrete strong duality applies, i.e.,
$$\begin{aligned} I_h^{cr}(u_h^{cr})=D_h^{rt}(z_h^{rt})\,. \end{aligned}$$
 
Proof
ad (i). Due to Proposition 4.7(ii) and (4.25), the sequence \((z_{h,\varepsilon _h}^{rt})_{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0}\subseteq \mathcal {R}T^0_N(\mathcal {T}_h)\) is bounded. Thus, by the finite-dimensionality of \(\mathcal {R}T^0_N(\mathcal {T}_h)\), the Bolzano–Weierstraß theorem yields a not relabeled subsequence and a vector field \(z_h^{rt}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) such that
$$\begin{aligned} z_{h,\varepsilon _h}^{rt}\rightarrow z_h^{rt}\quad \text { in }\mathcal {R}T^0_N(\mathcal {T}_h)\quad (\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0)\,. \end{aligned}$$
(4.27)
Due to the continuity of \(\Pi _h:L^1(\Omega ;\mathbb {R}^d)\rightarrow (\mathcal {L}^0(\mathcal {T}_h))^d\) and \(\mathcal {R}T^0_N(\mathcal {T}_h)\hookrightarrow L^1(\Omega ;\mathbb {R}^d)\), from (4.27), we obtain
$$\begin{aligned} \Pi _h z_{h,\varepsilon _h}^{rt}\rightarrow \Pi _h z_h^{rt}\quad \text { in }(\mathcal {L}^0(\mathcal {T}_h))^d\quad (\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0)\,. \end{aligned}$$
(4.28)
From \(\vert \Pi _h z_{h,\varepsilon _h}^{rt}\vert \le 1-\varepsilon _h\) a.e. in \(\Omega \) (cf. (4.25)), and (4.28), we obtain \(\vert \Pi _h z_h^{rt}\vert \le 1\) a.e. in \(\Omega \), i.e.,
$$\begin{aligned} \smash {I_{K_1(0)}^{\Omega }}(\Pi _h z_h^{rt})=0\,. \end{aligned}$$
(4.29)
ad (ii). Using Proposition 4.7, (4.19), and (4.21), we find that
$$\begin{aligned} \left. \begin{aligned} div \,z_h^{rt}&=\lim _{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0}{div \,z_{h,\varepsilon _h}^{rt}}\\ &=\lim _{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0}{\alpha \,(\Pi _hu_{h,\varepsilon _h}^{cr}-g_h)}\\ &=\alpha \,(\Pi _h u_h^{cr}-g_h)\end{aligned}\quad \right\} \quad \text { a.e. in }\Omega \,, \end{aligned}$$
as well as
$$\begin{aligned} \left. \begin{aligned} \Pi _h z_h^{rt}\cdot \nabla _{\!h} u_h^{cr}&=\lim _{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0}{\Pi _h z_{h,\varepsilon _h}^{rt}\cdot \nabla _{\!h} u_{h,\varepsilon _h}^{cr}}\\&=\lim _{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0}\{{f_{\varepsilon _h}^*(\vert \Pi _h z_{h,\varepsilon _h}^{rt}\vert )+f_{\varepsilon _h}(\vert \nabla _{\!h} u_{h,\varepsilon _h}^{cr}\vert )\}}\\&=\vert \nabla _{\!h} u_h^{cr}\vert \end{aligned}\quad \right\} \quad \text { a.e. in }\Omega \,, \end{aligned}$$
i.e., the claimed discrete convex optimality conditions.
ad (iii). Using Proposition 4.7 and (4.29), we find that
$$\begin{aligned} I_h^{cr}(u_h^{cr})&=\lim _{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0}{I_{h,\varepsilon _h}^{cr}(u_{h,\varepsilon _h}^{cr})}\\&=\lim _{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\rightarrow 0}{D_{h,\varepsilon _h}^{rt}(z_{h,\varepsilon _h}^{rt})}\\&=D_h^{rt}(z_h^{rt})\,, \end{aligned}$$
i.e., the claimed discrete strong duality relation. \(\square \)

5 Numerical Experiments

In this section, we review the theoretical findings of Sect. 4 via numerical experiments. To compare approximations to an exact solution, we impose Dirichlet boundary conditions on \({\Gamma _D\!=\!\partial \Omega }\), i.e., for every \(v\in BV(\Omega )\cap L^2(\Omega )\), we have that \(I(v)<\infty \) if \(tr \,v=0\) a.e. on \(\partial \Omega \) and \(I(v)=+\infty \) else, though an existence theory is difficult to establish, in general. However, the concepts derived in Sect. 4 carry over verbatimly with \(\Gamma _N=\emptyset \) provided that the existence of a minimizer is given. All experiments were conducted employing the finite element software package FEniCS (version 2019.1.0, cf. [38]). All graphics were generated using the Matplotlib library (version 3.5.1, cf. [37]) and the Vedo library (version 2023.4.4, cf. [40]). Numerical experiments for the other model problems mentioned in Remark 3.3 can be found in [7, 8].

5.1 Implementation Details Regarding the Optimization Procedure

All computations are based on the regularized, discrete ROF problem (4.15). This is motivated by the fact that according to Proposition 4.7(i), in order to bound the error \({\Vert u-\Pi _h u_h^{cr}\Vert _{L^2(\Omega )}}\), it suffices to determine the error \(\Vert u-\Pi _h u_{h,\varepsilon _h}^{cr}\Vert _{L^2(\Omega )}\). The iterative minimization of (4.15) is realized using a semi-implicit discretized \(L^2\)-gradient flow from [5] (see also [4, Section 5]) modified with a residual stopping criterion guaranteeing the necessary accuracy in the optimization procedure.
Algorithm 5.1
(Semi-implicit discretized \(L^2\)-gradient flow) Let \(g_h, \varepsilon _h\in \mathcal {L}^0(\mathcal {T}_h)\) be such that \(\varepsilon _h>0\) a.e. in \(\Omega \) and \(\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}<1\), \(u^0_h\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)\) and \(\tau ,\varepsilon _{stop}^h> 0\). Then, for every \(k\in \mathbb {N}\):
(i)
Compute the iterate \(\smash {u_h^k\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}\) such that for every \(\smash {v_h\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}\), it holds that
$$\begin{aligned} (\textrm{d}_{\tau } u_h^k,v_h)_{\Omega }+\Big (\tfrac{f_{\varepsilon _h}'(\vert \nabla _{\!h}u_h^{k-1}\vert )}{\vert \nabla _{\!h}u_h^{k-1}\vert }\nabla _{\!h}u_h^k ,\nabla _{\!h}v_h \Big )_{\Omega }+\alpha \, (\Pi _hu_h^k-g_h,\Pi _hv_h)_{\Omega }=0\,, \end{aligned}$$
(5.1)
where \(\textrm{d}_{\tau } u_h^k{:}{=}\frac{1}{\tau }(u_h^k-u_h^{k-1})\) denotes the backward difference quotient.
 
(ii)
Compute the residual \(\smash {r_h^k\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}\) such that for every \(\smash {v_h\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)}\), it holds that
$$\begin{aligned} (r_h^k,v_h)_{\Omega }=\Big (\tfrac{f_{\varepsilon _h}'(\vert \nabla _{\!h}u_h^k\vert )}{\vert \nabla _{\!h}u_h^k\vert } \nabla _{\!h}u_h^k,\nabla _{\!h}v_h \Big )_{\Omega }+\alpha \, (\Pi _hu_h^k-g_h,\Pi _hv_h)_{\Omega }\,. \end{aligned}$$
(5.2)
Stop if \(\Vert r_h^k\Vert _{L^2(\Omega )}\le \varepsilon _{stop}^h\); otherwise, increase \(k\!\rightarrow \! k+1\) and continue with step (i).
 
According to [4, Remark 5.5], the iterates \(u_h^k\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)\), \(k\in \mathbb {N}\), the residuals \(r_h^k\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)\), \(k\in \mathbb {N}\), generated by Algorithm 5.1, and the minimizer \(u_{h,\varepsilon _h}^{cr}\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_h)\) of (4.15) satisfy
$$\begin{aligned} \Vert u_{h,\varepsilon _h}^{cr}-u_h^k\Vert _{L^2(\Omega )}\le 2\,\Vert r_h^k\Vert _{L^2(\Omega )}\,. \end{aligned}$$
(5.3)
In consequence, if we choose as a stopping criterion that \(\Vert r_h^{k^*}\Vert _{L^2(\Omega )}\hspace{-0.1em}\le \hspace{-0.1em}\varepsilon _{\textit{stop}}^h\hspace{-0.1em}{:}{=}\hspace{-0.1em}c_{\textit{stop}}\,h\) for \(k^*\in \mathbb {N}\), where \(c_{\textit{stop}}\hspace{-0.1em}>\hspace{-0.1em}0\) does not depend on \(h\hspace{-0.1em}>\hspace{-0.1em}0\), then, owing to Proposition 4.7(i) and (5.3), we have that
$$\begin{aligned} \Vert \Pi _h(u_h^{cr}-u_h^{k^*})\Vert _{L^2(\Omega )}^2\le \tfrac{\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}}{1-\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}}\,(2\,\Vert g\Vert _{L^2(\Omega )}^2+\tfrac{8}{\alpha }\,\vert \Omega \vert )+8\,c_{\textit{stop}}^2\,h^2\,. \end{aligned}$$
If \(\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}\le c_{\textit{reg}}\, h^2\), where \(c_{\textit{reg}}\in (0,1)\), then, we arrive at \(\Vert \Pi _h(u_h^{cr}-u_h^{k^*})\Vert _{L^2(\Omega )}=\mathcal {O}(h)\). Thus, to bound the error \(\Vert u-\Pi _hu_h^{cr}\Vert _{L^2(\Omega )} \) experimentally, it is sufficient to compute \(\smash {\Vert u-\Pi _hu_h^{k^*}\Vert _{L^2(\Omega )}}\).
The following proposition proves the well-posedness, stability, and convergence of Algorithm 5.1.
Proposition 5.2
Let the assumptions of Algorithm 5.1 be satisfied and let \(\varepsilon _h\in \mathcal {L}^0(\mathcal {T}_h)\) such that \(\varepsilon _h>0\) a.e. in \(\Omega \) and \(\smash {\Vert \varepsilon _h\Vert _{L^\infty (\Omega )}}<1\). Then, the following statements apply:
(i)
Algorithm 5.1 is well-posed, i.e., for every \(k\hspace{-0.15em}\in \hspace{-0.15em} \mathbb {N}\), given the most-recent iterate \({u_h^{k-1}\hspace{-0.15em}\in \hspace{-0.15em} \smash {\mathcal {S}^{1,cr}_D(\mathcal {T}_h)}}\), there exists a unique iterate \(u_h^k\in \smash {\mathcal {S}^{1,cr}_D(\mathcal {T}_h)}\) solving (5.1).
 
(ii)
Algorithm 5.1 is unconditionally strongly stable, i.e., for every \(L\in \mathbb {N}\), it holds that
$$\begin{aligned} I_{h,\varepsilon _h}^{cr}(u_h^L)+\tau \sum _{k=1}^L{\Vert \textrm{d}_{\tau } u_h^k\Vert _{L^2(\Omega )}^2}\le I_{h,\varepsilon _h}^{cr}(u_h^0)\,. \end{aligned}$$
 
(iii)
Algorithm 5.1 terminates after a finite number of steps, i.e., there exists \(k^*\in \mathbb {N}\) such that \(\Vert r_h^{k^*}\Vert _{L^2(\Omega )}\le \varepsilon _{stop}^h\).
 
The proof of Proposition 5.2(ii) is essentially based on the following inequality.
Lemma 5.3
For every \(\varepsilon \in (0,1)\) and \(a,b\in \mathbb {R}^d\), it holds that
$$\begin{aligned} \tfrac{f_\varepsilon '(\vert a\vert )}{\vert a\vert } b\cdot (b-a)\ge f_\varepsilon (\vert b\vert )-f_\varepsilon (\vert a\vert )+\tfrac{1}{2}\tfrac{f_\varepsilon '(\vert a\vert )}{\vert a\vert }\vert b-a\vert ^2\,. \end{aligned}$$
Proof
Follows from [4, Appendix A.2], since \(f_\varepsilon \hspace{-0.1em}\in \hspace{-0.1em} C^1(\mathbb {R}_{\ge 0})\) and \((t\hspace{-0.1em}\mapsto \hspace{-0.1em} f_\varepsilon '(t)/t)\hspace{-0.1em}\in \hspace{-0.1em} C^0(\mathbb {R}_{\ge 0})\) is positive and non-decreasing for all \(\varepsilon \in (0,1)\). \(\square \)
Proof (of Proposition 5.2)
ad (i). Since \(\frac{f_\varepsilon '(t)}{t}\ge 0\) for all \(\varepsilon \in (0,1)\) and \(t\ge 0\), the well-posedness of Algorithm 5.1 is a direct consequence of the Lax–Milgram lemma.
ad (ii). Let \(L\hspace{-0.1em}\in \hspace{-0.1em} \mathbb {N}\) be arbitrary. Then, for every \(k\hspace{-0.1em}\in \hspace{-0.1em} \{1,\dots ,L\}\), choosing \({v_h\hspace{-0.1em}=\hspace{-0.1em}\textrm{d}_{\tau } u_h^k\hspace{-0.1em}\in \hspace{-0.1em} \mathcal {S}^{1,cr}_D(\mathcal {T}_h)}\) in (5.1), we find that
$$\begin{aligned} \Vert \textrm{d}_{\tau } u_h^k\Vert _{L^2(\Omega )}^2+\Big (\tfrac{f_{h,\varepsilon _h}'(\vert \nabla _{\!h}u_h^{k-1}\vert )}{\vert \nabla _{\!h}u_h^{k-1}\vert }\nabla _{\!h}u_h^k,\nabla _{\!h} \textrm{d}_{\tau } u_h^k\Big )_{\Omega }+\alpha \, (\Pi _hu_h^k-g_h,\Pi _h \textrm{d}_{\tau } u_h^k)_{\Omega }\,. \end{aligned}$$
(5.4)
According to Lemma 5.3 with \(a=\nabla _{\!h}u_h^{k-1}|_T\in \mathbb {R}^d\) and \(b=\nabla _{\!h} u_h^k|_T\in \mathbb {R}^d\) applied for all \(T\in \mathcal {T}_h\), for every \(k\in \{1,\dots ,L\}\), we have that
$$\begin{aligned} \tfrac{f_{h,\varepsilon _h}'(\vert \nabla _{\!h}u_h^{k-1}\vert )}{\vert \nabla _{\!h}u_h^{k-1}\vert }\nabla _{\!h}u_h^k\cdot \nabla _{\!h} \textrm{d}_{\tau } u_h^k\ge \textrm{d}_{\tau } f_{h,\varepsilon _h}(\vert \nabla _{\!h}u_h^k\vert )\quad \text { a.e. in }\Omega \,. \end{aligned}$$
(5.5)
In addition, since \(\textrm{d}_{\tau } g_h=0\), for every \(k\in \{1,\dots ,L\}\), we have that
$$\begin{aligned} \begin{aligned} (\Pi _hu_h^k-g_h)\Pi _h \textrm{d}_{\tau } u_h^k =(\Pi _hu_h^k-g_h)\textrm{d}_{\tau }(\Pi _h u_h^k-g_h) =\tfrac{\textrm{d}_{\tau }}{2} \vert \Pi _hu_h^k-g_h\vert ^2\,. \end{aligned} \end{aligned}$$
(5.6)
Using (5.5) and (5.6) in (5.4), for every \(k\in \{1,\dots ,L\}\), we arrive at
$$\begin{aligned} \Vert \textrm{d}_{\tau } u_h^k\Vert _{L^2(\Omega )}^2+\textrm{d}_{\tau } I_{h,\varepsilon _h}^{cr}(u_h^k)\le 0\,. \end{aligned}$$
(5.7)
Summation of (5.7) with respect to \(k\hspace{-0.1em}\in \hspace{-0.1em}\{1,\dots ,L\}\), using \(\sum _{k=1}^L{\hspace{-0.1em}\textrm{d}_{\tau } I_{h,\varepsilon _h}^{cr}(u_h^k)}\hspace{-0.1em}=\hspace{-0.1em}I_{h,\varepsilon _h}^{cr}(u_h^L)-\hspace{-0.1em}I_{h,\varepsilon _h}^{cr}(u_h^0)\), yields the claimed stability estimate.
ad (iii). Due to (i), we have that \(\Vert \textrm{d}_{\tau } u_h^k\Vert _{L^2(\Omega )}^2\rightarrow 0\) \((k\rightarrow \infty )\), i.e., by the finite-dimensionality of \(\smash {\mathcal {S}^{1,cr}_D(\mathcal {T}_h)}\) and the equivalence of norms, it holds that
$$\begin{aligned} u_h^k-u_h^{k-1}\rightarrow 0\quad \text { in }\mathcal {S}^{1,cr}_D(\mathcal {T}_h)\quad (k\rightarrow \infty )\,. \end{aligned}$$
(5.8)
In addition, due to (i), we have that \(I_{h,\varepsilon _h}^{cr}(u_h^k)\le I_{h,\varepsilon _h}^{cr}(u_h^0)\), which, using Lemma 4.3, implies that \((u_h^k)_{k\in \mathbb {N}}\subseteq \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\) is bounded. Due to the finite-dimensionality of \(\mathcal {S}^{1,cr}_D(\mathcal {T}_h)\), the Bolzano–Weier-straß theorem yields a subsequence \((u_h^{k_l})_{l\in \mathbb {N}}\subseteq \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\) and a function \(\tilde{u}_h\in \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\) such that
$$\begin{aligned} u_h^{k_l}\rightarrow \tilde{u}_h\quad \text { in }\mathcal {S}^{1,cr}_D(\mathcal {T}_h)\quad (l\rightarrow \infty )\,. \end{aligned}$$
(5.9)
Due to (5.8), from (5.9), we deduce that
$$\begin{aligned} u_h^{k_l-1}\rightarrow \tilde{u}_h\quad \text { in }\mathcal {S}^{1,cr}_D(\mathcal {T}_h)\quad (l\rightarrow \infty )\,. \end{aligned}$$
(5.10)
As a result, using (5.8)–(5.10), by passing for \(l\rightarrow \infty \) in (5.1), for every \(v_h\in \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\), we obtain
$$\begin{aligned} \Big (\tfrac{f_{h,\varepsilon _h}'(\vert \nabla _{\!h}\tilde{u}_h\vert )}{\vert \nabla _{\!h}\tilde{u}_h\vert }\nabla _{\!h}\tilde{u}_h ,\nabla _{\!h}v_h \Big )_{\Omega }+\alpha \,(\Pi _h\tilde{u}_h-g_h,\Pi _hv_h)_{\Omega }=0\,, \end{aligned}$$
(5.11)
and, by uniqueness, \(\smash {\tilde{u}_h=u_{h,\varepsilon _h}^{cr}}\). Hence, using (5.8) and (5.11), for every \(v_h\in \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\), we obtain
$$\begin{aligned} \big (r_h^{k_l},v_h\big )_{\Omega }&=\Big (\tfrac{f_{h,\varepsilon _h}'(\vert \nabla _{\!h}u_h^{k_l}\vert )}{\vert \nabla _{\!h}u_h^{k_l}\vert }\nabla _{\!h}u_h^{k_l},\nabla _{\!h}v_h \Big )_{\Omega }+\alpha \,(\Pi _hu_h^{k_l}-g_h,\Pi _hv_h)_{\Omega }\\&\rightarrow \Big (\tfrac{f_{h,\varepsilon _h}'(\vert \nabla _{\!h}u_{h,\varepsilon _h}^{cr}\vert )}{\vert \nabla _{\!h}u_{h,\varepsilon _h}^{cr}\vert }\nabla _{\!h}u_{h,\varepsilon _h}^{cr} ,\nabla _{\!h}v_h \Big )_{\Omega }+\alpha \,(\Pi _hu_{h,\varepsilon _h}^{cr}-g_h,\Pi _hv_h)_{\Omega }=0\quad (l\rightarrow \infty )\,, \end{aligned}$$
i.e., \(r_h^{k_l}\rightharpoonup 0\) in \(\mathcal {S}^{1,cr}_D(\mathcal {T}_h)\) \((l\rightarrow \infty )\), and, thus, by the finite-dimensionality of \(\mathcal {S}^{1,cr}_D(\mathcal {T}_h)\), \(r_h^{k_l}\rightarrow 0\) in \(\mathcal {S}^{1,cr}_D(\mathcal {T}_h)\) \((l\rightarrow \infty )\), which implies that \(r_h^{k_l}\rightarrow 0\) in \(L^2(\Omega )\) \((l\rightarrow \infty )\). As this argumentation remains valid for each subsequence of \((r_h^k)_{k\in \mathbb {N}}\subseteq \mathcal {S}^{1,cr}_D(\mathcal {T}_h)\), the standard convergence principle yields that \(r_h^k\rightarrow 0\) in \(L^2(\Omega )\) \((k\rightarrow \infty )\). In particular, there exists \(k^*\in \mathbb {N}\) such that \(\Vert r_h^{k^*}\Vert _{L^2(\Omega )}\le \varepsilon ^h_{\textit{stop}}\). \(\square \)

5.2 Implementation Details Regarding the Adaptive Mesh Refinement Procedure

    Before we present numerical experiments, we briefly outline the details of the implementations regarding the adaptive mesh refinement procedure. In general, we follow the adaptive algorithm (cf. [1, 25, 50]):
Algorithm 5.4
(AFEM) Let \(\varepsilon _{STOP }>0\), \(\theta \in (0,1]\), and \(\mathcal {T}_0\) an initial triangulation of \(\Omega \), and choose \(\varepsilon _0\in \mathcal {L}^0(\mathcal {T}_0)\) such that \(\varepsilon _0>0\) a.e. in \(\Omega \) and \(\Vert \varepsilon _0\Vert _{L^\infty (\Omega )}<1\). Then, for every \(i\in \mathbb {N}\cup \{0\}\):
  • (‘Solve’) Approximate the regularized, discrete primal solution \(u_i^{cr}{:}{=}u_{h_i,\varepsilon _i}^{cr} \in \mathcal {S}^{1,cr}_D(\mathcal {T}_i)\) minimizing (4.15). Post-process \(u_i^{cr}\hspace{-0.1em}\in \hspace{-0.1em} \smash {\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_i)}\) to obtain an approximation \({\overline{u}_i^{cr}\hspace{-0.1em}\in \hspace{-0.1em} \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_i)}\) with \(\overline{u}_i^{cr}=0\) a.e. on \(\partial \Omega \) and a regularized, discrete dual solution \(z_i^{rt}{:}{=}z_{h_i,\varepsilon _i}^{rt}\in \mathcal {R}T^0_N(\mathcal {T}_i)\) maximizing (4.16). Then, define
    $$\begin{aligned} \overline{z}_i^{rt}{:}{=}\tfrac{z_i^{rt}}{\max \{1,\Vert z_i^{rt}\Vert _{L^\infty (\Omega ;\mathbb {R}^d)}\}} \in \mathcal {R}T^0_N(\mathcal {T}_i)\,. \end{aligned}$$
  • (‘Estimate’) Compute the local refinement indicators \(\smash {(\eta ^2_{T,\textit{CR}}(\overline{u}_i^{cr},\overline{z}_i^{rt}))_{T\in \mathcal {T}_i}}\) (cf. Remark 4.2(iv)). If \(\eta ^2(\overline{u}_i^{cr},\overline{z}_i^{rt}) \le \varepsilon _{STOP }\) (cf. Remark 4.2(iii)), then STOP; otherwise, continue with step (’Mark’).
  • (‘Mark’) Choose a minimal (in terms of cardinality) subset \(\mathcal {M}_i\subseteq \mathcal {T}_i\) such that
    $$\begin{aligned} \sum _{T\in \mathcal {M}_i}{\eta _{T,\textit{CR}}^2(\overline{u}_i^{cr},\overline{z}_i^{rt})}\ge \theta ^2\sum _{T\in \mathcal {T}_i}{\eta _{T,\textit{CR}}^2(\overline{u}_i^{cr},\overline{z}_i^{rt})}\,. \end{aligned}$$
  • (‘Refine’)Perform a conforming refinement of \(\mathcal {T}_i\) to obtain \(\mathcal {T}_{i+1}\) such that each \(T\in \mathcal {M}_i\) is refined in \(\mathcal {T}_{i+1}\). Then, construct \(\varepsilon _{i+1}\in \mathcal {L}^0(\mathcal {T}_{i+1})\) such that \(\varepsilon _{i+1}>0\) a.e. in \(\Omega \) and \(\Vert \varepsilon _{i+1}\Vert _{L^\infty (\Omega )}<c_i\,h_{i+1}^2\). Increase \(i\mapsto i+1\) and continue with step (’Solve’).
Remark 5.5
(i)
The regularized, discrete primal solution \(u_i^{cr}\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_i)\) in step (‘Solve’) is computed using the semi-implicit discretized \(L^2\)-gradient flow (cf. Algorithm 5.1) for fixed step-size \({\tau =1.0}\), stopping criterion \(\varepsilon _{stop}^{h_i}{:}{=}\frac{h_i}{\sqrt{20}}\), and initial condition \(u_i^0=0\in \mathcal {S}_D^{1,cr}(\mathcal {T}_i)\). According to Proposition 5.2(ii), Algorithm 5.1 is unconditionally strongly stable, so that employing the fixed step-size \({\tau =1.0}\) is a reasonable choice. The stopping criterion \({\varepsilon _{stop}^{h_i}\hspace{-0.1em}{:}{=}\hspace{-0.1em}\frac{h_i}{\sqrt{20}}}\) ensures (cf. the argumentation below Algorithm 5.1) that the final iterate \(u_{h_i}^{k^*}\in \mathcal {S}^{1,cr}_D(\mathcal {T}_i)\) is a sufficiently accurate approximation of the discrete primal solution, in the sense that its accuracy does not violate the best possible linear convergence rate (cf. Remark 5.6 (below)).
 
(ii)
As a post-processed approximation \(\overline{u}_i^{cr}\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_i)\) with \(\overline{u}_i^{cr}=0\) a.e. on \(\partial \Omega \), we employ
$$\begin{aligned} \overline{u}_i^{cr}{:}{=}{\left\{ \begin{array}{ll} u_i^{cr}& \text { if }u_i^{cr}=0\text { on a.e. }\partial \Omega \,,\\ I_k^{\partial \Omega } u_i^{cr}& \text { else}\,, \end{array}\right. } \end{aligned}$$
where the operator \(I_i^{\partial \Omega }:\mathcal {S}^{1,\textit{cr}}(\mathcal {T}_i)\rightarrow \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_i)\) for every \(v_{h_i}\in \mathcal {S}^{1,\textit{cr}}(\mathcal {T}_i) \) is defined by
$$\begin{aligned} I_i^{\partial \Omega }v_i{:}{=}\sum _{S\in \mathcal {S}_{h_i}\,:\,S\cap \partial \Omega =\emptyset }{v_{h_i}(x_S)\,\varphi _S}\,. \end{aligned}$$
 
(iii)
Note that the particular choices in (ii) are only due to the imposed homogeneous Dirichlet boundary condition. In the case \(\Gamma _D\hspace{-0.1em}=\hspace{-0.1em}\emptyset \), the choice \(\overline{u}_i^{cr}\hspace{-0.1em}{:}{=}\hspace{-0.1em}u_i^{cr}\hspace{-0.1em}\in \hspace{-0.1em} \mathcal {S}^{1,cr}(\mathcal {T}_i)\) is always admissible.
 
(iv)
If not otherwise specified, we employ the parameter \(\theta =\smash {\frac{1}{2}}\) in (‘Mark’).
 
(v)
To find the set \(\mathcal {M}_i\subseteq \mathcal {T}_i\) in step (‘Mark’), we deploy the Dörfler marking strategy (cf. [32]).
 
(vi)
The (minimal) conforming refinement of \(\mathcal {T}_i\) with respect to \(\mathcal {M}_i\) in step (‘Refine’) is obtained by deploying the black-green-blue-refinement algorithm (cf. [50]).
 
(vii)
For the construction of the adaptively modified regularization parameter \(\varepsilon _i\in \mathcal {L}^0(\mathcal {T}_i)\) in step (’Refine’), we employ separately the following two cases:
$$\begin{aligned} \varepsilon _i{:}{=}{\left\{ \begin{array}{ll} \tfrac{\alpha }{d} \vert \Pi _{h_{i-1}} u_{i-1}^{cr}-g_{h_i}\vert h_i^2 + h_i^3& (local)\,,\\ h_i^2& (global)\,. \end{array}\right. } \end{aligned}$$
 

5.3 Example with Lipschitz Continuous Dual Solution

    We examine an example from [11]. In this example, we let \(\Omega =(-1,1)^d\), \(d\in \{2,3\}\), \(\Gamma _D=\partial \Omega \), \(r=\smash {\frac{1}{2}}\), \(\alpha =10\), and \(g=\chi _{B_r^d(0)}\in BV(\Omega )\cap L^\infty (\Omega )\). Then, the primal solution \(u\in BV(\Omega )\cap L^\infty (\Omega )\) and a dual solution \(z\in W^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\), for a.e. \(x\in \Omega \) are defined by
$$\begin{aligned} u(x)&{:}{=}(1-\tfrac{d}{\alpha r})\,g(x)\,,\qquad z(x){:}{=}{\left\{ \begin{array}{ll} -\tfrac{x}{r}& if \vert x\vert < r\,,\\ -\tfrac{rx}{\vert x\vert ^d}& if \vert x\vert \ge r\,. \end{array}\right. } \end{aligned}$$
(5.12)
Note that \(z\in W^{1,\infty }(\Omega ;\mathbb {R}^d)\), so that, according to [4, 27], uniform mesh-refinement (i.e., \(\theta =1\) in Algorithm 5.4) is expected to yield the quasi-optimal convergence rate \(\mathcal {O}(h^{\frac{1}{2}})\).
2D Case. The coarsest triangulation \(\mathcal {T}_0\) of Fig. 1 (initial triangulation of Algorithm 5.4) consists of 16 halved squares. More precisely, Fig. 1 displays the triangulations \(\mathcal {T}_i\)\({i\in \{0,15,20,25\}}\), generated by Algorithm 5.4 using either the adaptively modified \(\varepsilon _i\in \mathcal {L}^0(\mathcal {T}_i)\) (cf. (local)) or the global choice \(\varepsilon _i{:}{=}h_i^2\) (cf. (global)). For both choices, a refinement towards the circle \(\partial B_r^2(0)\), i.e., the jump set \(J_u\) of the exact solution \(u\in BV(\Omega )\cap L^\infty (\Omega )\) (cf. (5.12)) is reported. This behavior is also seen in Fig. 2, where the regularized, discrete primal solution \(u_{15}^{\textit{cr}}\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_{15})\), the (local) \(L^2\)-projection onto element-wise constant functions \(\Pi _{h_{15}} u_{15}^{\textit{cr}}\in \mathcal {L}^0(\mathcal {T}_{15})\), and the (local) \(L^2\)-projections onto element-wise affine functions of the moduli of the regularized, discrete dual solution \(z_{15}^{{\text {rt}}}\in \mathcal {R}T^0_N(\mathcal {T}_{15})\) and of the projected regularized, discrete dual solution \(\overline{z}_{15}^{{\text {rt}}}\in \mathcal {R}T^0_N(\mathcal {T}_{15})\) are plotted. Figure 1, in addition, shows that using the adaptively modified \({\varepsilon _i\in \mathcal {L}^0(\mathcal {T}_i)}\) (cf. (local)), the refinement is more concentrated at the jump set \(J_u\) of the exact solution \(u\in BV(\Omega )\cap L^\infty (\Omega )\), (cf. (5.12)). However, in Fig. 3 it is seen that (local) does not result in an improved error decay, but an error decay comparable to (global). In addition, Fig. 3 demonstrates that Algorithm 5.4 improves the experimental convergence rate of about \((h_i)^{\smash {\frac{1}{2}}}\sim (N_i)^{\smash {-\frac{1}{4}}}\), where \(N_i{:}{=}dim (\mathcal {S}^{1,cr}_D(\mathcal {T}_i))\), \(i\in \mathbb {N}\), predicted by [4, 27] for uniform mesh-refinement to the quasi-optimal rate \(h_i\sim (N_i)^{\smash {-\frac{1}{2}}}\), \(i\in \mathbb {N}\), (cf. Remark 5.6 (below)). In addition, Fig. 3 indicates the primal-dual gap estimator is reliable and efficient with respect to the error quantity
$$\begin{aligned} \tilde{\rho }^2(\overline{u}_i^{\textit{cr}},\overline{z}_i^{\textit{rt}}){:}{=}\tfrac{\alpha }{2}\Vert \overline{u}_i^{\textit{cr}}-u\Vert ^2_{L^2(\Omega )}+\tfrac{1}{2\alpha }\Vert div \,\overline{z}_i^{\textit{rt}}-div \,z\Vert ^2_{L^2(\Omega )}\,,\quad i\in \mathbb {N}\,, \end{aligned}$$
(5.13)
which, due to Remark 3.3(iv), is a lower bound for sum of the optimal strong convexity measures.
3D Case. The initial triangulation \(\mathcal {T}_0\) of Algorithm 5.4 consists of 27 cubes each divided into six tetrahedrons. Using either the adaptively modified \(\varepsilon _i\in \mathcal {L}^0(\mathcal {T}_i)\) (cf. (local)) or the global choice \(\varepsilon _i{:}{=}h_i^2\) (cf. (global)), we report similar results to the 2D case: for both choices, a refinement towards the sphere \(\partial B_r^3(0)\), i.e., the jump set \(J_u\) of the exact solution \(u\in BV(\Omega )\cap L^\infty (\Omega )\) (cf. (5.12)), is reported, which can be seen in Figure 4, where the regularized, discrete primal solution \({u_{10}^{\textit{cr}}\hspace{-0.15em}\in \hspace{-0.15em}\mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_{10})}\) and the (local) \(L^2\)-projection onto element-wise affine functions of the modulus of the regularized, discrete dual solution \(z_{10}^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_{10})\) are plotted. Figure 3 shows that the adaptive Algorithm 5.4 improves the experimental convergence rate of about \((h_i)^{\smash {\frac{1}{2}}}\sim (N_i)^{\smash {-\frac{1}{6}}}\), where \(N_i{:}{=}dim (\mathcal {S}^{1,cr}_D(\mathcal {T}_i))\), \(i\in \mathbb {N}\), predicted by [4, 27] for uniform mesh-refinement to the quasi-optimal rate \(h_i\sim (N_i)^{\smash {-\frac{1}{3}}}\), \(i\in \mathbb {N}\), (cf. Remark 5.6 (below)) (Fig. 5).
Remark 5.6
(A Comment on the optimality of linear convergence rates) In one dimension, the \(L^2\)-best-approximation error of the sign function on quasi-uniform partitions is of order \(\mathcal {O}(\smash {h^{\frac{1}{2}}})\) (cf. [3, Example 10.5]). More generally, using that the intersection \(BV(\Omega ) \cap L^\infty (\Omega )\) is contained in fractional Sobolev spaces \(W^{s,2}(\Omega )\) for all \(s<1/2\), (cf. [49, Lemma 38.1]), one cannot expect a higher convergence rate than \(\mathcal {O}(\smash {h^{\frac{1}{2}}})\) for generic, essentially bounded functions of bounded variation. For triangulations that are graded towards the jump sets of certain discontinuous functions with a quadratic grading strength, i.e., the local mesh-size satisfies \(h_T \sim h^2\) for all elements \(T\in \mathcal {T}_h\) at the discontinuity set, with the average mesh-size \(h\sim card (\mathcal {N}_h)^{\smash {-\frac{1}{d}}}\), a linear convergence rate \(\mathcal {O}(h)\) has been established in [11]. Since our error estimates not only bound squared \(L^2\)-errors but also control squares of \(L^p\)-norms of non-linear error quantities involving derivatives (cf.  [11, Remark 5.4]), a higher convergence rate than linear cannot be expected. In view of these aspects, the linear convergence rate \(\mathcal {O}(h)\) for the devised adaptive strategy is quasi-optimal.

5.4 Example Without Lipschitz Continuous Dual Solution

    We examine an example from [11]. In this example, we let \(\Omega =(-1.5,1.5)^2\), \(\Gamma _D=\partial \Omega \), \(r=\smash {\frac{1}{2}}\), \(\alpha =10\), and \(g=\chi _{B_r^2(r\textrm{e}_1)}-\chi _{B_r^2(-r\textrm{e}_1)}\in BV(\Omega )\cap L^\infty (\Omega )\). Then, the primal solution \(u\in BV(\Omega )\cap L^\infty (\Omega )\) and a dual solution \(z\in W^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^2)\), for a.e. \(x\in \Omega \) are defined by
$$\begin{aligned} u(x){:}{=}(1-\tfrac{2}{\alpha r})\,g(x)\,,\qquad z(x){:}{=}{\left\{ \begin{array}{ll} {\mp }\tfrac{x\mp r \textrm{e}_1}{r}& if \vert x{\mp } r \textrm{e}_1\vert < r\,,\\ {\mp }\tfrac{r(x\mp r \textrm{e}_1)}{\vert x{\mp } r \textrm{e}_1\vert ^2}& if \vert x{\mp } r \textrm{e}_1\vert \ge r\,. \end{array}\right. } \end{aligned}$$
(5.14)
Note that \(z\notin W^{1,\infty }(\Omega ;\mathbb {R}^2)\), so that we cannot refer to [4, 27] in order to expect uniform mesh-refinement to yield the convergence rate \(\mathcal {O}(h^{\frac{1}{2}})\). However, since \(z|_{\Omega ^{\pm }}\in W^{1,\infty }(\Omega ^{\pm };\mathbb {R}^2)\), where \(\Omega ^+\hspace{-0.1em}{:}{=}\hspace{-0.1em} \Omega \cap (\mathbb {R}_{>0}\times \mathbb {R})\) and \(\Omega ^-\hspace{-0.1em}{:}{=}\hspace{-0.1em} \Omega \cap (\mathbb {R}_{<0}\times \mathbb {R})\), and since the coarsest triangulation \(\mathcal {T}_0\) of Fig. 6 and, hence, also all resulting refinements \(\mathcal {T}_i\), \(i\in \mathbb {N}\), of \(\mathcal {T}_0\) resolve \(J_z{:}{=}\Omega \cap (\{0\}\times \mathbb {R})\), i.e., the jump set of \(z\in W^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^2)\), in the sense that \(J_z\subseteq \bigcup _{S\in \mathcal {S}_{h_i}}{S}\) for all \(i\in \mathbb {N}\), referring to [6, Theorem 4.5], we can expect uniform mesh-refinement to yield the convergence rate \(\mathcal {O}(h^{\smash {\frac{1}{2}}})\).
The coarsest triangulation \(\mathcal {T}_0\) of Fig. 6 (initial triangulation of Algorithm 5.4) consists of 16 halved squares. More precisely, Fig. 1 displays the triangulations \(\mathcal {T}_i\), \(i\!\in \! \{0,15,20,25\}\), generated by Algorithm 5.4 using either the adaptively modified \(\varepsilon _i\!\in \! \mathcal {L}^0(\mathcal {T}_i)\) (cf. (local)) or the global choice \(\varepsilon _i{:}{=}h_i^2\) (cf. (global)). For both choices, a refinement towards \({\partial B_r^2(r\textrm{e}_1)\cup \partial B_r^2(-r\textrm{e}_1)}\), i.e., the jump set \(J_u\) of the exact solution \(u\in BV(\Omega )\cap L^\infty (\Omega )\) (cf. (5.14)) is reported. This behavior is also seen in Fig. 7, where the regularized, discrete primal solution \(u_{15}^{\textit{cr}}\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_{15})\), the (local) \(L^2\)-projection onto element-wise constant functions \(\Pi _{h_{15}} u_{15}^{\textit{cr}}\in \mathcal {L}^0(\mathcal {T}_{15})\), and the (local) \(L^2\)-projections onto element-wise affine functions of the moduli of the regularized, discrete dual solution \(z_{15}^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_{15})\) and of the scaled regularized, discrete dual solution \({\overline{z}_{15}^{\textit{rt}}\in \mathcal {R}T^0_N(\mathcal {T}_{15})}\) are plotted. Figure 6, in addition, shows that employing the adaptively modified regularization parameter (cf. (local)), the refinement is more concentrated at the jump set \(J_u\) of the exact solution \(u\in BV(\Omega )\cap L^\infty (\Omega )\) (cf. (5.14)). However, in Fig. 8 it can be seen that (local) does not result in an improved error decay, but an error decay comparable to (global). In addition, Fig. 8 demonstrates that Algorithm 5.4 improves the experimental convergence rate of about \((h_i)^{\smash {\frac{1}{2}}}\sim (N_i)^{\smash {-\frac{1}{4}}}\), where \(N_i{:}{=}dim (\mathcal {S}^{1,cr}_D(\mathcal {T}_i))\), \(i\in \mathbb {N}\), predicted by [6, Theorem 4.5] for uniform mesh-refinement to the quasi-optimal rate \(h_i\sim (N_i)^{\smash {-\frac{1}{2}}}\), \(i\in \mathbb {N}\), (cf. Remark 5.6). In addition, Fig. 8 indicates the primal-dual error estimator is both reliable and efficient with respect to the error quantity (5.13).

5.5 Example with Lipschitz Continuous Primal Solution and Lipschitz Continuous Dual Solution

    We examine an example from [5]. In this example, we let \(\Omega =(-1.5,1.5)^2\), \(\Gamma _D=\partial \Omega \)\(\alpha =10\), \(s(t){:}{=}\sqrt{3t}\) and \(r(t){:}{=}\frac{1}{2}\sqrt{1-4t}\) for \(t=0.1\), and \(g\in BV(\Omega )\cap L^\infty (\Omega )\) for a.e. \(x\in \Omega \), be defined by
$$\begin{aligned} g(x){:}{=}{\left\{ \begin{array}{ll} 1 +\frac{2-\alpha (s(t)^2+t)}{s(t)}& \text { if }\vert x\vert \le s(t)\,,\\ 1 +\frac{1-\alpha (\vert x\vert ^2+t)}{\vert x\vert }& \text { if }s(t)<\vert x\vert \le r(t)\,,\\ 0& \text { else}\,. \end{array}\right. } \end{aligned}$$
Then, the primal solution \(u\in BV(\Omega )\cap L^\infty (\Omega )\) and a dual solution \(z\in W^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^2)\) with \(\vert z\vert \le 1\) a.e. in \(\Omega \), for a.e. \(x\in \Omega \) are defined by
$$\begin{aligned}&u(x){:}{=}{\left\{ \begin{array}{ll} 1 - \frac{s(t)^2+t}{s(t)}& \text { if }\vert x\vert \le s(t)\,,\\ 1 -\frac{\vert x\vert ^2+t}{\vert x\vert }& \text { if }s(t)<\vert x\vert \le r(t)\,,\\ 0& \text { else}\,, \end{array}\right. }\\&z(x){:}{=}{\left\{ \begin{array}{ll} -\frac{x}{s(t)}& \text { if }\vert x\vert \le s(t)\,,\\ -\frac{x}{\vert x\vert }& \text { if }s(t)<\vert x\vert \le r(t)\,,\\ -\frac{xr(t)}{\vert x\vert ^2}& \text { else}\,. \end{array}\right. } \end{aligned}$$
Note that \(z\in \smash {W^{1,\infty }(\Omega ;\mathbb {R}^2)}\), so that, according to [4, 27], uniform mesh-refinement is expected to yield the quasi-optimal convergence rate \(\mathcal {O}(\smash {h^{\frac{1}{2}}})\).
The coarsest triangulation \(\mathcal {T}_0\) of Fig. 9 (initial triangulation of Algorithm 5.4) consists of 16 halved squares. More precisely, Fig. 9 displays the triangulations \(\mathcal {T}_i\), \(i\!\in \! \{0,5,10,15\}\), generated by Algorithm 5.4 employing either \(\varepsilon _i\in \mathcal {L}^0(\mathcal {T}_i)\) (cf. (local)) or \(\varepsilon _i{:}{=}h_i^2\) (cf. (global)). For both choices, a refinement mainly towards and on the set \(\{\vert \nabla u\vert >0\}\) is reported. This is also seen in Fig. 10, where the regularized, discrete primal solution \(u_{15}^{{\text {cr}}}\in \mathcal {S}^{1,{\text {cr}}}_D(\mathcal {T}_{10})\), the (local) \(L^2\)-projection onto element-wise constant functions \(\Pi _{h_{10}} u_{10}^{\textit{cr}}\in \mathcal {L}^0(\mathcal {T}_{10})\), and the (local) \(L^2\)-projections onto element-wise affine functions of the modulus of the regularized, discrete dual solution \(z_{10}^{{\text {rt}}}\in \mathcal {R}T^0_N(\mathcal {T}_{10})\) and of the scaled regularized, discrete dual solution \(\overline{z}_{10}^{{\text {rt}}}\in \mathcal {R}T^0_N(\mathcal {T}_{10})\) are plotted. Figure 9 shows that employing the adaptively modified regularization parameter (cf. (local)), the refinement takes place at and on the set \(\{\vert \nabla u\vert >0\}\). However, in Fig. 11, again, it can be seen that (local) does not result in an improved error decay, but an error decay comparable to (global). In addition, Fig. 11 demonstrates that Algorithm 5.4 improves the experimental convergence rate of about \((h_i)^{\smash {\frac{1}{2}}}\sim (N_i)^{\smash {-\frac{1}{4}}}\), where \(N_i{:}{=}dim (\mathcal {S}^{1,cr}_D(\mathcal {T}_i))\), \(i\in \mathbb {N}\), predicted by [4, 27] for uniform mesh-refinement to the quasi-optimal rate \(h_i\sim (N_i)^{\smash {-\frac{1}{2}}}\), \(i\in \mathbb {N}\), (cf. Remark 5.6). In addition, Fig. 11 indicates the primal-dual error estimator is both reliable and efficient with respect to the error quantity (5.13).

5.6 Example Without Dirichlet Boundary Condition and Without Exact Solution

    We examine an example from [9, 13]. In this example, we let \(\Omega =(-1,1)^2\), \(r=\smash {\frac{1}{2}}\), \(\Gamma _D=\emptyset \), \(\alpha =100\), and \(g=\chi _{[-r,r]^2}\in BV(\Omega )\cap L^\infty (\Omega )\). Then, the primal solution and the dual solutions are not known. However, according to [27, Section 5.2], given the regularity of \(g\in BV(\Omega )\cap L^\infty (\Omega )\), we can expect the convergence rate \(\mathcal {O}(h^{\frac{1}{4}})\) using uniform mesh refinement.
The coarsest triangulation \(\mathcal {T}_0\) of Fig. 1 (initial triangulation of Algorithm 5.4) consists of 16 halved squares. More precisely, Fig. 12 displays the triangulations \(\mathcal {T}_i\)\({i\in \{0,15,20,25\}}\), generated by Algorithm 5.4 using either the adaptively modified \(\varepsilon _i\in \mathcal {L}^0(\mathcal {T}_i)\) (cf. (local)), or the global choice \(\varepsilon _i{:}{=}h_i^2\) (cf. (global)). For both choices, a refinement towards the square \(\partial [-r,r]^2\), i.e., the jump set \(J_g\) of the data \(g\in BV(\Omega )\cap L^\infty (\Omega )\) is reported. This behavior is also seen in Fig. 13, where the regularized, discrete primal solution \(u_{15}^{\textit{cr}}\in \mathcal {S}^{1,\textit{cr}}_D(\mathcal {T}_{15})\), the (local) \(L^2\)-projection onto element-wise constant functions \(\Pi _{h_{15}} u_{15}^{\textit{cr}}\in \mathcal {L}^0(\mathcal {T}_{15})\), and the (local) \(L^2\)-projections onto element-wise affine functions of the modulus of the regularized, discrete dual solution \(z_{15}^{{\text {rt}}}\in \mathcal {R}T^0_N(\mathcal {T}_{15})\) and of the projected regularized, discrete dual solution \({\overline{z}_{15}^{{\text {rt}}}\in \mathcal {R}T^0_N(\mathcal {T}_{15})}\) are plotted. Figure 12, in addition, shows that using the adaptively modified \(\varepsilon _i\in \mathcal {L}^0(\mathcal {T}_i)\), cf. (local), the refinement is, again, more concentrated at the jump set \(J_g\) of the data \(g\in BV(\Omega )\cap L^\infty (\Omega )\). However, in Fig. 3 it can be seen that (local) does not result in an improved error decay, but an error decay comparable to (global). In addition, Fig. 14 demonstrates that Algorithm 5.4 improves the experimental convergence rate of about \((h_i)^{\smash {\frac{1}{4}}}\sim (N_i)^{\smash {-\frac{1}{8}}}\), where \(N_i{:}{=}dim (\mathcal {S}^{1,cr}_D(\mathcal {T}_i))\), \(i\in \mathbb {N}\), predicted by [27, Section 5.2] for uniform mesh-refinement to the value \((h_i)^{\smash {\frac{2}{5}}}\sim (N_i)^{\smash {-\frac{1}{5}}}\), \(i\in \mathbb {N}\). This, on the one hand, confirms the optimality of the a priori error estimates established in [27, Section 5.2] and, on the other hand, according to [4, 27], let us expect that there exists no Lipschitz continuous dual solution to the given data \(g=\chi _{[-r,r]^2}\in BV(\Omega )\cap L^\infty (\Omega )\). The reported reduced error decay of \((h_i)^{\smash {\frac{2}{5}}}\sim (N_i)^{\smash {-\frac{1}{5}}}\), \(i\in \mathbb {N}\), compared to [13], where an error decay of \((h_i)^{\smash {\frac{1}{2}}}\sim (N_i)^{\smash {-\frac{1}{4}}}\), \(i\in \mathbb {N}\), is reported, might only be pre-asymptotic and due to slight accuracy losses resulting due to the global scaling step. This might be due to potential singularities of a dual solution located at the corners of the square \(\partial [-r,r]^2\), as indicated in Fig. 13. Therefore, it is possible that the error decay \((h_i)^{\smash {\frac{1}{2}}}\sim (N_i)^{\smash {-\frac{1}{4}}}\), \(i\in \mathbb {N}\), in [13] may be reported after surpassing a potential pre-asymptotic regime.

5.7 Numerical Experiments with Application to Image Processing

    In order to benchmark the performance of the proposed numerical scheme (cf. Algorithms 5.1 and 5.4) in a problem related to image processing, we examine a standard example from the field of image processing (cf. Sect. 5.7.1) and a new example (cf. Sect. 5.7.2).

5.7.1 The Cameraman Image

    We examine the cameraman image, which in a similar context has been considered in [13]. In this example, we let \(\Omega = (0,1)^2\), \(\Gamma _D=\emptyset \), \(\alpha =1\textrm{e}{+}4\), and \(g\in BV(\Omega )\cap L^\infty (\Omega )\) a piece-wise constant function taking its values in the interval [0, 1], representing the cameraman image on a uniform triangulation with 66.049 nodes, cf. Fig. 15. The adaptive algorithm (cf. Algorithm 5.4), employed as coarsening strategy, reduces the number of nodes within 30 iteration steps to 25.059 nodes which corresponds to 38.0% of the initial number of nodes, which results in a squared \(L^2\)-error of \(\Vert u_{30}^{cr}-g\Vert _{L^2(\Omega )}^2\approx 2.211\textrm{e}{-}3\). The resulting coarsened image, represented by \({u_{30}^{cr}\in \mathcal {S}^{1,cr}(\mathcal {T}_{30})}\), is shown in Fig. 15. The underlying grid \(\mathcal {T}_{30}\) shown in Fig. 16 reveals the expected coarsening of the triangulation away from the edges.

5.7.2 The Merle Image

    We examine an image of Merle, the male cat of the second author. In this example, we let \(\Omega =(0,1)^2\), \(\Gamma _D=\emptyset \), \(\alpha =1\textrm{e}{+}4\), and \(g\in BV(\Omega )\cap L^\infty (\Omega )\) a piece-wise constant function taking its values in the interval [0, 1], representing the Merle image on a uniform triangulation with 140.625 nodes, cf. Fig. 17. The adaptive algorithm (cf. Algorithm 5.4), employed as coarsening strategy, reduces the number of nodes within 30 iteration steps to 41.749 nodes which is 30.0% of the initial number of nodes, which results in a squared \(L^2\)-error of \(\Vert u_{30}^{cr}-g\Vert _{L^2(\Omega )}^2\approx 2.162\textrm{e}{-}3\). The resulting coarsened image, represented by \(u_{30}^{cr}\in \mathcal {S}^{1,cr}(\mathcal {T}_{30})\), is shown in Fig. 17. The underlying grid \(\mathcal {T}_{30}\) shown in Fig. 18 reveals the expected coarsening of the triangulation away from the edges.

5.8 Influence of Noise

    If a given image contains noise, then the fidelity parameter \(\alpha >0\) determines which frequencies are eliminated in the continuous minimization process. The adaptive refinement procedure resolves the remaining noise once this becomes relevant in the discretized problem. As the exact solution may not belong to a suitable approximation class, optimal convergence cannot be expected to occur from that point on. This behavior is illustrated in Fig. 19, in which the error decay in the case of a noisy image \(g=\chi _{B_r^2(0)}+\xi \in L^2(\Omega )\), where \(r=0.5\) and \(\xi \in \mathcal {L}^0(\mathcal {T}^{\text {black}}_8)\) is defined by associating to each element \(T\in \mathcal {T}^{\text {black}}_8\) a random value \(\xi _T\in [-\frac{1}{5},\frac{1}{5})\) sampled from a uniform distribution over the half-open interval \([-\frac{1}{5},\frac{1}{5})\) and \(\mathcal {T}^{\text {black}}_8\) is the triangulation that results from applying eight times uniform refinement to the starting triangulation \(\mathcal {T}_0\) in Sect. 5.5. In addition, Fig. 20 indicates that this could be due to the adaptive Algorithm 5.4 being distracted by the noise \(\xi \in \mathcal {L}^0(\mathcal {T}^{\text {black}}_8)\) from the discontinuity set \(\partial B_r^2(0)\) at a certain point. In particular, note that, in the case (local), in Fig. 1 there was no refinement away from the discontinuity set.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Footnotes
1
Here, \(W^{\smash {-\frac{1}{p},p}}(\partial \Omega ){:}{=}(W^{\smash {1-\frac{1}{p'},p'}}(\partial \Omega ))^*\).
 
2
Here, \(C_c^\infty (\Omega ;\mathbb {R}^d)\) denotes the space of smooth and in \(\Omega \) compactly supported vector fields.
 
3
We refrain from being too precise concerning what we mean with approximation to allow for more flexibility. Assumptions on both \(\phi :\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \}\) and \(\psi _h:\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\), \(h>0\), that imply, e.g., \(\Gamma \)-convergence results can be found in [4, Proposition 3.3].
 
Literature
16.
go back to reference H. H. Bauschke and P. L. Combettes, Convex analysis and monotone operator theory in hilbert spaces, in CMS Books in Mathematics, 2011. H. H. Bauschke and P. L. Combettes, Convex analysis and monotone operator theory in hilbert spaces, in CMS Books in Mathematics, 2011.
18.
go back to reference F. Bertrand and D. Boffi, The Prager-Synge theorem in reconstruction based a posteriori error estimation, in 75 years of mathematics of computation, Contemp. Math. 754, Amer. Math. Soc., [Providence], RI, [2020] 2020, pp. 45–67. https://doi.org/10.1090/conm/754/15152. F. Bertrand and D. Boffi, The Prager-Synge theorem in reconstruction based a posteriori error estimation, in 75 years of mathematics of computation, Contemp. Math. 754, Amer. Math. Soc., [Providence], RI, [2020] 2020, pp. 45–67. https://​doi.​org/​10.​1090/​conm/​754/​15152.
24.
go back to reference C. Carstensen, D. Günther, and H. Rabus, Mixed finite element method for a degenerate convex variational problem from topology optimization, SIAM J. Numer. Anal. 50 no. 2 (2012), 522–543 (English). https://doi.org/10.1137/100806837. C. Carstensen, D. Günther, and H. Rabus, Mixed finite element method for a degenerate convex variational problem from topology optimization, SIAM J. Numer. Anal. 50 no. 2 (2012), 522–543 (English). https://​doi.​org/​10.​1137/​100806837.
28.
go back to reference M. Crouzeix and P.-A. Raviart, Conforming and nonconforming finite element methods for solving the stationary Stokes equations. I, Rev. Française Automat. Informat. Recherche Opérationnelle Sér. Rouge 7 no. R-3 (1973), 33–75. M. Crouzeix and P.-A. Raviart, Conforming and nonconforming finite element methods for solving the stationary Stokes equations. I, Rev. Française Automat. Informat. Recherche Opérationnelle Sér. Rouge 7 no. R-3 (1973), 33–75.
29.
go back to reference B. Dacorogna, Direct methods in the calculus of variations, second ed., Applied Mathematical Sciences 78, Springer, New York, 2008. B. Dacorogna, Direct methods in the calculus of variations, second ed., Applied Mathematical Sciences 78, Springer, New York, 2008.
33.
44.
go back to reference P.-A. Raviart and J. M. Thomas, A mixed finite element method for 2nd order elliptic problems, in Mathematical aspects of finite element methods (Proc. Conf., Consiglio Naz. delle Ricerche (C.N.R.), Rome, 1975), 1977, pp. 292–315. Lecture Notes in Math., Vol. 606. P.-A. Raviart and J. M. Thomas, A mixed finite element method for 2nd order elliptic problems, in Mathematical aspects of finite element methods (Proc. Conf., Consiglio Naz. delle Ricerche (C.N.R.), Rome, 1975), 1977, pp. 292–315. Lecture Notes in Math., Vol. 606.
46.
go back to reference S. I. Repin, A posteriori error estimates for approximate solutions to variational problems with strongly convex functionals, J. Math. Sci. (New York) 97 no. 4 (1999), 4311–4328, Problems of mathematical physics and function theory. https://doi.org/10.1007/BF02365047. S. I. Repin, A posteriori error estimates for approximate solutions to variational problems with strongly convex functionals, J. Math. Sci. (New York) 97 no. 4 (1999), 4311–4328, Problems of mathematical physics and function theory. https://​doi.​org/​10.​1007/​BF02365047.
47.
48.
go back to reference M. Růžička and L. Diening, Non–Newtonian fluids and function spaces, in Nonlinear Analysis, Function Spaces and Applications, Proceedings of NAFSA 2006 Prague, 8, 2007, pp. 95–144. M. Růžička and L. Diening, Non–Newtonian fluids and function spaces, in Nonlinear Analysis, Function Spaces and Applications, Proceedings of NAFSA 2006 Prague, 8, 2007, pp. 95–144.
49.
go back to reference L. Tartar, An introduction to Sobolev spaces and interpolation spaces, Lecture Notes of the Unione Matematica Italiana 3, Springer, Berlin; UMI, Bologna, 2007. L. Tartar, An introduction to Sobolev spaces and interpolation spaces, Lecture Notes of the Unione Matematica Italiana 3, Springer, Berlin; UMI, Bologna, 2007.
Metadata
Title
Explicit A Posteriori Error Representation for Variational Problems and Application to TV-Minimization
Authors
Sören Bartels
Alex Kaltenbach
Publication date
18-10-2024
Publisher
Springer US
Published in
Foundations of Computational Mathematics
Print ISSN: 1615-3375
Electronic ISSN: 1615-3383
DOI
https://doi.org/10.1007/s10208-024-09676-5

Premium Partner