1 Introduction
The numerical analysis of the approximation of variational problems is challenging when these are non-differentiable, degenerate, or involve constraints. In particular, following established concepts for linear elliptic partial differential equations often leads to sub-optimal results only. The framework of convex duality provides an attractive concept to reveal hidden information and structures to obtain quasi-optimal error representation formulas under meaningful regularity conditions. Similar to [
45,
46], we first exploit this idea to derive explicit computable
a posteriori error estimates for a
natural error measure. Then, this general result is applied to a non-differentiable model problem with discontinuous solutions. As a whole, our results, similar to [
45,
46], show that the question of developing asymptotically exact
a posteriori error estimators is rather a question of identifying optimal error quantities. However, different from [
45,
46], we also propose a general approach for making our results practicable from a numerical point of view.
Given a domain
\(\Omega \subseteq \mathbb {R}^d\),
\({d\in \mathbb {N}}\), a proper, lower semi-continuous, and convex energy density
\(\phi :{\mathbb {R}^d}\rightarrow \mathbb {R}\cup \{+\infty \}\), a (Lebesgue) measurable energy density
\(\psi :\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) that is proper, lower semi-continuous, and convex with respect to the second argument, and a Banach space
X consisting of functions defined in
\(\Omega \), we denote by the minimization of the energy functional
\(I:X\rightarrow \mathbb {R}\cup \{+\infty \}\), for every
\(v\in X\) defined by
$$\begin{aligned} I(v) {:}{=}\int _{\Omega }{\phi (\nabla v)\,\textrm{d}x} + \int _{\Omega }{\psi (\cdot , v)\,\textrm{d}x}\,, \end{aligned}$$
(1.1)
the
primal problem.
Its (Fenchel)
dual problem is given via the maximization of the energy functional
\(D:Y\rightarrow \mathbb {R}\cup \{-\infty \}\), where
Y is a Banach space consisting of vector fields defined in
\(\Omega \), for every
\(y\in Y\) is defined by
$$\begin{aligned} D(y) {:}{=}-\int _{\Omega }{\phi ^*(y)\,\textrm{d}x} - \int _{\Omega }{\psi ^*(\cdot , \textrm{div}\,y)\,\textrm{d}x}\,. \end{aligned}$$
(1.2)
Here,
\(\phi ^*:\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \}\) and
\(\psi ^*:\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) (with respect to the second argument) denote the (Fenchel) conjugates of
\(\phi :{\mathbb {R}^d}\rightarrow \mathbb {R}\cup \{+\infty \}\) and
\(\psi :\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\), respectively. Under rather general conditions (
cf. [
33,
51]), we have the well-posedness of the primal problem and the dual problem, i.e., the existence of a minimizer
\(u\in X\) of (
1.1), i.e., a
primal solution, and of a maximizer
\(z\in Y\) of (
1.2), i.e., a
dual solution, and the
strong duality relation$$\begin{aligned} \min _{v\in X} I(v) = I(u)= D(z) = \max _{y\in Y} D(y)\,. \end{aligned}$$
(1.3)
Since
\(u\hspace{-0.1em}\in \hspace{-0.1em}X\) and
\(z\hspace{-0.1em}\in \hspace{-0.1em} Y\) are optimal for (
1.1) and (
1.2), respectively, it holds that
\(0\hspace{-0.1em}\in \hspace{-0.1em} \partial I(u)\) and
\({0\hspace{-0.1em}\in \hspace{-0.1em} \partial D(z)}\). In particular, for every
\(v\in X\) and
\(y\in Y\), the quantities
$$\begin{aligned} \rho _I^2(v,u)&{:}{=}I(v) - I(u)\,, \end{aligned}$$
(1.4)
$$\begin{aligned} \rho _{-D}^2(y,z)&{:}{=}D(z) - D(y)\,, \end{aligned}$$
(1.5)
are non-negative. They define distances, if (
1.1) and (
1.2), respectively, are strictly convex, and are called optimal strong convexity measures.
For admissible approximations
\(v\in X\) with
\(I(v)<+\infty \) and
\(y\in Y\) with
\(D(y)>-\infty \), given the definitions (
1.4) and (
1.5), the strong duality relation (
1.3) implies the
a posteriori error identity
$$\begin{aligned} \begin{aligned} \rho _I^2(v,u) + \rho _{-D}^2(y,z) = I(v) - D({y}) {=}{:}\eta ^2(v,y)\,. \end{aligned} \end{aligned}$$
(1.6)
Hence, the fully computable error estimator
\(\eta ^2:X\times Y\rightarrow \mathbb {R}\cup \{+\infty \}\),
cf. (
1.6), exactly represents the sum of the primal and dual approximation errors, i.e., of (
1.4) and (
1.5).
The error representation (
1.6) can be seen as a generalization of the Prager–Synge result (
cf. [
19,
20,
43]), which states that for the Poisson problem, i.e.,
\(\phi \hspace{-0.15em}{:}{=}\hspace{-0.15em} \frac{1}{2}\vert \cdot \vert ^2\hspace{-0.15em}\in \hspace{-0.15em} C^1(\mathbb {R}^d)\),
\({\psi \hspace{-0.15em}{:}{=}\hspace{-0.15em} ((t,x)^\top \hspace{-0.15em}\mapsto \hspace{-0.15em} -f(x)t):}\) \(\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\), where
\(f\in L^2(\Omega )\),
\(X{:}{=}W^{1,2}_D(\Omega )\), and
\({Y{:}{=}W^2_N(div ;\Omega )}\) (
cf. (
2.5)), for every
\({v\in W^{1,2}_D(\Omega )}\) and
\({y\in W^2_N(div ;\Omega )}\) with
\(div \,y=-f\) a.e. in
\(\Omega \), we have that
$$\begin{aligned} \begin{aligned} \tfrac{1}{2} \Vert \nabla v -\nabla u\Vert _{L^2(\Omega ;\mathbb {R}^d)}^2 + \tfrac{1}{2} \Vert y - z \Vert _{L^2(\Omega ;\mathbb {R}^d)}^2 = \tfrac{1}{2} \Vert \nabla v-y \Vert ^2_{L^2(\Omega ;\mathbb {R}^d)}\,. \end{aligned} \end{aligned}$$
(1.7)
The equation (
1.7) has been used by various authors to define error estimators; for a comprehensive list of references, we refer the reader to [
18]. Often, local procedures are devised to construct an admissible vector field
\(y\in W^2_N(div ;\Omega )\) with
\(-div \,y=f\) a.e. in
\(\Omega \) from a given function
\({v\in W^{1,2}_D(\Omega )}\). While this leads to efficient procedures to obtain accurate error estimators, the arguments cannot be expected to transfer to non-linear problems. Another alternative to computing approximations for the primal and dual problems consists in using finite element methods for which reconstruction formulas are available, e.g., using the discontinuous Crouzeix–Raviart finite element method and the Marini formula in the case of the Poisson problem (
cf. [
39]).
It has recently been found (
cf. [
4,
27]) that the discontinuous Crouzeix–Raviart finite element method leads to quasi-optimal a priori error estimates for non-linear and non-differentiable problems, while continuous finite element methods provide only a sub-optimal convergence behavior. In the derivation of those results, a general discrete convex duality theory with Raviart–Thomas vector fields has emerged that also leads to reconstruction formulas in rather general settings. As a consequence, given an approximation
\(v\in X\) or
\(y\in Y\), respectively, the missing one can be obtained via a simple post-processing procedure. Then, the pair leads to the error representation formula (
1.6). It should also be noted that neither
\(v\in X\) nor
\(y\in Y\) needs to be optimal in a subspace of
X or
Y. By introducing appropriate residuals, any pair of admissible approximations of
\(u\in X\) and
\(z\in Y\) can be used. This is particularly important for non-linear problems, i.e., non-quadratic functionals, where an exact solution of discrete problems is neither possible nor rational.
A difficulty in the application of the explicit a posteriori error representation formula (
1.6) arises from the condition that
\(v\in X\) and
\(y\in Y\) need to be admissible for the functionals (
1.1) and (
1.2). In the case of the Poisson problem, this arises, e.g., via element-wise constant approximations of
\(f\in L^2(\Omega )\) that are the images of Raviart–Thomas vector fields under the divergence operator. While data terms can be controlled by introducing appropriate data oscillation terms, structural peculiarities of the energy densities
\(\phi :\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \}\) and
\(\psi :\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) and their (Fenchel) conjugates
\(\phi ^*:\mathbb {R}^d\rightarrow \mathbb {R}\cup \{+\infty \}\) and
\(\psi ^*:\Omega \times \mathbb {R}\rightarrow \mathbb {R}\cup \{+\infty \}\) are often more challenging. We illustrate this by analyzing a non-differentiable problem which leads to a new error analysis and an adaptive refinement procedure for the computationally challenging minimization problem.
For a given function
\(g\in L^2(\Omega )\), i.e., the
noisy image, and a given parameter
\(\alpha >0\), i.e., the
fidelity parameter, the Rudin–Osher–Fatemi (ROF) model (
cf. [
47]) seeks a minimizer
\(u\in BV(\Omega )\cap L^2(\Omega )\), i.e., the
de-noised image, where
\(BV(\Omega )\) denotes the space of functions with bounded variation, for the functional
\(I:BV(\Omega )\cap L^2(\Omega )\rightarrow \mathbb {R}\), for every
\({v\in BV(\Omega )\cap L^2(\Omega )}\) defined by
$$\begin{aligned} I(v) {:}{=}\vert \textrm{D}v\vert (\Omega ) + \tfrac{\alpha }{2} \Vert v-g\Vert _{L^2(\Omega )}^2 \,, \end{aligned}$$
(1.8)
where
\(\vert \textrm{D}(\cdot )\vert (\Omega ):\hspace{-0.1em}BV(\Omega )\hspace{-0.1em}\rightarrow \hspace{-0.1em} [0,+\infty ]\) denotes the total variation functional. The functional (
1.8) can be seen as a special case of (
1.1) with
\(\phi \hspace{-0.15em}= \hspace{-0.15em}|\cdot |\hspace{-0.15em}\in \hspace{-0.15em} C^0(\mathbb {R}^d)\) and
\({\psi \hspace{-0.15em}=\hspace{-0.15em}((x,t)^\top \hspace{-0.15em}\mapsto \hspace{-0.15em} \frac{\alpha }{2}(t-g(x))^2):\hspace{-0.15em}\Omega \hspace{-0.15em}\times \hspace{-0.15em} \mathbb {R}\hspace{-0.15em}\rightarrow \hspace{-0.15em} \mathbb {R}}\). The (Fenchel) (pre-)dual problem to the minimization of the functional (
1.8) consists in the maximization of the functional
\(D:W_N^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\rightarrow \mathbb {R}\cup \{-\infty \}\), for every
\({y\in W_N^2(div ;\Omega )} \) \(\cap L^\infty (\Omega ;\mathbb {R}^d)\) defined by
$$\begin{aligned} D(y) {:}{=}-\smash {I_{K_1(0)}^{\Omega }}(y)-\tfrac{1}{2\alpha } \Vert \textrm{div}\, y+\alpha g\Vert _{L^2(\Omega )}^2+\tfrac{\alpha }{2}\Vert g\Vert _{L^2(\Omega )}^2\,, \end{aligned}$$
(1.9)
where
\(\smash {I_{K_1(0)}^{\Omega }}(y){:}{=}0\) if
\(\vert y\vert \le 1\) a.e. in
\(\Omega \) and
\(\smash {I_{K_1(0)}^{\Omega }}(y){:}{=}+\infty \) else. The primal solution
\(u\in BV(\Omega )\) \(\cap L^2(\Omega )\), i.e., the unique minimizer of (
1.8), and a dual solution
\({z\in W_N^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)}\), i.e., a (possibly non-unique) maximizer of (
1.9), are (formally) related via (
cf. [
26, p. 284])
$$\begin{aligned} \begin{aligned} z&\in \left. {\left\{ \begin{array}{ll} \big \{\frac{\nabla u}{|\nabla u|}\big \}& \text { if }|\nabla u|>0\\ K_1(0)& \text { if }|\nabla u|=0 \end{array}\right. }\right\} & \quad \text { a.e. in }\Omega \,,\\ div \, z&= \alpha \, (u-g) & \quad \text { a.e. in }\Omega \,. \end{aligned} \end{aligned}$$
(1.10)
The relations (
1.10) determine
\(z\in W_N^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\) via
\(u\in BV(\Omega )\cap L^2(\Omega )\) and vice versa. A Crouzeix–Raviart finite element approximation of (
1.1) is given by the minimization of the regularized, discrete functional
\(I_{h,\varepsilon }^{cr}:\mathcal {S}^{1,cr}(\mathcal {T}_h)\rightarrow \mathbb {R}\),
\(h,\varepsilon >0\), for every
\(v_h\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\) defined by
$$\begin{aligned} I_{h,\varepsilon }^{cr}(v_h) {:}{=}\Vert f_\varepsilon (\vert \nabla _{\!h} v_h\vert )\Vert _{L^1(\Omega )} + \tfrac{\alpha }{2} \Vert \Pi _h(v_h-g)\Vert _{L^2(\Omega )}^2 \,. \end{aligned}$$
Here,
\(\nabla _{\!h}\) is the element-wise application of the gradient operator and
\(f_\varepsilon \!\in \! \smash {C^1(\mathbb {R})}\) is a regularization of the modulus
\(\vert \cdot \vert \), and
\(\Pi _h\) denotes the (local)
\(L^2\)-projection onto element-wise constant functions. A quasi-optimal dual Raviart–Thomas vector field
\(z_{h,\varepsilon }^{rt}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) can be associated with a minimizing function
\(u_{h,\varepsilon }^{cr}\in \mathcal {S}^{1,cr}(\mathcal {T}_h)\) of
\(I_{h,\varepsilon }^{cr}:\mathcal {S}^{1,cr}(\mathcal {T}_h)\rightarrow \mathbb {R}\) via the reconstruction formula
$$\begin{aligned} z_{h,\varepsilon }^{rt} = \tfrac{f_\varepsilon '(\vert \nabla _{\!h} u_{h,\varepsilon }^{cr}\vert ) }{\vert \nabla _{\!h} u_{h,\varepsilon }^{cr}\vert }\nabla _{\!h} u_{h,\varepsilon }^{cr} + \alpha \tfrac{\Pi _h (u_{h,\varepsilon }^{cr} -g)}{d}\big ( \textrm{id}_{\mathbb {R}^d}- \Pi _h \textrm{id}_{\mathbb {R}^d}\big )\quad \text { in }\mathcal {R}T^0_N(\mathcal {T}_h)\,. \end{aligned}$$
(1.11)
For canonical choices of
\(f_\varepsilon \hspace{-0.15em}\in \hspace{-0.15em}C^1(\mathbb {R})\),
e.g.,
\(f_\varepsilon \hspace{-0.15em}=\hspace{-0.15em}\vert \cdot \vert _\varepsilon \hspace{-0.15em}=\hspace{-0.15em} ((\cdot )^2+\varepsilon ^2)^{1/2}\), it holds that
\(\vert \Pi _h z_{h,\varepsilon }^{rt}\vert \hspace{-0.15em}\le \hspace{-0.15em} 1\) a.e. in
\(\Omega \), but not
\(|z_{h,\varepsilon }^{rt}|\le 1\) a.e. in
\(\Omega \). Thus, we employ
\(f_\varepsilon = (1-\varepsilon )\, |\cdot |_\varepsilon \), so that
\(|f_\varepsilon '(t)|\le 1-\varepsilon \) for all
\(t\in \mathbb {R}\). The choice
\(\varepsilon \sim h^2\) in (
1.11) and an additional projection step onto
\(K_1(0)\) provides an accurate approximation
\({\overline{z}}_{h,\varepsilon }^{rt}\in \mathcal {R}T^0_N(\mathcal {T}_h)\) of
\(z\in W_N^2(div ;\Omega )\cap L^\infty (\Omega ;\mathbb {R}^d)\), which satisfies
\(|{\overline{z}}_{h,\varepsilon }^{rt}|\le 1\) a.e. in
\(\Omega \) and, thus, represents an admissible test function that leads to the definition of an error estimator. The resulting adaptive mesh-refinement procedure leads to significantly improved experimental convergence rates compared to recent related contributions (
cf. [
9,
11,
13]). More precisely, we report quasi-optimal linear convergence rates which have been obtained only for meshes with quadratic grading towards a sufficiently simple jump set of a piece-wise regular
g in [
11].
This article is organized as follows: In Sect.
2, we introduce the employed notation and the relevant finite element spaces. In Sect.
3, we propose a general approach for explicit
a posteriori error representation for convex minimization problems based on (discrete) convex duality relations. In Sect.
4, we apply the concepts of Sect.
3 to the Rudin–Osher–Fatemi model and propose a regularization scheme. In Sect.
5, we review our theoretical findings via numerical experiments.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.