Top

Jahresbericht der Deutschen Mathematiker-Vereinigung

Published in:

Open Access 06-05-2021 | Survey Article

An Introduction to Finite Element Methods for Inverse Coefficient Problems in Elliptic PDEs

Author: Bastian Harrach

Published in: Jahresbericht der Deutschen Mathematiker-Vereinigung | Issue 3/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Patentsearch

Off

Abstract

Several novel imaging and non-destructive testing technologies are based on reconstructing the spatially dependent coefficient in an elliptic partial differential equation from measurements of its solution(s). In practical applications, the unknown coefficient is often assumed to be piecewise constant on a given pixel partition (corresponding to the desired resolution), and only finitely many measurement can be made. This leads to the problem of inverting a finite-dimensional non-linear forward operator $\mathcal{F}:\ \mathcal{D}(\mathcal{F})\subseteq \mathbb{R}^{n}\to \mathbb{R}^{m}$, where evaluating ℱ requires one or several PDE solutions.

Numerical inversion methods require the implementation of this forward operator and its Jacobian. We show how to efficiently implement both using a standard FEM package and prove convergence of the FEM approximations against their true-solution counterparts. We present simple example codes for Comsol with the Matlab Livelink package, and numerically demonstrate the challenges that arise from non-uniqueness, non-linearity and instability issues. We also discuss monotonicity and convexity properties of the forward operator that arise for symmetric measurement settings.

This text assumes the reader to have a basic knowledge on Finite Element Methods, including the variational formulation of elliptic PDEs, the Lax-Milgram-theorem, and the Céa-Lemma. Section 3 also assumes that the reader is familiar with the concept of Fréchet differentiability.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 Introduction

Many practical reconstruction problems in the field of medical imaging and non-destructive testing lead to inverse coefficient problems in elliptic partial differential equations. This text is meant to be an introductory tutorial for implementing such problems with Finite Element Methods (FEM).

We assume that the unknown coefficient is piecewise-constant on a given resolution, and that finitely many linear measurements of one of several solutions are taken, where different solutions are generated by different linear excitation in the underlying physics model. This leads to the finite-dimensional non-linear inverse problem of determining

$$ \sigma \in \mathbb{R}^{n} \quad \text{ from } \quad \mathcal{F}( \sigma )\in \mathbb{R}^{m} $$

with $n\in \mathbb{N}$ unknowns and $m\in \mathbb{N}$ measurements.

Iterative numerical solution methods for this inverse problem require evaluating ℱ and its derivatives at each iteration step, which means solving the underlying elliptic PDE. In this work, we will demonstrate how FEM-based implementations for ℱ and its Jacobian can be obtained very efficiently from standard FEM-solvers for the considered elliptic PDE. Roughly speaking, the sensitivity of a measurement with respect to changing the coefficient in one pixel can be simply calculated by multiplying FEM-solutions corresponding to the measurement and excitation patterns with so-called pixel stiffness matrices that are obtained from summing up all element stiffness matrices of elements belonging to the pixel where the change occurs. Hence, the FEM-based Jacobian can be obtained without any additional computational burden with just a few lines of extra code. Alternatively, for an even simpler implementation, the pixel stiffness matrices can be easily obtained by subtracting global stiffness matrices without requiring any knowledge about the triangulation details.

This text is meant as a simple-to-read explanation of this approach in a sufficiently general but naturally arising setting. More precisely, we restrict ourselves to coercive and symmetric variational formulations that linearly depend on the unknown coefficients, and to excitations and measurements that correspond to linear functionals. In this setting, we demonstrate how to obtain the Jacobian of the FEM-based forward map with the means of a standard FEM software package such as COMSOL. We also discuss monotonicity and convexity properties arising in symmetric measurement situations that are the basis for recent research on rigorously justified reconstruction methods.

The purpose of this text is of introductory nature, but we proceed in a mathematically rigorous fashion to allow this text to also serve as a reference. We prove differentiability of the true-solution forward operator and its FEM-based approximation, and show convergence of the FEM-approximated quantities to their true-solution counterparts.

Section 2 gives two examples to motivate our general setting: stationary diffusion and Elecrical Impedance Tomography. Section 3 introduces the forward operator using the exact PDE solution and derives its properties. The FEM-approximation of the forward operator and its Jacobian is studied in Sect. 4. In Sect. 5 we show numerical examples and demonstrate some of the major challenges that arise in solving inverse coefficient problems. The COMSOL/MATLAB source codes for all examples are given in the Appendix.

2 Motivation and Examples

2.1 Stationary Diffusion

We consider the stationary diffusion equation

$$ -\nabla \cdot (\sigma \nabla u)=g \quad \text{ in $\Omega $} $$

(1)

with homogeneous Dirichlet boundary condition $u|_{\partial \Omega }=0$ in a Lipschitz bounded domain $\Omega \subset \mathbb{R}^{d}$, $d\in \mathbb{N}$. For $u\in H^{1}(\Omega )$, $\sigma \in L^{\infty }_{+}(\Omega )$ and $g\in L^{2}(\Omega )$ the equation is equivalent to finding $u\in H_{0}^{1}(\Omega )$ with

$$ \int _{\Omega }\sigma \nabla u \cdot \nabla v \,\mathrm{d} x= \int _{\Omega }g v \,\mathrm{d} x\quad \text{ for all } v\in H_{0}^{1}( \Omega ), $$

(2)

and unique solvability follows from the Lax-Milgram theorem.

We are interested in the inverse coefficient problem of determining the diffusivity coefficient $\sigma $ in (1) from measurements of the solution for one or several source terms $g$, cf. [9] for an application in groundwater filtration. In practical applications with finitely many measurements, it is natural to only aim for a certain pixel-based resolution and therefore assume that $\sigma $ is piecewise constant with respect to a partition $\Omega =\bigcup _{i=1}^{n} \mathcal{P}_{i}$, i.e.

$$ \sigma (x)=\sum _{i=1}^{n} \sigma _{i} \chi _{\mathcal{P}_{i}}(x) \quad \text{ for all } x\in \Omega , $$

where the pixels $\mathcal{P}_{i}\subseteq \Omega $ are assumed to be measurable subsets. The left image in Fig. 1 shows a simple example where the unit square $\Omega =(0,1)^{2}$ is divided into $3\times 3$ pixels. In the following, with a slight abuse of notation, we write $\sigma =(\sigma _{1},\ldots ,\sigma _{n})^{T}\in \mathbb{R}^{n}$ for the unknown diffusivity.

The source term $g$ in the diffusion model (1) can be identified with the linear functional on the right hand side of the variational formulation (2)

$$ l\in H^{-1}(\Omega ),\quad l(v):= \int _{\Omega }g v \,\mathrm{d} x, $$

which corresponds to identifying $L^{2}(\Omega )$ with a subset of $H^{-1}(\Omega )$. Accordingly, we consider excitations in the form of linear functionals. Also, to emphasize that the solution depends on the diffusion coefficient and the excitation, we write $u_{\sigma }^{l}$ in the following. The left image in Fig. 2 illustrates the concentration resulting from a constant source term $g=\chi _{D}$, i.e. $l(v)=\int _{D} v\,\mathrm{d} x$, where $D=D_{2}\cup D_{4}\cup D_{5}\cup D_{7}$ is a union of four circular subdomains as sketched in the right image of Fig. 1. The right image in Fig. 2 shows the corresponding plot for $D=D_{1}\cup D_{3}\cup D_{6}\cup D_{8}$. Both images show the solution of (1) with constant diffusion coefficient $\sigma =1$.

Natural models for measuring the solution of (1) also yield to linear functionals. Measuring the total concentration in one of the circular subdomains $D_{j}$ corresponds to measuring $r(u):=\int _{D_{j}} u \,\mathrm{d} x$. Hence, the inverse problem of determining finitely many information about the diffusivity coefficient from finitely many measurements of the concentration (possibly but not necessarily resulting from different excitations) leads to the finite-dimensional inverse problem to

$$\begin{aligned} \text{determine} \quad \sigma \in \mathbb{R}^{n}_{+} \quad \text{ from } \quad \mathcal{F}(\sigma )\in \mathbb{R}^{m}, \end{aligned}$$

where

$$ \mathcal{F}:\ \mathbb{R}^{n}_{+}\to \mathbb{R}^{m}, \quad \mathcal{F}(\sigma ):=(r_{j}(u_{\sigma }^{l_{j}}))_{j=1}^{m}, $$

and $u_{\sigma }^{l_{j}}\in H_{0}^{1}(\Omega )$ solves

$$ \sum _{i=1}^{n} \sigma _{i} b_{i}(u_{\sigma }^{l_{j}},v)=l_{j}(v) \quad \text{ for all } v\in H_{0}^{1}(\Omega ), $$

with given $l_{j},r_{j}\in H^{-1}(\Omega )$, $j=1,\ldots ,m$, and

$$ b_{i}(u,v):=\int _{\mathcal{P}_{i}} \nabla u\cdot \nabla v \,\mathrm{d} x, \quad i=1,\ldots ,n. $$

2.2 Electrical Impedance Tomography (EIT)

We give another example for an application that leads to an inverse elliptic coefficient problem in a similar form as the diffusion example.

EIT aims to image the inner conductivity structure of a subject by current and voltage measurements through electrodes attached to the imaging subject. Let $\Omega \subseteq \mathbb{R}^{d}$, $d\in \{2,3\}$, be a smoothly bounded domain denoting the imaging subject. The electrodes $\mathcal{E}_{k}$, $k=1,\ldots ,K$, are assumed to be open connected subsets of $\partial \Omega $ with disjoint closures.

When currents with strength $J=(J_{1},\ldots ,J_{K})\in \mathbb{R}^{K}$ are driven through the $K$ electrodes (with $\sum _{k=1}^{K} J_{k}=0$), the resulting electric potential $u\in H^{1}(\Omega )$ inside $\Omega $, and the potential $U\in \mathbb{R}^{K}$ on the electrodes, solve

\begin{aligned} \nabla \cdot (σ \nabla u) & = 0 & in Ω, \\ σ \partial_{ν} u & = 0 & on \partial Ω ∖ ⋃_{k = 1}^{K} E_{k}, \\ u + z σ \partial_{ν} & u = const. = : U_{k} & on E_{k}, k = 1, \dots, K, \\ \int_{E_{k}} σ \partial_{ν} u |_{E_{k}} d s & = J_{k} & on E_{k}, k = 1, \dots, K, \end{aligned}

where $\sigma \in L^{\infty }_{+}(\Omega )$ is the conductivity inside $\Omega $, and $z>0$ is the contact impedance of the electrodes.

Under the gauge condition $U\in \mathbb{R}_{\diamond }^{K}:=\{ V\in \mathbb{R}^{K}:\ \sum _{k=1}^{K} V_{k}=0\}$, one can show (see [19]) that this so-called complete electrode model (CEM) for EIT is equivalent to the variational formulation that $(u,U)\in H^{1}(\Omega )\times \mathbb{R}^{K}_{\diamond }$ solves

$$ \int _{\Omega }\sigma \nabla u\cdot \nabla w \,\mathrm{d} x+ \sum _{k=1}^{K} \int _{\mathcal{E}_{k}} \frac{1}{z}(u-U_{k})(w-W_{k})\,\mathrm{d} s =\sum _{k=1}^{K} J_{k} W_{k} $$

(3)

for all $(w,W)\in H^{1}(\Omega )\times \mathbb{R}^{K}_{\diamond }$, and unique solvability follows from the Lax-Milgram theorem.

We assume that $z>0$ is known, and that $\sigma (x)=\sum _{i=1}^{n} \sigma _{i} \chi _{\mathcal{P}_{i}}(x)$ is piecewise constant with respect to a pixel partition $\Omega =\bigcup _{i=1}^{n} \mathcal{P}_{i}$, and write $\sigma =(\sigma _{1},\ldots ,\sigma _{n})\in \mathbb{R}^{n}$ for the unknown conductivity values inside $\Omega $.

The applied current patterns $J=(J_{1},\ldots ,J_{K})\in \mathbb{R}^{K}$ can be identified with the functional

$$ l\in H',\quad l(w,W):=\sum _{k=1}^{K} J_{k} W_{k} \quad \text{ for all } (w,W)\in H:=H^{1}(\Omega )\times \mathbb{R}^{K}_{\diamond }. $$

Likewise, measuring the voltage between the $k_{1}$-th and the $k_{2}$-th electrode corresponds to measuring the linear functional

$$ r\in H', \quad r(u,U):=U_{k_{1}}-U_{k_{2}}, $$

of the solution $(u,U)\in H$ generated by some current pattern.

Hence, the problem of determining the interior conductivity with a fixed finite resolution from finitely many voltage-current measurements in EIT (with CEM) leads to the finite-dimensional inverse problem to

$$\begin{aligned} \text{determine} \quad \sigma \in \mathbb{R}^{n}_{+} \quad \text{ from } \quad \mathcal{F}(\sigma )\in \mathbb{R}^{m}, \end{aligned}$$

where $\mathcal{F}:\ \mathbb{R}^{n}_{+}\to \mathbb{R}^{m}$, $\mathcal{F}(\sigma ):=(r_{j}(u_{\sigma }^{l_{j}},U_{\sigma }^{l_{j}}))_{j=1}^{m}$, and $(u_{\sigma }^{l_{j}},U_{\sigma }^{l_{j}})\in H$ solves

$$\begin{aligned} b_{0}((u_{\sigma }^{l_{j}},U_{\sigma }^{l_{j}}),(w,W))+ \sum _{i=1}^{n} \sigma _{i} \, b_{i}((u_{\sigma }^{l_{j}},U_{\sigma }^{l_{j}}),(w,W)) = l_{j}(w,W) \end{aligned}$$

for all $(w,W)\in H$, with given $l_{j},r_{j}\in H'$, $j=1,\ldots ,m$, and

$$\begin{aligned} b_{0}((u,U),(w,W))&:=\sum _{k=1}^{K} \int _{\mathcal{E}_{k}} \frac{1}{z}(u-U_{k})(w-W_{k})\,\mathrm{d} s, \\ b_{i}((u,U),(w,W))&:=\int _{\mathcal{P}_{i}} \nabla u\cdot \nabla w \,\mathrm{d} x. \end{aligned}$$

Clearly, one could also extend this formulation to cover the case of unknown contact impedances.

3 The True-Solution Setting

The examples in Sect. 2 lead to inverse problems for a finite-dimensional non-linear forward operator $\mathcal{F}:\ \mathbb{R}^{n}_{+}\to \mathbb{R}^{m}$, where evaluations of ℱ require solving an infinite-dimensional linear problem (the PDE). In this section, we will first derive some properties of ℱ for the case that it is defined with the true infinite-dimensional PDE solution. The properties of the operator $F\approx \mathcal{F}$, that is defined with a FEM-approximation of the PDE solution, will be studied in Sect. 4.

3.1 The True-Solution Forward Operator and Its Derivative

We will study problems that appear in the variational formulation of elliptic PDEs with piecewise constant coefficients on a fixed pixel partition, as in the examples in Sect. 2.

The Variational Setting

Let $H$ be a Hilbert space. We consider the problem of finding $u\in H$ that solves

$$ b_{\sigma }(u,v)=l(v), $$

(4)

where $b_{\sigma }:\ H\times H\to \mathbb{R}$ is a bilinear form, and $l\in H'=\mathcal{L}(H,\mathbb{R})$. $b_{\sigma }$ is assumed to linearly depend on $n$ parameters $\sigma =(\sigma _{1},\ldots ,\sigma _{n})\in \mathbb{R}^{n}$ in the following way

$$ b_{\sigma }(u,v)=b_{0}(u,v)+\sum _{i=1}^{n} \sigma _{i} b_{i}(u,v), $$

where $b_{0},\ b_{i}:\ H\times H\to \mathbb{R}$ are bounded, symmetric and positive semidefinite bilinear forms. Writing

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_IEq75_HTML.gif

, we also assume that

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_IEq76_HTML.gif

is bounded and coercive with constants $\beta ,C>0$, i.e.,

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_Equo_HTML.png

Clearly, this yields that for all $\sigma \in \mathbb{R}^{n}_{+}$

$$ C\max \{1,\sigma _{1},\ldots ,\sigma _{n}\} \hspace{0.2ex}\|v\|\hspace{0.2ex}^{2} \geq b_{\sigma }(v,v) \geq \beta \min \{1,\sigma _{1},\ldots ,\sigma _{n}\} \hspace{0.2ex}\|v\|\hspace{0.2ex}^{2} \quad \forall v\in H, $$

(5)

so that $b_{\sigma }$ is symmetric, bounded and coercive. Here and in the following $\mathbb{R}^{n}_{+}$ denotes the set of all $\sigma \in \mathbb{R}^{n}$ with $\sigma >0$ and “>” and “≥” are understood elementwise on $\mathbb{R}^{n}$.

The True-Solution Forward Operator

We now characterize the derivative of the solution of (4) with respect to $\sigma $.

Lemma 1

Let $l\in H'$. The solution operator

$$ \mathcal{S}:\ \mathbb{R}^{n}_{+}\to H, \quad \mathcal{S}(\sigma ):=u_{\sigma }^{l}, \quad \textit{ where $u_{\sigma }^{l}\in H$ solves (4),} $$

is infinitely often Fréchet differentiable. Its first derivative

$$ S':\ \mathbb{R}^{n}_{+}\to \mathcal{L}(\mathbb{R}^{n},H) $$

fulfills that, for all $\sigma \in \mathbb{R}^{n}_{+}$ and $\tau \in \mathbb{R}^{n}$, $S'(\sigma )\tau \in H$ is the unique solution of

$$ b_{\sigma }( S'(\sigma )\tau , w) = -\sum _{i=1}^{n} \tau _{i} b_{i}(u_{\sigma }^{l} , w ) \quad \forall w\in H. $$

Also, for $r\in H'$, $\sigma \in \mathbb{R}^{n}_{+}$, and $\tau \in \mathbb{R}^{n}$,

$$ r(u_{\sigma }^{l})=b_{\sigma }(u_{\sigma }^{l}, u_{\sigma }^{r}) \quad \textit{ and } \quad r\left ( \mathcal{S}'(\sigma )\tau \right )=-\sum _{i=1}^{n} \tau _{i} b_{i}(u_{\sigma }^{l}, u_{\sigma }^{r}). $$

Proof

For $\sigma \in \mathbb{R}^{n}_{+}$, the Riesz theorem yields that there exists a unique operator $\mathcal{B}(\sigma )\in \mathcal{L}(H,H')$ associated to the bilinear form $b_{\sigma }(\cdot ,\cdot )$, i.e.

$$ \langle \mathcal{B}(\sigma ) u,v \rangle _{H'\times H}=b_{\sigma }(u,v) \quad \text{ for all } u,v\in H. $$

Clearly, $\mathcal{B}(\sigma )$ is symmetric and, by the Lax-Milgram theorem, $\mathcal{B}(\sigma )$ is invertible with symmetric inverse $\mathcal{B}(\sigma )^{-1}\in \mathcal{L}(H',H)$. Hence, (4) is uniquely solvable, and the solution operator $\mathcal{S}$ is well-defined.

It is easily checked, that $\mathcal{B}(\sigma )$ is Fréchet differentiable for every $\sigma \in \mathbb{R}^{n}_{+}$, and that its derivative $\mathcal{B}'(\sigma )\in \mathcal{L}(\mathbb{R}^{n},\mathcal{L}(H,H'))$ is given by

$$ \mathcal{B}'(\sigma )\tau =\sum _{i=1}^{n} \tau _{i} \mathcal{B}_{i} \quad \text{ for all } \sigma \in \mathbb{R}^{n}_{+},\ \tau \in \mathbb{R}^{n}, $$

where $\mathcal{B}_{i}\in \mathcal{L}(H,H')$ is the unique operator fulfilling

$$ (\mathcal{B}_{i} u,v)_{H'\times H}=b_{i}(u,v)\quad \text{ for all } u,v \in H. $$

Since $\mathcal{B}'(\sigma )$ does not depend on $\sigma $, this also shows that $\mathcal{B}(\sigma )$ is infinitely often Fréchet differentiable with all second and higher derivatives being zero.

Using the derivative of operator inversion and the product and chain rule for the Fréchet derivative, we thus obtain that $\mathcal{S}(\sigma )$ is infinitely often Fréchet differentiable with

$$\begin{aligned} \mathcal{S}'(\sigma )\tau &=-\mathcal{B}(\sigma )^{-1} (\mathcal{B}'( \sigma )(\tau )) \mathcal{B}(\sigma )^{-1} l=-\sum _{i=1}^{n} \tau _{i} \mathcal{B}(\sigma )^{-1} \mathcal{B}_{i} u_{\sigma }^{l}. \end{aligned}$$

Hence, $v=\mathcal{S}'(\sigma )\tau \in H$ solves

$$\begin{aligned} b_{\sigma }( v, w) &= -\sum _{i=1}^{n} \langle \tau _{i} \mathcal{B}_{i} u_{\sigma }^{l} , w \rangle _{H'\times H} = -\sum _{i=1}^{n} \tau _{i} b_{i}(u_{\sigma }^{l} , w ) \quad \forall w\in H. \end{aligned}$$

Moreover, we obtain for all $r\in H'$, by using the symmetry of $\mathcal{B}(\sigma )$,

$$\begin{aligned} r( u_{\sigma }^{l} )=\langle \mathcal{B}(\sigma ) \mathcal{B}(\sigma )^{-1} r, u_{\sigma }^{l} \rangle _{H'\times H} = b_{\sigma }( u_{\sigma }^{l} , u_{\sigma }^{r}), \end{aligned}$$

and

$$\begin{aligned} r\left ( \mathcal{S}'(\sigma )\tau \right ) &= \langle r, \mathcal{S}'( \sigma )\tau \rangle _{H'\times H} = \langle \mathcal{B}(\sigma ) \mathcal{S}'(\sigma )\tau , \mathcal{B}(\sigma )^{-1} r\rangle _{H' \times H} \\ &= -\sum _{i=1}^{n} \tau _{i} \langle \mathcal{B}_{i}\, u_{\sigma }^{l}, u_{\sigma }^{r},\rangle _{H'\times H} = -\sum _{i=1}^{n} \tau _{i} b_{i}(u_{\sigma }^{l}, u_{\sigma }^{r}), \end{aligned}$$

which finished the proof. □

Corollary 1

Let $l,r\in H'$. Then the mapping

$$ \mathcal{F}_{l,r}:\ \mathbb{R}^{n}_{+}\to \mathbb{R},\quad \mathcal{F}_{l,r}(\sigma ):=r(u_{\sigma }^{l}) $$

fulfills

$$ \mathcal{F}_{l,r}(\sigma )=b_{\sigma }(u_{\sigma }^{l},u_{\sigma }^{r}) \quad \textit{ for all $\sigma \in \mathbb{R}^{n}_{+}$.} $$

Moreover, $\mathcal{F}_{l,r}:\ \mathbb{R}^{n}_{+}\to \mathbb{R}$ is infinitely often differentiable and its first derivatives fulfill

$$ \frac{\partial }{\partial \sigma _{i}} \mathcal{F}_{l,r}(\sigma )=-b_{i}(u_{\sigma }^{l},u_{\sigma }^{r}). $$

Proof

This follows from Lemma 1. □

3.2 Convexity and Monotonicity for Symmetric Measurements

A special mathematical structure appears for measurements $\mathcal{F}_{l,r}$, when $l$ and $r$ are taken from the same subset of $H'$, and all combinations are used. In the stationary diffusion example this corresponds to using the same subsets of $\Omega $ both for excitations and concentration measurements, in EIT this corresponds to using the same electrodes for voltage and current measurements.

Given a set of $m\in \mathbb{N}$ excitations/measurements $\{l_{1},\ldots ,l_{m}\}\subset H'$, we combine the measurements into a matrix-valued map $\mathcal{F}:\ \mathbb{R}^{n}_{+}\to \mathbb{R}^{m\times m}$

$$ \mathcal{F}(\sigma )=(\mathcal{F}_{j,k}(\sigma ))_{j,k=1,\ldots ,m} \in \mathbb{R}^{m\times m}, \quad \mathcal{F}_{j,k}(\sigma )= \mathcal{F}_{l_{j},l_{k}}(\sigma )=l_{k}(u_{\sigma }^{l_{j}}). $$

As before, we write “≥” for the elementwise order on $\mathbb{R}^{n}$. We also write $\mathbb{S}_{m}\subseteq \mathbb{R}^{m\times m}$ for the subset of symmetric $m\times m$-matrices, and “⪰” for the Loewner order on $\mathbb{S}_{m}$, i.e. $B\succeq A$ denotes that $B-A$ is positive semi-definite.

Lemma 2

$\mathcal{F}:\ \mathbb{R}^{n}_{+}\to \mathbb{R}^{m\times m}$ has the following properties:

(a)

ℱ is infinitely often differentiable.

(b)

For all $\sigma \in \mathbb{R}^{n}_{+}$, $\mathcal{F}(\sigma )\in \mathbb{S}_{m}$ and $\mathcal{F}(\sigma ) \succeq 0$. $\mathcal{F}(\sigma )$ is positive definite if $l_{1},\ldots ,l_{m}\in H'$ are linearly independent.

(c)

ℱ is monotonically non-increasing, i.e.

$$\begin{aligned} \mathcal{F}'(\sigma )\tau \preceq 0 \quad \textit{ for all } \quad \sigma \in \mathbb{R}^{n}_{+},\ 0\leq \tau \in \mathbb{R}^{n}, \end{aligned}$$

(6)

and for all $\sigma ^{(1)},\sigma ^{(2)}\in \mathbb{R}^{n}_{+}$

$$\begin{aligned} \sigma ^{(1)}\leq \sigma ^{(2)} \quad \textit{ implies } \quad \mathcal{F}(\sigma ^{(1)})\succeq \mathcal{F}(\sigma ^{(2)}). \end{aligned}$$

(7)

(d)

ℱ is convex, i.e., for all $\sigma ,\sigma ^{(0)}\in \mathbb{R}^{n}_{+}$,

$$\begin{aligned} \mathcal{F}(\sigma )-\mathcal{F}(\sigma ^{(0)}) \succeq \mathcal{F}'(\sigma ^{(0)})(\sigma -\sigma ^{(0)}), \end{aligned}$$

(8)

and, for all $t\in [0,1]$,

$$\begin{aligned} \mathcal{F}((1-t) \sigma ^{(0)}+t \sigma )\preceq (1-t) \mathcal{F}( \sigma ^{(0)})+t \mathcal{F}(\sigma ). \end{aligned}$$

(9)

Proof

Corollary 1 shows that each component of ℱ is infinitely often differentiable so that (a) is proven.

For the rest of the proof let $\sigma \in \mathbb{R}^{n}_{+}$, $g\in \mathbb{R}^{m}$, and set $l:=\sum _{j=1}^{m} g_{j} l_{j}$. By Corollary 1,

$$ l_{k}(u_{\sigma }^{l_{j}})=b_{\sigma }(u_{\sigma }^{l_{j}},u_{\sigma }^{l_{k}})=b_{\sigma }(u_{\sigma }^{l_{k}},u_{\sigma }^{l_{j}})=l_{j}(u_{\sigma }^{l_{k}}), $$

so that $\mathcal{F}(\sigma )$ is a symmetric matrix. Moreover,

$$ g^{T} \mathcal{F}(\sigma ) g=\sum _{j,k=1}^{m} g_{j} l_{k}(u_{\sigma }^{l_{j}}) g_{k} = \sum _{j,k=1}^{m} g_{j} g_{k} b_{\sigma }(u_{\sigma }^{l_{j}},u_{\sigma }^{l_{k}})=b_{\sigma }(u_{\sigma }^{l},u_{\sigma }^{l})\geq 0, $$

so that $\mathcal{F}(\sigma)\succeq 0$. If $g\neq 0$ and $l_{1},\ldots ,l_{m}\in H'$ are linearly independent then $l\neq 0$, which implies $u_{\sigma }^{l}\neq 0$ and thus $g^{T} \mathcal{F}(\sigma ) g>0$. Hence, (b) is proven.

To prove (c) and (d), we start by using again Corollary 1 and obtain

$$ g^{T} (\mathcal{F}'(\sigma )\tau ) g=-\sum _{i=1}^{n} \tau _{i} b_{i}(u_{\sigma }^{l},u_{\sigma }^{l}) \quad \text{ for all } \tau \in \mathbb{R}^{n}. $$

Since the bilinear forms $b_{i}(\cdot ,\cdot )$ are positive semi-definite, this implies (6).

To prove (8), let $\sigma ^{(0)}\in \mathbb{R}^{n}_{+}$. For brevity we write $u_{0}^{l}:=u_{\sigma _{0}}^{l}$. Using

$$ b_{\sigma }(u_{\sigma }^{l},u_{0}^{l})=l(u_{0}^{l})=b_{\sigma _{0}}(u_{0}^{l},u_{0}^{l}), $$

we obtain that

$$\begin{aligned} 0&\leq b_{\sigma }(u_{\sigma }^{l}-u_{0}^{l},u_{\sigma }^{l}-u_{0}^{l})=b_{\sigma }(u_{\sigma }^{l},u_{\sigma }^{l})-2b_{\sigma }(u_{\sigma}^{l},u_{0}^{l})+b_{\sigma }(u_{0}^{l},u_{0}^{l}) \\ &=b_{\sigma }(u_{\sigma }^{l},u_{\sigma }^{l})-2b_{\sigma _{0}}(u_{0}^{l},u_{0}^{l})+b_{\sigma }(u_{0}^{l},u_{0}^{l}) \\ &= g^{T} (\mathcal{F}(\sigma )-\mathcal{F}(\sigma ^{(0)})) g +b_{\sigma }(u_{0}^{l},u_{0}^{l})-b_{\sigma _{0}}(u_{0}^{l},u_{0}^{l}) \\ &= g^{T} (\mathcal{F}(\sigma )-\mathcal{F}(\sigma ^{(0)}) g + \sum _{i=1}^{n} (\sigma _{i}-\sigma _{i}^{(0)}) b_{i}(u_{0}^{l},u_{0}^{l}). \end{aligned}$$

This shows that

$$ g^{T} (\mathcal{F}(\sigma )-\mathcal{F}(\sigma ^{(0)})) g\geq - \sum _{i=1}^{n} (\sigma _{i}-\sigma _{i}^{(0)}) b_{i}(u_{0}^{l},u_{0}^{l})=g^{T} \mathcal{F}'(\sigma ^{(0)})(\sigma -\sigma ^{(0)})g, $$

so that (8) holds. Together with (6) this also implies (7).

(9) follows from (8) by the following standard argument. Let $\sigma ,\sigma ^{(0)}\in \mathbb{R}^{n}_{+}$, $t\in [0,1]$, and set

$$ \sigma ^{(t)}:=t \sigma + (1-t) \sigma ^{(0)}\in \mathbb{R}^{n}_{+}. $$

Using (8) on $\mathcal{F}(\sigma )-\mathcal{F}(\sigma ^{(t)})$ and $\mathcal{F}(\sigma ^{(0)})-\mathcal{F}(\sigma ^{(t)})$, we then obtain that

$$\begin{aligned} &{(1-t) \mathcal{F}(\sigma ^{(0)})+t \mathcal{F}(\sigma )-\mathcal{F}(\sigma ^{(t)})} \\ &=(1-t) (\mathcal{F}(\sigma ^{(0)})-\mathcal{F}(\sigma ^{(t)})) +t ( \mathcal{F}(\sigma )-\mathcal{F}(\sigma ^{(t)})) \\ &\succeq (1-t)\mathcal{F}'(\sigma ^{(t)})(\sigma ^{(0)}-\sigma ^{(t)}) + t\mathcal{F}'(\sigma ^{(t)})(\sigma -\sigma ^{(t)}) \\ &= \mathcal{F}'(\sigma ^{(t)})((1-t)\sigma ^{(0)}+ t \sigma -\sigma ^{(t)} )=0, \end{aligned}$$

which proves (9). □

4 The FEM Setting

4.1 The FEM-Approximated Forward Operator and Its Derivative

The Finite Element Method

The Finite Element Method numerically approximates the solution of (4) by solving it in a finite-dimensional subspace $V\subset H$, e.g. the subspace of continuous, piecewise linear functions on a fixed triangulation. Let $\Lambda _{1},\ldots ,\Lambda _{N}$ denote a basis of $V$, e.g. the so-called hat functions for linear finite elements. Then the finite-dimensional variational problem

$$ \tilde{u}_{\sigma }^{l}\in V \quad \text{ solves } \quad b_{\sigma }( \tilde{u}_{\sigma }^{l},v)=l(v) \quad \text{ for all } v\in V $$

(10)

is equivalent to

$$ \tilde{u}_{\sigma }^{l}=\sum _{j=1}^{N} \lambda _{j} \Lambda _{j}, \quad \text{ where } \quad B_{\sigma }\lambda =y^{l}, $$

(11)

with $\lambda =(\lambda _{j})_{j=1}^{N}\in \mathbb{R}^{N}$, and the so-called stiffness matrix and load vector

\begin{aligned} B_{σ} & \in R^{N \times N}, & with (j, k) -th entry given by b_{σ} (Λ_{j}, Λ_{k}), \\ y^{l} & \in R^{N}, & with j -th entry given by l (Λ_{j}) . \end{aligned}

It follows from the Lax-Milgram theorem that (10) is uniquely solvable and that $B_{\sigma}$ is a symmetric, positive definite (and thus invertible) matrix. Moreover, the Céa-Lemma yields that the FEM approximation $\tilde{u}_{\sigma }^{l}\in V$ is as good an approximation to the true solution $u_{\sigma }^{l}\in H$ as elements of the finite-dimensional space $V$ can be:

$$ \hspace{0.2ex}\|u_{\sigma}^{l}-\tilde{u}_{\sigma}^{l}\|\hspace{0.2ex}\leq \frac{C_{\sigma }}{\beta _{\sigma }}\inf _{v\in V}\hspace{0.2ex}\|u_{\sigma}^{l}-v\|\hspace{0.2ex}, $$

(12)

where $C_{\sigma }:=C\max \{1,\sigma _{1},\ldots ,\sigma _{n}\}$, and $\beta _{\sigma }:=\beta \min \{1,\sigma _{1},\ldots ,\sigma _{n}\}$ are the continuity and coercivity constants of $b_{\sigma }$, cf. (5).

Pixel Stiffness Matrices

Finite element software packages include triangulation algorithms, assembling routines for the global stiffness matrix $B_{\sigma }$ and the load vector $y^{l}$, and efficient solvers for the linear system $B_{\sigma }\lambda =y^{l}$. For our setting where

$$ b_{\sigma }(u,v)=b_{0}(u,v)+\sum _{i=1}^{n} \sigma _{i} b_{i}(u,v), $$

we will also require the pixel stiffness matrices

$$ B_{i}\in \mathbb{R}^{N\times N}, \quad \text{ with $(j,k)$-th entry given by } \quad b_{i}(\Lambda _{j}, \Lambda _{k}). $$

The assembling of $B_{\sigma }$ is usually done by writing it as a weighted sum of element stiffness matrices. In our setting, it is natural to assume that the pixel partition complies with the FEM triangulation, i.e., that each pixel is a union of triangulation elements. Figure 3 shows a coarser and a finer FEM mesh for the diffusion example, both complying with the pixel partition and with the subdomains that are used for measurements and excitations. Hence, during the assembly of the global stiffness matrix $B_{\sigma }$, the pixel stiffness matrices can usually be obtained without any additional computational cost by the simple intermediate step of first summing up the element matrices for each pixel, and then summing up the pixel stiffness matrices to obtain $B_{\sigma }$. Alternatively, the pixel stiffness matrix $B_{i}$ can be conveniently obtained from global stiffness matrices by the simple identities

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_Equap_HTML.png

where

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_IEq170_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_IEq171_HTML.gif

denote the global stiffness matrix $B_{\sigma }$ for

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_IEq173_HTML.gif

and

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_IEq174_HTML.gif

, respectively, and $e_{i}\in \mathbb{R}^{n}$ is the $i$-th unit vector. Note that this does not require any knowledge of the triangulation details.

The FEM-Approximated Forward Operator

Given $l,r\in H'$, we approximate the true measurement $\mathcal{F}_{l,r}(\sigma )=r(u_{\sigma }^{l})$ by

$$ F_{l,r}(\sigma ):=r(\tilde{u}_{\sigma }^{l}), $$

where $\tilde{u}_{\sigma }^{l}\in V$ is the FEM-approximation to the true solution $u_{\sigma }^{l}\in H$, i.e., the solution of (10).

Lemma 3

Let $l,r\in H'$. Then

$$ F_{l,r}(\sigma )=b_{\sigma }(\tilde{u}_{\sigma }^{l},\tilde{u}_{\sigma }^{r}) \quad \textit{ for all $\sigma \in \mathbb{R}^{n}_{+}$.} $$

Moreover, $F_{l,r}:\ \mathbb{R}^{n}_{+}\to \mathbb{R}$ is infinitely often differentiable and its first derivatives fulfill

$$ \frac{\partial }{\partial \sigma _{i}} F_{l,r}(\sigma )=-b_{i}( \tilde{u}_{\sigma }^{l},\tilde{u}_{\sigma }^{r}), \quad i=1,\ldots ,n. $$

Proof

This follows by applying Corollary 1 to the Hilbert space $V$. □

From Lemma 3, we obtain a simple FEM-based implementation of the forward operator and its derivative.

Corollary 2

With

\begin{aligned} {\tilde{u}}_{σ}^{l} & = \sum_{j = 1}^{N} λ_{j}^{l} Λ_{j}, & λ^{l} & = {(λ_{j}^{l})}_{j = 1}^{N} \in R^{N}, \\ {\tilde{u}}_{σ}^{r} & = \sum_{j = 1}^{N} λ_{j}^{r} Λ_{j}, & λ^{r} & = {(λ_{j}^{r})}_{j = 1}^{N} \in R^{N}, \end{aligned}

we have that

$$ F_{l,r}(\sigma )=(\lambda ^{l})^{T} B_{\sigma }\lambda ^{r}=(\lambda ^{l})^{T} y^{r}, \quad \textit{ and } \quad \frac{\partial }{\partial \sigma _{i}} F_{l,r}( \sigma )=- (\lambda ^{l})^{T} B_{i} \lambda ^{r}. $$

Proof

This follows from Lemma 3. □

We summarize the consequences of Corollary 2 in Algorithm 1. Using a FEM package that is capable of solving the considered PDE, and that allows access to the stiffness matrix and the load vector, one can simply implement the FEM-approximated forward operator and all its first derivatives by a few lines of extra code. This calculation merely requires solving two linear systems with the stiffness matrix (which is equivalent to two PDE solutions).

Convergence of the FEM-Approximated Forward Operator

The following lemma shows that the FEM-approximated operator and its first derivatives agree with their true-solution counterparts as good as the FEM solution agrees with the true solution. Hence, by the Céa-Lemma (12), $F_{l,r}(\sigma )$ and $\frac{\partial }{\partial \sigma _{i}} F_{l,r}(\sigma )$ will be as good an approximation to $\mathcal{F}_{l,r}(\sigma )$ and $\frac{\partial }{\partial \sigma _{i}} \mathcal{F}_{l,r}(\sigma )$ as the true solutions can be approximated by elements of the finite-dimensional space $V$.

Lemma 4

For all $l,r\in H'$ and $\sigma \in \mathbb{R}^{n}_{+}$, we have that:

$$\begin{aligned} \mathcal{F}_{l,r}(\sigma )-F_{l,r}(\sigma )&=b_{\sigma }(u_{\sigma }^{l}- \tilde{u}_{\sigma }^{l},u_{\sigma }^{r}-\tilde{u}_{\sigma }^{r}), \end{aligned}$$

(13)

$$\begin{aligned} \frac{\partial }{\partial \sigma _{i}} \mathcal{F}_{l,r}(\sigma )- \frac{\partial }{\partial \sigma _{i}} F_{l,r}(\sigma )&=b_{i}( \tilde{u}_{\sigma }^{l},\tilde{u}_{\sigma }^{r}-u_{\sigma }^{r})+b_{i}( \tilde{u}_{\sigma }^{l}-u_{\sigma }^{l},u_{\sigma }^{r}). \end{aligned}$$

(14)

Hence, by the Céa-Lemma (12),

$$\begin{aligned} 0\leq \mathcal{F}_{l,r}(\sigma )-F_{l,r}(\sigma )& \leq C_{\sigma }\hspace{0.2ex}\|u_{\sigma}^{l}-\tilde{u}_{\sigma}^{l}\|\hspace{0.2ex} \hspace{0.2ex}\|u_{\sigma}^{r}-\tilde{u}_{\sigma}^{r}\|\hspace{0.2ex} \\ & \leq \frac{C_{\sigma }^{3}}{\beta _{\sigma }^{2}}\inf _{v\in V} \hspace{0.2ex}\|u_{\sigma}^{l}-v\|\hspace{0.2ex} \inf _{v\in V}\hspace{0.2ex}\|u_{\sigma}^{r}-v\|\hspace{0.2ex}, \end{aligned}$$

and

$$\begin{aligned} &{\left |\frac{\partial }{\partial \sigma _{i}} \mathcal{F}_{l,r}(\sigma )-\frac{\partial }{\partial \sigma _{i}} F_{l,r}(\sigma )\right |} \\ & \leq C_{i} \hspace{0.2ex}\|\tilde{u}_{\sigma}^{l}\|\hspace{0.2ex} \hspace{0.2ex}\|\tilde{u}_{\sigma}^{r}-u_{\sigma}^{r}\|\hspace{0.2ex} + C_{i}\hspace{0.2ex}\|u_{\sigma}^{r}\|\hspace{0.2ex} \hspace{0.2ex}\|\tilde{u}_{\sigma}^{l}-u_{\sigma}^{l}\|\hspace{0.2ex} \\ & \leq \frac{C_{i} C_{\sigma }}{\beta _{\sigma }}\left ( \hspace{0.2ex}\|\tilde{u}_{\sigma}^{l}\|\hspace{0.2ex} \inf _{v\in V}\hspace{0.2ex}\|u_{\sigma}^{r}-v\|\hspace{0.2ex}+ \hspace{0.2ex}\| u_{\sigma}^{r}\|\hspace{0.2ex} \inf _{v\in V}\hspace{0.2ex}\|u_{\sigma}^{l}-v\|\hspace{0.2ex}\right ), \end{aligned}$$

where $C_{i}>0$ is the continuity constant of $b_{i}(\cdot ,\cdot )$.

Proof

Using

$$\begin{aligned} b_{\sigma }(\tilde{u}_{\sigma }^{l},\tilde{u}_{\sigma }^{r})=l(\tilde{u}_{\sigma }^{r})=b_{\sigma }(u_{\sigma }^{l},\tilde{u}_{\sigma }^{r}),\quad \text{ and } \quad b_{\sigma }(\tilde{u}_{\sigma }^{l},\tilde{u}_{\sigma }^{r})=r( \tilde{u}_{\sigma }^{l})=b_{\sigma }(\tilde{u}_{\sigma }^{l},u_{\sigma }^{r}), \end{aligned}$$

we obtain (13) from

$$\begin{aligned} \mathcal{F}_{l,r}(\sigma )-F_{l,r}(\sigma )&=b_{\sigma }(u_{\sigma }^{l},u_{\sigma }^{r})-b_{\sigma }(\tilde{u}_{\sigma }^{l},\tilde{u}_{\sigma }^{r})=b_{\sigma }(u_{\sigma }^{l},u_{\sigma }^{r}-\tilde{u}_{\sigma }^{r}) \\ &=b_{\sigma }(u_{\sigma }^{l},u_{\sigma }^{r}-\tilde{u}_{\sigma }^{r})-b_{\sigma }(\tilde{u}_{\sigma }^{l},u_{\sigma }^{r}-\tilde{u}_{\sigma }^{r}) \\ &=b_{\sigma }(u_{\sigma }^{l}-\tilde{u}_{\sigma }^{l},u_{\sigma }^{r}-\tilde{u}_{\sigma }^{r}). \end{aligned}$$

Also,

$$\begin{aligned} \frac{\partial }{\partial \sigma _{i}} \mathcal{F}_{l,r}(\sigma )- \frac{\partial }{\partial \sigma _{i}} F_{l,r}(\sigma ) &= b_{i}( \tilde{u}_{\sigma }^{l},\tilde{u}_{\sigma }^{r})-b_{i}(u_{\sigma }^{l},u_{\sigma }^{r}) \\ &= b_{i}(\tilde{u}_{\sigma }^{l},\tilde{u}_{\sigma }^{r}-u_{\sigma }^{r})+b_{i}( \tilde{u}_{\sigma }^{l}-u_{\sigma }^{l},u_{\sigma }^{r}), \end{aligned}$$

which shows (14). □

4.2 Convexity and Monotonicity for Symmetric Measurements

As in Sect. 3.2 we now consider the symmetric measurement case, where $l$ and $r$ are taken from the same subset of $H'$ (and all combinations are used). Given a set of $m\in \mathbb{N}$ excitations/measurements $\{l_{1},\ldots ,l_{m}\}\subset H'$, we combine the measurements into a matrix-valued map $F:\ \mathbb{R}^{n}_{+}\to \mathbb{R}^{m\times m}$

$$ F(\sigma )=(F_{j,k}(\sigma ))_{j,k=1,\ldots ,m}\in \mathbb{R}^{m \times m}, \quad F_{j,k}(\sigma )=F_{l_{j},l_{k}}(\sigma )=l_{k}(u_{\sigma }^{l_{j}}). $$

The entries of $F(\sigma )$ and its first derivatives $\frac{\partial }{\partial \sigma _{i}} F(\sigma )$, $i=1,\ldots ,n$ can be calculated as in Algorithm 1. Let us stress that this approach is particularly efficient in this symmetric case as it requires only $m$ linear system solutions with the stiffness matrix (i.e., the equivalent of $m$ PDE solutions) for calculating all $m^{2}$ entries of $F(\sigma )\in \mathbb{R}^{m\times m}$ and all $n m^{2}$ entries of the $n$ matrices $\frac{\partial }{\partial \sigma _{i}} F(\sigma )\in \mathbb{R}^{m \times m}$.

As in Sect. 3.2, the FEM-approximated forward operator is monotonically non-increasing and convex in the sense of the elementwise order “≥” on $\mathbb{R}^{n}$, and the Loewner order “⪰” on the set of symmetric $m\times m$-matrices.

Lemma 5

$F:\ \mathbb{R}^{n}_{+}\to \mathbb{R}^{m\times m}$ has the following properties:

(a)

$F$ is infinitely often differentiable.

(b)

For all $\sigma \in \mathbb{R}^{n}_{+}$, $F(\sigma )\in \mathbb{S}_{m}$ and $F(\sigma ) \succeq 0$. $F(\sigma )$ is positive definite if $l_{1},\ldots ,l_{m}\in H'$ are linearly independent.

(c)

$F$ is monotonically non-increasing, i.e.

$$\begin{aligned} F'(\sigma )\tau \preceq 0 \quad \textit{ for all } \quad \sigma \in \mathbb{R}^{n}_{+},\ 0\leq \tau \in \mathbb{R}^{n}, \end{aligned}$$

(15)

and for all $\sigma ^{(1)},\sigma ^{(2)}\in \mathbb{R}^{n}_{+}$

$$\begin{aligned} \sigma ^{(1)}\leq \sigma ^{(2)} \quad \textit{ implies } \quad F(\sigma ^{(1)})\succeq F(\sigma ^{(2)}). \end{aligned}$$

(16)

(d)

$F$ is convex, i.e.

$$\begin{aligned} F(\sigma )-F(\sigma ^{(0)}) \succeq F'(\sigma ^{(0)})(\sigma -\sigma ^{(0)}) \quad \textit{ for all } \quad \sigma ,\sigma ^{(0)}\in \mathbb{R}^{n}_{+}, \end{aligned}$$

(17)

and, for all $t\in [0,1]$,

$$\begin{aligned} F((1-t) \sigma ^{(0)}+t \sigma )\preceq (1-t) F( \sigma ^{(0)})+t F(\sigma ). \end{aligned}$$

(18)

(e)

$\mathcal{F}(\sigma )\succeq F(\sigma )$.

Proof

(a)–(d) follow from applying Lemma 2 on the Hilbert space $V$. (e) was proven in Lemma 4. □

5 Numerical Examples and Inverse Problem Challenges

In this section, we will show some numerical results for the stationary diffusion example from Sect. 2.1 and demonstrate some major challenges that arise in solving the inverse coefficient problem of recovering $\hat{\sigma }\in \mathbb{R}^{n}$ from $F(\hat{\sigma })\in \mathbb{R}^{m}$, or from a noisy version $Y^{\delta }\approx F(\hat{\sigma })$. The source codes for the following examples (and also for generating Fig. 1 and 2) are given in the appendix for the reader’s reference.

5.1 Non-uniqueness

Even for $m\geq n$, and a noise-free measurement $\hat{Y}=F(\hat{\sigma })\in \mathbb{R}^{m}$, it is not clear whether the measurements uniquely determine the unknown $\hat{\sigma }\in \mathbb{R}^{n}$. To demonstrate this on a simple one-dimensional example, let us consider the stationary diffusion example with $3\times 3$ pixels and circular excitation/measurement subdomains in each boundary pixel as in Fig. 1. We apply a source term in $D_{1}$ in the lower left pixel, and measure the resulting total concentration in $D_{8}$ in the top right pixel, so that $l=\chi _{1}\in H^{-1}(\Omega )$ and $r=\chi _{8}\in H^{-1}(\Omega )$, where we write $\chi _{j}:=\chi _{D_{j}}$ for the ease of notation. We choose $\sigma =1$ in all pixels except $\mathcal{P}_{i}$, and on $\mathcal{P}_{i}$ we vary the diffusivity in steps of 0.01 up to 3. Figure 4 shows $F_{l,r}(\sigma )$ for all $i=1,\ldots ,9$, in the same order as the pixels, e.g., the lower left image shows $F_{l,r}(\sigma )$ for $\sigma =(\sigma _{1},1,\ldots ,1)$ for varying $\sigma _{1}$.

Intuitively speaking, one can see that rising the diffusivity in the middle pixel increases the measurement since particles can easier diffuse through the middle pixel on their way from the lower left to the top right. Rising the diffusivity in the corner pixels decreases the measurement since particles can easier diffuse to the boundary that is set to zero by the homogeneous Dirichlet condition. In the middle top, left, bottom, and right pixel, rising the diffusivity first increases the measurement since particles can easier find their way from the lower left to the top right. But at some point, this effect is reverted since it also drives particles to the boundary.

This demonstrates that changing a coefficient can have an increasing or decreasing effect on the measurements, and the effect does not have to be monotonic for all parameter values. It also indicates that an exact one-dimensional measurement $\hat{Y}=F_{l,r}(\hat{\sigma })$ might uniquely determine one parameter in $\hat{\sigma }$ in some cases (here: the diffusivity in the middle and corner pixels), but non-uniqueness might occur in other cases (here: in the top, left, bottom, and right pixel).

Let us also stress the following point. A single non-symmetric measurement $F_{l,r}(\sigma )$ with $l\neq r$ might depend non-monotonically on $\sigma $ as demonstrated in Fig. 4. But, by Lemma 5, for all $l,r\in H'$, the symmetric measurements $F_{l,l}(\sigma )$, $F_{r,r}(\sigma )$, and also the matrix-valued measurement

$$ \begin{pmatrix} F_{l,l}(\sigma ) & F_{l,r}(\sigma ) \\ F_{r,l}(\sigma ) & F_{r,r}(\sigma ) \end{pmatrix} \in \mathbb{R}^{2\times 2} $$

are monotonically non-increasing and convex functions of $\sigma $. Note that the monotonicity and convexity properties of the matrix-valued measurement hold with respect to the Loewner order even though the individual non-diagonal matrix elements might not have these properties.

5.2 Non-linearity and Local Minima

The inverse problem of recovering $\hat{\sigma }\in \mathbb{R}^{n}$ from $F(\hat{\sigma })\in \mathbb{R}^{m}$ could be considered as a non-linear root finding problem and (for $n=m$) approached with Newton’s method which is only known to converge locally. However, in practice one usually takes redundant measurements (i.e., $m>n$), and, due to measurement or modelling errors, one cannot expect exact data fit. Hence, a common approach to reconstruct $\hat{\sigma }\in \mathbb{R}^{n}$ from $Y^{\delta }\approx F(\hat{\sigma })\in \mathbb{R}^{m}$ is to minimize a residuum functional, e.g.,

$$ R(\sigma ):=\hspace{0.2ex}\|F(\sigma )-Y^{\delta}\|\hspace{0.2ex}^{2}\to \text{min!} $$

or a sum of a residuum functional together with a regularization term.

For our simple $3\times 3$-pixel example, the left image in Fig. 5 shows a contour plot of $R(\sigma )$ (in a normalized logarithmic scale) for a two-dimensional measurement function $F(\sigma ):= \begin{pmatrix} F_{\chi _{1},\chi _{7}}(\sigma ) & F_{\chi _{1},\chi _{8}}(\sigma )\end{pmatrix}^{T} \in \mathbb{R}^{2}$, exact data $Y^{\delta }=F(\hat{\sigma })$,

$$\begin{aligned} \hat{\sigma }&= \begin{pmatrix} 1 & 1 & 1 & 0.5 & 1 & 0.5 & 1 &1 &1 \end{pmatrix} ^{T} \quad \text{ and } \quad \\ \sigma &= \begin{pmatrix} 1 & 1 & 1 & \sigma _{4} & 1 & \sigma _{6} & 1 &1 &1 \end{pmatrix} ^{T}. \end{aligned}$$

$\sigma _{4}$ and $\sigma _{6}$ are varied in steps of 0.002 up to 0.6. Again, for this choice of unknowns and measurements, we numerically observe non-uniqueness. The results indicate that there is a second point $\tilde{\sigma }\neq \hat{\sigma }$ with $F(\tilde{\sigma })=F(\hat{\sigma })$.

The right image in Fig. 5 shows a plot of $R(\sigma )$ (in a normalized non-logarithmic scale) for varying $\sigma _{4}$ while keeping $\sigma _{6}:=\sigma _{4}$, i.e. the diagonal of the left image in Fig. 5. It indicates that in this case, one parameter in $\hat{\sigma }$ can be uniquely reconstructed from the two-dimensional measurement $F(\hat{\sigma })$. But it also shows that the residuum functional $R(\sigma )$ possesses a local minimum in a wrong point. Since the curse of dimensionality makes it practically infeasible to numerically find the global minimizer of a non-linear functional in more than a few unknowns, this demonstrates that optimization-based approaches also suffer from only local convergence.

5.3 Stability, Error Estimates and Ill-Posedness

Even if $F(\hat{\sigma })$ uniquely determines $\hat{\sigma }$, and if this non-linear problem can be solved without running into a local minimum, the problem might be ill-posed in the sense that $\sigma $ does not depend stably on $F(\sigma )$. In that case, small errors in the measurements might lead to large errors in the reconstruction. The error amplification is often quantified by searching for a Lipschitz stability constant $L>0$ with

$$ \hspace{0.2ex}\|\sigma _{1}-\sigma _{2}\|\hspace{0.2ex}\leq L \hspace{0.2ex}\|F(\sigma _{1})-F(\sigma _{2})\|\hspace{0.2ex} \quad \text{ for all } \sigma _{1}, \sigma _{2}\in \mathbb{R}^{n}. $$

It should be stressed though, that such a stability estimate does not immediately yield an error estimate for noisy measurements $Y^{\delta }\approx F(\hat{\sigma})$, since $Y^{\delta }$ might not lie in the range of $F$.

To estimate the (relative) error amplification, we calculate the condition number of

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_IEq286_HTML.gif

for our $3\times 3$ example, where $F:\ \mathbb{R}^{9}\to \mathbb{R}^{64}$ now depends on all 9 pixel values, and the components of $F$ are given by $F_{l,r}$ with $l$ and $r$ running through all $8\times 8$-combinations of $\chi _{1},\ldots ,\chi _{8}$, i.e., we use all combinations of circular subdomain for applying source terms and for measuring the solution. Note that, Lemma 5 implies that roughly half of the measurements are redundant by symmetry, and that unlike in Sects. 3.2 and 4.2, we simply write the measurements as a long vector in order to have

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_IEq295_HTML.gif

as a matrix.

We repeat the calculation of the condition number of

https://static-content.springer.com/image/art%3A10.1365%2Fs13291-021-00236-2/MediaObjects/13291_2021_236_IEq296_HTML.gif

for analogous settings with $n_{x}\times n_{x}$-pixels, and $(4n_{x}-4)\times (4n_{x}-4)$ measurements on circular subdomains in the boundary pixels, cf. the left image in Fig. 6 for the geometry of the $15\times 15$-pixel case. The right image in Fig. 6 shows the condition number as a function of the total number of pixels $n_{x}^{2}$ for $n_{x}\in \{2,3,\ldots ,15\}$.

Our results indicate that the instability of the considered inverse problem grows roughly exponentially with the number of unknowns which is in par with theoretical results on the similar elliptic inverse coefficient problem of EIT [17].

5.4 Further Reading

There is vast literature on theoretical and numerical inverse coefficient problems, their practical applications, and the regularization of ill-posed problems. Let us single out just a very few results as starting points for further reading that are closely connected to the challenges addressed in this section, and the author’s own research. Arguably the most prominent inverse elliptic coefficient problem is the so-called Calderón problem [7, 8] with applications in EIT, cf. [1, 5] for surveys on EIT and the related field of diffuse optical tomography. For theoretical uniqueness proofs in the infinite-dimensional setting of (intuitively speaking) infinitely many pixels and measurements we refer to [10, 15, 20]. A survey on solving parameter identification problems for PDEs with a focus on sparsity regularization can be found in [13]. The interplay between instability, regularization and FEM discretization is studied in [14]. A result on convexification approaches to obtain globally convergent reconstruction algorithms can be found in [16]. Learning-based approaches for inverse coefficient problems and parametrized PDEs are studied in [4, 6, 18].

Results on uniqueness and Lipschitz stability for finitely many unknowns from infinitely or finitely many measurements have been obtained in, e.g., [2, 3, 11]. Let us stress that, for most problems, it is still an open question, how many (and which) measurements are required to uniquely determine an unknown PDE coefficient with a given resolution, how to explicitly quantify the error amplification, and how to obtain globally convergent reconstruction algorithms. For a relatively simple, but fully non-linear inverse problem of determining a Robin coefficient in an elliptic PDE, these question were recently answered in [12] by exploiting the convexity and monotonicity structure of symmetric measurements from Sects. 3.2 and 4.2.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

previous article Fitting Ideals in Number Theory and Arithmetic

next article Kazuhiro Fujiwara, Fumiharu Kato: “Foundations of Rigid Geometry I”

Our product recommendations

Jahresbericht der Deutschen Mathematiker-Vereinigung

Der „Jahresbericht der Deutschen Mathematiker-Vereinigung (DMV)“ versteht sich als ein Schaufenster für Mathematik. In Übersichtsartikeln und Berichten aus der Forschung soll für möglichst viele LeserInnen verständlich und interessant über aktuelle und wichtige Entwicklungen der Mathematik berichtet werden.

inform now

Appendix: Source Code for COMSOL with MATLAB LiveLink

Listing 1–3 contain auxiliary functions to build and manipulate the FEM model. Figure 1–6 are created by Listing 4–9.

Adler, A., Gaburro, R., Lionheart, W.: Electrical impedance tomography. In: Handbook of Mathematical Methods in Imaging, pp. 701–762. Springer, Berlin (2015) CrossRef

Alberti, G.S., Santacesaria, M.: Calderón’s inverse problem with a finite number of measurements. In: Forum of Mathematics, Sigma, vol. 7. Cambridge University Press, Cambridge (2019) MATH

Alessandrini, G., Vessella, S.: Lipschitz stability for the inverse conductivity problem. Adv. Appl. Math. 35(2), 207–241 (2005) MathSciNetCrossRef

Arridge, S., Maass, P., Öktem, O., Schönlieb, C.-B.: Solving inverse problems using data-driven models. Acta Numer. 28, 1–174 (2019) MathSciNetCrossRef

Arridge, S.R.: Methods in diffuse optical imaging. Philos. Trans. - Royal Soc. A, Math. Phys. Eng. Sci. 369(1955), 4558–4576 (2011) MathSciNetCrossRef

Bhattacharya, K., Hosseini, B., Kovachki, N.B., Stuart, A.M.: Model reduction and neural networks for parametric pdes, arXiv preprint (2020). arXiv:2005.03180

Calderón, A.P.: On an inverse boundary value problem. In: Meyer, W.H., Raupp, M.A. (eds.) Seminar on Numerical Analysis and Its Application to Continuum Physics, pp. 65–73. Brasil. Math. Soc., Rio de Janeiro (1980)

Calderón, A.P.: On an inverse boundary value problem. Comput. Appl. Math. 25(2–3), 133–138 (2006) MathSciNetMATH

Hanke, M.: A regularizing Levenberg-Marquardt scheme, with applications to inverse groundwater filtration problems. Inverse Probl. 13(1), 79 (1997) MathSciNetCrossRef

10.

Harrach, B.: On uniqueness in diffuse optical tomography. Inverse Probl. 25, 055010 (2009) MathSciNetCrossRef

11.

Harrach, B.: Uniqueness and Lipschitz stability in electrical impedance tomography with finitely many electrodes. Inverse Probl. 35(2), 024005 (2019) MathSciNetCrossRef

12.

Harrach, B.: Uniqueness, stability and global convergence for a discrete inverse elliptic Robin transmission problem. Numer. Math. 147, 29–70 (2021) MathSciNetCrossRef

13.

Jin, B., Maass, P.: Sparsity regularization for parameter identification problems. Inverse Probl. 28(12), 123001 (2012) MathSciNetCrossRef

14.

Kaltenbacher, B., Kirchner, A., Vexler, B.: Adaptive discretizations for the choice of a Tikhonov regularization parameter in nonlinear inverse problems. Inverse Probl. 27(12), 125008 (2011) MathSciNetCrossRef

15.

Kenig, C., Salo, M.: Recent progress in the Calderón problem with partial data. Contemp. Math. 615, 193–222 (2014) CrossRef

16.

Klibanov, M.V., Li, J., Zhang, W.: Convexification of electrical impedance tomography with restricted Dirichlet-to-Neumann map data. Inverse Probl. 35, 3 (2019) MathSciNetMATH

17.

Rondi, L.: A remark on a paper by Alessandrini and Vessella. Adv. Appl. Math. 36(1), 67–69 (2006) MathSciNetCrossRef

18.

Seo, J.K., Kim, K.C., Jargal, A., Lee, K., Harrach, B.: A learning-based method for solving ill-posed nonlinear inverse problems: a simulation study of lung EIT. SIAM J. Imaging Sci. 12(3), 1275–1295 (2019) MathSciNetCrossRef

19.

Somersalo, E., Cheney, M., Isaacson, D.: Existence and uniqueness for electrode models for electric current computed tomography. SIAM J. Appl. Math. 52(4), 1023–1040 (1992) MathSciNetCrossRef

20.

Uhlmann, G.: Electrical impedance tomography and Calderón’s problem. Inverse Probl. 25(12), 123011 (2009) CrossRef

Title: An Introduction to Finite Element Methods for Inverse Coefficient Problems in Elliptic PDEs
Author: Bastian Harrach
Publication date: 06-05-2021
Publisher: Springer Berlin Heidelberg
Published in: Jahresbericht der Deutschen Mathematiker-Vereinigung / Issue 3/2021
Print ISSN: 0012-0456
Electronic ISSN: 1869-7135
DOI: https://doi.org/10.1365/s13291-021-00236-2

Springer Professional

An Introduction to Finite Element Methods for Inverse Coefficient Problems in Elliptic PDEs

Abstract

Publisher’s Note

1 Introduction

2 Motivation and Examples

2.1 Stationary Diffusion

2.2 Electrical Impedance Tomography (EIT)

3 The True-Solution Setting

3.1 The True-Solution Forward Operator and Its Derivative

3.2 Convexity and Monotonicity for Symmetric Measurements

4 The FEM Setting

4.1 The FEM-Approximated Forward Operator and Its Derivative

4.2 Convexity and Monotonicity for Symmetric Measurements

5 Numerical Examples and Inverse Problem Challenges

5.1 Non-uniqueness

5.2 Non-linearity and Local Minima

5.3 Stability, Error Estimates and Ill-Posedness

5.4 Further Reading

Publisher’s Note

Our product recommendations

Jahresbericht der Deutschen Mathematiker-Vereinigung

Appendix: Source Code for COMSOL with MATLAB LiveLink

Premium Partner

Springer Professional

Abstract

Publisher’s Note

1 Introduction

2 Motivation and Examples

2.1 Stationary Diffusion

2.2 Electrical Impedance Tomography (EIT)

3 The True-Solution Setting

3.1 The True-Solution Forward Operator and Its Derivative

3.2 Convexity and Monotonicity for Symmetric Measurements

4 The FEM Setting

4.1 The FEM-Approximated Forward Operator and Its Derivative

4.2 Convexity and Monotonicity for Symmetric Measurements

5 Numerical Examples and Inverse Problem Challenges

5.1 Non-uniqueness

5.2 Non-linearity and Local Minima

5.3 Stability, Error Estimates and Ill-Posedness

5.4 Further Reading

Publisher’s Note

Our product recommendations

Jahresbericht der Deutschen Mathematiker-Vereinigung

Appendix: Source Code for COMSOL with MATLAB LiveLink

Other articles of this Issue 3/2021

Kazuhiro Fujiwara, Fumiharu Kato: “Foundations of Rigid Geometry I”

Fitting Ideals in Number Theory and Arithmetic

Correction to: Die Begriffe der Äquivalenz, der Gleichheit und der Identität

Preface

Premium Partner