Skip to main content
Erschienen in: Engineering with Computers 6/2023

Open Access 15.07.2023 | Original Article

Reduced order model approaches for predicting the magnetic polarizability tensor for multiple parameters of interest

verfasst von: James Elgy, Paul D. Ledger

Erschienen in: Engineering with Computers | Ausgabe 6/2023

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The magnetic polarizability tensor (MPT) is an economical characterisation of a conducting magnetic object, which can assist with identifying hidden targets in metal detection. The MPT’s coefficients depend on multiple parameters of interest including the object shape, size, electrical conductivity, magnetic permeability, and the frequency of excitation. The computation of the coefficients follow from post-processing an eddy current transmission problem solved numerically using high-order finite elements. To reduce the computational cost of constructing these characterisations for multiple different parameters, we compare three methods by which the MPT can be efficiently calculated for two-dimensional parameter sets, with different levels of code invasiveness. We compare, with numerical examples, a neural network regression of MPT eigenvalues with a projection-based reduced order model (ROM) and a neural network enhanced ROM (POD–NN) for predicting MPT coefficients.
Hinweise
These authors contributed equally to this work.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Article Highlights:
1.
Rapid computations of object characterisations with varying material parameters to assist with metal detection.
 
2.
A comparison between novel neural network and projection enhanced reduced order models for efficient computation.
 
3.
Practical demonstrations of alternative methodologies including comparisons of computational cost.
 

1 Introduction

In recent years, there has been considerable interest in the characterisation of hidden conducting permeable objects by the magnetic polarizability tensor (MPT) and its applications to metal detection. The complex symmetric rank 2 MPT has been shown to offer an economical method of characterising conducting permeable objects [13], explicit formulae for calculating its 6 independent complex coefficients based on the object size, shape, electrical conductivity, magnetic permeability, and frequency of excitation have been derived [1, 2, 4], computational procedures proposed for its calculation [5] and apparatus for its measurement are advanced [69].
Key metal detection applications include in the discrimination between threat and non-threat objects in security screening using walk through metal detectors [3], whereby the early detection of threat objects (such as knives and firearms or components thereof) has the potential to reduce the likelihood of attacks and improve public safety. Further security applications include distinguishing between metallic clutter (e.g., ring-pulls, coins, shrapnel) and metallic components of hidden anti-personnel mines and unexploded ordnance [10]. Commercial applications, such as ensuring food safety screening, improving identification of metallic objects of significance in archaeological searches, and discriminating between real and counterfeit coinage at automated checkouts and vending machines are also of interest.
An approach for computing the MPT object characterisation as a function of exciting frequency, known as the MPT spectral signature, based on computing full-order model solutions at a small number of snapshot frequencies and using a (projected) proper orthogonal decomposition (POD) based reduced order model (ROM) [11] to predict the solution for other frequencies, has been proposed in [5] and implemented in the MPT-Calculator software. To compute the full order model solutions, the NGSolve high order finite element library [1215] was used and a \({{\varvec{H}}}(\hbox {curl})\) conforming discretisation on unstructured tetrahedral meshes was employed. The resulting characterisations have been shown to be in excellent agreement with practical measurements [9] for a wide range of object shapes. The MPT-Calculator tool has subsequently been used in combination with exact MPT scalings to generate dictionaries of realistic object characterisations [16], which, in turn, have been used for training machine learning classifiers for identifying possible threat and non–threat objects [17].
To be able to build larger dictionaries of MPT spectral signature object characterisations with increased variability in the object’s material parameters, we introduce and compare three alternative novel ROMs in this paper. First, we extend the ROM presented in [5] to two parameters, namely frequency and permeability. Secondly, building on the approaches proposed in [18, 19], a regression-based POD is employed, which involves a neural network-based regression of information from the truncated singular value decomposition of the snapshot solution matrix to make predictions for new problem parameters. A recent extensive discussion of neural networks with applications in POD reduced order modelling is given in [20], covering artificial neural networks, physics informed neural networks, and feed-forward neural networks and the differences between them in the context of POD which provides further context to our methodology. Third, a neural network regression of the MPT eigenvalues is developed to predict MPT eigenvalues for new problem parameters. We then compare the accuracy and the computational performance of the three approaches. A further important aspect we consider is code invasiveness. POD approaches require access to the underlying finite element implementation, which may not be possible in many cases, for example using commercial closed–source software. On the other hand, POD-NN requires only access to computed solution vectors and a direct regression of the parameters requires no access to the underlying code.
This paper is organised as follows: Sect. 2 briefly reviews the mathematical formulation of the rank 2 MPT object characterisation of an isolated highly conducting magnetic object in a non–conducting background for the eddy current time–harmonic approximation to the Maxwell system. Section 3 recalls the hp finite element approximations to a transmission problem, which is used for calculating full-order model solutions, and Sect. 4 describes our proposed ROMs to accelerate this computation when evaluating for different material properties. This is followed, in Sect. 5, by numerical examples of the ROM approaches for an object with a known analytical solution for its MPT coefficients, and an object where there is no known analytical solution. Finally, concluding remarks and intended future work are provided in Sect. 6.

2 The Eddy Current Model and The Rank 2 MPT

We briefly recall the problem description from [2, 4, 5]. As illustrated in Fig. 1, our interest lies in characterising a highly conducting magnetic object, \(B_\alpha\) set in an unbounded region of free space \(B_\alpha ^c: = \overline{B_\alpha } \backslash {\mathbb {R}}^3\), where the overbar denotes the closure. Later, we will also use the overbar to denote the complex conjugate, however, it should be clear from the context as to which definition applies. We write \(B_\alpha = \alpha B + {\varvec{z}}\) so that the object can be described by a unit-sized object B placed at the origin, which is scaled by a size parameter \(\alpha \ll 1\) (measured in m) and translated by \({\varvec{z}}\). At a position \({\varvec{x}}\), the material properties are
$$\begin{aligned} \begin{matrix} \mu _\alpha ({\varvec{x}}):= {\left\{ \begin{array}{ll} \mu _* &{} \; {\varvec{x}} \in B_\alpha \\ \mu _0 &{} \; {\varvec{x}} \in B^c_\alpha \end{array}\right. }, &{}\sigma _\alpha ({\varvec{x}}):= {\left\{ \begin{array}{ll} \sigma _* &{} \; {\varvec{x}} \in B_\alpha \\ 0 &{} \; {\varvec{x}} \in B^c_\alpha \end{array}\right. }, \end{matrix} \end{aligned}$$
(1)
where \(\mu\) represents the permeability [Hm\(^{-1}\)] and \(\sigma\) represents the conductivity [Sm\(^{-1}\)], the subscript \(*\) denotes their values inside \(B_\alpha\) and the subscript 0 their values outside. The free space permeability is \(\mu _0:= 4 \pi \times 10^{-7}\) Hm\(^{-1}\) and we introduce the relative permeability \(\mu _r:= \mu _*/\mu _0\) inside the object.
An asymptotic formula has been established for the perturbation in magnetic field \(({\varvec{H}}_\alpha -{\varvec{H}}_0) ({\varvec{x}})\) at positions \({\varvec{x}}\) away from \(B_\alpha\) as \(\alpha \to 0\) when the object is placed in a time-varying low frequency magnetic background field \({\varvec{H}}_0\) generated by an electric current source placed external to the body [2, 21]. This assumes that the eddy current approximation of the time harmonic Maxwell system has been applied, in which displacement currents are neglected, and is appropriate given the highly conducting nature of \(B_\alpha\) and the low angular frequencies \(\omega\) of the excitation. The form of this expansion is
$$\begin{aligned} \left( {\varvec{H}}_\alpha -{\varvec{H}}_0 \right) \left( {\varvec{x}}\right) _i= & {} \left( {\varvec{D}}^2_{{\varvec{x}}} G\left( {\varvec{x}},{\varvec{z}} \right) \right) _{ij}\nonumber \\{} & {} \left( {\mathcal {M}}\right) _{jk} \left( {\varvec{H}}_0\left( {\varvec{z}}\right) \right) _k +({\varvec{R}}({\varvec{x}}))_i, \end{aligned}$$
(2)
with \({\varvec{R}} ({\varvec{x}} )\) denoting the residual, which satisfies \(\vert {\varvec{R}} ({\varvec{x}}) \vert \le C \alpha ^4\) with C being a constant independent of \(\alpha\). In the above, \(G\left( \varvec{x,z} \right) :=1/\left( 4\pi |{\varvec{x}}-{\varvec{z}}|\right)\) is the free space Laplace Green’s function and \({\varvec{D}}^2_{{\varvec{x}}} G\) denotes the Hessian of G. The subscripts ij, and k denote the component indices and Einstein summation convention is assumed. The complex symmetric rank 2 tensor \({\mathcal {M}} = \left( {\mathcal {M}} \right) _{jk}{\varvec{e}}_j\otimes {\varvec{e}}_k\) is the MPT and can be decomposed as [1]
$$\begin{aligned} {\mathcal {M}} = {\mathcal {N}}^0 + {\mathcal {R}} + \textrm{i}{\mathcal {I}}=\tilde{{\mathcal {R}}} + \textrm{i}{\mathcal {I}}, \end{aligned}$$
(3)
where \(\textrm{i}:=\sqrt{-1}\), \({\mathcal {N}}^0 (\alpha B, \mu _r)\) denotes its magnetostatic contribution, \(\tilde{{\mathcal {R}}} (\alpha B, \omega ,\sigma _*,\mu _r) = {\mathcal {N}}^0(\alpha B, \mu _r) + {\mathcal {R}} (\alpha B, \omega ,\sigma _*,\mu _r)\) its frequency dependent real part and \({\mathcal {I}}(\alpha B, \omega ,\sigma _*,\mu _r)\) its frequency dependent imaginary part. Their coefficients can be found from
$$\begin{aligned} \left( {\mathcal {N}}^0\right) _{ij}&= \alpha ^3 \delta _{ij} \int _B \left( 1- {\mu _r}^{-1} \right) \textrm{d}\varvec{\xi } \nonumber \\&\quad + \frac{\alpha ^3}{4} \int _{B\cup B^c} {\tilde{\mu }}_r^{-1}\nabla \times \tilde{\varvec{\theta }}_i^{(0)} \cdot \nabla \times \tilde{ \varvec{\theta }}^{(0)}_j \textrm{d} \varvec{\xi }, \end{aligned}$$
(4a)
$$\begin{aligned} \left( {\mathcal {R}}\right) _{ij}&= -\frac{\alpha ^3}{4} \int _{B\cup B^c} {\tilde{\mu }}_r^{-1}\nabla \times \varvec{\theta }^{(1)}_i \cdot \nabla \times \overline{\varvec{\theta }^{(1)}_j} \textrm{d} \varvec{\xi }, \end{aligned}$$
(4b)
$$\begin{aligned} \left( {\mathcal {I}}\right) _{ij}&= \frac{\alpha ^3}{4} \int _B \nu \left( \varvec{\theta }^{(1)}_i + \left( \tilde{\varvec{\theta }}^{(0)}_i + {\varvec{e}}_i\times \varvec{\xi } \right) \right) \nonumber \\&\quad \cdot\overline{\left( \varvec{\theta }^{(1)}_j + \left( \tilde{\varvec{\theta }}^{(0)}_j + {\varvec{e}}_j\times \varvec{\xi } \right) \right) } \textrm{d} \varvec{\xi }. \end{aligned}$$
(4c)
In (4), \(\delta _{ij}\) is the Kronecker delta, \(\nu :=\alpha ^2\sigma _*\mu _0\omega\), and \({\tilde{\mu }}_r(\varvec{\xi })=\mu _r\) inside B and \({\tilde{\mu }}_r(\varvec{\xi })=1\) outside where \(\varvec{\xi }\) is chosen to be measured from an origin inside B. The vector field \(\varvec{\theta }_i = \varvec{\theta }^{(0)}_i + \varvec{\theta }^{(1)}_i =\tilde{\varvec{\theta }}^{(0)}_i + {\varvec{e}}_i \times \varvec{\xi } + \varvec{\theta }^{(1)}_i\) is the solution to the transmission problem [1]:
$$\begin{aligned}{} & {} \nabla \times \tilde{\mu _r}^{-1} \nabla \times \varvec{\theta }_i - \textrm{i}\nu \varvec{\theta }_i= \textrm{i}\nu {\varvec{e}}_i \times \varvec{\xi }{} & {} \text { in} \; B ,{} & {} \end{aligned}$$
(5a)
$$\begin{aligned}{} & {} \nabla \cdot \varvec{\theta }_i= 0{} & {} \text { in} \; B^c= {{\mathbb {R}}}^3 \setminus {\overline{B}},{} & {} \end{aligned}$$
(5b)
$$\begin{aligned}{} & {} \nabla \times \nabla \times \varvec{\theta }_i= {\varvec{0}}{} & {} \text { in} \; B^c ,{} & {} \end{aligned}$$
(5c)
$$\begin{aligned}{} & {} \left[ \varvec{\theta }_i \times {\varvec{n}} \right] _\Gamma= {\varvec{0}}{} & {} \text { on} \; \Gamma := \partial B ,{} & {} \end{aligned}$$
(5d)
$$\begin{aligned}{} & {} \left[ {\tilde{\mu }}_r^{-1} \nabla \times \varvec{\theta }_i \times {\varvec{n}} \right] _\Gamma = -2\left[ {\tilde{\mu }}_r^{-1} \right] _\Gamma {\varvec{e}}_i \times {\varvec{n}}{} & {} \text { on} \; \Gamma ,{} & {} \end{aligned}$$
(5e)
$$\begin{aligned}{} & {} \varvec{\theta }_i\left( \varvec{\xi }\right) = {\mathcal {O}}\left( |\varvec{\xi }|^{-1}\right){} & {} \text { as} \; |\varvec{\xi }|\rightarrow \infty ,{} & {} \end{aligned}$$
(5f)
where \([ \cdot ]_\Gamma\) denotes the jump over \(\Gamma\) and \({\varvec{n}}\) is the unit outward normal. The above problem can be split to form separate problems for \(\tilde{\varvec{\theta }}_i^{(0)}\) and \(\varvec{\theta }_i^{(1)}\), where the former is independent of \(\omega\) and is a real vector field and the latter is a complex frequency-dependent vector field. Scaling results are available that allow the immediate calculation of \(({\mathcal {M}})_{ij}\) for new values of \(\sigma _*\) and \(\alpha\). The goal of this paper is to compare ROM approaches for rapidly predicting \(({\mathcal {M}})_{ij}\) for different \(\mu _r\) and \(\omega\) [5]. This in turn will aid with creating large dictionaries of object characterisations for training machine learning classifiers, which cannot be achieved through the application of scaling results.

3 Finite Element Approximation

As described in [5], by truncating the unbounded domain sufficiently far from B to create a finite computational domain \(\Omega\), replacing the far field condition (5f) with \({\varvec{n}} \times {\varvec{\theta }}_i = {\varvec{0}}\) on \(\partial \Omega\), circumventing the Coulomb type gauge \(\nabla \cdot \varvec{\theta }_i = 0\) in \(\Omega {\setminus } {\overline{B}}\) with numerical regularisation (by solving a perturbed problem involving a small regularisation parameter \(\varepsilon\)), and employing a higher order \({\varvec{H}}(\text {curl})\) conforming finite element approximation a discrete finite element approximation to the continuous weak forms for the \(\tilde{\varvec{\theta }}^{(0)}_i\) and \(\varvec{\theta }^{(1)}_i\) problems can be established. For both the \(\tilde{\varvec{\theta }}^{(0)}_i\) and \(\varvec{\theta }^{(1)}_i\) problems, an unstructured mesh of tetrahedral elements of size h is used to partition \(\Omega\) and order p elements applied leading to a linear system of equations of the form
$$\begin{aligned} {\textbf{A}}{\textbf{q}}\left( \varvec{\omega }\right) = {\textbf{r}}, \end{aligned}$$
(6)
where for \(\varvec{\theta }_i^{(1)}\), \({\textbf{A}}\in {\mathbb {C}}^{N_{dof}\times N_{dof}}\) is a large sparse complex symmetric matrix, \({\textbf{q}}\left( \varvec{\omega }\right) \in {\mathbb {C}}^{N_{dof}}\) is a parameter dependent solution with \(\varvec{\omega }\) indicating the list of model parameters to be varied, and \({\textbf{r}}\in {\mathbb {C}}^{N_{dof}}\) is a known forcing vector. The situation is similar for the simpler \(\varvec{\theta }_i^{(0)}\) that involves real matrices. Once the solution to (6) has been established, the discrete approximation to \(\varvec{\theta }^{(1)}_i\) is recovered using
$$\begin{aligned} \varvec{\theta }_i^{(1,hp) }( \varvec{\omega },\varvec{\xi }) = \sum _{u=1}^{N_{dof}} \textrm{q}_u^{(1)}(\varvec{\omega }) {\textbf{N}}_u (\varvec{\xi }), \end{aligned}$$
(7)
where \({\textbf{N}}_u(\varvec{\xi })\) is a typical \({\varvec{H}}(\text {curl})\) conforming basis function and \(N_{dof}\) is the number of degrees of freedom in the finite element approximation. A similar approximation also applies for \(\tilde{\varvec{\theta }}_i^{(0,hp)}\). The approximate computation of \(({\mathcal {M}})_{ij}\) then follows by replacing \({\varvec{\theta }}_i^{(1)}\) with \({\varvec{\theta }}_i^{(1,hp) }\), \(\tilde{\varvec{\theta }}_i^{(0)}\) with \(\tilde{\varvec{\theta }}_i^{(0,hp)}\), and \(B^c\) with \(\Omega {\setminus } {\overline{B}}\) in (4).

4 Reduced Order Approaches

The repeated solution of equation (6) for different model parameters \(\varvec{\omega }\) is computationally expensive. In [5], a projection based POD approach was developed (called PODP) in which the solution of (6) for new \({\textbf{q}}\left( \varvec{\omega }\right)\) is replaced by solving a smaller projected linear system of equations of size \(M \times M\) where \(M\ll N_{dof}\). The reduced system was obtained by extracting the modal behaviour from a small number of solution snapshots and using Galerkin projection. The implementation in [5] was limited to the case where \(\varvec{\omega }=[\omega ]\) and we consider the extension to \(\varvec{\omega }=[\omega , \mu _r]\).
As alternatives to this approach, we also consider two other approaches that are less intrusive to the software as they do not need direct access to (decompositions of) both \({\textbf{A}}\) and \({\textbf{r}}\). In the first of these alternatives, we employ a technique equivalent to POD-NN used by Hesthaven and Ubbiali [19], which is built by performing a neural network regression, leading to an approximation to \({\textbf{q}}\left( \varvec{\omega }\right)\) for new \(\varvec{\omega }\). The approximate \(\tilde{\varvec{\theta }}_i^{(0)}\) and \(\varvec{\theta }_i^{(1)}\) then are obtained from (7) and the MPT coefficients are obtained as before by a simple post-processing. In the second of the alternatives, we consider a direct neural network regression of the MPT coefficients to predict the MPT coefficients for new \(\varvec{\omega }\). PODP and POD–NN share the same off-line stage and have different on-line stages as described below:

4.1 Off–line stage

In the off-line stage, snapshot solutions \({\textbf{q}}_i^{(0)}({\varvec{\omega }}_n) = {\textbf{q}}_i^{(0)} (\mu _{r,n})\) and \({\textbf{q}}_i^{(1)}(\varvec{\omega }_n)\), corresponding to the finite element solution coefficients for \(\tilde{\varvec{\theta }}^{(0,hp)}_i(\mu _{r,n})\) and \(\varvec{\theta }^{(1,hp)}_i(\varvec{\omega }_n)\), are first obtained for a small number of sets of snapshot parameters \(\varvec{\omega }_n= [\omega _n, \mu _{r,n}]\), \(n=1,\ldots ,N\), by solving systems of the form (6). Based on previous experience in [5], we choose the snapshot parameters \(\varvec{\omega }_n\) to be logarithmically spaced over a two dimensional grid. In addition, the solution snapshots \({\textbf{q}}^{(0)}_i \left( \mu _{r,n} \right)\) were post-processed by applying a post-processing Poisson projection [15]:
$$\begin{aligned} \tilde{\varvec{\theta }}_i^{(0,hp)} \rightarrow \nabla \Delta ^{-1} \textrm{div}\left( \tilde{\varvec{\theta }}_i^{(0,hp)}\right) , \end{aligned}$$
(8)
which improves the gauging of \(\tilde{\varvec{\theta }}_i^{(0,hp)}\) without changing \(\nabla \times \tilde{\varvec{\theta }}_i^{(0,hp)}\). The matrices \({\textbf{D}}^{(0)}_i\in {\mathbb {R}}^{N_{dof} \times N}\) and \({\textbf{D}}^{(1)}_i\in {\mathbb {C}}^{N_{dof} \times N}\), \(i=1,2,3\), are then defined as a concatenation of \({\textbf{q}}_i^{(0)}(\varvec{\omega }_n)\) or \({\textbf{q}}_i^{(1)}(\varvec{\omega }_n)\) as
$$\begin{aligned} {\textbf{D}}^{(s)}_i:= \left[ {\textbf{q}}^{(s)}_i \left( \varvec{\omega }_1 \right) , {\textbf{q}}^{(s)}_i \left( \varvec{\omega }_2 \right) , {\textbf{q}}^{(s)}_i \left( \varvec{\omega }_3 \right) ,\ldots , {\textbf{q}}^{(s)}_i \left( \varvec{\omega }_ N \right) \right] . \end{aligned}$$
(9)
For the on-line stages of the PODP and POD–NN approaches, discussed in Sects. 4.2 and 4.3 respectively, the off-line stage continues with a singular value decomposition (SVD) applied to each \({\textbf{D}}^{(s)}_i\) in order to extract modal information
$$\begin{aligned} {\textbf{D}}^{(s)}_i = \mathbf {U\Sigma V}^\textrm{H} \approx {\textbf{U}}^M \mathbf {\Sigma }^M \left( {\textbf{V}}^M\right) ^\textrm{H}, \end{aligned}$$
(10)
where we omit the dependence of i and (s) on the SVD matrices and their truncated counterparts for simplicity of presentation. In the above, \(\textrm{H}\) is the Hermitian and in the approximation \(M<N\) corresponds to the level of truncation, which is determined by prescribing a tolerance TOL on the ordered relative singular values contained in the diagonal elements of \(\mathbf {\Sigma }\). This truncation, results in a truncated matrix \({\textbf{U}}^M \in {\mathbb {C}}^{N_{dof} \times M}\) containing the first M columns of the unitary matrix \({\textbf{U}}\), a square matrix \(\mathbf {\Sigma }^M \in {\mathbb {R}}^{M\times M}\) containing the truncated singular values on the diagonal and, \({\textbf{V}}^M \in {\mathbb {C}}^{N\times M}\), which is obtained by taking the first M columns of the unitary matrix \({\textbf{V}}\). Using (10), we recover an approximation to \({\textbf{q}}_i^{(s)}(\varvec{\omega }_n)\) as follows:
$$\begin{aligned} {\textbf{q}}_i ^{(s)} \left( \varvec{\omega }_n \right) \approx {\textbf{U}}^M \mathbf {\Sigma }^M \left( \left( {\textbf{V}}^M\right) ^\textrm{H}\right) _{:,n}, \end{aligned}$$
(11)
where \(\left( \left( {\textbf{V}}^M\right) ^\textrm{H}\right) _{:,n}\) refers to the nth column of \(\left( {\textbf{V}}^M\right) ^\textrm{H}\). Note that in the case of \(s=0\), the matrices \(\textbf{U}^M\) and \(\textbf{V}^M\) are real and so \(\textrm{H}\) can be replaced with transpose.

4.2 PODP - Projection based ROM

Following the PODP approach described by Wilson and Ledger in [5], we briefly describe how it can be extended to multiple parameters.
On–line stage In the on-line stage, we solve a small linear system (equation (24) [5]) of the form
$$\begin{aligned} {\textbf{A}}^M(\varvec{\omega }){\textbf{p}}^M (\varvec{\omega }) = {\textbf{r}}^M (\varvec{\omega }), \end{aligned}$$
(12)
of size \(M\times M\) for \({\textbf{p}}^M (\varvec{\omega }) \in {{\mathbb {C}}}^M\) where \({\textbf{A}}^M (\varvec{\omega }):= \left( {\textbf{U}}^M \right) ^\textrm{H} {\textbf{A}}(\varvec{\omega }) {\textbf{U}}^M\), and \({\textbf{r}}^M:= \left( {\textbf{U}}^M \right) ^\textrm{H} {\textbf{r}}( \varvec{\omega })\) (with reduction to real matrices for \(s=0\)). Once \({\textbf{p}}^M (\varvec{\omega })\) is obtained, we use the approximation \({\textbf{q}}_i^{(s)} \approx {\textbf{U}}^M {\textbf{p}}^M (\varvec{\omega })\) [5]. By repeating this for \(s=0,1\), \(i=1,2,3\) and combining with (7), allows us approximate \(\tilde{\varvec{\theta }}_i^{(0,hp)}\) and \({\varvec{\theta }}_i^{(1,hp)}\) and, hence, the approximate \(({\mathcal {M}}^{\textrm{PODP}})_{ij}\) for new \(\varvec{\omega }\).

4.2.1 Error Estimation

A detailed proof of an error estimate \(\Delta (\varvec{\omega })_{ij}\), which provides the upper bound
$$\begin{aligned} \left| ({\mathcal {M}} (\varvec{\omega })^{\textrm{PODP}})_{ij} - ({\mathcal {M}} (\varvec{\omega }))_{ij}\right| \le \Delta (\varvec{\omega })_{ij}, \end{aligned}$$
(13)
on the MPT coefficients obtained by the PODP approximation with respect to \(({\mathcal {M}} (\varvec{\omega }))_{ij}\) obtained using the full-order finite element solve for \(\varvec{\omega }=[\omega ]\) is established in [5]. This naturally extends to the case where \(\varvec{\omega }=[\omega ,\mu _r]\) using
$$\begin{aligned} \Delta (\varvec{\omega })_{ij}= & {} \frac{\alpha ^3}{8\alpha _{LB}} \left(\Vert \hat{{\varvec{r}}}_i\left( \varvec{\omega }\right) \Vert _{Y^{(hp)}}^2 + \Vert \hat{{\varvec{r}}}_j\left( \varvec{\omega }\right) \Vert _{Y^{(hp)}}^2 \right. \nonumber \\{} & {} + \left.\Vert \hat{{\varvec{r}}}_i \left( \varvec{\omega } \right) - \hat{{\varvec{r}}}_j\left( \varvec{\omega }\right) \Vert _{Y^{(hp)}}^2\right), \end{aligned}$$
(14)
and, in this case, \(\alpha _{LB}\) is the lower bound on a stability constant obtained by taking the smallest eigenvalue from an eigenvalue problem [11, pg 56] for the smallest frequency and inverse permeability of interest. \(Y^{(hp)}\) is as defined in [5] and corresponds to the set of \({\varvec{H}}(\textrm{curl})\) conforming functions in the discretisation.
Similarly to [5], to efficiently compute \(\hat{{\varvec{r}}}_i\left( \varvec{\omega }\right)\) and \(\hat{{\varvec{r}}}_j\left( \varvec{\omega }\right)\), we build a discrete analogue of the residual by first constructing
$$\begin{aligned} {\textbf{G}}^{(i,j)} = \left( {\textbf{W}}^{(i)}\right) ^\textrm{H} {\textbf{M}}_0^{-1} {\textbf{W}}^{(j)}, \end{aligned}$$
(15)
once for all \(\varvec{\omega }\) where \({\textbf{M}}_0\) is the real symmetric mass matrix for the lowest order basis functions. To obtain \({\textbf{W}}^{(i)}\) for \(\varvec{\omega }=[\omega ,\mu _r]\), we consider the splittings
$$\begin{aligned} {\textbf{A}} = {\textbf{B}}^{(0)} + \mu _r^{-1}{\textbf{B}}^{(1)} + \omega {\textbf{C}}^{(1)}, \qquad {\textbf{r}} = \omega {\textbf{r}}^{(1)}, \end{aligned}$$
(16)
where
$$\begin{aligned} \left( {\textbf{B}}^{(0)}\right) _{ij}&= \int _{\Omega \backslash {\overline{B}}} \nabla \times {\textbf{N}}_i \cdot \nabla \times {\textbf{N}}_j \;\textrm{d}\varvec{\xi } + \varepsilon \int _{\Omega \backslash {\overline{B}}} {\textbf{N}}_i\cdot {\textbf{N}}_j \; \textrm{d}\varvec{\xi } , \end{aligned}$$
(17a)
$$\begin{aligned} \left( {\textbf{B}}^{(1)}\right) _{ij}&= \int _{B} \nabla \times {\textbf{N}}_i \cdot \nabla \times {\textbf{N}}_j \;\textrm{d}\varvec{\xi } , \end{aligned}$$
(17b)
$$\begin{aligned} \left( {\textbf{C}}^{(1)}\right) _{ij}&=- \textrm{i}\int _{B} \alpha ^2 \sigma _*\mu _0{\textbf{N}}_i\cdot {\textbf{N}}_j \; \textrm{d}\varvec{\xi }, \end{aligned}$$
(17c)
and \({\textbf{r}}^{(1)}\) is similarly constructed by taking out the factor \(\omega\) from \({\textbf{r}}\). Then, using \({\textbf{U}}^{(M,i)}\) to denote \({\textbf{U}}^M\) for the ith direction and \(\varvec{\theta }^{(1,hp)}_i\), and similarly for \(\mathbf {\Sigma }^{(M,i)}\), we can determine \({\textbf{W}}^{(i)}\) as
$$\begin{aligned} {\textbf{W}}^{(i)}:= {\textbf{P}}_0^p\left( {\textbf{r}}^{(1)}, \; {\textbf{B}}^{(0)}{\textbf{U}}^{(M,i)}, \; {\textbf{B}}^{(1)}{\textbf{U}}^{(M,i)}, \; {\textbf{C}}^{(1)}{\textbf{U}}^{(M,i)} \right) , \end{aligned}$$
(18)
where \({\textbf{P}}_0^p\) is the projection from order p to order 0 \({\varvec{H}}(\text {curl})\) conforming basis functions. It then follows that an efficient evaluation of the contributions to the error estimate are given by
$$\begin{aligned} \Vert \hat{{\varvec{r}}}_i\left( \varvec{\omega }\right) \Vert _{Y^{(hp)}}^2 =& \left( {\textbf{w}}^{(i)}\left( \varvec{\omega }\right) \right) ^\textrm{H} {\textbf{G}}^{(i,i)} \left( {\textbf{w}}^{(i)}\left( \varvec{\omega }\right) \right) , \end{aligned}$$
(19)
$$\begin{aligned} \Vert \hat{{\varvec{r}}}_i\left( \varvec{\omega }\right) - \hat{{\varvec{r}}}_j\left( \varvec{\omega }\right) \Vert _{Y^{(hp)} }^2 =& \left. \Vert \hat{{\varvec{r}}}_i\left( \varvec{\omega }\right) \Vert _{Y^{(hp)} }^2 + \Vert \hat{{\varvec{r}}}_j\left( \varvec{\omega }\right) \Vert _{Y^{(hp)} }^2 \right. \nonumber \\&\left. -2\textrm{Re}\left( \left( {\textbf{w}}^{(i)}\left( \varvec{\omega }\right) \right) ^\textrm{H} {\textbf{G}}^{(i,j)} \left( {\textbf{w}}^{(j)}\left( \varvec{\omega }\right) \right) \right)\right. , \end{aligned}$$
(20)
for each parameter vector, \(\varvec{\omega }\), by updating
$$\begin{aligned} {\textbf{w}}^{(i)}\left( \varvec{\omega }\right) = \begin{bmatrix} \omega \\ -{\textbf{p}}^{(M,i)}\left( \varvec{\omega }\right) \\ -\mu _r^{-1}{\textbf{p}}^{(M,i)}\left( \varvec{\omega }\right) \\ -\omega {\textbf{p}}^{(M,i)}\left( \varvec{\omega }\right) \\ \end{bmatrix}. \end{aligned}$$
(21)
In (18), \(\textbf{r}^{(1)}\) depends on \(\mu _r\), which means that, in our implementation, we construct the matrices iteratively, however, it is also possible to formulate an alternative construction of \(\textbf{W}^{(i)}\) and \(\textbf{w}^{(i)}(\varvec{\omega })\) from (5), provided that the \(\tilde{\varvec{\theta }}_i^{(0)}\) and \(\varvec{\theta }_i^{(1)}\) problems are not split. The MPT coefficients in this case can be computed using the alternative (but equivalent) formulation in [2].

4.3 POD–NN - Neural Network Enhanced ROM

We follow an approach that is equivalent to the POD–NN approach described in [18, 19].
On–line stage In POD-NN, the approximate solution for new parameters \(\varvec{\omega }\) is taken as
$$\begin{aligned} {\textbf{q}}_i ^{(s)} \left( \varvec{\omega } \right) \approx {\textbf{U}}^M {\varvec{R}} \left( \varvec{\omega }, {{\varvec{c}}} \left( \mathbf {\Sigma }^M \left( {\textbf{V}}^M\right) ^\textrm{H} \right) \right) , \end{aligned}$$
(22)
and the mth component of \({{\varvec{R}}}\), \(R_m \left( \varvec{\omega }, {{\varvec{c}}}_m \right)\), is a prescribed function (e.g. a polynomial or some other smoothly varying differentiable function) whose coefficients \({\varvec{c}}\) are found from solving the minimisation problem
$$\begin{aligned} \text {min}_{ {\varvec{c}}} \frac{1}{N} \sum _{n=1}^{N} \left| \left( {\varvec{\Sigma }}^M ({\varvec{V}}^M)^{\textrm{H}}\right) _{:, n} - {\varvec{R}} \left( \varvec{\omega }_n, {{\varvec{c}}} \right) \right| ^2 \end{aligned}$$
(23)
Due to their superior interpolation properties, a neural network is used for this regression and, hence, the name POD–NN. Our implementation differs from [19] in that we train the network based on columns of \({\varvec{\Sigma }}^M (\textbf{V}^M)^{\textrm{H}}\) rather than columns of \((\textbf{U}^M)^{\textrm{H}} \textbf{D}\), but note that by (10) \({\varvec{\Sigma }}^M (\textbf{V}^M)^{\textrm{H}} \approx (\textbf{U}^M)^{\textrm{H}} \textbf{D}\) and so the two approaches are equivalent.
For the specific details of the implementation, which involves training individual networks for each direction \(i=1,2,3\) and \(s=0,1\), we refer to [18]. Combining with (7), allows us approximate \(\tilde{\varvec{\theta }}_i^{(0,hp)}\) and \(\varvec{\theta }_i^{(1,hp)}\) and, hence, the approximate \(({\mathcal {M}}(\varvec{\omega })^{\mathrm{POD-NN}})_{ij}\) for new \(\varvec{\omega }\).

4.4 NNR - Neural Network Regression of MPT coefficients

As a final alternative, we consider a neural network regression technique for predicting \(({\mathcal M}(\varvec{\omega })^{\mathrm{{NNR}}})_{ij}\) for new parameters \({\varvec{\omega }}\). This is the least invasive of the techniques considered as the prediction can be obtained by curve-fitting. However, it lacks the physical insights that are gained from using the modal information in the other POD-based approaches and so we expect it to be less accurate for the same parameter snapshots. To fix ideas, let \(({\mathcal {M}}(\varvec{\omega }_n))_{ij} = y_{\textrm{r},n} + \textrm{i}y_{\textrm{i},n}\) and introduce the dictionaries of ordered pairs
$$\begin{aligned} {{\textbf{D}}}_{\textrm{r}}&:= \left\{ \left( \varvec{\omega }_1 ,y_{\textrm{r}, 1} \right) ,\left( \varvec{\omega }_2 ,y_{\textrm{r}, 2} \right) , \ldots , \left( \varvec{\omega }_N, y_{\textrm{r}, N} \right) \right\} , \end{aligned}$$
(24a)
$$\begin{aligned} {{\textbf{D}}}_{\textrm{i}}&:= \left\{ \left( \varvec{\omega }_1 ,y_{\textrm{i}, 1 }\right) , \left( \varvec{\omega }_2 ,y_{\textrm{i}, 2} \right) , \ldots , \left( \varvec{\omega }_N ,y_{\textrm{i}, N} \right) \right\} . \end{aligned}$$
(24b)
By splitting each of the dictionaries into training and testing parts using a \(15\%\) reserve such that \(N^{train}<N\), we then apply feed-forward networks [22] that are trained to minimise the functionals
$$\begin{aligned} \min _{{{\varvec{m}}}_{\textrm{r}}} \frac{1}{N^{train}} \sum _{n=1}^{N^{train}} |y_{\textrm{r}, n} - {\tilde{y}}_\textrm{r}(\varvec{\omega }_n, {\varvec{m}}_{\textrm{r}} )|^2, \qquad \min _{{{\varvec{m}}}_{\textrm{i}} } \frac{1}{N^{train}} \sum _{n=1}^{N^{train}} |y_{\textrm{i}, n} - {\tilde{y}}_\textrm{i}(\varvec{\omega }_n, {\varvec{m}}_{\textrm{i}} )|^2 \end{aligned}$$
(25)
for the model parameters \({{\varvec{m}}}_{\textrm{r}}\) and \({{\varvec{m}}}_{\textrm{i}}\) in order to predict \(({\mathcal M}(\varvec{\omega })^{NNR})_{ij}=\tilde{y_r}(\varvec{\omega }, {\varvec{m}}_{\textrm{r}}) + \textrm{i}{\tilde{y}}_\textrm{i}(\varvec{\omega }, {\varvec{m}}_{\textrm{i}} )\). This can equivalently be used to directly build a regression of the MPT eigenvalues.
Fig. 2 shows a flowchart summarising the 3 different methods for computing the MPT coefficients with associated equation numbers.

4.5 Software

Our practical implementations1 build on the MPT-Calculator software initially developed by Wilson and Ledger [5], which uses the NGSolve library (version 6.2.2204) for the finite element computations [1215]. For our neural-network computations, we use the Scikit-Learn library (version 1.1.2) and consider the tanh and sigmoid activation functions. Prior to training, we scale the training data so that it has mean 0 and standard deviation 1. We use a quasi-Newton type limited memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) method [2325] to determine the model parameters \({{\varvec{m}}}_{\textrm{r}}\) and \({{\varvec{m}}}_{\textrm{i}}\). To determine the hyperparameters, for the feed-forward network we employ a grid-based search. We consider \(\ell = 1, 2\), and 3 hidden layers, each with either \(t=2, 4, 8, 16, 32\), or 64 neurons and either the tanh or logistic activation functions.

5 Numerical Examples

In this section, we first compare the PODP, POD–NN and NNR approaches for the MPT characterisation of a conducting permeable sphere for different \(\omega\) and \(\mu _r\) and then show the predictive capability of the approach for a geometry that does not have an analytical solution.

5.1 Conducing Permeable Sphere

This subsection discusses the case where \(B_\alpha\) is a conducting permeable sphere with radius \(\alpha =0.01\) m, conductivity \(\sigma _*=10^6\) S/m, relative permeability \(1 \le \mu _r\le 50\) and exciting frequency \(10^1 \le \omega \le 10^5\) rad/s. An analytical solution is available for \(({{\mathcal {M}}})_{ij}\) for this geometry [26] in the form \(({{\mathcal {M}}})_{ij} = m (\alpha B, \omega , \sigma _*, \mu _r)\delta _{ij}\), which shows that \({{\mathcal {M}}}\) is a multiple of identity in this case and, hence, \(\tilde{\varvec{\theta }}_1^{(0)}= \tilde{\varvec{\theta }}_2^{(0)}=\tilde{\varvec{\theta }}_3^{(0)}\) and \(\varvec{\theta }_1^{(1)}= \varvec{\theta }_2^{(1)}=\varvec{\theta }_3^{(1)}\) for the continuous problem.
To compute snapshot solutions for this geometry, we consider B to be a unit radius sphere centred at the origin and set up a finite computational domain \(\Omega\) in the form of a sphere of radius 200 units. The domain is discretised by a quasi-uniform tetrahedral mesh of element size 0.2 units and 57 698 elements as illustrated in Fig. 3. The curved surface of \(\Gamma\) is approximated by 5th order polynomials.

5.1.1 Full Order Model Solutions

Before considering the accuracy of the reduced order model approaches, we first consider the accuracy of the full-order model for computing \(({{\mathcal {M}}}^{hp})_{ij}\) using different uniform polynomial orders \(p=0,1,2,3,4\) for \(\tilde{\varvec{\theta }}_i^{(0,hp)}\) and \(\varvec{\theta }_i^{(1,hp)}\). In Fig. 4, we show the convergence of the first eigenvalues \(\lambda _1 ( \tilde{{\mathcal {R}}}^{hp})\) and \(\lambda _1 ( {{\mathcal {I}}}^{hp})\) obtained using the full order model to \(\lambda _1( \tilde{{\mathcal {R}}})\) and \(\lambda _1( {{\mathcal {I}} })\) for \(10 \le \omega \le 10^5\) rad/s and \(\mu _r =1, 20\). We observe a rapid convergence of eigenvalues to the exact solution. The convergence of the other eigenvalues is similar.
While the solutions obtained from p-refinement are accurate, for each choice of parameters \(\mathbf{\omega }= [\omega , \mu _r]\) a preconditioned conjugate gradient solver is applied to solve (6), which requires repeated matrix–vector products involving the sparse matrix \(\textbf{A}\) with nz non-zero entries. Repeated solution of the linear system (6) for a large number of evaluation parameters is expensive.

5.1.2 Reduced Order Model Solutions

Off–line stage
Unless otherwise stated, we consider \(N=16^2\) snapshot full order model solutions using the aforementioned discretisation and uniform order \(p=2\) elements. This is then evaluated over a \(K=32^2\) grid of evaluation parameters. This was identified by considering the minimum number of snapshots for which the PODP, POD–NN and NNR all gave reliable results with a relative root mean squared error (rRMSE) \(e \le 10^{-2}\) where e is
$$\begin{aligned} e = \frac{\sqrt{\sum _{k=1}^{K} |z^\textrm{APP}(\varvec{\omega }_k) - z^{hp}(\varvec{\omega }_k)|^2}}{\sqrt{\sum _{k=1}^{K} |z^{hp}(\varvec{\omega }_k)|^2 }} \end{aligned}$$
(26)
Denoting \(z^{hp}({\varvec{\omega }}):= \lambda _1 (\tilde{{\mathcal {R}}}( \varvec{\omega })^{hp}) + \textrm{i} \lambda _1 ({\mathcal {I}}( \varvec{\omega })^{hp})\) and \(z^{\textrm{APP}}(\varvec{\omega }):= \lambda _1 (\tilde{{\mathcal {R}}}( \varvec{\omega })^{\textrm{APP}}) + \textrm{i} \lambda _1 ({\mathcal {I}}( \varvec{\omega })^{\textrm{APP}})\) for \(k=1, 2, \cdots , K\) evaluations parameters, where \(\textrm{APP}\) is used to denote either PODP, POD–NN or NNR.
The snapshots are computed for logarithmically spaced parameters \(\omega _n=10^{{\tilde{\omega }}_n}\), where \({\tilde{\omega }}_n\) is drawn from a 16 sample linearly spaced distribution between 1 and 5, and \(\mu _{r,n}=b^{\tilde{\mu _r}_n}\), where \(\tilde{\mu _r}_n\) is obtained from the linear distribution between 0 and \(\log _b(50)\). The base b is chosen to be \(50^{1/5}\).
Then, once (9) and (10) have been applied, we can extract the modal information from the SVD. In the case of \({\textbf{D}}^{(0)}_i\), we obtain the decay of the singular values shown in Fig. 5 where the improvement in the decay of the singular values by including (8) is clear. Although not shown, in the case of \({\textbf{D}}^{(1)}_i\), the projection (8) is not appropriate, since the projection would remove the gradients fields needed inside the object in this case. Henceforth, we select \(TOL=10^{-6}\) for the truncated SVDs (TSVDs) of \({\textbf{D}}^{(0)}_i\) and \({\textbf{D}}^{(1)}_i\). This corresponds to \(M=3\) modes per \(\tilde{\varvec{\theta }}_i^{(0)}\) and \(M=20\) modes per \(\varvec{\theta }_i^{(1)}\).
Online stage–PODP
Applying the approach described in Sect. 4.2 leads to the results shown in Fig. 6 for \(\lambda _1( \tilde{{\mathcal {R}}}^{\textrm{PODP}})\), and \(\lambda _1( {{\mathcal {I}}}^\textrm{PODP})\) for the case of \(1 \le \mu _r\le 50\) and \(10^1 \le \omega \le 10^5\) rad/s. This figure shows that the reduced order model is in excellent agreement with the full order model solutions at the snapshot values and the prediction for other parameters follows the expected trends. The behaviour of the other eigenvalues is similar.
In order to certify the PODP method, the approach from Sect. 4.2.1 is applied leading to the results for \((\tilde{{\mathcal {R}}}^{\textrm{PODP}} \pm \Delta )\) shown in Fig. 7 for the cases where \(N=6^2\) and \(N=16^2\). In this figure, we observe that the certification reduces to the full order model solutions at the snapshot values and shows that PODP is highly reliable for a large range of \(\omega\) and \(\mu _r\) values in both cases. The larger error bounds for large \(\omega\) and \(\mu _r\) using \(N=6^2\) indicates that the solution is less reliable in this case, although the comparison with the snapshot values shows it is accurate. We emphasise that, despite the effectivity index of the upper bound being large in this region, this certification can be computed at only a small additional cost during the online stage of the ROM and, hence, stills provides useful information to assess our confidence in the PODP prediction. The confidence in the prediction can be improved by increasing from \(N=6^2\) to \(N=16^2\) as the figure shows. Alternatively, we can improve the confidence in the prediction, by just adding additional snapshots corresponding to the locations where \((\Delta )_{ij}\) is large. Similar behaviour is observed for \(({{\mathcal {I}}}^{\textrm{PODP}} \pm \Delta )\) and for the other eigenvalues for this problem. It was also observed when the PODP approach was applied for a single parameter in [5].
Online stage POD–NN
In this section, we consider the results obtained by applying the POD–NN described in Sect. 4.3. The neural networks used for POD–NN were obtained using a cross-validated grid search method resulting in a choice of 2 hidden layers, each with 8 neurons and a tanh activation function. We also observed that there was no significant change in accuracy when including a \(L_2\) regularisation term in the training of the network and performance was degraded when considering more than \(N=16^2\) snapshots. Of additional note, the performance of the neural network was found to be stable, satisfying a training tolerance of \(10^{-10}\) over at least \(2 \le \ell \le 3\) layers and \(2^3 \le t \le 2^6\) neurons.
For our implementation of POD–NN, a new network is trained and evaluated for each direction, \(s=0,1\) and i with the real and imaginary parts of \(\textbf{D}_i^{(s)}\) being concatenated so that 6 training sets with M ordered pairs \((\textbf{x}^{(m)}, \textbf{y}^{(m)})\), where \(\textbf{x}^{(m)}\in {{\mathbb {R}}}^2\) and \(\textbf{y}^{(m)} \in {{\mathbb {R}}}^{2N}\), are formed. These training sets are used to train 6 different networks. Once this has been completed, the predictions shown in Fig. 8 for \(\lambda _1( \tilde{{\mathcal {R}}}^{\mathrm{POD-NN}})\) and \(\lambda _1( {{\mathcal {I}}}^{\mathrm{POD-NN}} )\) for the case of \(1 \le \mu _r\le 50\) and \(10^1 \le \omega \le 10^5\) rad/s are obtained. Like the PODP case, the results show that this ROM is also in excellent agreement with the full-order model solutions at the snapshot values and the prediction for other parameters follows the expected trends. The behaviour for the other eigenvalues is similar.
NNR regression
For simplicity of presentation, and ease of comparison, we consider the same neural network architecture that was previously used for the POD–NN scheme for NNR. Applying the approach from Sect. 4.4 leads to the results for \(\lambda _1( \tilde{{\mathcal {R}}}^{\textrm{NNR}} )\), and \(\lambda _1( {{\mathcal {I}}}^{\textrm{NNR}} )\) for the case of \(1 \le \mu _r\le 50\) and \(10^1 \le \omega \le 10^5\) rad/s shown in Fig. 9. As with the PODP and POD–NN methods, the NNR method shows good visual agreement with the snapshot values and the prediction follows the expected trends. In addition, evaluating the rRMSE between the model and the testing samples gives \(e<0.01\). The results for the other eigenvalues are similar.

5.1.3 Methodology Comparison

Our focus in Sect. 5.1.2 was on a relatively fine off–line stage using \(N=16^2\) snapshot solutions where it was observed that the PODP, POD–NN and NNR approaches all produced accurate results for \(\lambda _1( \tilde{{\mathcal {R}}} )\) and \(\lambda _1( {{\mathcal {I}}} )\) when evaluated for parameters \(1 \le \mu _r\le 50\) and \(10^1 \le \omega \le 10^5\) rad/s. We now wish to examine the performance of each method by evaluating the relative error using
We choose to evaluate e over a grid of \(K = 32\times 32\) points that are different from the snapshot parameters. The snapshots are generated using logarithmically spaced parameters corresponding to \(N= (2)^2, (2^2)^2, (2^3)^2, (2^4)^2 =4,16,64,256\) snapshots and e evaluated at combinations of \(\mu _r\) and \(\omega\) corresponding to a 32 by 32 grid over the range \(10^1 \le \omega \le 10^5\) rad/s and \(1 \le \mu _r \le 50\).
Figure 10 shows e when evaluated for \(1 \le \mu _r \le 50\), \(10^1 \le \omega \le 10^5\) rad/s in a 32 by 32 grid as a function of N for the different methods. We can see that the PODP scheme performs significantly better than the POD-NN and NNR interpolants and, in particular, the PODP scheme leads to an approximation that is more than 4 orders of magnitude more accurate. Nonetheless, if \(N\ge (8)^2\) snapshots are used, POD–NN leads to a solution with error \(e \le 0.01\). The behaviour for other combinations of \(\mu _r\) and \(\omega\) is similar. During the training process, a cross-validated grid search approach was used to find optimal hyperparameters. Nevertheless, better performance may be obtained by a smaller tolerance, performing an optimisation over a wider range of possible hyperparameters, or changing the activation function.
Next, we consider a comparison between the computational time using a sequential methodology for the different methods in Fig. 11. We show the wall clock time taken to compute the snapshot solutions corresponding to \({\textbf{q}}\left( \varvec{\omega }_n\right)\) and present the timings required to obtain \({\lambda _1}(\tilde{{\mathcal {R}}} (\varvec{\omega })^{\textrm{APP}})\) and \(\lambda _1({{\mathcal {I}}} (\varvec{\omega })^{\textrm{APP}})\) at \(K=32^2\) different choices of \(\varvec{\omega }\) corresponding to the aforementioned evaluation values of \(\omega\) and \(\mu _r\). These timings include the time required to optimise the hyper-parameters using a cross validated grid-based search considering \(\ell =1,2\) or 3 layer networks with \(t=1,2,4,8,16\), or 64 neurons. We also consider logistic or tanh activation functions and a small regularisation term of 0, \(10^{-5}\) or \(10^{-7}\). A more expansive set of hyperparameters, a smaller training tolerance, or a different search strategy may result in a significantly increased time.
The figure shows a a significant acceleration in computational time for the NNR method, which is due to the need to only compute \({\lambda _1}(\tilde{{\mathcal {R}}} (\varvec{\omega }_n)^{hp})\) and \(\lambda _1({{\mathcal {I}}} (\varvec{\omega }_n)^{hp} )\) for each set of snapshot parameters \(\varvec{\omega }_n\), whereas the PODP and POD–NN methods require the evaluation of \({\lambda _1}(\tilde{{\mathcal {R}}} (\varvec{\omega })^{\textrm{APP}})\) and \(\lambda _1({{\mathcal {I}}} (\varvec{\omega })^{\textrm{APP}})\) at the evaluation points \(\varvec{\omega }\). In each case, the PODP, POD–NN, and NNR methods perform significantly faster than the corresponding full order solution for these evaluation parameters which takes a wall clock time of 45965 s (approximately 13 h). PODP and POD–NN both share the same off–line stage and construction of the reduced order model. A detailed description for computational costs associated with POD is provided in [11, pg. 21-29]. Training the POD–NN and NNR neural networks depends heavily on the network architecture and tolerances but typically relies on efficient quasi-Newton optimisations. The LBFGS, which we have used here, is well suited to large dimensional problems given its linear computational cost. See [27, pg. 224-233].
Timings were performed using a 6 core Intel i5 10600 CPU and 64 GB of RAM where the multiple cores of this machine were used computing the snapshots in parallel, but not for the timings in Fig. 11. From the figures, we see that computing and postprocessing the snapshots constitutes a majority of the computation time and reducing this cost is a key benefit of the NNR method. Furthermore, training (including the optimisation to find the hyperparameters) and evaluating the ROM and neural networks is extremely quick for the relatively shallow neural networks considered.
From Figs. 10 and 11, we see that the PODP method is significantly more accurate than the NNR and POD–NN methods, and of a similar computational expense to the POD–NN method, however, this is at the expense of significant code intervention. For each method, many of the calculations are trivially parallelisable and a parallel implementation using a machine with multiple cores can be used to reduce the computational cost. In the case of NNR, which is the least invasive to implement, a parallel implementation of the calculation of \({\lambda _1}(\tilde{{\mathcal {R}}} (\varvec{\omega }_n)^{hp})\) and \(\lambda _1({{\mathcal {I}}} (\varvec{\omega }_n)^{hp})\) at all snapshots \(\varvec{\omega }_n\) would reduce the time, but, given that the largest contribution to the processing time is the computation of the snapshot solutions, which is shared across all three methods, additional parallelism is less effective than the PODP and POD–NN cases. In the case of PODP, which is the most invasive to implement, we see it has the smallest growth in computational time for this example, with solving the smaller linear system being quick and post-processing taking a constant time. However, if the size M of the ROM needed becomes large, the memory usage and computational time of this approach may increase significantly. The POD–NN method does not require access to \({\textbf{A}}\) and \(\mathrm {\textbf{r}}\) and is ideally suited to closed code bases.

5.2 Hammer Head Example

A metal claw head hammer is considered as a more complex example, modelled as 440A stainless steel \(\sigma _* = 1.7 \times 10^6\) S/m [28, pg 894], and the non–dimensional object B is chosen to be such that \(\alpha =0.001\) m. Similarly to the sphere example, the object B for this case is surrounded by a large non-conducting region \([-1000, 1000]^3\) units and consists of \(51\,095\) unstructured tetrahedral elements. The maximum mesh size inside the object is \(h=5\) units and the distribution of the element on the surface of B is illustrated in Fig. 12. In practice, many magnetic materials, such as steel, have a non–linear \({{\varvec{B}}}-{{\varvec{H}}}\) constitutive behaviour and, thus, \(\mu _r\) depends on \({{\varvec{H}}}\) and may vary by several orders of magnitude, although, for field strengths involved in metal detection, the linear relationship \({\varvec{B}}=\mu _r \mu _0 {{\varvec{H}}}\) typically holds. Nonetheless, it not always straightforward to find the correct \(\mu _r\) for the characterisation. Using the ROM with two parameters allows us to explore the effects of increased \(\mu _r\) at reduced computation time.
Given the success of the PODP approach in the previous section, we also apply this approach for the hammer-head example.
Off–line solutions
By performing a p–convergence study on the mesh of 51 095 unstructured tetrahedral elements we found that using uniform order \(p=2\) order elements are sufficient to obtain converged results for \(({\mathcal {M}})_{ij}\) at the snapshot parameters. Then, using this discretisation, \(N=16^2\) full order model solution snapshots were generated corresponding to logarithmically spaced snapshot parameters, in a similar way as described in Sect. 5.1.2. Similarly to Sect. 5.1.2, TOL was set at \(10^{-6}\) for both \(\tilde{\varvec{\theta }}^{(0)}\) and \(\varvec{\theta }^{(1)}\) problems.
On–line solutions
Applying the approach described in Sect. 4.2 leads to the results shown in Fig. 13 for \(\lambda _1( \tilde{{\mathcal {R}}}^{\textrm{PODP}})\), and \(\lambda _1( {{\mathcal {I}}}^\textrm{PODP})\) for the case of \(1 \le \mu _r\le 50\) and \(10^1 \le \omega \le 10^5\) rad/s. This figure shows that the ROM is in excellent agreement with the full-order model solutions at the snapshot values and the prediction for other parameters follows the expected trends. The behaviour for the other eigenvalues is similar.
Similarly, the estimated error certificates for this object shows that the ROM and full-order solutions are in good agreement, as illustrated by Fig 14 where \((\tilde{{\mathcal {R}}}\pm \Delta )_{1,1}\) and \(({\mathcal {I}}\pm \Delta )_{1,1}\) are shown. Similar performance is observed for the other coefficients.

6 Conclusion

In this article, we discuss alternative approaches to efficiently computing the complex MPT coefficients for different objects under a two-dimensional range of material properties. We have extended the PODP ROM discussed in [5] to two dimensions, and compared it with less invasive neural network-based regression techniques POD-NN and NNR.
A series of numerical examples are provided for the cases of a conducting permeable sphere, and a hammer-head modelled as 440 stainless steel over a wide frequency range and considering a range of permeabilities. Our results have shown that the PODP method performs most accurately, however, if such high accuracy is not required, then the POD–NN method provides a less intrusive alternative. While the on-line stage of POD-NN is quick, the training and optmisation of hyper-parameters can add significant costs to the off-line stage depending on the optimisation strategy and architectures of the neural networks considered. Due to the inclusion of a more effective gauging for the \(\varvec{\theta }^{(0)}\) problem, we achieve faster decay of the singular values, and a more accurate ROM than the one presented in [5].
Future work involves applying the presented approaches to generate a large dictionary of MPT spectral signatures for different magnetic objects and to improving the methodology for resolving the fields in the thin skin depths for objects with very hight \(\mu _r\). In addition, it would be of interest, particularly in scrap metal sorting, if the full object description allowed by considering changes in \(\mu _r\) can be applied for the estimation of an object’s material properties.

Acknowledgements

The authors gratefully acknowledge the financial support received from EPSRC in the form of grant EP/V009028/1.

Statements and Declarations

Financial or non-financial interests

We gratefully acknowledge full funding received from the UK EPSRC under grant EP/V009028/1. The authors have no relevant financial or non-financial interests to disclose.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://​creativecommons.​org/​licenses/​by/​4.​0/​.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Fußnoten
1
The github repository https://​github.​com/​MPT-Calculator/​MPT-Calculator is publicly available.
 
Literatur
9.
Zurück zum Zitat Özdeger T (2022) Advances in Techniques for the Characterisation of Targets in Metal Detection and Ultrawide Band Electromagnetic Screening Applications. Phd, The University of Manchester. Özdeger T (2022) Advances in Techniques for the Characterisation of Targets in Metal Detection and Ultrawide Band Electromagnetic Screening Applications. Phd, The University of Manchester.
14.
Zurück zum Zitat Schöberl J (2014) C++ 11 implementation of finite elements in NGSolve. Technical Report 30, Institute for Analysis and Scientific Computing, Vienna University of Technology, Vienna, Austria Schöberl J (2014) C++ 11 implementation of finite elements in NGSolve. Technical Report 30, Institute for Analysis and Scientific Computing, Vienna University of Technology, Vienna, Austria
18.
Zurück zum Zitat Miah S, Sooriyakanthan Y, Ledger PD, Gil AJ, Mallett M (2023) Reduced order modelling using neural networks for predictive modelling of 3D-magneto-mechanical problems with application to magnetic resonance imaging scanners. Engineering with Computers, Accepted. Miah S, Sooriyakanthan Y, Ledger PD, Gil AJ, Mallett M (2023) Reduced order modelling using neural networks for predictive modelling of 3D-magneto-mechanical problems with application to magnetic resonance imaging scanners. Engineering with Computers, Accepted.
24.
Zurück zum Zitat Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-Learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNet Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-Learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNet
26.
Zurück zum Zitat Wait JR (1951) A conducting sphere in a time varying magnetic field. Geophysics 16(4):666–672CrossRef Wait JR (1951) A conducting sphere in a time varying magnetic field. Geophysics 16(4):666–672CrossRef
28.
Zurück zum Zitat Mitchell BS (2004) An Introduction to Materials Engineering and Science for Chemical and Materials Engineers. Wiley, Hoboken Mitchell BS (2004) An Introduction to Materials Engineering and Science for Chemical and Materials Engineers. Wiley, Hoboken
Metadaten
Titel
Reduced order model approaches for predicting the magnetic polarizability tensor for multiple parameters of interest
verfasst von
James Elgy
Paul D. Ledger
Publikationsdatum
15.07.2023
Verlag
Springer London
Erschienen in
Engineering with Computers / Ausgabe 6/2023
Print ISSN: 0177-0667
Elektronische ISSN: 1435-5663
DOI
https://doi.org/10.1007/s00366-023-01868-x

Weitere Artikel der Ausgabe 6/2023

Engineering with Computers 6/2023 Zur Ausgabe

Neuer Inhalt