Sie können Operatoren mit Ihrer Suchanfrage kombinieren, um diese noch präziser einzugrenzen. Klicken Sie auf den Suchoperator, um eine Erklärung seiner Funktionsweise anzuzeigen.
Findet Dokumente, in denen beide Begriffe in beliebiger Reihenfolge innerhalb von maximal n Worten zueinander stehen. Empfehlung: Wählen Sie zwischen 15 und 30 als maximale Wortanzahl (z.B. NEAR(hybrid, antrieb, 20)).
Findet Dokumente, in denen der Begriff in Wortvarianten vorkommt, wobei diese VOR, HINTER oder VOR und HINTER dem Suchbegriff anschließen können (z.B., leichtbau*, *leichtbau, *leichtbau*).
Dieses Kapitel vertieft die Levenberg-Marquardt-Methode und ihre Anwendung im Teilbereich des Ensembles zur Minimierung von Kostenfunktionen. Sie stellt eine detaillierte Ableitung der Methode dar und hebt ihre Vorteile bei der Stabilisierung von Iterationen im Vergleich zur Gauß-Newton-Methode hervor. Der Text stellt auch den Subspace-EnRML-Algorithmus vor und diskutiert seine praktische Umsetzung und die Verwendung von Ensemblekovarianzen. Ein wesentlicher Schwerpunkt liegt auf den Auswirkungen von Messfehlerkorrelationen und den Vorteilen der Anwendung der Subrauminversionsmethode. Das Kapitel schließt mit einer Diskussion über die Methode Ensemble Smoother und die Methode Ensemble Smoother with Multiple Data Assimilation (ESMDA), die Einblicke in ihre Implementierung und Vorteile im Umgang mit nichtlinearen Problemen bietet. Der Text untersucht auch die praktischen Aspekte der Wahl optimaler Schrittlängen und die Bedeutung der Berücksichtigung von Messfehlerkorrelationen, um Überanpassung und Zusammenbruch des Ensembles zu vermeiden.
KI-Generiert
Diese Zusammenfassung des Fachinhalts wurde mit Hilfe von KI generiert.
Abstract
This chapter derives Gauss-Newton and Levenberg-Marquardt methods for RML sampling that searches for the solution in the ensemble subspace. The approach does not introduce any further assumptions since ensemble methods implicitly confine the solution to the space spanned by the prior ensemble. The advantage of the ensemble subspace formulation is that it avoids the inversion of huge low-rank covariance matrices and solves the EnRML formulation precisely without introducing any further approximations. The subspace EnRML algorithm can also compute the posterior ensemble subspace solution for the ES and ESMDA methods.
6.1 Levenberg-Marquardt in the Ensemble Subspace
Evensen et al. (2019) formulated the Gauss-Newton method for minimizing the cost function in Eq. (5.19). Here, we will take a slightly different approach and derive the Levenberg-Marquart method instead. Levenberg-Marquardt modifies the Hessian somewhat by adding a term \(\lambda {\textbf{I}}_N\) and can stabilize the iteration in some cases.
The Jacobian (gradient) of the cost function \(\nabla _{{\textbf{w}}} \mathcal {J}({\textbf{w}}_{j}) \in \mathcal {R}^{N\times 1}\) is
and an approximate Hessian (gradient of the Jacobian) \(\nabla _{\textbf{w}}\nabla _{\textbf{w}}\mathcal {J}({\textbf{W}}) \in \mathcal {R}^{N\times N}\) becomes
where we have now introduced \(\lambda ^{i}\) as an additional parameter. Thus, it is clear that when \(\lambda \) is large, the Hessian approaches \(\lambda ^i {\textbf{I}}_N\), and we obtain the gradient descent method with a small step length of \(\gamma ^i/\lambda ^i\). On the other hand, when \(\lambda =0\), we revert to the Gauss-Newton method. Levenberg-Marquart will usually converge slower than the standard Gauss-Newton method, but in some cases, if Gauss-Newton has convergence issues, using Levenberg-Marquart may aid the convergence. Practically, Leveberg-Marquart combines Gauss-Newton and gradient descent.
Now using the corollaries from Eqs. (3.5) and (3.6), we can write the iteration in Eq. (6.4) in the standard form
The tricky term in Eq. (6.5), which corresponds to the one mentioned in relation to Eq. (4.5), is the product \({{\textbf{G}}_j^i}{\textbf{A}}\). Evensen et al. (2019) showed that we can write
which relates the ensemble anomalies at iteration i to the initial anomalies \({\textbf{A}}= {\textbf{A}}^{i}{\boldsymbol{\Omega }^{i}}^{-1}\), and where we have defined the ensemble of weights
Note that we cannot use Eq. (6.10) when \(n < N-1\), i.e., when the state dimension is less than the ensemble size minus one. We, then, need to retain the projection \({{\textbf{A}}^{i}}^\dagger {\textbf{A}}^{i}\) and use Eq. (6.9) rather than Eq. (6.10). Evensen et al. (2019) derived the proofs of this result, and we refer to this paper for the details. This result is also complementary to the use of the expression \({\textbf{Y}}{\textbf{A}}^\dagger {\textbf{A}}{\textbf{Y}}^\textrm{T}\) to represent \({\textbf{C}}_{{\textit{yy}}}\) in Eq. (5.13) when \(n < N-1\), as was derived by Evensen (2019).
We can now write the iteration of Eq. (6.5) in matrix form as
where we have used \({\textbf{W}}^i/\sqrt{N-1} = \boldsymbol{\Pi }{\textbf{W}}^i\), which we get from Eq. (6.13) using \({\textbf{S}}^\textrm{T}/\sqrt{N-1}=\boldsymbol{\Pi }{\textbf{S}}^\textrm{T}\). Thus, we can compute the final update in Eq. (6.14) to a cost of \(n N^2\) operations. The updated ensemble is a linear combination of the prior ensemble members, and the prior ensemble space contains the updated ensemble of solutions.
Anzeige
Algorithm 1 details the Gauss-Newton implementation of the subspace EnRML algorithm. We excluded the Levenberg-Marquart parameter \(\lambda \) as we have never needed to use it in the ensemble subspace implementation. The algorithm takes as inputs the prior ensemble and the perturbed measurements and runs an ensemble of model simulations to evaluate \({\textbf{g}}\bigl ({\textbf{Z}}^{i}\bigr )\). Thus, the algorithm is generic, and we can use it for any model or problem configuration. In Sect. 6.5, we discuss a practical and efficient implementation for inverting the expression \(\Big ({\textbf{S}}^{i}{{\textbf{S}}^{i}}^\textrm{T}+ \overline{{\textbf{C}}}_{{\textit{dd}}}\Big )^{-1}\), where we replace the full measurement error covariance matrix with the ensemble representation, \({\textbf{C}}_{{\textit{dd}}}\approx \overline{{\textbf{C}}}_{{\textit{dd}}}= {\textbf{E}}{\textbf{E}}^\textrm{T}\).
Choosing an optimal value for the step length \(\gamma ^i\) is generally not straightforward. Evensen et al. (2019) proposed a scheme with \(\gamma ^i\) decreasing from 0.6 to 0.1. For linear problems, an optimal step length of \(\gamma ^i=1.0\) gives us the solution in one step. However, for nonlinear problems where we apply an averaged model sensitivity, we must use a conservative value of \(\gamma \) that leads to stable solutions for all the realizations. We typically start with a value \(\gamma ^i=0.5\), and if the sum of the cost functions in Eq. (5.19) increases from one iteration to the next, we divide the step length by two and rerun that iteration. Note that we only need to store the \({\textbf{W}}\) and the predicted measurements \(\boldsymbol{\Upsilon }\) from the previous iteration and then recompute the updated \({\textbf{W}}^i\) from Eq. (6.13). A critical learning is that one should rerun an iteration with a smaller value of \(\gamma ^i\) if the mean cost function value increases over an iteration.
6.3 Ensemble Smoother
Interestingly, it is possible to compute the EnKF or ES solution from the first subspace EnRML iteration. Using the ensemble matrices from the previous chapter, we can rewrite the ES update Eq. (5.13), as
If we set the step length \(\gamma =1.0\) and the relaxation parameter \(\lambda =0\) in the Levenberg-Marquart iteration of (6.13), the first step becomes just the update Eq. (6.15), as the prior value \({\textbf{W}}^{(0)}=0\). Hence, we can use Algorithm 1 to compute both the EnRML and the ES solutions.
6.4 Ensemble Smoother with Multiple Data Assimilation
The Ensemble Smoother with Multiple Data Assimilation (ESMDA) method proposed by Emerick and Reynolds (2013) uses ES updates for each recursive update step in the MDA algorithm from Sect. 3.6. Since the ES update is an ensemble that approximately samples the posterior distribution, we can use the ES-updated ensemble as the prior for the next update step. Furthermore, since we can use Algorithm 1 to compute the ES solution, we can also recursively compute the solution for the ESMDA method.
We use the algorithm as we do for the ES solution, but we repeat the procedure \(n_\alpha \) times. As the steps are independent, we must resample the perturbed measurements from \(\mathcal {N}({\textbf{d}}, \alpha _i {\textbf{C}}_{{\textit{dd}}})\) for each recursive call to the algorithm. Thus, each recursive step uses an effective measurement error variance increased by a factor \(\alpha _i\), and we compute a sequence of small update steps that gradually introduce the information from the measurements. Hence, ESMDA computes many short linear steps instead of one long one, reducing the impact of nonlinearity when using the linear ES update equation. Note that ESMDA with one step corresponds to the ES estimate.
There is a need for a better theoretical understanding and basis for choosing the optimal sequence of weights. Using uniform weights, i.e., setting all \(\alpha _i = n_\alpha \) which satisfy Eq. (3.17) is standard. However, some works have provided evidence that a geometrically decreasing set of weights gives superior results (Rafiee and Reynolds 2017; Evensen 2018; Emerick 2019).
ESMDA has gained popularity due to its ease of implementation and successful use in many applications. When considering the convergence of ESMDA, we mean the number of update steps needed before a further decrease in step length does not change the final solution. The required number of steps depends on the nonlinearity of the model. The typical number of steps in practical applications ranges from 4 to 16, depending on the model’s nonlinearity. We note that in ESMDA, as the number of steps increases, the magnitude of the measurement perturbations also increases. Thus, one can imagine cases where the perturbed measurements take unphysical values, causing the algorithm to break down. Emerick (2018) resolved this issue using a square-root formulation for the update calculation, although the paper’s objective was to reduce sampling errors.
Chapter 9 will provide simple scalar examples illustrating some properties of the ESMDA method, and ESMDA is also the method used in the reservoir cases in Chap. 13.
6.5 Ensemble Subspace Inversion
In Algorithm 1, we have represented the measurement error covariance matrix by an ensemble of measurement perturbations, \({\textbf{E}}\). In this case, we compute the inversion using an ensemble subspace scheme proposed by Evensen (2004). See also the more recent discussions by Evensen (2009, 2021) and Evensen et al. (2019). The scheme projects the measurement error perturbations onto the ensemble subspace and computes the pseudo inverse using the following factorization
and the identity matrix \({\textbf{I}}_N \in \mathcal {R}^{N\times N}\). We must specify a truncation value in the singular-value decomposition in Eq. (6.23) to avoid dividing by zero when computing the pseudo inverse \(\boldsymbol{\Sigma }^\dagger \). The truncation value is not critical, and a typical truncation accounts for around 99% of the variance represented by the largest singular values.
The eigenvalue decomposition in Eq. (6.21) is of the matrix product in (6.20). Note that this eigenvalue decomposition is most efficiently computed by a singular-value decomposition of the product \(\boldsymbol{\Sigma }^\dagger {\textbf{U}}^\textrm{T}{\textbf{E}}\). The left singular vectors will then equal the eigenvectors in \({\textbf{Q}}\), and the squares of the singular values will equal the eigenvalues in \(\boldsymbol{\Lambda }\). Thus, the inversion becomes
The main advantage of this algorithm is that it allows for computing the inverse to a linear cost in the number of measurements, \(\mathcal {O}(mN^2)\). The algorithm represents the measurement error covariances using the measurement perturbation matrix \({\textbf{E}}\). Simulating measurement perturbations with given statistics is usually easier than constructing a complete error covariance matrix. The disadvantage is that using a finite ensemble to represent the measurement error covariance matrix introduces additional sampling errors. However, Evensen (2021) demonstrated that, by using a larger ensemble to represent \({\textbf{E}}\) in Eq. (6.18), one could reduce the associated sampling errors to a negligible magnitude and with little additional computational cost since \({\textbf{E}}\) only occur in the matrix multiplication \({\textbf{U}}^\textrm{T}{\textbf{E}}\) in Eq. (6.20).
6.6 EnKF Analysis with Independent Measurements
Independent measurements are measurements with uncorrelated measurement errors. Most operational ensemble-based assimilation schemes apply an assumption of uncorrelated measurement errors and use a diagonal \({\textbf{C}}_{{\textit{dd}}}= {\textbf{I}}_m\); see, e.g., the reviews on data assimilation in the geosciences (Carrassi et al. 2018), weather prediction (Houtekamer and Zhang 2016), and petroleum applications (Aanonsen et al. 2009). Data assimilation practitioners employ this assumption for two reasons. First, the measurement error covariances are often unknown, so one avoids modeling them by setting them to zero. Additionally, the assumption of a diagonal \({\textbf{C}}_{{\textit{dd}}}\) somewhat simplifies the update scheme in Eq. (5.12). With \({\textbf{C}}_{{\textit{dd}}}= {\textbf{I}}_m\), Eq. (6.15) becomes
This modification reduces the size of the matrix inversion from \(m\times m\) in Eq. (6.25) to \(N\times N\) in Eq. (6.26). See also the discussion on this implementation in Evensen et al. (2019, Sect. 3.2).
Finally, the subspace inversion from the previous section is equally fast as the “exact” inversion in Eq. (6.26), even for a diagonal measurement error covariance matrix. The authors’ experience is that the exact inversion in Eq. (6.26) is less stable than the subspace inversion from the previous section. Evensen et al. (2024), therefore, used the subspace inversion for all experiments.
Fig. 6.1
Impact of measurement dependency on analysis update. The upper row presents the results for a case with uncorrelated measurement errors, while the second row gives the results when using measurements with correlated errors. The third row is similar to the second but uses four times as many observations. The left plots show the results for the posterior ensemble means, while the panels to the right provide the associated error variance estimates. The line labels EnKF and ICA denote the standard EnKF update and an inconsistent update, as is explained in the text. The measurement error bars indicate the plus-minus two standard deviations of the measurement errors
Evensen (2021) discussed the impact of conditioning on measurements with correlated errors, and we will now revisit and extend an example from this paper. Figure 6.1 shows the updates from three EnKF experiments, all using 2000 model realizations to minimize sampling errors. The upper plots represent a case where we condition the prior ensemble on 50 independent measurements with error standard deviations of 0.5. We have plotted the measurements with error bars indicating plus and minus two standard deviations. We obtained an excellent EnKF result that agreed with the measurements given their errors and recovered the reference solution well. The posterior variance varies around 0.125, a significant reduction compared to the prior variance of one.
The experiment shown in the second row is similar to the one in the first row. However, we assume correlated measurement errors with a decorrelation length of 40 (grid indexes). From the EnKF posterior variance shown in the right plot, we notice that introducing measurement error correlations reduces the EnKF-update’s strength, and the posterior variance now varies around 0.2. The importance of specifying and accounting for any measurement error correlations becomes evident from the experiment displayed in the third row of Fig. 6.1. This experiment differs from the second row by conditioning on 200 dependent measurements instead of 50. In the EnKF-analysis update, we avoid any overfitting of the observations, and the posterior variance is nearly identical to the experiment with 50 dependent measurements. Thus, because of the dependencies between neighboring observations, increasing the measurement density does not lead to the assimilation of more information or a more substantial update.
An interesting result is the ICA case (ICA denotes inconsistent analysis), where we have conditioned on measurements with correlated errors but neglected these correlations when computing the update, i.e., we are using a diagonal measurement error covariance matrix. In this case, we obtain an underestimated posterior variance similar to the example with uncorrelated measurement errors. In the example with 200 correlated measurements, the variance underestimation is even more pronounced, and we see the potential for ensemble collapse when increasing the number of measurements.
The ICA cases represent the typical approach currently used in most ensemble methods applied by history-matching practitioners. Neglecting the measurements’ error correlations leads to overfitting and, eventually, so-called ensemble collapse. For example, when conditioning the model on a time series of rate data, one needs to select the sampling frequency of data in time. For example, should we use weekly, monthly, or another data sampling frequency? And does an increased sampling frequency introduce additional information to the conditioning process? From the example above, it is clear that increasing the sampling frequency of dependent data without representing the measurement error correlations is inconsistent. It leads to underestimating the posterior ensemble variance. We will revert to this discussion in Sect. 13.6.
For the practical computation and representation of measurement error correlations, it is simpler to simulate a time series of correlated errors and store these in the measurement error perturbation matrix \({\textbf{E}}\) used to define the ensemble measurement error covariance in Eq. (5.7). Also, in these examples, we used 10N realizations of the measurement perturbations to reduce the sampling errors.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.