Optimization of mesh hierarchies in multilevel Monte Carlo samplers

Haji-Ali, Abdul-Lateef; Nobile, Fabio; von Schwerin, Erik; Tempone, Raúl

doi:10.1007/s40072-015-0049-7

576 Accesses
17 Citations
2 Altmetric
Explore all metrics

Abstract

We perform a general optimization of the parameters in the multilevel Monte Carlo (MLMC) discretization hierarchy based on uniform discretization methods with general approximation orders and computational costs. We optimize hierarchies with geometric and non-geometric sequences of mesh sizes and show that geometric hierarchies, when optimized, are nearly optimal and have the same asymptotic computational complexity as non-geometric optimal hierarchies. We discuss how enforcing constraints on parameters of MLMC hierarchies affects the optimality of these hierarchies. These constraints include an upper and a lower bound on the mesh size or enforcing that the number of samples and the number of discretization elements are integers. We also discuss the optimal tolerance splitting between the bias and the statistical error contributions and its asymptotic behavior. To provide numerical grounds for our theoretical results, we apply these optimized hierarchies together with the Continuation MLMC Algorithm (Collier et al., BIT Numer Math 55(2):399–432, 2015). The first example considers a three-dimensional elliptic partial differential equation with random inputs. Its space discretization is based on continuous piecewise trilinear finite elements and the corresponding linear system is solved by either a direct or an iterative solver. The second example considers a one-dimensional Itô stochastic differential equation discretized by a Milstein scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Article Open access 13 June 2022

Topology optimization of multi-scale structures: a review

Article Open access 08 March 2021

Recent advances and applications of surrogate models for finite element method computations: a review

Article 17 July 2022

Notes

We consider uniform meshes, but the extension to certain non-uniform meshes is immediate; see Remark 2.2 in Sect. 2.2.

References

Amestoy, P.R., Duff, I.S., L’Excellent, J.Y., Koster, J.: A fully asynchronous multifrontal solver using distributed dynamic scheduling. SIAM J. Matrix Anal. Appl. 23, 15–41 (2001). doi:10.1137/S0895479899358194
Article MathSciNet MATH Google Scholar
Babuška, I., Nobile, F., Tempone, R.: A stochastic collocation method for elliptic partial differential equations with random input data. SIAM Rev. 52(2), 317–355 (2010)
Article MathSciNet MATH Google Scholar
Balay, S., Gropp, W.D., McInnes, L.C., Smith, B.F.: Efficient management of parallelism in object oriented numerical software libraries. In: Arge, E., Bruaset, A.M., Langtangen, H.P. (eds.) Modern Software Tools in Scientific Computing, pp. 163–202. Birkhäuser, Boston (1997)
Chapter Google Scholar
Balay, S., Brown, J., Buschelman, K., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Smith, B.F., Zhang, H.: PETSc Web page (2013). http://www.mcs.anl.gov/petsc
Barth, A., Schwab, C., Zollinger, N.: Multi-level Monte Carlo finite element method for elliptic PDEs with stochastic coefficients. Numer. Math. 119(1), 123–161 (2011)
Article MathSciNet MATH Google Scholar
Barth, A., Lang, A., Schwab, C.: Multilevel Monte Carlo method for parabolic stochastic partial differential equations. BIT Numer. Math. 53(1), 3–27 (2013)
Article MathSciNet MATH Google Scholar
Bayer, C., Hoel, H., von Schwerin, E., Tempone, R.: On nonasymptotic optimal stopping criteria in Monte Carlo simulations. SIAM J. Sci. Comput. 36(2), A869–A885 (2014). doi:10.1137/130911433
Article MathSciNet MATH Google Scholar
Charrier, J., Scheichl, R., Teckentrup, A.: Finite element error analysis of elliptic PDEs with random coefficients and its application to multilevel Monte Carlo methods. SIAM J. Numer. Anal. 51(1), 322–352 (2013)
Article MathSciNet MATH Google Scholar
Cliffe, K., Giles, M., Scheichl, R., Teckentrup, A.: Multilevel Monte Carlo methods and applications to elliptic PDEs with random coefficients. Comput. Vis. Sci. 14(1), 3–15 (2011)
Article MathSciNet MATH Google Scholar
Collier, N., Dalcin, L., Calo, V.: PetIGA: High-performance isogeometric analysis. arxiv (1305.4452) (2013). arXiv:1305.4452
Collier, N., Haji-Ali, A.L., Nobile, F., von Schwerin, E., Tempone, R.: A continuation multilevel Monte Carlo algorithm. BIT Numer. Math. 55(2), 399–432 (2015). doi:10.1007/s10543-014-0511-3
Giles, M.: Improved multilevel Monte Carlo convergence using the Milstein scheme. Monte Carlo and Quasi-Monte Carlo Methods 2006, pp. 343–358. Springer, Berlin (2008)
Chapter Google Scholar
Giles, M.: Multilevel Monte Carlo path simulation. Oper. Res. 56(3), 607–617 (2008)
Article MathSciNet MATH Google Scholar
Giles, M., Reisinger, C.: Stochastic finite differences and multilevel Monte Carlo for a class of SPDEs in finance. SIAM J. Financ. Math. 3(1), 572–592 (2012)
Article MathSciNet MATH Google Scholar
Glasserman, P.: Monte Carlo methods in financial engineering. Stochastic Modelling and Applied Probability. Applications of Mathematics. Springer, New York (2004)
Google Scholar
Heinrich, S.: Monte Carlo complexity of global solution of integral equations. J. Complex. 14(2), 151–175 (1998)
Article MathSciNet MATH Google Scholar
Heinrich, S., Sindambiwe, E.: Monte Carlo complexity of parametric integration. J. Complex. 15(3), 317–341 (1999)
Article MathSciNet MATH Google Scholar
Hoel, H., von Schwerin, E., Szepessy, A., Tempone, R.: Adaptive multilevel Monte Carlo simulation. In: Engquist, B., Runborg, O., Tsai, Y.H. (eds.) Numerical Analysis of Multiscale Computations. Lecture Notes in Computational Science and Engineering, vol. 82, pp. 217–234. Springer, Berlin (2012)
Chapter Google Scholar
Hoel, H., von Schwerin, E., Szepessy, A., Tempone, R.: Implementation and analysis of an adaptive multilevel Monte Carlo algorithm. Monte Carlo Methods Appl. 20(1), 1–41 (2014)
Article MathSciNet MATH Google Scholar
Jouini, E., Cvitanić, J., Musiela, M. (eds.): Option pricing, interest rates and risk management. Handbooks in Mathematical Finance. Cambridge University Press, Cambridge (2001)
Karatzas, I., Shreve, S.E.: Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics, vol. 113, 2nd edn. Springer, New York (1991)
MATH Google Scholar
Kebaier, A.: Statistical Romberg extrapolation: a new variance reduction method and applications to options pricing. Ann. Appl. Probab. 14(4), 2681–2705 (2005)
Article MathSciNet MATH Google Scholar
Milstein, G.N., Tretyakov, M.V.: Stochastic numerics for mathematical physics. Springer, New York (2004)
Book MATH Google Scholar
Moon, K.S., Szepessy, A., Tempone, R., Zouraris, G.E.: Convergence rates for adaptive weak approximation of stochastic differential equations. Stoch. Anal. Appl. 23(3), 511–558 (2005)
Article MathSciNet MATH Google Scholar
Moraes, A., Tempone, R., Vilanova, P.: Multilevel hybrid chernoff tau-leap. BIT Numer. Math. (2015). doi:10.1007/s10543-015-0556-y
Øksendal, B.: Stochastic Differential Equations. Universitext, 5th edn. Springer, Berlin (1998)
Book MATH Google Scholar
Teckentrup, A., Scheichl, R., Giles, M., Ullmann, E.: Further analysis of multilevel Monte Carlo methods for elliptic PDEs with random coefficients. Numer. Math. 125(3), 569–600 (2013)
Article MathSciNet MATH Google Scholar
Tesei, F., Nobile, F.: A multi level Monte Carlo method with control variate for elliptic pdes with log-normal coefficients. Technical report (2014)
Xia, Y., Giles, M.: Multilevel path simulation for jump-diffusion SDEs. In: Plaskota, L., Woźniakowski, H. (eds.) Monte Carlo and Quasi-Monte Carlo Methods 2010, pp. 695–708. Springer, Berlin (2012)
Chapter Google Scholar

Download references

Acknowledgments

R. Tempone is a member of the Research Center on Uncertainty Quantification (SRI-UQ), division of Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) at King Abdullah University of Science and Technology (KAUST). The authors would like to recognize the support of the following KAUST and University of Texas at Austin AEA projects: Round 2, “Predictability and Uncertainty Quantification for Models of Porous Media”, and Round 3, “Uncertainty quantification for predictive modeling of the dissolution of porous and fractured media”. F. Nobile has been partially supported by the Swiss National Science Foundation under the Project No. 140574 “Efficient numerical methods for flow and transport phenomena in heterogeneous random porous media” and by the Center for ADvanced MOdeling Science (CADMOS). E. von Schwerin has been partially supported by the aforementioned SRI-UQ and CADMOS. We would also like to acknowledge the following open source software packages that made this work possible: PETSc [4], PetIGA [10].

Author information

Authors and Affiliations

Applied Mathematics and Computational Sciences, KAUST, Thuwal, Saudi Arabia
Abdul-Lateef Haji-Ali & Raúl Tempone
MATHICSE-CSQI, EPF de Lausanne, Lausanne, Switzerland
Fabio Nobile
Department of Mathematical Sciences, University of Delaware, Newark, USA
Erik von Schwerin

Authors

Abdul-Lateef Haji-Ali
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Nobile
View author publications
You can also search for this author in PubMed Google Scholar
Erik von Schwerin
View author publications
You can also search for this author in PubMed Google Scholar
Raúl Tempone
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Abdul-Lateef Haji-Ali or Erik von Schwerin.

Appendix 1: Derivations and proofs

1.1 Appendix 1.1: Optimal hierarchies given $h_0$, $\theta $, and L

Here we solve Problem 2.1 of Sect. 2.2 for the optimal hierarchy for any fixed value of L. We initially treat the parameter $\theta $ as given, postponing its optimization until later, and proceed in two steps to find the optimal $\{M_\ell \}_{\ell =0}^L$ and $\{h_\ell \}_{\ell =0}^L$. Assuming general work estimates $\{W_\ell \}_{\ell =0}^L$ in (2.4) and general variance estimates of $\{V_\ell \}_{\ell =0}^L$, we assume equality in (2.9b) and introduce the Lagrange multiplier $\lambda $ to obtain the Lagrangian

$$\begin{aligned} {{\mathcal {L}}}\left( \{M_\ell \}_{\ell =0}^L,\lambda \right)&= \sum _{\ell =0}^{L} M_\ell W_\ell +\lambda \left\{ \sum _{\ell =0}^L \frac{V_\ell }{M_\ell }- \left( \theta \frac{{\mathrm {TOL}}}{C_\alpha }\right) ^2 \right\} . \end{aligned}$$

The requirement that the variation of the Lagrangian with respect to $M_\ell $ is zero, gives $M_\ell = \sqrt{\lambda \frac{V_\ell }{W_\ell }}.$ Solving for $\lambda $ in the variance constraint (2.9b) and substituting back leads to (2.13). Substituting this optimal $M_\ell $ in the total work (2.4) yields

$$\begin{aligned} W({\mathbf {H}})&= \left( \frac{C_\alpha }{ \theta {\mathrm {TOL}}} \right) ^2 \left( \sum _{\ell =0}^L \sqrt{W_\ell V_\ell } \right) ^2. \end{aligned}$$

(5.1)

We proceed to find the optimal $\{h_\ell \}_{\ell =0}^L$ under the particular models (2.12). The total work (5.1) is minimized when

$$\begin{aligned} \sum _{\ell =0}^L \sqrt{W_\ell V_\ell }&= \sqrt{\frac{{V_0}}{h_0^{d\gamma }}} + \sqrt{Q_S}\sum _{\ell =1}^L \sqrt{\frac{h_{\ell -1}^{q_2}}{h_\ell ^{d\gamma }}}, \end{aligned}$$

(5.2)

is minimized. Here the finest mesh, $h_L$, is given by the bias constraint (2.12b) as

$$\begin{aligned} h_L&= \left( \frac{(1-\theta )\,{\mathrm {TOL}}}{Q_W} \right) ^{\frac{1}{q_1}}, \end{aligned}$$

(5.3)

independently of the multilevel construction. Now, treat the coarsest mesh, $h_0$, as given and find the optimal $h_1,\ldots ,h_{L-1}$ that minimize

$$\begin{aligned} \frac{1}{\sqrt{Q_S}}\sum _{\ell =1}^L \sqrt{W_\ell V_\ell }&= \sum _{\ell =1}^L \sqrt{\frac{h_{\ell -1}^{q_2}}{h_\ell ^{d\gamma }}}. \end{aligned}$$

(5.4)

The requirement that the derivative of this sum with respect to $h_\ell $ equals zero, for $\ell =1,\ldots ,L-1$, leads to the optimality condition

$$\begin{aligned} q_2 h_\ell ^{\left( \frac{q_2+d\gamma }{2}\right) }&= d\gamma h_{\ell -1}^{\left( \frac{q_2}{2}\right) } h_{\ell +1}^{\left( \frac{d\gamma }{2}\right) }, \end{aligned}$$

which after taking the logarithm and using $\chi $ defined in (2.14), leads to

$$\begin{aligned} - \log {\left( h_{\ell +1}\right) } + \left( 1+\chi \right) \log {\left( h_\ell \right) } - \chi \log {\left( h_{\ell -1}\right) }&= -\frac{2}{d\gamma } \log {\left( \chi \right) }. \end{aligned}$$

(5.5)

This is a second order linear difference whose solution depends on $\chi $.

1.1.1 Appendix 1.1.1: For $\chi =1$

This section provides proofs of Theorem 2.1, Lemma 2.1, and Corollary 2.1. The solution of the difference Eq. (5.5) for the case $\chi =1$ is the geometric sequence

$$\begin{aligned} h_\ell = h_0 {\beta } ^{-\ell },\,\, \text {with}\, {\beta } =\left( \frac{h_0}{h_L}\right) ^{1/L}. \end{aligned}$$

(5.6)

In other words, all $h_\ell $ are defined in terms of $h_0$ and $h_L$, where the latter is determined by $\theta $ through (5.3) and we solve for the former by setting the derivative of (5.2) with respect to $h_0$ equal to zero. This optimality condition becomes (for $q_2=d\gamma $)

$$\begin{aligned} h_1 = \left( \frac{Q_S}{V_0}\right) ^{\frac{1}{q_2}} h_0^2. \end{aligned}$$

Combining this expression with (5.6) for $\ell =1$ and solving for $h_0$ yields

$$\begin{aligned} h_0 = h_L^\frac{1}{L+1} \left( \frac{V_0}{Q_S}\right) ^\frac{L}{q_2 (L+1)}. \end{aligned}$$

(5.7)

Substituting this expression and (5.3) in the expression for $ {\beta } $ in (5.6) we obtain (2.15c). Moreover, substituting (2.15c) and (5.6) and (2.10) and (2.5) in (2.13) yields (2.15b). Next, we substitute (2.15b) and (2.15a) in (2.6) to obtain the optimal work for $q_2=d\gamma $

$$\begin{aligned} W = \left( \frac{C_\alpha }{\theta {\mathrm {TOL}}} \right) ^2 \left( \sqrt{V_0} h_0^\frac{-q_2}{2} + \sqrt{Q_S} {\beta } ^\frac{q_2}{2} L \right) ^2. \end{aligned}$$

(5.8)

Using (5.6) and (5.7), we obtain

$$\begin{aligned} W = \left( \frac{C_\alpha }{\theta {\mathrm {TOL}}} \right) ^2 h_L^\frac{-q_2}{L+1} V_0^\frac{1}{L+1} Q_S^\frac{L}{L+1}\left( 1+L\right) ^2 . \end{aligned}$$

(5.9)

Substituting for $h_L$ from (5.3)

$$\begin{aligned} W = \left( \frac{C_\alpha }{\theta {\mathrm {TOL}}} \right) ^2 \left( \frac{Q_W}{(1-\theta )\,{\mathrm {TOL}}} \right) ^{\frac{1}{ \eta (L+1)}} V_0^\frac{1}{L+1} Q_S^\frac{L}{L+1}\left( 1+L\right) ^2. \end{aligned}$$

(5.10)

Optimizing for $\theta $ yields (2.15d). Substituting back gives the work as a function of L

$$\begin{aligned} W(L) \!=\! C_\alpha ^2 {\mathrm {TOL}}^{-2(1+{{\mathfrak {e}}}(L))} Q_W^{2{{\mathfrak {e}}}(L)} V_0^{2 \eta {{\mathfrak {e}}}(L)} Q_S^{-2 \eta {{\mathfrak {e}}}(L)} Q_S \left( \frac{1}{2\eta }\right) ^2 \left( 1 + \frac{1}{{{\mathfrak {e}}}(L)}\right) ^{2(1+{{\mathfrak {e}}}(L))}, \end{aligned}$$

(5.11)

where ${{\mathfrak {e}}}(L) = \frac{1}{2\eta (L+1)}$. Treating L as a continuous variable and differentiating with respect to L yields

$$\begin{aligned} W'(L) = 2 W(L) {{\mathfrak {e}}}'(L) \left( C -y + \log (1+y) \right) , \end{aligned}$$

(5.12)

where ${y=2\eta (L+1)} > 0$ and ${C = \log \left( {\mathrm {TOL}}^{-1} Q_W V_0^\eta Q_S^{-\eta } \right) }$. Setting (5.12) to zero gives the equation

$$\begin{aligned} y - \log (1+y) = C. \end{aligned}$$

(5.13)

It follows that $\lim _{C \rightarrow \infty } \frac{y}{C} = 1$ which leads to (2.17) for the value of L and (2.18) for the work. Since $y>0$, Eq. (5.13) implies

$$\begin{aligned} 1<y C^{-1} \end{aligned}$$

(5.14)

for any $C>0$. Furthermore, for any $y>0$, it holds

$$\begin{aligned} \left( \frac{\exp (1)-1}{\exp (1)}\right) (1+y) - C \le 1+y - \log (1+y) - C, \end{aligned}$$

which together with (5.13) gives

$$\begin{aligned} y C^{-1} \le \frac{\exp (1)}{\exp (1)-1} + \frac{1}{(\exp (1)-1)C}, \end{aligned}$$

(5.15)

for any $C>0$. Inequalities (5.14) and (5.15) are (2.16).

1.1.2 Appendix 1.1.2: For $\chi \ne 1$

This section provides proofs of Theorem 2.2, Lemma 2.2, and Corollary 2.2. The solution of the difference Eq. (5.5) for the case $\chi \ne 1$ is

$$\begin{aligned} h_\ell&= h_0^{\left( \frac{\chi ^\ell -\chi ^L}{1-\chi ^L}\right) } h_L^{\left( \frac{1-\chi ^\ell }{1-\chi ^L}\right) } \chi ^{-\frac{2}{d\gamma }\left( \frac{L(1-\chi ^\ell ) -\ell (1-\chi ^L)}{(1-\chi )(1-\chi ^L)}\right) }. \end{aligned}$$

(5.16)

We now distinguish between two different cases for $h_0$: either we are free to choose the optimal $h_0\in {\mathbb {R}}_+$, or we have an upper bound on the coarsest mesh $h_0$. The first, idealized, situation will allow us to obtain explicit expressions for the optimal splitting parameter $\theta $ and the asymptotic work, and we start by considering this case. We return to the other case at the end of this section.

Unconstrained optimization of $h_0$ We take $h_1,\ldots ,h_L$ given by (5.16) and (5.3) and set the derivative of (5.2) with respect to $h_0$ equal to zero. This optimality condition becomes (after some straightforward simplifications)

$$\begin{aligned} -\frac{d\gamma }{2}\frac{\sqrt{{V_0}}}{h_0^{1+d\gamma /2}} +\frac{q_2}{2}\sqrt{Q_S}\frac{h_0^{q_2/2-1}}{h_1^{d\gamma /2}}&= 0, \end{aligned}$$

which, since all parameters are positive, is equivalent to

$$\begin{aligned} h_1&= \left( \frac{\chi ^2 Q_S}{{V_0}} \right) ^{\frac{1}{d\gamma }} h_0^{1+\chi }. \end{aligned}$$

Combining this expression for $h_1$ with the one in (5.16) and solving for $h_0$ gives

$$\begin{aligned} h_0&= h_L^{\frac{1-\chi }{1-\chi ^{L+1}}} \left( \frac{{V_0}}{Q_S}\right) ^{\frac{1}{d\gamma } \frac{1-\chi ^L}{1-\chi ^{L+1}}} \chi ^{-\frac{2}{d\gamma }\frac{1}{1-\chi } \left( L\frac{1-\chi }{1-\chi ^{L+1}}-\chi \frac{1-\chi ^L}{1-\chi ^{L+1}} \right) }, \end{aligned}$$

which after substituting back into (5.16) and using (5.3) yields (2.19a). Finally substituting these optimal mesh sizes into (2.13) yields (2.19b).

Optimal splitting parameter $\theta $ Now the sequences $\{h_\ell \}_{\ell =0}^L$ and $\{M_\ell \}_{\ell =0}^L$ are determined in terms of the still not optimized L and $\theta $ as well as measurable model parameters. The work per level in (2.6) becomes

$$\begin{aligned} \frac{M_\ell }{h_\ell ^{d\gamma }}&= \left( \frac{C_\alpha }{\theta {\mathrm {TOL}}}\right) ^2 \left( \frac{Q_W}{(1-\theta )\,{\mathrm {TOL}}} \right) ^{\frac{1}{\eta }\frac{1-\chi }{1-\chi ^{L+1}}} V_0 \left( \frac{Q_S}{V_0}\right) ^{\left\{ \frac{1-\chi ^L}{1-\chi ^{L+1}} \right\} } \\&\quad \times \, \frac{1-\chi ^{L+1}}{1-\chi } \chi ^{\left\{ -\frac{2\chi }{1-\chi }\frac{1-\chi ^L}{1-\chi ^{L+1}} +L\frac{1+\chi ^{L+1}}{1-\chi ^{L+1}}\right\} } \chi ^{-\ell }. \end{aligned}$$

Since the only $\ell $-dependent factor in the right hand side is the last one, $\chi ^{-\ell }$, and using $\sum _{\ell =0}^L\chi ^{-\ell }=\chi ^{-L}(1-\chi ^{L+1})/(1-\chi )$, the total work in (2.6) becomes

$$\begin{aligned} W{(L,\theta ,{\mathrm {TOL}})}&= w_1\left( L,{\mathrm {TOL}}\right) \, w_2(L) \, f\left( L,\theta \right) \left( \frac{1-\chi ^{L+1}}{1-\chi }\right) ^2, \end{aligned}$$

(5.17)

with

$$\begin{aligned} w_1\left( L,{\mathrm {TOL}}\right)&= {\mathrm {TOL}}^{-\left( 2+\frac{1}{\eta }\frac{1-\chi }{1-\chi ^{L+1}}\right) }, \end{aligned}$$

(5.18a)

$$\begin{aligned} w_2(L)&= C_\alpha ^2 \, V_0 \left( \frac{Q_S}{V_0}\right) ^{\left\{ \frac{1-\chi ^L}{1-\chi ^{L+1}} \right\} } Q_W^{\left\{ \frac{1}{\eta }\frac{1-\chi }{1-\chi ^{L+1}}\right\} } \nonumber \\&\quad \times \, \chi ^{\left\{ -\frac{2\chi }{1-\chi }\frac{1-\chi ^L}{1-\chi ^{L+1}} +2L\frac{\chi ^{L+1}}{1-\chi ^{L+1}}\right\} }, \end{aligned}$$

(5.18b)

$$\begin{aligned} f\left( L,\theta \right)&= \frac{1}{ \theta ^2 \left( 1-\theta \right) ^{\frac{1}{\eta }\frac{1-\chi }{1-\chi ^{L+1}}} }. \end{aligned}$$

(5.18c)

Thus given the value of L the dependence on the splitting parameter $\theta $ is straightforward, and the minimal work for a given L is obtained with the minimizer of (5.18c), namely (2.19c). With this optimal splitting parameter $\theta $ in (5.17) the total work as a function of the yet to be determined parameter L and the tolerance is

$$\begin{aligned} W{\left( L,{\mathrm {TOL}}\right) }&= w_1(L,{\mathrm {TOL}}) \, w_2(L) \, w_3 (L), \end{aligned}$$

(5.19)

with

$$\begin{aligned} w_3(L)&= \left( \frac{1}{2\eta }\right) ^2 \left( 1+2\eta \frac{1-\chi ^{L+1}}{1-\chi } \right) ^{2\left( 1+\frac{1}{2\eta }\frac{1-\chi }{1-\chi ^{L+1}}\right) }. \end{aligned}$$

(5.20)

Optimal number of levels

The optimal integer L seems impossible to find analytically. In practical computations we instead perform an extensive search over a small range of integer values. In the analysis below we treat L as a real parameter to obtain the bounds (2.20) that delimit the range of integer values that must be tested, and allow a complexity analysis as ${\mathrm {TOL}}\rightarrow 0$ without an exactly determined L.

Treating L as a real parameter, we differentiate the work (5.19) with respect to L to obtain

$$\begin{aligned} \frac{\partial W}{\partial L}&= \frac{\partial w_1}{\partial L} \, w_2 \, w_3 + w_1 \, \frac{\partial w_2}{\partial L} \, w_3 + w_1 \, w_2 \, \frac{\partial w_3}{\partial L}, \end{aligned}$$

where, introducing the shorthand

$$\begin{aligned} \xi (L) = 2\eta \frac{1-\chi ^{L+1}}{1-\chi }\quad \text {for } L\in [0,\infty ), \end{aligned}$$

(5.21)

and using the constants $c_1$ and $c_2$ in (2.21) we write

$$\begin{aligned} \frac{\partial w_1}{\partial L}&= w_1(L,{\mathrm {TOL}})\,\frac{\log {(\chi )}\,\chi ^{L+1}}{1-\chi ^{L+1}} \frac{2}{\xi (L)} \log {\left( {\mathrm {TOL}}^{-1}\right) }, \end{aligned}$$

(5.22a)

$$\begin{aligned} \frac{\partial w_2}{\partial L}&= w_2(L)\, \frac{\log {(\chi )}\,\chi ^{L+1}}{1-\chi ^{L+1}} \frac{2}{\xi (L)} \left( c_1 + -c_2(L+1) + \xi (L) \right) , \end{aligned}$$

(5.22b)

$$\begin{aligned} \frac{\partial w_3}{\partial L}&= w_3(L)\,\frac{\log {(\chi )}\,\chi ^{L+1}}{1-\chi ^{L+1}} \frac{2}{\xi (L)} \big ( \log {\left( 1+\xi (L)\right) }-\xi (L) \big ), \end{aligned}$$

(5.22c)

so that

$$\begin{aligned} \frac{\partial W}{\partial L}(L,{\mathrm {TOL}})&= u(L,{\mathrm {TOL}})v(L,{\mathrm {TOL}}), \end{aligned}$$

(5.23)

with

$$\begin{aligned} u(L,{\mathrm {TOL}})&= W{(L,{\mathrm {TOL}})}\, \frac{\log {(\chi )}\,\chi ^{L+1}}{1-\chi ^{L+1}} \frac{2}{\xi (L)}, \\ v(L,{\mathrm {TOL}})&= \log {\left( {\mathrm {TOL}}^{-1}\right) } + c_1 + -c_2 (L+1) + \log {\left( 1+\xi (L)\right) }. \end{aligned}$$

Clearly $u(L,{\mathrm {TOL}})<0$ for all $\chi \in {\mathbb {R}}_+{\setminus }\{1\}$ so the sign of $\partial W/\partial L$ is the opposite of the sign of $v(L,{\mathrm {TOL}})$. For a fixed $\chi \in {\mathbb {R}}_+{\setminus }\{1\}$ we have

$$\begin{aligned} v(L,{\mathrm {TOL}}) > 0&\Leftrightarrow L+1 < \frac{1}{c_2} \left( \log {\left( {\mathrm {TOL}}^{-1}\right) }+c_1+\log { \left( 1+\xi (L)\right) }\right) , \end{aligned}$$

and, since $\xi (L)\ge \xi (0)=2\eta $,

$$\begin{aligned} L+1 < \frac{1}{c_2} \left( \log {\left( {\mathrm {TOL}}^{-1}\right) }+c_1+\log {\left( 1+2\eta \right) }\right)&\Rightarrow v(L,{\mathrm {TOL}}) > 0 \Leftrightarrow \frac{\partial W}{\partial L} < 0. \end{aligned}$$

(5.24)

For the opposite inequality,

$$\begin{aligned} v(L,{\mathrm {TOL}}) < 0&\Leftrightarrow L+1 > \frac{1}{c_2} \left( \log {\left( {\mathrm {TOL}}^{-1}\right) }+c_1+\log {\left( 1+\xi (L) \right) }\right) , \end{aligned}$$

we distinguish between the cases $0<\chi <1$ and $1<\chi $. When $0<\chi <1$ we have the upper bound $\xi (L)<\frac{2\eta }{1-\chi }$ and consequently

$$\begin{aligned} L+1 > \frac{1}{c_2} \left( \log {\left( {\mathrm {TOL}}^{-1}\right) }+c_1+\log {\left( 1+ \frac{2\eta }{1-\chi }\right) }\right)&\Rightarrow \frac{\partial W}{\partial L} > 0,&\chi \in (0,1). \end{aligned}$$

(5.25)

In contrast $\xi (L)$ is unbounded when $1<\chi $ but, since the definitions of $\chi $ and $\eta $ and the relation between strong and weak convergence orders implies that $2\eta \ge \chi $, we have

$$\begin{aligned} \log {\left( 1+\xi (L)\right) }&< \log {\left( \frac{2\eta }{\chi -1}\right) } + (L+1)\log {\chi }, \end{aligned}$$

and

$$\begin{aligned} c_2&\ge \frac{\chi }{\chi -1}\log {\chi }, \end{aligned}$$

which gives the bound

$$\begin{aligned} \frac{1}{c_2} \log {\left( 1+\xi (L)\right) }&< \frac{\chi -1}{\chi }(L+1) + \frac{1}{c_2} \log {\left( \frac{2\eta }{\chi -1}\right) }. \end{aligned}$$

Hence

$$\begin{aligned} L+1 - \frac{1}{c_2} \log {\left( 1+\xi (L)\right) }&> \frac{L+1}{\chi } - \frac{1}{c_2} \log {\left( \frac{2\eta }{\chi -1}\right) }, \end{aligned}$$

and it follows that

$$\begin{aligned} L+1 > \frac{\chi }{c_2} \left( \log {\left( {\mathrm {TOL}}^{-1}\right) }+c_1+\log {\left( \frac{2\eta }{\chi -1}\right) }\right)&\Rightarrow \frac{\partial W}{\partial L} > 0,&\chi \in (1,\infty ). \end{aligned}$$

(5.26)

Combining (5.24) with (5.25) and (5.26), we obtain the bounds (2.20).

Optimal hierarchies with an upper bound on $h_0$ Practical computations will impose an upper limit on the mesh sizes, $h_0\le h_{{\mathrm {max}}}$. If the mesh sizes (2.19a) violate such a bound, we must modify our analysis slightly. We now consider $h_0$ given as one of the coarsest mesh sizes that can be realized in the given discretization, and analyze the case $L\ge 1$. Using the optimal mesh sizes (5.16) yields

$$\begin{aligned} \sqrt{\frac{h_{\ell -1}^{q_2}}{h_\ell ^{d\gamma }}}&= h_0^{\frac{d\gamma }{2}\chi ^L\frac{1-\chi }{1-\chi ^L}} h_L^{-\frac{d\gamma }{2}\frac{1-\chi }{1-\chi ^L}} \chi ^{\left( \frac{L}{1-\chi ^L}-\frac{\chi }{1-\chi } -\ell \right) }, \end{aligned}$$

where the only $\ell $-dependent factor in the right hand side is the last one, $\chi ^{-\ell }$, so that the sum in (5.4) is

$$\begin{aligned} \sum _{\ell =1}^L \sqrt{\frac{h_{\ell -1}^{q_2}}{h_\ell ^{d\gamma }}}&= \left( \frac{h_0^{\left( \chi ^L\right) }}{h_L}\right) ^{\frac{d\gamma }{2}\frac{1-\chi }{1-\chi ^L}} \chi ^{\left( \frac{L\chi ^L}{1-\chi ^L}-\frac{\chi }{1-\chi }\right) } \frac{1-\chi ^L}{1-\chi }. \end{aligned}$$

In this sum only $h_L$ depends on $\theta $ through (5.3). Keeping L fixed we wish to minimize the total work, which by (5.1)–(5.2) is

$$\begin{aligned} W({\mathbf {H}})&= \left( \frac{C_\alpha }{ \theta {\mathrm {TOL}}} \right) ^2 \left( \sqrt{\frac{{V_0}}{h_0^{d\gamma }}} + \sqrt{Q_S} \left( \frac{h_0^{\left( \chi ^L\right) }}{h_L}\right) ^ {\frac{d\gamma }{2}\frac{1-\chi }{1-\chi ^L}} \chi ^{\left( \frac{L\chi ^L}{1-\chi ^L}-\frac{\chi }{1-\chi }\right) } \frac{1-\chi ^L}{1-\chi } \right) ^2, \end{aligned}$$

with respect to $\theta $. Letting

$$\begin{aligned} \varDelta&= \frac{1}{2\eta }\frac{1-\chi }{1-\chi ^L}, \end{aligned}$$

and

$$\begin{aligned} C&= \sqrt{\frac{Q_S}{{V_0}}} h_0^{\frac{d\gamma }{2}\frac{1-\chi ^{L+1}}{1-\chi ^L}} \chi ^{\left( \frac{L\chi ^L}{1-\chi ^L}-\frac{\chi }{1-\chi }\right) } \frac{1-\chi ^L}{1-\chi } \left( \frac{Q_W}{{\mathrm {TOL}}}\right) ^\varDelta , \end{aligned}$$

we obtain

$$\begin{aligned} W({\mathbf {H}})&\propto \tilde{f}\left( \theta ,L,h_0\right) = \frac{1}{\theta ^2}\left( 1+\frac{C}{\left( 1-\theta \right) ^\varDelta } \right) ^2, \end{aligned}$$

with the optimality condition

$$\begin{aligned} \frac{\partial \tilde{f}}{\partial \theta }&= \frac{2}{\theta ^2}\left( 1+\frac{C}{\left( 1-\theta \right) ^\varDelta }\right) \left( \frac{C\varDelta }{\left( 1-\theta \right) ^{\varDelta +1}} -\frac{1}{\theta }\left( 1+\frac{C}{\left( 1-\theta \right) ^\varDelta }\right) \right) = 0, \end{aligned}$$

where

$$\begin{aligned} \frac{2}{\theta ^2}\left( 1+\frac{C}{\left( 1-\theta \right) ^\varDelta }\right)&> 0. \end{aligned}$$

In this case when $h_0$ is constrained we no longer have an explicit expression for the optimal $\theta $. However, using

$$\begin{aligned} \frac{C\varDelta }{\left( 1-\theta \right) ^{\varDelta +1}} -\frac{1}{\theta }\left( 1+\frac{C}{\left( 1-\theta \right) ^\varDelta }\right)&< \frac{C}{\left( 1-\theta \right) ^\varDelta } \left( \frac{\varDelta }{1-\theta }-\frac{1}{\theta }\right) , \end{aligned}$$

and that

$$\begin{aligned} \frac{\varDelta }{1-\theta }-\frac{1}{\theta } = 0&\Leftrightarrow \theta = \frac{1}{1+\varDelta }, \end{aligned}$$

we conclude that the optimal $\theta $ satisfies

$$\begin{aligned} \frac{1}{1+\varDelta }&\le \theta . \end{aligned}$$

(5.27)

Similarly, from the inequality

$$\begin{aligned} \frac{C\varDelta }{\left( 1-\theta \right) ^{\varDelta +1}} -\frac{1}{\theta }\left( 1+\frac{C}{\left( 1-\theta \right) ^\varDelta }\right)&> \frac{1}{\left( 1-\theta \right) ^\varDelta } \left( \frac{C\varDelta }{1-\theta }-\frac{1+C}{\theta }\right) , \end{aligned}$$

and the relation

$$\begin{aligned} \frac{C\varDelta }{1-\theta }-\frac{1+C}{\theta } = 0&\Leftrightarrow \theta = \frac{1+C}{1+C+\varDelta }, \end{aligned}$$

we obtain an upper bound for $\theta $, namely

$$\begin{aligned} \theta&\le \frac{1+C}{1+C+C\varDelta }. \end{aligned}$$

(5.28)

Finally, combining (5.27) and (5.28) we have the following bounds for the optimal $\theta $:

$$\begin{aligned} \left( 1+\frac{1}{2\eta }\frac{1-\chi }{1-\chi ^L}\right) ^{-1} \le \theta \le \left( 1+\frac{1}{2\eta }\frac{1-\chi }{1-\chi ^L}\frac{C}{1+C}\right) ^{-1}, \end{aligned}$$

(5.29)

where the upper bound has a non-trivial dependence on ${\mathrm {TOL}}$ and L through C.

1.2 Appendix 1.2: Heuristic optimization of geometric hierarchies

This section motivates the results in Sect. 2.3 and Corollary 2.3 where we optimized geometric hierarchies defined by $h_\ell = h_0 {\beta } ^{-\ell }$ for given $h_0$ and $ {\beta } > 1$. In this case, the work and variance models are in (2.28) and L is must satisfy the bias constraint

$$\begin{aligned} L&\ge \frac{\log {(h_0)} - \frac{1}{q_1}\log {\left( \frac{(1-\theta ) {\mathrm {TOL}}}{Q_W}\right) } }{\log ( {\beta } )}. \end{aligned}$$

(5.30)

We distinguish between two cases:

• $\chi =1$: Or equivalently $q_2 = d\gamma $. In this case, the total work defined in (5.1) simplifies to

$$\begin{aligned} W= \left( \frac{C_\alpha }{\theta {\mathrm {TOL}}}\right) ^2 \left( \sqrt{V_0} h_0^\frac{ -q_2}{2} + L \sqrt{Q_S} \beta ^{\frac{q_2}{2}} \right) ^2, \end{aligned}$$

(5.31)

We make the simplification of treating L as a real parameter and substitute the lower bound of (5.30) in (5.31) and optimize with respect to $ {\beta } $ to get $ {\beta } =\exp \left( \frac{2}{q_2}\right) $. Substituting this choice and (2.30), the total work satisfies

$$\begin{aligned} \frac{W}{{\mathrm {TOL}}^{-2} \left( \log {{\mathrm {TOL}}} \right) ^2} \rightarrow \theta ^{-2} C_\alpha ^2 Q_S \exp (2) \left( \frac{1}{2\eta }\right) ^2, \qquad \text {as } {\mathrm {TOL}}\rightarrow 0. \end{aligned}$$

Optimizing for $\theta $ suggests that $\theta \rightarrow 1$ as ${\mathrm {TOL}}\rightarrow 0$ and (2.18) follows.

• $\chi \ne 1$: In this case, the total work defined in (5.1) simplifies to

$$\begin{aligned} W= \left( \frac{C_\alpha }{\theta {\mathrm {TOL}}} \right) ^2 h_0^{d\gamma (\chi -1)} \left( \sqrt{V_0} h_0^\frac{-q_2}{2} + \sqrt{Q_S} \frac{ \left( 1- {\beta } ^\frac{L(d\gamma -q_2)}{2} \right) }{ {\beta } ^{-\frac{d\gamma }{2}} - {\beta } ^{-\frac{q_2}{2}}} \right) ^2, \end{aligned}$$

(5.32)

for a given $L, h_0$ and $\theta $. Again, we make the simplification of treating L as a real parameter and substitute the lower bound (5.30) to obtain

$$\begin{aligned} {\beta } ^\frac{L(d\gamma -q_2)}{2} = \left( \frac{(1-\theta ){\mathrm {TOL}}}{Q_W} \right) ^\frac{\chi -1}{2\eta } h_0^\frac{d\gamma (1-\chi )}{2}, \end{aligned}$$

for any $ {\beta } $. Substituting back in (5.32) and optimizing with respect to $ {\beta } $ to minimize the work gives (2.29). Substituting this optimal $ {\beta } $ in (5.32) yields

$$\begin{aligned} W= \left( \frac{C_\alpha }{\theta {\mathrm {TOL}}} \right) ^2 h_0^{d\gamma (\chi -1)} \left( \sqrt{V_0} h_0^\frac{-q_2}{2} + \sqrt{Q_S} \frac{\chi ^\frac{\chi }{\chi -1}}{\chi -1} \left( 1 - \chi ^{-L}\right) \right) ^2. \end{aligned}$$

(5.33)

Asymptotically, using (2.30) as ${\mathrm {TOL}}\rightarrow 0$ yields (2.23) with the following constants

$$\begin{aligned} C_1&= (1-\theta )^\frac{\chi -1}{\eta }\theta ^{-2} C_\alpha ^2 Q_W^\frac{1-\chi }{\eta } Q_S \left( \frac{\chi ^\frac{\chi }{\chi -1}}{\chi -1} \right) ^2, \end{aligned}$$

(5.34a)

$$\begin{aligned} C_2&= \theta ^{-2} C_\alpha ^2 h_0^{d\gamma (\chi -1)} \left( \sqrt{V_0} h_0^\frac{-q_2}{2} + \sqrt{Q_S} \frac{\chi ^\frac{\chi }{\chi -1}}{\chi -1} \right) ^2. \end{aligned}$$

(5.34b)

Optimizing these constants with respect to $\theta $ yields (2.25) and substituting this and (2.32) back yields (2.24a) and (2.31) for $C_1$ and $C_2$, respectively. This, as Remark 2.6 mentions, shows that the asymptotic computational complexities of optimal non-geometric and geometric hierarchies are the same.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Haji-Ali, AL., Nobile, F., von Schwerin, E. et al. Optimization of mesh hierarchies in multilevel Monte Carlo samplers. Stoch PDE: Anal Comp 4, 76–112 (2016). https://doi.org/10.1007/s40072-015-0049-7

Download citation

Received: 10 March 2014
Published: 02 June 2015
Issue Date: March 2016
DOI: https://doi.org/10.1007/s40072-015-0049-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimization of mesh hierarchies in multilevel Monte Carlo samplers

Abstract

Access this article

Similar content being viewed by others

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Topology optimization of multi-scale structures: a review

Recent advances and applications of surrogate models for finite element method computations: a review

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Appendix 1: Derivations and proofs

1.1 Appendix 1.1: Optimal hierarchies given \(h_0\), \(\theta \), and L

1.1.1 Appendix 1.1.1: For \(\chi =1\)

1.1.2 Appendix 1.1.2: For \(\chi \ne 1\)

1.2 Appendix 1.2: Heuristic optimization of geometric hierarchies

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Optimization of mesh hierarchies in multilevel Monte Carlo samplers

Abstract

Access this article

Similar content being viewed by others

Eighty Years of the Finite Element Method: Birth, Evolution, and Future

Topology optimization of multi-scale structures: a review

Recent advances and applications of surrogate models for finite element method computations: a review

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Appendix 1: Derivations and proofs

Appendix 1: Derivations and proofs

1.1 Appendix 1.1: Optimal hierarchies given \(h_0\), \(\theta \), and L

1.1.1 Appendix 1.1.1: For \(\chi =1\)

1.1.2 Appendix 1.1.2: For \(\chi \ne 1\)

1.2 Appendix 1.2: Heuristic optimization of geometric hierarchies

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation