Skip to main content

2022 | OriginalPaper | Buchkapitel

Scalable Control Variates for Monte Carlo Methods Via Stochastic Optimization

verfasst von : Shijing Si, Chris. J. Oates, Andrew B. Duncan, Lawrence Carin, François-Xavier Briol

Erschienen in: Monte Carlo and Quasi-Monte Carlo Methods

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Control variates are a well-established tool to reduce the variance of Monte Carlo estimators. However, for large-scale problems including high-dimensional and large-sample settings, their advantages can be outweighed by a substantial computational cost. This paper considers control variates based on Stein operators, presenting a framework that encompasses and generalizes existing approaches that use polynomials, kernels and neural networks. A learning strategy based on minimizing a variational objective through stochastic optimization is proposed, leading to scalable and effective control variates. Novel theoretical results are presented to provide insight into the variance reduction that can be achieved, and an empirical assessment, including applications to Bayesian inference, is provided in support.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
To simplify presentation in the paper, we always assume \(\mathcal {U}\) is a maximal set of functions for which \(\mathcal {L}u\) is well-defined and \(\Pi [\mathcal {L}u] = 0\).
 
2
We emphasize that MC can be evaluated at negligible cost and we are not advocating that our methods should be preferred for this task.
 
Literatur
1.
Zurück zum Zitat Andradóttir, S., Heyman, D.P., Ott, T.J.: Variance reduction through smoothing and control variates for Markov chain simulations. ACM Trans. Model. Comput. Simul. 3(3), 167–189 (1993) Andradóttir, S., Heyman, D.P., Ott, T.J.: Variance reduction through smoothing and control variates for Markov chain simulations. ACM Trans. Model. Comput. Simul. 3(3), 167–189 (1993)
2.
Zurück zum Zitat Assaraf, R., Caffarel, M.: Zero-variance principle for Monte Carlo algorithms. Phys. Rev. Lett. 83(23), 4682 (1999) Assaraf, R., Caffarel, M.: Zero-variance principle for Monte Carlo algorithms. Phys. Rev. Lett. 83(23), 4682 (1999)
3.
Zurück zum Zitat Baker, J., Fearnhead, P., Fox, E.B., Nemeth, C.: Control variates for stochastic gradient MCMC. Stat. Comput. 29, 599–615 (2019) Baker, J., Fearnhead, P., Fox, E.B., Nemeth, C.: Control variates for stochastic gradient MCMC. Stat. Comput. 29, 599–615 (2019)
4.
Zurück zum Zitat Barbour, A.D.: Stein’s method and Poisson process convergence. J. Appl. Probab. 25, 175–184 (1988) Barbour, A.D.: Stein’s method and Poisson process convergence. J. Appl. Probab. 25, 175–184 (1988)
5.
Zurück zum Zitat Barp, A., Briol, F.X., Duncan, A.B., Girolami, M., Mackey, L.: Minimum Stein discrepancy estimators. In: Neural Information Processing Systems, pp. 12964–12976 (2019) Barp, A., Briol, F.X., Duncan, A.B., Girolami, M., Mackey, L.: Minimum Stein discrepancy estimators. In: Neural Information Processing Systems, pp. 12964–12976 (2019)
7.
Zurück zum Zitat Belomestny, D., Iosipoi, L., Moulines, E., Naumov, A., Samsonov, S.: Variance reduction for Markov chains with application to MCMC. Stat. Comput. 30, 973–997 (2020) Belomestny, D., Iosipoi, L., Moulines, E., Naumov, A., Samsonov, S.: Variance reduction for Markov chains with application to MCMC. Stat. Comput. 30, 973–997 (2020)
8.
Zurück zum Zitat Belomestny, D., Iosipoi, L., Zhivotovskiy, N.: Variance reduction via empirical variance minimization: convergence and complexity. Doklady Math. 98, 494–497 (2018) Belomestny, D., Iosipoi, L., Zhivotovskiy, N.: Variance reduction via empirical variance minimization: convergence and complexity. Doklady Math. 98, 494–497 (2018)
9.
Zurück zum Zitat Belomestny, D., Moulines, E., Shagadatov, N., Urusov, M.: Variance Reduction for MCMC Methods Via Martingale Representations (2019). arXiv:1903.0737 Belomestny, D., Moulines, E., Shagadatov, N., Urusov, M.: Variance Reduction for MCMC Methods Via Martingale Representations (2019). arXiv:​1903.​0737
10.
Zurück zum Zitat Briol, F.X., Oates, C.J., Girolami, M., Osborne, M.A., Sejdinovic, D.: Probabilistic integration: a role in statistical computation? (with discussion). Stat. Sci. 34(1), 1–22 (2019) Briol, F.X., Oates, C.J., Girolami, M., Osborne, M.A., Sejdinovic, D.: Probabilistic integration: a role in statistical computation? (with discussion). Stat. Sci. 34(1), 1–22 (2019)
11.
12.
Zurück zum Zitat Chen, L.H.Y., Goldstein, L., Shao, Q.M.: Normal Approximation by Stein’s Method. Springer, Berlin (2010) Chen, L.H.Y., Goldstein, L., Shao, Q.M.: Normal Approximation by Stein’s Method. Springer, Berlin (2010)
13.
Zurück zum Zitat Chen, W.Y., Barp, A., Briol, F.X., Gorham, J., Girolami, M., Mackey, L., Oates, C.J.: Stein point Markov chain Monte Carlo. In: International Conference on Machine Learning, PMLR 97, pp. 1011–1021 (2019) Chen, W.Y., Barp, A., Briol, F.X., Gorham, J., Girolami, M., Mackey, L., Oates, C.J.: Stein point Markov chain Monte Carlo. In: International Conference on Machine Learning, PMLR 97, pp. 1011–1021 (2019)
14.
Zurück zum Zitat Chen, W.Y., Mackey, L., Gorham, J., Briol, F.X., Oates, C.J.: Stein points. In: Proceedings of the International Conference on Machine Learning, PMLR 80:843–852 (2018) Chen, W.Y., Mackey, L., Gorham, J., Briol, F.X., Oates, C.J.: Stein points. In: Proceedings of the International Conference on Machine Learning, PMLR 80:843–852 (2018)
15.
Zurück zum Zitat Chwialkowski, K., Strathmann, H., Gretton, A.: A kernel test of goodness of fit. Int. Conf. Mach. Learn. 48, 2606–2615 (2016) Chwialkowski, K., Strathmann, H., Gretton, A.: A kernel test of goodness of fit. Int. Conf. Mach. Learn. 48, 2606–2615 (2016)
16.
Zurück zum Zitat Dellaportas, P., Kontoyiannis, I.: Control variates for estimation based on reversible Markov chain Monte Carlo samplers. J. R. Stat. Soc. Ser. B: Stat. Methodol. 74(1), 133–161 (2012) Dellaportas, P., Kontoyiannis, I.: Control variates for estimation based on reversible Markov chain Monte Carlo samplers. J. R. Stat. Soc. Ser. B: Stat. Methodol. 74(1), 133–161 (2012)
17.
Zurück zum Zitat Friel, N., Mira, A., Oates, C.J.: Exploiting multi-core architectures for reduced-variance estimation with intractable likelihoods. Bayesian Anal. 11(1), 215–245 (2014) Friel, N., Mira, A., Oates, C.J.: Exploiting multi-core architectures for reduced-variance estimation with intractable likelihoods. Bayesian Anal. 11(1), 215–245 (2014)
18.
Zurück zum Zitat Genz, A.: Testing multidimensional integration routines. In: Proceedings of the International Conference on Tools, Methods and Languages for Scientific and Engineering Computation, pp. 81–94 (1984) Genz, A.: Testing multidimensional integration routines. In: Proceedings of the International Conference on Tools, Methods and Languages for Scientific and Engineering Computation, pp. 81–94 (1984)
19.
Zurück zum Zitat Gorham, J., Duncan, A., Mackey, L., Vollmer, S.: Measuring sample quality with diffusions. Ann. Appl. Probab. 29(5), 2884–2928 (2019) Gorham, J., Duncan, A., Mackey, L., Vollmer, S.: Measuring sample quality with diffusions. Ann. Appl. Probab. 29(5), 2884–2928 (2019)
20.
Zurück zum Zitat Gorham, J., Mackey, L.: Measuring sample quality with Stein’s method. In: Advances in Neural Information Processing Systems, pp. 226–234 (2015) Gorham, J., Mackey, L.: Measuring sample quality with Stein’s method. In: Advances in Neural Information Processing Systems, pp. 226–234 (2015)
21.
Zurück zum Zitat Gorham, J., Mackey, L.: Measuring sample quality with kernels. In: Proceedings of the International Conference on Machine Learning, pp. 1292–1301 (2017) Gorham, J., Mackey, L.: Measuring sample quality with kernels. In: Proceedings of the International Conference on Machine Learning, pp. 1292–1301 (2017)
22.
Zurück zum Zitat Grathwohl, W., Choi, D., Wu, Y., Roeder, G., Duvenaud, D.: Backpropagation through the void: Optimizing control variates for black-box gradient estimation. In: International Conference on Learning Representations (2018) Grathwohl, W., Choi, D., Wu, Y., Roeder, G., Duvenaud, D.: Backpropagation through the void: Optimizing control variates for black-box gradient estimation. In: International Conference on Learning Representations (2018)
23.
Zurück zum Zitat Greensmith, E., Bartlett, P.L., Baxter, J.: Variance reduction techniques for gradient estimates in reinforcement learning. J. Mach. Learn. Res. 5, 1471–1530 (2004) Greensmith, E., Bartlett, P.L., Baxter, J.: Variance reduction techniques for gradient estimates in reinforcement learning. J. Mach. Learn. Res. 5, 1471–1530 (2004)
24.
Zurück zum Zitat Hammer, H., Tjelmeland, H.: Control variates for the Metropolis-Hastings algorithm. Scand. J. Stat. 35(3), 400–414 (2008) Hammer, H., Tjelmeland, H.: Control variates for the Metropolis-Hastings algorithm. Scand. J. Stat. 35(3), 400–414 (2008)
25.
Zurück zum Zitat Henderson, S.G., Glynn, P.W.: Approximating martingales for variance reduction in Markov process simulation. Math. Oper. Res. 27(2), 253–271 (2002) Henderson, S.G., Glynn, P.W.: Approximating martingales for variance reduction in Markov process simulation. Math. Oper. Res. 27(2), 253–271 (2002)
26.
Zurück zum Zitat Hickernell, F.J., Lemieux, C., Owen, A.B.: Control variates for quasi-Monte Carlo. Stat. Sci. 20(1), 1–31 (2005) Hickernell, F.J., Lemieux, C., Owen, A.B.: Control variates for quasi-Monte Carlo. Stat. Sci. 20(1), 1–31 (2005)
27.
Zurück zum Zitat Kennedy, M.C., Hagan, A.O.: Bayesian calibration of computer models. J. R. Stat. Soc. Ser. B: Stat. Methodol. 63(3), 425–464 (2001) Kennedy, M.C., Hagan, A.O.: Bayesian calibration of computer models. J. R. Stat. Soc. Ser. B: Stat. Methodol. 63(3), 425–464 (2001)
29.
Zurück zum Zitat Ley, C., Swan, Y.: Parametric Stein operators and variance bounds. Braz. J. Probab. Stat. 30(2) (2016) Ley, C., Swan, Y.: Parametric Stein operators and variance bounds. Braz. J. Probab. Stat. 30(2) (2016)
30.
Zurück zum Zitat Liu, H., Feng, Y., Mao, Y., Zhou, D., Peng, J., Liu, Q.: Action-dependent control variates for policy optimization via Stein’s identity. In: International Conference on Learning Representation (2018) Liu, H., Feng, Y., Mao, Y., Zhou, D., Peng, J., Liu, Q.: Action-dependent control variates for policy optimization via Stein’s identity. In: International Conference on Learning Representation (2018)
31.
Zurück zum Zitat Liu, Q., Lee, J.D.: Black-box importance sampling. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 952–961 (2017) Liu, Q., Lee, J.D.: Black-box importance sampling. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, pp. 952–961 (2017)
32.
Zurück zum Zitat Liu, Q., Lee, J.D., Jordan, M.I.: A kernelized Stein discrepancy for goodness-of-fit tests and model evaluation. In: International Conference on Machine Learning, pp. 276–284 (2016) Liu, Q., Lee, J.D., Jordan, M.I.: A kernelized Stein discrepancy for goodness-of-fit tests and model evaluation. In: International Conference on Machine Learning, pp. 276–284 (2016)
33.
Zurück zum Zitat Liu, Q., Wang, D.: Stein variational gradient descent: a general purpose Bayesian inference algorithm. In: Advances in Neural Information Processing Systems (2016) Liu, Q., Wang, D.: Stein variational gradient descent: a general purpose Bayesian inference algorithm. In: Advances in Neural Information Processing Systems (2016)
34.
Zurück zum Zitat Liu, S., Kanamori, T., Jitkrittum, W., Chen, Y.: Fisher efficient inference of intractable models. In: Neural Information Processing Systems, pp. 8793–8803 (2019) Liu, S., Kanamori, T., Jitkrittum, W., Chen, Y.: Fisher efficient inference of intractable models. In: Neural Information Processing Systems, pp. 8793–8803 (2019)
35.
Zurück zum Zitat Mira, A., Solgi, R., Imparato, D.: Zero variance Markov chain Monte Carlo for Bayesian estimators. Stat. Comput. 23(5), 653–662 (2013) Mira, A., Solgi, R., Imparato, D.: Zero variance Markov chain Monte Carlo for Bayesian estimators. Stat. Comput. 23(5), 653–662 (2013)
37.
Zurück zum Zitat Newton, N.J.: Variance reduction for simulated diffusions. SIAM J. Appl. Math. 54(6), 1780–1805 (1994) Newton, N.J.: Variance reduction for simulated diffusions. SIAM J. Appl. Math. 54(6), 1780–1805 (1994)
38.
Zurück zum Zitat Oates, C.J., Cockayne, J., Briol, F.X., Girolami, M.: Convergence rates for a class of estimators based on Stein’s identity. Bernoulli 25(2), 1141–1159 (2019) Oates, C.J., Cockayne, J., Briol, F.X., Girolami, M.: Convergence rates for a class of estimators based on Stein’s identity. Bernoulli 25(2), 1141–1159 (2019)
39.
Zurück zum Zitat Oates, C.J., Girolami, M., Chopin, N.: Control functionals for Monte Carlo integration. J. R. Stat. Soc. B: Stat. Methodol. 79(3), 695–718 (2017) Oates, C.J., Girolami, M., Chopin, N.: Control functionals for Monte Carlo integration. J. R. Stat. Soc. B: Stat. Methodol. 79(3), 695–718 (2017)
40.
Zurück zum Zitat Oates, C.J., Papamarkou, T., Girolami, M.: The controlled thermodynamic integral for Bayesian model comparison. J. Am. Stat. Assoc. (2016) Oates, C.J., Papamarkou, T., Girolami, M.: The controlled thermodynamic integral for Bayesian model comparison. J. Am. Stat. Assoc. (2016)
41.
Zurück zum Zitat O’Hagan, A.: Bayes-Hermite quadrature. J. Stat. Plan. Inference 29, 245–260 (1991) O’Hagan, A.: Bayes-Hermite quadrature. J. Stat. Plan. Inference 29, 245–260 (1991)
42.
Zurück zum Zitat Paisley, J., Blei, D., Jordan, M.: Variational Bayesian inference with stochastic search. In: International Conference on Machine Learning (2012) Paisley, J., Blei, D., Jordan, M.: Variational Bayesian inference with stochastic search. In: International Conference on Machine Learning (2012)
43.
Zurück zum Zitat Papamarkou, T., Mira, A., Girolami, M.: Zero variance differential geometric Markov chain Monte Carlo algorithms. Bayesian Anal. 9(1), 97–128 (2014) Papamarkou, T., Mira, A., Girolami, M.: Zero variance differential geometric Markov chain Monte Carlo algorithms. Bayesian Anal. 9(1), 97–128 (2014)
44.
Zurück zum Zitat Pardoux, E., Vertennikov, A.Y.: On the Poisson equation and diffusion approximation. I. Ann. Probab. 29(3), 1061–1085 (2001) Pardoux, E., Vertennikov, A.Y.: On the Poisson equation and diffusion approximation. I. Ann. Probab. 29(3), 1061–1085 (2001)
45.
Zurück zum Zitat Portier, F., Segers, J.: Monte Carlo integration with a growing number of control variates. J. Appl. Probab. 56(4), 1168–1186 (2019) Portier, F., Segers, J.: Monte Carlo integration with a growing number of control variates. J. Appl. Probab. 56(4), 1168–1186 (2019)
46.
Zurück zum Zitat Ranganath, R., Altosaar, J., Tran, D., Blei, D.M.: Operator variational inference. In: Advances in Neural Information Processing Systems, pp. 496–504 (2016) Ranganath, R., Altosaar, J., Tran, D., Blei, D.M.: Operator variational inference. In: Advances in Neural Information Processing Systems, pp. 496–504 (2016)
47.
Zurück zum Zitat Ranganath, R., Gerrish, S., Blei, D.M.: Black box variational inference. In: Artificial Intelligence and Statistics, pp. 814–822 (2014) Ranganath, R., Gerrish, S., Blei, D.M.: Black box variational inference. In: Artificial Intelligence and Statistics, pp. 814–822 (2014)
48.
Zurück zum Zitat Riabiz, M., Chen, W., Cockayne, J., Swietach, P., Niederer, S.A., Mackey, L., Oates, C.J.: Optimal thinning of MCMC output (2020). arXiv:2005.03952 Riabiz, M., Chen, W., Cockayne, J., Swietach, P., Niederer, S.A., Mackey, L., Oates, C.J.: Optimal thinning of MCMC output (2020). arXiv:​2005.​03952
49.
Zurück zum Zitat Ross, N.: Fundamentals of Stein’s method. Probab. Surv. 8, 210–293 (2011) Ross, N.: Fundamentals of Stein’s method. Probab. Surv. 8, 210–293 (2011)
50.
Zurück zum Zitat Si, S., Oates, C.J., Duncan, A.B., Carin, L., Briol, F.X.: Scalable Control Variates for Monte Carlo Methods via Stochastic Optimization (2020). arXiv:2006.07487 Si, S., Oates, C.J., Duncan, A.B., Carin, L., Briol, F.X.: Scalable Control Variates for Monte Carlo Methods via Stochastic Optimization (2020). arXiv:​2006.​07487
51.
Zurück zum Zitat South, L.F., Karvonen, T., Nemeth, C., Girolami, M., Oates, C.J.: Semi-exact control functionals from Sard’s method (2020). arXiv:2002.00033 South, L.F., Karvonen, T., Nemeth, C., Girolami, M., Oates, C.J.: Semi-exact control functionals from Sard’s method (2020). arXiv:​2002.​00033
52.
Zurück zum Zitat South, L.F., Oates, C.J., Mira, A., Drovandi, C.: Regularised zero-variance control variates for high-dimensional variance reduction (2019). arXiv:1811.05073 South, L.F., Oates, C.J., Mira, A., Drovandi, C.: Regularised zero-variance control variates for high-dimensional variance reduction (2019). arXiv:​1811.​05073
53.
Zurück zum Zitat Stein, C.: A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In: Proceedings of 6th Berkeley Symposium on Mathematical Statistics and Probability, pp. 583–602. University of California Press (1972) Stein, C.: A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In: Proceedings of 6th Berkeley Symposium on Mathematical Statistics and Probability, pp. 583–602. University of California Press (1972)
54.
Zurück zum Zitat Wan, R., Zhong, M., Xiong, H., Zhu, Z.: Neural control variates for Monte Carlo variance reduction. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 533–547 (2019) Wan, R., Zhong, M., Xiong, H., Zhu, Z.: Neural control variates for Monte Carlo variance reduction. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 533–547 (2019)
55.
Zurück zum Zitat Wang, C., Chen, X., Smola, A., Xing, E.P.: Variance reduction for stochastic gradient optimization. In: Advances in Neural Information Processing Systems, pp. 181–189 (2013) Wang, C., Chen, X., Smola, A., Xing, E.P.: Variance reduction for stochastic gradient optimization. In: Advances in Neural Information Processing Systems, pp. 181–189 (2013)
56.
Zurück zum Zitat Yang, J., Liu, Q., Rao, V., Neville, J.: Goodness-of-fit testing for discrete distributions via Stein discrepancy. In: International Conference on Machine Learning, pp. 5561–5570 (2018) Yang, J., Liu, Q., Rao, V., Neville, J.: Goodness-of-fit testing for discrete distributions via Stein discrepancy. In: International Conference on Machine Learning, pp. 5561–5570 (2018)
Metadaten
Titel
Scalable Control Variates for Monte Carlo Methods Via Stochastic Optimization
verfasst von
Shijing Si
Chris. J. Oates
Andrew B. Duncan
Lawrence Carin
François-Xavier Briol
Copyright-Jahr
2022
DOI
https://doi.org/10.1007/978-3-030-98319-2_10

Premium Partner