Skip to main content
Top

2020 | OriginalPaper | Chapter

Neural Control Variates for Monte Carlo Variance Reduction

Authors : Ruosi Wan, Mingjun Zhong, Haoyi Xiong, Zhanxing Zhu

Published in: Machine Learning and Knowledge Discovery in Databases

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In statistics and machine learning, approximation of an intractable integration is often achieved by using the unbiased Monte Carlo estimator, but the variances of the estimation are generally high in many applications. Control variates approaches are well-known to reduce the variance of the estimation. These control variates are typically constructed by employing predefined parametric functions or polynomials, determined by using those samples drawn from the relevant distributions. Instead, we propose to construct those control variates by learning neural networks to handle the cases when test functions are complex. In many applications, obtaining a large number of samples for Monte Carlo estimation is expensive, the adoption of the original loss function may result in severe overfitting when training a neural network. This issue was not reported in those literature on control variates with neural networks. We thus further introduce a constrained control variates with neural networks to alleviate the overfitting issue. We apply the proposed control variates to both toy and real data problems, including a synthetic data problem, Bayesian model evidence evaluation and Bayesian neural networks. Experimental results demonstrate that our method can achieve significant variance reduction compared to other methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
We will call the trial function \(Q(\varvec{\theta })\) as the constant or linear type trial functions \(\varPhi (\varvec{\theta })\) in the following.
 
Literature
1.
go back to reference Assaraf, R., Caffarel, M.: Zero-variance principle for Monte Carlo algorithms. Phys. Rev. Lett. 83(23), 4682 (1999)CrossRef Assaraf, R., Caffarel, M.: Zero-variance principle for Monte Carlo algorithms. Phys. Rev. Lett. 83(23), 4682 (1999)CrossRef
2.
go back to reference Cornuet, J.M., Marin, J.M., Mira, A., Robert, C.P.: Adaptive multiple importance sampling. Scand. J. Stat. 39(4), 798–812 (2012)MathSciNetCrossRef Cornuet, J.M., Marin, J.M., Mira, A., Robert, C.P.: Adaptive multiple importance sampling. Scand. J. Stat. 39(4), 798–812 (2012)MathSciNetCrossRef
3.
go back to reference Frenkel, D., Smit, B.: Understanding Molecular Simulation: from Algorithms to Applications, vol. 1. Elsevier, Amsterdam (2001)MATH Frenkel, D., Smit, B.: Understanding Molecular Simulation: from Algorithms to Applications, vol. 1. Elsevier, Amsterdam (2001)MATH
5.
go back to reference Higdon, D., McDonnell, J.D., Schunck, N., Sarich, J., Wild, S.M.: A Bayesian approach for parameter estimation and prediction using a computationally intensive model. J. Phys. G: Nucl. Part. Phys. 42(3), 034009 (2015)CrossRef Higdon, D., McDonnell, J.D., Schunck, N., Sarich, J., Wild, S.M.: A Bayesian approach for parameter estimation and prediction using a computationally intensive model. J. Phys. G: Nucl. Part. Phys. 42(3), 034009 (2015)CrossRef
6.
go back to reference LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)CrossRef
7.
go back to reference Li, C., Chen, C., Carlson, D., Carin, L.: Pre-conditioned stochastic gradient Langevin dynamics for deep neural networks. In: AAAI, vol. 2, p. 4 (2016) Li, C., Chen, C., Carlson, D., Carin, L.: Pre-conditioned stochastic gradient Langevin dynamics for deep neural networks. In: AAAI, vol. 2, p. 4 (2016)
8.
go back to reference Liu, H., Feng, Y., Mao, Y., Zhou, D., Peng, J., Liu, Q.: Action-dependent control variates for policy optimization via stein identity. In: ICLR (2018) Liu, H., Feng, Y., Mao, Y., Zhou, D., Peng, J., Liu, Q.: Action-dependent control variates for policy optimization via stein identity. In: ICLR (2018)
9.
go back to reference Liu, Q., Wang, D.: Stein variational gradient descent: a general purpose Bayesian inference algorithm. In: Advances in Neural Information Processing Systems, pp. 2378–2386 (2016) Liu, Q., Wang, D.: Stein variational gradient descent: a general purpose Bayesian inference algorithm. In: Advances in Neural Information Processing Systems, pp. 2378–2386 (2016)
10.
go back to reference Mira, A., Solgi, R., Imparato, D.: Zero variance markovchain monte carlo for Bayesian estimators. Stat. Comput. 23(5), 653–662 (2013)MathSciNetCrossRef Mira, A., Solgi, R., Imparato, D.: Zero variance markovchain monte carlo for Bayesian estimators. Stat. Comput. 23(5), 653–662 (2013)MathSciNetCrossRef
11.
go back to reference Neal, R.M.: Bayesian Learning for Neural Networks, vol. 118. Springer, New York (2012) Neal, R.M.: Bayesian Learning for Neural Networks, vol. 118. Springer, New York (2012)
12.
go back to reference Oates, C.J., Cockayne, J., Briol, F.X., Girolami, M.: Convergence rates for a class of estimators based on stein’s method. arXivpreprint arXiv:1603.03220 (2016) Oates, C.J., Cockayne, J., Briol, F.X., Girolami, M.: Convergence rates for a class of estimators based on stein’s method. arXivpreprint arXiv:​1603.​03220 (2016)
13.
go back to reference Oates, C.J., Girolami, M., Chopin, N.: Control functionals for Monte Carlo integration. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 79(3), 695–718 (2017)MathSciNetCrossRef Oates, C.J., Girolami, M., Chopin, N.: Control functionals for Monte Carlo integration. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 79(3), 695–718 (2017)MathSciNetCrossRef
14.
go back to reference Oates, C.J., Papamarkou, T., Girolami, M.: The controlled thermodynamic integral for Bayesian model evidence evaluation. J. Am. Stat. Assoc. 111(514), 634–645 (2016)MathSciNetCrossRef Oates, C.J., Papamarkou, T., Girolami, M.: The controlled thermodynamic integral for Bayesian model evidence evaluation. J. Am. Stat. Assoc. 111(514), 634–645 (2016)MathSciNetCrossRef
15.
16.
go back to reference Rubinstein, R.Y., Kroese, D.P.: Simulation and the Monte Carlo Method, vol. 10. Wiley, Hoboken (2016)CrossRef Rubinstein, R.Y., Kroese, D.P.: Simulation and the Monte Carlo Method, vol. 10. Wiley, Hoboken (2016)CrossRef
17.
go back to reference Stein, C., et al.: A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory. The Regents of the University of California (1972) Stein, C., et al.: A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 2: Probability Theory. The Regents of the University of California (1972)
18.
go back to reference Tucker, G., Mnih, A., Maddison, C.J., Lawson, J., Sohl-Dickstein, J.: Rebar: Low-variance, unbiased gradient estimates for discrete latent variable models. In: Advances in Neural Information Processing Systems, pp. 2624–2633 (2017) Tucker, G., Mnih, A., Maddison, C.J., Lawson, J., Sohl-Dickstein, J.: Rebar: Low-variance, unbiased gradient estimates for discrete latent variable models. In: Advances in Neural Information Processing Systems, pp. 2624–2633 (2017)
Metadata
Title
Neural Control Variates for Monte Carlo Variance Reduction
Authors
Ruosi Wan
Mingjun Zhong
Haoyi Xiong
Zhanxing Zhu
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-46147-8_32

Premium Partner