Skip to main content
Erschienen in: Soft Computing 6/2020

29.07.2019 | Methodologies and Application

Improving predictive uncertainty estimation using Dropout–Hamiltonian Monte Carlo

verfasst von: Sergio Hernández, Diego Vergara, Matías Valdenegro-Toro, Felipe Jorquera

Erschienen in: Soft Computing | Ausgabe 6/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Estimating predictive uncertainty is crucial for many computer vision tasks, from image classification to autonomous driving systems. Hamiltonian Monte Carlo (HMC) is an sampling method for performing Bayesian inference. On the other hand, Dropout regularization has been proposed as an approximate model averaging technique that tends to improve generalization in large-scale models such as deep neural networks. Although HMC provides convergence guarantees for most standard Bayesian models, it do not handle discrete parameters arising from Dropout regularization. In this paper, we present a robust methodology for improving predictive uncertainty in classification problems, based on Dropout and HMC. Even though Dropout induces a non-smooth energy function with no such convergence guarantees, the resulting discretization of the Hamiltonian proves empirical success. The proposed method allows to effectively estimate the predictive accuracy and to provide better generalization for difficult test examples.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bardenet R, Doucet A, Holmes C (2014) Towards scaling up Markov Chain Monte Carlo: an adaptive subsampling approach. In: International conference on machine learning, pp 405–413 Bardenet R, Doucet A, Holmes C (2014) Towards scaling up Markov Chain Monte Carlo: an adaptive subsampling approach. In: International conference on machine learning, pp 405–413
Zurück zum Zitat Beskos A, Pillai N, Roberts G, Sanz-Serna J-M, Stuart A et al (2013) Optimal tuning of the hybrid monte carlo algorithm. Bernoulli 19(5A):1501–1534MathSciNetCrossRef Beskos A, Pillai N, Roberts G, Sanz-Serna J-M, Stuart A et al (2013) Optimal tuning of the hybrid monte carlo algorithm. Bernoulli 19(5A):1501–1534MathSciNetCrossRef
Zurück zum Zitat Bishop CM (2007) Pattern recognition and machine learning (information science and statistics), 1st edn. Springer, New York (corr. 2nd printing edn.) Bishop CM (2007) Pattern recognition and machine learning (information science and statistics), 1st edn. Springer, New York (corr. 2nd printing edn.)
Zurück zum Zitat Chaari L, Tourneret J-Y, Chaux C, Batatia H (2016) A hamiltonian Monte Carlo method for non-smooth energy sampling. IEEE Trans Signal Process 64(21):5585–5594MathSciNetCrossRef Chaari L, Tourneret J-Y, Chaux C, Batatia H (2016) A hamiltonian Monte Carlo method for non-smooth energy sampling. IEEE Trans Signal Process 64(21):5585–5594MathSciNetCrossRef
Zurück zum Zitat Chen T, Fox E, Guestrin C (2014) Stochastic gradient Hamiltonian Monte Carlo. In: International conference on machine learning, pp 1683–1691 Chen T, Fox E, Guestrin C (2014) Stochastic gradient Hamiltonian Monte Carlo. In: International conference on machine learning, pp 1683–1691
Zurück zum Zitat Eidinger E, Enbar R, Hassner T (2014) Age and gender estimation of unfiltered faces. IEEE Trans Inf Forens Secur 9(12):2170–2179CrossRef Eidinger E, Enbar R, Hassner T (2014) Age and gender estimation of unfiltered faces. IEEE Trans Inf Forens Secur 9(12):2170–2179CrossRef
Zurück zum Zitat Gal Y, Ghahramani Z (2016) Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International conference on machine learning, pp 1050–1059 Gal Y, Ghahramani Z (2016) Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International conference on machine learning, pp 1050–1059
Zurück zum Zitat Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2014) Bayesian data analysis, vol 2. CRC Press, Boca RatonMATH Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2014) Bayesian data analysis, vol 2. CRC Press, Boca RatonMATH
Zurück zum Zitat Girolami M, Calderhead B (2011) Riemann manifold langevin and Hamiltonian Monte Carlo methods. J R Stat Soc Ser B (Statistical Methodology) 73(2):123–214MathSciNetCrossRef Girolami M, Calderhead B (2011) Riemann manifold langevin and Hamiltonian Monte Carlo methods. J R Stat Soc Ser B (Statistical Methodology) 73(2):123–214MathSciNetCrossRef
Zurück zum Zitat Griewank A, Walther A (2008) Evaluating derivatives: principles and techniques of algorithmic differentiation, second edn. Society for Industrial and Applied Mathematics, PhiladelphiaCrossRef Griewank A, Walther A (2008) Evaluating derivatives: principles and techniques of algorithmic differentiation, second edn. Society for Industrial and Applied Mathematics, PhiladelphiaCrossRef
Zurück zum Zitat Hoffman MD (2017) Learning deep latent Gaussian models with Markov Chain Monte Carlo. In: International conference on machine learning, pp 1510–1519 Hoffman MD (2017) Learning deep latent Gaussian models with Markov Chain Monte Carlo. In: International conference on machine learning, pp 1510–1519
Zurück zum Zitat Hoffman MD, Gelman A (2014) The no-u-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res 15(1):1593–1623MathSciNetMATH Hoffman MD, Gelman A (2014) The no-u-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res 15(1):1593–1623MathSciNetMATH
Zurück zum Zitat Kingma DP, Salimans T, Welling M (2015) Variational Dropout and the local reparameterization trick. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems, vol 28. Curran Associates, Inc., Red Hook, pp 2575–2583 Kingma DP, Salimans T, Welling M (2015) Variational Dropout and the local reparameterization trick. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems, vol 28. Curran Associates, Inc., Red Hook, pp 2575–2583
Zurück zum Zitat LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proce IEEE 86(11):2278–2324CrossRef LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proce IEEE 86(11):2278–2324CrossRef
Zurück zum Zitat Leibig C, Allken V, Ayhan MS, Berens P, Wahl S (2017) Leveraging uncertainty information from deep neural networks for disease detection. Sci Rep 7(1):17816CrossRef Leibig C, Allken V, Ayhan MS, Berens P, Wahl S (2017) Leveraging uncertainty information from deep neural networks for disease detection. Sci Rep 7(1):17816CrossRef
Zurück zum Zitat Levi G, Hassner T (2015) Age and gender classification using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 34–42 Levi G, Hassner T (2015) Age and gender classification using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 34–42
Zurück zum Zitat Miceli Barone AV, Haddow B, Germann U, Sennrich R (2017 September) Regularization techniques for fine-tuning in neural machine translation. In: Proceedings of the 2017 conference on empirical methods in natural language processing. Copenhagen, Association for omputational Linguistics, pp 1489–1494 Miceli Barone AV, Haddow B, Germann U, Sennrich R (2017 September) Regularization techniques for fine-tuning in neural machine translation. In: Proceedings of the 2017 conference on empirical methods in natural language processing. Copenhagen, Association for omputational Linguistics, pp 1489–1494
Zurück zum Zitat Neal RM (2012) Bayesian learning for neural networks, vol 118. Springer, New YorkMATH Neal RM (2012) Bayesian learning for neural networks, vol 118. Springer, New YorkMATH
Zurück zum Zitat Neal RM et al (2011) MCMC using hamiltonian dynamics. Handb Markov Chain Monte Carlo 2(11):2MATH Neal RM et al (2011) MCMC using hamiltonian dynamics. Handb Markov Chain Monte Carlo 2(11):2MATH
Zurück zum Zitat Nishimura A, Dunson D, Lu J (2017) Discontinuous Hamiltonian Monte Carlo for models with discrete parameters and discontinuous likelihoods. arXiv preprint arXiv:1705.08510 Nishimura A, Dunson D, Lu J (2017) Discontinuous Hamiltonian Monte Carlo for models with discrete parameters and discontinuous likelihoods. arXiv preprint arXiv:​1705.​08510
Zurück zum Zitat Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: British machine vision conference Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: British machine vision conference
Zurück zum Zitat Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc., San FranciscoMATH Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc., San FranciscoMATH
Zurück zum Zitat Prince SJ (2012) Computer vision: models, learning, and inference. Cambridge University Press, CambridgeCrossRef Prince SJ (2012) Computer vision: models, learning, and inference. Cambridge University Press, CambridgeCrossRef
Zurück zum Zitat Roberts GO, Rosenthal JS (1998) Optimal scaling of discrete approximations to Langevin diffusions. J R Stat Soc Ser B (Statistical Methodology) 60(1):255–268MathSciNetCrossRef Roberts GO, Rosenthal JS (1998) Optimal scaling of discrete approximations to Langevin diffusions. J R Stat Soc Ser B (Statistical Methodology) 60(1):255–268MathSciNetCrossRef
Zurück zum Zitat Roberts GO, Stramer O (2002) Langevin diffusions and metropolis-hastings algorithms. Methodol Comput Appl Probab 4(4):337–357MathSciNetCrossRef Roberts GO, Stramer O (2002) Langevin diffusions and metropolis-hastings algorithms. Methodol Comput Appl Probab 4(4):337–357MathSciNetCrossRef
Zurück zum Zitat Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH
Zurück zum Zitat Tran D, Hoffman MD, Saurous RA, Brevdo E, Murphy K, Blei DM (2017) Deep probabilistic programming. In: International conference on learning representations Tran D, Hoffman MD, Saurous RA, Brevdo E, Murphy K, Blei DM (2017) Deep probabilistic programming. In: International conference on learning representations
Zurück zum Zitat Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using Dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1058–1066 Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using Dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1058–1066
Zurück zum Zitat Wang Z, Mohamed S, Freitas N (2013) Adaptive Hamiltonian and Riemann manifold Monte Carlo. In: International conference on machine learning, pp 1462–1470 Wang Z, Mohamed S, Freitas N (2013) Adaptive Hamiltonian and Riemann manifold Monte Carlo. In: International conference on machine learning, pp 1462–1470
Zurück zum Zitat Warde-Farley D, Goodfellow IJ, Courville A, Bengio Y (2013) An empirical analysis of dropout in piecewise linear networks. arXiv preprint arXiv:1312.6197 Warde-Farley D, Goodfellow IJ, Courville A, Bengio Y (2013) An empirical analysis of dropout in piecewise linear networks. arXiv preprint arXiv:​1312.​6197
Zurück zum Zitat Welling M, Teh YW (2011) Bayesian learning via stochastic gradient Langevin dynamics. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 681–688 Welling M, Teh YW (2011) Bayesian learning via stochastic gradient Langevin dynamics. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 681–688
Metadaten
Titel
Improving predictive uncertainty estimation using Dropout–Hamiltonian Monte Carlo
verfasst von
Sergio Hernández
Diego Vergara
Matías Valdenegro-Toro
Felipe Jorquera
Publikationsdatum
29.07.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 6/2020
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-019-04195-w

Weitere Artikel der Ausgabe 6/2020

Soft Computing 6/2020 Zur Ausgabe

Premium Partner