nach oben

Soft Computing

Erschienen in:

29.07.2019 | Methodologies and Application

Improving predictive uncertainty estimation using Dropout–Hamiltonian Monte Carlo

verfasst von: Sergio Hernández, Diego Vergara, Matías Valdenegro-Toro, Felipe Jorquera

Erschienen in: Soft Computing | Ausgabe 6/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Estimating predictive uncertainty is crucial for many computer vision tasks, from image classification to autonomous driving systems. Hamiltonian Monte Carlo (HMC) is an sampling method for performing Bayesian inference. On the other hand, Dropout regularization has been proposed as an approximate model averaging technique that tends to improve generalization in large-scale models such as deep neural networks. Although HMC provides convergence guarantees for most standard Bayesian models, it do not handle discrete parameters arising from Dropout regularization. In this paper, we present a robust methodology for improving predictive uncertainty in classification problems, based on Dropout and HMC. Even though Dropout induces a non-smooth energy function with no such convergence guarantees, the resulting discretization of the Hamiltonian proves empirical success. The proposed method allows to effectively estimate the predictive accuracy and to provide better generalization for difficult test examples.

Vorheriger Artikel A hybrid method for evaluating the effectiveness of giant systems with indicator correlations: an application for naval formation decision making in multiple scenarios

Nächster Artikel Cloud service selection based on QoS-aware logistics

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Afshar HM, Domke J (2015) Reflection, refraction, and Hamiltonian Monte Carlo. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems 28. Curran Associates, Inc., pp 3007–3015. http://papers.nips.cc/paper/5801-reflection-refraction-and-hamiltonian-monte-carlo.pdf

Baldi P, Sadowski PJ (2013) Understanding dropout. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems 26. Curran Associates, Inc., pp 2814–2822. http://papers.nips.cc/paper/4878-understanding-dropout.pdf

Baldi P, Sadowski P (2014) The Dropout learning algorithm. Artif Intel 210:78–122MathSciNetCrossRef

Bardenet R, Doucet A, Holmes C (2014) Towards scaling up Markov Chain Monte Carlo: an adaptive subsampling approach. In: International conference on machine learning, pp 405–413

Beskos A, Pillai N, Roberts G, Sanz-Serna J-M, Stuart A et al (2013) Optimal tuning of the hybrid monte carlo algorithm. Bernoulli 19(5A):1501–1534MathSciNetCrossRef

Bishop CM (2007) Pattern recognition and machine learning (information science and statistics), 1st edn. Springer, New York (corr. 2nd printing edn.)

Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, Riddell A (2017) Stan: a probabilistic programming language. J Stat Softw. https://doi.org/10.18637/jss.v076.i01 CrossRef

Chaari L, Tourneret J-Y, Chaux C, Batatia H (2016) A hamiltonian Monte Carlo method for non-smooth energy sampling. IEEE Trans Signal Process 64(21):5585–5594MathSciNetCrossRef

Chen T, Fox E, Guestrin C (2014) Stochastic gradient Hamiltonian Monte Carlo. In: International conference on machine learning, pp 1683–1691

Eidinger E, Enbar R, Hassner T (2014) Age and gender estimation of unfiltered faces. IEEE Trans Inf Forens Secur 9(12):2170–2179CrossRef

Gal Y, Ghahramani Z (2016) Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: International conference on machine learning, pp 1050–1059

Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2014) Bayesian data analysis, vol 2. CRC Press, Boca RatonMATH

Girolami M, Calderhead B (2011) Riemann manifold langevin and Hamiltonian Monte Carlo methods. J R Stat Soc Ser B (Statistical Methodology) 73(2):123–214MathSciNetCrossRef

Griewank A, Walther A (2008) Evaluating derivatives: principles and techniques of algorithmic differentiation, second edn. Society for Industrial and Applied Mathematics, PhiladelphiaCrossRef

Guyon I, Gunn S, Ben-Hur A, Dror G (2005) Result analysis of the NIPS 2003 feature selection challenge. In: Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems 17. MIT Press, pp 545–552. http://papers.nips.cc/paper/2728-result-analysis-of-the-nips-2003-featureselection-challenge.pdf

Hoffman MD (2017) Learning deep latent Gaussian models with Markov Chain Monte Carlo. In: International conference on machine learning, pp 1510–1519

Hoffman MD, Gelman A (2014) The no-u-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J Mach Learn Res 15(1):1593–1623MathSciNetMATH

Kingma DP, Salimans T, Welling M (2015) Variational Dropout and the local reparameterization trick. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R (eds) Advances in neural information processing systems, vol 28. Curran Associates, Inc., Red Hook, pp 2575–2583

LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proce IEEE 86(11):2278–2324CrossRef

Leibig C, Allken V, Ayhan MS, Berens P, Wahl S (2017) Leveraging uncertainty information from deep neural networks for disease detection. Sci Rep 7(1):17816CrossRef

Levi G, Hassner T (2015) Age and gender classification using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 34–42

Li Z, Hoiem D (2018) Learning without forgetting. In: IEEE transactions on pattern analysis and machine intelligence, pp 1–1. https://doi.org/10.1109/TPAMI.2017.2773081 CrossRef

Miceli Barone AV, Haddow B, Germann U, Sennrich R (2017 September) Regularization techniques for fine-tuning in neural machine translation. In: Proceedings of the 2017 conference on empirical methods in natural language processing. Copenhagen, Association for omputational Linguistics, pp 1489–1494

Neal RM (2012) Bayesian learning for neural networks, vol 118. Springer, New YorkMATH

Neal RM et al (2011) MCMC using hamiltonian dynamics. Handb Markov Chain Monte Carlo 2(11):2MATH

Nishimura A, Dunson D, Lu J (2017) Discontinuous Hamiltonian Monte Carlo for models with discrete parameters and discontinuous likelihoods. arXiv preprint arXiv:1705.08510

Pakman A, Paninski L (2013) Auxiliary-variable exact hamiltonian monte carlo samplers for binary distributions. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems 26. Curran Associates, Inc., pp 2490–2498. http://papers.nips.cc/paper/5045-auxiliary-variable-exact-hamiltonian-montecarlo-samplers-for-binary-distributions.pdf

Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: British machine vision conference

Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann Publishers Inc., San FranciscoMATH

Pereyra M (2016) Proximal Markov chain monte carlo algorithms. Stat Comput 26(4):745–760MathSciNetCrossRef

Prince SJ (2012) Computer vision: models, learning, and inference. Cambridge University Press, CambridgeCrossRef

Roberts GO, Rosenthal JS (1998) Optimal scaling of discrete approximations to Langevin diffusions. J R Stat Soc Ser B (Statistical Methodology) 60(1):255–268MathSciNetCrossRef

Roberts GO, Stramer O (2002) Langevin diffusions and metropolis-hastings algorithms. Methodol Comput Appl Probab 4(4):337–357MathSciNetCrossRef

Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH

Tran D, Hoffman MD, Saurous RA, Brevdo E, Murphy K, Blei DM (2017) Deep probabilistic programming. In: International conference on learning representations

Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using Dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1058–1066

Wang Z, Mohamed S, Freitas N (2013) Adaptive Hamiltonian and Riemann manifold Monte Carlo. In: International conference on machine learning, pp 1462–1470

Warde-Farley D, Goodfellow IJ, Courville A, Bengio Y (2013) An empirical analysis of dropout in piecewise linear networks. arXiv preprint arXiv:1312.6197

Welling M, Teh YW (2011) Bayesian learning via stochastic gradient Langevin dynamics. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 681–688

Titel: Improving predictive uncertainty estimation using Dropout–Hamiltonian Monte Carlo
verfasst von: Sergio Hernández
Diego Vergara
Matías Valdenegro-Toro
Felipe Jorquera
Publikationsdatum: 29.07.2019
Verlag: Springer Berlin Heidelberg
Erschienen in: Soft Computing / Ausgabe 6/2020
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI: https://doi.org/10.1007/s00500-019-04195-w

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Weitere Artikel der Ausgabe 6/2020

Kernel intuitionistic fuzzy entropy clustering for MRI image segmentation

Tuning of reinforcement learning parameters applied to SOP using the Scott–Knott method

Hopf bifurcation of forced Chen system and its stability via adaptive control with arbitrary parameters

The -additive measure in a new light: the measure and its connections with belief, probability, plausibility, rough sets, multi-attribute utility functions and fuzzy operators

A heuristic fuzzy algorithm for assessing and managing tourism sustainability

Hesitant fuzzy soft multisets and their applications in decision-making problems

Premium Partner