Top

Published in:

2021 | OriginalPaper | Chapter

Self-bounding Majority Vote Learning Algorithms by the Direct Minimization of a Tight PAC-Bayesian C-Bound

Authors : Paul Viallard, Pascal Germain, Amaury Habrard, Emilie Morvant

Published in: Machine Learning and Knowledge Discovery in Databases. Research Track

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In the PAC-Bayesian literature, the C-Bound refers to an insightful relation between the risk of a majority vote classifier (under the zero-one loss) and the first two moments of its margin (i.e., the expected margin and the voters’ diversity). Until now, learning algorithms developed in this framework minimize the empirical version of the C-Bound, instead of explicit PAC-Bayesian generalization bounds. In this paper, by directly optimizing PAC-Bayesian guarantees on the C-Bound, we derive self-bounding majority vote learning algorithms. Moreover, our algorithms based on gradient descent are scalable and lead to accurate predictors paired with non-vacuous guarantees.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Decoupling Sparsity and Smoothness in Dirichlet Belief Networks

next chapter Midpoint Regularization: From High Uncertainty Training Labels to Conservative Classification Decisions

Available only for authorised users

The C-Bound was introduced by Breiman in the context of Random Forest [6].

update-\(\mathcal {Q}\) is a generic update function, i.e., it can be for example a standard update of GD or the update of another algorithm like Adam [18] or COCOB [27].

The reader can refer to [4] for an introduction of interior-point methods.

For example, when using CVXPY [9], that uses Disciplined Convex Programming (DCP [16]), the maximization of a non-concave function is not possible.

Experiments are done with PyTorch [29] and CVXPY [9]. The source code is available at https://github.com/paulviallard/ECML21-PB-CBound.

The algorithm 2r is similar to Algorithm 2, but without the numerator of the C-Bound (i.e., the disagreement). More details are given in the Supplemental.

An overview of the datasets is presented in the Supplemental.

Bauvin, B., Capponi, C., Roy, J., Laviolette, F.: Fast greedy C-bound minimization with guarantees. Mach. Learn. 109(9), 1945–1986 (2020). https://doi.org/10.1007/s10994-020-05902-7MathSciNetCrossRefMATH

Bellet, A., Habrard, A., Morvant, E., Sebban, M.: Learning a priori constrained weighted majority votes. Mach. Learn. 97(1–2), 129–154 (2014). https://doi.org/10.1007/s10994-014-5462-zMathSciNetCrossRefMATH

Boser, B., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: COLT (1992)

Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)MATH

Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). https://doi.org/10.1007/BF00058655CrossRefMATH

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324CrossRefMATH

Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/BF00994018CrossRefMATH

Danskin, J.: The theory of max-min, with applications. SIAM J. Appl. Math. 14(4), 641–664 (1966)MathSciNetCrossRef

Diamond, S., Boyd, S.: CVXPY: a python-embedded modeling language for convex optimization. J. Mach. Learn. Res. 17(1), 2909–2913 (2016)MathSciNetMATH

10.

Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1CrossRef

11.

Dziugaite, G.K., Roy, D.: Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data. In: UAI (2017)

12.

Freund, Y.: Self bounding learning algorithms. In: COLT (1998)

13.

Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: ICML (1996)

14.

Germain, P., Lacasse, A., Laviolette, F., Marchand, M.: PAC-Bayesian learning of linear classifiers. In: ICML (2009)

15.

Germain, P., Lacasse, A., Laviolette, F., Marchand, M., Roy, J.: Risk bounds for the majority vote: from a PAC-Bayesian analysis to a learning algorithm. J. Mach. Learn. Res. (2015)

16.

Grant, M., Boyd, S., Ye, Y.: Disciplined convex programming. In: Liberti, L., Maculan, N. (eds.) Global Optimization, pp. 155–210. Springer, Boston (2006). https://doi.org/10.1007/0-387-30528-9_7CrossRef

17.

Kervadec, H., Dolz, J., Yuan, J., Desrosiers, C., Granger, E., Ayed, I.B.: Constrained Deep Networks: Lagrangian Optimization via Log-Barrier Extensions. CoRR abs/1904.04205 (2019)

18.

Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

19.

Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Hoboken (2014)MATH

20.

Lacasse, A., Laviolette, F., Marchand, M., Germain, P., Usunier, N.: PAC-Bayes bounds for the risk of the majority vote and the variance of the gibbs classifier. In: NIPS (2006)

21.

Langford, J., Shawe-Taylor, J.: PAC-Bayes & margins. In: NIPS (2002)

22.

Lorenzen, S.S., Igel, C., Seldin, Y.: On PAC-Bayesian bounds for random forests. Mach. Learn. 108(8), 1503–1522 (2019). https://doi.org/10.1007/s10994-019-05803-4MathSciNetCrossRefMATH

23.

Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2018)

24.

Masegosa, A., Lorenzen, S.S., Igel, C., Seldin, Y.: Second order PAC-Bayesian bounds for the weighted majority vote. In: NeurIPS (2020)

25.

McAllester, D.: Some PAC-Bayesian theorems. Mach. Learn. 37(3), 355–363 (1999). https://doi.org/10.1023/A:1007618624809CrossRefMATH

26.

McAllester, D.: PAC-Bayesian stochastic model selection. Mach. Learn. 51(1), 5–21 (2003). https://doi.org/10.1023/A:1021840411064CrossRefMATH

27.

Orabona, F., Tommasi, T.: Training deep networks without learning rates through coin betting. In: NIPS (2017)

28.

Parrado-Hernández, E., Ambroladze, A., Shawe-Taylor, J., Sun, S.: PAC-Bayes bounds with data dependent priors. J. Mach. Learn. Res. 13(1), 3507–3531 (2012)MathSciNetMATH

29.

Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)

30.

Reeb, D., Doerr, A., Gerwinn, S., Rakitsch, B.: Learning Gaussian processes by minimizing PAC-Bayesian generalization bounds. In: NeurIPS (2018)

31.

Roy, J., Laviolette, F., Marchand, M.: From PAC-Bayes bounds to quadratic programs for majority votes. In: ICML (2011)

32.

Roy, J., Marchand, M., Laviolette, F.: A column generation bound minimization approach with PAC-Bayesian generalization guarantees. In: AISTATS (2016)

33.

Seeger, M.: PAC-Bayesian generalisation error bounds for gaussian process classification. J. Mach. Learn. Res. 3, 233–269 (2002)MathSciNetCrossRef

34.

Shawe-Taylor, J., Williamson, R.: A PAC analysis of a Bayesian estimator. In: COLT (1997)

Title: Self-bounding Majority Vote Learning Algorithms by the Direct Minimization of a Tight PAC-Bayesian C-Bound
Authors: Paul Viallard
Pascal Germain
Amaury Habrard
Emilie Morvant
Publisher: Springer International Publishing
Book: Machine Learning and Knowledge Discovery in Databases. Research Track
Print ISBN: 978-3-030-86519-1

Electronic ISBN: 978-3-030-86520-7

Copyright Year: 2021
DOI: https://doi.org/10.1007/978-3-030-86520-7_11

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner