Skip to main content

2017 | OriginalPaper | Buchkapitel

Bayesian Nonlinear Support Vector Machines for Big Data

verfasst von : Florian Wenzel, Théo Galy-Fajou, Matthäus Deutsch, Marius Kloft

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We propose a fast inference method for Bayesian nonlinear support vector machines that leverages stochastic variational inference and inducing points. Our experiments show that the proposed method is faster than competing Bayesian approaches and scales easily to millions of data points. It provides additional features over frequentist competitors such as accurate predictive uncertainty estimates and automatic hyperparameter search.
Code related to this chapter is available at: https://​doi.​org/​10.​6084/​m9.​figshare.​5443627

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Note that frequentist approaches can also lead to other forms of uncertainty estimates, e.g. in form of confidence intervals. But since the classic SVM does not exhibit a probabilistic formulation these uncertainty estimates cannot be directly computed.
 
2
This follows directly since \(K_{mm}\) and \(A^{-\frac{1}{2}}\) are positive definite.
 
3
The RBF kernel is defined as \(k(x_1,x_2,\theta )=\exp \left( -\frac{||x_1-x_2||}{\theta ^2}\right) \), where \(\theta \) is the length scale parameter.
 
4
For a comparison with the stochastic variational inference version of GPC, see Sect. 5.3.
 
5
The length scale parameter tuning is not included in the training time. We found \(\theta = 5.0\) by our proposed automatic tuning approach.
 
Literatur
1.
Zurück zum Zitat Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATH Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)MATH
3.
Zurück zum Zitat Henao, R., Yuan, X., Carin, L.: Bayesian nonlinear support vector machines and discriminative factor modeling. In: NIPS (2014) Henao, R., Yuan, X., Carin, L.: Bayesian nonlinear support vector machines and discriminative factor modeling. In: NIPS (2014)
4.
Zurück zum Zitat Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? JMLR 15(1), 3133–3181 (2014)MathSciNetMATH Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? JMLR 15(1), 3133–3181 (2014)MathSciNetMATH
5.
Zurück zum Zitat Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. MIT press, Cambridge (2012)MATH Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. MIT press, Cambridge (2012)MATH
6.
Zurück zum Zitat Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. JMLR 14, 1303–1347 (2013)MathSciNetMATH Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. JMLR 14, 1303–1347 (2013)MathSciNetMATH
7.
Zurück zum Zitat Hensman, J., Fusi, N., Lawrence, N.D.: Gaussian processes for big data. In: Conference on Uncertainty in Artificial Intellegence (2013) Hensman, J., Fusi, N., Lawrence, N.D.: Gaussian processes for big data. In: Conference on Uncertainty in Artificial Intellegence (2013)
8.
Zurück zum Zitat Platt, P.J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 10(3), 61–74 (1999) Platt, P.J.C.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 10(3), 61–74 (1999)
9.
Zurück zum Zitat Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). MIT Press, Cambridge (2005) Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). MIT Press, Cambridge (2005)
10.
Zurück zum Zitat Hensman, J., Matthews, A.: Scalable variational Gaussian process classification. In: AISTATS (2015) Hensman, J., Matthews, A.: Scalable variational Gaussian process classification. In: AISTATS (2015)
11.
Zurück zum Zitat Baldi, P., Sadowski, P., Whiteson, D.: Searching for exotic particles in high-energy physics with deep learning. Nature Commun. 4 (2014). Article no. 4308 Baldi, P., Sadowski, P., Whiteson, D.: Searching for exotic particles in high-energy physics with deep learning. Nature Commun. 4 (2014). Article no. 4308
12.
Zurück zum Zitat Zhu, J., Chen, N., Perkins, H., Zhang, B.: Gibbs max-margin topic models with data augmentation. JMLR 15(1), 1073–1110 (2014)MathSciNetMATH Zhu, J., Chen, N., Perkins, H., Zhang, B.: Gibbs max-margin topic models with data augmentation. JMLR 15(1), 1073–1110 (2014)MathSciNetMATH
13.
Zurück zum Zitat Xu, M., Zhu, J., Zhang, B.: Fast max-margin matrix factorization with data augmentation. In: ICML, pp. 978–986 (2013) Xu, M., Zhu, J., Zhang, B.: Fast max-margin matrix factorization with data augmentation. In: ICML, pp. 978–986 (2013)
14.
Zurück zum Zitat Zhang, A., Zhu, J., Zhang, B.: Max-margin infinite hidden Markov models. In: ICML (2014) Zhang, A., Zhu, J., Zhang, B.: Max-margin infinite hidden Markov models. In: ICML (2014)
15.
Zurück zum Zitat Luts, J., Ormerod, J.T.: Mean field variational Bayesian inference for support vector machine classification. Comput. Stat. Data Anal. 73, 163–176 (2014)MathSciNetCrossRef Luts, J., Ormerod, J.T.: Mean field variational Bayesian inference for support vector machine classification. Comput. Stat. Data Anal. 73, 163–176 (2014)MathSciNetCrossRef
16.
Zurück zum Zitat Snelson, E., Ghahramani, Z.: Sparse GPs using pseudo-inputs. In: NIPS (2006) Snelson, E., Ghahramani, Z.: Sparse GPs using pseudo-inputs. In: NIPS (2006)
17.
Zurück zum Zitat Kloft, M., Brefeld, U., Sonnenburg, S., Zien, A.: \(lp\)-norm multiple kernel learning. JMLR 12, 953–997 (2011)MATH Kloft, M., Brefeld, U., Sonnenburg, S., Zien, A.: \(lp\)-norm multiple kernel learning. JMLR 12, 953–997 (2011)MATH
18.
Zurück zum Zitat Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)CrossRefMATH Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)CrossRefMATH
19.
Zurück zum Zitat Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1–2), 1–305 (2008)MATH Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1–2), 1–305 (2008)MATH
21.
Zurück zum Zitat Amari, S., Nagaoka, H.: Methods of Information Geometry. American Mathematical Society, Providence (2007)MATH Amari, S., Nagaoka, H.: Methods of Information Geometry. American Mathematical Society, Providence (2007)MATH
22.
Zurück zum Zitat Martens, J.: New insights and perspectives on the natural gradient method. Arxiv Preprint (2017) Martens, J.: New insights and perspectives on the natural gradient method. Arxiv Preprint (2017)
23.
Zurück zum Zitat Amari, S.: Natural gradient works efficiently in learning. Neural Comput. 10, 251–276 (1998)CrossRef Amari, S.: Natural gradient works efficiently in learning. Neural Comput. 10, 251–276 (1998)CrossRef
24.
Zurück zum Zitat Titsias, M.K.: Variational learning of inducing variables in sparse Gaussian processes. In: Artificial Intelligence and Statistics, vol. 12, pp. 567–574 (2009) Titsias, M.K.: Variational learning of inducing variables in sparse Gaussian processes. In: Artificial Intelligence and Statistics, vol. 12, pp. 567–574 (2009)
25.
Zurück zum Zitat Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)MATH Murphy, K.P.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2012)MATH
26.
Zurück zum Zitat Ranganath, R., Wang, C., Blei, D.M., Xing, E.P.: An adaptive learning rate for stochastic variational inference. In: ICML (2013) Ranganath, R., Wang, C., Blei, D.M., Xing, E.P.: An adaptive learning rate for stochastic variational inference. In: ICML (2013)
27.
Zurück zum Zitat Maritz, J., Lwin, T.: Empirical Bayes Methods with Applications: Monographs on Statistics and Applied Probability. Chapman & Hall/CRC, Boca Raton (1989)MATH Maritz, J., Lwin, T.: Empirical Bayes Methods with Applications: Monographs on Statistics and Applied Probability. Chapman & Hall/CRC, Boca Raton (1989)MATH
28.
Zurück zum Zitat Mandt, S., Hoffman, M., Blei, D.: A variational analysis of stochastic gradient algorithms. In: ICML (2016) Mandt, S., Hoffman, M., Blei, D.: A variational analysis of stochastic gradient algorithms. In: ICML (2016)
29.
Zurück zum Zitat Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)CrossRef Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)CrossRef
30.
Zurück zum Zitat Brier, G.W.: Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78(1), 1–3 (1950)CrossRef Brier, G.W.: Verification of forecasts expressed in terms of probability. Mon. Weather Rev. 78(1), 1–3 (1950)CrossRef
31.
Zurück zum Zitat Diethe, T.: 13 benchmark datasets derived from the UCI, DELVE and STATLOG repositories (2015) Diethe, T.: 13 benchmark datasets derived from the UCI, DELVE and STATLOG repositories (2015)
32.
Zurück zum Zitat Bachem, O., Lucic, M., Hassani, H., Krause, A.: Fast and provably good seedings for k-means. In: NIPS (2016) Bachem, O., Lucic, M., Hassani, H., Krause, A.: Fast and provably good seedings for k-means. In: NIPS (2016)
33.
Zurück zum Zitat Lichman, M.: UCI machine learning repository (2013) Lichman, M.: UCI machine learning repository (2013)
34.
Zurück zum Zitat Mandt, S., Wenzel, F., Nakajima, S., Cunningham, J.P., Lippert, C., Kloft, M.: Sparse probit linear mixed model. Mach. Learn. 106(9–10), 1621–1642 (2017)MathSciNetCrossRef Mandt, S., Wenzel, F., Nakajima, S., Cunningham, J.P., Lippert, C., Kloft, M.: Sparse probit linear mixed model. Mach. Learn. 106(9–10), 1621–1642 (2017)MathSciNetCrossRef
35.
Zurück zum Zitat Perdisci, R., Gu, G., Lee, W.: Using an ensemble of one-class SVM classifiers to H. P.-based anomaly detection systems. In: Data Mining (2006) Perdisci, R., Gu, G., Lee, W.: Using an ensemble of one-class SVM classifiers to H. P.-based anomaly detection systems. In: Data Mining (2006)
Metadaten
Titel
Bayesian Nonlinear Support Vector Machines for Big Data
verfasst von
Florian Wenzel
Théo Galy-Fajou
Matthäus Deutsch
Marius Kloft
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-71249-9_19