Skip to main content
Erschienen in: Neural Computing and Applications 24/2020

22.11.2018 | WSOM 2017

Rademacher complexity of margin multi-category classifiers

verfasst von: Yann Guermeur

Erschienen in: Neural Computing and Applications | Ausgabe 24/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

One of the main open problems of the theory of margin multi-category pattern classification is the characterization of the optimal dependence of the confidence interval of a guaranteed risk on the three basic parameters which are the sample size m, the number C of categories and the scale parameter \(\gamma\). This is especially the case when working under minimal learnability hypotheses. The starting point is a basic supremum inequality whose capacity measure depends on the choice of the margin loss function. Then, transitions are made, from capacity measure to capacity measure. At some level, a structural result performs the transition from the multi-class case to the bi-class one. In this article, we highlight the advantages and drawbacks inherent to the three major options for this decomposition: using Rademacher complexities, covering numbers or scale-sensitive combinatorial dimensions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Alon N, Ben-David S, Cesa-Bianchi N, Haussler D (1997) Scale-sensitive dimensions, uniform convergence, and learnability. J ACM 44(4):615–631MathSciNetCrossRef Alon N, Ben-David S, Cesa-Bianchi N, Haussler D (1997) Scale-sensitive dimensions, uniform convergence, and learnability. J ACM 44(4):615–631MathSciNetCrossRef
2.
Zurück zum Zitat Anthony M, Bartlett P (1999) Neural network learning: theoretical foundations. Cambridge University Press, CambridgeCrossRef Anthony M, Bartlett P (1999) Neural network learning: theoretical foundations. Cambridge University Press, CambridgeCrossRef
3.
Zurück zum Zitat Bartlett P (1998) The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Trans Inf Theory 44(2):525–536MathSciNetCrossRef Bartlett P (1998) The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Trans Inf Theory 44(2):525–536MathSciNetCrossRef
4.
Zurück zum Zitat Bartlett P, Shawe-Taylor J (1999) Generalization performance of support vector machines and other pattern classifiers. In: Schölkopf B, Burges C, Smola A (eds) Advances in kernel methods–support vector learning. The MIT Press, Cambridge, MA, pp 43–54 Bartlett P, Shawe-Taylor J (1999) Generalization performance of support vector machines and other pattern classifiers. In: Schölkopf B, Burges C, Smola A (eds) Advances in kernel methods–support vector learning. The MIT Press, Cambridge, MA, pp 43–54
5.
Zurück zum Zitat Berlinet A, Thomas-Agnan C (2004) Reproducing kernel Hilbert spaces in probability and statistics. Kluwer Academic Publishers, BostonCrossRef Berlinet A, Thomas-Agnan C (2004) Reproducing kernel Hilbert spaces in probability and statistics. Kluwer Academic Publishers, BostonCrossRef
6.
Zurück zum Zitat Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297MATH
7.
Zurück zum Zitat Daniely A, Sabato S, Ben-David S, Shalev-Shwartz S (2011) Multiclass learnability and the ERM principle. In: COLT’11, pp 207–232 Daniely A, Sabato S, Ben-David S, Shalev-Shwartz S (2011) Multiclass learnability and the ERM principle. In: COLT’11, pp 207–232
8.
Zurück zum Zitat Doǧan U, Glasmachers T, Igel C (2016) A unified view on multi-class support vector classification. J Mach Learn Res 17(45):1–32MathSciNetMATH Doǧan U, Glasmachers T, Igel C (2016) A unified view on multi-class support vector classification. J Mach Learn Res 17(45):1–32MathSciNetMATH
9.
Zurück zum Zitat Dudley R, Giné E, Zinn J (1991) Uniform and universal Glivenko–Cantelli classes. J Theor Probab 4(3):485–510MathSciNetCrossRef Dudley R, Giné E, Zinn J (1991) Uniform and universal Glivenko–Cantelli classes. J Theor Probab 4(3):485–510MathSciNetCrossRef
10.
Zurück zum Zitat Guermeur Y (2007) VC theory of large margin multi-category classifiers. J Mach Learn Res 8:2551–2594MathSciNetMATH Guermeur Y (2007) VC theory of large margin multi-category classifiers. J Mach Learn Res 8:2551–2594MathSciNetMATH
11.
Zurück zum Zitat Guermeur Y (2012) A generic model of multi-class support vector machine. Int J Intell Inf Database Syst 6(6):555–577 Guermeur Y (2012) A generic model of multi-class support vector machine. Int J Intell Inf Database Syst 6(6):555–577
12.
Zurück zum Zitat Guermeur Y (2017) \(\text{ L }_p\)-norm Sauer–Shelah lemma for margin multi-category classifiers. J Comput Syst Sci 89:450–473MathSciNetCrossRef Guermeur Y (2017) \(\text{ L }_p\)-norm Sauer–Shelah lemma for margin multi-category classifiers. J Comput Syst Sci 89:450–473MathSciNetCrossRef
14.
Zurück zum Zitat Kearns M, Schapire R (1994) Efficient distribution-free learning of probabilistic concepts. J Comput Syst Sci 48(3):464–497MathSciNetCrossRef Kearns M, Schapire R (1994) Efficient distribution-free learning of probabilistic concepts. J Comput Syst Sci 48(3):464–497MathSciNetCrossRef
15.
Zurück zum Zitat Kearns M, Schapire R, Sellie L (1994) Toward efficient agnostic learning. Mach Learn 17(2–3):115–141MATH Kearns M, Schapire R, Sellie L (1994) Toward efficient agnostic learning. Mach Learn 17(2–3):115–141MATH
16.
Zurück zum Zitat Kolmogorov A, Tihomirov V (1961) \(\epsilon\)-entropy and \(\epsilon\)-capacity of sets in functional spaces. Am Math Soc Transl Ser 2 17:277–364MathSciNet Kolmogorov A, Tihomirov V (1961) \(\epsilon\)-entropy and \(\epsilon\)-capacity of sets in functional spaces. Am Math Soc Transl Ser 2 17:277–364MathSciNet
17.
Zurück zum Zitat Kontorovich A, Weiss R (2014) Maximum margin multiclass nearest neighbors. In: ICML’14 Kontorovich A, Weiss R (2014) Maximum margin multiclass nearest neighbors. In: ICML’14
18.
Zurück zum Zitat Kuznetsov V, Mohri M, Syed U (2014) Multi-class deep boosting. In: NIPS 27, pp 2501–2509 Kuznetsov V, Mohri M, Syed U (2014) Multi-class deep boosting. In: NIPS 27, pp 2501–2509
19.
Zurück zum Zitat Lei Y, Doǧan U, Binder A, Kloft M (2015) Multi-class SVMs: from tighter data-dependent generalization bounds to novel algorithms. NIPS 28:2026–2034 Lei Y, Doǧan U, Binder A, Kloft M (2015) Multi-class SVMs: from tighter data-dependent generalization bounds to novel algorithms. NIPS 28:2026–2034
20.
Zurück zum Zitat Maurer A (2016) A vector-contraction inequality for Rademacher complexities. In: ALT’16, pp 3–17 Maurer A (2016) A vector-contraction inequality for Rademacher complexities. In: ALT’16, pp 3–17
21.
Zurück zum Zitat Mendelson S (2003) A few notes on statistical learning theory. In: Mendelson S, Smola A (eds) Advanced lectures on machine learning. Springer, Berlin, pp 1–40CrossRef Mendelson S (2003) A few notes on statistical learning theory. In: Mendelson S, Smola A (eds) Advanced lectures on machine learning. Springer, Berlin, pp 1–40CrossRef
23.
Zurück zum Zitat Mohri M, Rostamizadeh A, Talwalkar A (2012) Foundations of machine learning. The MIT Press, Cambridge, MAMATH Mohri M, Rostamizadeh A, Talwalkar A (2012) Foundations of machine learning. The MIT Press, Cambridge, MAMATH
24.
Zurück zum Zitat Pollard D (1984) Convergence of stochastic processes. Springer, New YorkCrossRef Pollard D (1984) Convergence of stochastic processes. Springer, New YorkCrossRef
25.
Zurück zum Zitat Talagrand M (2014) Upper and lower bounds for stochastic processes: modern methods and classical problems. Springer, BerlinCrossRef Talagrand M (2014) Upper and lower bounds for stochastic processes: modern methods and classical problems. Springer, BerlinCrossRef
26.
Zurück zum Zitat van der Vaart A, Wellner J (1996) Weak convergence and empirical processes, with applications to statistics. Springer series in statistics. Springer, New YorkCrossRef van der Vaart A, Wellner J (1996) Weak convergence and empirical processes, with applications to statistics. Springer series in statistics. Springer, New YorkCrossRef
27.
Zurück zum Zitat Vapnik V (1998) Statistical learning theory. Wiley, New YorkMATH Vapnik V (1998) Statistical learning theory. Wiley, New YorkMATH
28.
Zurück zum Zitat Vapnik V, Chervonenkis A (1971) On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab Appl XVI(2):264–280CrossRef Vapnik V, Chervonenkis A (1971) On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab Appl XVI(2):264–280CrossRef
29.
Zurück zum Zitat Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: Casdagli M, Eubank S (eds) Nonlinear modeling and forecasting, SFI studies in the sciences of complexity, vol XII. Addison-Wesley, Boston, pp 95–112 Wahba G (1992) Multivariate function and operator estimation, based on smoothing splines and reproducing kernels. In: Casdagli M, Eubank S (eds) Nonlinear modeling and forecasting, SFI studies in the sciences of complexity, vol XII. Addison-Wesley, Boston, pp 95–112
30.
Zurück zum Zitat Zhang T (2004) Statistical analysis of some multi-category large margin classification methods. J Mach Learn Res 5:1225–1251MathSciNetMATH Zhang T (2004) Statistical analysis of some multi-category large margin classification methods. J Mach Learn Res 5:1225–1251MathSciNetMATH
Metadaten
Titel
Rademacher complexity of margin multi-category classifiers
verfasst von
Yann Guermeur
Publikationsdatum
22.11.2018
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 24/2020
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-018-3873-7

Weitere Artikel der Ausgabe 24/2020

Neural Computing and Applications 24/2020 Zur Ausgabe

Editorial

Editorial

Premium Partner