Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 6/2019

05-06-2018 | Original Article

Statistical learning with group invariance: problem, method and consistency

Authors: Weixia Xu, Dingjiang Huang, Shuigeng Zhou

Published in: International Journal of Machine Learning and Cybernetics | Issue 6/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Statistical learning theory (SLT) provides the theoretical basis for many machine learning algorithms (e.g. SVMs and kernel methods). Invariance, as one type of popular prior knowledge in pattern analysis, has been widely incorporated into various statistical learning algorithms to improve learning performance. Though successful in some applications, existing invariance learning algorithms are task-specific, and lack a solid theoretical basis including consistency. In this paper, we first propose the problem of statistical learning with group invariance (or group invariance learning in short) to provide a unifying framework for existing invariance learning algorithms in pattern analysis by exploiting group invariance. We then introduce the group invariance empirical risk minimization (GIERM) method to solve the group invariance learning problem, which incorporates the group action on the original data into empirical risk minimization (ERM). Finally, we investigate the consistency of the GIERM method in detail. Our theoretical results include three theorems, covering the necessary and sufficient conditions of consistency, uniform two-sided convergence and uniform one-sided convergence for the group invariance learning process based on the GIERM method.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Appendix
Available only for authorised users
Literature
1.
go back to reference Vapnik VN (1998) Statistical learning theory. Wiley, New YorkMATH Vapnik VN (1998) Statistical learning theory. Wiley, New YorkMATH
2.
go back to reference von Luxburg, U, Schölkopf B (2011) Statistical learning theory: models, concepts, and results. In: Handbook of the history of logic, vol 10, pp 651–706. Elsevier von Luxburg, U, Schölkopf B (2011) Statistical learning theory: models, concepts, and results. In: Handbook of the history of logic, vol 10, pp 651–706. Elsevier
3.
go back to reference Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization and beyond. MIT Press, Cambridge, MA Schölkopf B, Smola AJ (2002) Learning with kernels: support vector machines, regularization, optimization and beyond. MIT Press, Cambridge, MA
4.
go back to reference Lauer F, Bloch G (2008) Incorporating prior knowledge in support vector machines for classification: a review. Neurocomputing 71(7):1578–1594CrossRef Lauer F, Bloch G (2008) Incorporating prior knowledge in support vector machines for classification: a review. Neurocomputing 71(7):1578–1594CrossRef
5.
go back to reference Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82CrossRef Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82CrossRef
6.
go back to reference Haasdonk B, Burkhardt H (2007) Invariant kernel functions for pattern analysis and machine learning. Mach Learn 68(1):35–61CrossRef Haasdonk B, Burkhardt H (2007) Invariant kernel functions for pattern analysis and machine learning. Mach Learn 68(1):35–61CrossRef
8.
go back to reference Kondor R (2008) Group theoretical methods in machine learning. Ph.D. thesis, Columbia University Kondor R (2008) Group theoretical methods in machine learning. Ph.D. thesis, Columbia University
9.
go back to reference Simard PY, Cun YL, Denker JS (1993) Efficient pattern recognition using a new transformation distance. In: Hanson S, Cowan J, Giles C (eds) Advances in neural information processing systems 5. Morgan Kaufmann Publishers Inc, San Francisco, CA, pp 50–58 Simard PY, Cun YL, Denker JS (1993) Efficient pattern recognition using a new transformation distance. In: Hanson S, Cowan J, Giles C (eds) Advances in neural information processing systems 5. Morgan Kaufmann Publishers Inc, San Francisco, CA, pp 50–58
10.
go back to reference Simard PY, Cun YL, Denker JS, Victorri B (1998) Transformation invariance in pattern recognition - tangent distance and tangent propagation. In: Orr GB, Müller KR (eds) Neural networks: tricks of the trade, Lecture Notes in Computer Science, vol 1524, pp 239–274. Springer Simard PY, Cun YL, Denker JS, Victorri B (1998) Transformation invariance in pattern recognition - tangent distance and tangent propagation. In: Orr GB, Müller KR (eds) Neural networks: tricks of the trade, Lecture Notes in Computer Science, vol 1524, pp 239–274. Springer
11.
go back to reference Schölkopf B, Burges C, Vapnik VN (1996) Incorporating invariances in support vector learning machines. In: von der Malsburg C, von Seelen W, Vorbrüggen JC, Sendhoff B (eds) Proceedings of ICANN 96: Artificial Neural Networks, pp 47–52. Springer(1996) Schölkopf B, Burges C, Vapnik VN (1996) Incorporating invariances in support vector learning machines. In: von der Malsburg C, von Seelen W, Vorbrüggen JC, Sendhoff B (eds) Proceedings of ICANN 96: Artificial Neural Networks, pp 47–52. Springer(1996)
12.
go back to reference Niyogi P, Girosi F, Poggio T (1998) Incorporating prior information in machine learning by creating virtual examples. Proc IEEE 86(11):2196–2209CrossRef Niyogi P, Girosi F, Poggio T (1998) Incorporating prior information in machine learning by creating virtual examples. Proc IEEE 86(11):2196–2209CrossRef
13.
go back to reference DeCoste D, Schölkopf B (2002) Training invariant support vector machines. Mach Learn 46(1–3):161–190MATHCrossRef DeCoste D, Schölkopf B (2002) Training invariant support vector machines. Mach Learn 46(1–3):161–190MATHCrossRef
14.
go back to reference Schulz-Mirbach H, Schölkopf B (1994) Constructing invariant features by averaging techniques. In: Proceedings of the 12th International Conference on Pattern Recognition (ICPR’94), pp 387–390. IEEE, Jerusalem, Israel Schulz-Mirbach H, Schölkopf B (1994) Constructing invariant features by averaging techniques. In: Proceedings of the 12th International Conference on Pattern Recognition (ICPR’94), pp 387–390. IEEE, Jerusalem, Israel
15.
go back to reference Kondor R, Jebara T (2003) A kernel between sets of vectors. In: Fawcett T, Mishra N (eds) Proceedings of the 20th International Conference on Machine Learning (ICML’03), pp 361–368. AAAI Press, Washington, DC (2003) Kondor R, Jebara T (2003) A kernel between sets of vectors. In: Fawcett T, Mishra N (eds) Proceedings of the 20th International Conference on Machine Learning (ICML’03), pp 361–368. AAAI Press, Washington, DC (2003)
16.
go back to reference Wang L, Gao Y, Chan KL, Xue P, Yau WY (2005) Retrieval with knowledge-driven kernel design: an approach to improving svm based cbir with relevance feedback. In: Proceedings of the 10th International Conference on Computer Vision (ICCV’05), vol 2, pp 1355–1362. IEEE, Beijing, China Wang L, Gao Y, Chan KL, Xue P, Yau WY (2005) Retrieval with knowledge-driven kernel design: an approach to improving svm based cbir with relevance feedback. In: Proceedings of the 10th International Conference on Computer Vision (ICCV’05), vol 2, pp 1355–1362. IEEE, Beijing, China
17.
go back to reference Reisert M, Burkhardt H (2007) Learning equivariant functions with matrix valued kernels. J Mach Learn Res 8(3):385–408MathSciNetMATH Reisert M, Burkhardt H (2007) Learning equivariant functions with matrix valued kernels. J Mach Learn Res 8(3):385–408MathSciNetMATH
18.
go back to reference Graepel T, Herbrich R (2004) Invariant pattern recognition by semidefinite programming machines. In: Thrun S, Saul LK, Schölkopf B (eds) Advances in Neural Information Processing Systems 16 (NIPS 2003), pp 33–40. MIT Press Graepel T, Herbrich R (2004) Invariant pattern recognition by semidefinite programming machines. In: Thrun S, Saul LK, Schölkopf B (eds) Advances in Neural Information Processing Systems 16 (NIPS 2003), pp 33–40. MIT Press
19.
go back to reference Bhattacharyya C, Shivaswamy PK, Smola AJ (2005) A second order cone programming formulation for classifying missing data. In: Saul L, Weiss Y, Bottou L (eds) Advances in Neural Information Processing Systems 17 (NIPS 2004), pp 153–160. MIT Press Bhattacharyya C, Shivaswamy PK, Smola AJ (2005) A second order cone programming formulation for classifying missing data. In: Saul L, Weiss Y, Bottou L (eds) Advances in Neural Information Processing Systems 17 (NIPS 2004), pp 153–160. MIT Press
20.
go back to reference Shivaswamy PK, Jebara T (2006) Permutation invariant svms. In: Cohen WW, Moore A (eds) Proceedings of the 23rd International Conference on Machine Learning (ICML’06), pp 817–824. ACM, Pittsburgh, USA Shivaswamy PK, Jebara T (2006) Permutation invariant svms. In: Cohen WW, Moore A (eds) Proceedings of the 23rd International Conference on Machine Learning (ICML’06), pp 817–824. ACM, Pittsburgh, USA
21.
go back to reference Jebara T (2003) Convex invariance learning. In: Bishop CM, Frey BJ (eds) Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (AI & Statistics’03). Key West, Florida Jebara T (2003) Convex invariance learning. In: Bishop CM, Frey BJ (eds) Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics (AI & Statistics’03). Key West, Florida
22.
go back to reference Teo CH, Globerson A, Roweis ST, Smola AJ (2008) Convex learning with invariances. In: Advances in Neural Information Processing Systems 20 (NIPS 2007), pp 1489–1496. Curran Associates, Inc. Teo CH, Globerson A, Roweis ST, Smola AJ (2008) Convex learning with invariances. In: Advances in Neural Information Processing Systems 20 (NIPS 2007), pp 1489–1496. Curran Associates, Inc.
23.
go back to reference Kumar MP, Torr PHS, Zisserman A (2007) An invariant large margin nearest neighbour classifier. In: Proceedings of the 11th International Conference on Computer Vision (ICCV 2007), pp 1–8. IEEE, Rio de Janeiro Kumar MP, Torr PHS, Zisserman A (2007) An invariant large margin nearest neighbour classifier. In: Proceedings of the 11th International Conference on Computer Vision (ICCV 2007), pp 1–8. IEEE, Rio de Janeiro
24.
go back to reference Lauer F, Bloch G (2008) Incorporating prior knowledge in support vector regression. Mach Learn 70(1):89–118CrossRef Lauer F, Bloch G (2008) Incorporating prior knowledge in support vector regression. Mach Learn 70(1):89–118CrossRef
25.
go back to reference Vedaldi A, Blaschko M, Zisserman A (2011) Learning equivariant structured output svm regressors. In: Proceedings of the 13th International Conference on Computer Vision (ICCV’11). pp 959–966. IEEE, Barcelona Vedaldi A, Blaschko M, Zisserman A (2011) Learning equivariant structured output svm regressors. In: Proceedings of the 13th International Conference on Computer Vision (ICCV’11). pp 959–966. IEEE, Barcelona
26.
go back to reference Eaton ML (1989) Group invariance applications in statistics. In: Regional conference series in Probability and Statistics, vol 1, pp i–v+1–133. Institute of Mathematical Statistics Eaton ML (1989) Group invariance applications in statistics. In: Regional conference series in Probability and Statistics, vol 1, pp i–v+1–133. Institute of Mathematical Statistics
27.
Metadata
Title
Statistical learning with group invariance: problem, method and consistency
Authors
Weixia Xu
Dingjiang Huang
Shuigeng Zhou
Publication date
05-06-2018
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 6/2019
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-018-0829-2

Other articles of this Issue 6/2019

International Journal of Machine Learning and Cybernetics 6/2019 Go to the issue