Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 12/2019

17.05.2019 | Original Article

A multiclass boosting algorithm to labeled and unlabeled data

verfasst von: Jafar Tanha

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 12/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this article we focus on the semi-supervised learning. Semi-supervised learning typically is a learning task from both labeled and unlabeled data. We especially consider the multiclass semi-supervised classification problem. To solve the multiclass semi-supervised classification problem we propose a new multiclass loss function using new codewords. In the proposed loss function, we combine the classifier predictions, based on the labeled data, and the pairwise similarity between labeled and unlabeled examples. The main goal of the proposed loss function is to minimize the inconsistency between classifier predictions and the pairwise similarity. The proposed loss function consists of two terms. The first term is the multiclass margin cost of the labeled data and the second term is a regularization term on unlabeled data. The regularization term is used to minimize the cost of pseudo-margin on unlabeled data. We then derive a new multiclass boosting algorithm from the proposed risk function, called GMSB. The derived algorithm also uses a set optimal similarity functions for a given dataset. The results of our experiments on a number of UCI and real-world biological, text, and image datasets show that GMSB outperforms the state-of-the-art boosting methods to multiclass semi-supervised learning.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Bagheri MA, Montazer GA, Kabir E (2013) A subspace approach to error correcting output codes. Pattern Recognit Lett 34(2):176–184CrossRef Bagheri MA, Montazer GA, Kabir E (2013) A subspace approach to error correcting output codes. Pattern Recognit Lett 34(2):176–184CrossRef
2.
Zurück zum Zitat Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a mahalanobis metric from equivalence constraints. J Mach Learn Res 6:937–965MathSciNetMATH Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a mahalanobis metric from equivalence constraints. J Mach Learn Res 6:937–965MathSciNetMATH
3.
Zurück zum Zitat Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434MathSciNetMATH Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434MathSciNetMATH
4.
Zurück zum Zitat Bennett K, Demiriz A (1999) Semi-supervised support vector machines. NIPS pp 368–374 Bennett K, Demiriz A (1999) Semi-supervised support vector machines. NIPS pp 368–374
5.
Zurück zum Zitat Bennett K, Demiriz A, Maclin R (2002) Exploiting unlabeled data in ensemble methods. In: Proceedings of ACM SIGKDD conference, pp 289–296 Bennett K, Demiriz A, Maclin R (2002) Exploiting unlabeled data in ensemble methods. In: Proceedings of ACM SIGKDD conference, pp 289–296
6.
Zurück zum Zitat Blum A, Mitchell TM (1998) Combining labeled and unlabeled data with co-training. In: COLT, pp 92–100 Blum A, Mitchell TM (1998) Combining labeled and unlabeled data with co-training. In: COLT, pp 92–100
7.
Zurück zum Zitat Boley D, Gini M, Gross R, Han E, Hastings K, Karypis G, Kumar V, Mobasher B, Moore J (1999) Document categorization and query generation on the world wide web using webace. Artif Intell Rev 13(5):365–391CrossRef Boley D, Gini M, Gross R, Han E, Hastings K, Karypis G, Kumar V, Mobasher B, Moore J (1999) Document categorization and query generation on the world wide web using webace. Artif Intell Rev 13(5):365–391CrossRef
8.
Zurück zum Zitat dAlch Buc F, Grandvalet Y, Ambroise C (2002) Semi-supervised marginboost. NIPS 14:553–560 dAlch Buc F, Grandvalet Y, Ambroise C (2002) Semi-supervised marginboost. NIPS 14:553–560
9.
Zurück zum Zitat Chen K, Wang S (2011) Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions. Pattern Anal Mach Intell 33(1):129–143CrossRef Chen K, Wang S (2011) Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions. Pattern Anal Mach Intell 33(1):129–143CrossRef
10.
Zurück zum Zitat Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 785–794 Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 785–794
11.
Zurück zum Zitat Dai Z, Yang Z, Yang F, Cohen WW, Salakhutdinov RR (2017) Good semi-supervised learning that requires a bad gan. In: NIPS, pp 6510–6520 Dai Z, Yang Z, Yang F, Cohen WW, Salakhutdinov RR (2017) Good semi-supervised learning that requires a bad gan. In: NIPS, pp 6510–6520
12.
Zurück zum Zitat Demiriz A, Bennett K, Shawe-Taylor J (2002) Linear programming boosting via column generation. Mach Learn 46(1):225–254MATHCrossRef Demiriz A, Bennett K, Shawe-Taylor J (2002) Linear programming boosting via column generation. Mach Learn 46(1):225–254MATHCrossRef
13.
Zurück zum Zitat Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286MATHCrossRef Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286MATHCrossRef
14.
Zurück zum Zitat Dunlop MM, Slepcev D, Stuart AM, Thorpe M (2018) Large data and zero noise limits of graph-based semi-supervised learning algorithms. CoRR arxIV:abs/1805.09450 Dunlop MM, Slepcev D, Stuart AM, Thorpe M (2018) Large data and zero noise limits of graph-based semi-supervised learning algorithms. CoRR arxIV:​abs/​1805.​09450
16.
Zurück zum Zitat Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: ICML, pp 148–156 Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: ICML, pp 148–156
17.
Zurück zum Zitat Freund Y, Schapire R, Abe N (1999) A short introduction to boosting. J Jpn Soc Artif Intell 14(771–780):1612 Freund Y, Schapire R, Abe N (1999) A short introduction to boosting. J Jpn Soc Artif Intell 14(771–780):1612
18.
Zurück zum Zitat Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 28(2):337–407MATHCrossRef Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 28(2):337–407MATHCrossRef
19.
Zurück zum Zitat Goodman N, Mansinghka V, Roy DM, Bonawitz K, Tenenbaum JB (2012) Church: a language for generative models. arXiv preprint arXiv:12063255 Goodman N, Mansinghka V, Roy DM, Bonawitz K, Tenenbaum JB (2012) Church: a language for generative models. arXiv preprint arXiv:​12063255
20.
Zurück zum Zitat He R, Zheng W, Hu B, Kong X (2011) Nonnegative sparse coding for discriminative semi-supervised learning. In: The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, CO, USA, 20-25 June 2011, pp 2849–2856, https://doi.org/10.1109/CVPR.2011.5995487 He R, Zheng W, Hu B, Kong X (2011) Nonnegative sparse coding for discriminative semi-supervised learning. In: The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, CO, USA, 20-25 June 2011, pp 2849–2856, https://​doi.​org/​10.​1109/​CVPR.​2011.​5995487
21.
Zurück zum Zitat Hoi S, Liu W, Chang S (2008) Semi-supervised distance metric learning for collaborative image retrieval. In: CVPR, pp 1–7 Hoi S, Liu W, Chang S (2008) Semi-supervised distance metric learning for collaborative image retrieval. In: CVPR, pp 1–7
22.
Zurück zum Zitat Hoi SC, Liu W, Lyu MR, Ma WY (2006) Learning distance metrics with contextual constraints for image retrieval. In: Computer vision and pattern recognition, 2006 IEEE computer society conference on, IEEE, vol 2, pp 2072–2078 Hoi SC, Liu W, Lyu MR, Ma WY (2006) Learning distance metrics with contextual constraints for image retrieval. In: Computer vision and pattern recognition, 2006 IEEE computer society conference on, IEEE, vol 2, pp 2072–2078
24.
Zurück zum Zitat Jaakkola M (2002) Partially labeled classification with markov random walks. In: NIPS 14: proceedings of the 2002 conference, MIT Press, vol 2, p 945 Jaakkola M (2002) Partially labeled classification with markov random walks. In: NIPS 14: proceedings of the 2002 conference, MIT Press, vol 2, p 945
25.
Zurück zum Zitat Jiang B, Chen H, Yuan B, Yao X (2017) Scalable graph-based semi-supervised learning through sparse bayesian model. IEEE Trans Knowl Data Eng 29(12):2758–2771CrossRef Jiang B, Chen H, Yuan B, Yao X (2017) Scalable graph-based semi-supervised learning through sparse bayesian model. IEEE Trans Knowl Data Eng 29(12):2758–2771CrossRef
26.
Zurück zum Zitat Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML, pp 200–209 Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML, pp 200–209
27.
Zurück zum Zitat Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems, pp 3581–3589 Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems, pp 3581–3589
30.
Zurück zum Zitat Li Y, Guan C, Li H, Chin Z (2008) A self-training semi-supervised svm algorithm and its application in an eeg-based brain computer interface speller system. Pattern Recognit Lett 29(9):1285–1294CrossRef Li Y, Guan C, Li H, Chin Z (2008) A self-training semi-supervised svm algorithm and its application in an eeg-based brain computer interface speller system. Pattern Recognit Lett 29(9):1285–1294CrossRef
31.
Zurück zum Zitat Mallapragada P, Jin R, Jain A, Liu Y (2009) Semiboost: boosting for semi-supervised learning. Pattern Anal Mach Intell 31(11):2000–2014CrossRef Mallapragada P, Jin R, Jain A, Liu Y (2009) Semiboost: boosting for semi-supervised learning. Pattern Anal Mach Intell 31(11):2000–2014CrossRef
32.
Zurück zum Zitat Miyato T, Maeda S, Ishii S, Koyama M (2018) Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence pp 1–1 Miyato T, Maeda S, Ishii S, Koyama M (2018) Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence pp 1–1
33.
Zurück zum Zitat Mukherjee I, Schapire RE (2013) A theory of multiclass boosting. J Mach Learn Res 14(1):437–497MathSciNetMATH Mukherjee I, Schapire RE (2013) A theory of multiclass boosting. J Mach Learn Res 14(1):437–497MathSciNetMATH
34.
Zurück zum Zitat Ng WWY, Zhou X, Tian X, Wang X, Yeung DS (2018) Bagging-boosting-based semi-supervised multi-hashing with query-adaptive re-ranking. Neurocomputing 275:916–923CrossRef Ng WWY, Zhou X, Tian X, Wang X, Yeung DS (2018) Bagging-boosting-based semi-supervised multi-hashing with query-adaptive re-ranking. Neurocomputing 275:916–923CrossRef
36.
Zurück zum Zitat Nigam K, McCallum A, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using em. Mach Learn 39(2):103–134MATHCrossRef Nigam K, McCallum A, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using em. Mach Learn 39(2):103–134MATHCrossRef
39.
Zurück zum Zitat Rosenberg C, Hebert M, Schneiderman H (2005) Semi-supervised self-training of object detection models. In: WACV/MOTION, IEEE Computer Society, pp 29–36 Rosenberg C, Hebert M, Schneiderman H (2005) Semi-supervised self-training of object detection models. In: WACV/MOTION, IEEE Computer Society, pp 29–36
40.
Zurück zum Zitat Saberian MJ, Vasconcelos N (2011) Multiclass boosting: Theory and algorithms. In: Advances in Neural Information Processing Systems 24 (NIPS), pp 2124–2132 Saberian MJ, Vasconcelos N (2011) Multiclass boosting: Theory and algorithms. In: Advances in Neural Information Processing Systems 24 (NIPS), pp 2124–2132
43.
Zurück zum Zitat Subramanya A, Talukdar PP (2014) Graph-based semi-supervised learning. Synth Lect Artif Intell Mach Learn 8(4):1–125MATHCrossRef Subramanya A, Talukdar PP (2014) Graph-based semi-supervised learning. Synth Lect Artif Intell Mach Learn 8(4):1–125MATHCrossRef
44.
Zurück zum Zitat Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038CrossRef Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038CrossRef
45.
Zurück zum Zitat Tanha J (2013) Ensemble approaches to semi-supervised learning, Ph.D thesis, Informatics Institute, University of Amsterdam Tanha J (2013) Ensemble approaches to semi-supervised learning, Ph.D thesis, Informatics Institute, University of Amsterdam
46.
Zurück zum Zitat Tanha J, van Someren M, Afsarmanesh H (2011) Disagreement-based co-training. In: Tools with artificial intelligence (ICTAI), 2011 23rd IEEE International Conference on IEEE, pp 803–810 Tanha J, van Someren M, Afsarmanesh H (2011) Disagreement-based co-training. In: Tools with artificial intelligence (ICTAI), 2011 23rd IEEE International Conference on IEEE, pp 803–810
47.
Zurück zum Zitat Tanha J, van Someren M, Afsarmanesh H (2012a) An adaboost algorithm for multiclass semi-supervised learning. In: ICDM, pp 1116–1121 Tanha J, van Someren M, Afsarmanesh H (2012a) An adaboost algorithm for multiclass semi-supervised learning. In: ICDM, pp 1116–1121
48.
Zurück zum Zitat Tanha J, van Someren M, Bakker M, Bouten W, Shamoun-Baranes J, Afsarmanesh H (2012b) Multiclass semi-supervised learning for animal behavior recognition from accelerometer data. In: Tools with artificial intelligence (ICTAI), 2012 24rd IEEE International Conference on IEEE Tanha J, van Someren M, Bakker M, Bouten W, Shamoun-Baranes J, Afsarmanesh H (2012b) Multiclass semi-supervised learning for animal behavior recognition from accelerometer data. In: Tools with artificial intelligence (ICTAI), 2012 24rd IEEE International Conference on IEEE
49.
Zurück zum Zitat Tanha J, Saberian MJ, van Someren M (2013) Multiclass semi-supervised boosting using similarity learning. In: 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, December 7–10, 2013, pp 1205–1210 Tanha J, Saberian MJ, van Someren M (2013) Multiclass semi-supervised boosting using similarity learning. In: 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, December 7–10, 2013, pp 1205–1210
50.
Zurück zum Zitat Tanha J, van Someren M, Afsarmanesh H (2014) Boosting for multiclass semi-supervised learning. Pattern Recognit Lett 37:63–77CrossRef Tanha J, van Someren M, Afsarmanesh H (2014) Boosting for multiclass semi-supervised learning. Pattern Recognit Lett 37:63–77CrossRef
51.
Zurück zum Zitat Tanha J, van Someren M, Afsarmanesh H (2017) Semi-supervised self-training for decision tree classifiers. Int J Mach Learn Cybern 8(1):355–370CrossRef Tanha J, van Someren M, Afsarmanesh H (2017) Semi-supervised self-training for decision tree classifiers. Int J Mach Learn Cybern 8(1):355–370CrossRef
53.
Zurück zum Zitat Triguero I, García S, Herrera F (2015) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42(2):245–284CrossRef Triguero I, García S, Herrera F (2015) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42(2):245–284CrossRef
54.
Zurück zum Zitat Valizadegan H, Jin R, Jain A (2008) Semi-supervised boosting for multi-class classification. ECML pp 522–537 Valizadegan H, Jin R, Jain A (2008) Semi-supervised boosting for multi-class classification. ECML pp 522–537
56.
Zurück zum Zitat Zha ZJ, Mei T, Wang J, Wang Z, Hua XS (2009) Graph-based semi-supervised learning with multiple labels. J Vis Commun Image Rep 20(2):97–103CrossRef Zha ZJ, Mei T, Wang J, Wang Z, Hua XS (2009) Graph-based semi-supervised learning with multiple labels. J Vis Commun Image Rep 20(2):97–103CrossRef
57.
Zurück zum Zitat Zhang M, Tang J, Zhang X, Xue X (2014) Addressing cold start in recommender systems: A semi-supervised co-training algorithm. In: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, ACM, pp 73–82 Zhang M, Tang J, Zhang X, Xue X (2014) Addressing cold start in recommender systems: A semi-supervised co-training algorithm. In: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, ACM, pp 73–82
58.
Zurück zum Zitat Zhou D, Bousquet O, Lal T, Weston J, Schölkopf B (2004) Learning with local and global consistency. NIPS 16:321–328 Zhou D, Bousquet O, Lal T, Weston J, Schölkopf B (2004) Learning with local and global consistency. NIPS 16:321–328
60.
Zurück zum Zitat Zhu X (2005) Semi-supervised learning literature survey. Tech. Rep. 1530, Computer Sciences, University of Wisconsin-Madison Zhu X (2005) Semi-supervised learning literature survey. Tech. Rep. 1530, Computer Sciences, University of Wisconsin-Madison
61.
Zurück zum Zitat Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data with label propagation. School Comput Sci, Carnegie Mellon Univ, Pittsburgh, PA, Tech Rep CMU-CALD-02-107 Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data with label propagation. School Comput Sci, Carnegie Mellon Univ, Pittsburgh, PA, Tech Rep CMU-CALD-02-107
62.
Zurück zum Zitat Zhu X, Goldberg AB (2009) Introduction to Semi-Supervised Learning. Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers Zhu X, Goldberg AB (2009) Introduction to Semi-Supervised Learning. Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers
63.
Zurück zum Zitat Zhuang L, Gao H, Lin Z, Ma Y, Zhang X, Yu N (2012) Non-negative low rank and sparse graph for semi-supervised learning. In: 2012 ieee conference on computer vision and pattern recognition, Providence, RI, USA, June 16–21, 2012, pp 2328–2335. https://doi.org/10.1109/CVPR.2012.6247944 Zhuang L, Gao H, Lin Z, Ma Y, Zhang X, Yu N (2012) Non-negative low rank and sparse graph for semi-supervised learning. In: 2012 ieee conference on computer vision and pattern recognition, Providence, RI, USA, June 16–21, 2012, pp 2328–2335. https://​doi.​org/​10.​1109/​CVPR.​2012.​6247944
Metadaten
Titel
A multiclass boosting algorithm to labeled and unlabeled data
verfasst von
Jafar Tanha
Publikationsdatum
17.05.2019
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 12/2019
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-019-00951-4

Weitere Artikel der Ausgabe 12/2019

International Journal of Machine Learning and Cybernetics 12/2019 Zur Ausgabe

Neuer Inhalt