nach oben

International Journal of Machine Learning and Cybernetics

Erschienen in:

17.05.2019 | Original Article

A multiclass boosting algorithm to labeled and unlabeled data

verfasst von: Jafar Tanha

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 12/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this article we focus on the semi-supervised learning. Semi-supervised learning typically is a learning task from both labeled and unlabeled data. We especially consider the multiclass semi-supervised classification problem. To solve the multiclass semi-supervised classification problem we propose a new multiclass loss function using new codewords. In the proposed loss function, we combine the classifier predictions, based on the labeled data, and the pairwise similarity between labeled and unlabeled examples. The main goal of the proposed loss function is to minimize the inconsistency between classifier predictions and the pairwise similarity. The proposed loss function consists of two terms. The first term is the multiclass margin cost of the labeled data and the second term is a regularization term on unlabeled data. The regularization term is used to minimize the cost of pseudo-margin on unlabeled data. We then derive a new multiclass boosting algorithm from the proposed risk function, called GMSB. The derived algorithm also uses a set optimal similarity functions for a given dataset. The results of our experiments on a number of UCI and real-world biological, text, and image datasets show that GMSB outperforms the state-of-the-art boosting methods to multiclass semi-supervised learning.

Vorheriger Artikel Learning from correlation with extreme learning machine

Nächster Artikel Sparse and heuristic support vector machine for binary classifier and regressor fusion

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

Jetzt informieren

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

Jetzt informieren

Nur mit Berechtigung zugänglich

Bagheri MA, Montazer GA, Kabir E (2013) A subspace approach to error correcting output codes. Pattern Recognit Lett 34(2):176–184CrossRef

Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a mahalanobis metric from equivalence constraints. J Mach Learn Res 6:937–965MathSciNetMATH

Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J Mach Learn Res 7:2399–2434MathSciNetMATH

Bennett K, Demiriz A (1999) Semi-supervised support vector machines. NIPS pp 368–374

Bennett K, Demiriz A, Maclin R (2002) Exploiting unlabeled data in ensemble methods. In: Proceedings of ACM SIGKDD conference, pp 289–296

Blum A, Mitchell TM (1998) Combining labeled and unlabeled data with co-training. In: COLT, pp 92–100

Boley D, Gini M, Gross R, Han E, Hastings K, Karypis G, Kumar V, Mobasher B, Moore J (1999) Document categorization and query generation on the world wide web using webace. Artif Intell Rev 13(5):365–391CrossRef

dAlch Buc F, Grandvalet Y, Ambroise C (2002) Semi-supervised marginboost. NIPS 14:553–560

Chen K, Wang S (2011) Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions. Pattern Anal Mach Intell 33(1):129–143CrossRef

10.

Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 785–794

11.

Dai Z, Yang Z, Yang F, Cohen WW, Salakhutdinov RR (2017) Good semi-supervised learning that requires a bad gan. In: NIPS, pp 6510–6520

12.

Demiriz A, Bennett K, Shawe-Taylor J (2002) Linear programming boosting via column generation. Mach Learn 46(1):225–254MATHCrossRef

13.

Dietterich TG, Bakiri G (1995) Solving multiclass learning problems via error-correcting output codes. J Artif Intell Res 2:263–286MATHCrossRef

14.

Dunlop MM, Slepcev D, Stuart AM, Thorpe M (2018) Large data and zero noise limits of graph-based semi-supervised learning algorithms. CoRR arxIV:abs/1805.09450

15.

Frank A, Asuncion A (2010) UCI machine learning repository. URL http://archive.ics.uci.edu/ml

16.

Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: ICML, pp 148–156

17.

Freund Y, Schapire R, Abe N (1999) A short introduction to boosting. J Jpn Soc Artif Intell 14(771–780):1612

18.

Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Stat 28(2):337–407MATHCrossRef

19.

Goodman N, Mansinghka V, Roy DM, Bonawitz K, Tenenbaum JB (2012) Church: a language for generative models. arXiv preprint arXiv:12063255

20.

He R, Zheng W, Hu B, Kong X (2011) Nonnegative sparse coding for discriminative semi-supervised learning. In: The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colorado Springs, CO, USA, 20-25 June 2011, pp 2849–2856, https://doi.org/10.1109/CVPR.2011.5995487

21.

Hoi S, Liu W, Chang S (2008) Semi-supervised distance metric learning for collaborative image retrieval. In: CVPR, pp 1–7

22.

Hoi SC, Liu W, Lyu MR, Ma WY (2006) Learning distance metrics with contextual constraints for image retrieval. In: Computer vision and pattern recognition, 2006 IEEE computer society conference on, IEEE, vol 2, pp 2072–2078

23.

Huang L, Liu X, Ma B, Lang B (2015) Online semi-supervised annotation via proxy-based local consistency propagation. Neurocomputing 149:1573–1586. https://doi.org/10.1016/j.neucom.2014.08.035 CrossRef

24.

Jaakkola M (2002) Partially labeled classification with markov random walks. In: NIPS 14: proceedings of the 2002 conference, MIT Press, vol 2, p 945

25.

Jiang B, Chen H, Yuan B, Yao X (2017) Scalable graph-based semi-supervised learning through sparse bayesian model. IEEE Trans Knowl Data Eng 29(12):2758–2771CrossRef

26.

Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML, pp 200–209

27.

Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. In: Advances in Neural Information Processing Systems, pp 3581–3589

28.

Laine S, Aila T (2016) Temporal ensembling for semi-supervised learning. CoRR ARxIV:abs/1610.02242

29.

Lewis D D (1999) Reuters-21578 text categorization test collection distribution, http://www.research.att.com/~lewis. URL http://www.research.att.com/~lewis

30.

Li Y, Guan C, Li H, Chin Z (2008) A self-training semi-supervised svm algorithm and its application in an eeg-based brain computer interface speller system. Pattern Recognit Lett 29(9):1285–1294CrossRef

31.

Mallapragada P, Jin R, Jain A, Liu Y (2009) Semiboost: boosting for semi-supervised learning. Pattern Anal Mach Intell 31(11):2000–2014CrossRef

32.

Miyato T, Maeda S, Ishii S, Koyama M (2018) Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Machine Intelligence pp 1–1

33.

Mukherjee I, Schapire RE (2013) A theory of multiclass boosting. J Mach Learn Res 14(1):437–497MathSciNetMATH

34.

Ng WWY, Zhou X, Tian X, Wang X, Yeung DS (2018) Bagging-boosting-based semi-supervised multi-hashing with query-adaptive re-ranking. Neurocomputing 275:916–923CrossRef

35.

Ni B, Yan S, Kassim AA (2012) Learning a propagable graph for semisupervised learning: classification and regression. IEEE Trans Knowl Data Eng 24(1):114–126. https://doi.org/10.1109/TKDE.2010.209 CrossRef

36.

Nigam K, McCallum A, Thrun S, Mitchell T (2000) Text classification from labeled and unlabeled documents using em. Mach Learn 39(2):103–134MATHCrossRef

37.

Odena A (2016) Semi-supervised learning with generative adversarial networks. CoRR arXiv:abs/1606.01583

38.

Rasmus A, Berglund M, Honkala M, Valpola H, Raiko T (2015) Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pp 3546–3554, URL http://papers.nips.cc/paper/5947-semi-supervised-learning-with-ladder-networks

39.

Rosenberg C, Hebert M, Schneiderman H (2005) Semi-supervised self-training of object detection models. In: WACV/MOTION, IEEE Computer Society, pp 29–36

40.

Saberian MJ, Vasconcelos N (2011) Multiclass boosting: Theory and algorithms. In: Advances in Neural Information Processing Systems 24 (NIPS), pp 2124–2132

41.

Sajjadi M, Javanmardi M, Tasdizen T (2016) Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In: Lee DD, Sugiyama M, Luxburg UV, Guyon I, Garnett R (eds) Advances in Neural Information Processing Systems 29, Curran Associates, Inc., pp 1163–1171, URL http://papers.nips.cc/paper/6333-regularization-with-stochastic-transformations-and-perturbations-for-deep-semi-supervised-learning.pdf

42.

Song E, Huang D, Ma G, Hung C (2011) Semi-supervised multi-class adaboost by exploiting unlabeled data. Expert Syst Appl 38(6):6720–6726. https://doi.org/10.1016/j.eswa.2010.11.062 CrossRef

43.

Subramanya A, Talukdar PP (2014) Graph-based semi-supervised learning. Synth Lect Artif Intell Mach Learn 8(4):1–125MATHCrossRef

44.

Sun S (2013) A survey of multi-view machine learning. Neural Comput Appl 23(7–8):2031–2038CrossRef

45.

Tanha J (2013) Ensemble approaches to semi-supervised learning, Ph.D thesis, Informatics Institute, University of Amsterdam

46.

Tanha J, van Someren M, Afsarmanesh H (2011) Disagreement-based co-training. In: Tools with artificial intelligence (ICTAI), 2011 23rd IEEE International Conference on IEEE, pp 803–810

47.

Tanha J, van Someren M, Afsarmanesh H (2012a) An adaboost algorithm for multiclass semi-supervised learning. In: ICDM, pp 1116–1121

48.

Tanha J, van Someren M, Bakker M, Bouten W, Shamoun-Baranes J, Afsarmanesh H (2012b) Multiclass semi-supervised learning for animal behavior recognition from accelerometer data. In: Tools with artificial intelligence (ICTAI), 2012 24rd IEEE International Conference on IEEE

49.

Tanha J, Saberian MJ, van Someren M (2013) Multiclass semi-supervised boosting using similarity learning. In: 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, December 7–10, 2013, pp 1205–1210

50.

Tanha J, van Someren M, Afsarmanesh H (2014) Boosting for multiclass semi-supervised learning. Pattern Recognit Lett 37:63–77CrossRef

51.

Tanha J, van Someren M, Afsarmanesh H (2017) Semi-supervised self-training for decision tree classifiers. Int J Mach Learn Cybern 8(1):355–370CrossRef

52.

TREC (1999) Text retrieval conference. URL http://trec.nist.gov

53.

Triguero I, García S, Herrera F (2015) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42(2):245–284CrossRef

54.

Valizadegan H, Jin R, Jain A (2008) Semi-supervised boosting for multi-class classification. ECML pp 522–537

55.

Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227. https://doi.org/10.1109/TPAMI.2008.79 CrossRef

56.

Zha ZJ, Mei T, Wang J, Wang Z, Hua XS (2009) Graph-based semi-supervised learning with multiple labels. J Vis Commun Image Rep 20(2):97–103CrossRef

57.

Zhang M, Tang J, Zhang X, Xue X (2014) Addressing cold start in recommender systems: A semi-supervised co-training algorithm. In: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, ACM, pp 73–82

58.

Zhou D, Bousquet O, Lal T, Weston J, Schölkopf B (2004) Learning with local and global consistency. NIPS 16:321–328

59.

Zhu J, Zou H, Rosset S, Hastie T et al (2009) Multi-class adaboost. Stat Interface 2(3):349–360MathSciNetMATHCrossRef

60.

Zhu X (2005) Semi-supervised learning literature survey. Tech. Rep. 1530, Computer Sciences, University of Wisconsin-Madison

61.

Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data with label propagation. School Comput Sci, Carnegie Mellon Univ, Pittsburgh, PA, Tech Rep CMU-CALD-02-107

62.

Zhu X, Goldberg AB (2009) Introduction to Semi-Supervised Learning. Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers

63.

Zhuang L, Gao H, Lin Z, Ma Y, Zhang X, Yu N (2012) Non-negative low rank and sparse graph for semi-supervised learning. In: 2012 ieee conference on computer vision and pattern recognition, Providence, RI, USA, June 16–21, 2012, pp 2328–2335. https://doi.org/10.1109/CVPR.2012.6247944

Titel: A multiclass boosting algorithm to labeled and unlabeled data
verfasst von: Jafar Tanha
Publikationsdatum: 17.05.2019
Verlag: Springer Berlin Heidelberg
Erschienen in: International Journal of Machine Learning and Cybernetics / Ausgabe 12/2019
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-019-00951-4

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Weitere Artikel der Ausgabe 12/2019

Binary multi-verse optimization algorithm for global optimization and discrete problems

A hybrid method for increasing the speed of SVM training using belief function theory and boundary region

Sparse and heuristic support vector machine for binary classifier and regressor fusion

Knowledge representation and reasoning using self-learning interval type-2 fuzzy Petri nets and extended TOPSIS

A fast decision making method for mandatory lane change using kernel extreme learning machine

Meticulous fuzzy convolution C means for optimized big data analytics: adaptation towards deep learning

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.