Top

Neural Processing Letters

Published in:

11-06-2021

Improve Semi-supervised Learning with Metric Learning Clusters and Auxiliary Fake Samples

Authors: Wei Zhou, Cheng Lian, Zhigang Zeng, Bingrong Xu, Yixin Su

Published in: Neural Processing Letters | Issue 5/2021

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Because it is very expensive to collect a large number of labeled samples to train deep neural networks in certain fields, semi-supervised learning (SSL) researcher has become increasingly important in recent years. There are many consistency regularization-based methods for solving SSL tasks, such as the \(\Pi \) model and mean teacher. In this paper, we first show through an experiment that the traditional consistency-based methods exist the following two problems: (1) as the size of unlabeled samples increases, the accuracy of these methods increases very slowly, which means they cannot make full use of unlabeled samples. (2) When the number of labeled samples is vary small, the performance of these methods will be very low. Based on these two findings, we propose two methods, metric learning clustering (MLC) and auxiliary fake samples, to alleviate these problems. The proposed methods achieve state-of-the-art results on SSL benchmarks. The error rates are 10.20%, 38.44% and 4.24% for CIFAR-10 with 4000 labels, CIFAR-100 with 10,000 labels and SVHN with 1000 labels by using MLC. For MNIST, the auxiliary fake samples method shows great results in cases with the very few labels.

previous article Interval Neutrosophic Einstein Prioritized Normalized Weighted Geometric Bonferroni Mean Operator and its Application to Multicriteria Decision making

next article Anfis-Based Defect Severity Prediction on a Multi-Stage Gearbox Operating Under Fluctuating Speeds

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: Proceedings of the eleventh annual conference on computational learning theory, pp 92–100

Boomija MD, Phil M (2008) Comparison of partition based clustering algorithms. J Comput Appl 1(4):18–21

Chongxuan L, Xu T, Zhu J, Zhang B (2017) Triple generative adversarial nets. In: Advances in neural information processing systems, pp 4088–4098

Cong Y, Liu J, Yuan J, Luo J (2013) Self-supervised online metric learning with low rank constraint for scene categorization. IEEE Trans Image Process 22(8):3179–3191CrossRef

Dai Z, Yang Z, Yang F, Cohen WW, Salakhutdinov RR (2017) Good semi-supervised learning that requires a bad gan. In: Advances in neural information processing systems, pp 6510–6520

Du C, Zhu J, Zhang B (2015) Learning deep generative models with doubly stochastic mcmc. arXiv preprint arXiv:1506.04557

Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572

Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2, pp 1735–1742. IEEE

10.

Haeusser P, Mordvintsev A, Cremers D (2017) Learning by association–a versatile semi-supervised training method for neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 89–98

11.

Hoffer E, Ailon N (2015) Deep metric learning using triplet network. In: International workshop on similarity-based pattern recognition. Springer, Berlin, pp 84–92

12.

Hoffer E, Ailon N (2016) Semi-supervised deep learning by metric embedding. arXiv:Learning

13.

Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254CrossRef

14.

Kamnitsas K, Castro DC, Folgoc LL, Walker I, Tanno R, Rueckert D, Glocker B, Criminisi A, Nori A (2018) Semi-supervised learning via compact latent space clustering. arXiv preprint arXiv:1806.02679

15.

Kingma DP, Welling M (2013) Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114

16.

Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. In: Advances in neural information processing systems, pp 3581–3589

17.

Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint, arXiv:1609.02907

18.

Kriegel H-P, Kröger P, Sander J, Zimek A (2011) Density-based clustering. Wiley Interdiscip Rev Data Min Knowl Discov 1(3):231–240CrossRef

19.

Krizhevsky A, Hinton G (2009) Learning Multiple Layers of Features from Tiny Images. Technical Report, Univ. of Toronto

20.

Kumar MP, Packer B, Koller D (2010) Self-paced learning for latent variable models. In: NIPS

21.

Laine S, Aila T (2016) Temporal ensembling for semi-supervised learning. arXiv preprint, arXiv:1610.02242

22.

LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRef

23.

Lin T-Y, Goyal P, Girshick RS, He K, Dollár P (2017) Focal loss for dense object detection. In: 2017 IEEE international conference on computer vision (ICCV), pp 2999–3007

24.

Luo Y, Zhu J, Li M, Ren Y, Zhang B (2018) Smooth neighbors on teacher graphs for semi-supervised learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8896–8905

25.

Min S, Chen X, Zha Z-J, Feng W, Zhang Y (2019) A two-stream mutual attention network for semi-supervised biomedical segmentation with noisy labels. Proceedings of the AAAI Conf Artif Intell 33:4578–4585

26.

Miyato T, Maeda S, Koyama M, Ishii S (2018) Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Trans Pattern Anal Mach Intell 41(8):1979–1993CrossRef

27.

Miyato T, Maeda S, Koyama M, Nakae K, Ishii S (2015) Distributional smoothing with virtual adversarial training. arXiv preprint, arXiv:1507.00677

28.

Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning

29.

Song HO, Xiang Y, Jegelka S, Savarese S (2016) Deep metric learning via lifted structured feature embedding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4004–4012

30.

Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in PyTorch. In: NeurIPS Autodiff Workshop

31.

Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems, pp 2234–2242

32.

Sietsma J, Dow RJF (1991) Creating artificial neural networks that generalize. Neural Netw 4(1):67–79CrossRef

33.

Sindhwani V, Niyogi P, Belkin M (2005) A co-regularization approach to semi-supervised learning with multiple views. In: Proceedings of ICML workshop on learning with multiple views, vol 2005, pp 74–79. Citeseer

34.

Sohn K (2016) Improved deep metric learning with multi-class n-pair loss objective. In: Advances in neural information processing systems, pp 1857–1865

35.

Springenberg JT (2015) Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv preprint arXiv:1511.06390

36.

Tarvainen A, Valpola H (2017) Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In: Advances in neural information processing systems, pp 1195–1204

37.

Van Der Maaten L, Hinton GE (2008) Visualizing data using t-sne. J Mach Learn Res 9:2579–2605MATH

38.

Verma V, Lamb A, Kannala J, Bengio Y, Lopez-Paz D (2019) Interpolation consistency training for semi-supervised learning. arXiv preprint, arXiv:1903.03825

39.

Wang X, Kihara D, Luo J, Qi G-J (2021) EnAET: A self-trained framework for semi-supervised and supervised learning with ensemble transformations. IEEE Trans Image Process 30:1639–1647

40.

Wang X, Han X, Huang W, Dong D, Scott MR (2019) Multi-similarity loss with general pair weighting for deep metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5022–5030

41.

Xie Q, Hovy E, Luong M, Le QV (2019) Self-training with noisy student improves imagenet classification. arXiv:Learning

42.

Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: 33rd annual meeting of the association for computational linguistics, pp 189–196

43.

Yu J, Yong R, Bo C (2013) Exploiting click constraints and multi-view features for image re-ranking. IEEE Trans Multimedia 16(1):159–168MathSciNetCrossRef

44.

Yu J, Yong R, Dacheng T (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032MathSciNetCrossRef

45.

Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2017) mixup: Beyond empirical risk minimization. arXiv preprint, arXiv:1710.09412

46.

Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: Proceedings of the IEEE international conference on computer vision, pp 3754–3762

47.

Zhou W, Lian C, Zeng Z, Su Y (2020) Mutual improvement between temporal ensembling and virtual adversarial training. Neural Process Lett 51:1111–1124

Title: Improve Semi-supervised Learning with Metric Learning Clusters and Auxiliary Fake Samples
Authors: Wei Zhou
Cheng Lian
Zhigang Zeng
Bingrong Xu
Yixin Su
Publication date: 11-06-2021
Publisher: Springer US
Published in: Neural Processing Letters / Issue 5/2021
Print ISSN: 1370-4621
Electronic ISSN: 1573-773X
DOI: https://doi.org/10.1007/s11063-021-10556-0

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 5/2021

Jacobi Neural Network Method for Solving Linear Differential-Algebraic Equations with Variable Coefficients

Online GBDT with Chunk Dynamic Weighted Majority Learners for Noisy and Drifting Data Streams

Correction to: Recent Deep Learning Techniques, Challenges and Its Applications for Medical Healthcare System: A Review

An Optimization Technique for Solving a Class of Ridge Fuzzy Regression Problems

Multimodal Machine Learning for Natural Language Processing: Disambiguating Prepositional Phrase Attachments with Images

Synchronization and Robust Synchronization of Coupled Neural Networks with Non-identical Nodes