Top

Published in:

2021 | OriginalPaper | Chapter

Midpoint Regularization: From High Uncertainty Training Labels to Conservative Classification Decisions

Author : Hongyu Guo

Published in: Machine Learning and Knowledge Discovery in Databases. Research Track

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Label Smoothing (LS) improves model generalization through penalizing models from generating overconfident output distributions. For each training sample the LS strategy smooths the one-hot encoded training signal by distributing its distribution mass over the non-ground truth classes. We extend this technique by considering example pairs, coined PLS. PLS first creates midpoint samples by averaging random sample pairs and then learns a smoothing distribution during training for each of these midpoint samples, resulting in midpoints with high uncertainty labels for training. We empirically show that PLS significantly outperforms LS, achieving up to 30% of relative classification error reduction. We also visualize that PLS produces very low winning softmax scores for both in and out of distribution samples.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Self-bounding Majority Vote Learning Algorithms by the Direct Minimization of a Tight PAC-Bayesian C-Bound

next chapter Learning Weakly Convex Sets in Metric Spaces

For efficiency purpose, we implement this by randomly selecting a sample from the same mini-batch during training.

https://github.com/facebookresearch/mixup-cifar10.

https://github.com/facebookresearch/mixup-cifar10/models/.

Ahn, S., Hu, S.X., Damianou, A., Lawrence, N.D., Dai, Z.: Variational information distillation for knowledge transfer. arXiv (2019)

Archambault, G.P., Mao, Y., Guo, H., Zhang, R.: Mixup as directional adversarial training, vol. abs/1906.06875 (2019)

Carratino, L., Cissé, M., Jenatton, R., Vert, J.P.: On mixup regularization. arXiv (2020)

Chorowski, J., Jaitly, N.: Towards better decoding and language model integration in sequence to sequence models. In: INTERSPEECH (2016)

Chrabaszcz, P., Loshchilov, I., Hutter, F.: A downsampled variant of imagenet as an alternative to the CIFAR datasets. arXiv (2017)

Furlanello, T., Lipton, Z.C., Tschannen, M., Itti, L., Anandkumar, A.: Born again neural networks. In: ICML (2018)

Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: ICML 2017. JMLR.org (2017)

Guo, H.: Nonlinear mixup: out-of-manifold data augmentation for text classification. In: AAAI, pp. 4044–4051 (2020)

Guo, H., Mao, Y., Zhang, R.: Augmenting data with mixup for sentence classification: an empirical study, vol. abs/1905.08941 (2019)

10.

Guo, H., Mao, Y., Zhang, R.: Mixup as locally linear out-of-manifold regularization. In: AAAI, pp. 3714–3722 (2019)

11.

He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38CrossRef

12.

Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv (2015)

13.

Huang, Y., et al.: GPipe: efficient training of giant neural networks using pipeline parallelism. In: NeurIPS (2019)

14.

Ioffe, S.: Batch renormalization: towards reducing minibatch dependence in batch-normalized models. In: Guyon, I., et al. (eds.) NeurIPS, pp. 1945–1953 (2017)

15.

Li, W., Dasarathy, G., Berisha, V.: Regularization via structural label smoothing. In: AISTAT (2020)

16.

Lukasik, M., Bhojanapalli, S., Menon, A.K., Kumar, S.: Does label smoothing mitigate label noise? In: ICML (2020)

17.

Mobahi, H., Farajtabar, M., Bartlett, P.L.: Self-distillation amplifies regularization in hilbert space. arXiv (2020)

18.

Müller, R., Kornblith, S., Hinton, G.E.: When does label smoothing help? In: NIPS (2019)

19.

Pereyra, G., Tucker, G., Chorowski, J., Kaiser, L., Hinton, G.E.: Regularizing neural networks by penalizing confident output distributions. In: ICLR Workshop (2017)

20.

Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: AAAI (2019)

21.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2818–2826 (2016)

22.

Tokozume, Y., Ushiku, Y., Harada, T.: Learning from between-class examples for deep sound recognition. In: ICLR (2018)

23.

Vaswani, A., et al.: Attention is all you need. arXiv abs/1706.03762 (2017)

24.

Xie, L., Wang, J., Wei, Z., Wang, M., Tian, Q.: Disturblabel: regularizing CNN on the loss layer. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4753–4762 (2016)

25.

Yang, C., Xie, L., Qiao, S., Yuille, A.L.: Training deep neural networks in generations: a more tolerant teacher educates better students. In: AAAI, pp. 5628–5635. AAAI Press (2019)

26.

Yuan, L., Tay, F.E.H., Li, G., Wang, T., Feng, J.: Revisit knowledge distillation: a teacher-free framework. arXiv (2019)

27.

Yun, S., Han, D., Chun, S., Oh, S.J., Yoo, Y., Choe, J.: Cutmix: regularization strategy to train strong classifiers with localizable features. In: ICCV, pp. 6022–6031. IEEE (2019)

28.

Zhang, H., Cissé, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. In: ICLR (2018)

29.

Zhu, Z., et al.: Viewpoint-aware loss with angular regularization for person re-identification. In: AAAI, pp. 13114–13121 (2020)

30.

Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)

Title: Midpoint Regularization: From High Uncertainty Training Labels to Conservative Classification Decisions
Author: Hongyu Guo
Publisher: Springer International Publishing
Book: Machine Learning and Knowledge Discovery in Databases. Research Track
Print ISBN: 978-3-030-86519-1

Electronic ISBN: 978-3-030-86520-7

Copyright Year: 2021
DOI: https://doi.org/10.1007/978-3-030-86520-7_12

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner