nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

Impact of Noisy Labels in Learning Techniques: A Survey

verfasst von : Nitika Nigam, Tanima Dutta, Hari Prabhat Gupta

Erschienen in: Advances in Data and Information Sciences

Verlag: Springer Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Noisy data is the main issue in classification. The possible sources of noise label can be insufficient availability of information or encoding/communication problems, or data entry error by experts/nonexperts, etc., which can deteriorate the model’s performance and accuracy. However, in a real-world dataset, like Flickr, the likelihood of containing the noisy label is high. Initially, few methods such as identification, correcting, and elimination of noisy data was used to enhance the performance. Various machine learning algorithms are used to diminish the noisy environment, but in the recent studies, deep learning models are resolving this issue. In this survey, a brief introduction about the solution for the noisy label is provided.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Classification of Forest Cover Type Using Random Forests Algorithm

Nächstes Kapitel Performance Analysis of Schema Design Approaches for Migration from RDBMS to NoSQL Databases

Angluin, D., & Laird, P. (1988). Learning from noisy examples. Machine Learning, 2(4), 343–370.

Azadi, S., Feng, J., Jegelka, S., & Darrell, T. (2015). Auxiliary image regularization for deep cnns with noisy labels. arXiv:151107069.

Biggio, B., Nelson, B., Laskov, P. (2011). Support vector machines under adversarial label noise. In Asian Conference on Machine Learning (pp. 97–112).

Bootkrajang, J., Kabán, A. (2013). Boosting in the presence of label noise. arXiv:13096818.

Bouveyron, C., & Girard, S. (2009). Robust supervised classification with mixture models: Learning from data with uncertain labels. Pattern Recognition, 42(11), 2649–2658.CrossRef

Brodley, C. E., & Friedl, M. A. (1999). Identifying mislabeled training data. Journal of Artificial Intelligence Research, 11, 131–167.CrossRef

Cantador, I., Dorronsoro, J. R. (2005). Boosting parallel perceptrons for label noise reduction in classification problems. In International Work-Conference on the Interplay Between Natural and Artificial Computation (pp. 586–593). Springer.

Chen, X., Gupta, A. (2015). Webly supervised learning of convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1431–1439).

Frénay, B., & Verleysen, M. (2014). Classification in the presence of label noise: A survey. IEEE Transactions on Neural Networks and Learning Systems, 25(5), 845–869.CrossRef

10.

Freund, Y., Schapire, R. E., et al. (1996). Experiments with a new boosting algorithm. In Icml, Citeseer (Vol. 96, pp. 148–156).

11.

Friedman, J., Hastie, T., Tibshirani, R., et al. (2000). Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The annals of statistics, 28(2), 337–407.MathSciNetCrossRef

12.

Goodfellow, I., Bengio, Y., Courville, A., Bengio, Y. (2016). Deep learning (Vol. 1). MIT press Cambridge.

13.

Han, B., Yao, Q., Yu, X., Niu, G., Xu, M., Hu, W., et al. (2018) Co-sampling: Training robust networks for extremely noisy supervision. arXiv:180406872.

14.

Hickey, R. J. (1996). Noise modelling and evaluating learning from examples. Artificial Intelligence, 82(1–2), 157–179.MathSciNetCrossRef

15.

Izadinia, H., Russell, B. C., Farhadi, A., Hoffman, M. D., Hertzmann, A. (2015) Deep classifiers from image tags in the wild. In Proceedings of the 2015 Workshop on Community-Organized Multimodal Mining: Opportunities for Novel Solutions (pp. 13–18). ACM.

16.

Joulin, A., van der Maaten, L., Jabri, A., Vasilache, N. (2016). Learning visual features from large weakly supervised data. In European Conference on Computer Vision (pp. 67–84). Springer

17.

Karmaker, A., & Kwek, S. (2006). A boosting approach to remove class label noise 1. International Journal of Hybrid Intelligent Systems, 3(3), 169–177.CrossRef

18.

Khoshgoftaar, T. M., Zhong, S., & Joshi, V. (2005). Enhancing software quality estimation using ensemble-classifier based noise filtering. Intelligent Data Analysis, 9(1), 3–27.CrossRef

19.

Li, Y., Yang, J., Song, Y., Cao, L., Luo, J., & Li, L. J. (2017). Learning from noisy labels with distillation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1910–1918)

20.

Lin, C. H., Weld, D. S., et al. (2014). To re (label), or not to re (label). In Second AAAI Conference on Human Computation and Crowdsourcing

21.

Liu, H., & Zhang, S. (2012). Noisy data elimination using mutual k-nearest neighbor for classification mining. Journal of Systems and Software, 85(5), 1067–1074.CrossRef

22.

Liu, T., & Tao, D. (2016). Classification with noisy labels by importance reweighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(3), 447–461.CrossRef

23.

Malach, E., Shalev-Shwartz, S. (2017). Decoupling “when to update” from “how to update”. In Advances in Neural Information Processing Systems (pp. 960–970).

24.

Menon, A., Rooyen, B. V., Ong, C. S., Williamson, B. (2015). Learning from corrupted binary labels via class-probability estimation. In F Bach, D Blei, (Eds.) Proceedings of the 32nd International Conference on Machine Learning, PMLR, Lille, France, Proceedings of Machine Learning Research (Vol. 37, pp. 125–134). http://proceedings.mlr.press/v37/menon15.html.

25.

Mnih, V., Hinton, G. E. (2012). Learning to label aerial images from noisy data. In Proceedings of the 29th International Conference on Machine Learning (ICML-12) (pp. 567–574)

26.

Nettleton, D. F., Orriols-Puig, A., & Fornells, A. (2010). A study of the effect of different types of noise on the precision of supervised learning techniques. Artificial Intelligence Review, 33(4), 275–306.CrossRef

27.

Oja, E. (1980). On the convergence of an associative learning algorithm in the presence of noise. International Journal of Systems Science, 11(5), 629–640.MathSciNetCrossRef

28.

Orr, K. (1998). Data quality and systems theory. Communications of the ACM, 41(2), 66–71.CrossRef

29.

Oza, N. C. (2004) Aveboost2: Boosting for noisy data. In International Workshop on Multiple Classifier Systems (pp. 31–40). Springer.

30.

Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.

31.

Raykar, V. C., Yu, S., Zhao, L. H., Valadez, G. H., Florin, C., Bogoni, L., et al. (2010). Learning from crowds. Journal of Machine Learning Research, 11(Apr), 1297–1322.

32.

Reed, S., Lee, H., Anguelov, D., Szegedy, C., Erhan, D., Rabinovich, A. (2014) Training deep neural networks on noisy labels with bootstrapping. arXiv:14126596.

33.

Rodrigues, F., Pereira, F. C. (2018). Deep learning from crowds. In Thirty-Second AAAI Conference on Artificial Intelligence.

34.

Sluban, B., Gamberger, D., & Lavrač, N. (2014). Ensemble-based noise detection: Noise ranking and visual performance evaluation. Data Mining and Knowledge Discovery, 28(2), 265–303.MathSciNetCrossRef

35.

Sukhbaatar, S., Bruna, J., Paluri, M., Bourdev, L., Fergus, R. (2014). Training convolutional networks with noisy labels. arXiv:14062080.

36.

Sun, J. W., Zhao, F. Y., Wang, C. J., Chen, S. F. (2007). Identifying and correcting mislabeled training instances. In Future Generation Communication and Networking (FGCN 2007) (Vol. 1, pp. 244–250), IEEE.

37.

Sun, Y., Xu, Y., et al. (2018). Limited gradient descent: Learning with noisy labels. arXiv:181108117.

38.

Teng, C. M. (1999). Correcting noisy data. In ICML, Citeseer (pp. 239–248)

39.

Verbaeten, S., Van Assche, A. (2003). Ensemble methods for noise elimination in classification problems. In International Workshop on Multiple Classifier Systems (pp. 317–325). Springer.

40.

Vu, T. K., Tran, Q. L. (2018). Robust loss functions: Defense mechanisms for deep architectures. In: 2018 10th International Conference on Knowledge and Systems Engineering (KSE) (pp. 163–168). IEEE.

41.

Yan, Y., Rosales, R., Fung, G., Subramanian, R., & Dy, J. (2014). Learning from multiple annotators with varying expertise. Machine Learning, 95(3), 291–327. https://doi.org/10.1007/s10994-013-5412-1.

42.

Yao, J., Wang, J., Tsang, I. W., Zhang, Y., Sun, J., Zhang, C., et al. (2019). Deep learning from noisy image labels with quality embedding. IEEE Transactions on Image Processing, 28(4), 1909–1922.MathSciNetCrossRef

43.

Zhong, S., Tang, W., & Khoshgoftaar, T. M. (2005). Boosted noise filters for identifying mislabeled data. Department of Computer Science and engineering, Florida Atlantic University.

44.

Zhu, X., Wu, X. (2004). Class noise vs. attribute noise: A quantitative study. Artificial Intelligence Review, 22(3), 177–210.

45.

Zhu, X., Wu, X., Chen, Q. (2003). Eliminating class noise in large datasets. In Proceedings of the 20th International Conference on Machine Learning (ICML-03) (pp. 920–927)

Titel: Impact of Noisy Labels in Learning Techniques: A Survey
verfasst von: Nitika Nigam
Tanima Dutta
Hari Prabhat Gupta
Verlag: Springer Singapore
Buch: Advances in Data and Information Sciences
Print ISBN: 978-981-15-0693-2

Electronic ISBN: 978-981-15-0694-9

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-981-15-0694-9_38

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Kryptowährungen/© gopixa / Getty Images / iStock, MG4 aus China auf dem Prüfstand im ADAC-Technik-Zentrum in Landsberg am Lech/© ADAC e.V., Chassis eines Elektrofahrzeugs/© chesky / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.