Skip to main content
Top

2025 | OriginalPaper | Chapter

On Adversarial Training with Incorrect Labels

Authors : Benjamin Zi Hao Zhao, Junda Lu, Xiaowei Zhou, Dinusha Vatsalan, Muhammad Ikram, Mohamed Ali Kaafar

Published in: Web Information Systems Engineering – WISE 2024

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this work, we study adversarial training in the presence of incorrectly labeled data. Specifically, the predictive performance of an adversarially trained Machine Learning (ML) model trained on clean data and when the labels of training data and adversarial examples contain erroneous labels. Such erroneous labels may arise organically from a flawed labeling process or maliciously akin to a poisoning attacker.
We extensively investigate the effect of incorrect labels on model accuracy and robustness with variations to 1) when incorrect labels are applied to the adversarial training process, 2) the extent of data impacted by incorrect labels (a poisoning rate), 3) the consistency of the incorrect labels either applied randomly or with a constant mapping, 4) the model architecture used for classification, and 5) an ablation study on varying training settings of pretraining, adversarial initialization, and adversarial training strength. We further observe generalization of such behaviors over multiple datasets.
An input label change to an incorrect one may occur before the model is trained in the training dataset, or during the adversarial sample curation, where annotators make mistakes labeling the sourced adversarial example. Interestingly our results indicate that this flawed adversarial training process may counter-intuitively function as data augmentation, yielding improved outcomes for the adversarial robustness of the model.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Alayrac, J.B., Uesato, J., Huang, P.S., Fawzi, A., Stanforth, R., Kohli, P.: Are labels required for improving adversarial robustness? In: ANIPS, vol. 32 (2019) Alayrac, J.B., Uesato, J., Huang, P.S., Fawzi, A., Stanforth, R., Kohli, P.: Are labels required for improving adversarial robustness? In: ANIPS, vol. 32 (2019)
3.
go back to reference Carmon, Y., Raghunathan, A., Schmidt, L., Duchi, J.C., Liang, P.S.: Unlabeled data improves adversarial robustness. In: ANIPS, vol. 32 (2019) Carmon, Y., Raghunathan, A., Schmidt, L., Duchi, J.C., Liang, P.S.: Unlabeled data improves adversarial robustness. In: ANIPS, vol. 32 (2019)
5.
go back to reference Croce, F., Hein, M.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International Conference on Machine Learning, pp. 2206–2216. PMLR (2020) Croce, F., Hein, M.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International Conference on Machine Learning, pp. 2206–2216. PMLR (2020)
6.
go back to reference Dalvi, N., Domingos, P., Mausam, Sanghai, S., Verma, D.: Adversarial classification. In: ACM SIGKDD KDD, pp. 99–108 (2004) Dalvi, N., Domingos, P., Mausam, Sanghai, S., Verma, D.: Adversarial classification. In: ACM SIGKDD KDD, pp. 99–108 (2004)
7.
go back to reference Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE CVPR (2009) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE CVPR (2009)
8.
go back to reference Dong, C., Liu, L., Shang, J.: Label noise in adversarial training: a novel perspective to study robust overfitting. Adv. Neural. Inf. Process. Syst. 35, 17556–17567 (2022) Dong, C., Liu, L., Shang, J.: Label noise in adversarial training: a novel perspective to study robust overfitting. Adv. Neural. Inf. Process. Syst. 35, 17556–17567 (2022)
9.
go back to reference Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., Lempitsky, V.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 1–35 (2016)MathSciNet Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., Lempitsky, V.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 1–35 (2016)MathSciNet
10.
11.
go back to reference He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
12.
go back to reference Hendrycks, D., Zhao, K., Basart, S., Steinhardt, J., Song, D.: Natural adversarial examples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15262–15271 (2021) Hendrycks, D., Zhao, K., Basart, S., Steinhardt, J., Song, D.: Natural adversarial examples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15262–15271 (2021)
13.
go back to reference Igl, M., Farquhar, G., Luketina, J., Boehmer, W., Whiteson, S.: Transient non-stationarity and generalisation in deep reinforcement learning. In: ICLR (2021) Igl, M., Farquhar, G., Luketina, J., Boehmer, W., Whiteson, S.: Transient non-stationarity and generalisation in deep reinforcement learning. In: ICLR (2021)
14.
go back to reference Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. In: ANIPS, vol. 32 (2019) Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. In: ANIPS, vol. 32 (2019)
15.
go back to reference Krizhevsky, et al.: Learning multiple layers of features from tiny images (2009) Krizhevsky, et al.: Learning multiple layers of features from tiny images (2009)
16.
go back to reference Le, Y., Yang, X.: Tiny ImageNet visual recognition challenge (2015) Le, Y., Yang, X.: Tiny ImageNet visual recognition challenge (2015)
17.
go back to reference Li, L., Xie, T., Li, B.: SoK: certified robustness for deep neural networks. In: 2023 IEEE Symposium on Security and Privacy (SP). IEEE (2023) Li, L., Xie, T., Li, B.: SoK: certified robustness for deep neural networks. In: 2023 IEEE Symposium on Security and Privacy (SP). IEEE (2023)
18.
go back to reference Lowd, D., Meek, C.: Adversarial learning. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (2005) Lowd, D., Meek, C.: Adversarial learning. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (2005)
19.
go back to reference Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017) Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:​1706.​06083 (2017)
20.
go back to reference Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1765–1773 (2017) Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O., Frossard, P.: Universal adversarial perturbations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1765–1773 (2017)
21.
go back to reference Papernot, N., McDaniel, P., Goodfellow, I.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples (2016) Papernot, N., McDaniel, P., Goodfellow, I.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples (2016)
22.
go back to reference Qin, C., et al.: Adversarial robustness through local linearization. In: Advances in Neural Information Processing Systems, vol. 32 (2019) Qin, C., et al.: Adversarial robustness through local linearization. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
23.
go back to reference Rauber, J., Zimmermann, R., Bethge, M., Brendel, W.: Foolbox native: fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX. J. Open Source Softw. 5(53), 2607 (2020)CrossRef Rauber, J., Zimmermann, R., Bethge, M., Brendel, W.: Foolbox native: fast adversarial attacks to benchmark the robustness of machine learning models in PyTorch, TensorFlow, and JAX. J. Open Source Softw. 5(53), 2607 (2020)CrossRef
24.
go back to reference Rosenfeld, E., Winston, E., Ravikumar, P., Kolter, Z.: Certified robustness to label-flipping attacks via randomized smoothing. In: International Conference on Machine Learning, pp. 8230–8241. PMLR (2020) Rosenfeld, E., Winston, E., Ravikumar, P., Kolter, Z.: Certified robustness to label-flipping attacks via randomized smoothing. In: International Conference on Machine Learning, pp. 8230–8241. PMLR (2020)
25.
go back to reference Schmarje, L., Santarossa, M., Schröder, S.M., Koch, R.: A survey on semi-, self-and unsupervised learning for image classification. IEEE Access 9, 82146–82168 (2021)CrossRef Schmarje, L., Santarossa, M., Schröder, S.M., Koch, R.: A survey on semi-, self-and unsupervised learning for image classification. IEEE Access 9, 82146–82168 (2021)CrossRef
26.
go back to reference Shafahi, A., et al.: Poison frogs! Targeted clean-label poisoning attacks on neural networks. In: ANIPS, vol. 31 (2018) Shafahi, A., et al.: Poison frogs! Targeted clean-label poisoning attacks on neural networks. In: ANIPS, vol. 31 (2018)
27.
go back to reference Shafahi, A., et al.: Adversarial training for free! In: ANIPS, vol. 32 (2019) Shafahi, A., et al.: Adversarial training for free! In: ANIPS, vol. 32 (2019)
28.
go back to reference Song, H., Kim, M., Park, D., Shin, Y., Lee, J.G.: Learning from noisy labels with deep neural networks: a survey. IEEE TNNLS 34(11), 8135–8153 (2022) Song, H., Kim, M., Park, D., Shin, Y., Lee, J.G.: Learning from noisy labels with deep neural networks: a survey. IEEE TNNLS 34(11), 8135–8153 (2022)
29.
go back to reference Szegedy, C., et al.: Intriguing properties of neural networks (2013) Szegedy, C., et al.: Intriguing properties of neural networks (2013)
30.
go back to reference Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning. PMLR (2019) Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning. PMLR (2019)
31.
go back to reference Tolpegin, V., Truex, S., Gursoy, M.E., Liu, L.: Data poisoning attacks against federated learning systems. In: ESORICS (2020) Tolpegin, V., Truex, S., Gursoy, M.E., Liu, L.: Data poisoning attacks against federated learning systems. In: ESORICS (2020)
32.
go back to reference Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: Attacks and defenses (2017) Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: Attacks and defenses (2017)
33.
34.
go back to reference Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017) Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:​1708.​07747 (2017)
35.
go back to reference Zagoruyko, S., Komodakis, N.: Wide residual networks. In: BMVC (2016) Zagoruyko, S., Komodakis, N.: Wide residual networks. In: BMVC (2016)
36.
go back to reference Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64(3), 107–115 (2021)CrossRef Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64(3), 107–115 (2021)CrossRef
37.
go back to reference Zhou, D., Wang, N., Han, B., Liu, T.: Modeling adversarial noise for adversarial training. In: International Conference on Machine Learning, pp. 27353–27366. PMLR (2022) Zhou, D., Wang, N., Han, B., Liu, T.: Modeling adversarial noise for adversarial training. In: International Conference on Machine Learning, pp. 27353–27366. PMLR (2022)
38.
go back to reference Zhu, J., et al.: Understanding the interaction of adversarial training with noisy labels (2021) Zhu, J., et al.: Understanding the interaction of adversarial training with noisy labels (2021)
Metadata
Title
On Adversarial Training with Incorrect Labels
Authors
Benjamin Zi Hao Zhao
Junda Lu
Xiaowei Zhou
Dinusha Vatsalan
Muhammad Ikram
Mohamed Ali Kaafar
Copyright Year
2025
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-96-0573-6_9

Premium Partner