Skip to main content
Erschienen in: International Journal of Speech Technology 3/2022

02.06.2021

RETRACTED ARTICLE: Detecting adversarial attacks on audio-visual speech recognition using deep learning method

verfasst von: Rabie A. Ramadan

Erschienen in: International Journal of Speech Technology | Ausgabe 3/2022

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Deep learning techniques have made significant progress in various machine learning-based tasks in different fields. Deep learning patterns are primarily prone to Adverse attacks. However, the exploration of adversarial detection methods for the audio and Video (AV) streaming dataset is minimal. This research proposes an effective malicious detection process with the temporal connection among distinct AV streams using the Deep Convolutional Neural Network (DCNN) method. The proposed process significantly detects the adversarial attacks based on two audio-visual recognition models, namely Lip-Reading in the Wild(LRW) and Geospatial Repository and Data (GRiD) Management models, which are trained in correspondence to the Lip reading data sets. Experimental results have indicated that the proposed strategy is a powerful method to identify the adversarial attacks compared to Supervised Kernel Machines, Combined Neural Network, and Band Feature Selection methods. The precision, recall, accuracy, and F1-score of the proposed system are observed as 88.10%, 89.30%, 95.60%, and 0.96, respectively, far better than the existing systems.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Boloor, A., He, X., Gill, C. D., Vorobeychik, Y., & Zhang, X. (2019). Simple physical adversarial examples against end-to-end autonomous driving models. In Proceedings of the 15th IEEE international conference on embedded software and systems (ICESS), Las Vegas, NV, USA, 2–3 June 2019, pp. 1–7. Boloor, A., He, X., Gill, C. D., Vorobeychik, Y., & Zhang, X. (2019). Simple physical adversarial examples against end-to-end autonomous driving models. In Proceedings of the 15th IEEE international conference on embedded software and systems (ICESS), Las Vegas, NV, USA, 2–3 June 2019, pp. 1–7.
Zurück zum Zitat Carlini, N., & Wagner, D. A. (2017). Towards evaluating the robustness of neural networks. In 2017 IEEE symposium on security and privacy (SP), pp. 39–57. Carlini, N., & Wagner, D. A. (2017). Towards evaluating the robustness of neural networks. In 2017 IEEE symposium on security and privacy (SP), pp. 39–57.
Zurück zum Zitat Carlini, N., & Wagner, D. A. (2017). Towards evaluating the robustness of neural networks. In Proceedings of the IEEE symposium on security and privacy, San Jose, CA, USA, 22–26 May 2017, pp. 39–57. Carlini, N., & Wagner, D. A. (2017). Towards evaluating the robustness of neural networks. In Proceedings of the IEEE symposium on security and privacy, San Jose, CA, USA, 22–26 May 2017, pp. 39–57.
Zurück zum Zitat Carlini, N., Mishra, P., Vaidya, T., Zhang, Y., Sherr, M., Shields, C., Wagner, D., & Zhou, W. (2016). Hidden voice commands. In: USENIX security symposium, 2016, pp. 513–530. Carlini, N., Mishra, P., Vaidya, T., Zhang, Y., Sherr, M., Shields, C., Wagner, D., & Zhou, W. (2016). Hidden voice commands. In: USENIX security symposium, 2016, pp. 513–530.
Zurück zum Zitat Chakraborty, T., Jajodia, S., Katz, J., Picariello, A., Sperli, G., Subrahmanian, V. S. (2019). FORGE: a fake online repository generation engine for cyber deception. IEEE transactions on dependent and secure computing. 2019, pp. 1–16. Chakraborty, T., Jajodia, S., Katz, J., Picariello, A., Sperli, G., Subrahmanian, V. S. (2019). FORGE: a fake online repository generation engine for cyber deception. IEEE transactions on dependent and secure computing. 2019, pp. 1–16.
Zurück zum Zitat Chen, P.-Y., Sharma, Y., Zhang, H., Yi, J., Hsieh, C.-J. (2017). Ead: Elastic-net attacks to deep neural networks via adversarial examples. AAAI Conference on Artificial Intelligence (AAAI-18), New Orleans, Louisiana, USA. Chen, P.-Y., Sharma, Y., Zhang, H., Yi, J., Hsieh, C.-J. (2017). Ead: Elastic-net attacks to deep neural networks via adversarial examples. AAAI Conference on Artificial Intelligence (AAAI-18), New Orleans, Louisiana, USA.
Zurück zum Zitat Chung, J. S., & Zisserman, A. (2016). Lip reading in the wild. In ACCV, 2016. Chung, J. S., & Zisserman, A. (2016). Lip reading in the wild. In ACCV, 2016.
Zurück zum Zitat Cooke, M., Barker, J., Cunningham, S., & Shao, X. (2006). An audio-visual corpus for speech perception and automatic speech recognition. The Journal of the Acoustical Society of America,20(5), 2006. Cooke, M., Barker, J., Cunningham, S., & Shao, X. (2006). An audio-visual corpus for speech perception and automatic speech recognition. The Journal of the Acoustical Society of America,20(5), 2006.
Zurück zum Zitat Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., & Coates, A. (2014). Deep speech: Scaling up end-to-end speech recognition, arXiv preprint https://arxiv.org/1412.5567. Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., Prenger, R., Satheesh, S., Sengupta, S., & Coates, A. (2014). Deep speech: Scaling up end-to-end speech recognition, arXiv preprint https://​arxiv.​org/​1412.​5567.
Zurück zum Zitat Kurakin, A., Goodfellow, I. J., & Bengio, S. (2017). Adversarial examples in the physical world. In Proceedings of the 5th international conference on learning representations (ICLR), Toulon, France, 24–26 April 2017. Kurakin, A., Goodfellow, I. J., & Bengio, S. (2017). Adversarial examples in the physical world. In Proceedings of the 5th international conference on learning representations (ICLR), Toulon, France, 24–26 April 2017.
Zurück zum Zitat Kwon, H., Yoon, H., Park, K.-W. (2019). Poster: Detecting audio adversarial example through audio modification. In Proceedings of the 2019 ACM SIGSAC conference on computer and communications security, CCS’19, ACM, 2019, pp. 2521–2523. Kwon, H., Yoon, H., Park, K.-W. (2019). Poster: Detecting audio adversarial example through audio modification. In Proceedings of the 2019 ACM SIGSAC conference on computer and communications security, CCS’19, ACM, 2019, pp. 2521–2523.
Zurück zum Zitat Liu, Y., Chen, X., Liu, C., & Song, D. X. (2017). Delving into transferable adversarial examples and black-box attacks. CoRR, vol. abs/1611.02770. Liu, Y., Chen, X., Liu, C., & Song, D. X. (2017). Delving into transferable adversarial examples and black-box attacks. CoRR, vol. abs/1611.02770.
Zurück zum Zitat Logan, B. (2000). Mel frequency cepstral coefficients for music modeling. In ISMIR, vol. 270, pp. 1–11. Logan, B. (2000). Mel frequency cepstral coefficients for music modeling. In ISMIR, vol. 270, pp. 1–11.
Zurück zum Zitat Moosavi-Dezfooli, S., Fawzi, A., Frossard, P. (2016). DeepFool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016, pp. 2574–2582. Moosavi-Dezfooli, S., Fawzi, A., Frossard, P. (2016). DeepFool: A simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016, pp. 2574–2582.
Zurück zum Zitat Papernot, N., McDaniel, P. D., Jha, S., Fredrikson, M., Celik, Z. B., & Swami, A. (2016). The limitations of deep learning in adversarial settings. In Proceedings of the IEEE European symposium on security and privacy (EuroS&P), Saarbrücken, Germany, 21–24 March 2016, pp. 372–387. Papernot, N., McDaniel, P. D., Jha, S., Fredrikson, M., Celik, Z. B., & Swami, A. (2016). The limitations of deep learning in adversarial settings. In Proceedings of the IEEE European symposium on security and privacy (EuroS&P), Saarbrücken, Germany, 21–24 March 2016, pp. 372–387.
Zurück zum Zitat Ren, K., Zheng, T., Qin, Z., & Liu, X. (2020). Adversarial attacks and defenses in deep learning. Engineering, Volume 6, Issue 3, March 2020, pp. 346–360. Ren, K., Zheng, T., Qin, Z., & Liu, X. (2020). Adversarial attacks and defenses in deep learning. Engineering, Volume 6, Issue 3, March 2020, pp. 346–360.
Zurück zum Zitat Samizade, S., Tan, Z.-H., Shen, C., & Guan, X. (2019). Adversarial example detection by classification for deep speech recognition. ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP). Samizade, S., Tan, Z.-H., Shen, C., & Guan, X. (2019). Adversarial example detection by classification for deep speech recognition. ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP).
Zurück zum Zitat Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Journal of Neural Networks,61, 85–117.CrossRef Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Journal of Neural Networks,61, 85–117.CrossRef
Zurück zum Zitat Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014a). Intriguing properties of neural networks. In International conference on learning representations, pp. 22–29. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. (2014a). Intriguing properties of neural networks. In International conference on learning representations, pp. 22–29.
Zurück zum Zitat Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I. J., & Fergus, R. (2014b). Intriguing properties of neural networks. CoRR, vol. abs/1312.6199. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I. J., & Fergus, R. (2014b). Intriguing properties of neural networks. CoRR, vol. abs/1312.6199.
Zurück zum Zitat Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I. J., & Fergus, R. (2014c). Intriguing properties of neural networks. In Proceedings of the 2nd international conference on learning representations (ICLR), Banff, AB, Canada, 14–16 April 2014. Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I. J., & Fergus, R. (2014c). Intriguing properties of neural networks. In Proceedings of the 2nd international conference on learning representations (ICLR), Banff, AB, Canada, 14–16 April 2014.
Zurück zum Zitat Vaidya, T., Zhang, Y., Sherr, M., & Shields, C. (2015). Cocaine noodles: Exploiting the gap between human and machine speech recognition. WOOT,15, 10–11. Vaidya, T., Zhang, Y., Sherr, M., & Shields, C. (2015). Cocaine noodles: Exploiting the gap between human and machine speech recognition. WOOT,15, 10–11.
Zurück zum Zitat Zhang, G., Yan, C., Ji, X., Zhang, T., Zhang, T., & Xu, W. (2017). Dolphinattack: Inaudible voice commands. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, ACM, 2017, pp. 103–117. Zhang, G., Yan, C., Ji, X., Zhang, T., Zhang, T., & Xu, W. (2017). Dolphinattack: Inaudible voice commands. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, ACM, 2017, pp. 103–117.
Metadaten
Titel
RETRACTED ARTICLE: Detecting adversarial attacks on audio-visual speech recognition using deep learning method
verfasst von
Rabie A. Ramadan
Publikationsdatum
02.06.2021
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 3/2022
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-021-09859-3

Weitere Artikel der Ausgabe 3/2022

International Journal of Speech Technology 3/2022 Zur Ausgabe

Neuer Inhalt