nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Arabic Speech Recognition with Deep Learning: A Review

verfasst von : Wajdan Algihab, Noura Alawwad, Anfal Aldawish, Sarah AlHumoud

Erschienen in: Social Computing and Social Media. Design, Human Behavior and Analytics

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Automatic speech recognition is the area of research concerning the enablement of machines to accept vocal input from humans and interpreting it with the highest probability of correctness. There are several techniques to implement speech recognition models. One of the emerging techniques is using neural networks with deep learning for speech recognition. Arabic is one of the most spoken languages and least highlighted in terms of speech recognition. This paper serves as a brief review on the available studies on Arabic speech recognition. In addition, it sheds some light on the services and toolkits available for Arabic speech recognition systems’ development.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Wajeez: An Extractive Automatic Arabic Text Summarisation System

Nächstes Kapitel The State of the Awareness of Web Accessibility Guidelines of Student Website and App Developers

El Choubassi, M.M., El Khoury, H.E., Alagha, C.E.J., Skaf, J.A., Al-Alaoui, M.A.: Arabic speech recognition using recurrent neural networks. In: Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (IEEE Cat. No. 03EX795), Darmstadt, Germany, pp. 543–547 (2004)

Lipka, M., Hackett, C.: Why Muslims are the world’s fastest-growing religious group. Pew Research Center (2017). http://www.pewresearch.org/fact-tank/2017/04/06/why-muslims-are-the-worlds-fastest-growing-religious-group/. Accessed 14 Nov 2018

Ahmed, B.H.A., Ghabayen, A.S.: Arabic automatic speech recognition enhancement. In: 2017 Palestinian International Conference on Information and Communication Technology (PICICT), Gaza, Palestine, pp. 98–102 (2017)

Al-Anzi, F., AbuZeina, D.: Literature survey of Arabic speech recognition. In: International Conference on Computing Sciences and Engineering (ICCSE) (2018)

Rana, C.: A review: speech recognition with deep learning methods, p. 8 (2015)

Kitchenham, B.: Procedures for performing systematic reviews. Joint Technical report, Keele University Technical report (TR/SE-0401) and NICTA Technical report (0400011T.1), July 2004 (2004)

Heckman, S., Williams, L.: A systematic literature review of actionable alert identification techniques for automated static code analysis

Nasereddin, H.H.O., Omari, A.A.R.: Classification techniques for automatic speech recognition (ASR) algorithms used with real time speech translation. In: 2017 Computing Conference, London, pp. 200–207 (2017)

Shanbhogue, M., Kulkarni, S., Suprith, R.: A study on speech recognition, vol. 4, p. 6 (2016)

10.

Pdfs.semanticscholar.org (2012). https://pdfs.semanticscholar.org/04c8/b7668bc09eebcb56d54ba221a26d8fd174d7.pdf. Accessed 14 Nov 2018

11.

Yu, D., Deng, L.: Automatic Speech Recognition: A Deep Learning Approach, pp. 13–21. Springer, London (2015). https://doi.org/10.1007/978-1-4471-5779-3CrossRefMATH

12.

Turab, N., Khatatneh, K., Odeh, A.: A novel Arabic Speech Recognition method using neural networks and Gaussian Filtering. (IJEECS) Int. J. Electr. Electron. Comput. Syst. 19(01) (2014)

13.

Emami, A., Mangu, L.: Empirical study of neural network language models for Arabic speech recognition. In: 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), The Westin Miyako Kyoto, pp. 147–152 (2007)

14.

Desai, N., Dhameliya, K., Desai, V.: Feature extraction and classification techniques for speech recognition: a review, 3(12), 5 (2013)

15.

Kirchhoff, K., Vergyri, D., Bilmes, J., Duh, K., Stolcke, A.: Morphology-based language modeling for conversational Arabic speech recognition. Comput. Speech Lang. 20(4), 589–608 (2006)CrossRef

16.

Emami, A., Mangu, L.: Empirical study of neural network language models for Arabic speech recognition. In: IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU. IEEE (2007)

17.

Alghamdi, M., Elshafei, M., Al-Muhtaseb, H.: Arabic broadcast news transcription system. Int. J. Speech Technol. 10(4), 183–195 (2007)CrossRef

18.

Hyassat, H., Abu Zitar, R.: Arabic speech recognition using SPHINX engine. Int. J. Speech Technol. 9(3–4), 133–150 (2006)CrossRef

19.

Elmahdy, M., et al.: Modern standard Arabic based multilingual approach for dialectal Arabic speech recognition. In: Eighth International Symposium on Natural Language Processing, SNLP 2009. IEEE (2009)

20.

Selouani, S.A., Boudraa, M.: Algerian Arabic speech database (ALGASD): corpus design and automatic speech recognition application. Arab. J. Sci. Eng. 35(2C), 15 (2010)

21.

Jurafsky, D., Martin, J.: Speech and Language Processing. Prentice Hall, Upper Saddle River (2000)

22.

AbdAlmisreb, A., Abidin, A.F., Tahir, N.: Maxout based deep neural networks for Arabic phonemes recognition, p. 6 (2015)

23.

Amrouche, A., Rouvaen, J.M.: Arabic isolated word recognition using general regression neural network. In: 2003 46th Midwest Symposium on Circuits and Systems, Cairo, Egypt, vol. 2, pp. 689–692 (2003)

24.

Alotaibi, Y.A.: Spoken Arabic digits recognizer using recurrent neural networks. In: Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, Rome, Italy, pp. 195–199 (2004)

25.

Alotaibi, Y.: A simple time alignment algorithm for spoken Arabic digit recognition. J. King Abdulaziz Univ.-Eng. Sci. 20(1), 29–43 (2009)CrossRef

26.

Ahmad, A.M., Ismail, S., Samaon, D.F.: Recurrent neural network with backpropagation through time for speech recognition. In: IEEE International Symposium on Communications and Information Technology, ISCIT 2004, Sapporo, Japan, vol. 1, pp. 98–102 (2004)

27.

Zerari, N., Abdelhamid, S., Bouzgou, H., Raymond, C.: Bi-directional recurrent end-to-end neural network classifier for spoken Arab digit recognition. In: 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP), Algiers, pp. 1–6 (2018)

28.

Hmad, N., Allen, T.: Biologically inspired continuous Arabic speech recognition. In: Bramer, M., Petridis, M. (eds.) SGAI 2012, pp. 245–258. Springer, London (2012). https://doi.org/10.1007/978-1-4471-4739-8_20CrossRef

29.

Bouchakour, L., Debyeche, M.: Improving continuous Arabic speech recognition over mobile networks DSR and NSR using MFCCs features transformed, 12, 8 (2018)

30.

El-Desoky Mousa, A., Kuo, H.-K.J., Mangu, L., Soltau, H.: Morpheme-based feature-rich language models using deep neural networks for LVCSR of Egyptian Arabic. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, pp. 8435–8439 (2013)

31.

AlHanai, T., Hsu, W.-N., Glass, J.: Development of the MIT ASR system for the 2016 Arabic multi-genre broadcast challenge. In: 2016 IEEE Spoken Language Technology Workshop (SLT), San Diego, CA, pp. 299–304 (2016)

32.

Cardinal, P., et al.: Recent advances in ASR applied to an Arabic transcription system for Al-Jazeera, p. 5

33.

Ali, A., Zhang, Y., Cardinal, P., Dahak, N., Vogel, S., Glass, J.: A complete KALDI recipe for building Arabic speech recognition systems. In: 2014 IEEE Spoken Language Technology Workshop (SLT), South Lake Tahoe, NV, USA, pp. 525–529 (2014)

34.

Tomashenko, N., Vythelingum, K., Rousseau, A., Esteve, Y.: LIUM ASR systems for the 2016 multi-genre broadcast Arabic challenge. In: 2016 IEEE Spoken Language Technology Workshop (SLT), San Diego, CA, pp. 285–291 (2016)

35.

Khurana, S., Ali, A.: QCRI advanced transcription system (QATS) for the Arabic multi-dialect broadcast media recognition: MGB-2 challenge. In: 2016 IEEE Spoken Language Technology Workshop (SLT), San Diego, CA, pp. 292–298 (2016)

36.

Graciarena, M., Kajarekar, S., Stolcke, A., Shriberg, E.: Noise robust speaker identification for spontaneous Arabic speech. In: 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2007, Honolulu, HI, pp. IV-245–IV-248 (2007)

37.

Tolba, H.: Comparative experiments to evaluate the use of a CHMM-based speaker identification engine for Arabic spontaneous speech. In: 2009 2nd IEEE International Conference on Computer Science and Information Technology, Beijing, China, pp. 241–245 (2009)

38.

Ettaouil, M., Lazaar, M., En-Naimani, Z.: A hybrid ANN/HMM models for arabic speech recognition using optimal codebook. In: 2013 8th International Conference on Intelligent Systems: Theories and Applications (SITA), Rabat, Morocco, pp. 1–5 (2013)

39.

Wahyuni, E.S.: Arabic speech recognition using MFCC feature extraction and ANN classification. In: 2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE), Yogyakarta, pp. 22–25 (2017)

40.

Venkateswarlu, R., Kumari, R., JayaSri, G.: Speech_recognition_by_using_recurrent_neural_networks, 2(6), 7 (2011)

41.

Cloud Speech-to-Text. https://cloud.google.com/speech-to-text/. Accessed 18 Feb 2019

42.

Speech-to-Text. https://azure.microsoft.com/en-us/services/cognitive-services/speech-to-text/. Accessed 18 Feb 2019

43.

IBMWatsonSpeech-to-Text. https://www.ibm.com/watson/services/speech-to-text/. Accessed 18 Feb 2019

44.

KALDI. http://kaldi-asr.org/. Accessed 18 Feb 2019

45.

Ali, A., Zhang, Y., Cardinal, P., Dahak, N., Vogel, S.: A complete KALDI recipe for building Arabic speech recognition systems. In: Presented at the 2014 IEEE Spoken Language Technology Workshop (SLT), pp. 225–229 (2014)

46.

Manohar, V., Povey, D., Khudanpur, S.: JHU Kaldi system for Arabic MGB-3 ASR challenge using diarization, audio-transcript alignment and transfer learning. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Okinawa, pp. 346–352 (2017)

47.

The Microsoft cognitive toolkit. https://www.microsoft.com/en-us/cognitive-toolkit/. Accessed 18 Feb 2019

48.

An open source machine learning framework for everyone. https://www.tensorflow.org/. Accessed 18 Feb 2019

49.

Sim, K.C., Narayanan, A., Bagby, T., Sainath, T.N., Bacchiani, M.: Improving the efficiency of forward-backward algorithm using batched computation in TensorFlow. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Okinawa, Japan (2017)

Titel: Arabic Speech Recognition with Deep Learning: A Review
verfasst von: Wajdan Algihab
Noura Alawwad
Anfal Aldawish
Sarah AlHumoud
Verlag: Springer International Publishing
Buch: Social Computing and Social Media. Design, Human Behavior and Analytics
Print ISBN: 978-3-030-21901-7

Electronic ISBN: 978-3-030-21902-4

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-21902-4_2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"