Skip to main content

10.04.2024

Survey on Arabic speech emotion recognition

verfasst von: Latifa Iben Nasr, Abir Masmoudi, Lamia Hadrich Belguith

Erschienen in: International Journal of Speech Technology

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Emotions represent a fundamental aspect when evaluating user satisfaction or collecting customer feedback in human interactions, as well as in the realm of human–computer interface (HCI) technologies. Moreover, as human beings, we possess a distinctive capacity for communication through spoken language. Recently, the realm of Speech Emotion Recognition (SER) has garnered substantial interest and gained significant traction within the domain of Natural Language Processing (NLP). Its primary objective remains the identification of various emotions, such as sadness, neutrality, and anger, from audio speech using a diverse array of classifiers. This paper conducts a comprehensive critical analysis of the existing Arabic SER studies. Furthermore, this research delves into the performance and constraints associated with these previous works. It also sheds light on the current promising trends aimed at enhancing methods for recognizing emotions in speech. To the best of our knowledge, this research stands as a pioneering contribution to the SER field, particularly in the context of reviewing existing Arabic studies.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abdel-Hamid, L. (2020). Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features. Speech Communication, 122, 19–30. Abdel-Hamid, L. (2020). Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features. Speech Communication, 122, 19–30.
Zurück zum Zitat Abdel-Hamid, L., Shaker, N. H., & Emara, I. (2020). Analysis of linguistic and prosodic features of bilingual Arabic–English speakers for speech emotion recognition. IEEE Access, 8, 72957–72970. Abdel-Hamid, L., Shaker, N. H., & Emara, I. (2020). Analysis of linguistic and prosodic features of bilingual Arabic–English speakers for speech emotion recognition. IEEE Access, 8, 72957–72970.
Zurück zum Zitat Agrima, A., Mounir, I., Farchi, A., ElMazouzi, L., & Mounir, B. (2022). Emotion recognition based on the energy distribution of plosive syllables. International Journal of Electrical and Computer Engineering, 12(6), 6159. Agrima, A., Mounir, I., Farchi, A., ElMazouzi, L., & Mounir, B. (2022). Emotion recognition based on the energy distribution of plosive syllables. International Journal of Electrical and Computer Engineering, 12(6), 6159.
Zurück zum Zitat Akçay, M. B., & Oğuz, K. (2020). Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Communication, 116, 56–76.CrossRef Akçay, M. B., & Oğuz, K. (2020). Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Communication, 116, 56–76.CrossRef
Zurück zum Zitat Alamri, H. & Alshanbari, H. (2023). Emotion recognition in Arabic speech from Saudi dialect corpus using machine learning and deep learning algorithms. Alamri, H. & Alshanbari, H. (2023). Emotion recognition in Arabic speech from Saudi dialect corpus using machine learning and deep learning algorithms.
Zurück zum Zitat Al-Faham, A., & Ghneim, N. (2016). Towards enhanced Arabic speech emotion recognition: Comparison between three methodologies. Asian Journal of Science and Technology, 7(3), 2665–2669. Al-Faham, A., & Ghneim, N. (2016). Towards enhanced Arabic speech emotion recognition: Comparison between three methodologies. Asian Journal of Science and Technology, 7(3), 2665–2669.
Zurück zum Zitat Aljuhani, R. H., Alshutayri, A., & Alahdal, S. (2021). Arabic speech emotion recognition from Saudi dialect corpus, Jeddah, Saudi Arabia. IEEE Access, 9, 127081–127085.CrossRef Aljuhani, R. H., Alshutayri, A., & Alahdal, S. (2021). Arabic speech emotion recognition from Saudi dialect corpus, Jeddah, Saudi Arabia. IEEE Access, 9, 127081–127085.CrossRef
Zurück zum Zitat Alnuaim, A. A., Zakariah, M., Alhadlaq, A., Shashidhar, C., Hatamleh, W. A., Tarazi, H., ... & Ratna, R. (2022). Human–computer interaction with detection of speaker emotions using convolution neural networks. Computational Intelligence and Neuroscience, 2022, 7463091. Alnuaim, A. A., Zakariah, M., Alhadlaq, A., Shashidhar, C., Hatamleh, W. A., Tarazi, H., ... & Ratna, R. (2022). Human–computer interaction with detection of speaker emotions using convolution neural networks. Computational Intelligence and Neuroscience, 2022, 7463091.
Zurück zum Zitat Al-onazi, B. B., Nauman, M. A., Jahangir, R., Malik, M. M., Alkhammash, E. H., & Elshewey, A. M. (2022). Transformer-based multilingual speech emotion recognition using data augmentation and feature fusion. Applied Sciences, 12(18), 9188.CrossRef Al-onazi, B. B., Nauman, M. A., Jahangir, R., Malik, M. M., Alkhammash, E. H., & Elshewey, A. M. (2022). Transformer-based multilingual speech emotion recognition using data augmentation and feature fusion. Applied Sciences, 12(18), 9188.CrossRef
Zurück zum Zitat Alsabhan, W. (2023). Human–computer interaction with a real-time speech emotion recognition with ensembling techniques 1D convolution neural network and attention. Sensors, 23(3), 1386. Alsabhan, W. (2023). Human–computer interaction with a real-time speech emotion recognition with ensembling techniques 1D convolution neural network and attention. Sensors, 23(3), 1386.
Zurück zum Zitat Barrett, L. F., & Russell, J. A. (1998). Independence and bipolarity in the structure of current affect. Journal of Personality and Social Psychology, 74(4), 967.CrossRef Barrett, L. F., & Russell, J. A. (1998). Independence and bipolarity in the structure of current affect. Journal of Personality and Social Psychology, 74(4), 967.CrossRef
Zurück zum Zitat Burkhardt, F., Ajmera, J., Englert, R., Stegmann, J., & Burleson, W. (2006). Detecting anger in automated voice portal dialogs. In Proceedings of the ninth international conference on spoken language processing, Pittsburgh, PA, USA, September 17–21, 2006. Burkhardt, F., Ajmera, J., Englert, R., Stegmann, J., & Burleson, W. (2006). Detecting anger in automated voice portal dialogs. In Proceedings of the ninth international conference on spoken language processing, Pittsburgh, PA, USA, September 17–21, 2006.
Zurück zum Zitat Busso, C., Bulut, M., & Narayanan, S. (2012). Toward effective automatic recognition systems of emotion in speech. In Social emotions in nature and artifact: Emotions in human and human–computer interaction (pp. 110–127). Oxford University Press. Busso, C., Bulut, M., & Narayanan, S. (2012). Toward effective automatic recognition systems of emotion in speech. In Social emotions in nature and artifact: Emotions in human and human–computer interaction (pp. 110–127). Oxford University Press.
Zurück zum Zitat Cherif, R. Y., Moussaoui, A., Frahta, N., & Berrimi, M. (2021). Effective speech emotion recognition using deep learning approaches for Algerian dialect. In 2021 international conference of women in data science at Taif University (WiDSTaif), 2021 (pp. 1–6). IEEE. Cherif, R. Y., Moussaoui, A., Frahta, N., & Berrimi, M. (2021). Effective speech emotion recognition using deep learning approaches for Algerian dialect. In 2021 international conference of women in data science at Taif University (WiDSTaif), 2021 (pp. 1–6). IEEE.
Zurück zum Zitat Dahmani, H., Hussein, H., Meyer-Sickendiek, B., & Jokisch, O. (2019). Natural Arabic language resources for emotion recognition in Algerian dialect. In Arabic language processing: From theory to practice: 7th international conference (ICALP 2019), Nancy, France, October 16–17, 2019, Proceedings 7 (pp. 18–33). Springer. Dahmani, H., Hussein, H., Meyer-Sickendiek, B., & Jokisch, O. (2019). Natural Arabic language resources for emotion recognition in Algerian dialect. In Arabic language processing: From theory to practice: 7th international conference (ICALP 2019), Nancy, France, October 16–17, 2019, Proceedings 7 (pp. 18–33). Springer.
Zurück zum Zitat Devillers, L., Vaudable, C. & Chasatgnol, C. (2010). Real-life emotion-related states detection in call centers: A cross-corpora study. In Proceedings of INTERSPEECH, Makuhari, Chiba, Japan, 2010 (pp. 2350–2355). Devillers, L., Vaudable, C. & Chasatgnol, C. (2010). Real-life emotion-related states detection in call centers: A cross-corpora study. In Proceedings of INTERSPEECH, Makuhari, Chiba, Japan, 2010 (pp. 2350–2355).
Zurück zum Zitat El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.CrossRef El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.CrossRef
Zurück zum Zitat El Seknedy, M., & Fawzi, S. A. (2022). Emotion recognition system for Arabic speech: Case study Egyptian accent. In International conference on model and data engineering, 2022 (pp. 102–115). Springer. El Seknedy, M., & Fawzi, S. A. (2022). Emotion recognition system for Arabic speech: Case study Egyptian accent. In International conference on model and data engineering, 2022 (pp. 102–115). Springer.
Zurück zum Zitat Eyben, F., Weninger, F., Gross, F., & Schuller, B. (2013). Recent developments in openSMILE, the Munich open-source multimedia feature extractor. In Proceedings of the 21st ACM international conference on multimedia, MM 2013, 2013, Barcelona, Spain (pp. 835–838). https://doi.org/10.1145/2502081.2502224 Eyben, F., Weninger, F., Gross, F., & Schuller, B. (2013). Recent developments in openSMILE, the Munich open-source multimedia feature extractor. In Proceedings of the 21st ACM international conference on multimedia, MM 2013, 2013, Barcelona, Spain (pp. 835–838). https://​doi.​org/​10.​1145/​2502081.​2502224
Zurück zum Zitat Garnier-Rizet, M., Adda, G., Cailliau, F., Gauvain, J. L., Guillemin-Lanne, S., Lamel, L., ... & Waast-Richard, C. (2008). CallSurf: Automatic transcription, indexing and structuration of call center conversational speech for knowledge extraction and query by content. In Proceedings of language resources and evaluation conference (LREC), 2008, Marrakech, Morocco (pp. 2623–2628). Garnier-Rizet, M., Adda, G., Cailliau, F., Gauvain, J. L., Guillemin-Lanne, S., Lamel, L., ... & Waast-Richard, C. (2008). CallSurf: Automatic transcription, indexing and structuration of call center conversational speech for knowledge extraction and query by content. In Proceedings of language resources and evaluation conference (LREC), 2008, Marrakech, Morocco (pp. 2623–2628).
Zurück zum Zitat Hadjadji, I., Falek, L., Demri, L., & Teffahi, H. (2019). Emotion recognition in Arabic speech. In 2019 international conference on advanced electrical engineering (ICAEE), 2019 (pp. 1–5). IEEE. Hadjadji, I., Falek, L., Demri, L., & Teffahi, H. (2019). Emotion recognition in Arabic speech. In 2019 international conference on advanced electrical engineering (ICAEE), 2019 (pp. 1–5). IEEE.
Zurück zum Zitat Hifny, Y., & Ali, A. (2019, May). Efficient Arabic emotion recognition using deep neural networks. In 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP 2019) (pp. 6710–6714). IEEE. Hifny, Y., & Ali, A. (2019, May). Efficient Arabic emotion recognition using deep neural networks. In 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP 2019) (pp. 6710–6714). IEEE.
Zurück zum Zitat Horkous, H. (2021). La reconnaissance des émotions dans le dialecte Algérien. Doctoral Dissertation, Ecole Nationale Supérieure Polytechnique Alger. Horkous, H. (2021). La reconnaissance des émotions dans le dialecte Algérien. Doctoral Dissertation, Ecole Nationale Supérieure Polytechnique Alger.
Zurück zum Zitat Hossain, M. S., Muhammad, G., Song, B., Hassan, M. M., Alelaiwi, A., & Alamri, A. (2015). Audio–visual emotion-aware cloud gaming framework. IEEE Transactions on Circuits and Systems for Video Technology, 25, 2105–2118.CrossRef Hossain, M. S., Muhammad, G., Song, B., Hassan, M. M., Alelaiwi, A., & Alamri, A. (2015). Audio–visual emotion-aware cloud gaming framework. IEEE Transactions on Circuits and Systems for Video Technology, 25, 2105–2118.CrossRef
Zurück zum Zitat Hsu, W.-N., Bolte, B., Tsai, Y.-H.H., Lakhotia, K., Salakhutdinov, R., & Mohamed, A. (2021). HuBERT: Self-supervised speech representation learning by masked prediction of hidden units. IEEE/ACM Transactions on Audio, Speech and Language Processing, 29, 3451–3460.CrossRef Hsu, W.-N., Bolte, B., Tsai, Y.-H.H., Lakhotia, K., Salakhutdinov, R., & Mohamed, A. (2021). HuBERT: Self-supervised speech representation learning by masked prediction of hidden units. IEEE/ACM Transactions on Audio, Speech and Language Processing, 29, 3451–3460.CrossRef
Zurück zum Zitat Khalil, A., Al-Khatib, W., El-Alfy, E. S., & Cheded, L. (2018). Anger detection in Arabic speech dialogs. In 2018 international conference on computing sciences and engineering (ICCSE 2018) (pp. 1–6). IEEE. Khalil, A., Al-Khatib, W., El-Alfy, E. S., & Cheded, L. (2018). Anger detection in Arabic speech dialogs. In 2018 international conference on computing sciences and engineering (ICCSE 2018) (pp. 1–6). IEEE.
Zurück zum Zitat Klaylat, S., Osman, Z., Hamandi, L., & Zantout, R. (2018). Emotion recognition in Arabic speech. Analog Integrated Circuits and Signal Processing, 96, 337–351.CrossRef Klaylat, S., Osman, Z., Hamandi, L., & Zantout, R. (2018). Emotion recognition in Arabic speech. Analog Integrated Circuits and Signal Processing, 96, 337–351.CrossRef
Zurück zum Zitat Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech: A review. International Journal of Speech Technology, 15, 99–117.CrossRef Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech: A review. International Journal of Speech Technology, 15, 99–117.CrossRef
Zurück zum Zitat Lieskovská, E., Jakubec, M., Jarina, R., & Chmulík, M. (2021). A review on speech emotion recognition using deep learning and attention mechanism. Electronics, 10(10), 1163.CrossRef Lieskovská, E., Jakubec, M., Jarina, R., & Chmulík, M. (2021). A review on speech emotion recognition using deep learning and attention mechanism. Electronics, 10(10), 1163.CrossRef
Zurück zum Zitat Likitha, M. S., Gupta, S. R. R., Hasitha, K., & Raju, A. U. (2017). Speech based human emotion recognition using MFCC. In Proceedings of the international conference on wireless communication, signal processing and networking (WiSPNET), March 2017 (pp. 2257–2260). Likitha, M. S., Gupta, S. R. R., Hasitha, K., & Raju, A. U. (2017). Speech based human emotion recognition using MFCC. In Proceedings of the international conference on wireless communication, signal processing and networking (WiSPNET), March 2017 (pp. 2257–2260).
Zurück zum Zitat Macary, M. (2022). Analyse de données massives en temps réel pour l’extraction d’informations sémantiques et émotionnelles de la parole. Doctoral Dissertation, Le Mans Université. Macary, M. (2022). Analyse de données massives en temps réel pour l’extraction d’informations sémantiques et émotionnelles de la parole. Doctoral Dissertation, Le Mans Université.
Zurück zum Zitat Macary, M., Tahon, M., Estève, Y., & Rousseau, A. (2020). AlloSat: A new call center French corpus for satisfaction and frustration analysis. In Language resources and evaluation conference (LREC 2020). Macary, M., Tahon, M., Estève, Y., & Rousseau, A. (2020). AlloSat: A new call center French corpus for satisfaction and frustration analysis. In Language resources and evaluation conference (LREC 2020).
Zurück zum Zitat Meddeb, M., Hichem, K., & Alimi, A. (2016). Automated extraction of features from Arabic emotional speech corpus. International Journal of Computer Information Systems and Industrial Management Applications, 8, 184194. Meddeb, M., Hichem, K., & Alimi, A. (2016). Automated extraction of features from Arabic emotional speech corpus. International Journal of Computer Information Systems and Industrial Management Applications, 8, 184194.
Zurück zum Zitat Meftah, A. H., Qamhan, M. A., Seddiq, Y., Alotaibi, Y. A., & Selouani, S. A. (2021). King Saud University emotions corpus: Construction, analysis, evaluation, and comparison. IEEE Access, 9, 54201–54219. Meftah, A. H., Qamhan, M. A., Seddiq, Y., Alotaibi, Y. A., & Selouani, S. A. (2021). King Saud University emotions corpus: Construction, analysis, evaluation, and comparison. IEEE Access, 9, 54201–54219.
Zurück zum Zitat Mohamed, O., & Aly, S. A. (2021). Arabic speech emotion recognition employing wav2vec2. 0 and HuBERT based on BAVED dataset. arXiv preprint arXiv:2110.04425 Mohamed, O., & Aly, S. A. (2021). Arabic speech emotion recognition employing wav2vec2. 0 and HuBERT based on BAVED dataset. arXiv preprint arXiv:​2110.​04425
Zurück zum Zitat Mohammad, O. A., & Elhadef, M. (2021). Arabic speech emotion recognition method based on LPC and PPSD. In 2021 2nd international conference on computation, automation and knowledge management (ICCAKM) (pp. 31–36). IEEE. Mohammad, O. A., & Elhadef, M. (2021). Arabic speech emotion recognition method based on LPC and PPSD. In 2021 2nd international conference on computation, automation and knowledge management (ICCAKM) (pp. 31–36). IEEE.
Zurück zum Zitat Morrison, K. M. (2007). Natural resources, aid, and democratization: A best case scenario. Public Choice, 131(3–4), 365–386.CrossRef Morrison, K. M. (2007). Natural resources, aid, and democratization: A best case scenario. Public Choice, 131(3–4), 365–386.CrossRef
Zurück zum Zitat Munot, R., & Nenkova, A. (2019). Emotion impacts speech recognition performance. In Proceedings of the conference of North American Chapter of the Association of Computational Linguistics, student research workshop, 2019 (pp. 16–21). https://doi.org/10.18653/v1/n19-3003. Munot, R., & Nenkova, A. (2019). Emotion impacts speech recognition performance. In Proceedings of the conference of North American Chapter of the Association of Computational Linguistics, student research workshop, 2019 (pp. 16–21). https://​doi.​org/​10.​18653/​v1/​n19-3003.
Zurück zum Zitat Nasr, L. I., Masmoudi, A., & Belguith, L. H. (2023). Natural Tunisian speech preprocessing for features extraction. In 2023 IEEE/ACIS 23rd international conference on computer and information science (ICIS 2023) (pp. 73–78). IEEE. Nasr, L. I., Masmoudi, A., & Belguith, L. H. (2023). Natural Tunisian speech preprocessing for features extraction. In 2023 IEEE/ACIS 23rd international conference on computer and information science (ICIS 2023) (pp. 73–78). IEEE.
Zurück zum Zitat Nicolaou, M. A., Gunes, H., & Pantic, M. (2011). Continuous prediction of spontaneous affect from multiple cues and modalities in valence–arousal space. IEEE Transactions on Affective Computing, 2(2), 92–105.CrossRef Nicolaou, M. A., Gunes, H., & Pantic, M. (2011). Continuous prediction of spontaneous affect from multiple cues and modalities in valence–arousal space. IEEE Transactions on Affective Computing, 2(2), 92–105.CrossRef
Zurück zum Zitat Oh, K., Lee, D., Ko, B., & Choi, H. (2017). A Chatbot for psychiatric counseling in mental healthcare service based on emotional dialogue analysis and sentence generation. In Proceedings of the 2017 18th IEEE international conference on mobile data management (MDM), Daejeon, Korea, May 29–June 1, 2017 (pp. 371–375). Oh, K., Lee, D., Ko, B., & Choi, H. (2017). A Chatbot for psychiatric counseling in mental healthcare service based on emotional dialogue analysis and sentence generation. In Proceedings of the 2017 18th IEEE international conference on mobile data management (MDM), Daejeon, Korea, May 29–June 1, 2017 (pp. 371–375).
Zurück zum Zitat Platt, J. C. (1998, April). Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14. Platt, J. C. (1998, April). Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14.
Zurück zum Zitat Poorna, S. S., & Nair, G. J. (2019). Multistage classification scheme to enhance speech emotion recognition. International Journal of Speech Technology, 22, 327–340.CrossRef Poorna, S. S., & Nair, G. J. (2019). Multistage classification scheme to enhance speech emotion recognition. International Journal of Speech Technology, 22, 327–340.CrossRef
Zurück zum Zitat Prasetya, M. R., Harjoko, A., & Supriyanto, C. (2019). Speech emotion recognition of Indonesian movie audio tracks based on MFCC and SVM. In 2019 international conference on contemporary computing and informatics (IC3I) (pp. 22–25). IEEE. Prasetya, M. R., Harjoko, A., & Supriyanto, C. (2019). Speech emotion recognition of Indonesian movie audio tracks based on MFCC and SVM. In 2019 international conference on contemporary computing and informatics (IC3I) (pp. 22–25). IEEE.
Zurück zum Zitat Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1), 227–256.CrossRef Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1), 227–256.CrossRef
Zurück zum Zitat Shahin, I., Alomari, O. A., Nassif, A. B., Afyouni, I., Hashem, I. A., & Elnagar, A. (2023). An efficient feature selection method for Arabic and English speech emotion recognition using Grey Wolf Optimizer. Applied Acoustics, 205, 109279.CrossRef Shahin, I., Alomari, O. A., Nassif, A. B., Afyouni, I., Hashem, I. A., & Elnagar, A. (2023). An efficient feature selection method for Arabic and English speech emotion recognition using Grey Wolf Optimizer. Applied Acoustics, 205, 109279.CrossRef
Zurück zum Zitat Shahin, I., Hindawi, N., Nassif, A. B., Alhudhaif, A., & Polat, K. (2022). Novel dual-channel long short-term memory compressed capsule networks for emotion recognition. Expert Systems with Applications, 188, 116080.CrossRef Shahin, I., Hindawi, N., Nassif, A. B., Alhudhaif, A., & Polat, K. (2022). Novel dual-channel long short-term memory compressed capsule networks for emotion recognition. Expert Systems with Applications, 188, 116080.CrossRef
Zurück zum Zitat Shahin, I., Nassif, A. B., & Hamsa, S. (2019). Emotion recognition using hybrid Gaussian mixture model and deep neural network. IEEE Access, 7, 26777–26787.CrossRef Shahin, I., Nassif, A. B., & Hamsa, S. (2019). Emotion recognition using hybrid Gaussian mixture model and deep neural network. IEEE Access, 7, 26777–26787.CrossRef
Zurück zum Zitat Singh, Y. B., & Goel, S. (2022). A systematic literature review of speech emotion recognition approaches. Neurocomputing, 492, 245–263.CrossRef Singh, Y. B., & Goel, S. (2022). A systematic literature review of speech emotion recognition approaches. Neurocomputing, 492, 245–263.CrossRef
Zurück zum Zitat Tahon, M. (2012). Analyse acoustique de la voix émotionnelle de locuteurs lors d’une interaction humain-robot. These de doctorat, Paris 11. Tahon, M. (2012). Analyse acoustique de la voix émotionnelle de locuteurs lors d’une interaction humain-robot. These de doctorat, Paris 11.
Zurück zum Zitat Torres-García, A. A., Garcia, C. A. R., Villasenor-Pineda, L., & Mendoza-Montoya, O. (Eds.) (2021). Biosignal processing and classification using computational learning and intelligence: Principles, algorithms, and applications. Academic Press. Torres-García, A. A., Garcia, C. A. R., Villasenor-Pineda, L., & Mendoza-Montoya, O. (Eds.) (2021). Biosignal processing and classification using computational learning and intelligence: Principles, algorithms, and applications. Academic Press.
Zurück zum Zitat Vapnik, V. (1995). The nature of statistical learning theory. Springer.CrossRef Vapnik, V. (1995). The nature of statistical learning theory. Springer.CrossRef
Zurück zum Zitat Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181.CrossRef Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181.CrossRef
Zurück zum Zitat Wani, T. M., Gunawan, T. S., Qadri, S. A. A., Kartiwi, M., & Ambikairajah, E. (2021). A comprehensive review of speech emotion recognition systems. IEEE Access, 9, 47795–47814. Wani, T. M., Gunawan, T. S., Qadri, S. A. A., Kartiwi, M., & Ambikairajah, E. (2021). A comprehensive review of speech emotion recognition systems. IEEE Access, 9, 47795–47814.
Zurück zum Zitat Wierzbicka, A. (1999). Emotions across languages and cultures: Diversity and universals. Cambridge University Press. Wierzbicka, A. (1999). Emotions across languages and cultures: Diversity and universals. Cambridge University Press.
Zurück zum Zitat Wong, E., & Sridharan, S. (2003). Fusion of output scores on language identification system. In Multilingual speech and language processing, 2003 (p. 7). Wong, E., & Sridharan, S. (2003). Fusion of output scores on language identification system. In Multilingual speech and language processing, 2003 (p. 7).
Zurück zum Zitat Yenigalla, P., Kumar, A., Tripathi, S., Singh, C., Kar, S., & Vepa, J. (2018). Speech emotion recognition using spectrogram and phoneme embedding. In Proceedings of the INTERSPEECH, September 2–6, 2018, Hyderabad, India. Yenigalla, P., Kumar, A., Tripathi, S., Singh, C., Kar, S., & Vepa, J. (2018). Speech emotion recognition using spectrogram and phoneme embedding. In Proceedings of the INTERSPEECH, September 2–6, 2018, Hyderabad, India.
Zurück zum Zitat Zantout, R., Klaylat, S., Hamandi, L., & Osman, Z. (2020). Ensemble models for enhancement of an Arabic speech emotion recognition system. In Advances in information and communication: Proceedings of the 2019 future of information and communication conference (FICC) (Vol. 2, pp. 174–187). Springer. Zantout, R., Klaylat, S., Hamandi, L., & Osman, Z. (2020). Ensemble models for enhancement of an Arabic speech emotion recognition system. In Advances in information and communication: Proceedings of the 2019 future of information and communication conference (FICC) (Vol. 2, pp. 174–187). Springer.
Zurück zum Zitat Zeng, X., & Wang, D. S. (2009). A generalized extended rational expansion method and its application to (1 + 1)-dimensional dispersive long wave equation. Applied Mathematics and Computation, 212(2), 296–304.MathSciNetCrossRef Zeng, X., & Wang, D. S. (2009). A generalized extended rational expansion method and its application to (1 + 1)-dimensional dispersive long wave equation. Applied Mathematics and Computation, 212(2), 296–304.MathSciNetCrossRef
Metadaten
Titel
Survey on Arabic speech emotion recognition
verfasst von
Latifa Iben Nasr
Abir Masmoudi
Lamia Hadrich Belguith
Publikationsdatum
10.04.2024
Verlag
Springer US
Erschienen in
International Journal of Speech Technology
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-024-10088-7

Neuer Inhalt