Skip to main content

2020 | OriginalPaper | Buchkapitel

Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network

verfasst von : Sharif Noor Zisad, Mohammad Shahadat Hossain, Karl Andersson

Erschienen in: Brain Informatics

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Detecting emotions from the speech is one of the emergent research fields in the area of human information processing. Expressing emotion is a very difficult task for a person with neurological disorder. Hence, a Speech Emotion Recognition (SER) system may solve this by ensuring a barrier-less communication. Various research has been carried out in the area of SER. Therefore, the main objective of this research is to develop a system that can recognize emotion from the speech of a neurologically disordered person. Since convolutional neural network (CNN) is an effective method, it has been considered to develop the system. The system uses tonal properties like MFCCs. RAVDESS audio speech and song databases for training and testing. In addition, a custom local dataset developed to support further training and testing. The performance of the proposed system compared with the traditional machine learning models as well as with the pre-trained CNN models including VGG16 and VGG19. The results demonstrate that the CNN model proposed in this research performed better than the mentioned machine learning techniques. This system enables one tohhhhhh classify eight emotions of neurologically disordered person including calm, angry, fearful, disgust, happy, surprise, neutral and sad.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ahmed, T.U., Hossain, M.S., Alam, M.J., Andersson, K.: An integrated CNN-RNN framework to assess road crack. In: 2019 22nd International Conference on Computer and Information Technology (ICCIT), pp. 1–6. IEEE (2019) Ahmed, T.U., Hossain, M.S., Alam, M.J., Andersson, K.: An integrated CNN-RNN framework to assess road crack. In: 2019 22nd International Conference on Computer and Information Technology (ICCIT), pp. 1–6. IEEE (2019)
2.
Zurück zum Zitat Alharbi, S.T., Hossain, M.S., Monrat, A.A.: A belief rule based expert system to assess autism under uncertainty. In: Proceedings of the World Congress on Engineering and Computer Science, vol. 1 (2015) Alharbi, S.T., Hossain, M.S., Monrat, A.A.: A belief rule based expert system to assess autism under uncertainty. In: Proceedings of the World Congress on Engineering and Computer Science, vol. 1 (2015)
3.
Zurück zum Zitat Aloufi, R., Haddadi, H., Boyle, D.: Emotionless: privacy-preserving speech analysis for voice assistants. arXiv preprint arXiv:1908.03632 (2019) Aloufi, R., Haddadi, H., Boyle, D.: Emotionless: privacy-preserving speech analysis for voice assistants. arXiv preprint arXiv:​1908.​03632 (2019)
4.
Zurück zum Zitat Bojanić, M., Delić, V., Karpov, A.: Call redistribution for a call center based on speech emotion recognition. Appl. Sci. 10(13), 4653 (2020)CrossRef Bojanić, M., Delić, V., Karpov, A.: Call redistribution for a call center based on speech emotion recognition. Appl. Sci. 10(13), 4653 (2020)CrossRef
6.
Zurück zum Zitat Chernykh, V., Prikhodko, P.: Emotion recognition from speech with recurrent neural networks. arXiv preprint arXiv:1701.08071 (2017) Chernykh, V., Prikhodko, P.: Emotion recognition from speech with recurrent neural networks. arXiv preprint arXiv:​1701.​08071 (2017)
7.
Zurück zum Zitat Chowdhury, R.R., Hossain, M.S., ul Islam, R., Andersson, K., Hossain, S.: Bangla handwritten character recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 318–323. IEEE (2019) Chowdhury, R.R., Hossain, M.S., ul Islam, R., Andersson, K., Hossain, S.: Bangla handwritten character recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 318–323. IEEE (2019)
8.
Zurück zum Zitat Ghai, M., Lal, S., Duggal, S., Manik, S.: Emotion recognition on speech signals using machine learning. In: 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC), pp. 34–39. IEEE (2017) Ghai, M., Lal, S., Duggal, S., Manik, S.: Emotion recognition on speech signals using machine learning. In: 2017 International Conference on Big Data Analytics and Computational Intelligence (ICBDAC), pp. 34–39. IEEE (2017)
9.
Zurück zum Zitat Hossain, M.S., Habib, I.B., Andersson, K.: A belief rule based expert system to diagnose dengue fever under uncertainty. In: 2017 Computing Conference, pp. 179–186. IEEE (2017) Hossain, M.S., Habib, I.B., Andersson, K.: A belief rule based expert system to diagnose dengue fever under uncertainty. In: 2017 Computing Conference, pp. 179–186. IEEE (2017)
10.
Zurück zum Zitat Hossain, M.S., Hossain, E., Khalid, S., Haque, M.A.: A belief rule based (BRB) decision support system to assess clinical asthma suspicion. In: Scandinavian Conference on Health Informatics, Grimstad, Norway, 22 August 2014, pp. 83–89. No. 102, Linköping University Electronic Press (2014) Hossain, M.S., Hossain, E., Khalid, S., Haque, M.A.: A belief rule based (BRB) decision support system to assess clinical asthma suspicion. In: Scandinavian Conference on Health Informatics, Grimstad, Norway, 22 August 2014, pp. 83–89. No. 102, Linköping University Electronic Press (2014)
11.
Zurück zum Zitat Hossain, M.S., Rahaman, S., Kor, A.L., Andersson, K., Pattinson, C.: A belief rule based expert system for datacenter PUE prediction under uncertainty. IEEE Trans. Sustain. Comput. 2(2), 140–153 (2017)CrossRef Hossain, M.S., Rahaman, S., Kor, A.L., Andersson, K., Pattinson, C.: A belief rule based expert system for datacenter PUE prediction under uncertainty. IEEE Trans. Sustain. Comput. 2(2), 140–153 (2017)CrossRef
12.
Zurück zum Zitat Hossain, M.S., Sultana, Z., Nahar, L., Andersson, K.: An intelligent system to diagnose chikungunya under uncertainty. J. Wirel. Mob. Netw. Ubiquit. Comput. Dependable Appl. 10(2), 37–54 (2019) Hossain, M.S., Sultana, Z., Nahar, L., Andersson, K.: An intelligent system to diagnose chikungunya under uncertainty. J. Wirel. Mob. Netw. Ubiquit. Comput. Dependable Appl. 10(2), 37–54 (2019)
13.
Zurück zum Zitat Iqbal, A., Barua, K.: A real-time emotion recognition from speech using gradient boosting. In: 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 1–5. IEEE (2019) Iqbal, A., Barua, K.: A real-time emotion recognition from speech using gradient boosting. In: 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 1–5. IEEE (2019)
14.
Zurück zum Zitat Islam, M.Z., Hossain, M.S., ul Islam, R., Andersson, K.: Static hand gesture recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 324–329. IEEE (2019) Islam, M.Z., Hossain, M.S., ul Islam, R., Andersson, K.: Static hand gesture recognition using convolutional neural network with data augmentation. In: 2019 Joint 8th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 324–329. IEEE (2019)
15.
Zurück zum Zitat Islam, R.U., Ruci, X., Hossain, M.S., Andersson, K., Kor, A.L.: Capacity management of hyperscale data centers using predictive modelling. Energies 12(18), 3438 (2019)CrossRef Islam, R.U., Ruci, X., Hossain, M.S., Andersson, K., Kor, A.L.: Capacity management of hyperscale data centers using predictive modelling. Energies 12(18), 3438 (2019)CrossRef
16.
Zurück zum Zitat Kabir, S., Islam, R.U., Hossain, M.S., Andersson, K.: An integrated approach of belief rule base and deep learning to predict air pollution. Sensors 20(7), 1956 (2020)CrossRef Kabir, S., Islam, R.U., Hossain, M.S., Andersson, K.: An integrated approach of belief rule base and deep learning to predict air pollution. Sensors 20(7), 1956 (2020)CrossRef
17.
Zurück zum Zitat Karim, R., Andersson, K., Hossain, M.S., Uddin, M.J., Meah, M.P.: A belief rule based expert system to assess clinical bronchopneumonia suspicion. In: 2016 Future Technologies Conference (FTC), pp. 655–660. IEEE (2016) Karim, R., Andersson, K., Hossain, M.S., Uddin, M.J., Meah, M.P.: A belief rule based expert system to assess clinical bronchopneumonia suspicion. In: 2016 Future Technologies Conference (FTC), pp. 655–660. IEEE (2016)
18.
Zurück zum Zitat Livingstone, S.R., Russo, F.A.: The Ryerson audio-visual database of emotional speech and song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. PloS One 13(5), e0196391 (2018)CrossRef Livingstone, S.R., Russo, F.A.: The Ryerson audio-visual database of emotional speech and song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. PloS One 13(5), e0196391 (2018)CrossRef
19.
Zurück zum Zitat Martínez, B.E., Jacobo, J.C.: An improved characterization methodology to efficiently deal with the speech emotion recognition problem. In: 2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), pp. 1–6. IEEE (2017) Martínez, B.E., Jacobo, J.C.: An improved characterization methodology to efficiently deal with the speech emotion recognition problem. In: 2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), pp. 1–6. IEEE (2017)
20.
Zurück zum Zitat Rovetta, S., Mnasri, Z., Masulli, F., Cabri, A.: Emotion recognition from speech signal using fuzzy clustering. In: 2019 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology (EUSFLAT 2019). Atlantis Press (2019) Rovetta, S., Mnasri, Z., Masulli, F., Cabri, A.: Emotion recognition from speech signal using fuzzy clustering. In: 2019 Conference of the International Fuzzy Systems Association and the European Society for Fuzzy Logic and Technology (EUSFLAT 2019). Atlantis Press (2019)
21.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetMATH
22.
Zurück zum Zitat Tzirakis, P., Zhang, J., Schuller, B.W.: End-to-end speech emotion recognition using deep neural networks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5089–5093. IEEE (2018) Tzirakis, P., Zhang, J., Schuller, B.W.: End-to-end speech emotion recognition using deep neural networks. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5089–5093. IEEE (2018)
23.
Zurück zum Zitat Yang, N., Dey, N., Sherratt, R.S., Shi, F.: Recognize basic emotional statesin speech by machine learning techniques using mel-frequency cepstral coefficient features. J. Intell. Fuzzy Syst. (Preprint) 1–12 (2020) Yang, N., Dey, N., Sherratt, R.S., Shi, F.: Recognize basic emotional statesin speech by machine learning techniques using mel-frequency cepstral coefficient features. J. Intell. Fuzzy Syst. (Preprint) 1–12 (2020)
24.
Zurück zum Zitat Zhang, M., Liang, Y., Ma, H.: Context-aware affective graph reasoning for emotion recognition. In: 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 151–156. IEEE (2019) Zhang, M., Liang, Y., Ma, H.: Context-aware affective graph reasoning for emotion recognition. In: 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 151–156. IEEE (2019)
25.
Zurück zum Zitat Zhao, Z., Bao, Z., Zhao, Y., Zhang, Z., Cummins, N., Ren, Z., Schuller, B.: Exploring deep spectrum representations via attention-based recurrent and convolutional neural networks for speech emotion recognition. IEEE Access 7, 97515–97525 (2019)CrossRef Zhao, Z., Bao, Z., Zhao, Y., Zhang, Z., Cummins, N., Ren, Z., Schuller, B.: Exploring deep spectrum representations via attention-based recurrent and convolutional neural networks for speech emotion recognition. IEEE Access 7, 97515–97525 (2019)CrossRef
Metadaten
Titel
Speech Emotion Recognition in Neurological Disorders Using Convolutional Neural Network
verfasst von
Sharif Noor Zisad
Mohammad Shahadat Hossain
Karl Andersson
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-59277-6_26