Skip to main content
Erschienen in:

22.12.2023

An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm

verfasst von: Radhika Subramanian, Prasanth Aruchamy

Erschienen in: Circuits, Systems, and Signal Processing | Ausgabe 4/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

At present, there are several communication tools employed to express human emotions. Among the numerous modes of communication, speech is the most predominant one for communicating with people effectively and efficiently. Speech emotion recognition (SER) plays a significant role in several signal processing applications. However, in both feature selection (FS) methods and reliable classifiers, determining their appropriate features has emerged as challenges in identifying the emotions expressed in Indian regional languages. In this work, a novel SER framework has been proposed to classify different speech emotions. Primarily, the proposed framework utilizes a preprocessing phase so as to alleviate the background noise and the artifacts present in input speech signal. Later on, the two new speech attributes related to energy and phase have been integrated with state-of-the-art attributes for examining speech emotion characteristics. The threshold-based feature selection (TFS) algorithm has been introduced to determine the optimal features by applying a statistical approach. An Indian regional language called Tamil Emotional dataset has been created for examining the proposed framework with the aid of standard machine learning and deep learning classifiers. The proposed TFS technique has been more suitable for Indian regional languages since it exhibits a superior performance with 97.96% accuracy compared to Indian English and Malayalam datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik. 

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information. 

Order your 30-days-trial for free and without any commitment.

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat L. Abdel-Hamid, Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features. Speech Commun. 122, 9–30 (2020)CrossRef L. Abdel-Hamid, Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features. Speech Commun. 122, 9–30 (2020)CrossRef
2.
Zurück zum Zitat G. Agarwal, H. Om, Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition. Multimed. Tools Appl. 80, 9961–9992 (2021)CrossRef G. Agarwal, H. Om, Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition. Multimed. Tools Appl. 80, 9961–9992 (2021)CrossRef
3.
Zurück zum Zitat A. Bhowmick, A. Biswas, Identification/segmentation of Indian regional languages with singular value decomposition based feature embedding. Appl. Acoust. 176, 1–15 (2021)CrossRef A. Bhowmick, A. Biswas, Identification/segmentation of Indian regional languages with singular value decomposition based feature embedding. Appl. Acoust. 176, 1–15 (2021)CrossRef
4.
Zurück zum Zitat Y. Caiming, Q. Tian, F. Cheng, S..Zhang, Speech emotion recognition using support vector machines. Proceedings of International Conference on Computer Science and Information Engineering, pp. 215–220 (2011). Y. Caiming, Q. Tian, F. Cheng, S..Zhang, Speech emotion recognition using support vector machines. Proceedings of International Conference on Computer Science and Information Engineering, pp. 215–220 (2011).
5.
Zurück zum Zitat S. Chattopadhyay, A. Dey, P.K. Singh, A. Ahmadian, R. Sarkar, A feature selection model for speech emotion recognition using clustering-based population generation with hybrid of equilibrium optimizer and atom search optimization algorithm. Multimed. Tools Appl. 24, 1–34 (2022) S. Chattopadhyay, A. Dey, P.K. Singh, A. Ahmadian, R. Sarkar, A feature selection model for speech emotion recognition using clustering-based population generation with hybrid of equilibrium optimizer and atom search optimization algorithm. Multimed. Tools Appl. 24, 1–34 (2022)
6.
Zurück zum Zitat A. Dey, S. Chattopadhyay, P. Singh, A hybrid meta-heuristic feature selection method using golden ratio and equilibrium optimization algorithms for speech emotion recognition. IEEE Access, 8 (2020). A. Dey, S. Chattopadhyay, P. Singh, A hybrid meta-heuristic feature selection method using golden ratio and equilibrium optimization algorithms for speech emotion recognition. IEEE Access, 8 (2020).
7.
Zurück zum Zitat M. Hasan, M. Hossain, Effect of vocal tract dynamics on neural network-based speech recognition: A Bengali language-based study. Expert. Syst. 39, 1–22 (2022)CrossRef M. Hasan, M. Hossain, Effect of vocal tract dynamics on neural network-based speech recognition: A Bengali language-based study. Expert. Syst. 39, 1–22 (2022)CrossRef
8.
Zurück zum Zitat S. Jayachitra, A. Prasanth, Multi-feature analysis for automated brain stroke classification using weighted Gaussian Naïve Baye’s classifier. J. Circuits Syst. Comp. 30, 1–26 (2021)CrossRef S. Jayachitra, A. Prasanth, Multi-feature analysis for automated brain stroke classification using weighted Gaussian Naïve Baye’s classifier. J. Circuits Syst. Comp. 30, 1–26 (2021)CrossRef
9.
Zurück zum Zitat S. Kalli, T. Suresh, An effective motion object detection using adaptive background modeling mechanism in video surveillance system. J. Intell. Fuzzy Syst. 41, 777–1789 (2021) S. Kalli, T. Suresh, An effective motion object detection using adaptive background modeling mechanism in video surveillance system. J. Intell. Fuzzy Syst. 41, 777–1789 (2021)
10.
Zurück zum Zitat B. Kaur, S. Rathi, R.K. Agrawal, Enhanced depression detection from speech using Quantum Whale Optimization Algorithm for feature selection. Comput. Biol. Med. 150, 1–15 (2022)CrossRef B. Kaur, S. Rathi, R.K. Agrawal, Enhanced depression detection from speech using Quantum Whale Optimization Algorithm for feature selection. Comput. Biol. Med. 150, 1–15 (2022)CrossRef
11.
Zurück zum Zitat K. Kaur, P. Singh, Impact of feature extraction and feature selection algorithms on Punjabi speech emotion recognition using convolutional neural network. Trans. Asian Low-Resour. Lang. Inform. Process. 21, 1–23 (2022)CrossRef K. Kaur, P. Singh, Impact of feature extraction and feature selection algorithms on Punjabi speech emotion recognition using convolutional neural network. Trans. Asian Low-Resour. Lang. Inform. Process. 21, 1–23 (2022)CrossRef
12.
Zurück zum Zitat A. Koduru, H.B. Valiveti, A.K. Budati, Feature extraction algorithms to improve the speech emotion recognition rate. Int. J. Speech Technol. 23, 45–55 (2020)CrossRef A. Koduru, H.B. Valiveti, A.K. Budati, Feature extraction algorithms to improve the speech emotion recognition rate. Int. J. Speech Technol. 23, 45–55 (2020)CrossRef
13.
Zurück zum Zitat S. Langari, H. Marvi, M. Zahedi, Efficient speech emotion recognition using modified feature extraction. Inform. Med. Unlocked. 20, (2020). S. Langari, H. Marvi, M. Zahedi, Efficient speech emotion recognition using modified feature extraction. Inform. Med. Unlocked. 20, (2020).
14.
Zurück zum Zitat S. Lavanya, A. Prasanth, S. Jayachitra, A Tuned classification approach for efficient heterogeneous fault diagnosis in IoT-enabled WSN applications. Measurement 183, 1–16 (2021)CrossRef S. Lavanya, A. Prasanth, S. Jayachitra, A Tuned classification approach for efficient heterogeneous fault diagnosis in IoT-enabled WSN applications. Measurement 183, 1–16 (2021)CrossRef
15.
Zurück zum Zitat K.R. Lekshmi, E. Sherly, An acoustic model and linguistic analysis for Malayalam disyllabic words: a low resource language. Int. J. Speech Technol. 24, 483–495 (2021)CrossRef K.R. Lekshmi, E. Sherly, An acoustic model and linguistic analysis for Malayalam disyllabic words: a low resource language. Int. J. Speech Technol. 24, 483–495 (2021)CrossRef
16.
Zurück zum Zitat W. Lim, J. Daeyoung, L. Taejin, Speech emotion recognition using convolutional and recurrent neural networks. Asia-Pacific signal and information processing association annual summit and conference (APSIPA), IEEE (2016), pp 1–4. W. Lim, J. Daeyoung, L. Taejin, Speech emotion recognition using convolutional and recurrent neural networks. Asia-Pacific signal and information processing association annual summit and conference (APSIPA), IEEE (2016), pp 1–4.
17.
Zurück zum Zitat K. Manohar, E. Logashanmugam, Hybrid deep learning with optimal feature selection for speech emotion recognition using improved meta-heuristic algorithm. Knowl.-Based Syst. 246, 1–25 (2022)CrossRef K. Manohar, E. Logashanmugam, Hybrid deep learning with optimal feature selection for speech emotion recognition using improved meta-heuristic algorithm. Knowl.-Based Syst. 246, 1–25 (2022)CrossRef
18.
Zurück zum Zitat K. Mrinalini, P. Vijayalakshmi, T. Nagarajan, Feature-weighted AdaBoost classifier for punctuation prediction in Tamil and Hindi NLP systems. Expert. Syst. 39, 1–17 (2022) K. Mrinalini, P. Vijayalakshmi, T. Nagarajan, Feature-weighted AdaBoost classifier for punctuation prediction in Tamil and Hindi NLP systems. Expert. Syst. 39, 1–17 (2022)
20.
Zurück zum Zitat A. Prasanth, Certain investigations on energy-efficient fault detection and recovery management in underwater wireless sensor networks. J. Circuits Syst. Comput. 30, 1–20 (2021)CrossRef A. Prasanth, Certain investigations on energy-efficient fault detection and recovery management in underwater wireless sensor networks. J. Circuits Syst. Comput. 30, 1–20 (2021)CrossRef
21.
Zurück zum Zitat S. Radhika, A. Prasanth, A survey of human emotion recognition using speech signals: current trends and future perspectives. Micro-Electronics and Telecommunication Engineering: Proceedings of 6th ICMETE 2022, Singapore (2023), pp. 509–518. S. Radhika, A. Prasanth, A survey of human emotion recognition using speech signals: current trends and future perspectives. Micro-Electronics and Telecommunication Engineering: Proceedings of 6th ICMETE 2022, Singapore (2023), pp. 509–518.
22.
Zurück zum Zitat J. Rong, Li. Gang, Y.P.P. Chen, Acoustic feature selection for automatic emotion recognition from speech. Inform. Process. Manage. 45, 315–328 (2009). J. Rong, Li. Gang, Y.P.P. Chen, Acoustic feature selection for automatic emotion recognition from speech. Inform. Process. Manage. 45, 315–328 (2009).
23.
Zurück zum Zitat R. Sathya, S. Ananthi, K. Vaidehi, A hybrid location-dependent ultra convolutional neural network-based vehicle number plate recognition approach for intelligent transportation systems. Concurr. Comput. Pract. Experience 35(8), 1–25 (2023) R. Sathya, S. Ananthi, K. Vaidehi, A hybrid location-dependent ultra convolutional neural network-based vehicle number plate recognition approach for intelligent transportation systems. Concurr. Comput. Pract. Experience 35(8), 1–25 (2023)
24.
Zurück zum Zitat N. Sebe, M.S. Lew, I. Cohen, A. Garg, T.S. Huang, Object recognition supported by user interaction for service robots—emotion recognition using a Cauchy Naive Bayes classifier. IEEE Comput. Soc 16th International Conference on Pattern Recognition—Quebec City, Quebec, Canada, vol 1, pp17–20 (2002). N. Sebe, M.S. Lew, I. Cohen, A. Garg, T.S. Huang, Object recognition supported by user interaction for service robots—emotion recognition using a Cauchy Naive Bayes classifier. IEEE Comput. Soc 16th International Conference on Pattern Recognition—Quebec City, Quebec, Canada, vol 1, pp17–20 (2002).
25.
Zurück zum Zitat J. Sekar, P. Aruchamy, An efficient clinical support system for heart disease prediction using TANFIS classifier. Comput. Intell. 38, 610–640 (2022)CrossRef J. Sekar, P. Aruchamy, An efficient clinical support system for heart disease prediction using TANFIS classifier. Comput. Intell. 38, 610–640 (2022)CrossRef
26.
Zurück zum Zitat L. Sun, Q. Li, S. Fu, P. Li, Speech emotion recognition based on genetic algorithm–decision tree fusion of deep and acoustic features. ETRI J. 7, 15–29 (2022) L. Sun, Q. Li, S. Fu, P. Li, Speech emotion recognition based on genetic algorithm–decision tree fusion of deep and acoustic features. ETRI J. 7, 15–29 (2022)
27.
Zurück zum Zitat T. Jha, R. Kavya, J. Christopher, V. Arunachalam, Machine learning techniques for speech emotion recognition using paralinguistic acoustic features. Int. J. Speech Technol. 25, 707–725 (2022). T. Jha, R. Kavya, J. Christopher, V. Arunachalam, Machine learning techniques for speech emotion recognition using paralinguistic acoustic features. Int. J. Speech Technol. 25, 707–725 (2022).
28.
Zurück zum Zitat W. Wang, P.A. Watters, X. Cao, L. Shen, B. Li, Significance of phonological features in speech emotion recognition. Int. J. Speech Technol. 23, 633–642 (2020)CrossRef W. Wang, P.A. Watters, X. Cao, L. Shen, B. Li, Significance of phonological features in speech emotion recognition. Int. J. Speech Technol. 23, 633–642 (2020)CrossRef
29.
Zurück zum Zitat K. Wang, G. Su, L. Liu, S. Wang, Wavelet packet analysis for speaker-independent emotion recognition. Neurocomputing 398, 257–264 (2020)CrossRef K. Wang, G. Su, L. Liu, S. Wang, Wavelet packet analysis for speaker-independent emotion recognition. Neurocomputing 398, 257–264 (2020)CrossRef
30.
Zurück zum Zitat P. Xiao, K. Ma, L. Gu, Inter-subject prediction of pediatric emergence delirium using feature selection and classification from spontaneous EEG signals. Biomed. Signal Process. Control 80, 1–19 (2023)CrossRef P. Xiao, K. Ma, L. Gu, Inter-subject prediction of pediatric emergence delirium using feature selection and classification from spontaneous EEG signals. Biomed. Signal Process. Control 80, 1–19 (2023)CrossRef
31.
Zurück zum Zitat X. Xu, J. Deng, Z. Zhang, Rethinking auditory affective descriptors through zero-shot emotion recognition in speech. IEEE Trans. Comput. Soc. Syst. 9, 1530–1541 (2022)CrossRef X. Xu, J. Deng, Z. Zhang, Rethinking auditory affective descriptors through zero-shot emotion recognition in speech. IEEE Trans. Comput. Soc. Syst. 9, 1530–1541 (2022)CrossRef
32.
Zurück zum Zitat P. Yadav, G. Aggarwal, Speech emotion classification using machine learning. Int. J. Comp. Appl. 118(13), (2015). P. Yadav, G. Aggarwal, Speech emotion classification using machine learning. Int. J. Comp. Appl. 118(13), (2015).
33.
Zurück zum Zitat S. Zhang, Z. Xiaoming, C. Yuelong, G. Wenping, C. Ying, Feature learning via deep belief network for Chinese speech emotion recognition. Pattern Recognition: 7th Chinese Conference, Chengdu, China, November 5–7, 2016 Proceedings Part II, vol 7, pp 645–651 S. Zhang, Z. Xiaoming, C. Yuelong, G. Wenping, C. Ying, Feature learning via deep belief network for Chinese speech emotion recognition. Pattern Recognition: 7th Chinese Conference, Chengdu, China, November 5–7, 2016 Proceedings Part II, vol 7, pp 645–651
34.
Zurück zum Zitat Z. Zhang, Speech feature selection and emotion recognition based on weighted binary cuckoo search. Alex. Eng. J. 60, 1499–1507 (2021)CrossRef Z. Zhang, Speech feature selection and emotion recognition based on weighted binary cuckoo search. Alex. Eng. J. 60, 1499–1507 (2021)CrossRef
35.
Zurück zum Zitat Y. Zhou, X. Liang, Y. Gu, Multi-classifier interactive learning for ambiguous speech emotion recognition. IEEE/ACM Trans. Audio Speech Lang Process. 30, 695–705 (2022)CrossRef Y. Zhou, X. Liang, Y. Gu, Multi-classifier interactive learning for ambiguous speech emotion recognition. IEEE/ACM Trans. Audio Speech Lang Process. 30, 695–705 (2022)CrossRef
Metadaten
Titel
An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm
verfasst von
Radhika Subramanian
Prasanth Aruchamy
Publikationsdatum
22.12.2023
Verlag
Springer US
Erschienen in
Circuits, Systems, and Signal Processing / Ausgabe 4/2024
Print ISSN: 0278-081X
Elektronische ISSN: 1531-5878
DOI
https://doi.org/10.1007/s00034-023-02571-4