Skip to main content
Top

2024 | OriginalPaper | Chapter

Multi-featured Speech Emotion Recognition Using Extended Convolutional Neural Network

Authors : Arun Kumar Dubey, Yogita Arora, Neha Gupta, Sarita Yadav, Achin Jain, Devansh Verma

Published in: Advanced Computing

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

There has been a significant increase in recent years in the investigation of emotions expressed via speech signals; this field is known as Speech Emotion Recognition (SER). SER holds immense potential across various applications and serves as a pivotal bridge in enhancing Human-Computer Interaction. However, prevailing challenges such as diminished model accuracy in noisy environments have posed substantial obstacles in this field. To address the scarcity of robust data for SER, we adopted data augmentation techniques, encompassing noise injection, stretching, and pitch modification. Distinguishing our approach from recent literature, we harnessed multiple audio features, including Mel-Frequency Cepstral Coefficients (MFCCs), mel spectrograms, zero crossing rate, root mean square, and chroma. This paper employs Convolutional Neural Networks (CNNs) as the foundation for emotion classification. The Toronto Emotional Speech Set (TESS) and the Ryerson Audio-Visual Data-base of Emotional Speech and Song (RAVDESS) are two well-established datasets that we utilize. The accuracy of our proposed model on the RAVDESS dataset is 72%, and on the TESS dataset, it achieves an impressive 96.62%. These results surpass those of extant models that have been customized for each specific dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
Metadata
Title
Multi-featured Speech Emotion Recognition Using Extended Convolutional Neural Network
Authors
Arun Kumar Dubey
Yogita Arora
Neha Gupta
Sarita Yadav
Achin Jain
Devansh Verma
Copyright Year
2024
DOI
https://doi.org/10.1007/978-3-031-56700-1_26

Premium Partner