Skip to main content
Top

Multimodal Emotion Recognition based on Face and Speech using Deep Convolution Neural Network and Long Short Term Memory

  • 26-04-2025
Published in:

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The article presents a cutting-edge approach to multimodal emotion recognition (MER) by leveraging facial expressions and speech modalities. It addresses the limitations of unimodal emotion recognition systems, which often suffer from reliability and robustness issues. The proposed method utilizes a deep convolutional neural network (DCNN) and long short-term memory (LSTM) network to enhance feature representation and temporal dependency, resulting in a more accurate and reliable emotion recognition system. The article also introduces a novel feature selection algorithm that combines particle swarm optimization (PSO), the multi-attribute utility theory (MAUT), and the archimedes optimization algorithm (AoA) to select the most salient features from facial and speech data. This approach not only improves the system's accuracy but also reduces computational complexity. The article provides a detailed analysis of the experimental results, demonstrating the superior performance of the proposed method compared to traditional techniques. It also discusses the potential applications of the MER system in various fields, such as online learning platforms, customer care centers, and robotics. The article concludes with a discussion on the future directions for improving the MER system, including the use of high-resolution facial images and the development of visualization models to enhance the system's explainability.

Not a customer yet? Then find out more about our access models now:

Individual Access

Start your personal individual access now. Get instant access to more than 164,000 books and 540 journals – including PDF downloads and new releases.

Starting from 54,00 € per month!    

Get access

Access for Businesses

Utilise Springer Professional in your company and provide your employees with sound specialist knowledge. Request information about corporate access now.

Find out how Springer Professional can uplift your work!

Contact us now
Title
Multimodal Emotion Recognition based on Face and Speech using Deep Convolution Neural Network and Long Short Term Memory
Authors
Shwetkranti Taware
Anuradha D. Thakare
Publication date
26-04-2025
Publisher
Springer US
Published in
Circuits, Systems, and Signal Processing / Issue 9/2025
Print ISSN: 0278-081X
Electronic ISSN: 1531-5878
DOI
https://doi.org/10.1007/s00034-025-03080-2
This content is only visible if you are logged in and have the appropriate permissions.