Skip to main content
Erschienen in: Soft Computing 16/2023

04.06.2023 | Data analytics and machine learning

Facial expression recognition through multi-level features extraction and fusion

verfasst von: Yuanlun Xie, Wenhong Tian, Hengxin Zhang, Tingsong Ma

Erschienen in: Soft Computing | Ausgabe 16/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recent studies have shown that deep learning has presented great potential in facial expression recognition (FER) tasks and attracted more and more researchers’ attention. Many existing methods have achieved good results on facial expression images in the laboratory environment. However, there are still great challenges for FER in the wild environment where facial expression images are more complex and diverse than those in the laboratory. In this paper, we propose a new method for FER from the perspective of multi-level features extraction and fusion. Different from the existing feature extraction network where only a single convolution kernel scale is present, we propose a feature extraction module with different convolutional kernel scales, which extracts multi-level features as the output of the whole feature extraction network. Further, we do not directly use these multi-level features but propose a feature fusion module with global and local attention to adaptively fuse these different level features in pairs with a top-down way and construct a new facial expression feature. To relieve the overfitting effect caused by data imbalance, we employ label smoothing and L2 regularization strategies to further guide our model forward in a better direction. Through extensive experiments, we demonstrate our method achieves accuracies of 88.08% on RAFDB, 88.11% on FERPlus and 59.38% on AffectNet, respectively, which are very competitive performances. Moreover, our multi-level feature fusion approach enables traditional convolutional backbone networks to improve performance by 0.64–1.13% on FER tasks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Abbasnejad I, Sridharan S, Nguyen D, et al (2018) Using synthetic data to improve facial expression analysis with 3d convolutional networks. In: 2017 IEEE international conference on computer vision workshop (ICCVW) Abbasnejad I, Sridharan S, Nguyen D, et al (2018) Using synthetic data to improve facial expression analysis with 3d convolutional networks. In: 2017 IEEE international conference on computer vision workshop (ICCVW)
Zurück zum Zitat Amos B, Ludwiczuk B, Satyanarayanan M (2016) Openface: a general-purpose face recognition library with mobile applications. Tech. rep., CMU-CS-16-118, CMU School of Computer Science Amos B, Ludwiczuk B, Satyanarayanan M (2016) Openface: a general-purpose face recognition library with mobile applications. Tech. rep., CMU-CS-16-118, CMU School of Computer Science
Zurück zum Zitat Barsoum E, Zhang C, Ferrer CC, et al (2016a) Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM international conference on multimodal interaction, pp 279–283 Barsoum E, Zhang C, Ferrer CC, et al (2016a) Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM international conference on multimodal interaction, pp 279–283
Zurück zum Zitat Barsoum E, Zhang C, Ferrer CC, et al (2016b) Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM international conference on multimodal interaction. Association for Computing Machinery, New York, ICMI’16, pp 279–283. https://doi.org/10.1145/2993148.2993165 Barsoum E, Zhang C, Ferrer CC, et al (2016b) Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM international conference on multimodal interaction. Association for Computing Machinery, New York, ICMI’16, pp 279–283. https://​doi.​org/​10.​1145/​2993148.​2993165
Zurück zum Zitat Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision pattern recognition Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision pattern recognition
Zurück zum Zitat Ekman P, Friesen WV (1971) Constants across cultures in the face and emotion. J Personal Soc Psychol 17(2):124CrossRef Ekman P, Friesen WV (1971) Constants across cultures in the face and emotion. J Personal Soc Psychol 17(2):124CrossRef
Zurück zum Zitat Ekman P, Friesen W (1978) Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press, Palo Alto, p 12 Ekman P, Friesen W (1978) Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press, Palo Alto, p 12
Zurück zum Zitat Fan X, Deng Z, Wang K, et al (2020) Learning discriminative representation for facial expression recognition from uncertainties. In: 2020 IEEE international conference on image processing (ICIP). IEEE, pp 903–907 Fan X, Deng Z, Wang K, et al (2020) Learning discriminative representation for facial expression recognition from uncertainties. In: 2020 IEEE international conference on image processing (ICIP). IEEE, pp 903–907
Zurück zum Zitat Fan Y, Lam JC, Li VO (2018) Multi-region ensemble convolutional neural network for facial expression recognition. In: International conference on artificial neural networks. Springer, pp 84–94 Fan Y, Lam JC, Li VO (2018) Multi-region ensemble convolutional neural network for facial expression recognition. In: International conference on artificial neural networks. Springer, pp 84–94
Zurück zum Zitat Farzaneh AH, Qi X (2021) Facial expression recognition in the wild via deep attentive center loss. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2402–2411 Farzaneh AH, Qi X (2021) Facial expression recognition in the wild via deep attentive center loss. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 2402–2411
Zurück zum Zitat He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR) He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR)
Zurück zum Zitat Huang C (2017) Combining convolutional neural networks for emotion recognition. In: 2017 IEEE MIT undergraduate research technology conference (URTC) Huang C (2017) Combining convolutional neural networks for emotion recognition. In: 2017 IEEE MIT undergraduate research technology conference (URTC)
Zurück zum Zitat Huang G, Liu Z, Van Der Maaten L, et al (2017) Densely connected convolutional networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2261–2269 Huang G, Liu Z, Van Der Maaten L, et al (2017) Densely connected convolutional networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2261–2269
Zurück zum Zitat Huang G, Liu Z, Van Der Maaten L, et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 Huang G, Liu Z, Van Der Maaten L, et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Zurück zum Zitat Ji Y, Hu Y, Yang Y et al (2019) Cross-domain facial expression recognition via an intra-category common feature and inter-category distinction feature fusion network. Neurocomputing 333(MAR.14):231–239CrossRef Ji Y, Hu Y, Yang Y et al (2019) Cross-domain facial expression recognition via an intra-category common feature and inter-category distinction feature fusion network. Neurocomputing 333(MAR.14):231–239CrossRef
Zurück zum Zitat Kahou SE, Pal C, Bouthillier X, et al (2013) Combining modality specific deep neural networks for emotion recognition in video. In: Proceedings of the 15th ACM on international conference on multimodal interaction, pp 543–550 Kahou SE, Pal C, Bouthillier X, et al (2013) Combining modality specific deep neural networks for emotion recognition in video. In: Proceedings of the 15th ACM on international conference on multimodal interaction, pp 543–550
Zurück zum Zitat Lai YH, Lai SH (2018) Emotion-preserving representation learning via generative adversarial network for multi-view facial expression recognition. In: 2018 13th IEEE international conference on automatic face gesture recognition (FG 2018) Lai YH, Lai SH (2018) Emotion-preserving representation learning via generative adversarial network for multi-view facial expression recognition. In: 2018 13th IEEE international conference on automatic face gesture recognition (FG 2018)
Zurück zum Zitat Li S, Deng W (2019) Reliable crowdsourcing and deep locality preserving learning for unconstrained facial expression recognition. IEEE Trans Image Process 28(1):356–370MathSciNetCrossRefMATH Li S, Deng W (2019) Reliable crowdsourcing and deep locality preserving learning for unconstrained facial expression recognition. IEEE Trans Image Process 28(1):356–370MathSciNetCrossRefMATH
Zurück zum Zitat Li S, Deng W (2020) Deep facial expression recognition: a survey. IEEE IEEE Trans Affect Comput 13(3):1195–1215 Li S, Deng W (2020) Deep facial expression recognition: a survey. IEEE IEEE Trans Affect Comput 13(3):1195–1215
Zurück zum Zitat Li S, Deng W, Du J (2017a) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2852–2861 Li S, Deng W, Du J (2017a) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2852–2861
Zurück zum Zitat Li Y, Yang J, Song Y et al (2017b) Learning from noisy labels with distillation. In: 2017 IEEE international conference on computer vision (ICCV) Li Y, Yang J, Song Y et al (2017b) Learning from noisy labels with distillation. In: 2017 IEEE international conference on computer vision (ICCV)
Zurück zum Zitat Lin F, Hong R, Zhou W et al (2018) Facial expression recognition with data augmentation and compact feature learning. In: 2018 25th IEEE international conference on image processing (ICIP). IEEE, pp 1957–1961 Lin F, Hong R, Zhou W et al (2018) Facial expression recognition with data augmentation and compact feature learning. In: 2018 25th IEEE international conference on image processing (ICIP). IEEE, pp 1957–1961
Zurück zum Zitat Lopes AT, Aguiar ED, Souza AFD et al (2017) Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recognit 61:610–628CrossRef Lopes AT, Aguiar ED, Souza AFD et al (2017) Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recognit 61:610–628CrossRef
Zurück zum Zitat Lucey P, Cohn JF, Kanade T et al (2010) The extended Cohn–Kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE computer society conference on computer vision and pattern recognition-workshops. IEEE, pp 94–101 Lucey P, Cohn JF, Kanade T et al (2010) The extended Cohn–Kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE computer society conference on computer vision and pattern recognition-workshops. IEEE, pp 94–101
Zurück zum Zitat Lyons MJ, Akamatsu S, Kamachi MG et al (1998) Coding facial expressions with gabor wavelets. In: Proceedings of the third IEEE international conference on automatic face and gesture recognition, 1998 Lyons MJ, Akamatsu S, Kamachi MG et al (1998) Coding facial expressions with gabor wavelets. In: Proceedings of the third IEEE international conference on automatic face and gesture recognition, 1998
Zurück zum Zitat Meng Z, Liu P, Cai J et al (2017) Identity-aware convolutional neural network for facial expression recognition. In: 2017 12th IEEE international conference on automatic face gesture recognition (FG 2017), pp 558–565. https://doi.org/10.1109/FG.2017.140 Meng Z, Liu P, Cai J et al (2017) Identity-aware convolutional neural network for facial expression recognition. In: 2017 12th IEEE international conference on automatic face gesture recognition (FG 2017), pp 558–565. https://​doi.​org/​10.​1109/​FG.​2017.​140
Zurück zum Zitat Mollahosseini A, Hasani B, Mahoor MH (2017) Affectnet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1):18–31CrossRef Mollahosseini A, Hasani B, Mahoor MH (2017) Affectnet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1):18–31CrossRef
Zurück zum Zitat Pantic M, Valstar M, Rademaker R, et al (2005) Web-based database for facial expression analysis. In: 2005 IEEE international conference on multimedia and Expo. IEEE, pp 5–pp Pantic M, Valstar M, Rademaker R, et al (2005) Web-based database for facial expression analysis. In: 2005 IEEE international conference on multimedia and Expo. IEEE, pp 5–pp
Zurück zum Zitat Pitaloka DA, Wulandari A, Basaruddin T et al (2017) Enhancing CNN with preprocessing stage in automatic emotion recognition. Procedia Comput Sci 116:523–529CrossRef Pitaloka DA, Wulandari A, Basaruddin T et al (2017) Enhancing CNN with preprocessing stage in automatic emotion recognition. Procedia Comput Sci 116:523–529CrossRef
Zurück zum Zitat Shan C, Gong S, Mcowan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(6):803–816CrossRef Shan C, Gong S, Mcowan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(6):803–816CrossRef
Zurück zum Zitat Shome D, Kar T (2021) Fedaffect: few-shot federated learning for facial expression recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4168–4175 Shome D, Kar T (2021) Fedaffect: few-shot federated learning for facial expression recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4168–4175
Zurück zum Zitat Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations
Zurück zum Zitat Suwa M, Sugie N, Fujimora K (1978) A preliminary note on pattern recognition of human emotional expression. In: International joint conference on pattern recognition, pp 408–410 Suwa M, Sugie N, Fujimora K (1978) A preliminary note on pattern recognition of human emotional expression. In: International joint conference on pattern recognition, pp 408–410
Zurück zum Zitat Valstar MF, Sánchez-Lozano E, Cohn JF et al (2017) Fera 2017-addressing head pose in the third facial expression recognition and analysis challenge. In: 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017). IEEE, pp 839–847 Valstar MF, Sánchez-Lozano E, Cohn JF et al (2017) Fera 2017-addressing head pose in the third facial expression recognition and analysis challenge. In: 2017 12th IEEE international conference on automatic face & gesture recognition (FG 2017). IEEE, pp 839–847
Zurück zum Zitat Veit A, Alldrin N, Chechik G et al (2017) Learning from noisy large scale datasets with minimal supervision. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR) Veit A, Alldrin N, Chechik G et al (2017) Learning from noisy large scale datasets with minimal supervision. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR)
Zurück zum Zitat Wang Z, Zeng F, Liu S et al (2021) Oaenet: oriented attention ensemble for accurate facial expression recognition. Pattern Recognit 112(107):694 Wang Z, Zeng F, Liu S et al (2021) Oaenet: oriented attention ensemble for accurate facial expression recognition. Pattern Recognit 112(107):694
Zurück zum Zitat Yang H, Ciftci U, Yin L (2018a) Facial expression recognition by de-expression residue learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) Yang H, Ciftci U, Yin L (2018a) Facial expression recognition by de-expression residue learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)
Zurück zum Zitat Yang H, Ciftci U, Yin L (2018b) Facial expression recognition by de-expression residue learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2168–2177 Yang H, Ciftci U, Yin L (2018b) Facial expression recognition by de-expression residue learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2168–2177
Zurück zum Zitat Yu Z, Liu Q, Liu G (2018) Deeper cascaded peak-piloted network for weak expression recognition. Vis Comput 34(12):1691–1699CrossRef Yu Z, Liu Q, Liu G (2018) Deeper cascaded peak-piloted network for weak expression recognition. Vis Comput 34(12):1691–1699CrossRef
Zurück zum Zitat Zeng J, Shan S, Chen X (2018a) Facial expression recognition with inconsistently annotated datasets. In: Proceedings of the European conference on computer vision (ECCV), pp 222–237 Zeng J, Shan S, Chen X (2018a) Facial expression recognition with inconsistently annotated datasets. In: Proceedings of the European conference on computer vision (ECCV), pp 222–237
Zurück zum Zitat Zeng J, Shan S, Chen X (2018b) Facial expression recognition with inconsistently annotated datasets. In: Proceedings of the European conference on computer vision (ECCV) Zeng J, Shan S, Chen X (2018b) Facial expression recognition with inconsistently annotated datasets. In: Proceedings of the European conference on computer vision (ECCV)
Zurück zum Zitat Zhang K, Zhang Z, Li Z et al (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503CrossRef Zhang K, Zhang Z, Li Z et al (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503CrossRef
Zurück zum Zitat Zhang F, Zhang T, Mao Q et al (2018) Joint pose and expression modeling for facial expression recognition. In: 2018 IEEE/CVF conference on computer vision and pattern recognition Zhang F, Zhang T, Mao Q et al (2018) Joint pose and expression modeling for facial expression recognition. In: 2018 IEEE/CVF conference on computer vision and pattern recognition
Zurück zum Zitat Zheng Z, Rasmussen C, Peng X (2021) Student–teacher oneness: a storage-efficient approach that improves facial expression recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4077–4086 Zheng Z, Rasmussen C, Peng X (2021) Student–teacher oneness: a storage-efficient approach that improves facial expression recognition. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4077–4086
Metadaten
Titel
Facial expression recognition through multi-level features extraction and fusion
verfasst von
Yuanlun Xie
Wenhong Tian
Hengxin Zhang
Tingsong Ma
Publikationsdatum
04.06.2023
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 16/2023
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-023-08531-z

Weitere Artikel der Ausgabe 16/2023

Soft Computing 16/2023 Zur Ausgabe

Mathematical methods in data science

Generalized three-way formal concept lattices