Skip to main content
Top

2022 | OriginalPaper | Chapter

Leveraging Transfer Learning for Effective Recognition of Emotions from Images: A Review

Authors : Devangi Purkayastha, D. Malathi

Published in: Cyber Security, Privacy and Networking

Publisher: Springer Nature Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Emotions constitute  an integral part of interpersonal communication and comprehending human behavior. Reliable analysis and interpretation of facial expressions are essential to gain a deeper insight into human behavior. Even though facial emotion recognition (FER) is extensively studied to improve human–computer interaction, it is yet elusive to human interpretation. Albeit humans have the innate capability to identify emotions through facial expressions, it is a challenging task to be accomplished by computer systems due to intra-class variations. While most of the recent works have performed well on datasets with images captured under controlled conditions, they fail to perform well on datasets that consist of variations in image lighting, shadows, facial orientation, noise, and partial faces. For all the tremendous performances of the existing works, there appears to be significant room for researchers. This paper emphasizes automatic FER on a single image for real-time emotion recognition using transfer learning. Since natural images suffer from problems of resolution, pose, and noise, this study proposes a deep learning approach based on transfer learning from a pre-trained VGG-16 network to significantly reduce training time and effort while achieving commendable improvement over previously proposed techniques and models on the FER-2013 dataset. The main contribution of this paper is to study and demonstrate the efficacy of multiple state-of-the-art models using transfer learning to conclude which is better to classify an input image as having one of the seven basic emotions: happy, sad, surprise, angry, disgust, fear, and neutral. The analysis shows that the VGG-16 model outperforms ResNet-50, DenseNet-121, EfficientNet-B2, and others while attaining a training accuracy of about 85% and validation accuracy as high as 67% in just 15 epochs with significantly lower training time.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Mollahossein A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: Applications of computer vision (WACV), 2016 IEEE Winter Conference on IEEE Mollahossein A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: Applications of computer vision (WACV), 2016 IEEE Winter Conference on IEEE
2.
go back to reference Ruvinga C, Malathi D, Dorathi Jayaseeli JD (2020) Human concentration level recognition based on vgg16 CNN architecture. Int J Adv Sci Technol 29(6s):1364–1373 Ruvinga C, Malathi D, Dorathi Jayaseeli JD (2020) Human concentration level recognition based on vgg16 CNN architecture. Int J Adv Sci Technol 29(6s):1364–1373
3.
go back to reference Han K, Yu D, Tashev I (2014) Speech emotion recognition using deep neural network and extreme learning machine. In: Fifteenth annual conference of the International Speech Communication Association Han K, Yu D, Tashev I (2014) Speech emotion recognition using deep neural network and extreme learning machine. In: Fifteenth annual conference of the International Speech Communication Association
4.
go back to reference Petrantonakis PC, Hadjileontiadis LJ (2010) Emotion recognition from EEG using higher order crossings. IEEE Trans Inf Technol Biomed 14(2):186–197CrossRef Petrantonakis PC, Hadjileontiadis LJ (2010) Emotion recognition from EEG using higher order crossings. IEEE Trans Inf Technol Biomed 14(2):186–197CrossRef
5.
go back to reference Chung-Hsien W, Ze-Jing C, Yu-Chung L (2006) Emotion recognition from text using semantic labels and separable mixture models. ACM Trans Asian Lang Inf Process (TALIP) 5(2):165–183 Chung-Hsien W, Ze-Jing C, Yu-Chung L (2006) Emotion recognition from text using semantic labels and separable mixture models. ACM Trans Asian Lang Inf Process (TALIP) 5(2):165–183
6.
go back to reference Mehrabian: nonverbal communication (1972) Aldine Transaction, New Brunswick Mehrabian: nonverbal communication (1972) Aldine Transaction, New Brunswick
7.
go back to reference Minaee S, Abdolrashid A (2019) Deep-emotion: facial expression recognition using attentional convolutional network. ArXiv, abs/1902.01019 Minaee S, Abdolrashid A (2019) Deep-emotion: facial expression recognition using attentional convolutional network. ArXiv, abs/1902.01019
8.
go back to reference Goodfellow I et al (2013) Challenges in representation learning: a report on three machine learning contests Goodfellow I et al (2013) Challenges in representation learning: a report on three machine learning contests
9.
go back to reference Sariyanidi E, Gunes H, Cavallaro A (2015) Automatic analysis of facial affect: a survey of registration, representation, and recognition. IEEE Trans Pattern Anal Mach Intell 37(5):1113–1133 Sariyanidi E, Gunes H, Cavallaro A (2015) Automatic analysis of facial affect: a survey of registration, representation, and recognition. IEEE Trans Pattern Anal Mach Intell 37(5):1113–1133
10.
go back to reference Li Z, Imai JI, Kaneko M (2009) Facial-component-based bag of words and PHOG descriptor for facial expression recognition. In: Conference Proceedings—IEEE International Conference on Systems, Man and Cybernetics, pp. 1353–1358 Li Z, Imai JI, Kaneko M (2009) Facial-component-based bag of words and PHOG descriptor for facial expression recognition. In: Conference Proceedings—IEEE International Conference on Systems, Man and Cybernetics, pp. 1353–1358
11.
go back to reference Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(5):803–816CrossRef Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(5):803–816CrossRef
12.
go back to reference Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 886–893 Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE computer society conference on computer vision and pattern recognition, vol 1, pp 886–893
13.
go back to reference Wang Z, Ying Z (2012) Facial expression recognition based on rotation invariant local phase quantization and sparse representation Wang Z, Ying Z (2012) Facial expression recognition based on rotation invariant local phase quantization and sparse representation
14.
go back to reference Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: European conference on computer vision, pp 428–441. Springer, Berlin Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: European conference on computer vision, pp 428–441. Springer, Berlin
15.
go back to reference Cootes TF, Edwards GJ, Taylor CJ et al (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 23(5):681–685CrossRef Cootes TF, Edwards GJ, Taylor CJ et al (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 23(5):681–685CrossRef
16.
go back to reference Cootes TF, Taylor CJ, Cooper DH, Graham J (1995) Active shape models-their training and application. Comput Vis Image Understand 61(1):38–59CrossRef Cootes TF, Taylor CJ, Cooper DH, Graham J (1995) Active shape models-their training and application. Comput Vis Image Understand 61(1):38–59CrossRef
17.
go back to reference Stewart BM, Littlewort G, Frank M, Lainscsek C, Fasel I, Movellan J (2005) Recognizing facial expression: machine learning and application to spontaneous behavior. In: IEEE Computer Society Conference on Computer vision and pattern recognition, vol 2, pp 568–573 Stewart BM, Littlewort G, Frank M, Lainscsek C, Fasel I, Movellan J (2005) Recognizing facial expression: machine learning and application to spontaneous behavior. In: IEEE Computer Society Conference on Computer vision and pattern recognition, vol 2, pp 568–573
18.
go back to reference Whitehill J, Omlin CW (2006) Haar features for faces au recognition. In: Automatic face and gesture recognition, FGR 2006. 7th International Conference, IEEE Whitehill J, Omlin CW (2006) Haar features for faces au recognition. In: Automatic face and gesture recognition, FGR 2006. 7th International Conference, IEEE
19.
go back to reference Mohammadi M, Fatemizadeh E, Mahoor MH (2014) PCA based dictionary building for accurate facial expression recognition via sparse representation. J Vis Commun Image Rep 25(4):1082–1092CrossRef Mohammadi M, Fatemizadeh E, Mahoor MH (2014) PCA based dictionary building for accurate facial expression recognition via sparse representation. J Vis Commun Image Rep 25(4):1082–1092CrossRef
20.
go back to reference Paul E, Friesen Wallace V (1971) Constants across cultures in the face and emotion. J Personal Soc Psychol 17(2):124 Paul E, Friesen Wallace V (1971) Constants across cultures in the face and emotion. J Personal Soc Psychol 17(2):124
21.
go back to reference Friesen E, Ekman P (1978) Facial action coding system: a technique for the measurement of facial movement. Palo Alto Friesen E, Ekman P (1978) Facial action coding system: a technique for the measurement of facial movement. Palo Alto
22.
go back to reference Malathi D, Dorathi Jayaseeli JD, Gopika S, Senthil Kumar K (2017) Object recognition using the principles of Deep Learning Architecture. ARPN J Eng Appl Sci 12(12):3736–3739 Malathi D, Dorathi Jayaseeli JD, Gopika S, Senthil Kumar K (2017) Object recognition using the principles of Deep Learning Architecture. ARPN J Eng Appl Sci 12(12):3736–3739
23.
go back to reference Lucey P et al (2010) The extended Cohn-Kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE Lucey P et al (2010) The extended Cohn-Kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE
24.
go back to reference Yu Z (2015) Image based static facial expression recognition with multiple deep network learning. In: ACM on international conference on multimodal interaction—ICMI, pp 435–442 Yu Z (2015) Image based static facial expression recognition with multiple deep network learning. In: ACM on international conference on multimodal interaction—ICMI, pp 435–442
25.
go back to reference Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 1345–1359 (2010) Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 1345–1359 (2010)
26.
go back to reference Lyons MJ, Akamatsu S, Kamachi M, Gyoba J, Budynek J (1998) The Japanese female facial expression (JAFFE) database. In: Third international conference on automatic face and gesture recognition, pp 14–16 Lyons MJ, Akamatsu S, Kamachi M, Gyoba J, Budynek J (1998) The Japanese female facial expression (JAFFE) database. In: Third international conference on automatic face and gesture recognition, pp 14–16
27.
28.
go back to reference Tan, M, Le Q (2019) EfficientNet: rethinking model scaling for convolutional neural networks. PIn: roceedings of the 36th international conference on machine learning. In: Proceedings of machine learning research, vol 97, pp 6105–6114 Tan, M, Le Q (2019) EfficientNet: rethinking model scaling for convolutional neural networks. PIn: roceedings of the 36th international conference on machine learning. In: Proceedings of machine learning research, vol 97, pp 6105–6114
29.
go back to reference He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
30.
go back to reference Christian S, Vincent V (2015) Sergey Ioffe. In: Rethinking the inception architecture for computer vision. Jonathon Shlens Christian S, Vincent V (2015) Sergey Ioffe. In: Rethinking the inception architecture for computer vision. Jonathon Shlens
31.
go back to reference Szegedy C et al (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), Boston, MA, USA, pp 1–9 Szegedy C et al (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), Boston, MA, USA, pp 1–9
32.
go back to reference Chollet F (2016) Xception: deep learning with depth wise separable convolutions. CoRR. abs/1610.02357 Chollet F (2016) Xception: deep learning with depth wise separable convolutions. CoRR. abs/1610.02357
33.
go back to reference Gao H, Zhuang L, van der Maaten L, Weinberger KQ (2016) Densely connected convolutional networks Gao H, Zhuang L, van der Maaten L, Weinberger KQ (2016) Densely connected convolutional networks
Metadata
Title
Leveraging Transfer Learning for Effective Recognition of Emotions from Images: A Review
Authors
Devangi Purkayastha
D. Malathi
Copyright Year
2022
Publisher
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-16-8664-1_2