Towards robust automatic affective classification of images using facial expressions for practical applications

Zhang, Ligang; Tjondronegoro, Dian; Chandran, Vinod; Eggink, Jana

doi:10.1007/s11042-015-2497-5

Towards robust automatic affective classification of images using facial expressions for practical applications

Published: 27 February 2015

Volume 75, pages 4669–4695, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ligang Zhang^1,2,
Dian Tjondronegoro²,
Vinod Chandran² &
…
Jana Eggink³

504 Accesses
16 Citations
1 Altmetric
Explore all metrics

Abstract

Affect is an important feature of multimedia content and conveys valuable information for multimedia indexing and retrieval. Most existing studies for affective content analysis are limited to low-level features or mid-level representations, and are generally criticized for their incapacity to address the gap between low-level features and high-level human affective perception. The facial expressions of subjects in images carry important semantic information that can substantially influence human affective perception, but have been seldom investigated for affective classification of facial images towards practical applications. This paper presents an automatic image emotion detector (IED) for affective classification of practical (or non-laboratory) data using facial expressions, where a lot of “real-world” challenges are present, including pose, illumination, and size variations etc. The proposed method is novel, with its framework designed specifically to overcome these challenges using multi-view versions of face and fiducial point detectors, and a combination of point-based texture and geometry. Performance comparisons of several key parameters of relevant algorithms are conducted to explore the optimum parameters for high accuracy and fast computation speed. A comprehensive set of experiments with existing and new datasets, shows that the method is effective despite pose variations, fast, and appropriate for large-scale data, and as accurate as the method with state-of-the-art performance on laboratory-based data. The proposed method was also applied to affective classification of images from the British Broadcast Corporation (BBC) in a task typical for a practical application providing some valuable insights.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

A review on face recognition systems: recent approaches and challenges

Article 30 July 2020

Facial Expression Recognition Using Machine Learning and Deep Learning Techniques: A Systematic Review

Article 13 April 2024

Notes

http://www.youtube.com/yt/press/statistics.html.
The dataset is not freely available, academic institutions interested in working with it should contact jana.eggink@bbc.co.uk, license agreements might be available for collaborative work between the BBC and individual universities.

References

Acar E, Hopfgartner F, Albayrak S (2014) Understanding Affective Content of Music Videos through Learned Representations. In: Gurrin C, Hopfgartner F, Hurst W, Johansen H, Lee H, O’Connor N (eds) MultiMedia Modeling, vol 8325. Lecture Notes in Computer Science. Springer International Publishing, pp 303-314. doi:10.1007/978-3-319-04114-8_26
An L, Yang S, Bhanu B (2015) Efficient smile detection by extreme learning machine. Neurocomputing 149, Part A (0):354-363. doi:http://dx.doi.org/10.1016/j.neucom.2014.04.072
Anisetti M, Bellandi V (2009) Emotional state inference using face related features. In: Damiani E, Jeong J, Howlett R, Jain L (eds) New directions in intelligent interactive multimedia systems and services - 2, vol 226. studies in computational intelligence. Springer, Berlin, pp 401–411. doi:10.1007/978-3-642-02937-0_37
Google Scholar
Anisetti M, Bellandi V, Damiani E, Arnone L, Rat B (2008) A3FD: accurate 3D face detection. In: Damiani E, Yétongnon K, Schelkens P, Dipanda A, Legrand L, Chbeir R (eds) Signal processing for image enhancement and multimedia processing vol 31, multimedia systems and applications series. Springer, US, pp 155–165. doi:10.1007/978-0-387-72500-0_14
Chapter Google Scholar
Anisetti M, Bellandi V, Damiani E, Beverina F 3D Expressive Face Model-based Tracking Algorithm. In: Signal Processing, Pattern Recognition, and Applications, Innsbruck, 2006. pp 111-116
Ashraf AB, Lucey S, Cohn JF, Chen T, Ambadar Z, Prkachin KM, Solomon PE (2009) The painful face - pain expression recognition using active appearance models. Image Vis Comput 27(12):1788–1796
Article Google Scholar
Bianchi-Berthouze N (2003) K-DIME: an affective image filtering system. Multimed IEEE 10(3):103–106
Article Google Scholar
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Caifeng S (2012) Smile detection by boosting pixel differences. Imag Process IEEE Trans 21(1):431–436. doi:10.1109/tip.2011.2161587
Article MathSciNet Google Scholar
Canini L, Benini S, Leonardi R (2013) Affective recommendation of movies based on selected connotative features. Circ Syst Video Technol IEEE Trans 23(4):636–647. doi:10.1109/TCSVT.2012.2211935
Article Google Scholar
Caridakis G, Karpouzis K, Wallace M, Kessous L, Amir N (2010) Multimodal user’s affective state analysis in naturalistic interaction. J Multimod User Interf 3(1):49–66. doi:10.1007/s12193-009-0030-8
Article Google Scholar
Chang H, Haizhou A, Yuan L, Shihong L (2007) High-performance rotation invariant multiview face detection. Patt Anal Mach Intell IEEE Trans 29(4):671–686
Article Google Scholar
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm/. Accessed 19 Feb 2015
Chew SW, Lucey P, Lucey S, Saragih J, Cohn JF, Matthews I, Sridharan S (2012) In the pursuit of effective affective computing: the relationship between features and registration. Syst Man Cybernet B Cybernet IEEE Trans 42(4):1006–1016. doi:10.1109/TSMCB.2012.2194485
Article Google Scholar
Cootes TF, Taylor CJ, Cooper DH, Graham J (1995) Active shape models-their training and application. Comput Vis Image Underst 61(1):38–59
Article Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Danisman T, Bilasco IM, Martinet J, Djeraba C (2013) Intelligent pixels of interest selection with application to facial expression recognition using multilayer perceptron. Signal Process 93(6):1547–1556. doi:10.1016/j.sigpro.2012.08.007
Article Google Scholar
Dhall A, Goecke R, Lucey S, Gedeon T Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark. In: Computer Vision Workshops (ICCV Workshops), 2011 I.E. International Conference on, 6-13 Nov. 2011. pp 2106-2112
Ekman P (1994) Strong evidence for universals in facial expressions - a reply to Russells mistaken critique. Psychol Bull 115(2):268–287
Article Google Scholar
Ekman P, Friesen W (1978) The facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press, Palo Alto, pp 274–280
Google Scholar
Fei-Fei L, Perona P A Bayesian hierarchical model for learning natural scene categories. In: Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, 20-25 June 2005 2005. pp 524-531 vol. 522. doi:10.1109/CVPR.2005.16
Feng X, Lai Y, Mao X, Peng J, Jiang X, Hadid A (2013) Extracting local binary patterns from image key points: application to automatic facial expression recognition. In: Kämäräinen J-K, Koskela M (eds) Image analysis, vol 7944. lecture notes in computer science. Springer, Berlin, pp 339–348. doi:10.1007/978-3-642-38886-6_33
Google Scholar
Han D, Li W, Li Z (2008) Semantic image classification using statistical local spatial relations model. Multimed Tools Appl 39(2):169–188. doi:10.1007/s11042-008-0203-6
Article Google Scholar
Hanchuan P, Fuhui L, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. Patt Anal Mach Intell IEEE Trans 27(8):1226–1238
Article Google Scholar
Hanjalic A (2006) Extracting moods from pictures and sounds: towards truly personalized TV. Signal Process Mag IEEE 23(2):90–100
Article Google Scholar
Hanjalic A, Li-Qun X (2005) Affective video content representation and modeling. Multimed IEEE Trans 7(1):143–154
Article Google Scholar
Hao T, Huang TS (2008) 3D facial expression recognition based on automatically selected features. In: computer vision and pattern recognition workshops, 2008. CVPRW ’08. IEEE Computer Society Conference on pp 1-8
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1–2):177–196. doi:10.1023/A:1007617005950
Article MATH Google Scholar
Ionescu B, Schluter J, Mironica I, Schedl M A naive mid-level concept-based fusion approach to violence detection in Hollywood movies. In: Proceedings of the 3rd ACM conference on International conference on multimedia retrieval, Dallas, Texas, USA, 2013. ACM, 2461502, pp 215-222. doi:10.1145/2461466.2461502
Jana M, Allan H Affective image classification using features inspired by psychology and art theory. In: Proceedings of the international conference on Multimedia, Firenze, Italy, 2010. ACM, pp 83-92. doi:10.1145/1873951.1873965
Joonwhoan L, EunJong P (2011) Fuzzy similarity-based emotional classification of color images. Multimedia IEEE Trans 13(5):1031–1039
Article Google Scholar
Kotsia I, Zafeiriou S, Pitas I (2008) Texture and shape information fusion for facial expression and facial action unit recognition. Pattern Recogn 41(3):833–851
Article MATH Google Scholar
Lajevardi S, Hussain Z (2011) Automatic facial expression recognition: feature extraction and selection. Signal Imag Video Process:1-11. doi:10.1007/s11760-010-0177-5
Li S, Zhu L, Zhang Z, Blake A, Zhang H, Shum H (2002) Statistical learning of multi-view face detection. In: computer vision — ECCV 2002. pp 117-121
Liu N, Dellandréa E, Tellez B, Chen L (2011) Associating textual features with visual ones to improve affective image classification. In: International Conference on affective computing and intelligent interaction (ACII2011), vol 6974. Lecture notes in computer science. Springer Berlin / Heidelberg, pp 195-204. doi:10.1007/978-3-642-24600-5_23
Liu M, Li S, Shan S, Chen X (2013) Enhancing expression recognition in the wild with unlabeled reference data. In: Lee K, Matsushita Y, Rehg J, Hu Z (eds) Computer vision – ACCV 2012, vol 7725. lecture notes in computer science. Springer, Berlin, pp 577–588. doi:10.1007/978-3-642-37444-9_45
Google Scholar
Maja P, Nicu S, Jeffrey FC, Thomas H (2005) Affective multimodal human-computer interaction. Paper presented at the Proceedings of the 13th annual ACM international conference on Multimedia, Hilton, Singapore
Mehrabian A (1968) Communication without words. Psychol Today 2(9):52–55
Google Scholar
Michela D, Pamela Z, Giulia B, Liliana A Emotion based classification of natural images. In: Proceedings of the 2011 international workshop on Detecting and Exploiting Cultural diversity on the social web, Glasgow, Scotland, UK, 2011. ACM, pp 17-22. doi:10.1145/2064448.2064470
Milborrow S, Nicolls F (2008) Locating facial features with an extended active shape model. In: Forsyth D, Torr P, Zisserman A (eds) Computer vision – ECCV 2008, vol 5305. lecture notes in computer science. Springer, Berlin, pp 504–513. doi:10.1007/978-3-540-88693-8_37
Google Scholar
Mingli S, Dacheng T, Zicheng L, Xuelong L, Mengchu Z (2010) Image ratio features for facial expression recognition application. Syst Man Cybernet B Cybernet IEEE Trans 40(3):779–788
Article Google Scholar
Mita T, Kaneko T, Hori O Joint Haar-like features for face detection. In: Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, 2005. pp 1619-1626 Vol. 1612
Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Patt Anal Mach Intell IEEE Trans 24(7):971–987
Article MATH Google Scholar
Pandzic IS, Forchheimer R (2002) MPEG-4 facial animation: the standard, implementation and applications. Wiley
Panning A, Al-Hamadi A, Niese R, Michaelis B (2008) Facial expression recognition based on Haar-like feature detection. Patt Recog Imag Anal 18(3):447–452
Article Google Scholar
Peng W, Kohler C, Barrett F, Gur R, Verma R (2007) Quantifying facial expression abnormality in schizophrenia by combining 2D and 3D features. In: Computer vision and pattern recognition, 2007. CVPR ’07. IEEE Conference on. pp 1-8
Rudovic O, Pantic M, Patras I (2013) Coupled Gaussian processes for pose-invariant facial expression recognition. Patt Anal Mach Intell IEEE Trans 35(6):1357–1369. doi:10.1109/tpami.2012.233
Article Google Scholar
Russell JA (1980) A circumplex model of affect. J Pers Soc Psychol 39(6):1161–1178
Article Google Scholar
Shan C, Gritti T (2008) Learning discriminative lbp-histogram bins for facial expression recognition. In: Proc. British Machine Vision Conference
Shan H, Shangfei W, Yanpeng L (2011) Spontaneous facial expression recognition based on feature point tracking. In: Image and graphics (ICIG), Sixth International Conference on, 12-15 Aug. 2011. pp 760-765
Shangfei W, Zhilei L, Siliang L, Yanpeng L, Guobing W, Peng P, Fei C, Xufa W (2010) A natural visible and infrared facial expression database for expression recognition and emotion inference. Multimed IEEE Trans 12(7):682–691
Article Google Scholar
Sung J, Kim D (2008) Pose-robust facial expression recognition using view-based 2D + 3D AAM. Syst Man Cybernet A Syst Humans IEEE Trans 38(4):852–866
Article Google Scholar
Tariq U, Kai-Hsiang L, Zhen L, Xi Z, Zhaowen W, Vuong L, Huang TS, Xutao L, Han TX Emotion recognition from an ensemble of features. In: automatic face & gesture recognition and workshops (FG 2011), 2011 I.E. International Conference on, 21-25 March 2011 2011. pp 872-877. doi:10.1109/FG.2011.5771365
Tsalakanidou F, Malassiotis S (2010) Real-time 2D + 3D facial action and expression recognition. Pattern Recogn 43(5):1763–1775
Article Google Scholar
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Article Google Scholar
Whitehill J, Littlewort G, Fasel I, Bartlett M, Movellan J (2009) Toward practical smile detection. Patt Anal Mach Intell IEEE Trans 31(11):2106–2111
Article Google Scholar
Wu Y, Ji Q (2014) Discriminative deep face shape model for facial point detection. Int J Comput Vision:1-17. doi:10.1007/s11263-014-0775-8
Xiangxin Z, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: Computer Vision and Pattern Recognition (CVPR), 2012 I.E. Conference on, 16-21 June 2012 pp 2879-2886. doi:10.1109/CVPR.2012.6248014
Xie X, Lam K-M (2009) Facial expression recognition based on shape and texture. Pattern Recogn 42(5):1003–1011
Article Google Scholar
Xu M, Wang J, He X, Jin J, Luo S, Lu H (2012) A three-level framework for affective content analysis and its case studies. Multimedia Tools and Applications:1-23. doi:10.1007/s11042-012-1046-8
Yongmian Z, Qiang J (2005) Active and dynamic information fusion for facial expression understanding from image sequences. Patt Anal Mach Intell IEEE Trans 27(5):699–714
Article Google Scholar
Zeng Z, Pantic M, Roisman GI, Huang TS (2009) A survey of affect recognition methods: audio, visual, and spontaneous expressions. Pattern Anal Machine Intell IEEE Trans 31(1):39–58
Article Google Scholar
Zhang L, Tjondronegoro D, Chandran V (2011) Evaluation of texture and geometry for dimensional facial expression recognition. In: digital image computing techniques and applications (DICTA), 2011 International Conference on, 6-8 Dec. 2011 pp 620-626
Zhang L, Tjondronegoro D, Chandran V (2012) Discovering the best feature extraction and selection algorithms for spontaneous facial expression recognition. In: 2012 I.E. International Conference on Multimedia & Expo (ICME 2012), pp 1027-1032
Zhang L, Tjondronegoro D, Chandran V (2014) Facial expression recognition experiments with data from television broadcasts and the World Wide Web. Image Vis Comput 32(2):107–119. doi:10.1016/j.imavis.2013.12.008
Article Google Scholar
Zhang L, Tjondronegoro D, Chandran V (2014) Representation of facial expression categories in continuous arousal–valence space: feature and correlation. Image Vis Comput 32(12):1067–1079. doi:10.1016/j.imavis.2014.09.005
Article Google Scholar
Zhang C, Zhang Z (2010) A survey of recent advances in face detection. technical report, microsoft research
Zhaoyu W, Shangfei W Spontaneous facial expression recognition by using feature-level fusion of visible and thermal infrared images. In: Machine Learning for Signal Processing (MLSP), 2011 I.E. International Workshop on. pp 1-6
Zhengyou Z, Lyons M, Schuster M, Akamatsu S Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron. In: Automatic Face and Gesture Recognition, 1998. Proceedings. Third IEEE International Conference on, 1998. pp 454-459
Zisheng L, Jun-ichi I, Kaneko M Facial-component-based bag of words and PHOG descriptor for facial expression recognition. In: Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference on, 11-14 Oct. 2009 2009. pp 1353-1358

Download references

Acknowledgments

This work is funded by the British Broadcast Corporation, Australian Smart Services CRC, and the National Natural Science Foundation of China (Grant No. 61402362, 61402363).

Author information

Authors and Affiliations

Faculty of Computer Science and Engineering, Xi’an University of Technology, Xi’an, 710048, China
Ligang Zhang
Science and Engineering Faculty, Queensland University of Technology, Brisbane, 4000, Australia
Ligang Zhang, Dian Tjondronegoro & Vinod Chandran
Research & Development, British Broadcasting Corporation (BBC), London, UK
Jana Eggink

Authors

Ligang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dian Tjondronegoro
View author publications
You can also search for this author in PubMed Google Scholar
Vinod Chandran
View author publications
You can also search for this author in PubMed Google Scholar
Jana Eggink
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ligang Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, L., Tjondronegoro, D., Chandran, V. et al. Towards robust automatic affective classification of images using facial expressions for practical applications. Multimed Tools Appl 75, 4669–4695 (2016). https://doi.org/10.1007/s11042-015-2497-5

Download citation

Received: 01 September 2014
Revised: 31 December 2014
Accepted: 04 February 2015
Published: 27 February 2015
Issue Date: April 2016
DOI: https://doi.org/10.1007/s11042-015-2497-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Towards robust automatic affective classification of images using facial expressions for practical applications

Abstract

Access this article

Similar content being viewed by others

Facial emotion recognition using convolutional neural networks (FERC)

A review on face recognition systems: recent approaches and challenges

Facial Expression Recognition Using Machine Learning and Deep Learning Techniques: A Systematic Review

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Towards robust automatic affective classification of images using facial expressions for practical applications

Abstract

Access this article

Similar content being viewed by others

Facial emotion recognition using convolutional neural networks (FERC)

A review on face recognition systems: recent approaches and challenges

Facial Expression Recognition Using Machine Learning and Deep Learning Techniques: A Systematic Review

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation