research-article

Emotion Recognition in the Wild with Feature Fusion and Multiple Kernel Learning

Authors:
JunKai Chen

The Hong Kong Polytechnic University, Hong Kong, China

The Hong Kong Polytechnic University, Hong Kong, China
View Profile

,
Zenghai Chen

The Hong Kong Polytechnic University, Hong Kong, China

The Hong Kong Polytechnic University, Hong Kong, China
View Profile

,
Zheru Chi

The Hong Kong Polytechnic University, Hong Kong, China

The Hong Kong Polytechnic University, Hong Kong, China
View Profile

,
Hong Fu

The Hong Kong Polytechnic University, Chu Hai College of Higher Education, Hong Kong, China

The Hong Kong Polytechnic University, Chu Hai College of Higher Education, Hong Kong, China
View Profile

ICMI '14: Proceedings of the 16th International Conference on Multimodal InteractionNovember 2014Pages 508–513https://doi.org/10.1145/2663204.2666277

Published:12 November 2014Publication History

ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

Pages 508–513

ABSTRACT

This paper presents our proposed approach for the second Emotion Recognition in The Wild Challenge. We propose a new feature descriptor called Histogram of Oriented Gradients from Three Orthogonal Planes (HOG_TOP) to represent facial expressions. We also explore the properties of visual features and audio features, and adopt Multiple Kernel Learning (MKL) to find an optimal feature fusion. An SVM with multiple kernels is trained for the facial expression classification. Experimental results demonstrate that our method achieves a promising performance. The overall classification accuracy on the validation set and test set are 40.21% and 45.21%, respectively.

References

M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, "Coding Facial Expressions with Gabor Wavelets," in Automatic Face and Gesture Recognition, Proceedings. Third IEEE International Conference on, 1998, pp. 200--205. Google ScholarDigital Library
T. Sim, S. Baker, and M. Bsat, "The Carnegie Mellon University pose, illumination, and expression database," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 25, pp. 1615--1618, 2003. Google ScholarDigital Library
M. Pantic, M. Valstar, R. Rademaker, and L. Maat, "Web-based database for facial expression analysis," in Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on, 2005.Google Scholar
R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker, "Multi-pie," Image and Vision Computing, vol. 28, pp. 807--813, 2010. Google ScholarDigital Library
A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, "Static Facial Expression Analysis In Tough Conditions," in Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, 2011, pp. 2106--2112.Google Scholar
A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, "A semi-automatic method for collecting richly labelled large facial expression databases from movies," IEEE Multimedia, 2012. Google ScholarDigital Library
A. Dhall, R. Goecke, J. Joshi, M. Wagner, and T. Gedeon, "Emotion Recognition In The Wild Challenge 2013," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 509--516. Google ScholarDigital Library
A. Rakotomamonjy, F. R. Bach, S. Canu, and Y. Grandvalet, "SimpleMKL," Journal of Machine Learning Research, vol. 9, pp. 2491--2521, 2008.Google Scholar
S. Z. Li and A. K. Jain, Handbook of face recognition: springer, 2011. Google ScholarDigital Library
T. F. Cootes, G. J. Edwards, and C. J. Taylor, "Active appearance models," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 23, pp. 681--685, 2001. Google ScholarDigital Library
I. Matthews and S. Baker, "Active appearance models revisited," International Journal of Computer Vision, vol. 60, pp. 135--164, 2004. Google ScholarDigital Library
D. Cristinacce and T. F. Cootes, "Feature detection and tracking with constrained local models," in BMVC, 2006, pp. 929--938.Google Scholar
X. Zhu and D. Ramanan, "Face detection, pose estimation and landmark localization in the wild," in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, 2012, pp. 2879--2886. Google ScholarDigital Library
H. G. Feichtinger and T. Strohmer, Gabor analysis and algorithms: Theory and applications: Springer, 1998. Google ScholarDigital Library
T. Ojala, M. Pietikainen, and T. Maenpaa, "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 24, pp. 971--987, 2002. Google ScholarDigital Library
V. Ojansivu and J. Heikkilä, "Blur insensitive texture classification using local phase quantization," in Image and Signal Processing, 2008, pp. 236--243. Google ScholarDigital Library
G. Zhao and M. Pietikainen, "Dynamic texture recognition using local binary patterns with an application to facial expressions," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 29, pp. 915--928, 2007. Google ScholarDigital Library
J. Päivärinta, E. Rahtu, and J. Heikkilä, "Volume local phase quantization for blur-insensitive dynamic texture classification," in Proceedings of the 17th Scandinavian conference on Image analysis, 2011, pp. 360--369. Google ScholarDigital Library
A. Dhall, A. Asthana, R. Goecke, and T. Gedeon, "Emotion recognition using PHOG and LPQ features," in Automatic Face & Gesture Recognition and Workshops IEEE International Conference on, 2011, pp. 878--883.Google Scholar
Y. Kim, H. Lee, and E. M. Provost, "Deep learning for robust feature generation in audiovisual emotion recognition," in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, 2013, pp. 3687--3691.Google ScholarCross Ref
M. Liu, R. Wang, Z. Huang, S. Shan, and X. Chen, "Partial least squares regression on grassmannian manifold for emotion recognition," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 525--530. Google ScholarDigital Library
K. Sikka, K. Dykstra, S. Sathyanarayana, G. Littlewort, and M. Bartlett, "Multiple kernel learning for emotion recognition in the wild," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 517--524. Google ScholarDigital Library
S. E. Kanou, C. Pal, X. Bouthillier, P. Froumenty, Ç. Gülçehre, R. Memisevic, et al., "Combining modality specific deep neural networks for emotion recognition in video," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 543--550. Google ScholarDigital Library
M. Gönen and E. Alpaydfin, "Multiple Kernel Learning Algorithms," The Journal of Machine Learning Research, vol. 12, pp. 2211--2268, 2011. Google ScholarDigital Library
J. A. Russell, J. A. Bachorowski, and J. M. Fernandez-Dols, "Facial and vocal expressions of emotion," Annu Rev Psychol, vol. 54, pp. 329--349, 2003.Google ScholarCross Ref
X. Zhang, M. H. Mahoor, and R. M. Voyles, "Facial expression recognition using HessianMKL based multiclass-SVM," in Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on, 2013, pp. 1--6.Google ScholarCross Ref
X. Zhang, M. H. Mahoor, S. M. Mavadati, and J. F. Cohn, "A lp-norm MTMKL framework for simultaneous detection of multiple facial action units," in Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on 2014, pp. 1104--1111.Google Scholar
N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," in Computer Vision and Pattern Recognition, 2005. IEEE Conference on, 2005, pp. 886--893. Google ScholarDigital Library
F. Eyben, M. Wollmer, and B. Schuller, "OpenEAR -- Introducing the munich open-source emotion and affect recognition toolkit," in Affective Computing and Intelligent Interaction and Workshops. ACII 2009. 3rd International Conference on, 2009, pp. 1--6.Google Scholar
F. Eyben, M. Wöllmer, and B. Schuller, "Opensmile_ the munich versatile and fast open-source," in Proceedings of the international conference on Multimedia, 2010, pp. 1459--1462. Google ScholarDigital Library
B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. A. Müller, et al., "The INTERSPEECH 2010 paralinguistic challenge," in INTERSPEECH, 2010, pp. 2794--2797.Google Scholar
B. Schuller, M. Valstar, F. Eyben, G. McKeown, R. Cowie, and M. Pantic, "Avec 2011-the first international audio/visual emotion challenge," in Affective Computing and Intelligent Interaction, ed: Springer, 2011, pp. 415--424. Google ScholarDigital Library
C. J. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition," Data mining and knowledge discovery, vol. 2, pp. 121--167, 1998. Google ScholarDigital Library
G. R. Lanckriet, N. Cristianini, P. Bartlett, L. E. Ghaoui, and M. I. Jordan, "Learning the Kernel Matrix with Semi-Definite Programming," The Journal of Machine Learning Research, vol. 5, pp. 27--72, 2004. Google ScholarDigital Library
P. Viola and M. Jones, "Robust Real-Time Face Detection," International journal of computer vision, vol. 57, pp. 137--154, 2004. Google ScholarDigital Library
A. Dhall, R. Goecke, J. Joshi, K. Sikka, and T. Gedeon, "Emotion Recognition In The Wild Challenge 2014: Baseline, Data and Protocol," in ACM International Conference on Multimodal Interaction 2014., 2014. Google ScholarDigital Library

Index Terms

Emotion Recognition in the Wild with Feature Fusion and Multiple Kernel Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks

Recommendations

Multiple kernel learning for emotion recognition in the wild
ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interaction

We propose a method to automatically detect emotions in unconstrained settings as part of the 2013 Emotion Recognition in the Wild Challenge [16], organized in conjunction with the ACM International Conference on Multimodal Interaction (ICMI 2013). Our ...
Read More
Combining Multimodal Features with Hierarchical Classifier Fusion for Emotion Recognition in the Wild
ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

Emotion recognition in the wild is a very challenging task. In this paper, we investigate a variety of different multimodal features from video and audio to evaluate their discriminative ability to human emotion analysis. For each clip, we extract SIFT, ...
Read More
2D facial expression recognition via 3D reconstruction and feature fusion

This paper proposed a method for facial expression recognition.In proposed method, facial depth has been added to facial texture for feature extraction.We demonstrated that adding the facial depth to feature extraction is effective.The 3DH-LLBP is ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction
November 2014
558 pages
ISBN:9781450328852
DOI:10.1145/2663204
General Chairs:
Albert Ali Salah
Boğaziçi University, Turkey
,
Jeffrey Cohn
University of Pittsburgh, USA
,
Björn Schuller
University of Passau, Germany and Imperial College London, UK
,
Program Chairs:
Oya Aran
Idiap Research Institute, Switzerland
,
Louis-Philippe Morency
University of Southern California, USA
,
Philip R. Cohen
Adapx, USA
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 November 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
emotion recognition
feature fusion
hog_top
multiple kernel learning
support vector machine
Qualifiers
- research-article
Conference

Acceptance Rates
ICMI '14 Paper Acceptance Rate51of127submissions,40%Overall Acceptance Rate453of1,080submissions,42%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 63
  Total Citations
  View Citations
- 673
  Total Downloads
- Downloads (Last 12 months)23
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Emotion Recognition in the Wild with Feature Fusion and Multiple Kernel Learning

ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Multiple kernel learning for emotion recognition in the wild

Combining Multimodal Features with Hierarchical Classifier Fusion for Emotion Recognition in the Wild

2D facial expression recognition via 3D reconstruction and feature fusion