ABSTRACT
This paper presents our proposed approach for the second Emotion Recognition in The Wild Challenge. We propose a new feature descriptor called Histogram of Oriented Gradients from Three Orthogonal Planes (HOG_TOP) to represent facial expressions. We also explore the properties of visual features and audio features, and adopt Multiple Kernel Learning (MKL) to find an optimal feature fusion. An SVM with multiple kernels is trained for the facial expression classification. Experimental results demonstrate that our method achieves a promising performance. The overall classification accuracy on the validation set and test set are 40.21% and 45.21%, respectively.
- M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, "Coding Facial Expressions with Gabor Wavelets," in Automatic Face and Gesture Recognition, Proceedings. Third IEEE International Conference on, 1998, pp. 200--205. Google ScholarDigital Library
- T. Sim, S. Baker, and M. Bsat, "The Carnegie Mellon University pose, illumination, and expression database," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 25, pp. 1615--1618, 2003. Google ScholarDigital Library
- M. Pantic, M. Valstar, R. Rademaker, and L. Maat, "Web-based database for facial expression analysis," in Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on, 2005.Google Scholar
- R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker, "Multi-pie," Image and Vision Computing, vol. 28, pp. 807--813, 2010. Google ScholarDigital Library
- A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, "Static Facial Expression Analysis In Tough Conditions," in Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, 2011, pp. 2106--2112.Google Scholar
- A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, "A semi-automatic method for collecting richly labelled large facial expression databases from movies," IEEE Multimedia, 2012. Google ScholarDigital Library
- A. Dhall, R. Goecke, J. Joshi, M. Wagner, and T. Gedeon, "Emotion Recognition In The Wild Challenge 2013," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 509--516. Google ScholarDigital Library
- A. Rakotomamonjy, F. R. Bach, S. Canu, and Y. Grandvalet, "SimpleMKL," Journal of Machine Learning Research, vol. 9, pp. 2491--2521, 2008.Google Scholar
- S. Z. Li and A. K. Jain, Handbook of face recognition: springer, 2011. Google ScholarDigital Library
- T. F. Cootes, G. J. Edwards, and C. J. Taylor, "Active appearance models," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 23, pp. 681--685, 2001. Google ScholarDigital Library
- I. Matthews and S. Baker, "Active appearance models revisited," International Journal of Computer Vision, vol. 60, pp. 135--164, 2004. Google ScholarDigital Library
- D. Cristinacce and T. F. Cootes, "Feature detection and tracking with constrained local models," in BMVC, 2006, pp. 929--938.Google Scholar
- X. Zhu and D. Ramanan, "Face detection, pose estimation and landmark localization in the wild," in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, 2012, pp. 2879--2886. Google ScholarDigital Library
- H. G. Feichtinger and T. Strohmer, Gabor analysis and algorithms: Theory and applications: Springer, 1998. Google ScholarDigital Library
- T. Ojala, M. Pietikainen, and T. Maenpaa, "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 24, pp. 971--987, 2002. Google ScholarDigital Library
- V. Ojansivu and J. Heikkilä, "Blur insensitive texture classification using local phase quantization," in Image and Signal Processing, 2008, pp. 236--243. Google ScholarDigital Library
- G. Zhao and M. Pietikainen, "Dynamic texture recognition using local binary patterns with an application to facial expressions," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 29, pp. 915--928, 2007. Google ScholarDigital Library
- J. Päivärinta, E. Rahtu, and J. Heikkilä, "Volume local phase quantization for blur-insensitive dynamic texture classification," in Proceedings of the 17th Scandinavian conference on Image analysis, 2011, pp. 360--369. Google ScholarDigital Library
- A. Dhall, A. Asthana, R. Goecke, and T. Gedeon, "Emotion recognition using PHOG and LPQ features," in Automatic Face & Gesture Recognition and Workshops IEEE International Conference on, 2011, pp. 878--883.Google Scholar
- Y. Kim, H. Lee, and E. M. Provost, "Deep learning for robust feature generation in audiovisual emotion recognition," in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, 2013, pp. 3687--3691.Google ScholarCross Ref
- M. Liu, R. Wang, Z. Huang, S. Shan, and X. Chen, "Partial least squares regression on grassmannian manifold for emotion recognition," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 525--530. Google ScholarDigital Library
- K. Sikka, K. Dykstra, S. Sathyanarayana, G. Littlewort, and M. Bartlett, "Multiple kernel learning for emotion recognition in the wild," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 517--524. Google ScholarDigital Library
- S. E. Kanou, C. Pal, X. Bouthillier, P. Froumenty, Ç. Gülçehre, R. Memisevic, et al., "Combining modality specific deep neural networks for emotion recognition in video," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 543--550. Google ScholarDigital Library
- M. Gönen and E. Alpaydfin, "Multiple Kernel Learning Algorithms," The Journal of Machine Learning Research, vol. 12, pp. 2211--2268, 2011. Google ScholarDigital Library
- J. A. Russell, J. A. Bachorowski, and J. M. Fernandez-Dols, "Facial and vocal expressions of emotion," Annu Rev Psychol, vol. 54, pp. 329--349, 2003.Google ScholarCross Ref
- X. Zhang, M. H. Mahoor, and R. M. Voyles, "Facial expression recognition using HessianMKL based multiclass-SVM," in Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on, 2013, pp. 1--6.Google ScholarCross Ref
- X. Zhang, M. H. Mahoor, S. M. Mavadati, and J. F. Cohn, "A lp-norm MTMKL framework for simultaneous detection of multiple facial action units," in Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on 2014, pp. 1104--1111.Google Scholar
- N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," in Computer Vision and Pattern Recognition, 2005. IEEE Conference on, 2005, pp. 886--893. Google ScholarDigital Library
- F. Eyben, M. Wollmer, and B. Schuller, "OpenEAR -- Introducing the munich open-source emotion and affect recognition toolkit," in Affective Computing and Intelligent Interaction and Workshops. ACII 2009. 3rd International Conference on, 2009, pp. 1--6.Google Scholar
- F. Eyben, M. Wöllmer, and B. Schuller, "Opensmile_ the munich versatile and fast open-source," in Proceedings of the international conference on Multimedia, 2010, pp. 1459--1462. Google ScholarDigital Library
- B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. A. Müller, et al., "The INTERSPEECH 2010 paralinguistic challenge," in INTERSPEECH, 2010, pp. 2794--2797.Google Scholar
- B. Schuller, M. Valstar, F. Eyben, G. McKeown, R. Cowie, and M. Pantic, "Avec 2011-the first international audio/visual emotion challenge," in Affective Computing and Intelligent Interaction, ed: Springer, 2011, pp. 415--424. Google ScholarDigital Library
- C. J. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition," Data mining and knowledge discovery, vol. 2, pp. 121--167, 1998. Google ScholarDigital Library
- G. R. Lanckriet, N. Cristianini, P. Bartlett, L. E. Ghaoui, and M. I. Jordan, "Learning the Kernel Matrix with Semi-Definite Programming," The Journal of Machine Learning Research, vol. 5, pp. 27--72, 2004. Google ScholarDigital Library
- P. Viola and M. Jones, "Robust Real-Time Face Detection," International journal of computer vision, vol. 57, pp. 137--154, 2004. Google ScholarDigital Library
- A. Dhall, R. Goecke, J. Joshi, K. Sikka, and T. Gedeon, "Emotion Recognition In The Wild Challenge 2014: Baseline, Data and Protocol," in ACM International Conference on Multimodal Interaction 2014., 2014. Google ScholarDigital Library
Index Terms
- Emotion Recognition in the Wild with Feature Fusion and Multiple Kernel Learning
Recommendations
Multiple kernel learning for emotion recognition in the wild
ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interactionWe propose a method to automatically detect emotions in unconstrained settings as part of the 2013 Emotion Recognition in the Wild Challenge [16], organized in conjunction with the ACM International Conference on Multimodal Interaction (ICMI 2013). Our ...
Combining Multimodal Features with Hierarchical Classifier Fusion for Emotion Recognition in the Wild
ICMI '14: Proceedings of the 16th International Conference on Multimodal InteractionEmotion recognition in the wild is a very challenging task. In this paper, we investigate a variety of different multimodal features from video and audio to evaluate their discriminative ability to human emotion analysis. For each clip, we extract SIFT, ...
2D facial expression recognition via 3D reconstruction and feature fusion
This paper proposed a method for facial expression recognition.In proposed method, facial depth has been added to facial texture for feature extraction.We demonstrated that adding the facial depth to feature extraction is effective.The 3DH-LLBP is ...
Comments