skip to main content
10.1145/2663204.2666277acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Emotion Recognition in the Wild with Feature Fusion and Multiple Kernel Learning

Authors Info & Claims
Published:12 November 2014Publication History

ABSTRACT

This paper presents our proposed approach for the second Emotion Recognition in The Wild Challenge. We propose a new feature descriptor called Histogram of Oriented Gradients from Three Orthogonal Planes (HOG_TOP) to represent facial expressions. We also explore the properties of visual features and audio features, and adopt Multiple Kernel Learning (MKL) to find an optimal feature fusion. An SVM with multiple kernels is trained for the facial expression classification. Experimental results demonstrate that our method achieves a promising performance. The overall classification accuracy on the validation set and test set are 40.21% and 45.21%, respectively.

References

  1. M. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, "Coding Facial Expressions with Gabor Wavelets," in Automatic Face and Gesture Recognition, Proceedings. Third IEEE International Conference on, 1998, pp. 200--205. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. Sim, S. Baker, and M. Bsat, "The Carnegie Mellon University pose, illumination, and expression database," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 25, pp. 1615--1618, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Pantic, M. Valstar, R. Rademaker, and L. Maat, "Web-based database for facial expression analysis," in Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on, 2005.Google ScholarGoogle Scholar
  4. R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker, "Multi-pie," Image and Vision Computing, vol. 28, pp. 807--813, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, "Static Facial Expression Analysis In Tough Conditions," in Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, 2011, pp. 2106--2112.Google ScholarGoogle Scholar
  6. A. Dhall, R. Goecke, S. Lucey, and T. Gedeon, "A semi-automatic method for collecting richly labelled large facial expression databases from movies," IEEE Multimedia, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Dhall, R. Goecke, J. Joshi, M. Wagner, and T. Gedeon, "Emotion Recognition In The Wild Challenge 2013," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 509--516. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Rakotomamonjy, F. R. Bach, S. Canu, and Y. Grandvalet, "SimpleMKL," Journal of Machine Learning Research, vol. 9, pp. 2491--2521, 2008.Google ScholarGoogle Scholar
  9. S. Z. Li and A. K. Jain, Handbook of face recognition: springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. F. Cootes, G. J. Edwards, and C. J. Taylor, "Active appearance models," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 23, pp. 681--685, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. I. Matthews and S. Baker, "Active appearance models revisited," International Journal of Computer Vision, vol. 60, pp. 135--164, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Cristinacce and T. F. Cootes, "Feature detection and tracking with constrained local models," in BMVC, 2006, pp. 929--938.Google ScholarGoogle Scholar
  13. X. Zhu and D. Ramanan, "Face detection, pose estimation and landmark localization in the wild," in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, 2012, pp. 2879--2886. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. G. Feichtinger and T. Strohmer, Gabor analysis and algorithms: Theory and applications: Springer, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. T. Ojala, M. Pietikainen, and T. Maenpaa, "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 24, pp. 971--987, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. V. Ojansivu and J. Heikkilä, "Blur insensitive texture classification using local phase quantization," in Image and Signal Processing, 2008, pp. 236--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. G. Zhao and M. Pietikainen, "Dynamic texture recognition using local binary patterns with an application to facial expressions," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 29, pp. 915--928, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Päivärinta, E. Rahtu, and J. Heikkilä, "Volume local phase quantization for blur-insensitive dynamic texture classification," in Proceedings of the 17th Scandinavian conference on Image analysis, 2011, pp. 360--369. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. A. Dhall, A. Asthana, R. Goecke, and T. Gedeon, "Emotion recognition using PHOG and LPQ features," in Automatic Face & Gesture Recognition and Workshops IEEE International Conference on, 2011, pp. 878--883.Google ScholarGoogle Scholar
  20. Y. Kim, H. Lee, and E. M. Provost, "Deep learning for robust feature generation in audiovisual emotion recognition," in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, 2013, pp. 3687--3691.Google ScholarGoogle ScholarCross RefCross Ref
  21. M. Liu, R. Wang, Z. Huang, S. Shan, and X. Chen, "Partial least squares regression on grassmannian manifold for emotion recognition," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 525--530. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Sikka, K. Dykstra, S. Sathyanarayana, G. Littlewort, and M. Bartlett, "Multiple kernel learning for emotion recognition in the wild," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 517--524. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. E. Kanou, C. Pal, X. Bouthillier, P. Froumenty, Ç. Gülçehre, R. Memisevic, et al., "Combining modality specific deep neural networks for emotion recognition in video," in Proceedings of the 15th ACM on International conference on multimodal interaction, 2013, pp. 543--550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. M. Gönen and E. Alpaydfin, "Multiple Kernel Learning Algorithms," The Journal of Machine Learning Research, vol. 12, pp. 2211--2268, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. A. Russell, J. A. Bachorowski, and J. M. Fernandez-Dols, "Facial and vocal expressions of emotion," Annu Rev Psychol, vol. 54, pp. 329--349, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  26. X. Zhang, M. H. Mahoor, and R. M. Voyles, "Facial expression recognition using HessianMKL based multiclass-SVM," in Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on, 2013, pp. 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  27. X. Zhang, M. H. Mahoor, S. M. Mavadati, and J. F. Cohn, "A lp-norm MTMKL framework for simultaneous detection of multiple facial action units," in Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on 2014, pp. 1104--1111.Google ScholarGoogle Scholar
  28. N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," in Computer Vision and Pattern Recognition, 2005. IEEE Conference on, 2005, pp. 886--893. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. F. Eyben, M. Wollmer, and B. Schuller, "OpenEAR -- Introducing the munich open-source emotion and affect recognition toolkit," in Affective Computing and Intelligent Interaction and Workshops. ACII 2009. 3rd International Conference on, 2009, pp. 1--6.Google ScholarGoogle Scholar
  30. F. Eyben, M. Wöllmer, and B. Schuller, "Opensmile_ the munich versatile and fast open-source," in Proceedings of the international conference on Multimedia, 2010, pp. 1459--1462. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. A. Müller, et al., "The INTERSPEECH 2010 paralinguistic challenge," in INTERSPEECH, 2010, pp. 2794--2797.Google ScholarGoogle Scholar
  32. B. Schuller, M. Valstar, F. Eyben, G. McKeown, R. Cowie, and M. Pantic, "Avec 2011-the first international audio/visual emotion challenge," in Affective Computing and Intelligent Interaction, ed: Springer, 2011, pp. 415--424. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. C. J. Burges, "A Tutorial on Support Vector Machines for Pattern Recognition," Data mining and knowledge discovery, vol. 2, pp. 121--167, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. G. R. Lanckriet, N. Cristianini, P. Bartlett, L. E. Ghaoui, and M. I. Jordan, "Learning the Kernel Matrix with Semi-Definite Programming," The Journal of Machine Learning Research, vol. 5, pp. 27--72, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. P. Viola and M. Jones, "Robust Real-Time Face Detection," International journal of computer vision, vol. 57, pp. 137--154, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Dhall, R. Goecke, J. Joshi, K. Sikka, and T. Gedeon, "Emotion Recognition In The Wild Challenge 2014: Baseline, Data and Protocol," in ACM International Conference on Multimodal Interaction 2014., 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Emotion Recognition in the Wild with Feature Fusion and Multiple Kernel Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICMI '14: Proceedings of the 16th International Conference on Multimodal Interaction
      November 2014
      558 pages
      ISBN:9781450328852
      DOI:10.1145/2663204

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 November 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ICMI '14 Paper Acceptance Rate51of127submissions,40%Overall Acceptance Rate453of1,080submissions,42%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader