Skip to main content

2016 | OriginalPaper | Buchkapitel

MEC 2016: The Multimodal Emotion Recognition Challenge of CCPR 2016

verfasst von : Ya Li, Jianhua Tao, Björn Schuller, Shiguang Shan, Dongmei Jiang, Jia Jia

Erschienen in: Pattern Recognition

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Emotion recognition is a significant research filed of pattern recognition and artificial intelligence. The Multimodal Emotion Recognition Challenge (MEC) is a part of the 2016 Chinese Conference on Pattern Recognition (CCPR). The goal of this competition is to compare multimedia processing and machine learning methods for multimodal emotion recognition. The challenge also aims to provide a common benchmark data set, to bring together the audio and video emotion recognition communities, and to promote the research in multimodal emotion recognition. The data used in this challenge is the Chinese Natural Audio-Visual Emotion Database (CHEAVD), which is selected from Chinese movies and TV programs. The discrete emotion labels are annotated by four experienced assistants. Three sub-challenges are defined: audio, video and multimodal emotion recognition. This paper introduces the baseline audio, visual features, and the recognition results by Random Forests.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat McKeown, G., Valstar, M., Cowie, R., Pantic, M., Schröder, M.: The semaine database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans. Affect. Comput. 3, 5–17 (2012)CrossRef McKeown, G., Valstar, M., Cowie, R., Pantic, M., Schröder, M.: The semaine database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans. Affect. Comput. 3, 5–17 (2012)CrossRef
2.
Zurück zum Zitat Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Collecting large, richly annotated facial-expression databases from movies. IEEE Multimedia 19, 34–41 (2012)CrossRef Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Collecting large, richly annotated facial-expression databases from movies. IEEE Multimedia 19, 34–41 (2012)CrossRef
3.
Zurück zum Zitat Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: IEEE International Conference on Computer Vision Workshops, ICCV 2011 Workshops, Barcelona, Spain, pp. 2106–2112, November 2011 Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: IEEE International Conference on Computer Vision Workshops, ICCV 2011 Workshops, Barcelona, Spain, pp. 2106–2112, November 2011
4.
Zurück zum Zitat Schuller, B., Valstar, M., Eyben, F., McKeown, G., Cowie, R., Pantic, M.: AVEC 2011-the first international audio/visual emotion challenge. In: Affective Computing and Intelligent Interaction, pp. 415–424 (2011) Schuller, B., Valstar, M., Eyben, F., McKeown, G., Cowie, R., Pantic, M.: AVEC 2011-the first international audio/visual emotion challenge. In: Affective Computing and Intelligent Interaction, pp. 415–424 (2011)
5.
Zurück zum Zitat Schuller, B., Steidl, S., Batliner, A.: The interspeech 2009 emotion challenge. In: Interspeech, pp. 312–315 (2009) Schuller, B., Steidl, S., Batliner, A.: The interspeech 2009 emotion challenge. In: Interspeech, pp. 312–315 (2009)
6.
Zurück zum Zitat Valstar, M.F., Jiang, B., Mehu, M., Pantic, M., Scherer, K.: The first facial expression recognition and analysis challenge. In: 2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops (FG 2011), pp. 921–926 (2011) Valstar, M.F., Jiang, B., Mehu, M., Pantic, M., Scherer, K.: The first facial expression recognition and analysis challenge. In: 2011 IEEE International Conference on Automatic Face and Gesture Recognition and Workshops (FG 2011), pp. 921–926 (2011)
7.
Zurück zum Zitat Dhall, A., Ramana Murthy, O., Goecke, R., Joshi, J., Gedeon, T.: Video and image based emotion recognition challenges in the wild: Emotiw 2015. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 423–426 (2015) Dhall, A., Ramana Murthy, O., Goecke, R., Joshi, J., Gedeon, T.: Video and image based emotion recognition challenges in the wild: Emotiw 2015. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 423–426 (2015)
8.
Zurück zum Zitat Ringeval, F., Schuller, B., Valstar, M., Jaiswal, S., Marchi, E., Lalanne, D., et al.: AV+EC 2015: the first affect recognition challenge bridging across audio, video, and physiological data. In: Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge, pp. 3–8 (2015) Ringeval, F., Schuller, B., Valstar, M., Jaiswal, S., Marchi, E., Lalanne, D., et al.: AV+EC 2015: the first affect recognition challenge bridging across audio, video, and physiological data. In: Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge, pp. 3–8 (2015)
9.
Zurück zum Zitat Ververidis, D., Kotropoulos, C.: Emotional speech recognition: resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)CrossRef Ververidis, D., Kotropoulos, C.: Emotional speech recognition: resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)CrossRef
10.
Zurück zum Zitat Wu, C.-H., Lin, J.-C., Wei, W.-L.: Survey on audiovisual emotion recognition: databases, features, and data fusion strategies. APSIPA Trans. Signal Inf. Process. 3, 12 (2014)CrossRef Wu, C.-H., Lin, J.-C., Wei, W.-L.: Survey on audiovisual emotion recognition: databases, features, and data fusion strategies. APSIPA Trans. Signal Inf. Process. 3, 12 (2014)CrossRef
11.
Zurück zum Zitat Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: towards a new generation of databases. Speech Commun. 40, 33–60 (2003)CrossRefMATH Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: towards a new generation of databases. Speech Commun. 40, 33–60 (2003)CrossRefMATH
12.
Zurück zum Zitat Grimm, M., Kroschel, K., Narayanan, S.: The Vera am Mittag German audio-visual emotional speech database. In: International Conference on Multimedia Computing and Systems/International Conference on Multimedia and Expo, pp. 865–868 (2008) Grimm, M., Kroschel, K., Narayanan, S.: The Vera am Mittag German audio-visual emotional speech database. In: International Conference on Multimedia Computing and Systems/International Conference on Multimedia and Expo, pp. 865–868 (2008)
13.
Zurück zum Zitat Devillers, L., Cowie, R., Martin, J.C., Douglas-Cowie, E., Abrilian, S., Mcrorie, M.: Real life emotions in French and English TV video clips: an integrated annotation protocol combining continuous and discrete approaches. In: International Conference on Language Resources and Evaluation, pp. 1105–1110 (2006) Devillers, L., Cowie, R., Martin, J.C., Douglas-Cowie, E., Abrilian, S., Mcrorie, M.: Real life emotions in French and English TV video clips: an integrated annotation protocol combining continuous and discrete approaches. In: International Conference on Language Resources and Evaluation, pp. 1105–1110 (2006)
14.
Zurück zum Zitat Clavel, C., Vasilescu, I., Devillers, L., Richard, G., Ehrette, T., Sedogbo, C.: The SAFE corpus: illustrating extreme emotions in dynamic situations. In: First International Workshop on Emotion: Corpora for Research on Emotion and Affect, pp. 76–79 (2006) Clavel, C., Vasilescu, I., Devillers, L., Richard, G., Ehrette, T., Sedogbo, C.: The SAFE corpus: illustrating extreme emotions in dynamic situations. In: First International Workshop on Emotion: Corpora for Research on Emotion and Affect, pp. 76–79 (2006)
15.
Zurück zum Zitat Bao, W., Li, Y., Gu, M., Yang, M., Li, H., Chao, L., et al.: Building a Chinese natural emotional audio-visual database. In: 2014 International Conference on Signal Processing, pp. 583–587 (2014) Bao, W., Li, Y., Gu, M., Yang, M., Li, H., Chao, L., et al.: Building a Chinese natural emotional audio-visual database. In: 2014 International Conference on Signal Processing, pp. 583–587 (2014)
16.
Zurück zum Zitat Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., et al.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18, 32–80 (2001)CrossRef Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., et al.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18, 32–80 (2001)CrossRef
17.
Zurück zum Zitat Gobl, C., Chasaide, A.N.: The role of voice quality in communicating emotion, mood and attitude. Speech Commun. 40, 189–212 (2003)CrossRefMATH Gobl, C., Chasaide, A.N.: The role of voice quality in communicating emotion, mood and attitude. Speech Commun. 40, 189–212 (2003)CrossRefMATH
18.
Zurück zum Zitat Eyben, F., Weninger, F., Gross, F., Schuller, B.: Recent developments in openSMILE, the Munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 835–838 (2013) Eyben, F., Weninger, F., Gross, F., Schuller, B.: Recent developments in openSMILE, the Munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 835–838 (2013)
19.
Zurück zum Zitat Xiong, X., Torre, F.D.L.: Supervised descent method and its applications to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013) Xiong, X., Torre, F.D.L.: Supervised descent method and its applications to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539 (2013)
20.
Zurück zum Zitat Viola, P., Jones, M.J.: Robust real-time object detection. Int. J. Comput. Vision 57, 87 (2001) Viola, P., Jones, M.J.: Robust real-time object detection. Int. J. Comput. Vision 57, 87 (2001)
21.
Zurück zum Zitat Zhao, G., Pietikinen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29, 915–928 (2007)CrossRef Zhao, G., Pietikinen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29, 915–928 (2007)CrossRef
Metadaten
Titel
MEC 2016: The Multimodal Emotion Recognition Challenge of CCPR 2016
verfasst von
Ya Li
Jianhua Tao
Björn Schuller
Shiguang Shan
Dongmei Jiang
Jia Jia
Copyright-Jahr
2016
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-3005-5_55

Premium Partner