Skip to main content

2017 | OriginalPaper | Buchkapitel

19. Multimodal Affect Recognition in the Context of Human-Computer Interaction for Companion-Systems

verfasst von : Friedhelm Schwenker, Ronald Böck, Martin Schels, Sascha Meudt, Ingo Siegert, Michael Glodek, Markus Kächele, Miriam Schmidt-Wack, Patrick Thiam, Andreas Wendemuth, Gerald Krell

Erschienen in: Companion Technology

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In general, humans interact with each other using multiple modalities. The main channels are speech, facial expressions, and gesture. But also bio-physiological data such as biopotentials can convey valuable information which can be used to interpret the communication in a dedicated way. A Companion-System can use these modalities to perform an efficient human-computer interaction (HCI). To do so, the multiple sources need to be analyzed and combined in technical systems. However, so far only few studies have been published dealing with the fusion of three or even more such modalities. This chapter addresses the necessary processing steps in the development of a multimodal system applying fusion approaches.
ATLAS and ikannotate are presented which are designed for the pre-analyzing of multimodal data streams and the labeling of relevant parts. ATLAS allows us to display raw data, extracted features and even outputs of pre-trained classifier modules. Further, the tool integrates annotation, transcription and an active learning module. Ikannotate can be directly used for transcription and guided step-wise emotional annotation of multimodal data. The tool includes the three mainly used annotation paradigms, namely the basic emotions, the Geneva emotion wheel and the self-assessment manikins (SAMs). Furthermore, annotators using ikannotate can assign an uncertainty to samples.
Classifier architectures need to realize a fusion system in which the multiple modalities are combined. A large number of machine learning approaches were evaluated, such as data, feature, score and decision-level fusion schemes, but also temporal fusion architectures and partially supervised learning.
The proposed methods are evaluated on either multimodal benchmark corpora or on the datasets of the Transregional Collaborative Research Centre SFB/TRR 62, i.e. Last Minute Corpus and the EmoRec Dataset. Furthermore, we present results which were achieved in international challenges.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The g-mean was chosen because of the strong imbalance between the two classes.
 
Literatur
1.
Zurück zum Zitat Batliner, A., Fischer, K., Huber, R., Spiker, J., Nöth, E.: Desperately seeking emotions: Actors, wizards and human beings. In: Proceedings of the ISCA Workshop on Speech and Emotion: A Conceptual Framework for Research, pp. 195–200 (2000) Batliner, A., Fischer, K., Huber, R., Spiker, J., Nöth, E.: Desperately seeking emotions: Actors, wizards and human beings. In: Proceedings of the ISCA Workshop on Speech and Emotion: A Conceptual Framework for Research, pp. 195–200 (2000)
2.
Zurück zum Zitat Böck, R., Siegert, I., Haase, M., Lange, J., Wendemuth, A.: ikannotate - a tool for labelling, transcription, and annotation of emotionally coloured speech. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) Proceedings of ACII. Lecture Notes on Computer Science, vol. 6974, pp. 25–34. Springer, Berlin (2011) Böck, R., Siegert, I., Haase, M., Lange, J., Wendemuth, A.: ikannotate - a tool for labelling, transcription, and annotation of emotionally coloured speech. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) Proceedings of ACII. Lecture Notes on Computer Science, vol. 6974, pp. 25–34. Springer, Berlin (2011)
4.
Zurück zum Zitat Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of German emotional speech. In: Proceedings of Interspeech 2005, pp. 1517–1520 (2005) Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of German emotional speech. In: Proceedings of Interspeech 2005, pp. 1517–1520 (2005)
5.
Zurück zum Zitat Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001)CrossRef Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001)CrossRef
6.
Zurück zum Zitat Devillers, L., Vidrascu, L., Lamel, L.: Challenges in real-life emotion annotation and machine learning based detection. Neural Netw. 18(4), 407–422 (2005)CrossRef Devillers, L., Vidrascu, L., Lamel, L.: Challenges in real-life emotion annotation and machine learning based detection. Neural Netw. 18(4), 407–422 (2005)CrossRef
7.
Zurück zum Zitat Dhall, A., Goecke, R., Joshi, J., Sikka, K., Gedeon, T.: Emotion recognition in the wild challenge 2014: baseline, data and protocol. In: Proceedings of ICMI, pp. 461–466. ACM, New York (2014) Dhall, A., Goecke, R., Joshi, J., Sikka, K., Gedeon, T.: Emotion recognition in the wild challenge 2014: baseline, data and protocol. In: Proceedings of ICMI, pp. 461–466. ACM, New York (2014)
8.
Zurück zum Zitat Dix, A., Finlay, J., Abowd, G., Beale, R.: Human-computer Interaction. Prentice-Hall, Upper Saddle River, NJ (1997)MATH Dix, A., Finlay, J., Abowd, G., Beale, R.: Human-computer Interaction. Prentice-Hall, Upper Saddle River, NJ (1997)MATH
9.
Zurück zum Zitat Frommer, J., Michaelis, B., Rösner, D., Wendemuth, A., Friesen, R., Haase, M., Kunze, M., Andrich, R., Lange, J., Panning, A., Siegert, I.: Towards emotion and affect detection in the multimodal last minute corpus. In: Calzolari, N., Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of LREC. ELRA, Paris (2012) Frommer, J., Michaelis, B., Rösner, D., Wendemuth, A., Friesen, R., Haase, M., Kunze, M., Andrich, R., Lange, J., Panning, A., Siegert, I.: Towards emotion and affect detection in the multimodal last minute corpus. In: Calzolari, N., Choukri, K., Declerck, T., Doğan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of LREC. ELRA, Paris (2012)
10.
Zurück zum Zitat Glodek, M., Tschechne, S., Layher, G., Schels, M., Brosch, T., Scherer, S., Kächele, M., Schmidt, M., Neumann, H., Palm, G., Schwenker, F.: Multiple classifier systems for the classification of audio-visual emotional states. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) Proceedings of ACII - Part II, Lecture Notes on Computer Science, vol. 6975, pp. 359–368. Springer, Berlin (2011) Glodek, M., Tschechne, S., Layher, G., Schels, M., Brosch, T., Scherer, S., Kächele, M., Schmidt, M., Neumann, H., Palm, G., Schwenker, F.: Multiple classifier systems for the classification of audio-visual emotional states. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) Proceedings of ACII - Part II, Lecture Notes on Computer Science, vol. 6975, pp. 359–368. Springer, Berlin (2011)
11.
Zurück zum Zitat Glodek, M., Reuter, S., Schels, M., Dietmayer, K., Schwenker, F.: Kalman filter based classifier fusion for affective state recognition. In: Zhou, Z.H., Roli, F., Kittler, J. (eds.) Multiple Classifier Systems (MCS). Lecture Notes on Computer Science, vol. 7872, pp. 85–94. Springer, Berlin (2013) Glodek, M., Reuter, S., Schels, M., Dietmayer, K., Schwenker, F.: Kalman filter based classifier fusion for affective state recognition. In: Zhou, Z.H., Roli, F., Kittler, J. (eds.) Multiple Classifier Systems (MCS). Lecture Notes on Computer Science, vol. 7872, pp. 85–94. Springer, Berlin (2013)
12.
Zurück zum Zitat Glodek, M., Schels, M., Schwenker, F.: Ensemble Gaussian mixture models for probability density estimation. Comput. Stat. 27(1), 127–138 (2013)MathSciNetCrossRefMATH Glodek, M., Schels, M., Schwenker, F.: Ensemble Gaussian mixture models for probability density estimation. Comput. Stat. 27(1), 127–138 (2013)MathSciNetCrossRefMATH
13.
Zurück zum Zitat Glodek, M., Geier, T., Biundo, S., Palm, G.: A layered architecture for probabilistic complex pattern recognition to detect user preferences. J. Biol. Inspired Cognitive Archit. 9, 46–56 (2014)CrossRef Glodek, M., Geier, T., Biundo, S., Palm, G.: A layered architecture for probabilistic complex pattern recognition to detect user preferences. J. Biol. Inspired Cognitive Archit. 9, 46–56 (2014)CrossRef
14.
Zurück zum Zitat Glodek, M., Schels, M., Schwenker, F., Palm, G.: Combination of sequential class distributions from multiple channels using Markov fusion networks. J. Multimodal User Interfaces 8(3), 257–272 (2014)CrossRef Glodek, M., Schels, M., Schwenker, F., Palm, G.: Combination of sequential class distributions from multiple channels using Markov fusion networks. J. Multimodal User Interfaces 8(3), 257–272 (2014)CrossRef
15.
Zurück zum Zitat Glodek, M., Honold, F., Geier, T., Krell, G., Nothdurft, F., Reuter, S., Schüssel, F., Hörnle, T., Dietmayer, K., Minker, W., Biundo, S., Weber, M., Palm, G., Schwenker, F.: Fusion paradigms in cognitive technical systems for human-computer interaction. Neurocomputing 161, 17–37 (2015)CrossRef Glodek, M., Honold, F., Geier, T., Krell, G., Nothdurft, F., Reuter, S., Schüssel, F., Hörnle, T., Dietmayer, K., Minker, W., Biundo, S., Weber, M., Palm, G., Schwenker, F.: Fusion paradigms in cognitive technical systems for human-computer interaction. Neurocomputing 161, 17–37 (2015)CrossRef
16.
Zurück zum Zitat Gunes, H., Piccardi, M.: Bi-modal emotion recognition from expressive face and body gestures. J. Netw. Comput. Appl. 30(4), 1334–1345 (2007)CrossRef Gunes, H., Piccardi, M.: Bi-modal emotion recognition from expressive face and body gestures. J. Netw. Comput. Appl. 30(4), 1334–1345 (2007)CrossRef
17.
Zurück zum Zitat Healey, J.: Wearable and automotive systems for affect recognition from physiology. Ph.D. thesis, MIT (2000) Healey, J.: Wearable and automotive systems for affect recognition from physiology. Ph.D. thesis, MIT (2000)
18.
Zurück zum Zitat Hudlicka, E.: To feel or not to feel: The role of affect in human-computer interaction. Int. J. Hum.-Comput. Stud. 59(1-2), 1–32 (2003)CrossRef Hudlicka, E.: To feel or not to feel: The role of affect in human-computer interaction. Int. J. Hum.-Comput. Stud. 59(1-2), 1–32 (2003)CrossRef
19.
Zurück zum Zitat Kächele, M., Schwenker, F.: Cascaded fusion of dynamic, spatial, and textural feature sets for person-independent facial emotion recognition. In: Proceedings of ICPR, pp. 4660–4665 (2014) Kächele, M., Schwenker, F.: Cascaded fusion of dynamic, spatial, and textural feature sets for person-independent facial emotion recognition. In: Proceedings of ICPR, pp. 4660–4665 (2014)
20.
Zurück zum Zitat Kächele, M., Glodek, M., Zharkov, D., Meudt, S., Schwenker, F.: Fusion of audio-visual features using hierarchical classifier systems for the recognition of affective states and the state of depression. In: De Marsico, M., Tabbone, A., Fred, A. (eds.) Proceedings of ICPRAM, pp. 671–678. SciTePress, Setúbal (2014) Kächele, M., Glodek, M., Zharkov, D., Meudt, S., Schwenker, F.: Fusion of audio-visual features using hierarchical classifier systems for the recognition of affective states and the state of depression. In: De Marsico, M., Tabbone, A., Fred, A. (eds.) Proceedings of ICPRAM, pp. 671–678. SciTePress, Setúbal (2014)
21.
Zurück zum Zitat Kächele, M., Schels, M., Schwenker, F.: Inferring depression and affect from application dependent meta knowledge. In: Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, AVEC ’14, pp. 41–48. ACM, New York (2014) Kächele, M., Schels, M., Schwenker, F.: Inferring depression and affect from application dependent meta knowledge. In: Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge, AVEC ’14, pp. 41–48. ACM, New York (2014)
22.
Zurück zum Zitat Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Fluids Eng. 82(1), 35–45 (1960) Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Fluids Eng. 82(1), 35–45 (1960)
23.
Zurück zum Zitat Kanade, T., Cohn, J., Tian, Y.: Comprehensive database for facial expression analysis. In: Automatic Face and Gesture Recognition, 2000, pp. 46–53 (2000) Kanade, T., Cohn, J., Tian, Y.: Comprehensive database for facial expression analysis. In: Automatic Face and Gesture Recognition, 2000, pp. 46–53 (2000)
24.
Zurück zum Zitat Kim, K., Bang, S., Kim, S.: Emotion recognition system using short-term monitoring of physiological signals. Med. Biol. Eng. Comput. 42(3), 419–427 (2004)CrossRef Kim, K., Bang, S., Kim, S.: Emotion recognition system using short-term monitoring of physiological signals. Med. Biol. Eng. Comput. 42(3), 419–427 (2004)CrossRef
25.
Zurück zum Zitat Kipp, M.: Anvil - a generic annotation tool for multimodal dialogue. In: INTERSPEECH-2001, Aalborg, Denmark, pp. 1367–1370 (2001) Kipp, M.: Anvil - a generic annotation tool for multimodal dialogue. In: INTERSPEECH-2001, Aalborg, Denmark, pp. 1367–1370 (2001)
26.
Zurück zum Zitat Krell, G., Niese, R., Al-Hamadi, A., Michaelis, B.: Suppression of uncertainties at emotional transitions — facial mimics recognition in video with 3-D model. In: Richard, P., Braz, J. (eds.) Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), vol. 2, pp. 537–542 (2010) Krell, G., Niese, R., Al-Hamadi, A., Michaelis, B.: Suppression of uncertainties at emotional transitions — facial mimics recognition in video with 3-D model. In: Richard, P., Braz, J. (eds.) Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP), vol. 2, pp. 537–542 (2010)
27.
Zurück zum Zitat Krell, G., Glodek, M., Panning, A., Siegert, I., Michaelis, B., Wendemuth, A., Schwenker, F.: Fusion of fragmentary classifier decisions for affective state recognition. In: MPRSS, Lecture Notes on Artificial Intelligence, vol. 7742, pp. 116–130. Springer, Berlin (2012) Krell, G., Glodek, M., Panning, A., Siegert, I., Michaelis, B., Wendemuth, A., Schwenker, F.: Fusion of fragmentary classifier decisions for affective state recognition. In: MPRSS, Lecture Notes on Artificial Intelligence, vol. 7742, pp. 116–130. Springer, Berlin (2012)
28.
Zurück zum Zitat Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, New York (2004)CrossRefMATH Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, New York (2004)CrossRefMATH
29.
Zurück zum Zitat Lang, P.J.: Behavioral Treatment and Bio-Behavioral Assessment: Computer Applications, pp. 119–137. Ablex Publishing, New York (1980) Lang, P.J.: Behavioral Treatment and Bio-Behavioral Assessment: Computer Applications, pp. 119–137. Ablex Publishing, New York (1980)
30.
Zurück zum Zitat Meudt, S., Schwenker, F.: Enhanced autocorrelation in real world emotion recognition. In: Proceedings of the 16th International Conference on Multimodal Interaction, ICMI ’14, pp. 502–507. ACM, New York (2014) Meudt, S., Schwenker, F.: Enhanced autocorrelation in real world emotion recognition. In: Proceedings of the 16th International Conference on Multimodal Interaction, ICMI ’14, pp. 502–507. ACM, New York (2014)
31.
Zurück zum Zitat Meudt, S., Bigalke, L., Schwenker, F.: Atlas – an annotation tool for HCI data utilizing machine learning methods. In: International Conference on Affective and Pleasurable Design (APD’12), pp. 5347–5352 (2012) Meudt, S., Bigalke, L., Schwenker, F.: Atlas – an annotation tool for HCI data utilizing machine learning methods. In: International Conference on Affective and Pleasurable Design (APD’12), pp. 5347–5352 (2012)
32.
Zurück zum Zitat Meudt, S., Zharkov, D., Kächele, M., Schwenker, F.: Multi classifier systems and forward backward feature selection algorithms to classify emotional coloured speech. In: International Conference on Multimodal Interaction, ICMI 2013, pp. 551–556. ACM, New York (2013) Meudt, S., Zharkov, D., Kächele, M., Schwenker, F.: Multi classifier systems and forward backward feature selection algorithms to classify emotional coloured speech. In: International Conference on Multimodal Interaction, ICMI 2013, pp. 551–556. ACM, New York (2013)
33.
Zurück zum Zitat Niese, R., Al-Hamadi, A., Heuer, M., Michaelis, B., Matuszewski, B.: Machine vision based recognition of emotions using the circumplex model of affect. In: Proceedings of the International Conference on Multimedia Technology (ICMT), pp. 6424–6427. IEEE, New York (2011) Niese, R., Al-Hamadi, A., Heuer, M., Michaelis, B., Matuszewski, B.: Machine vision based recognition of emotions using the circumplex model of affect. In: Proceedings of the International Conference on Multimedia Technology (ICMT), pp. 6424–6427. IEEE, New York (2011)
34.
Zurück zum Zitat North, D.O.: An analysis of the factors which determine signal/noise discrimination in pulsed-carrier systems. Proc. IEEE 51(7), 1016–1027 (1963)CrossRef North, D.O.: An analysis of the factors which determine signal/noise discrimination in pulsed-carrier systems. Proc. IEEE 51(7), 1016–1027 (1963)CrossRef
35.
Zurück zum Zitat Oudeyer, P.: The production and recognition of emotions in speech: features and algorithms. Int. J. Hum.-Comput. Stud. 59(1-2), 157–183 (2003)CrossRef Oudeyer, P.: The production and recognition of emotions in speech: features and algorithms. Int. J. Hum.-Comput. Stud. 59(1-2), 157–183 (2003)CrossRef
36.
Zurück zum Zitat Palm, G., Glodek, M.: Towards emotion recognition in human computer interaction. In: Esposito, A., Squartini, S., Palm, G. (eds.) Neural Nets and Surroundings, vol. 19, pp. 323–336. Springer, Berlin (2013)CrossRef Palm, G., Glodek, M.: Towards emotion recognition in human computer interaction. In: Esposito, A., Squartini, S., Palm, G. (eds.) Neural Nets and Surroundings, vol. 19, pp. 323–336. Springer, Berlin (2013)CrossRef
37.
Zurück zum Zitat Panning, A., Siegert, I., Al-Hamadi, A., Wendemuth, A., Rösner, D., Frommer, J., Krell, G., Michaelis, B.: Multimodal affect recognition in spontaneous HCI environment. In: 2012 IEEE International Conference on Signal Processing, Communication and Computing, pp. 430–435. IEEE, New York (2012) Panning, A., Siegert, I., Al-Hamadi, A., Wendemuth, A., Rösner, D., Frommer, J., Krell, G., Michaelis, B.: Multimodal affect recognition in spontaneous HCI environment. In: 2012 IEEE International Conference on Signal Processing, Communication and Computing, pp. 430–435. IEEE, New York (2012)
38.
Zurück zum Zitat Ringeval, F., Sonderegger, A., Sauer, J., Lalanne, D.: Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In: 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–8 (2013) Ringeval, F., Sonderegger, A., Sauer, J., Lalanne, D.: Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In: 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–8 (2013)
39.
Zurück zum Zitat Schels, M., Scherer, S., Glodek, M., Kestler, H., Palm, G., Schwenker, F.: On the discovery of events in EEG data utilizing information fusion. Comput. Stat. 28(1), 5–18 (2013)MathSciNetCrossRefMATH Schels, M., Scherer, S., Glodek, M., Kestler, H., Palm, G., Schwenker, F.: On the discovery of events in EEG data utilizing information fusion. Comput. Stat. 28(1), 5–18 (2013)MathSciNetCrossRefMATH
40.
Zurück zum Zitat Schels, M., Kächele, M., Glodek, M., Hrabal, D., Walter, S., Schwenker, F.: Using unlabeled data to improve classification of emotional states in human computer interaction. J. Multimodal User Interfaces 8(1), 5–16 (2014)CrossRef Schels, M., Kächele, M., Glodek, M., Hrabal, D., Walter, S., Schwenker, F.: Using unlabeled data to improve classification of emotional states in human computer interaction. J. Multimodal User Interfaces 8(1), 5–16 (2014)CrossRef
41.
Zurück zum Zitat Scherer, K.R.: What are emotions? and how can they be measured? Soc. Sci. Inf. 44, 695–729 (2005)CrossRef Scherer, K.R.: What are emotions? and how can they be measured? Soc. Sci. Inf. 44, 695–729 (2005)CrossRef
42.
Zurück zum Zitat Scherer, S., Schwenker, F., Palm, G.: Classifier fusion for emotion recognition from speech. In: Advanced Intelligent Environments, pp. 95–117. Springer, Boston (2009) Scherer, S., Schwenker, F., Palm, G.: Classifier fusion for emotion recognition from speech. In: Advanced Intelligent Environments, pp. 95–117. Springer, Boston (2009)
43.
Zurück zum Zitat Scherer, S., Glodek, M., Layher, G., Schels, M., Schmidt, M., Brosch, T., Tschechne, S., Schwenker, F., Neumann, H., Palm, G.: A generic framework for the inference of user states in human computer interaction: how patterns of low level behavioral cues support complex user states in HCI. J. Multimodal User Interfaces 6(3–4), 117–141 (2012)CrossRef Scherer, S., Glodek, M., Layher, G., Schels, M., Schmidt, M., Brosch, T., Tschechne, S., Schwenker, F., Neumann, H., Palm, G.: A generic framework for the inference of user states in human computer interaction: how patterns of low level behavioral cues support complex user states in HCI. J. Multimodal User Interfaces 6(3–4), 117–141 (2012)CrossRef
44.
Zurück zum Zitat Scherer, S., Glodek, M., Schwenker, F., Campbell, N., Palm, G.: Spotting laughter in natural multiparty conversations: a comparison of automatic online and offline approaches using audiovisual data. ACM Trans. Interactive Intell. Syst. 2(1), 4:1–4:31 (2012) Scherer, S., Glodek, M., Schwenker, F., Campbell, N., Palm, G.: Spotting laughter in natural multiparty conversations: a comparison of automatic online and offline approaches using audiovisual data. ACM Trans. Interactive Intell. Syst. 2(1), 4:1–4:31 (2012)
45.
Zurück zum Zitat Schmidt, T., Schütte, W.: FOLKER: an annotation tool for efficient transcription of natural, multi-party interaction. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (2010) Schmidt, T., Schütte, W.: FOLKER: an annotation tool for efficient transcription of natural, multi-party interaction. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (2010)
46.
Zurück zum Zitat Schmidt, T., Wörner, K.: EXMARaLDA – Creating, analysing and sharing spoken language corpora for pragmatic research. Pragmatics 19, 565–582 (2009)CrossRef Schmidt, T., Wörner, K.: EXMARaLDA – Creating, analysing and sharing spoken language corpora for pragmatic research. Pragmatics 19, 565–582 (2009)CrossRef
47.
Zurück zum Zitat Schölkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J., Platt, J.C.: Support vector method for novelty detection. In: NIPS, vol. 12, pp. 582–588 (1999) Schölkopf, B., Williamson, R.C., Smola, A.J., Shawe-Taylor, J., Platt, J.C.: Support vector method for novelty detection. In: NIPS, vol. 12, pp. 582–588 (1999)
48.
Zurück zum Zitat Schüssel, F., Honold, F., Weber, M., Schmidt, M., Bubalo, N., Huckauf, A.: Multimodal interaction history and its use in error detection and recovery. In: Proceedings of the 16th ACM International Conference on Multimodal Interaction (ICMI’14), pp. 164–171. ACM, New York (2014) Schüssel, F., Honold, F., Weber, M., Schmidt, M., Bubalo, N., Huckauf, A.: Multimodal interaction history and its use in error detection and recovery. In: Proceedings of the 16th ACM International Conference on Multimodal Interaction (ICMI’14), pp. 164–171. ACM, New York (2014)
49.
Zurück zum Zitat Schwenker, F., Scherer, S., Magdi, Y.M., Palm, G.: The GMM-SVM supervector approach for the recognition of the emotional status from speech. In: ICANN (1), Lecture Notes on Computer Science, vol. 5768, pp. 894–903. Springer, Berlin (2009) Schwenker, F., Scherer, S., Magdi, Y.M., Palm, G.: The GMM-SVM supervector approach for the recognition of the emotional status from speech. In: ICANN (1), Lecture Notes on Computer Science, vol. 5768, pp. 894–903. Springer, Berlin (2009)
50.
Zurück zum Zitat Schwenker, F., Scherer, S., Schmidt, M., Schels, M., Glodek, M.: Multiple classifier systems for the recognition of human emotions. In: Multiple Classifier Systems, Lecture Notes on Computer Science, vol. 5997, pp. 315–324. Springer, Berlin (2010) Schwenker, F., Scherer, S., Schmidt, M., Schels, M., Glodek, M.: Multiple classifier systems for the recognition of human emotions. In: Multiple Classifier Systems, Lecture Notes on Computer Science, vol. 5997, pp. 315–324. Springer, Berlin (2010)
51.
Zurück zum Zitat Sezgin, M.C., Gunsel, B., Kurt, G.: Perceptual audio features for emotion detection. EURASIP J. Audio Speech Music Process. 2012, 1–21 (2012)CrossRef Sezgin, M.C., Gunsel, B., Kurt, G.: Perceptual audio features for emotion detection. EURASIP J. Audio Speech Music Process. 2012, 1–21 (2012)CrossRef
52.
Zurück zum Zitat Siegert, I., Glodek, M., Krell, G.: Using speaker group dependent modelling to improve fusion of fragmentary classifier decisions. In: Proceedings of the International IEEE Conference on Cybernetics (CYBCONF), pp. 132–137. IEEE, New York (2013) Siegert, I., Glodek, M., Krell, G.: Using speaker group dependent modelling to improve fusion of fragmentary classifier decisions. In: Proceedings of the International IEEE Conference on Cybernetics (CYBCONF), pp. 132–137. IEEE, New York (2013)
53.
Zurück zum Zitat Soleymani, M., Lichtenauer, J., Pun, T., Pantic, M.: A multimodal database for affect recognition and implicit tagging. IEEE Trans. Affect. Comput. 3, 42–55 (2012).CrossRef Soleymani, M., Lichtenauer, J., Pun, T., Pantic, M.: A multimodal database for affect recognition and implicit tagging. IEEE Trans. Affect. Comput. 3, 42–55 (2012).CrossRef
54.
Zurück zum Zitat Strauß, P.M., Hoffmann, H., Minker, W., Neumann, H., Palm, G., Scherer, S., Schwenker, F., Traue, H., Walter, W., Weidenbacher, U.: Wizard-of-oz data collection for perception and interaction in multi-user environments. In: Proceedings of LREC, pp. 2014–2017 (2006) Strauß, P.M., Hoffmann, H., Minker, W., Neumann, H., Palm, G., Scherer, S., Schwenker, F., Traue, H., Walter, W., Weidenbacher, U.: Wizard-of-oz data collection for perception and interaction in multi-user environments. In: Proceedings of LREC, pp. 2014–2017 (2006)
55.
Zurück zum Zitat Traue, H.C., Ohl, F., Brechmann, A., Schwenker, F., Kessler, H., Limbrecht, K., Hoffman, H., Scherer, S., Kotzyba, M., Scheck, A., Walter, S.: A framework for emotions and dispositions in man-companion interaction. In: Rojc, M., Campbell, N. (eds.) Converbal Synchrony in Human-Machine Interaction, pp. 98–140. CRC Press, Boca Raton (2013) Traue, H.C., Ohl, F., Brechmann, A., Schwenker, F., Kessler, H., Limbrecht, K., Hoffman, H., Scherer, S., Kotzyba, M., Scheck, A., Walter, S.: A framework for emotions and dispositions in man-companion interaction. In: Rojc, M., Campbell, N. (eds.) Converbal Synchrony in Human-Machine Interaction, pp. 98–140. CRC Press, Boca Raton (2013)
56.
Zurück zum Zitat Valstar, M., Schuller, B., Smith, K., Almaev, T., Eyben, F., Krajewski, J., Cowie, R., Pantic, M.: AVEC 2014: 3d dimensional affect and depression recognition challenge. In: Proceedings of ACM MM, AVEC ’14, pp. 3–10. ACM, New York (2014) Valstar, M., Schuller, B., Smith, K., Almaev, T., Eyben, F., Krajewski, J., Cowie, R., Pantic, M.: AVEC 2014: 3d dimensional affect and depression recognition challenge. In: Proceedings of ACM MM, AVEC ’14, pp. 3–10. ACM, New York (2014)
57.
Zurück zum Zitat Vinciarelli, A., Pantic, M., Bourlard, H., Pentland, A.: Social signal processing: state-of-the-art and future perspectives of an emerging domain. In: Proceedings of the International ACM Conference on Multimedia (MM), pp. 1061–1070. ACM, New York, NY (2008) Vinciarelli, A., Pantic, M., Bourlard, H., Pentland, A.: Social signal processing: state-of-the-art and future perspectives of an emerging domain. In: Proceedings of the International ACM Conference on Multimedia (MM), pp. 1061–1070. ACM, New York, NY (2008)
58.
Zurück zum Zitat Walter, S., Scherer, S., Schels, M., Glodek, M., Hrabal, D., Schmidt, M., Böck, R., Limbrecht, K., Traue, H.C., Schwenker, F.: Multimodal emotion classification in naturalistic user behavior. In: Jacko, J.A. (ed.) Proceedings of the 14th International Conference on Human Computer Interaction (HCI’11), Lecture Notes on Computer Science, vol. 6763, pp. 603–611. Springer, Berlin (2011) Walter, S., Scherer, S., Schels, M., Glodek, M., Hrabal, D., Schmidt, M., Böck, R., Limbrecht, K., Traue, H.C., Schwenker, F.: Multimodal emotion classification in naturalistic user behavior. In: Jacko, J.A. (ed.) Proceedings of the 14th International Conference on Human Computer Interaction (HCI’11), Lecture Notes on Computer Science, vol. 6763, pp. 603–611. Springer, Berlin (2011)
59.
Zurück zum Zitat Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)CrossRef Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)CrossRef
60.
Zurück zum Zitat Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 915–928 (2007)CrossRef Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 915–928 (2007)CrossRef
Metadaten
Titel
Multimodal Affect Recognition in the Context of Human-Computer Interaction for Companion-Systems
verfasst von
Friedhelm Schwenker
Ronald Böck
Martin Schels
Sascha Meudt
Ingo Siegert
Michael Glodek
Markus Kächele
Miriam Schmidt-Wack
Patrick Thiam
Andreas Wendemuth
Gerald Krell
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-43665-4_19