skip to main content
10.1145/3136755.3136807acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Do you speak to a human or a virtual agent? automatic analysis of user’s social cues during mediated communication

Published:03 November 2017Publication History

ABSTRACT

While several research works have shown that virtual agents are able to generate natural and social behaviors from users, few of them have compared these social reactions to those expressed dur- ing a human-human mediated communication. In this paper, we propose to explore the social cues expressed by a user during a mediated communication either with an embodied conversational agent or with another human. For this purpose, we have exploited a machine learning method to identify the facial and head social cues characteristics in each interaction type and to construct a model to automatically determine if the user is interacting with a virtual agent or another human. ‘e results show that, in fact, the users do not express the same facial and head movements during a communication with a virtual agent or another user. Based on these results, we propose to use such a machine learning model to automatically measure the social capability of a virtual agent to generate a social behavior in the user comparable to a human- human interaction. ‘e resulting model can detect automatically if the user is communicating with a virtual or real interlocutor, looking only at the user’s face and head during one second.

References

  1. Aran, O. and Gatica-Perez, D. (2013). One of a kind: inferring personality impressions in meetings. In Proceedings of the 15th ACM on International conference on multimodal interaction (ACM), 11–18 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aylett, M. P. and Pidcock, C. J. (2007). The cerevoice characterful speech synthesiser sdk. In IVA. 413–414 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bailenson, J. N. and Yee, N. (2005). Digital chameleons automatic assimilation of nonverbal gestures in immersive virtual environments. Psychological science 16, 814–819Google ScholarGoogle Scholar
  4. Beall, A., Bailenson, J., Loomis, J., Blascovich, J., and Rex, C. (2003). Non-zerosum mutual gaze in collaborative virtual environments. In Proceedings of HCI internationalGoogle ScholarGoogle Scholar
  5. Breiman, L. (2001). Random forests. Machine learning 45, 5–32 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Carney, D. R., Hall, J. A., and LeBeau, L. (2005). Beliefs about the nonverbal expression of social power. Journal of Nonverbal Behavior 29, 105–123Google ScholarGoogle ScholarCross RefCross Ref
  7. Cassell, J. (2000). More than just another pretty face: Embodied conversational interface agents. Communications of the ACM 43, 70–78 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cerekovic, A., Aran, O., and Gatica-Perez, D. (2016). Rapport with virtual agents: What do human social cues and personality explain?Google ScholarGoogle Scholar
  9. Dehn, D. M. and van Mulken, S. (2000). The impact of animated interface agents: a review of empirical research. International Journal of Human-Computer Studies Do You Speak to a Human or a Virtual Agent? ... ICMI’17, November 13–17, 2017, Glasgow, UK 52, 1–22 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ekman, P., Friesen, W. V., and Hager, J. C. (2002). The facial action coding system (Weidenfeld and Nicolson)Google ScholarGoogle Scholar
  11. Fourati, N. and Pelachaud, C. (2015). Relevant body cues for the classification of emotional body expression in daily actions. In Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on (IEEE), 267–273 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Hecht, M. L., DeVito, J. A., and Guerrero, L. K. (1999). Perspectives on nonverbal communication: Codes, functions, and contexts. In The nonverbal communication reader (Waveland Press Lone Grove„ IL). 3–18Google ScholarGoogle Scholar
  13. Kim, Y., Lee, H., and Provost, E. M. (2013). Deep learning for robust feature generation in audiovisual emotion recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (IEEE), 3687–3691Google ScholarGoogle Scholar
  14. Kopp, S., Gesellensetter, L., Krämer, N. C., and Wachsmuth, I. (2005). A conversational agent as museum guide–design and evaluation of a real-world application. In International Workshop on Intelligent Virtual Agents (Springer), 329–343 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Krämer, N., Kopp, S., Becker-Asano, C., and Sommer, N. (2013). Smile and the world will smile with youâĂŤthe effects of a virtual agentâĂŸs smile on usersâĂŹ evaluation and behavior. International Journal of Human-Computer Studies 71, 335–349 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Krämer, N. C. (2008). Social effects of virtual assistants. a review of empirical results with regard to communication. In Proceedings of the international conference on Intelligent Virtual Agents (IVA) (Berlin, Heidelberg: Springer-Verlag), 507–508 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Mayer, R. E. and DaPra, C. S. (2012). An embodiment effect in computer-based learning with animated pedagogical agents. Journal of Experimental Psychology: Applied 18, 239Google ScholarGoogle ScholarCross RefCross Ref
  18. McKeown, G., Valstar, M., Cowie, R., Pantic, M., and Schroder, M. (2012). The semaine database: Annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Transactions on Affective Computing 3, 5–17 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Mower, E., Feil-Seifer, D. J., Mataric, M. J., and Narayanan, S. (2007). Investigating implicit cues for user state estimation in human-robot interaction using physiological measurements. In Robot and Human interactive Communication, 2007.Google ScholarGoogle Scholar
  20. RO-MAN 2007. The 16th IEEE International Symposium on (IEEE), 1125–1130Google ScholarGoogle Scholar
  21. Nass, C. and Moon, Y. (2000). Machines and mindlessness: Social responses to computers. Journal of social issues 56, 81–103Google ScholarGoogle ScholarCross RefCross Ref
  22. Ochs, M., Niewiadmoski, R., and Pelachaud, C. (2010). How a virtual agent should smile? morphological and dynamic characteristics of virtual agent’s smiles. In Proceedings of the international conference on Intelligent Virtual Agents (IVA) (Springer Berlin Heidelberg), 427–440 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ochs, M., Niewiadomski, R., Brunet, P., and Pelachaud, C. (2012). Smiling virtual agent in social context. Cognitive Processing, Special Issue on“Social Agents” 13, 519–532Google ScholarGoogle Scholar
  24. Oliver, N. M., Rosario, B., and Pentland, A. P. (2000). A bayesian computer vision system for modeling human interactions. Pattern Analysis and Machine Intelligence, IEEE Transactions on 22, 831–843 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Pardo, D., Mencia, B. L., Trapote, Á. H., and Hernández, L. (2009). Non-verbal communication strategies to improve robustness in dialogue systems: a comparative study. Journal on Multimodal User Interfaces 3, 285–297Google ScholarGoogle ScholarCross RefCross Ref
  26. Pelachaud, C. (2009). Studies on gesture expressivity for a virtual agent. Speech Communication 51, 630–639 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Rashotte, L. S. (2002). What does that smile mean? the meaning of nonverbal behaviors in social interaction. Social Psychology Quarterly, 92–102Google ScholarGoogle Scholar
  28. Reeves, B. and Nass, C. (1996). How people treat computers, television, and new media like real people and places (CSLI Publications and Cambridge university press Cambridge, UK) Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T., and Zeileis, A. (2008). Conditional variable importance for random forests. BMC bioinformatics 9, 1Google ScholarGoogle Scholar
  30. Vinciarelli, A. (2007). Speakers Role Recognition in Multiparty Audio Recordings Using Social Network Analysis and Duration Distribution Modeling. IEEE Transactions on Multimedia 9, 1215–1226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Vinciarelli, A., Pantic, M., and Bourlard, H. (2009). Social signal processing: Survey of an emerging domain. Image and Vision Computing 27, 1743–1759 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Vinciarelli, A. and Pentland, A. S. (2015). New social signals in a new interaction world: the next frontier for social signal processing. IEEE Systems, Man, and Cybernetics Magazine 1, 10–17Google ScholarGoogle ScholarCross RefCross Ref
  33. Xiong, X. and De la Torre, F. (2013). Supervised descent method and its applications to face alignment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 532–539 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Do you speak to a human or a virtual agent? automatic analysis of user’s social cues during mediated communication

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICMI '17: Proceedings of the 19th ACM International Conference on Multimodal Interaction
      November 2017
      676 pages
      ISBN:9781450355438
      DOI:10.1145/3136755

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 November 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ICMI '17 Paper Acceptance Rate65of149submissions,44%Overall Acceptance Rate453of1,080submissions,42%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader