research-article

Do you speak to a human or a virtual agent? automatic analysis of user’s social cues during mediated communication

Authors:
Magalie Ochs

Aix-Marseille University, France / University of Toulon, France

Aix-Marseille University, France / University of Toulon, France
View Profile

,
Nathan Libermann

Aix-Marseille University, France / University of Toulon, France

Aix-Marseille University, France / University of Toulon, France
View Profile

,
Axel Boidin

Picxel, France

Picxel, France
View Profile

,
Thierry Chaminade

Aix-Marseille University, France / University of Toulon, France

Aix-Marseille University, France / University of Toulon, France
View Profile

ICMI '17: Proceedings of the 19th ACM International Conference on Multimodal InteractionNovember 2017Pages 197–205https://doi.org/10.1145/3136755.3136807

Published:03 November 2017Publication History

ICMI '17: Proceedings of the 19th ACM International Conference on Multimodal Interaction

Pages 197–205

ABSTRACT

While several research works have shown that virtual agents are able to generate natural and social behaviors from users, few of them have compared these social reactions to those expressed dur- ing a human-human mediated communication. In this paper, we propose to explore the social cues expressed by a user during a mediated communication either with an embodied conversational agent or with another human. For this purpose, we have exploited a machine learning method to identify the facial and head social cues characteristics in each interaction type and to construct a model to automatically determine if the user is interacting with a virtual agent or another human. ‘e results show that, in fact, the users do not express the same facial and head movements during a communication with a virtual agent or another user. Based on these results, we propose to use such a machine learning model to automatically measure the social capability of a virtual agent to generate a social behavior in the user comparable to a human- human interaction. ‘e resulting model can detect automatically if the user is communicating with a virtual or real interlocutor, looking only at the user’s face and head during one second.

References

Aran, O. and Gatica-Perez, D. (2013). One of a kind: inferring personality impressions in meetings. In Proceedings of the 15th ACM on International conference on multimodal interaction (ACM), 11–18 Google ScholarDigital Library
Aylett, M. P. and Pidcock, C. J. (2007). The cerevoice characterful speech synthesiser sdk. In IVA. 413–414 Google ScholarDigital Library
Bailenson, J. N. and Yee, N. (2005). Digital chameleons automatic assimilation of nonverbal gestures in immersive virtual environments. Psychological science 16, 814–819Google Scholar
Beall, A., Bailenson, J., Loomis, J., Blascovich, J., and Rex, C. (2003). Non-zerosum mutual gaze in collaborative virtual environments. In Proceedings of HCI internationalGoogle Scholar
Breiman, L. (2001). Random forests. Machine learning 45, 5–32 Google ScholarDigital Library
Carney, D. R., Hall, J. A., and LeBeau, L. (2005). Beliefs about the nonverbal expression of social power. Journal of Nonverbal Behavior 29, 105–123Google ScholarCross Ref
Cassell, J. (2000). More than just another pretty face: Embodied conversational interface agents. Communications of the ACM 43, 70–78 Google ScholarDigital Library
Cerekovic, A., Aran, O., and Gatica-Perez, D. (2016). Rapport with virtual agents: What do human social cues and personality explain?Google Scholar
Dehn, D. M. and van Mulken, S. (2000). The impact of animated interface agents: a review of empirical research. International Journal of Human-Computer Studies Do You Speak to a Human or a Virtual Agent? ... ICMI’17, November 13–17, 2017, Glasgow, UK 52, 1–22 Google ScholarDigital Library
Ekman, P., Friesen, W. V., and Hager, J. C. (2002). The facial action coding system (Weidenfeld and Nicolson)Google Scholar
Fourati, N. and Pelachaud, C. (2015). Relevant body cues for the classification of emotional body expression in daily actions. In Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on (IEEE), 267–273 Google ScholarDigital Library
Hecht, M. L., DeVito, J. A., and Guerrero, L. K. (1999). Perspectives on nonverbal communication: Codes, functions, and contexts. In The nonverbal communication reader (Waveland Press Lone Grove„ IL). 3–18Google Scholar
Kim, Y., Lee, H., and Provost, E. M. (2013). Deep learning for robust feature generation in audiovisual emotion recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (IEEE), 3687–3691Google Scholar
Kopp, S., Gesellensetter, L., Krämer, N. C., and Wachsmuth, I. (2005). A conversational agent as museum guide–design and evaluation of a real-world application. In International Workshop on Intelligent Virtual Agents (Springer), 329–343 Google ScholarDigital Library
Krämer, N., Kopp, S., Becker-Asano, C., and Sommer, N. (2013). Smile and the world will smile with youâĂŤthe effects of a virtual agentâĂŸs smile on usersâĂŹ evaluation and behavior. International Journal of Human-Computer Studies 71, 335–349 Google ScholarDigital Library
Krämer, N. C. (2008). Social effects of virtual assistants. a review of empirical results with regard to communication. In Proceedings of the international conference on Intelligent Virtual Agents (IVA) (Berlin, Heidelberg: Springer-Verlag), 507–508 Google ScholarDigital Library
Mayer, R. E. and DaPra, C. S. (2012). An embodiment effect in computer-based learning with animated pedagogical agents. Journal of Experimental Psychology: Applied 18, 239Google ScholarCross Ref
McKeown, G., Valstar, M., Cowie, R., Pantic, M., and Schroder, M. (2012). The semaine database: Annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Transactions on Affective Computing 3, 5–17 Google ScholarDigital Library
Mower, E., Feil-Seifer, D. J., Mataric, M. J., and Narayanan, S. (2007). Investigating implicit cues for user state estimation in human-robot interaction using physiological measurements. In Robot and Human interactive Communication, 2007.Google Scholar
RO-MAN 2007. The 16th IEEE International Symposium on (IEEE), 1125–1130Google Scholar
Nass, C. and Moon, Y. (2000). Machines and mindlessness: Social responses to computers. Journal of social issues 56, 81–103Google ScholarCross Ref
Ochs, M., Niewiadmoski, R., and Pelachaud, C. (2010). How a virtual agent should smile? morphological and dynamic characteristics of virtual agent’s smiles. In Proceedings of the international conference on Intelligent Virtual Agents (IVA) (Springer Berlin Heidelberg), 427–440 Google ScholarDigital Library
Ochs, M., Niewiadomski, R., Brunet, P., and Pelachaud, C. (2012). Smiling virtual agent in social context. Cognitive Processing, Special Issue on“Social Agents” 13, 519–532Google Scholar
Oliver, N. M., Rosario, B., and Pentland, A. P. (2000). A bayesian computer vision system for modeling human interactions. Pattern Analysis and Machine Intelligence, IEEE Transactions on 22, 831–843 Google ScholarDigital Library
Pardo, D., Mencia, B. L., Trapote, Á. H., and Hernández, L. (2009). Non-verbal communication strategies to improve robustness in dialogue systems: a comparative study. Journal on Multimodal User Interfaces 3, 285–297Google ScholarCross Ref
Pelachaud, C. (2009). Studies on gesture expressivity for a virtual agent. Speech Communication 51, 630–639 Google ScholarDigital Library
Rashotte, L. S. (2002). What does that smile mean? the meaning of nonverbal behaviors in social interaction. Social Psychology Quarterly, 92–102Google Scholar
Reeves, B. and Nass, C. (1996). How people treat computers, television, and new media like real people and places (CSLI Publications and Cambridge university press Cambridge, UK) Google ScholarDigital Library
Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T., and Zeileis, A. (2008). Conditional variable importance for random forests. BMC bioinformatics 9, 1Google Scholar
Vinciarelli, A. (2007). Speakers Role Recognition in Multiparty Audio Recordings Using Social Network Analysis and Duration Distribution Modeling. IEEE Transactions on Multimedia 9, 1215–1226. Google ScholarDigital Library
Vinciarelli, A., Pantic, M., and Bourlard, H. (2009). Social signal processing: Survey of an emerging domain. Image and Vision Computing 27, 1743–1759 Google ScholarDigital Library
Vinciarelli, A. and Pentland, A. S. (2015). New social signals in a new interaction world: the next frontier for social signal processing. IEEE Systems, Man, and Cybernetics Magazine 1, 10–17Google ScholarCross Ref
Xiong, X. and De la Torre, F. (2013). Supervised descent method and its applications to face alignment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 532–539 Google ScholarDigital Library

Index Terms

Do you speak to a human or a virtual agent? automatic analysis of user’s social cues during mediated communication
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Automatic understanding of affective and social signals by multimodal mimicry recognition
ACII'11: Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part II

Human mimicry is one of the important behavioral cues displayed during social interaction that inform us about the interlocutors' interpersonal states and attitudes. For example, the absence of mimicry is usually associated with negative attitudes. A ...
Read More
Should Agents Speak Like, um, Humans? The Use of Conversational Fillers by Virtual Agents
IVA '09: Proceedings of the 9th International Conference on Intelligent Virtual Agents

We describe the design and evaluation of an agent that uses the fillers <em>um</em> and <em>uh</em> in its speech. We describe an empirical study of human-human dialogue, analyzing gaze behavior during the production of fillers and use this data to ...
Read More
Co-Located Human–Human Interaction Analysis Using Nonverbal Cues: A Survey
Automated co-located human–human interaction analysis has been addressed by the use of nonverbal communication as measurable evidence of social and psychological phenomena. We survey the computing studies (since 2010) detecting phenomena related to social ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '17: Proceedings of the 19th ACM International Conference on Multimodal Interaction
November 2017
676 pages
ISBN:9781450355438
DOI:10.1145/3136755
General Chairs:
Edward Lank
University of Waterloo, Canada
,
Alessandro Vinciarelli
University of Glasgow, UK
,
Program Chairs:
Eve Hoggan
Aarhus University, Denmark
,
Sriram Subramanian
University of Sussex, UK
,
Stephen A. Brewster
University of Glasgow, UK
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Embodied Conversational Agent
Social signals
human-machine mediated communication
Qualifiers
- research-article
Conference

Acceptance Rates
ICMI '17 Paper Acceptance Rate65of149submissions,44%Overall Acceptance Rate453of1,080submissions,42%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 311
  Total Downloads
- Downloads (Last 12 months)23
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Do you speak to a human or a virtual agent? automatic analysis of user’s social cues during mediated communication

ICMI '17: Proceedings of the 19th ACM International Conference on Multimodal Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automatic understanding of affective and social signals by multimodal mimicry recognition

Should Agents Speak Like, um, Humans? The Use of Conversational Fillers by Virtual Agents

Co-Located Human–Human Interaction Analysis Using Nonverbal Cues: A Survey