ABSTRACT
While several research works have shown that virtual agents are able to generate natural and social behaviors from users, few of them have compared these social reactions to those expressed dur- ing a human-human mediated communication. In this paper, we propose to explore the social cues expressed by a user during a mediated communication either with an embodied conversational agent or with another human. For this purpose, we have exploited a machine learning method to identify the facial and head social cues characteristics in each interaction type and to construct a model to automatically determine if the user is interacting with a virtual agent or another human. ‘e results show that, in fact, the users do not express the same facial and head movements during a communication with a virtual agent or another user. Based on these results, we propose to use such a machine learning model to automatically measure the social capability of a virtual agent to generate a social behavior in the user comparable to a human- human interaction. ‘e resulting model can detect automatically if the user is communicating with a virtual or real interlocutor, looking only at the user’s face and head during one second.
- Aran, O. and Gatica-Perez, D. (2013). One of a kind: inferring personality impressions in meetings. In Proceedings of the 15th ACM on International conference on multimodal interaction (ACM), 11–18 Google ScholarDigital Library
- Aylett, M. P. and Pidcock, C. J. (2007). The cerevoice characterful speech synthesiser sdk. In IVA. 413–414 Google ScholarDigital Library
- Bailenson, J. N. and Yee, N. (2005). Digital chameleons automatic assimilation of nonverbal gestures in immersive virtual environments. Psychological science 16, 814–819Google Scholar
- Beall, A., Bailenson, J., Loomis, J., Blascovich, J., and Rex, C. (2003). Non-zerosum mutual gaze in collaborative virtual environments. In Proceedings of HCI internationalGoogle Scholar
- Breiman, L. (2001). Random forests. Machine learning 45, 5–32 Google ScholarDigital Library
- Carney, D. R., Hall, J. A., and LeBeau, L. (2005). Beliefs about the nonverbal expression of social power. Journal of Nonverbal Behavior 29, 105–123Google ScholarCross Ref
- Cassell, J. (2000). More than just another pretty face: Embodied conversational interface agents. Communications of the ACM 43, 70–78 Google ScholarDigital Library
- Cerekovic, A., Aran, O., and Gatica-Perez, D. (2016). Rapport with virtual agents: What do human social cues and personality explain?Google Scholar
- Dehn, D. M. and van Mulken, S. (2000). The impact of animated interface agents: a review of empirical research. International Journal of Human-Computer Studies Do You Speak to a Human or a Virtual Agent? ... ICMI’17, November 13–17, 2017, Glasgow, UK 52, 1–22 Google ScholarDigital Library
- Ekman, P., Friesen, W. V., and Hager, J. C. (2002). The facial action coding system (Weidenfeld and Nicolson)Google Scholar
- Fourati, N. and Pelachaud, C. (2015). Relevant body cues for the classification of emotional body expression in daily actions. In Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on (IEEE), 267–273 Google ScholarDigital Library
- Hecht, M. L., DeVito, J. A., and Guerrero, L. K. (1999). Perspectives on nonverbal communication: Codes, functions, and contexts. In The nonverbal communication reader (Waveland Press Lone Grove„ IL). 3–18Google Scholar
- Kim, Y., Lee, H., and Provost, E. M. (2013). Deep learning for robust feature generation in audiovisual emotion recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (IEEE), 3687–3691Google Scholar
- Kopp, S., Gesellensetter, L., Krämer, N. C., and Wachsmuth, I. (2005). A conversational agent as museum guide–design and evaluation of a real-world application. In International Workshop on Intelligent Virtual Agents (Springer), 329–343 Google ScholarDigital Library
- Krämer, N., Kopp, S., Becker-Asano, C., and Sommer, N. (2013). Smile and the world will smile with youâĂŤthe effects of a virtual agentâĂŸs smile on usersâĂŹ evaluation and behavior. International Journal of Human-Computer Studies 71, 335–349 Google ScholarDigital Library
- Krämer, N. C. (2008). Social effects of virtual assistants. a review of empirical results with regard to communication. In Proceedings of the international conference on Intelligent Virtual Agents (IVA) (Berlin, Heidelberg: Springer-Verlag), 507–508 Google ScholarDigital Library
- Mayer, R. E. and DaPra, C. S. (2012). An embodiment effect in computer-based learning with animated pedagogical agents. Journal of Experimental Psychology: Applied 18, 239Google ScholarCross Ref
- McKeown, G., Valstar, M., Cowie, R., Pantic, M., and Schroder, M. (2012). The semaine database: Annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Transactions on Affective Computing 3, 5–17 Google ScholarDigital Library
- Mower, E., Feil-Seifer, D. J., Mataric, M. J., and Narayanan, S. (2007). Investigating implicit cues for user state estimation in human-robot interaction using physiological measurements. In Robot and Human interactive Communication, 2007.Google Scholar
- RO-MAN 2007. The 16th IEEE International Symposium on (IEEE), 1125–1130Google Scholar
- Nass, C. and Moon, Y. (2000). Machines and mindlessness: Social responses to computers. Journal of social issues 56, 81–103Google ScholarCross Ref
- Ochs, M., Niewiadmoski, R., and Pelachaud, C. (2010). How a virtual agent should smile? morphological and dynamic characteristics of virtual agent’s smiles. In Proceedings of the international conference on Intelligent Virtual Agents (IVA) (Springer Berlin Heidelberg), 427–440 Google ScholarDigital Library
- Ochs, M., Niewiadomski, R., Brunet, P., and Pelachaud, C. (2012). Smiling virtual agent in social context. Cognitive Processing, Special Issue on“Social Agents” 13, 519–532Google Scholar
- Oliver, N. M., Rosario, B., and Pentland, A. P. (2000). A bayesian computer vision system for modeling human interactions. Pattern Analysis and Machine Intelligence, IEEE Transactions on 22, 831–843 Google ScholarDigital Library
- Pardo, D., Mencia, B. L., Trapote, Á. H., and Hernández, L. (2009). Non-verbal communication strategies to improve robustness in dialogue systems: a comparative study. Journal on Multimodal User Interfaces 3, 285–297Google ScholarCross Ref
- Pelachaud, C. (2009). Studies on gesture expressivity for a virtual agent. Speech Communication 51, 630–639 Google ScholarDigital Library
- Rashotte, L. S. (2002). What does that smile mean? the meaning of nonverbal behaviors in social interaction. Social Psychology Quarterly, 92–102Google Scholar
- Reeves, B. and Nass, C. (1996). How people treat computers, television, and new media like real people and places (CSLI Publications and Cambridge university press Cambridge, UK) Google ScholarDigital Library
- Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T., and Zeileis, A. (2008). Conditional variable importance for random forests. BMC bioinformatics 9, 1Google Scholar
- Vinciarelli, A. (2007). Speakers Role Recognition in Multiparty Audio Recordings Using Social Network Analysis and Duration Distribution Modeling. IEEE Transactions on Multimedia 9, 1215–1226. Google ScholarDigital Library
- Vinciarelli, A., Pantic, M., and Bourlard, H. (2009). Social signal processing: Survey of an emerging domain. Image and Vision Computing 27, 1743–1759 Google ScholarDigital Library
- Vinciarelli, A. and Pentland, A. S. (2015). New social signals in a new interaction world: the next frontier for social signal processing. IEEE Systems, Man, and Cybernetics Magazine 1, 10–17Google ScholarCross Ref
- Xiong, X. and De la Torre, F. (2013). Supervised descent method and its applications to face alignment. In Proceedings of the IEEE conference on computer vision and pattern recognition. 532–539 Google ScholarDigital Library
Index Terms
- Do you speak to a human or a virtual agent? automatic analysis of user’s social cues during mediated communication
Recommendations
Automatic understanding of affective and social signals by multimodal mimicry recognition
ACII'11: Proceedings of the 4th international conference on Affective computing and intelligent interaction - Volume Part IIHuman mimicry is one of the important behavioral cues displayed during social interaction that inform us about the interlocutors' interpersonal states and attitudes. For example, the absence of mimicry is usually associated with negative attitudes. A ...
Should Agents Speak Like, um, Humans? The Use of Conversational Fillers by Virtual Agents
IVA '09: Proceedings of the 9th International Conference on Intelligent Virtual AgentsWe describe the design and evaluation of an agent that uses the fillers <em>um</em> and <em>uh</em> in its speech. We describe an empirical study of human-human dialogue, analyzing gaze behavior during the production of fillers and use this data to ...
Co-Located Human–Human Interaction Analysis Using Nonverbal Cues: A Survey
Automated co-located human–human interaction analysis has been addressed by the use of nonverbal communication as measurable evidence of social and psychological phenomena. We survey the computing studies (since 2010) detecting phenomena related to social ...
Comments