ABSTRACT
In face-to-face conversations, speakers are continuously checking whether the listener is engaged in the conversation and change the conversational strategy if the listener is not fully engaged in the conversation. With the goal of building a conversational agent that can adaptively control conversations with the user, this study analyzes the user's gaze behaviors and proposes a method for estimating whether the user is engaged in the conversation based on gaze transition 3-gram patterns. First, we conduct a Wizard-of-Oz experiment to collect the user's gaze behaviors. Based on the analysis of the gaze data, we propose an engagement estimation method that detects the user's disengagement gaze patterns. The algorithm is implemented as a real-time engagement-judgment mechanism and is incorporated into a multimodal dialogue manager in a conversational agent. The agent estimates the user's conversational engagement and generates probing questions when the user is distracted from the conversation. Finally, we conduct an evaluation experiment using the proposed engagement-sensitive agent and demonstrate that the engagement estimation function improves the user's impression of the agent and the interaction with the agent. In addition, probing performed with proper timing was also found to have a positive effect on user's verbal/nonverbal behaviors in communication with the conversational agent.
- Sidner, C.L., et al., Explorations in engagement for humans and robots. Artificial Intelligence, (2005). 166(1-2): pp. 140--164. Google ScholarDigital Library
- Peters, C. Direction of Attention Perception for Conversation Initiation in Virtual Environments. in Intelligent Virtual Agents. (2005). p. 215--228. Google ScholarDigital Library
- Sidner, C.L., et al. Where to Look: A Study of Human-Robot Engagement. in ACM International Conference on Intelligent User Interfaces (IUI). (2004). p. 78--84. Google ScholarDigital Library
- Argyle, M. and Cook, M., Gaze and Mutual Gaze. (1976), Cambridge: Cambridge University Press.Google Scholar
- Duncan, S., On the structure of speaker-auditor interaction during speaking turns. Language in Society, (1974). 3: pp. 161--180.Google ScholarCross Ref
- Novick, D.G., Hansen, B., and Ward, K. Coordinating turn-taking with gaze. in ICSLP-96. (1996). Philadelphia, PA. p. 1888--1891.Google Scholar
- Kendon, A., Some Functions of Gaze Direction in Social Interaction. Acta Psychologica, (1967). 26: pp. 22--63.Google Scholar
- Argyle, M., et al., The different functions of gaze. Semiotica, (1973). 7: pp. 19--32.Google ScholarCross Ref
- Argyle, M. and Graham, J., The Central Europe Experiment - looking at persons and looking at things. Journal of Environmental Psychology and Nonverbal Behaviour, (1977). 1: pp. 6--16.Google Scholar
- Anderson, A.H., et al., The effects of face-to-face communication on the intelligibility of speech. Perception and Psychophysics, (1997). 59: pp. 580--592.Google Scholar
- Prasov, Z. and Chai, J.Y. What's in a Gaze? The Role of Eye-Gaze in Reference Resolution in Multimodal Conversational Interfaces. in the 13th international conference on Intelligent user interfaces (2008). p. 20--29. Google ScholarDigital Library
- Qvarfordt, P. and Zhai, S. Conversing with the User Based on Eye-Gaze Patterns. in the Conference on Human-Factors in Computing Systems, CHI 2005. (2005). Google ScholarDigital Library
- Eichner, T., et al. Attentive Presentation Agents. in The 7th International Conference on Intelligent Virtual Agents (IVA). (2007). p. 283--295. Google ScholarDigital Library
- Nakano, I.Y. and Nishida, T., Attentional Behaviors as Nonverbal Communicative Signals in Situated Interactions with Conversational Agents, in Engineering Approaches to Conversational Informatics, Nishida, T., Editor. (2007), John Wiley & Sons Inc.Google Scholar
- Pelachaud, C. and Bilvi, M. Modelling Gaze Behavior for Conversational Agents. in IVA03 International Working Conference on Intelligent Virtual Agents. (2003). Germany.Google Scholar
- Gratch, J., et al., Virtual Rapport, in 6th International Conference on Intelligent Virtual Agents. (2006), Springer: Marina del Rey, CA. Google ScholarDigital Library
- Nakano, Y.I., et al. Towards a Model of Face-to-Face Grounding. in the 41st Annual Meeting of the Association for Computational Linguistics (ACL03). (2003). Sapporo, Japan. p. 553--561. Google ScholarDigital Library
- Morency, L.-P., Kok, I.d., and Gratch, J. Predicting Listener Backchannels: A Probabilistic Multimodal Approach. in The 8th International Conference Intelligent Virtual Agents (IVA'08). (2008): Springer. p. 176--190. Google ScholarDigital Library
- Morency, L.-P., et al., Head gestures for perceptual interfaces: The role of context in improving recognition. Artificial Intelligence (2007). 171(8-9): pp. 568--585. Google ScholarDigital Library
- Bohus, D. and Horvitz, E. Learning to Predict Engagement with a Spoken Dialog System in Open-World Settings. in SIGdial'09. (2009). London, UK. Google ScholarDigital Library
- Kendon, A., Spatial organization in social encounters: the F-formation system, Conducting Interaction: Patterns of behavior in focused encounters. Studies in International Sociolinguistics, ed. Gumperz, J.J. (1990): Cambridge University Press.Google Scholar
- Schober, M.F. and Clark, H.H., Understanding by addressees and overhearers. Cognitive Psychology, (1989). 21: pp. 211--232.Google Scholar
- Ishii, R. and Nakano, Y. Estimating User's Conversational Engagement based on Gaze Behaviors. in The 8th International Conference Intelligent Virtual Agents (IVA'08). (2008): Springer. p. 200--207. Google ScholarDigital Library
- Matheson, C., Poesio, M., and Traum, D. Modelling Grounding and Discourse Obligations Using Update Rules. in 1st Annual Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL2000). (2000). p. 1--8. Google ScholarDigital Library
- Nakano, Y.I., et al. Converting Text into Agent Animations: Assigning Gestures to Text. in Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL 2004), Companion Volume. (2004). Boston. p. 153--156. Google ScholarDigital Library
- Larsson, S., et al., TrindiKit 1.0 (Manual). (1999). p. http://www.ling.gu.se/projekt/trindi//.Google Scholar
- MIDIKI. {cited; Available from: http://midiki.sourceforge.net/.Google Scholar
Index Terms
- Estimating user's engagement from eye-gaze behaviors in human-agent conversations
Recommendations
Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes
CHI '01: Proceedings of the SIGCHI Conference on Human Factors in Computing SystemsIn multi-agent, multi-user environments, users as well as agents should have a means of establishing who is talking to whom. In this paper, we present an experiment aimed at evaluating whether gaze directional cues of users could be used for this ...
An empirical study of eye-gaze behaviors: towards the estimation of conversational engagement in human-agent communication
EGIHMI '10: Proceedings of the 2010 workshop on Eye gaze in intelligent human machine interactionIn face-to-face conversations, speakers are continuously checking whether the listener is engaged in the conversation by monitoring the partner's eye-gaze behaviors. In this study, focusing on eye-gaze as information of estimating user's conversational ...
Gaze awareness in conversational agents: Estimating a user's conversational engagement from eye gaze
Special issue on interaction with smart objects, Special section on eye gaze and conversationIn face-to-face conversations, speakers are continuously checking whether the listener is engaged in the conversation, and they change their conversational strategy if the listener is not fully engaged. With the goal of building a conversational agent ...
Comments