ABSTRACT
We describe a machine learning approach that allows an open-world spoken dialog system to learn to predict engagement intentions in situ, from interaction. The proposed approach does not require any developer supervision, and leverages spatiotemporal and attentional features automatically extracted from a visual analysis of people coming into the proximity of the system to produce models that are attuned to the characteristics of the environment the system is placed in. Experimental results indicate that a system using the proposed approach can learn to recognize engagement intentions at low false positive rates (e.g. 2--4%) up to 3--4 seconds prior to the actual moment of engagement.
- M. Argyle and M. Cook, 1976, Gaze and Mutual Gaze, Cambridge University Press, New YorkGoogle Scholar
- D. Bohus and E. Horvitz, 2009a, Open-World Dialog: Challenges, Directions and Prototype, to appear in KRPD'09, Pasadena, CAGoogle Scholar
- D. Bohus and E. Horvitz, 2009b, Computational Models for Multiparty Engagement in Open-World Dialog, submitted to SIGdial'09, London, UK. Google ScholarDigital Library
- E. Goffman, 1963, Behaviour in public places: notes on the social order of gatherings, The Free Press, New YorkGoogle Scholar
- E. T. Hall, 1966, The Hidden Dimension: man's use of space in public and private, New York: Doubleday.Google Scholar
- A. Kendon, 1990a, A description of some human greetings, Conducting Interaction: Patterns of behavior in focused encounters, Studies in International Sociolinguistics, Cambridge University PressGoogle Scholar
- A. Kendon, 1990b, Spatial organization in social encounters: the F-formation system, Conducting Interaction: Patterns of behavior in focused encounters, Studies in International Sociolinguistics, Cambridge University PressGoogle Scholar
- M. P. Michalowski, S. Sabanovic, and R. Simmons, A spatial model of engagement for a social robot, in 9th IEEE Workshop on Advanced Motion Control, pp. 762--767Google Scholar
- C. Peters, C. Pelachaud, E. Bevacqua, and M. Mancini, 2005a, A model of attention and interest using gaze behavior, Lecture Notes in Computer Science, pp. 229--240. Google ScholarDigital Library
- C. Peters, 2005b, Direction of Attention Perception for Conversation Initiation in Virtual Environments, in Intelligent Virtual Agents, 2005, pp. 215--228. Google ScholarDigital Library
- C. L. Sidner, C. D. Kidd, C. Lee, and N. Lesh, 2004, Where to Look: A Study of Human-Robot Engagement, IUI'2004, pp. 78--84, Madeira, Portugal Google ScholarDigital Library
- C. L. Sidner, C. Lee, C. D. Kidd, N. Lesh, and C. Rich, 2005, Explorations in engagement for humans and robots, Artificial Intelligence, 166 (1--2), pp. 140--164 Google ScholarDigital Library
- R. Vertegaal, R. Slagter, G. C.v.d. Veer, and A. Nijholt, 2001, Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes, CHI'01 Google ScholarDigital Library
Index Terms
- Learning to predict engagement with a spoken dialog system in open-world settings
Recommendations
Dialog in the open world: platform and applications
ICMI-MLMI '09: Proceedings of the 2009 international conference on Multimodal interfacesWe review key challenges of developing spoken dialog systems that can engage in interactions with one or multiple participants in relatively unconstrained environments. We outline a set of core competencies for open-world dialog, and describe three ...
Automatically training a problematic dialogue predictor for a spoken dialogue system
Spoken dialogue systems promise efficient and natural access to a large variety of information sources and services from any phone. However, current spoken dialogue systems are deficient in their strategies for preventing, identifying and repairing ...
Human-robot collaborative tutoring using multiparty multimodal spoken dialogue
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interactionIn this paper, we describe a project that explores a novel experimental setup towards building a spoken, multi-modally rich, and human-like multiparty tutoring robot. A human-robot interaction setup is designed, and a human-human dialogue corpus is ...
Comments