ABSTRACT
We introduce a humanoid robot bartender that is capable of dealing with multiple customers in a dynamic, multi-party social setting. The robot system incorporates state-of-the-art components for computer vision, linguistic processing, state management, high-level reasoning, and robot control. In a user evaluation, 31 participants interacted with the bartender in a range of social situations. Most customers successfully obtained a drink from the bartender in all scenarios, and the factors that had the greatest impact on subjective satisfaction were task success and dialogue efficiency.
Supplemental Material
- A. Argyros and M. Lourakis. 3D tracking of skin-colored regions by a moving stereoscopic observer. Applied Optics, 43 (2): 366--378, Jan. 2004.Google ScholarCross Ref
- W. Bainbridge, J. Hart, E. Kim, and B. Scassellati. The benefits of interactions with physically present robots over video-displayed agents. International Journal of Social Robotics, 3: 41--52, 2011. 10.1007/s12369-010-0082-7.Google ScholarCross Ref
- H. Baltzakis and A. Argyros. Propagation of pixel hypotheses for multiple objects tracking. In Proceedings of ISVC 2009, Nov. 2009. Google ScholarDigital Library
- H. Baltzakis, M. Pateraki, and P. Trahanias. Visual tracking of hands, faces and facial features of multiple persons. Machine Vision and Applications, pages 1--17, 2012. 10.1007/s00138-012-0409-5.Google ScholarCross Ref
- C. Bartneck, D. Kulić, E. Croft, and S. Zoghbi. Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International Journal of Social Robotics, 1: 71--81, 2009. 10.1007/s12369-008-0001-3.Google ScholarCross Ref
- D. Bohus and E. Horvitz. Dialog in the open world: platform and applications. In Proceedings of ICMI-MLMI 2009, pages 31--38, Nov. 2009. 10.1145/1647314.1647323. Google ScholarDigital Library
- C. Breazeal. Socially intelligent robots. interactions, 12 (2): 19--22, 2005. 10.1145/1052438.1052455. Google ScholarDigital Library
- G. Castellano, I. Leite, A. Pereira, C. Martinho, A. Paiva, and P. W. McOwan. Affect recognition for interactive companions: challenges and design in real world scenarios. Journal on Multimodal User Interfaces, 3 (1): 89--98, 2010. 10.1007/s12193-009-0033-5.Google ScholarCross Ref
- K. Dautenhahn. Socially intelligent robots: dimensions of human-robot interaction. Philosophical Transactions of the Royal Society B: Biological Sciences, 362 (1480): 679--704, 2007. 10.1098/rstb.2006.2004.Google ScholarCross Ref
- R. E. Fikes and N. J. Nilsson. STRIPS: A new approach to the application of theorem proving to problem solving. Artificial Intelligence, 2: 189--208, 1971. 10.1016/0004-3702(71)90010-5. Google ScholarCross Ref
- M. Frampton and O. Lemon. Recent research advances in reinforcement learning in spoken dialogue systems. The Knowledge Engineering Review, 24 (4): 375--408, 2009. 10.1017/S0269888909990166. Google ScholarDigital Library
- S. S. Ge and M. J. Matarić. Preface. International Journal of Social Robotics, 1 (1): 1--2, 2009. 10.1007/s12369-008-0010-2.Google ScholarCross Ref
- M. Giuliani, M. E. Foster, A. Isard, C. Matheson, J. Oberlander, and A. Knoll. Situated reference in a hybrid human-robot interaction system. In Proceedings of INLG 2010, 2010. Google ScholarDigital Library
- T. Horf, R. Roller, and S. Wilske. MiCo: The robotic bartender for mini-cocktails. http://www.coli.uni-saarland.de/courses/lego-04/page.php?id=barkeeper, 2004.Google Scholar
- A. Hunt and S. McGlashan. Speech recognition grammar specification version 1.0. W3C recommendation, W3C, Mar. 2004.Google Scholar
- K. Huth. Wie man ein Bier bestellt. Master's thesis, Universität Bielefeld, 2011.Google Scholar
- A. Isard and C. Matheson. Rhetorical structure for natural language generation in dialogue. In Proceedings of SemDial 2012, 2012.Google Scholar
- A. Kapoor, W. Burleson, and R. W. Picard. Automatic prediction of frustration. International Journal of Human-Computer Studies, 65 (8): 724--736, 2007. 10.1016/j.ijhcs.2007.02.003. Google ScholarDigital Library
- S. Larsson and D. Traum. Information state and dialogue management in the TRINDI dialogue move engine toolkit. Natural Language Engineering, 6 (3&4): 323--340, 2000. 10.1017/S1351324900002539. Google ScholarDigital Library
- D. J. Litman and S. Pan. Designing and evaluating an adaptive spoken dialogue system. User Modeling and User-Adapted Interaction, 12 (2-3): 111--137, 2002. Google ScholarDigital Library
- W. C. Mann and S. A. Thompson. Rhetorical structure theory: Toward a functional theory of text organization. Text, 8 (3): 243--281, 1988.Google ScholarCross Ref
- E. Márquez Segura, M. Kriegel, R. Aylett, A. Deshmukh, and H. Cramer. How do you like me in this: User embodiment preferences for companion agents. In Proceedings of IVA 2012, Sept. 2012. Google ScholarDigital Library
- T. Masuda and D. Misaki. Development of Japanese green tea serving robot "T-Bartender". In Proceedings of ICMA 2005, volume 2, pages 1069--1074, July 2005. 10.1109/ICMA.2005.1626700.Google ScholarCross Ref
- Y. Matsusaka, T. Tojo, and T. Kobayashi. Conversation robot participating in group conversation. IEICE Transactions on Information and Systems, 86 (1): 26--36, 2003.Google Scholar
- B. Mutlu, T. Shiwa, T. Kanda, H. Ishiguro, and N. Hagita. Footing in human-robot conversations: how robots might shape participant roles using gaze cues. In Proceedings of HRI 2009, pages 61--68, 2009. 10.1145/1514095.1514109. Google ScholarDigital Library
- M. Pateraki, H. Baltzakis, and T. P. Visual tracking of hands, faces and facial features as a basis for human-robot communication. In Proceedings of the IROS 2011 Workshop on Visual Tracking and Omnidirectional Vision, September 2011.Google Scholar
- M. Pateraki, H. Baltzakis, and P. Trahanias. Using Dempster's rule of combination to robustly estimate pointed targets. In Proceedings of ICRA 2012, May 2012.Google ScholarCross Ref
- R. P. A. Petrick and F. Bacchus. A knowledge-based approach to planning with incomplete information and sensing. In Proceedings of AIPS-2002, pages 212--221, Apr. 2002.Google Scholar
- R. P. A. Petrick and F. Bacchus. Extending the knowledge-based approach to planning with incomplete information and sensing. In Proceedings of KR-2004, pages 613--622, June 2004.Google Scholar
- R. P. A. Petrick and M. E. Foster. What would you like to drink? recognising and planning with social states in a robot bartender domain. In Proceedings of CogRob 2012, 2012.Google Scholar
- M. Rickert. Efficient Motion Planning for Intuitive Task Execution in Modular Manipulation Systems. Dissertation, Technische Universität München, 2011.Google Scholar
- V. Rieser and O. Lemon. Reinforcement Learning for Adaptive Dialogue Systems: A Data-driven Methodology for Dialogue Management and Natural Language Generation. Springer, 2011. 10.1007/978-3-642-24942-6. Google ScholarDigital Library
- Robotics Library. URL http://roblib.sf.net/.Google Scholar
- M. Sigalas, H. Baltzakis, and P. Trahanias. Visual tracking of independently moving body and arms. In Proceedings of IROS '09, Oct. 2009. Google ScholarDigital Library
- A. Vinciarelli, M. Pantic, and H. Bourlard. Social signal processing: Survey of an emerging domain. Image and Vision Computing, 27 (12): 1743--1759, 2009. 10.1016/j.imavis.2008.11.007. Google ScholarDigital Library
- J. Wainer, D. J. Feil-Seifer, D. A. Shell, and M. J. Matarić. Embodiment and human-robot interaction: A task-based perspective. In Proceedings of IEEE RO-MAN 2007, pages 872--877, Aug. 2007. 10.1109/ROMAN.2007.4415207.Google ScholarCross Ref
- M. Walker, C. Kamm, and D. Litman. Towards developing general models of usability with PARADISE. Natural Language Engineering, 6 (3&4): 363--377, 2000. 10.1017/S1351324900002503. Google ScholarDigital Library
- M. A. Walker. An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email. Journal of Artificial Intelligence Research, 12: 387--416, 2000. Google ScholarCross Ref
- M. White. Efficient realization of coordinate structures in Combinatory Categorial Grammar. Research on Language and Computation, 4 (1): 39--75, 2006. 10.1007/s11168-006-9010-2.Google ScholarCross Ref
Index Terms
- Two people walk into a bar: dynamic multi-party social interaction with a robot agent
Recommendations
Comparing task-based and socially intelligent behaviour in a robot bartender
ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interactionWe address the question of whether service robots that interact with humans in public spaces must express socially appropriate behaviour. To do so, we implemented a robot bartender which is able to take drink orders from humans and serve drinks to them. ...
Multi-party interaction with a virtual character and a human-like robot
VRST '13: Proceedings of the 19th ACM Symposium on Virtual Reality Software and TechnologyResearch on interactive virtual characters and social robots focuses mainly on one-to-one interactions and multi-party interactions concept are rather less explored. As we are developing these characters to be helpful to us in our daily lives as guides, ...
Demonstration of a Robot Receptionist with Multi-party Situated Interaction
HRI '22: Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot InteractionWe present a demonstration of a Robot Receptionist: a situated interactive robot that can coordinate turn-taking and handle multi-party engagement and dialogue in dynamic environments, where users might enter or leave the scene at any time. We use a ...
Comments