skip to main content
10.1145/2522848.2522879acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

How can i help you': comparing engagement classification strategies for a robot bartender

Published:09 December 2013Publication History

ABSTRACT

A robot agent existing in the physical world must be able to understand the social states of the human users it interacts with in order to respond appropriately. We compared two implemented methods for estimating the engagement state of customers for a robot bartender based on low-level sensor data: a rule-based version derived from the analysis of human behaviour in real bars, and a trained version using supervised learning on a labelled multimodal corpus. We first compared the two implementations using cross-validation on real sensor data and found that nearly all classifier types significantly outperformed the rule-based classifier. We also carried out feature selection to see which sensor features were the most informative for the classification task, and found that the position of the head and hands were relevant, but that the torso orientation was not. Finally, we performed a user study comparing the ability of the two classifiers to detect the intended user engagement of actual customers of the robot bartender; this study found that the trained classifier was faster at detecting initial intended user engagement, but that the rule-based classifier was more stable.

References

  1. Weka primer. http://weka.wikispaces.com/Primer.Google ScholarGoogle Scholar
  2. D. Aha and D. Kibler. Instance-based learning algorithms. phMachine Learning, 6: 37--66, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Baayen, D. Davidson, and D. Bates. Mixed-effects modeling with crossed random effects for subjects and items. phJournal of Memory and Language, 59 (4): 390--412, 2008. 10.1016/j.jml.2007.12.005.Google ScholarGoogle ScholarCross RefCross Ref
  4. H. Baltzakis, M. Pateraki, and P. Trahanias. Visual tracking of hands, faces and facial features of multiple persons. phMachine Vision and Applications, 23 (6): 1141--1157, 2012. 10.1007/s00138-012-0409--5.Google ScholarGoogle ScholarCross RefCross Ref
  5. }Bohus.Horvitz:2009D. Bohus and E. Horvitz. Dialog in the open world: platform and applications. In phProceedings of ICMI-MLMI 2009, pages 31--38, Cambridge, MA, Nov. 2009\natexlaba. 10.1145/1647314.1647323. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. }Bohus.Horvitz:2009aD. Bohus and E. Horvitz. Learning to predict engagement with a spoken dialog system in open-world settings. In phProceedings of SIGDIAL 2009, pages 244--252, 2009\natexlabb. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. Castellano, I. Leite, A. Pereira, C. Martinho, A. Paiva, and P. McOwan. Detecting engagement in HRI: An exploration of social and task-based context. In phProceedings of SocialCom'12, pages 421--428, Sept. 2012. 10.1109/SocialCom-PASSAT.2012.51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. phACM Trans. Intell. Syst. Technol., 2 (3): 27:1--27:27, May 2011. 10.1145/1961189.1961199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. W. Cohen. Fast effective rule induction. In phTwelfth International Conference on Machine Learning, pages 115--123. Morgan Kaufmann, 1995.Google ScholarGoogle Scholar
  10. M. E. Foster, A. Gaschler, M. Giuliani, A. Isard, M. Pateraki, and R. P. A. Petrick. Two people walk into a bar: Dynamic multi-party social interaction with a robot agent. In phProceedings of ICMI 2012, Oct. 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. E. Frank, Y. Wang, S. Inglis, G. Holmes, and I. Witten. Using model trees for classification. phMachine Learning, 32 (1): 63--76, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Gaschler, Huth, Giuliani, Kessler, de Ruiter, and Knoll}Gaschler2012aA. Gaschler, K. Huth, M. Giuliani, I. Kessler, J. de Ruiter, and A. Knoll. Modelling state of interaction from head poses for social Human-Robot Interaction. In phProceedings of the Gaze in Human-Robot Interaction Workshop held at the 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI 2012), Boston, MA, March 2012\natexlaba.Google ScholarGoogle Scholar
  13. Gaschler, Jentzsch, Giuliani, Huth, de Ruiter, and Knoll}Gaschler2012bA. Gaschler, S. Jentzsch, M. Giuliani, K. Huth, J. de Ruiter, and A. Knoll. Social Behavior Recognition using body posture and head pose for Human-Robot Interaction. In phIEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 2012\natexlabb. 10.1109/IROS.2012.6385460.Google ScholarGoogle Scholar
  14. M. Giuliani, R. P. A. Petrick, M. E. Foster, A. Gaschler, A. Isard, M. Pateraki, and M. Sigalas. Comparing task-based and socially intelligent behaviour in a robot bartender. In phProceedings of the 15\textsuperscriptth International Conference on Multimodal Interfaces (ICMI 2013), Sydney, Australia, Dec. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. Hall and G. Holmes. Benchmarking attribute selection techniques for discrete class data mining. phIEEE Transactions on Knowledge and Data Engineering, 15 (6): 1437--1447, 2003. 10.1109/TKDE.2003.1245283. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The WEKA data mining software: an update. phSIGKDD Explorations Newsletter, 11 (1): 10--18, Nov. 2009. 10.1145/1656274.1656278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. A. Hall. Correlation-based feature selection for discrete and numeric class machine learning. In phProceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), pages 359--366, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C.-W. Hsu, C.-C. Chang, and C.-J. Lin. A practical guide to support vector classification. Technical report, Department of Computer Science, National Taiwan University, 15 April 2010. http://www.csie.ntu.edu.tw/ cjlin/papers/guide/guide.pdf.Google ScholarGoogle Scholar
  19. K. Huth, S. Loth, and J. De Ruiter. Insights from the bar: A model of interaction. In phProceedings of Formal and Computational Approaches to Multimodal Communication, Aug. 2012.Google ScholarGoogle Scholar
  20. G. H. John and P. Langley. Estimating continuous distributions in Bayesian classifiers. In phEleventh Conference on Uncertainty in Artificial Intelligence, pages 338--345, San Mateo, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. S. Keizer, M. E. Foster, O. Lemon, A. Gaschler, and M. Giuliani. Training and evaluation of an MDP model for social multi-user human-robot interaction. In phProceedings of the 14\textsuperscriptth Annual SIGdial Meeting on Discourse and Dialogue, 2013.Google ScholarGoogle Scholar
  22. R. Kohavi and G. H. John. Wrappers for feature subset selection. phArtificial intelligence, 97 (1): 273--324, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. le Cessie and J. van Houwelingen. Ridge estimators in logistic regression. phApplied Statistics, 41 (1): 191--201, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  24. L. Li, Q. Xu, and Y. K. Tan. Attention-based addressee selection for service and social robots to interact with multiple persons. In phProceedings of the Workshop at SIGGRAPH Asia, WASA '12, pages 131--136, 2012. 10.1145/2425296.2425319. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Loth, K. Huth, and J. P. De Ruiter. Automatic detection of service initiation signals used in bars. phFrontiers in Psychology, 4 (557), 2013. 10.3389/fpsyg.2013.00557.Google ScholarGoogle Scholar
  26. Z. MacHardy, K. Syharath, and P. Dewan. Engagement analysis through computer vision. In phProceedings of CollaborateCom 2012, pages 535--539, Oct. 2012.Google ScholarGoogle Scholar
  27. D. McColl and G. Nejat. Affect detection from body language during social HRI. In phProceedings of 2012 IEEE RO-MAN, pages 1013--1018, Sept. 2012. 10.1109/ROMAN.2012.6343882.Google ScholarGoogle Scholar
  28. }MicrosoftCorporation:2012Microsoft Corporation. Kinect for Windows. URL http://www.microsoft.com/en-us/kinectforwindows/.Google ScholarGoogle Scholar
  29. M. Pateraki, M. Sigalas, G. Chliveros, and P. Trahanias. Visual human-robot communication in social settings. In phProceedings of ICRA Workshop on Semantics, Identification and Control of Robot-Human-Environment Interaction, 2013.Google ScholarGoogle Scholar
  30. R. P. A. Petrick and M. E. Foster. Planning for social interaction in a robot bartender domain. In phProceedings of the ICAPS 2013 Special Track on Novel Applications, Rome, Italy, June 2013.Google ScholarGoogle Scholar
  31. R. P. A. Petrick, M. E. Foster, and A. Isard. Social state recognition and knowledge-level planning for human-robot interaction in a bartender domain. In phAAAI 2012 Workshop on Grounding Language for Physical Systems, Toronto, ON, Canada, July 2012.Google ScholarGoogle Scholar
  32. R. Quinlan. phC4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo, CA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. A. Vinciarelli, M. Pantic, D. Heylen, C. Pelachaud, I. Poggi, F. D'Errico, and M. Schroeder. Bridging the gap between social animal and unsocial machine: A survey of social signal processing. phIEEE Transactions on Affective Computing, 3 (1): 69--87, Jan. 2012. 10.1109/T-AFFC.2011.27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. M. Walker, C. Kamm, and D. Litman. Towards developing general models of usability with PARADISE. phNatural Language Engineering, 6 (3&4): 363--377, 2000. 10.1017/S1351324900002503. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. B. West, K. B. Welch, and A. T. Galecki. phLinear mixed models: a practical guide using statistical software. CRC Press, 2006.Google ScholarGoogle Scholar
  36. M. White. Efficient realization of coordinate structures in Combinatory Categorial Grammar. phResearch on Language and Computation, 4 (1): 39--75, 2006. 10.1007/s11168-006--9010--2.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. How can i help you': comparing engagement classification strategies for a robot bartender

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interaction
      December 2013
      630 pages
      ISBN:9781450321297
      DOI:10.1145/2522848

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 December 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ICMI '13 Paper Acceptance Rate49of133submissions,37%Overall Acceptance Rate453of1,080submissions,42%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader