skip to main content
10.1145/3242969.3264970acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Data Driven Non-Verbal Behavior Generation for Humanoid Robots

Published:02 October 2018Publication History

ABSTRACT

Social robots need non-verbal behavior to make an interaction pleasant and efficient. Most of the models for generating non-verbal behavior are rule-based and hence can produce a limited set of motions and are tuned to a particular scenario. In contrast, data-driven systems are flexible and easily adjustable. Hence we aim to learn a data-driven model for generating non-verbal behavior (in a form of a 3D motion sequence) for humanoid robots.

Our approach is based on a popular and powerful deep generative model: Variation Autoencoder (VAE). Input for our model will be multi-modal and we will iteratively increase its complexity: first, it will only use the speech signal, then also the text transcription and finally - the non-verbal behavior of the conversation partner. We will evaluate our system on the virtual avatars as well as on two humanoid robots with different embodiments: NAO and Furhat. Our model will be easily adapted to a novel domain: this can be done by providing application specific training data.

References

  1. Henny Admoni and Brian Scassellati. 2014. Data-driven model of nonverbal behavior for socially assistive human-robot interactions. In International Conference on Multimodal Interaction. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Henny Admoni, Thomas Weng, Bradley Hayes, and Brian Scassellati. 2016. Robot nonverbal behavior improves task performance in difficult collaborations ACM/IEEE International Conference on Human Robot Interaction. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Samer Al Moubayed, Jonas Beskow, Gabriel Skantze, and Björn Granström. 2012. Furhat: a back-projected human-like robot head for multiparty human-machine interaction. In Cognitive Behavioural Systems. 114--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Sean Andrist, Xiang Zhi Tan, Michael Gleicher, and Bilge Mutlu. 2014. Conversational gaze aversion for humanlike robots. In ACM/IEEE International Conference on Human Robot Interaction. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google ScholarGoogle Scholar
  6. Judith Bütepage, Michael Black, Danica Kragic, and Hedvig Kjellström. 2017. Deep representation learning for human motion prediction and classification IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  7. Justine Cassell, Hannes Högni Vilhjálmsson, and Timothy Bickmore. 2001. Beat: the behavior expression animation toolkit. In Annual Conference on Computer Graphics and Interactive Techniques. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Chung-Cheng Chiu, Louis-Philippe Morency, and Stacy Marsella. 2015. Predicting co-verbal gestures: a deep and temporal modeling approach International Conference on Intelligent Virtual Agents.Google ScholarGoogle Scholar
  9. Paul Ekman and Wallace V. Friesen. 1969. The repertoire of nonverbal behavior: Categories, origins, usage, and coding. Semiotica Vol. 1, 1 (1969), 49--98.Google ScholarGoogle ScholarCross RefCross Ref
  10. Adso Fernández-Baena, Raúl Montaño, Marc Antonijoan, Arturo Roversi, David Miralles, and Francesc Al'ıas. 2014. Gesture synthesis adapted to speech emphasis. Speech Communication Vol. 57 (2014), 331--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks IEEE International Conference on Acoustics, Speech and Signal Processing.Google ScholarGoogle Scholar
  12. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  13. Chien-Ming Huang and Bilge Mutlu. 2012. Robot behavior toolkit: generating effective social behaviors for robots ACM/IEEE International Conference on Human Robot Interaction. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Patrik Jonell, Joseph Mendelson, Thomas Storskog, Goran Hagman, Per Ostberg, Iolanda Leite, Taras Kucherenko, Olga Mikheeva, Ulrika Akenine, Vesna Jelic, et al.. 2017. Machine Learning and Social Robotics for Detecting Early Signs of Dementia. arXiv preprint arXiv:1709.01613 (2017).Google ScholarGoogle Scholar
  15. Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 (2013).Google ScholarGoogle Scholar
  16. Mark L. Knapp, Judith A. Hall, and Terrence G. Horgan. 2013. Nonverbal communication in human interaction. Wadsworth, Cengage Learning.Google ScholarGoogle Scholar
  17. Robert M. Krauss, Yihsiu Chen, and Purnima Chawla. 1996. Nonverbal behavior and nonverbal communication: What do conversational hand gestures tell us? In Advances in Experimental Social Psychology. Vol. Vol. 28. 389--450.Google ScholarGoogle Scholar
  18. Phoebe Liu, Dylan F. Glas, Takayuki Kanda, and Hiroshi Ishiguro. 2016. Data-driven HRI: Learning social behaviors by example from human-human interaction. IEEE Transactions on Robotics Vol. 32, 4 (2016), 988--1008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Julieta Martinez, Michael J. Black, and Javier Romero. 2017. On human motion prediction using recurrent neural networks IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  20. David Matsumoto, Mark G. Frank, and Hyi Sung Hwang. 2013. Nonverbal communication: Science and applications: Science and applications. Sage.Google ScholarGoogle Scholar
  21. Victor Ng-Thow-Hing, Pengcheng Luo, and Sandra Okita. 2010. Synchronized gesture and speech production for humanoid robots IEEE/RSJ International Conference on Intelligent Robots and Systems.Google ScholarGoogle Scholar
  22. Najmeh Sadoughi and Carlos Busso. 2017. Speech-driven animation with meaningful behaviors. arXiv preprint arXiv:1708.01640 (2017).Google ScholarGoogle Scholar
  23. Maha Salem, Stefan Kopp, Ipke Wachsmuth, Katharina Rohlfing, and Frank Joublin. 2012. Generation and evaluation of communicative robot gesture. International Journal of Social Robotics Vol. 4, 2 (2012), 201--217.Google ScholarGoogle ScholarCross RefCross Ref
  24. Kenta Takeuchi, Dai Hasegawa, Shinichi Shirakawa, Naoshi Kaneko, Hiroshi Sakuta, and Kazuhiko Sumi. 2017. Speech-to-Gesture Generation: A Challenge in Deep Learning Approach with Bi-Directional LSTM. In International Conference on Human Agent Interaction. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kenta Takeuchi, Souichirou Kubota, Keisuke Suzuki, Dai Hasegawa, and Hiroshi Sakuta. 2017. Creating a Gesture-Speech Dataset for Speech-Based Automatic Gesture Generation International Conference on Human-Computer Interaction. Springer, 198--202.Google ScholarGoogle Scholar

Index Terms

  1. Data Driven Non-Verbal Behavior Generation for Humanoid Robots

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction
            October 2018
            687 pages
            ISBN:9781450356923
            DOI:10.1145/3242969

            Copyright © 2018 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 2 October 2018

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            ICMI '18 Paper Acceptance Rate63of149submissions,42%Overall Acceptance Rate453of1,080submissions,42%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader