research-article

Data Driven Non-Verbal Behavior Generation for Humanoid Robots

Author:
Taras Kucherenko

KTH Royal Institute of Technology in Stockholm, Stockholm, Sweden

KTH Royal Institute of Technology in Stockholm, Stockholm, Sweden
View Profile

ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal InteractionOctober 2018Pages 520–523https://doi.org/10.1145/3242969.3264970

Published:02 October 2018Publication History

ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction

Pages 520–523

ABSTRACT

Social robots need non-verbal behavior to make an interaction pleasant and efficient. Most of the models for generating non-verbal behavior are rule-based and hence can produce a limited set of motions and are tuned to a particular scenario. In contrast, data-driven systems are flexible and easily adjustable. Hence we aim to learn a data-driven model for generating non-verbal behavior (in a form of a 3D motion sequence) for humanoid robots.

Our approach is based on a popular and powerful deep generative model: Variation Autoencoder (VAE). Input for our model will be multi-modal and we will iteratively increase its complexity: first, it will only use the speech signal, then also the text transcription and finally - the non-verbal behavior of the conversation partner. We will evaluate our system on the virtual avatars as well as on two humanoid robots with different embodiments: NAO and Furhat. Our model will be easily adapted to a novel domain: this can be done by providing application specific training data.

References

Henny Admoni and Brian Scassellati. 2014. Data-driven model of nonverbal behavior for socially assistive human-robot interactions. In International Conference on Multimodal Interaction. Google ScholarDigital Library
Henny Admoni, Thomas Weng, Bradley Hayes, and Brian Scassellati. 2016. Robot nonverbal behavior improves task performance in difficult collaborations ACM/IEEE International Conference on Human Robot Interaction. Google ScholarDigital Library
Samer Al Moubayed, Jonas Beskow, Gabriel Skantze, and Björn Granström. 2012. Furhat: a back-projected human-like robot head for multiparty human-machine interaction. In Cognitive Behavioural Systems. 114--130. Google ScholarDigital Library
Sean Andrist, Xiang Zhi Tan, Michael Gleicher, and Bilge Mutlu. 2014. Conversational gaze aversion for humanlike robots. In ACM/IEEE International Conference on Human Robot Interaction. Google ScholarDigital Library
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google Scholar
Judith Bütepage, Michael Black, Danica Kragic, and Hedvig Kjellström. 2017. Deep representation learning for human motion prediction and classification IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
Justine Cassell, Hannes Högni Vilhjálmsson, and Timothy Bickmore. 2001. Beat: the behavior expression animation toolkit. In Annual Conference on Computer Graphics and Interactive Techniques. Google ScholarDigital Library
Chung-Cheng Chiu, Louis-Philippe Morency, and Stacy Marsella. 2015. Predicting co-verbal gestures: a deep and temporal modeling approach International Conference on Intelligent Virtual Agents.Google Scholar
Paul Ekman and Wallace V. Friesen. 1969. The repertoire of nonverbal behavior: Categories, origins, usage, and coding. Semiotica Vol. 1, 1 (1969), 49--98.Google ScholarCross Ref
Adso Fernández-Baena, Raúl Montaño, Marc Antonijoan, Arturo Roversi, David Miralles, and Francesc Al'ıas. 2014. Gesture synthesis adapted to speech emphasis. Speech Communication Vol. 57 (2014), 331--350. Google ScholarDigital Library
Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks IEEE International Conference on Acoustics, Speech and Signal Processing.Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
Chien-Ming Huang and Bilge Mutlu. 2012. Robot behavior toolkit: generating effective social behaviors for robots ACM/IEEE International Conference on Human Robot Interaction. Google ScholarDigital Library
Patrik Jonell, Joseph Mendelson, Thomas Storskog, Goran Hagman, Per Ostberg, Iolanda Leite, Taras Kucherenko, Olga Mikheeva, Ulrika Akenine, Vesna Jelic, et al.. 2017. Machine Learning and Social Robotics for Detecting Early Signs of Dementia. arXiv preprint arXiv:1709.01613 (2017).Google Scholar
Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114 (2013).Google Scholar
Mark L. Knapp, Judith A. Hall, and Terrence G. Horgan. 2013. Nonverbal communication in human interaction. Wadsworth, Cengage Learning.Google Scholar
Robert M. Krauss, Yihsiu Chen, and Purnima Chawla. 1996. Nonverbal behavior and nonverbal communication: What do conversational hand gestures tell us? In Advances in Experimental Social Psychology. Vol. Vol. 28. 389--450.Google Scholar
Phoebe Liu, Dylan F. Glas, Takayuki Kanda, and Hiroshi Ishiguro. 2016. Data-driven HRI: Learning social behaviors by example from human-human interaction. IEEE Transactions on Robotics Vol. 32, 4 (2016), 988--1008.Google ScholarDigital Library
Julieta Martinez, Michael J. Black, and Javier Romero. 2017. On human motion prediction using recurrent neural networks IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
David Matsumoto, Mark G. Frank, and Hyi Sung Hwang. 2013. Nonverbal communication: Science and applications: Science and applications. Sage.Google Scholar
Victor Ng-Thow-Hing, Pengcheng Luo, and Sandra Okita. 2010. Synchronized gesture and speech production for humanoid robots IEEE/RSJ International Conference on Intelligent Robots and Systems.Google Scholar
Najmeh Sadoughi and Carlos Busso. 2017. Speech-driven animation with meaningful behaviors. arXiv preprint arXiv:1708.01640 (2017).Google Scholar
Maha Salem, Stefan Kopp, Ipke Wachsmuth, Katharina Rohlfing, and Frank Joublin. 2012. Generation and evaluation of communicative robot gesture. International Journal of Social Robotics Vol. 4, 2 (2012), 201--217.Google ScholarCross Ref
Kenta Takeuchi, Dai Hasegawa, Shinichi Shirakawa, Naoshi Kaneko, Hiroshi Sakuta, and Kazuhiko Sumi. 2017. Speech-to-Gesture Generation: A Challenge in Deep Learning Approach with Bi-Directional LSTM. In International Conference on Human Agent Interaction. Google ScholarDigital Library
Kenta Takeuchi, Souichirou Kubota, Keisuke Suzuki, Dai Hasegawa, and Hiroshi Sakuta. 2017. Creating a Gesture-Speech Dataset for Speech-Based Automatic Gesture Generation International Conference on Human-Computer Interaction. Springer, 198--202.Google Scholar

Index Terms

Data Driven Non-Verbal Behavior Generation for Humanoid Robots
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Empirical studies in HCI
  2. Interaction design
    1. Systems and tools for interaction design

Recommendations

Greek-language verbal and non-verbal interaction with a philosopher humanoid robot
PETRA '14: Proceedings of the 7th International Conference on PErvasive Technologies Related to Assistive Environments

We envision a world where humanoid robots can be used as exciting museum guides, shopping mall robots, or interactive theatre actors, impersonating characters such as ancient philosophers who talk about their theories and lives with the public. Towards ...
Read More
Whole-Body Motion Generation Integrating Operator's Intention and Robot's Autonomy in Controlling Humanoid Robots

This paper introduces a framework for whole-body motion generation integrating operator's control and robot's autonomous functions during online control of humanoid robots. Humanoid robots are biped machines that usually possess multiple degrees of ...
Read More
Head motions during dialogue speech and nod timing control in humanoid robots
HRI '10: Proceedings of the 5th ACM/IEEE international conference on Human-robot interaction

Head motion naturally occurs in synchrony with speech and may carry paralinguistic information, such as intention, attitude and emotion, in dialogue communication. With the aim of verifying the relationship between head motion and the dialogue acts ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction
October 2018
687 pages
ISBN:9781450356923
DOI:10.1145/3242969
General Chairs:
Sidney K. D'Mello
University of Illinois, USA
,
Panayiotis (Panos) Georgiou
University of Southern California, USA
,
Stefan Scherer
University of Southern California, USA
,
Program Chairs:
Emily Mower Provost
University of Michigan, USA
,
Mohammad Soleymani
University of Southern California, USA
,
Marcelo Worsley
Northwestern University, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 October 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data driven systems
deep learning
humanoid robot
machine learning
non-verbal behavior
Qualifiers
- research-article
Conference

Acceptance Rates
ICMI '18 Paper Acceptance Rate63of149submissions,42%Overall Acceptance Rate453of1,080submissions,42%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 283
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Data Driven Non-Verbal Behavior Generation for Humanoid Robots

ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Greek-language verbal and non-verbal interaction with a philosopher humanoid robot

Whole-Body Motion Generation Integrating Operator's Intention and Robot's Autonomy in Controlling Humanoid Robots

Head motions during dialogue speech and nod timing control in humanoid robots