Skip to main content
Top

2021 | OriginalPaper | Chapter

Towards a Natural Human-Robot Interaction in an Industrial Environment

Authors : Ander González-Docasal, Cristina Aceta, Haritz Arzelus, Aitor Álvarez, Izaskun Fernández, Johan Kildal

Published in: Conversational Dialogue Systems for the Next Decade

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Nowadays, modern industry has adopted robots as part of their processes. In many scenarios, such machines collaborate with humans to perform specific tasks in their same environment or simply guide them in a natural, safe and efficient way. Our approach improves a previously conducted work on a multi-modal human-robot interaction system with different audio acquisition and speech recognition modules for a more natural communication. The semantic interpreter, with the aid of a knowledge manager, parses the resulting transcription and, using contextual information, selects the order that the operator has uttered and sends it to the robot to be executed. This setup is evaluated in a real manufacture scenario in a laboratory environment with a large set of end users both quantitatively and qualitatively. The gathered results reveal that the system behaves robustly and that the assignment was also considered by the end users as manageable, whilst the system in overall was received with a high level of trust and usability.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Amodei D, Anubhai R, Battenberg E, Case C, Casper J, Catanzaro B, Chen J, Chrzanowski M, Coates A, Diamos G, Elsen E, Engel J, Fan L, Fougner C, Han T, Hannun A, Jun B, LeGresley P, Lin L, Narang S, Ng A, Ozair S, Prenger R, Raiman J, Satheesh S, Seetapun D, Sengupta S, Wang Y, Wang Z, Wang C, Xiao B, Yogatama D, Zhan J, Zhu Z (2015) Deep speech 2: end-to-end speech recognition in English and Mandarin Amodei D, Anubhai R, Battenberg E, Case C, Casper J, Catanzaro B, Chen J, Chrzanowski M, Coates A, Diamos G, Elsen E, Engel J, Fan L, Fougner C, Han T, Hannun A, Jun B, LeGresley P, Lin L, Narang S, Ng A, Ozair S, Prenger R, Raiman J, Satheesh S, Seetapun D, Sengupta S, Wang Y, Wang Z, Wang C, Xiao B, Yogatama D, Zhan J, Zhu Z (2015) Deep speech 2: end-to-end speech recognition in English and Mandarin
2.
go back to reference Anastasakos T, McDonough J, Schwartz R, Makhoul J (1996) A compact model for speaker-adaptive training. In: Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96, vol 2. IEEE, pp 1137–1140 Anastasakos T, McDonough J, Schwartz R, Makhoul J (1996) A compact model for speaker-adaptive training. In: Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP’96, vol 2. IEEE, pp 1137–1140
3.
go back to reference Antonelli D, Bruno G (2017) Human-robot collaboration using industrial robots. In: 2017 2nd International Conference on Electrical, Automation and Mechanical Engineering (EAME 2017). Atlantis Press Antonelli D, Bruno G (2017) Human-robot collaboration using industrial robots. In: 2017 2nd International Conference on Electrical, Automation and Mechanical Engineering (EAME 2017). Atlantis Press
4.
go back to reference Bernath C, Alvarez A, Arzelus H, Martínez CD (2018) Exploring E2E speech recognition systems for new languages. In: IberSPEECH, pp 102–106 Bernath C, Alvarez A, Arzelus H, Martínez CD (2018) Exploring E2E speech recognition systems for new languages. In: IberSPEECH, pp 102–106
5.
go back to reference Brooke J et al (1996) Sus-a quick and dirty usability scale. Usability Eval Ind 189(194):4–7 Brooke J et al (1996) Sus-a quick and dirty usability scale. Usability Eval Ind 189(194):4–7
6.
go back to reference Campione E, Véronis J (1998) A multilingual prosodic database. In: Fifth International Conference on Spoken Language Processing Campione E, Véronis J (1998) A multilingual prosodic database. In: Fifth International Conference on Spoken Language Processing
7.
go back to reference Casacuberta F, Garcia R, Llisterri J, Nadeu C, Pardo J, Rubio A (1991) Development of Spanish corpora for speech research (ALBAYZIN). In: Workshop on International Cooperation and Standardization of Speech Databases and Speech I/O Assesment Methods, Chiavari, Italy, pp 26–28 Casacuberta F, Garcia R, Llisterri J, Nadeu C, Pardo J, Rubio A (1991) Development of Spanish corpora for speech research (ALBAYZIN). In: Workshop on International Cooperation and Standardization of Speech Databases and Speech I/O Assesment Methods, Chiavari, Italy, pp 26–28
9.
go back to reference Gnjatović M, Tasevski J, Nikolić M, Mišković D, Borovac B, Delić V (2012) Adaptive multimodal interaction with industrial robot. In: 2012 IEEE 10th Jubilee International Symposium on Intelligent Systems and Informatics. IEEE, pp 329–333 Gnjatović M, Tasevski J, Nikolić M, Mišković D, Borovac B, Delić V (2012) Adaptive multimodal interaction with industrial robot. In: 2012 IEEE 10th Jubilee International Symposium on Intelligent Systems and Informatics. IEEE, pp 329–333
10.
go back to reference Gopinath RA (1998) Maximum likelihood modeling with gaussian distributions for classification. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’98 (Cat. No. 98CH36181), vol 2. IEEE, pp 661–664 Gopinath RA (1998) Maximum likelihood modeling with gaussian distributions for classification. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’98 (Cat. No. 98CH36181), vol 2. IEEE, pp 661–664
11.
go back to reference Heafield K (1998) KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics, pp 187–197 Heafield K (1998) KenLM: faster and smaller language model queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation. Association for Computational Linguistics, pp 187–197
12.
go back to reference Kennedy J, Lemaignan S, Montassier C, Lavalade P, Irfan B, Papadopoulos F, Senft E, Belpaeme T (2017) Child speech recognition in human-robot interaction: evaluations and recommendations. In: Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, pp 82–90 Kennedy J, Lemaignan S, Montassier C, Lavalade P, Irfan B, Papadopoulos F, Senft E, Belpaeme T (2017) Child speech recognition in human-robot interaction: evaluations and recommendations. In: Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, pp 82–90
13.
go back to reference Kildal J, Fernández I, Lluvia I, Lázaro I, Aceta C, Vidal N, Susperregi L (2019) Evaluating the UX obtained from a service robot that provides ancillary way-finding support in an industrial environment. In: Advances in Manufacturing Technology XXXIII: Proceedings of the 17th International Conference on Manufacturing Research, Incorporating the 34th National Conference on Manufacturing Research, 10–12 September 2019, Queen’s University, Belfast, vol 9. IOS Press, p 61 Kildal J, Fernández I, Lluvia I, Lázaro I, Aceta C, Vidal N, Susperregi L (2019) Evaluating the UX obtained from a service robot that provides ancillary way-finding support in an industrial environment. In: Advances in Manufacturing Technology XXXIII: Proceedings of the 17th International Conference on Manufacturing Research, Incorporating the 34th National Conference on Manufacturing Research, 10–12 September 2019, Queen’s University, Belfast, vol 9. IOS Press, p 61
14.
go back to reference Lin Y, Min H, Zhou H, Chen M (2018) A natural language interaction based automatic operating system for industrial robot. In: International Conference on Intelligent Computing. Springer, pp 111–122 Lin Y, Min H, Zhou H, Chen M (2018) A natural language interaction based automatic operating system for industrial robot. In: International Conference on Intelligent Computing. Springer, pp 111–122
15.
go back to reference Lleida E, Ortega A, Miguel A, Bazán-Gil V, Pérez C, Gómez M, de Prada A (2019) Albayzin 2018 evaluation: the iberspeech-RTVE challenge onspeech technologies for spanish broadcast media. Appl Sci 9(24):5412. https://doi.org/10.3390/app9245412 Lleida E, Ortega A, Miguel A, Bazán-Gil V, Pérez C, Gómez M, de Prada A (2019) Albayzin 2018 evaluation: the iberspeech-RTVE challenge onspeech technologies for spanish broadcast media. Appl Sci 9(24):5412. https://​doi.​org/​10.​3390/​app9245412
17.
go back to reference Padró L, Stanilovsky E (2012) Freeling 3.0: towards wider multilinguality. In: LREC2012 Padró L, Stanilovsky E (2012) Freeling 3.0: towards wider multilinguality. In: LREC2012
20.
go back to reference Peddinti V, Povey D, Khudanpur S (2015) A time delay neural network architecture for efficient modeling of long temporal contexts. In: INTERSPEECH Peddinti V, Povey D, Khudanpur S (2015) A time delay neural network architecture for efficient modeling of long temporal contexts. In: INTERSPEECH
21.
go back to reference Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, Hannemann M, Motlicek P, Qian Y, Schwarz P, Silovsky J, Stemmer G, Vesely K (2011) The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society. IEEE Catalog No. CFP11SRW-USB Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, Hannemann M, Motlicek P, Qian Y, Schwarz P, Silovsky J, Stemmer G, Vesely K (2011) The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society. IEEE Catalog No. CFP11SRW-USB
22.
go back to reference Povey D, Kingsbury B, Mangu L, Saon G, Soltau H, Zweig G (2005) fMPE: discriminatively trained features for speech recognition. In: Proceedings.(ICASSP 2005). IEEE International Conference on Acoustics, Speech, and Signal Processing 2005, vol 1. IEEE, pp I–961 Povey D, Kingsbury B, Mangu L, Saon G, Soltau H, Zweig G (2005) fMPE: discriminatively trained features for speech recognition. In: Proceedings.(ICASSP 2005). IEEE International Conference on Acoustics, Speech, and Signal Processing 2005, vol 1. IEEE, pp I–961
23.
go back to reference Pozo A, Aliprandi C, Álvarez A, Mendes C, Neto J, Paulo S, Piccinini N, Raffaelli M (2014) SAVAS: collecting, annotating and sharing audiovisual language resources for automatic subtitling Pozo A, Aliprandi C, Álvarez A, Mendes C, Neto J, Paulo S, Piccinini N, Raffaelli M (2014) SAVAS: collecting, annotating and sharing audiovisual language resources for automatic subtitling
Metadata
Title
Towards a Natural Human-Robot Interaction in an Industrial Environment
Authors
Ander González-Docasal
Cristina Aceta
Haritz Arzelus
Aitor Álvarez
Izaskun Fernández
Johan Kildal
Copyright Year
2021
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-15-8395-7_18