Skip to main content
Top

2013 | OriginalPaper | Chapter

Binaural Systems in Robotics

Authors : S. Argentieri, A. Portello, M. Bernard, P. Danès, B. Gas

Published in: The Technology of Binaural Listening

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Audition is often described by physiologists as the most important sense in humans, due to its essential role in communication and socialization. But quite surprisingly, the interest of this modality for robotics arose only in the 2000s, brought to evidence by cognitive robotics and Human–robot interaction. Since then, numerous contributions have been proposed to the field of robot audition, ranging from sound localization to scene analysis. Binaural approaches were investigated first, then became forsaken due to mixed results. Nevertheless, the last years have witnessed a renewal of interest in binaural active audition, that is, in the opportunities and challenges opened by the coupling of binaural sensing and robot motion. This chapter proposes a comprehensive state of the art of binaural approaches to robot audition. Though the literature on binaural audition and, more generally, on acoustics and signal processing, is a fundamental source of knowledge, the tasks, constraints, and environments of robotics raise original issues. These are reviewed, prior to the most prominent contributions, platforms and projects. Two lines of research in binaural active audition, conducted by the current authors, are then outlined, one of which is tightly connected to psychology of perception.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference J. Aloimonos, I. Weiss, and A. Bandyopadhyay. Active vision. Intl. J. Computer Vision, 1:333–356, 1988. J. Aloimonos, I. Weiss, and A. Bandyopadhyay. Active vision. Intl. J. Computer Vision, 1:333–356, 1988.
2.
go back to reference S. Argentieri and P. Danès. Broadband variations of the MUSIC high-resolution method for sound source localization in robotics. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2007, pages 2009–2014, 2007. S. Argentieri and P. Danès. Broadband variations of the MUSIC high-resolution method for sound source localization in robotics. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2007, pages 2009–2014, 2007.
3.
go back to reference E. Arnaud, H. Christensen, Y.-C. Lu, J. Barker, V. Khalidov, M. Hansard, B. Holveck, H. Mathieu, R. Narasimha, E. Taillant, F. Forbes, and R. Horaud. The CAVA corpus: Synchronised stereoscopic and binaural datasets with head movements. In ACM/IEEE Intl. Conf. Multimodal, Interfaces, ICMI’08, 2008. E. Arnaud, H. Christensen, Y.-C. Lu, J. Barker, V. Khalidov, M. Hansard, B. Holveck, H. Mathieu, R. Narasimha, E. Taillant, F. Forbes, and R. Horaud. The CAVA corpus: Synchronised stereoscopic and binaural datasets with head movements. In ACM/IEEE Intl. Conf. Multimodal, Interfaces, ICMI’08, 2008.
4.
go back to reference M. Aytekin, C. Moss, and J. Simon. A sensorimotor approach to sound localization. Neural Computation, 20:603–635, 2008. M. Aytekin, C. Moss, and J. Simon. A sensorimotor approach to sound localization. Neural Computation, 20:603–635, 2008.
5.
go back to reference P. Azad, T. Gockel, R. Dillmann. Computer Vision: Principles and Practice. Elektor, Electronics, 2008. P. Azad, T. Gockel, R. Dillmann. Computer Vision: Principles and Practice. Elektor, Electronics, 2008.
6.
go back to reference R. Bajcsy. Active perception. Proc. of the IEEE, 76:966–1005, 1988. R. Bajcsy. Active perception. Proc. of the IEEE, 76:966–1005, 1988.
7.
go back to reference Y. Bar-Shalom and X. Li. Estimation and Tracking: Principles, Techniques and Software. Artech House, 1993. Y. Bar-Shalom and X. Li. Estimation and Tracking: Principles, Techniques and Software. Artech House, 1993.
8.
go back to reference M. Bernard, S. N’Guyen, P. Pirim, B. Gas, and J.-A. Meyer. Phonotaxis behavior in the artificial rat Psikharpax. In Intl. Symp. Robotics and Intelligent Sensors, IRIS’2010, pages 118–122, Nagoya, Japan, 2010. M. Bernard, S. N’Guyen, P. Pirim, B. Gas, and J.-A. Meyer. Phonotaxis behavior in the artificial rat Psikharpax. In Intl. Symp. Robotics and Intelligent Sensors, IRIS’2010, pages 118–122, Nagoya, Japan, 2010.
9.
go back to reference M. Bernard, P. Pirim, A. de Cheveigné, and B. Gas. Sensorimotor learning of sound localization from an auditory evoked behavior. In IEEE Intl. Conf. Robotics and Automation, ICRA’2012, pages 91–96, St. Paul, MN, 2012. M. Bernard, P. Pirim, A. de Cheveigné, and B. Gas. Sensorimotor learning of sound localization from an auditory evoked behavior. In IEEE Intl. Conf. Robotics and Automation, ICRA’2012, pages 91–96, St. Paul, MN, 2012.
10.
go back to reference J. Blauert, D. Kolossa, K. Obermayer, and K. Adiloglu. Further challenges and the road ahead. In J. Blauert, editor, The technology of binaural listening, chapter 18. Springer, Berlin-Heidelberg-New York NY, 2013. J. Blauert, D. Kolossa, K. Obermayer, and K. Adiloglu. Further challenges and the road ahead. In J. Blauert, editor, The technology of binaural listening, chapter 18. Springer, Berlin-Heidelberg-New York NY, 2013.
11.
go back to reference W. Brimijoin, D. Mc Shefferty, and M. Akeroyd. Undirected head movements of listeners with asymmetrical hearing impairment during a speech-in-noise task.Hearing Research, 283:162–8, 2012. W. Brimijoin, D. Mc Shefferty, and M. Akeroyd. Undirected head movements of listeners with asymmetrical hearing impairment during a speech-in-noise task.Hearing Research, 283:162–8, 2012.
12.
go back to reference R. Brooks, C. Breazeal, N. Marjanović, B. Scassellati, and M. Williamson. The Cog project: Building a humanoid robot. In C. Nehaniv, editor, Computations for Metaphors, Analogy, and Agents, volume 1562 of LNCS, pages 52–87. Springer, 1999. R. Brooks, C. Breazeal, N. Marjanović, B. Scassellati, and M. Williamson. The Cog project: Building a humanoid robot. In C. Nehaniv, editor, Computations for Metaphors, Analogy, and Agents, volume 1562 of LNCS, pages 52–87. Springer, 1999.
13.
go back to reference Y. Chen and Y. Rui. Real-time speaker tracking using particle filter sensor fusion. Proc. of the IEEE, 920:485–494, 2004. Y. Chen and Y. Rui. Real-time speaker tracking using particle filter sensor fusion. Proc. of the IEEE, 920:485–494, 2004.
14.
go back to reference H. Christensen and J. Barker. Using location cues to track speaker changes from mobile binaural microphones. In Interspeech 2009, Brighton, UK, 2009. H. Christensen and J. Barker. Using location cues to track speaker changes from mobile binaural microphones. In Interspeech 2009, Brighton, UK, 2009.
15.
go back to reference H. Christensen, J. Barker, Y.-C. Lu, J. Xavier, R. Caseiro, and H. Arafajo. POPeye: Real-time binaural sound-source localisation on an audio-visual robot head. In Conf. Natural Computing and Intelligent Robotics, 2009. H. Christensen, J. Barker, Y.-C. Lu, J. Xavier, R. Caseiro, and H. Arafajo. POPeye: Real-time binaural sound-source localisation on an audio-visual robot head. In Conf. Natural Computing and Intelligent Robotics, 2009.
17.
go back to reference M. Cooke, Y. Lu, Y. Lu, and R. Horaud. Active hearing, active speaking. In Intl. Symp. Auditory and Audiological Res., 2007. M. Cooke, Y. Lu, Y. Lu, and R. Horaud. Active hearing, active speaking. In Intl. Symp. Auditory and Audiological Res., 2007.
18.
go back to reference M. Cooke, A. Morris, and P. Green. Recognizing occluded speech. In Proceedings of the ESCA Tutorial and Res.arch Worksh. Auditory Basis of Speech Perception, pages 297–300, Keele University, United Kingdom, 1996. M. Cooke, A. Morris, and P. Green. Recognizing occluded speech. In Proceedings of the ESCA Tutorial and Res.arch Worksh. Auditory Basis of Speech Perception, pages 297–300, Keele University, United Kingdom, 1996.
19.
go back to reference M. Cooke, A. Morris, and P. Green. Missing data techniques for robust speech recognition. In Intl. Conf. Acoustics, Speech, and Signal Processing, ICASSP’1997, pages 863–866, Munich, Germany, 1997. M. Cooke, A. Morris, and P. Green. Missing data techniques for robust speech recognition. In Intl. Conf. Acoustics, Speech, and Signal Processing, ICASSP’1997, pages 863–866, Munich, Germany, 1997.
20.
go back to reference B. Cornelis, M. Moonen, and J. Wouters. Binaural voice activity detection for MWF-based noise reduction in binaural hearing aids. In European Signal Processing Conf., EUSIPCO’2011, pages Barcelona, Spain, 2011. B. Cornelis, M. Moonen, and J. Wouters. Binaural voice activity detection for MWF-based noise reduction in binaural hearing aids. In European Signal Processing Conf., EUSIPCO’2011, pages Barcelona, Spain, 2011.
21.
go back to reference P. Danès and J. Bonnal. Information-theoretic detection of broadband sources in a coherent beamspace MUSIC scheme. In IEEE/RSJ Intl. Conf. Intell. Robots and Systems, IROS’2010, pages 1976–1981, Taipei, Taiwan, 2010. P. Danès and J. Bonnal. Information-theoretic detection of broadband sources in a coherent beamspace MUSIC scheme. In IEEE/RSJ Intl. Conf. Intell. Robots and Systems, IROS’2010, pages 1976–1981, Taipei, Taiwan, 2010.
22.
go back to reference A. Deleforge and R. Horaud. Learning the direction of a sound source using head motions and spectral features. Technical Report 7529, INRIA, 2011. A. Deleforge and R. Horaud. Learning the direction of a sound source using head motions and spectral features. Technical Report 7529, INRIA, 2011.
23.
go back to reference A. Deleforge and R. Horaud. The Cocktail-Party robot: Sound source separation and localisation with an active binaural head. In IEEE/ACM Intl. Conf. Human Robot Interaction, HRI’2012, Boston, MA, 2012. A. Deleforge and R. Horaud. The Cocktail-Party robot: Sound source separation and localisation with an active binaural head. In IEEE/ACM Intl. Conf. Human Robot Interaction, HRI’2012, Boston, MA, 2012.
24.
go back to reference J. Gibson. The Ecological Approach to Visual Perception. Erlbaum, 1982. J. Gibson. The Ecological Approach to Visual Perception. Erlbaum, 1982.
25.
go back to reference M. Giuliani, C. Lenz, T. Müller, M. Rickert, and A. Knoll. Design principles for safety in human-robot interaction. Intl. J. Social Robotics, 2:253–274, 2010. M. Giuliani, C. Lenz, T. Müller, M. Rickert, and A. Knoll. Design principles for safety in human-robot interaction. Intl. J. Social Robotics, 2:253–274, 2010.
26.
go back to reference A. Handzel, S. Andersson, M. Gebremichael, and P. Krishnaprasad. A biomimetic apparatus for sound-source localization. In IEEE Conf. Decision and Control, CDC’2003, volume 6, pages 5879–5884, Maui, HI, 2003. A. Handzel, S. Andersson, M. Gebremichael, and P. Krishnaprasad. A biomimetic apparatus for sound-source localization. In IEEE Conf. Decision and Control, CDC’2003, volume 6, pages 5879–5884, Maui, HI, 2003.
27.
go back to reference A. Handzel and P. Krishnaprasad. Biomimetic sound-source localization. IEEE Sensors J., 2:607–616, 2002. A. Handzel and P. Krishnaprasad. Biomimetic sound-source localization. IEEE Sensors J., 2:607–616, 2002.
28.
go back to reference S. Hashimoto, S. Narita, H. Kasahara, A. Takanishi, S. Sugano, K. Shirai, T. Kobayashi, H. Takanobu, T. Kurata, K. Fujiwara, T. Matsuno, T. Kawasaki, K. Hoashi. Humanoid robot-development of an information assistant robot, Hadaly. In IEEE Intl. Worksh. Robot and Human, Communication, RO-MAN’1997, pages 106–111, 1997. S. Hashimoto, S. Narita, H. Kasahara, A. Takanishi, S. Sugano, K. Shirai, T. Kobayashi, H. Takanobu, T. Kurata, K. Fujiwara, T. Matsuno, T. Kawasaki, K. Hoashi. Humanoid robot-development of an information assistant robot, Hadaly. In IEEE Intl. Worksh. Robot and Human, Communication, RO-MAN’1997, pages 106–111, 1997.
29.
go back to reference J. Hörnstein, M. Lopes, J. Santos-victor, and F. Lacerda. Sound localization for humanoid robots - building audio-motor maps based on the HRTF. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2006, pages 1170–1176, Beijing, China, 2006. J. Hörnstein, M. Lopes, J. Santos-victor, and F. Lacerda. Sound localization for humanoid robots - building audio-motor maps based on the HRTF. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2006, pages 1170–1176, Beijing, China, 2006.
30.
go back to reference J. Huang, T. Supaongprapa, I. Terakura, F. Wang, N. Ohnishi, and N. Sugie. A model-based sound localization system and its application to robot navigation. Robotics and Autonomous Syst., 270:199–209, 1999. J. Huang, T. Supaongprapa, I. Terakura, F. Wang, N. Ohnishi, and N. Sugie. A model-based sound localization system and its application to robot navigation. Robotics and Autonomous Syst., 270:199–209, 1999.
31.
go back to reference G. Ince, K. Nakadai, T. Rodemann, Y. Hasegawa, H. Tsujino, and J. Imura. Ego noise suppression of a robot using template subtraction. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2009, pages 199–204, Saint Louis, MO, 2009. G. Ince, K. Nakadai, T. Rodemann, Y. Hasegawa, H. Tsujino, and J. Imura. Ego noise suppression of a robot using template subtraction. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2009, pages 199–204, Saint Louis, MO, 2009.
32.
go back to reference G. Ince, K. Nakadai, T. Rodemann, J. Imura, K. Nakamura, and H. Nakajima. Incremental learning for ego noise estimation of a robot. InIEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2011, pages 131–136, San Francisco, CA, 2011. G. Ince, K. Nakadai, T. Rodemann, J. Imura, K. Nakamura, and H. Nakajima. Incremental learning for ego noise estimation of a robot. InIEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2011, pages 131–136, San Francisco, CA, 2011.
33.
go back to reference G. Ince, K. Nakadai, T. Rodemann, H. Tsujino, and J. Imura. Multi-talker speech recognition under ego-motion noise using missing feature theory. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2010, pages 982–987, Taipei, Taiwan, 2010. G. Ince, K. Nakadai, T. Rodemann, H. Tsujino, and J. Imura. Multi-talker speech recognition under ego-motion noise using missing feature theory. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2010, pages 982–987, Taipei, Taiwan, 2010.
34.
go back to reference R. Irie. Multimodal sensory integration for localization in a humanoid robot. In IJCAI Worksh. Computational Auditory Scene Analysis, pages 54–58, Nagoya, Aichi, Japan, 1997. R. Irie. Multimodal sensory integration for localization in a humanoid robot. In IJCAI Worksh. Computational Auditory Scene Analysis, pages 54–58, Nagoya, Aichi, Japan, 1997.
35.
go back to reference A. Ito, T. Kanayama, M. Suzuki, and S. Makino. Internal noise suppression for speech recognition by small robots. In Interspeech’2005, pages 2685–2688, Lisbon, Portugal, 2005. A. Ito, T. Kanayama, M. Suzuki, and S. Makino. Internal noise suppression for speech recognition by small robots. In Interspeech’2005, pages 2685–2688, Lisbon, Portugal, 2005.
36.
go back to reference M. Ji, S. Kim, H. Kim, K. Kwak, and Y. Cho. Reliable speaker identification using multiple microphones in ubiquitous robot companion environment. In IEEE Intl. Conf. Robot & Human Interactive Communication, RO-MAN’2007, pages 673–677, Jeju Island, Korea, 2007. M. Ji, S. Kim, H. Kim, K. Kwak, and Y. Cho. Reliable speaker identification using multiple microphones in ubiquitous robot companion environment. In IEEE Intl. Conf. Robot & Human Interactive Communication, RO-MAN’2007, pages 673–677, Jeju Island, Korea, 2007.
37.
go back to reference H.-D. Kim, J. Kim, K. Komatani, T. Ogata, and H. Okuno. Target speech detection and separation for humanoid robots in sparse dialogue with noisy home environments. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2008, pages 1705–1711, Nice, France, 2008. H.-D. Kim, J. Kim, K. Komatani, T. Ogata, and H. Okuno. Target speech detection and separation for humanoid robots in sparse dialogue with noisy home environments. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2008, pages 1705–1711, Nice, France, 2008.
38.
go back to reference C. Knapp and G. Carter. The generalized correlation method for estimation of time delay. IEEE Trans. Acoustics, Speech and, Signal Processing, 24:320–327, 1976. C. Knapp and G. Carter. The generalized correlation method for estimation of time delay. IEEE Trans. Acoustics, Speech and, Signal Processing, 24:320–327, 1976.
39.
go back to reference C. Knapp and G. Carter. Time delay estimation in the presence of relative motion. In IEEE Intl. Conf. Acoustics, Speech, and Signal Processing, ICASSP’1977, pages 280–283, Storrs, CT, 1977. C. Knapp and G. Carter. Time delay estimation in the presence of relative motion. In IEEE Intl. Conf. Acoustics, Speech, and Signal Processing, ICASSP’1977, pages 280–283, Storrs, CT, 1977.
40.
go back to reference Y. Kubota, M. Yoshida, K. Komatani, T. Ogata, and H. Okuno. Design and implementation of a 3D auditory scene visualizer: Towards auditory awareness with face tracking. In IEEE Intl. Symp. Multimedia, ISM’2008, pages 468–476, Berkeley, CA, 2008. Y. Kubota, M. Yoshida, K. Komatani, T. Ogata, and H. Okuno. Design and implementation of a 3D auditory scene visualizer: Towards auditory awareness with face tracking. In IEEE Intl. Symp. Multimedia, ISM’2008, pages 468–476, Berkeley, CA, 2008.
41.
go back to reference M. Kumon and Y. Noda. Active soft pinnae for robots. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2011, pages 112–117, San Francisco, CA, 2011. M. Kumon and Y. Noda. Active soft pinnae for robots. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2011, pages 112–117, San Francisco, CA, 2011.
42.
go back to reference M. Kumon, R. Shimoda, and Z. Iwai. Audio servo for robotic systems with pinnae. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2005, pages 885–890, Edmonton, Canada, 2005. M. Kumon, R. Shimoda, and Z. Iwai. Audio servo for robotic systems with pinnae. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2005, pages 885–890, Edmonton, Canada, 2005.
43.
go back to reference S. Kurotaki, N. Suzuki, K. Nakadai, H. Okuno, and H. Amano. Implementation of active direction-pass filter on dynamically reconfigurable processor. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2005, pages 3175–3180, Edmonton, Canada, 2005. S. Kurotaki, N. Suzuki, K. Nakadai, H. Okuno, and H. Amano. Implementation of active direction-pass filter on dynamically reconfigurable processor. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2005, pages 3175–3180, Edmonton, Canada, 2005.
44.
go back to reference Q. Lin, E. E. Jan, and J. Flanagan. Microphone arrays and speaker identification. IEEE Trans. Speech and Audio Processing, 2:622–629, 1994. Q. Lin, E. E. Jan, and J. Flanagan. Microphone arrays and speaker identification. IEEE Trans. Speech and Audio Processing, 2:622–629, 1994.
45.
go back to reference R. Lippmann and B. A. Carlson. Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering, and noise. In Eurospeech’1997, pages 863–866, Rhodos, Greece, 1997. R. Lippmann and B. A. Carlson. Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering, and noise. In Eurospeech’1997, pages 863–866, Rhodos, Greece, 1997.
46.
go back to reference Y.-C. Lu and M. Cooke. Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners. Speech Comm., 53:622–642, 2011. Y.-C. Lu and M. Cooke. Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners. Speech Comm., 53:622–642, 2011.
47.
go back to reference V. Lunati, J. Manhès, and P. Danès. A versatile system-on-a-programmable-chip for array processing and binaural robot audition. InIEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2012, pages 998–1003, Vilamoura, Portugal, 2012. V. Lunati, J. Manhès, and P. Danès. A versatile system-on-a-programmable-chip for array processing and binaural robot audition. InIEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2012, pages 998–1003, Vilamoura, Portugal, 2012.
48.
go back to reference D. Marr. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Feeeman, W.H., 1982. D. Marr. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Feeeman, W.H., 1982.
49.
go back to reference E. Martinson and B. Fransen. Dynamically reconfigurable microphone arrays. In IEEE Intl. Conf. Robotics and Automation, ICRA’2011, pages 5636–5641, Shangai, China, 2011. E. Martinson and B. Fransen. Dynamically reconfigurable microphone arrays. In IEEE Intl. Conf. Robotics and Automation, ICRA’2011, pages 5636–5641, Shangai, China, 2011.
50.
go back to reference Y. Matsusaka, T. Tojo, S. Kubota, K. Furukawa, D. Tamiya, K. Hayata, Y. Nakano, and T. Kobayashi. Multi-person conversation via multi-modal interface - a robot who communicate with multi-user -. In Eurospeech’1999, pages 1723–1726, Budapest, Hungary, 1999. Y. Matsusaka, T. Tojo, S. Kubota, K. Furukawa, D. Tamiya, K. Hayata, Y. Nakano, and T. Kobayashi. Multi-person conversation via multi-modal interface - a robot who communicate with multi-user -. In Eurospeech’1999, pages 1723–1726, Budapest, Hungary, 1999.
51.
go back to reference T. May, S. van de Par, and A. Kohlrausch. Binaural localization and detection of speakers in complex acoustic scenes. In J. Blauert, editor, The Technology of Binaural Listening, chapter 15. Springer, Berlin-Heidelberg-New York NY, 2013. T. May, S. van de Par, and A. Kohlrausch. Binaural localization and detection of speakers in complex acoustic scenes. In J. Blauert, editor, The Technology of Binaural Listening, chapter 15. Springer, Berlin-Heidelberg-New York NY, 2013.
52.
go back to reference F. Michaud, C. Côté, D. Létourneau, Y. Brosseau, J.-M. Valin, E. Beaudry, C. Raïevsky, A. Ponchon, P. Moisan, P. Lepage, Y. Morin, F. Gagnon, P. Giguère, M.-A. Roux, S. Caron, P. Frenette, and F. Kabanza. Spartacus attending the 2005 AAAI conference.Autonomous Robots, 22:369–383, 2007. F. Michaud, C. Côté, D. Létourneau, Y. Brosseau, J.-M. Valin, E. Beaudry, C. Raïevsky, A. Ponchon, P. Moisan, P. Lepage, Y. Morin, F. Gagnon, P. Giguère, M.-A. Roux, S. Caron, P. Frenette, and F. Kabanza. Spartacus attending the 2005 AAAI conference.Autonomous Robots, 22:369–383, 2007.
53.
go back to reference K. Nakadai, T. Lourens, H. Okuno, and H. Kitano. Active audition for humanoids. In Nat. Conf. Artificial Intelligence, AAAI-2000, pages 832–839, Austin, TX, 2000. K. Nakadai, T. Lourens, H. Okuno, and H. Kitano. Active audition for humanoids. In Nat. Conf. Artificial Intelligence, AAAI-2000, pages 832–839, Austin, TX, 2000.
54.
go back to reference K. Nakadai, T. Matsui, H. Okuno, and H. Kitano. Active audition system and humanoid exterior design. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2000, pages 1453–1461, Takamatsu, Japan, 2000. K. Nakadai, T. Matsui, H. Okuno, and H. Kitano. Active audition system and humanoid exterior design. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2000, pages 1453–1461, Takamatsu, Japan, 2000.
55.
go back to reference K. Nakadai, D. Matsuura, H. Okuno, and H. Kitano. Applying scattering theory to robot audition system: Robust sound source localization and extraction. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2003, pages 1147–1152, Las Vegas, NV, 2003. K. Nakadai, D. Matsuura, H. Okuno, and H. Kitano. Applying scattering theory to robot audition system: Robust sound source localization and extraction. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2003, pages 1147–1152, Las Vegas, NV, 2003.
56.
go back to reference K. Nakadai, H. Okuno, and H. Kitano. Epipolar geometry based sound localization and extraction for humanoid audition. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2001, volume 3, pages 1395–1401, Maui, HI, 2001. K. Nakadai, H. Okuno, and H. Kitano. Epipolar geometry based sound localization and extraction for humanoid audition. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2001, volume 3, pages 1395–1401, Maui, HI, 2001.
57.
go back to reference K. Nakadai, H. Okuno, and H. Kitano. Auditory fovea based speech separation and its application to dialog system. In IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, IROS’2002, volume 2, pages 1320–1325, Lausanne, Switzerland, 2002. K. Nakadai, H. Okuno, and H. Kitano. Auditory fovea based speech separation and its application to dialog system. In IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, IROS’2002, volume 2, pages 1320–1325, Lausanne, Switzerland, 2002.
58.
go back to reference K. Nakadai, H. Okuno, and H. Kitano. Robot recognizes three simultaneous speech by active audition. In IEEE Intl. Conf. Robotics and Automation, ICRA’2003, volume 1, pages 398–405, Taipei, Taiwan, 2003. K. Nakadai, H. Okuno, and H. Kitano. Robot recognizes three simultaneous speech by active audition. In IEEE Intl. Conf. Robotics and Automation, ICRA’2003, volume 1, pages 398–405, Taipei, Taiwan, 2003.
59.
go back to reference H. Nakajima, K. Kikuchi, T. Daigo, Y. Kaneda, K. Nakadai, and Y. Hasegawa. Real-time sound source orientation estimation using a 96 channel microphone array. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2009, pages 676–683, Saint Louis, MO, 2009. H. Nakajima, K. Kikuchi, T. Daigo, Y. Kaneda, K. Nakadai, and Y. Hasegawa. Real-time sound source orientation estimation using a 96 channel microphone array. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2009, pages 676–683, Saint Louis, MO, 2009.
60.
go back to reference H. Nakashima and T. Mukai. 3D sound source localization system based on learning of binaural hearing. In IEEE Intl. Conf. Systems, Man and Cybernetics, SMC’2005, pages 3534–3539, Nagoya, Japan, 2005. H. Nakashima and T. Mukai. 3D sound source localization system based on learning of binaural hearing. In IEEE Intl. Conf. Systems, Man and Cybernetics, SMC’2005, pages 3534–3539, Nagoya, Japan, 2005.
61.
go back to reference E. Nemer, R. Goubran, and S. Mahmoud. Robust voice activity detection using higher-order statistics in the LPC residual domain. IEEE Trans. Speech and Audio Processing, 9:217–231, 2001. E. Nemer, R. Goubran, and S. Mahmoud. Robust voice activity detection using higher-order statistics in the LPC residual domain. IEEE Trans. Speech and Audio Processing, 9:217–231, 2001.
62.
go back to reference Y. Nishimura, M. Nakano, K. Nakadai, H. Tsujino, and M. Ishizuka. Speech recognition for a robot under its motor noises by selective application of missing feature theory and MLLR. In ISCA Tutorial and Research Worksh. Statistical and Perceptual Audition, Pittsburgh, PA, 2006. Y. Nishimura, M. Nakano, K. Nakadai, H. Tsujino, and M. Ishizuka. Speech recognition for a robot under its motor noises by selective application of missing feature theory and MLLR. In ISCA Tutorial and Research Worksh. Statistical and Perceptual Audition, Pittsburgh, PA, 2006.
63.
go back to reference H. Okuno, T. Ogata, K. Komatani, and K. Nakadai. Computational auditory scene analysis and its application to robot audition. In IEEE Intl. Conf. Informatics Res. for Development of Knowledge Society Infrastructure, ICKS’2004, pages 73–80, 2004. H. Okuno, T. Ogata, K. Komatani, and K. Nakadai. Computational auditory scene analysis and its application to robot audition. In IEEE Intl. Conf. Informatics Res. for Development of Knowledge Society Infrastructure, ICKS’2004, pages 73–80, 2004.
64.
go back to reference J. O’Regan. How to build a robot that is conscious and feels. Minds and Machines, pages 117–136, 2012. J. O’Regan. How to build a robot that is conscious and feels. Minds and Machines, pages 117–136, 2012.
65.
go back to reference J. O’Regan and A. Noë. A sensorimotor account of vision and visual consciousness. Behavioral and brain sciences, 24:939–1031, 2001. J. O’Regan and A. Noë. A sensorimotor account of vision and visual consciousness. Behavioral and brain sciences, 24:939–1031, 2001.
66.
go back to reference D. Philipona and J. K. O’Regan. Is there something out there? inferring space from sensorimotor dependencies. Neural Computation, 15:2029–2049, 2001. D. Philipona and J. K. O’Regan. Is there something out there? inferring space from sensorimotor dependencies. Neural Computation, 15:2029–2049, 2001.
67.
go back to reference B. Pierce, T. Kuratate, A. Maejima, S. Morishima, Y. Matsusaka, M. Durkovic, K. Diepold, and G. Cheng. Development of an integrated multi-modal communication robotic face. In IEEE Worksh. Advanced Robotics and its Social Impacts, RSO’2012, pages 101–102, Munich, Germany, 2012. B. Pierce, T. Kuratate, A. Maejima, S. Morishima, Y. Matsusaka, M. Durkovic, K. Diepold, and G. Cheng. Development of an integrated multi-modal communication robotic face. In IEEE Worksh. Advanced Robotics and its Social Impacts, RSO’2012, pages 101–102, Munich, Germany, 2012.
68.
go back to reference H. Poincaré. L’espace et la géométrie. Revue de Métaphysique et de Morale, pages 631–646, 1895. H. Poincaré. L’espace et la géométrie. Revue de Métaphysique et de Morale, pages 631–646, 1895.
69.
go back to reference A. Portello, P. Danès, and S. Argentieri. Active binaural localization of intermittent moving sources in the presence of false meaurements. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2012, pages 3294–3299, Vilamoura, Portugal, 2012. A. Portello, P. Danès, and S. Argentieri. Active binaural localization of intermittent moving sources in the presence of false meaurements. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2012, pages 3294–3299, Vilamoura, Portugal, 2012.
70.
go back to reference R. Prasad, H. Saruwatari, and K. Shikano. Enhancement of speech signals separated from their convolutive mixture by FDICA algorithm. Digital Signal Processing, 19:127–133, 2009. R. Prasad, H. Saruwatari, and K. Shikano. Enhancement of speech signals separated from their convolutive mixture by FDICA algorithm. Digital Signal Processing, 19:127–133, 2009.
71.
go back to reference L. Rabiner and M. Sambur. An algorithm for determining the endpoints of isolated utterances. The Bell System Techn. J., 54:297–315, 1975. L. Rabiner and M. Sambur. An algorithm for determining the endpoints of isolated utterances. The Bell System Techn. J., 54:297–315, 1975.
72.
go back to reference B. Raj, R. Singh, and R. Stern. Inference of missing spectrographic features for robust speech recognition. In Intl. Conf. Spoken Language Processing, Sydney, Australia, 1998. B. Raj, R. Singh, and R. Stern. Inference of missing spectrographic features for robust speech recognition. In Intl. Conf. Spoken Language Processing, Sydney, Australia, 1998.
73.
go back to reference B. Raj and R. M. Stern. Missing-feature approaches in speech recognition. IEEE Signal Processing Mag., 22:101–116, 2005. B. Raj and R. M. Stern. Missing-feature approaches in speech recognition. IEEE Signal Processing Mag., 22:101–116, 2005.
74.
go back to reference T. Rodemann. A study on distance estimation in binaural sound localization. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2010, pages 425–430, Taipei, Taiwan, 2010. T. Rodemann. A study on distance estimation in binaural sound localization. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2010, pages 425–430, Taipei, Taiwan, 2010.
75.
go back to reference D. Rosenthal and H. Okuno, editors. Computational Auditory Scene Analysis. Lawrence Erlbaum Associates, 1997. D. Rosenthal and H. Okuno, editors. Computational Auditory Scene Analysis. Lawrence Erlbaum Associates, 1997.
76.
go back to reference A. Saxena and A. Ng. Learning sound location from a single microphone. In IEEE Intl. Conf. Robotics and Automation, ICRA’2009, pages 1737–1742, Kobe, Japan, 2009. A. Saxena and A. Ng. Learning sound location from a single microphone. In IEEE Intl. Conf. Robotics and Automation, ICRA’2009, pages 1737–1742, Kobe, Japan, 2009.
77.
go back to reference S. Schulz and T. Herfet. Humanoid separation of speech sources in reverberant environments. In Intl. Symp. Communications, Control and Signal Processing, ISCCSP’2008, pages 377–382, Brownsville, TX, 2008. S. Schulz and T. Herfet. Humanoid separation of speech sources in reverberant environments. In Intl. Symp. Communications, Control and Signal Processing, ISCCSP’2008, pages 377–382, Brownsville, TX, 2008.
78.
go back to reference M. L. Seltzer, B. Raj, and R. Stern. A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition. Speech Comm., 43:379–393, 2004. M. L. Seltzer, B. Raj, and R. Stern. A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition. Speech Comm., 43:379–393, 2004.
79.
go back to reference A. Skaf and P. Danès. Optimal positioning of a binaural sensor on a humanoid head for sound source localization. In IEEE Intl. Conf. Humanoid Robots, Humanoids’2011, pages 165–170, Bled, Slovenia, 2011. A. Skaf and P. Danès. Optimal positioning of a binaural sensor on a humanoid head for sound source localization. In IEEE Intl. Conf. Humanoid Robots, Humanoids’2011, pages 165–170, Bled, Slovenia, 2011.
80.
go back to reference D. Sodoyer, B. Rivet, L. Girin, C. Savariaux, J.-L. Schwartz, and C. Jutten. A study of lip movements during spontaneous dialog and its application to voice activity detection. J. Acoust. Soc. Am., 125:1184–1196, 2009. D. Sodoyer, B. Rivet, L. Girin, C. Savariaux, J.-L. Schwartz, and C. Jutten. A study of lip movements during spontaneous dialog and its application to voice activity detection. J. Acoust. Soc. Am., 125:1184–1196, 2009.
81.
go back to reference M. Stamm and M. Altinsoy. Employing binaural-proprioceptive interaction in human machine interfaces. In J. Blauert, editor, The technology of binaural listening, chapter 17. Springer, Berlin-Heidelberg-New York NY, 2013. M. Stamm and M. Altinsoy. Employing binaural-proprioceptive interaction in human machine interfaces. In J. Blauert, editor, The technology of binaural listening, chapter 17. Springer, Berlin-Heidelberg-New York NY, 2013.
82.
go back to reference R. Takeda, S. Yamamoto, K. Komatani, T. Ogata, and H. Okuno. Missing-feature based speech recognition for two simultaneous speech signals separated by ICA with a pair of humanoid ears. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2006, pages 878–885, Beijing, China, 2006. R. Takeda, S. Yamamoto, K. Komatani, T. Ogata, and H. Okuno. Missing-feature based speech recognition for two simultaneous speech signals separated by ICA with a pair of humanoid ears. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2006, pages 878–885, Beijing, China, 2006.
83.
go back to reference K. Tanaka, M. Abe, and S. Ando. A novel mechanical cochlea “fishbone” with dual sensor/actuator characteristics. IEEE/ASME Trans. Mechatronics, 3:98–105, 1998. K. Tanaka, M. Abe, and S. Ando. A novel mechanical cochlea “fishbone” with dual sensor/actuator characteristics. IEEE/ASME Trans. Mechatronics, 3:98–105, 1998.
84.
go back to reference J. Valin, J. Rouat, and F. Michaud. Enhanced robot audition based on microphone array source separation with post-filter. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2004, pages 2123–2128, Sendai, Japan, 2004. J. Valin, J. Rouat, and F. Michaud. Enhanced robot audition based on microphone array source separation with post-filter. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2004, pages 2123–2128, Sendai, Japan, 2004.
85.
go back to reference H. Van Trees. Optimum Array Processing (Detection, Estimation, and Modulation Theory, Part IV). Wiley-Interscience, 2002. H. Van Trees. Optimum Array Processing (Detection, Estimation, and Modulation Theory, Part IV). Wiley-Interscience, 2002.
86.
go back to reference D. Ward, E. Lehmann, and R. Williamson. Particle filtering algorithms for tracking an acoustic source in a reverberant environment. IEEE Trans. Speech and Audio Processing, 11:826–836, 2003. D. Ward, E. Lehmann, and R. Williamson. Particle filtering algorithms for tracking an acoustic source in a reverberant environment. IEEE Trans. Speech and Audio Processing, 11:826–836, 2003.
87.
go back to reference E. Weinstein and A. Weiss. Fundamental limitations in passive time delay estimation - Part II: Wideband systems. IEEE Trans. Acoustics, Speech and Signal Processing, pages 1064–1078, 1984. E. Weinstein and A. Weiss. Fundamental limitations in passive time delay estimation - Part II: Wideband systems. IEEE Trans. Acoustics, Speech and Signal Processing, pages 1064–1078, 1984.
88.
go back to reference A. Weiss and E. Weinstein. Fundamental limitations in passive time delay estimation - Part I: Narrowband systems. IEEE Trans. Acoustics, Speech and Signal Processing, pages 472–486, 1983. A. Weiss and E. Weinstein. Fundamental limitations in passive time delay estimation - Part I: Narrowband systems. IEEE Trans. Acoustics, Speech and Signal Processing, pages 472–486, 1983.
89.
go back to reference R. Weiss, M. Mandel, and D. Ellis. Combining localization cues and source model constraints for binaural source separation. Speech Comm., 53:606–621, 2011. R. Weiss, M. Mandel, and D. Ellis. Combining localization cues and source model constraints for binaural source separation. Speech Comm., 53:606–621, 2011.
90.
go back to reference R. Woodworth and H. Schlosberg. Experimental Psychology. Holt, Rinehart and Winston, 3rd edition, 1971. R. Woodworth and H. Schlosberg. Experimental Psychology. Holt, Rinehart and Winston, 3rd edition, 1971.
91.
go back to reference T. Yoshida and K. Nakadai. Two-layered audio-visual speech recognition for robots in noisy environments. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2010, pages 988–993, 2010. T. Yoshida and K. Nakadai. Two-layered audio-visual speech recognition for robots in noisy environments. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2010, pages 988–993, 2010.
92.
go back to reference K. Youssef, S. Argentieri, and J. Zarader. From monaural to binaural speaker recognition for humanoid robots. In IEEE/RAS Intl. Conf. Humanoid Robots, Humanoids’2010, pages 580–586, Nashville, TN, 2010. K. Youssef, S. Argentieri, and J. Zarader. From monaural to binaural speaker recognition for humanoid robots. In IEEE/RAS Intl. Conf. Humanoid Robots, Humanoids’2010, pages 580–586, Nashville, TN, 2010.
93.
go back to reference K. Youssef, S. Argentieri, and J.-L. Zarader. A binaural sound source localization method using auditive cues and vision. In IEEE Intl. Conf. Acoustics, Speech and Signal Processing, ICASSP’2012, pages 217–220, Kyoto, Japan, 2012. K. Youssef, S. Argentieri, and J.-L. Zarader. A binaural sound source localization method using auditive cues and vision. In IEEE Intl. Conf. Acoustics, Speech and Signal Processing, ICASSP’2012, pages 217–220, Kyoto, Japan, 2012.
94.
go back to reference K. Youssef, S. Argentieri, and J.-L. Zarader. Towards a systematic study of binaural cues. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2012, pages 1004–1009, Vilamoura, Portugal, 2012. K. Youssef, S. Argentieri, and J.-L. Zarader. Towards a systematic study of binaural cues. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2012, pages 1004–1009, Vilamoura, Portugal, 2012.
95.
go back to reference K. Youssef, B. Breteau, S. Argentieri, J.-L. Zarader, and Z. Wang. Approaches for automatic speaker recognition in a binaural humanoid context. In Eur. Symp. Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN’2011, pages 411–416, Bruges, Belgium, 2011. K. Youssef, B. Breteau, S. Argentieri, J.-L. Zarader, and Z. Wang. Approaches for automatic speaker recognition in a binaural humanoid context. In Eur. Symp. Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN’2011, pages 411–416, Bruges, Belgium, 2011.
Metadata
Title
Binaural Systems in Robotics
Authors
S. Argentieri
A. Portello
M. Bernard
P. Danès
B. Gas
Copyright Year
2013
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-37762-4_9

Premium Partners