Skip to main content
Top

2020 | OriginalPaper | Chapter

A Virtual Testbed for Binaural Agents

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Current developments in modeling the auditory system lead to increasing inclusion of cognitive functions, such as dynamic auditory scene analysis. This qualifies these systems as auditory front-ends for autonomous agents. Such agents can, for example, be mobile robotic systems, that is, they can move around in their environments, explore them, and develop internal models of them. Thereby, they can monitor their environments and become active in cases where potentially hazardous things happen. For example, in a Search-&-Rescue scenario (SAR), the agents could identify and save persons in dangerous situations. In this chapter, a virtual testbed for such systems is described that was developed in the EU project Two!Ears (www.​twoears.​eu) There, in simulated scenarios, the agents have to localize and identify potential victims and, consequently, rescue them according to dynamic SAR plans. The actions are predominantly based on binaural cues, derived from the two ear signals of head-and-torso simulators (dummy heads) on carriages that can actively move about in the scenes to be explored. Such a simulation system can provide a tool to monitor and evaluate the cognitive processes of autonomous systems while these are dynamically executing assigned tasks.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
The German term “Gestalt” describes an entity where the sum is perceived as more than the sum of its parts.
 
2
Head-related impulse responses (HRIRs) are the Fourier transforms of head-related transfer functions (HRTFs).
 
3
In the current project the “SoundScape Renderer” (SSR) of Geier and Spors (2012) has been chosen for this purpose—see www.​spatialaudio.​net/​ssr/​ [last accessed: August 18, 2019].
 
4
Sound-reflections from the walls were not considered, as this did not appear to be of importance for the current localization task. If this became necessary for tests in more complex scenarios, a precedence-effect processor had to be implemented—such as the one described by Braasch (2020), this volume.
 
5
If at a later point in time a sufficient amount of experimental data from the “real” agent is available, the emulated identity labels can be replaced by real ones. Identity classes could even be compiled automatically from experimental data.
 
Literature
go back to reference Blauert, J. 1997. Spatial Hearing—The Psychophysics of Human Sound Localization, 2nd ed. Cambridge, MA: The MIT-Press (expanded and revised edition of Räumliches Hören, S. Hirzel, Stuttgart, 1974). Blauert, J. 1997. Spatial Hearing—The Psychophysics of Human Sound Localization, 2nd ed. Cambridge, MA: The MIT-Press (expanded and revised edition of Räumliches Hören, S. Hirzel, Stuttgart, 1974).
go back to reference Blauert, J., and G. Brown. 2020. Reflexive and reflective auditory feedback. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 3–31. Cham, Switzerland: Springer and ASA Press. Blauert, J., and G. Brown. 2020. Reflexive and reflective auditory feedback. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 3–31. Cham, Switzerland: Springer and ASA Press.
go back to reference Braasch, J. 2020. Binaural modeling from an evolving-habitat perspective. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 251–286. Cham, Switzerland: Springer and ASA Press. Braasch, J. 2020. Binaural modeling from an evolving-habitat perspective. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 251–286. Cham, Switzerland: Springer and ASA Press.
go back to reference Braasch, J., S. Clapp, A. Parks, T. Pastore, and N. Xiang. 2013. A binaural model that analyses aural spaces and stereophonic reproduction systems by utilizing head movements. In The Technology of Binaural Listening, ed. J. Blauert, 201–223. Springer and ASA Press. Braasch, J., S. Clapp, A. Parks, T. Pastore, and N. Xiang. 2013. A binaural model that analyses aural spaces and stereophonic reproduction systems by utilizing head movements. In The Technology of Binaural Listening, ed. J. Blauert, 201–223. Springer and ASA Press.
go back to reference Braasch, J., A. Parks, and N. Xiang. 2011. Utilizing head movements in the binaural assessment of room acoustics and analysis of complex sound source scenarios. The Journal of the Acoustical Society of America 129: 2486.ADSCrossRef Braasch, J., A. Parks, and N. Xiang. 2011. Utilizing head movements in the binaural assessment of room acoustics and analysis of complex sound source scenarios. The Journal of the Acoustical Society of America 129: 2486.ADSCrossRef
go back to reference Bregman, A. 1990. Auditory Scene Analysis—The Perceptual Organization of Sound. Cambridge, MA: The MIT Press. Bregman, A. 1990. Auditory Scene Analysis—The Perceptual Organization of Sound. Cambridge, MA: The MIT Press.
go back to reference Cohen-L’hyver, B., S. Argentieri, and B. Gas. 2015. Modulating the auditory Turn-to-Reflex on the basis of multimodal feedback loops: The Dynamic Weighting Model. In IEEE Robio 2016—International Conference on Robotics and Biomimetics. Cohen-L’hyver, B., S. Argentieri, and B. Gas. 2015. Modulating the auditory Turn-to-Reflex on the basis of multimodal feedback loops: The Dynamic Weighting Model. In IEEE Robio 2016—International Conference on Robotics and Biomimetics.
go back to reference Cohen-L’hyver, B., S. Argentieri, and B. Gas. 2020. Audition as a trigger of head movements. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 697–731. Cham, Switzerland: Springer and ASA Press. Cohen-L’hyver, B., S. Argentieri, and B. Gas. 2020. Audition as a trigger of head movements. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 697–731. Cham, Switzerland: Springer and ASA Press.
go back to reference Dalal, N., and B. Triggs. 2005. Histograms of oriented gradients for human detection. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, 886–893. Dalal, N., and B. Triggs. 2005. Histograms of oriented gradients for human detection. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, 886–893.
go back to reference Fabre-Thorpe, M. 2003. Visual categorization: Accessing abstraction in non-human primates. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences 358: 1215–1223.CrossRef Fabre-Thorpe, M. 2003. Visual categorization: Accessing abstraction in non-human primates. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences 358: 1215–1223.CrossRef
go back to reference Frintrop, S., E. Rome, and H.I. Christensen. 2010. Computational visual attention systems and their cognitive foundations: A survey. ACM Transactions on Applied Perception 7 (1): 6:1–6:39. Frintrop, S., E. Rome, and H.I. Christensen. 2010. Computational visual attention systems and their cognitive foundations: A survey. ACM Transactions on Applied Perception 7 (1): 6:1–6:39.
go back to reference Geier, M., and S. Spors. 2012. Spatial audio reproduction with the soundscape renderer. In 27th Tonmeistertagung—VDT International Convention. Geier, M., and S. Spors. 2012. Spatial audio reproduction with the soundscape renderer. In 27th Tonmeistertagung—VDT International Convention.
go back to reference Goodfellow, I., Y. Bengio, and A. Courville. 2016. Deep Learning. Cambridge, MA; GB, London: The MIT Press.MATH Goodfellow, I., Y. Bengio, and A. Courville. 2016. Deep Learning. Cambridge, MA; GB, London: The MIT Press.MATH
go back to reference Hörnstein, J., M. Lopes, J. Santos-Victor, and F. Lacerda. 2006. Sound localization for humanoid robots–Building audio-motor maps based on the HRTF. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 1170–1176. Hörnstein, J., M. Lopes, J. Santos-Victor, and F. Lacerda. 2006. Sound localization for humanoid robots–Building audio-motor maps based on the HRTF. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 1170–1176.
go back to reference Itti, L., and P. Baldi. 2009. Bayesian surprise attracts human attention. Vision Research 49 (10): 1295–1306.CrossRef Itti, L., and P. Baldi. 2009. Bayesian surprise attracts human attention. Vision Research 49 (10): 1295–1306.CrossRef
go back to reference Jekosch, U. 2005. Assigning of meaning to sounds—Semiotics in the context of product-sound design. In Communication Acoustics, ed. J. Blauert, 193–221. Springer. Jekosch, U. 2005. Assigning of meaning to sounds—Semiotics in the context of product-sound design. In Communication Acoustics, ed. J. Blauert, 193–221. Springer.
go back to reference Kitano, H., H.G. Okuno, K. Nakadai, T. Sabisch, and T. Matsui. 2000. Design and architecture of SIG the humanoid: An experimental platform for integrated perception in RoboCup humanoid challenge. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 181–190. Kitano, H., H.G. Okuno, K. Nakadai, T. Sabisch, and T. Matsui. 2000. Design and architecture of SIG the humanoid: An experimental platform for integrated perception in RoboCup humanoid challenge. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 181–190.
go back to reference Kuehn, B., B. Schauerte, K. Kroschel, and R. Stiefelhagen. 2012. Multimodal saliency-based attention: A lazy robot’s approach. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 807–814. Kuehn, B., B. Schauerte, K. Kroschel, and R. Stiefelhagen. 2012. Multimodal saliency-based attention: A lazy robot’s approach. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 807–814.
go back to reference Ma, N., G.J. Brown, and T. May. 2015. Robust localisation of of multiple speakers exploiting deep neural networks and head movements. In Proceedings of Interspeech15, 2679–2683. Ma, N., G.J. Brown, and T. May. 2015. Robust localisation of of multiple speakers exploiting deep neural networks and head movements. In Proceedings of Interspeech15, 2679–2683.
go back to reference Metta, G., G. Sandini, D. Vernon, L. Natale, and F. Nori. 2008. The iCub humanoid robot: An open platform for research in embodied cognition. In Proceedings of 8th Workshop Performance Metrics for Intelligent Systems, 50–56. Metta, G., G. Sandini, D. Vernon, L. Natale, and F. Nori. 2008. The iCub humanoid robot: An open platform for research in embodied cognition. In Proceedings of 8th Workshop Performance Metrics for Intelligent Systems, 50–56.
go back to reference Nakadai, K., T. Lourens, H.G. Okuno, and H. Kitano. 2000. Active audition for humanoid. In Proceedings 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence, 832–839. Nakadai, K., T. Lourens, H.G. Okuno, and H. Kitano. 2000. Active audition for humanoid. In Proceedings 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence, 832–839.
go back to reference Okuno, H.G., K. Nakadai, K. Hidai, H. Mizoguchi, and H. Kitano. 2001. Human-robot interaction through real-time auditory and visual multiple-talker tracking. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 1402–1409. Okuno, H.G., K. Nakadai, K. Hidai, H. Mizoguchi, and H. Kitano. 2001. Human-robot interaction through real-time auditory and visual multiple-talker tracking. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 1402–1409.
go back to reference Pastore, T., Y. Zhou, and A. Yost. 2020. Cross-modal and cognitive processes in sound localization. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 315–350. Cham, Swtzerland: Springer and ASA Press. Pastore, T., Y. Zhou, and A. Yost. 2020. Cross-modal and cognitive processes in sound localization. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 315–350. Cham, Swtzerland: Springer and ASA Press.
go back to reference Plinge, A., M.H. Hennecke, and G.A. Fink. 2012. Reverberation-robust online multi-speaker tracking by using a microphone array and CASA processing. In International Workshop on Acoustic Signal Enhancement (IWAENC). Plinge, A., M.H. Hennecke, and G.A. Fink. 2012. Reverberation-robust online multi-speaker tracking by using a microphone array and CASA processing. In International Workshop on Acoustic Signal Enhancement (IWAENC).
go back to reference Raake, A., and J. Blauert. 2013. Comprehensive modeling of the formation process of sound-quality. In 5th International Workshop Quality of Multimedia Experience (QoMEX, Klagenfurt), 76–81. Raake, A., and J. Blauert. 2013. Comprehensive modeling of the formation process of sound-quality. In 5th International Workshop Quality of Multimedia Experience (QoMEX, Klagenfurt), 76–81.
go back to reference Ruesch, J., M. Lopes, A. Bernardino, J. Hörnstein, J. Santos-Victor, and R. Pfeifer. 2008. Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub. In IEEE International Conference on Robotics and Automation, 962–967. Ruesch, J., M. Lopes, A. Bernardino, J. Hörnstein, J. Santos-Victor, and R. Pfeifer. 2008. Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub. In IEEE International Conference on Robotics and Automation, 962–967.
go back to reference Schauerte, B., B. Kühn, K. Kroschel, and R. Stiefelhagen. 2011. Multimodal saliency-based attention for object-based scene analysis. In IEEE International Conference on Intelligent Robots and Systems, 1173–1179. Schauerte, B., B. Kühn, K. Kroschel, and R. Stiefelhagen. 2011. Multimodal saliency-based attention for object-based scene analysis. In IEEE International Conference on Intelligent Robots and Systems, 1173–1179.
go back to reference Schauerte, B., and R. Stiefelhagen. 2013. “Wow!” Bayesian surprise for salient acoustic event detection. In IEEE International Conference on Acoustics, Speech and Signal Processing, 6402–6406. Schauerte, B., and R. Stiefelhagen. 2013. “Wow!” Bayesian surprise for salient acoustic event detection. In IEEE International Conference on Acoustics, Speech and Signal Processing, 6402–6406.
go back to reference Schymura, C., and D. Kolossa. 2020. Blackboard systems for modeling binaural understanding. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 91–111. Cham, Switzerland: Springer and ASA Press. Schymura, C., and D. Kolossa. 2020. Blackboard systems for modeling binaural understanding. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 91–111. Cham, Switzerland: Springer and ASA Press.
go back to reference Sutojo, S., S. Van de Par, J. Thiemann, and A. Kohlrausch. 2020. Auditory Gestalt rules and their application. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 33–59. Cham, Switzerland: Springer and ASA Press. Sutojo, S., S. Van de Par, J. Thiemann, and A. Kohlrausch. 2020. Auditory Gestalt rules and their application. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 33–59. Cham, Switzerland: Springer and ASA Press.
go back to reference Sutton, R. 2018. Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA: The MIT Press.MATH Sutton, R. 2018. Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA: The MIT Press.MATH
go back to reference von der Malsburg, C. 1999. The what and why of binding: The modeler’s perspective. Neuron 24: 95–104.CrossRef von der Malsburg, C. 1999. The what and why of binding: The modeler’s perspective. Neuron 24: 95–104.CrossRef
go back to reference Walther, T., and B. Cohen-L’hyver. 2014. Multimodal feedback in auditory-based active scene exploration. In Proceedings of Forum Acusticum. Kraków, Poland. Walther, T., and B. Cohen-L’hyver. 2014. Multimodal feedback in auditory-based active scene exploration. In Proceedings of Forum Acusticum. Kraków, Poland.
Metadata
Title
A Virtual Testbed for Binaural Agents
Author
Jens Blauert
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-00386-9_17

Premium Partner