Top

Published in:

2020 | OriginalPaper | Chapter

A Virtual Testbed for Binaural Agents

Author : Jens Blauert

Published in: The Technology of Binaural Understanding

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Current developments in modeling the auditory system lead to increasing inclusion of cognitive functions, such as dynamic auditory scene analysis. This qualifies these systems as auditory front-ends for autonomous agents. Such agents can, for example, be mobile robotic systems, that is, they can move around in their environments, explore them, and develop internal models of them. Thereby, they can monitor their environments and become active in cases where potentially hazardous things happen. For example, in a Search-&-Rescue scenario (SAR), the agents could identify and save persons in dangerous situations. In this chapter, a virtual testbed for such systems is described that was developed in the EU project Two!Ears (www.twoears.eu) There, in simulated scenarios, the agents have to localize and identify potential victims and, consequently, rescue them according to dynamic SAR plans. The actions are predominantly based on binaural cues, derived from the two ear signals of head-and-torso simulators (dummy heads) on carriages that can actively move about in the scenes to be explored. Such a simulation system can provide a tool to monitor and evaluate the cognitive processes of autonomous systems while these are dynamically executing assigned tasks.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Modeling the Aesthetics of Audio-Scene Reproduction

next chapter Binaural Technology for Machine Speech Recognition and Understanding

The German term “Gestalt” describes an entity where the sum is perceived as more than the sum of its parts.

Head-related impulse responses (HRIRs) are the Fourier transforms of head-related transfer functions (HRTFs).

In the current project the “SoundScape Renderer” (SSR) of Geier and Spors (2012) has been chosen for this purpose—see www.spatialaudio.net/ssr/ [last accessed: August 18, 2019].

Sound-reflections from the walls were not considered, as this did not appear to be of importance for the current localization task. If this became necessary for tests in more complex scenarios, a precedence-effect processor had to be implemented—such as the one described by Braasch (2020), this volume.

If at a later point in time a sufficient amount of experimental data from the “real” agent is available, the emulated identity labels can be replaced by real ones. Identity classes could even be compiled automatically from experimental data.

Adream. 2014. Lab. for analysis and architecture of systems, F–Toulouse. https://www.laas.fr/public/en/adream. Last accessed 18 Aug 2019.

Blauert, J. 1997. Spatial Hearing—The Psychophysics of Human Sound Localization, 2nd ed. Cambridge, MA: The MIT-Press (expanded and revised edition of Räumliches Hören, S. Hirzel, Stuttgart, 1974).

Blauert, J., and G. Brown. 2020. Reflexive and reflective auditory feedback. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 3–31. Cham, Switzerland: Springer and ASA Press.

Blender Foundation. 2014. Blender-3D open source animation suite. http://www.blender.org/. Last accessed 18 Aug 2019.

Braasch, J. 2020. Binaural modeling from an evolving-habitat perspective. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 251–286. Cham, Switzerland: Springer and ASA Press.

Braasch, J., S. Clapp, A. Parks, T. Pastore, and N. Xiang. 2013. A binaural model that analyses aural spaces and stereophonic reproduction systems by utilizing head movements. In The Technology of Binaural Listening, ed. J. Blauert, 201–223. Springer and ASA Press.

Braasch, J., A. Parks, and N. Xiang. 2011. Utilizing head movements in the binaural assessment of room acoustics and analysis of complex sound source scenarios. The Journal of the Acoustical Society of America 129: 2486.ADSCrossRef

Bregman, A. 1990. Auditory Scene Analysis—The Perceptual Organization of Sound. Cambridge, MA: The MIT Press.

Cohen-L’hyver, B., S. Argentieri, and B. Gas. 2015. Modulating the auditory Turn-to-Reflex on the basis of multimodal feedback loops: The Dynamic Weighting Model. In IEEE Robio 2016—International Conference on Robotics and Biomimetics.

Cohen-L’hyver, B., S. Argentieri, and B. Gas. 2020. Audition as a trigger of head movements. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 697–731. Cham, Switzerland: Springer and ASA Press.

Dalal, N., and B. Triggs. 2005. Histograms of oriented gradients for human detection. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, 886–893.

Ears 2014. Embodied audition for robots. https://robot-ears.eu/. Last accessed 18 Aug 2019.

Fabre-Thorpe, M. 2003. Visual categorization: Accessing abstraction in non-human primates. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences 358: 1215–1223.CrossRef

Frintrop, S., E. Rome, and H.I. Christensen. 2010. Computational visual attention systems and their cognitive foundations: A survey. ACM Transactions on Applied Perception 7 (1): 6:1–6:39.

Geier, M., and S. Spors. 2012. Spatial audio reproduction with the soundscape renderer. In 27th Tonmeistertagung—VDT International Convention.

Goodfellow, I., Y. Bengio, and A. Courville. 2016. Deep Learning. Cambridge, MA; GB, London: The MIT Press.MATH

Hörnstein, J., M. Lopes, J. Santos-Victor, and F. Lacerda. 2006. Sound localization for humanoid robots–Building audio-motor maps based on the HRTF. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 1170–1176.

Itti, L., and P. Baldi. 2009. Bayesian surprise attracts human attention. Vision Research 49 (10): 1295–1306.CrossRef

Jekosch, U. 2005. Assigning of meaning to sounds—Semiotics in the context of product-sound design. In Communication Acoustics, ed. J. Blauert, 193–221. Springer.

Kitano, H., H.G. Okuno, K. Nakadai, T. Sabisch, and T. Matsui. 2000. Design and architecture of SIG the humanoid: An experimental platform for integrated perception in RoboCup humanoid challenge. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 181–190.

Kuehn, B., B. Schauerte, K. Kroschel, and R. Stiefelhagen. 2012. Multimodal saliency-based attention: A lazy robot’s approach. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 807–814.

Ma, N., G.J. Brown, and T. May. 2015. Robust localisation of of multiple speakers exploiting deep neural networks and head movements. In Proceedings of Interspeech15, 2679–2683.

Metta, G., G. Sandini, D. Vernon, L. Natale, and F. Nori. 2008. The iCub humanoid robot: An open platform for research in embodied cognition. In Proceedings of 8th Workshop Performance Metrics for Intelligent Systems, 50–56.

Nakadai, K., T. Lourens, H.G. Okuno, and H. Kitano. 2000. Active audition for humanoid. In Proceedings 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence, 832–839.

Okuno, H.G., K. Nakadai, K. Hidai, H. Mizoguchi, and H. Kitano. 2001. Human-robot interaction through real-time auditory and visual multiple-talker tracking. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 1402–1409.

Pastore, T., Y. Zhou, and A. Yost. 2020. Cross-modal and cognitive processes in sound localization. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 315–350. Cham, Swtzerland: Springer and ASA Press.

Plinge, A., M.H. Hennecke, and G.A. Fink. 2012. Reverberation-robust online multi-speaker tracking by using a microphone array and CASA processing. In International Workshop on Acoustic Signal Enhancement (IWAENC).

Premakumar, P. 2016. A* (A star) search path planning tutorial. https://de.mathworks.com/matlabcentral/fileexchange/26248-a-a-star-search-for-path-planning-tutorial. Last accessed 18 Aug 2019.

Raake, A., and J. Blauert. 2013. Comprehensive modeling of the formation process of sound-quality. In 5th International Workshop Quality of Multimedia Experience (QoMEX, Klagenfurt), 76–81.

Ruesch, J., M. Lopes, A. Bernardino, J. Hörnstein, J. Santos-Victor, and R. Pfeifer. 2008. Multimodal saliency-based bottom-up attention a framework for the humanoid robot iCub. In IEEE International Conference on Robotics and Automation, 962–967.

Schauerte, B., B. Kühn, K. Kroschel, and R. Stiefelhagen. 2011. Multimodal saliency-based attention for object-based scene analysis. In IEEE International Conference on Intelligent Robots and Systems, 1173–1179.

Schauerte, B., and R. Stiefelhagen. 2013. “Wow!” Bayesian surprise for salient acoustic event detection. In IEEE International Conference on Acoustics, Speech and Signal Processing, 6402–6406.

Schymura, C., and D. Kolossa. 2020. Blackboard systems for modeling binaural understanding. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 91–111. Cham, Switzerland: Springer and ASA Press.

Sutojo, S., S. Van de Par, J. Thiemann, and A. Kohlrausch. 2020. Auditory Gestalt rules and their application. In The Technology, and of Binaural Understanding, eds. J. Blauert and J. Braasch, 33–59. Cham, Switzerland: Springer and ASA Press.

Sutton, R. 2018. Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA: The MIT Press.MATH

Two!Ears. 2015. Specification of feedback loops and implementation progress. In Two!Ears Publications, ed. J. Blauert and T. Walther, Chap. Project deliverables, item d4.2, pp. 56–61, https://doi.org/10.5281/zenodo.2595224.

Two!Ears 2016. Final integration-&-evaluation. In Two!Ears Publications, ed. J. Blauert and T. Walther, Chap. Project deliverables, item d4.3. https://doi.org/10.5281/zenodo.2591202.

von der Malsburg, C. 1999. The what and why of binding: The modeler’s perspective. Neuron 24: 95–104.CrossRef

Walther, T., and B. Cohen-L’hyver. 2014. Multimodal feedback in auditory-based active scene exploration. In Proceedings of Forum Acusticum. Kraków, Poland.

Wang, D., and G.E. Brown. 2006. Computational auditory scene analysis: Principles, algorithms, and applications. IEEE Xplore. https://ieeexplore.ieee.org/document/4429320. Last accessed 18 Aug 2019.

WillowGarage 2014. Open source computer vision library. https://ieeexplore.ieee.org/document/4429320. Last accessed 18 August 2019.

Title: A Virtual Testbed for Binaural Agents
Author: Jens Blauert
Publisher: Springer International Publishing
Book: The Technology of Binaural Understanding
Print ISBN: 978-3-030-00385-2

Electronic ISBN: 978-3-030-00386-9

Copyright Year: 2020
DOI: https://doi.org/10.1007/978-3-030-00386-9_17

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Premium Partner