Top

Autonomous Robots

Published in:

31-07-2019

Motion planning for robot audition

Authors: Quan V. Nguyen, Francis Colas, Emmanuel Vincent, François Charpillet

Published in: Autonomous Robots | Issue 8/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Robot audition refers to a range of hearing capabilities which help robots explore and understand their environment. Among them, sound source localization is the problem of estimating the location of a sound source given measurements of its angle of arrival with respect to a microphone array mounted on the robot. In addition, robot motion can help quickly solve the front-back ambiguity existing in a linear microphone array. In this article, we focus on the problem of exploiting robot motion to improve the estimation of the location of an intermittent and possibly moving source in a noisy and reverberant environment. We first propose a robust extended mixture Kalman filtering framework for jointly estimating the source location and its activity over time. Building on this framework, we then propose a long-term robot motion planning algorithm based on Monte Carlo tree search to find an optimal robot trajectory according to two alternative criteria: the Shannon entropy or the standard deviation of the estimated belief on the source location. These criteria are integrated over time using a discount factor. Experimental results show the robustness of the proposed estimation framework to false angle of arrival measurements within \(\pm \,20^{\circ }\) and 10% false source activity detection rate. The proposed robot motion planning technique achieves an average localization error 48.7% smaller than a one-step-ahead method. In addition, we compare the correlation between the estimation error and the two criteria, and investigate the effect of the discount factor on the performance of the proposed motion planning algorithm.

previous article Horizon-based lazy optimal RRT for fast, efficient replanning in dynamic environment

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Available only for authorised users

Alam, J., Kenny, P., Ouellet, P., Stafylakis, T., & Dumouchel, P. (2014). Supervised/unsupervised voice activity detectors for text-dependent speaker recognition on the RSR2015 corpus. In Proceedings of Odyssey.

Ali, A. M., Asgari, S., Collier, T. C., Allen, M., Girod, L., Hudson, R. E., et al. (2009). An empirical study of collaborative acoustic source localization. Journal of Signal Processing Systems, 57(3), 415–436.CrossRef

Allen, J. B., & Berkley, D. A. (1979). Image method for efficiently simulating small-room acoustics. The Journal of the Acoustical Society of America, 65(4), 943–950.CrossRef

Amanatiadis, A. A., Chatzichristofis, S. A., Charalampous, K., Doitsidis, L., Kosmatopoulos, E. B., Tsalides, P., et al. (2013). A multi-objective exploration strategy for mobile robots under operational constraints. IEEE Access, 1, 691–702.CrossRef

Badali, A., Valin, J. M., Michaud, F., & Aarabi, P. (2009). Evaluating real-time audio localization algorithms for artificial audition in robotics. In Proceedings of the IROS (pp. 2033–2038).

Berglund, E., & Sitte, J. (2005). Sound source localisation through active audition. In Proceedings of the IROS (pp. 509–514).

Bhattacharyya, S. (2011). Motion planning and constraint exploration for robotic surgery. Nashville: Vanderbilt University.

Blandin, C., Ozerov, A., & Vincent, E. (2012). Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. Signal Processing, 92(8), 1950–1960.CrossRef

Browne, C. B., Powley, E., Whitehouse, D., Lucas, S. M., Cowling, P. I., Rohlfshagen, P., et al. (2012). A survey of Monte Carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in Games, 4(1), 1–43.CrossRef

Bustamante, G., & Danès, P. (2017). Multi-step-ahead information-based feedback control for active binaural localization. In Proceedings of the IROS.

Bustamante, G., Danès, P., Forgue, T., & Podlubne, A. (2016). Towards information-based feedback control for binaural active localization. In Proceedings of the ICASSP (pp. 6325–6329).

Bustamante, G., Danès, P., Forgue, T., Podlubne, A., & Manhès, J. (2017). An information based feedback control for audio-motor binaural localization. Autonomous Robots,. https://doi.org/10.1007/s10514-017-9639-8.CrossRef

Chengalvarayan, R. (1999). Robust energy normalization using speech/nonspeech discriminator for German connected digit recognition. In Proceedings of the Eurospeech.

Colas, F., Mahesh, S., Pomerleau, F., Liu, M., & Siegwart, R. (2013). 3D path planning and execution for search and rescue ground robots. In Proceedings of the IROS (pp. 722–727).

Cooke, M., Lu, Y. C., Lu, Y., & Horaud, R. (2007). Active hearing, active speaking. In Proceedings of the ISAAR (pp. 33–46).

DeJong, B. P. (2012). Auditory occupancy grids with a mobile robot. Journal of Automation, Mobile Robotics and Intelligent Systems, 6(3), 3–12.

DiBiase, J. H., Silverman, H. F., & Brandstein, M. S. (2001). Robust localisation in reverberant rooms. In M. Brandstein & D. Ward (Eds.), Microphone arrays: Signal processing techniques and applications (pp. 157–180). Berlin: Springer. CrossRef

Dolgov, D., Thrun, S., Montemerlo, M., & Diebel, J. (2008). Practical search techniques in path planning for autonomous driving. In Proceedings of the STAIR.

Evers, C., Moore, A., & Naylor, P. (2016). Towards informative path planning for acoustic SLAM. In Proceedings of the DAGA.

Fallon, M. F., & Godsill, S. J. (2012). Acoustic source localization and tracking of a time-varying number of speakers. IEEE Transactions on Audio, Speech, and Language Processing, 20(4), 1409–1415.CrossRef

Germain, F. G., Sun, D. L., & Mysore, G. J. (2013). Speaker and noise independent voice activity detection. In: Proceedings of the Interspeech.

Girod, L., Lukac, M., Trifa, V., & Estrin, D. (2006). The design and implementation of a self-calibrating distributed acoustic sensing platform. In: Proceedings of the SenSys (pp. 71–84).

Gonzalez-Banos, H. H., & Latombe, J. C. (2002). Navigation strategies for exploring indoor environments. The International Journal of Robotics Research, 21(10–11), 829–848.CrossRef

Hahn, W., & Tretter, S. (1973). Optimum processing for delay-vector estimation in passive signal arrays. IEEE Transactions on Information Theory, 19(5), 608–614.CrossRef

Hashimoto, S., Narita, S., Kasahara, H., Takanishi, A., Sugano, S., Shirai, K., Kobayashi, T., Takanobu, H., Kurata, T., Fujiwara, K., Matsuno, T., Kawasaki, T., & Hoashi, K. (1997). Humanoid robot-development of an information assistant robot hadaly. In Proceedings of the RO-MAN (pp. 106–111).

Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301), 13–30.MathSciNetCrossRef

Huber, M. F., Bailey, T., Durrant-Whyte, H., & Hanebeck, U. D. (2008). On entropy approximation for Gaussian mixture random vectors. In: Proceedings of the MFI (pp. 181–188).

Johnson, D. H., & Dudgeon, D. E. (1992). Array signal processing: Concepts and techniques. New York: Simon & Schuster.MATH

Karray, L., & Martin, A. (2003). Towards improving speech detection robustness for speech recognition in adverse conditions. Speech Communication, 40(3), 261–276.CrossRef

Kim, U. H., Kim, J., Kim, D., Kim, H., & You, B. J. (2008). Speaker localization using the TDOA-based feature matrix for a humanoid robot. In Proceedings of the RO-MAN (pp. 610–615).

Knapp, C., & Carter, G. (1976). The generalized cross-correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing, 24(4), 320–327.CrossRef

Kocsis, L., Szepesvári, C., & Willemson, J. (2006). Improved Monte-Carlo search. Technical Report 1, University of Tartu.

Latombe, J. C. (1991). Robot motion planning. Dordrecht: Kluwer.CrossRef

LaValle, S. M. (2006). Planning algorithms. Cambridge: Cambridge University Press.CrossRef

Lu, Y. C., & Cooke, M. (2011). Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners. Speech Communication, 53(5), 622–642.CrossRef

Magassouba, A. (2016). Aural servo: Towards an alternative approach to sound localization for robot motion control. Ph.D. thesis, Université Rennes 1.

Marković, I., Portello, A., Danès, P., Petrović, I., & Argentieri, S. (2013). Active speaker localization with circular likelihoods and bootstrap filtering. In Proceedings of the IROS (pp. 2914–2920).

Martinson, E., & Schultz, A. (2006). Auditory evidence grids. In Proceedings of the IROS (pp. 1139–1144).

Martinson, E., & Schultz, A. (2009). Discovery of sound sources by an autonomous mobile robot. Autonomous Robots, 27, 221–237.CrossRef

Marzinzik, M., & Kollmeier, B. (2002). Speech pause detection for noise spectrum estimation by tracking power envelope dynamics. IEEE Transactions on Speech and Audio Processing, 10(2), 109–118.CrossRef

Nakadai, K., Lourens, T., Okuno, H. G., & Kitano, H. (2000). Active audition for humanoid. In Proceedings of the AAAI (pp. 832–839).

Nakadai, K., Okuno, H. G., & Kitano, H. (2002). Real-time sound source localization and separation for robot audition. In Proceedings of the Interspeech (pp. 193–196).

Nakadai, K., Okuno, H. G., & Kitano, H. (2003). Robot recognizes three simultaneous speech by active audition. In Proceedings of the ICRA (pp. 398–405).

Nakadai, K., Takahashi, T., Okuno, H. G., Nakajima, H., Hasegawa, Y., & Tsujino, H. (2010). Design and implementation of robot audition system ’HARK’—Open source software for listening to three simultaneous speakers. Advanced Robotics, 24(5–6), 739–761.CrossRef

Nakamura, K., Nakadai, K., & Ince, G. (2012). Real-time super-resolution sound source localization for robots. In Proceedings of the IROS (pp. 694–699).

Nguyen, Q. V. (2018). Mapping of a sound environment by a mobile robot. Ph.D. thesis, University of Lorraine.

Nguyen, Q. V., Colas, F., Vincent, E., & Charpillet, F. (2016). Localizing an intermittent and moving sound source using a mobile robot. In Proceedings of the IROS (pp. 61–65).

Nguyen, Q. V., Colas, F., Vincent, E., & Charpillet, F. (2017). Long-term robot motion planning for active sound source localization with Monte Carlo tree search. In Proceedings of the HSCMA (pp 61–65).

Okuno, H. G., & Nakadai, K. (2015). Robot audition: Its rise and perspectives. In Proceedings of the ICASSP (pp. 5610–5614).

Popoviciu, T. (1935). Sur les équations algébriques ayant toutes leurs racines réelles. Mathematica (Cluj), 9, 129–145.MATH

Portello, A., Bustamante, G., Danès, P., Piat, J., & Manhès, J. (2014). Active localization of an intermittent sound source from a moving binaural sensor. In Proceedings of the Forum Acusticum.

Portello, A., Danès, P., & Argentieri, S. (2011). Acoustic models and Kalman filtering strategies for active binaural sound localization. In Proceedings of the IROS (pp. 137–142).

Portello, A., Danès, P., & Argentieri, S. (2012). Active binaural localization of intermittent moving sources in the presence of false measurements. In Proceedings of the IROS (pp. 3294–3299).

Ramírez, J., Górriz, J. M., & Segura, J. C. (2007). Voice activity detection Fundamentals and speech recognition system robustness. In M. Grimm & K. Kroschel (Eds.), Robust speech recognition and understanding. Vienna: Intech.

Ramirez, J., Segura, J. C., Benitez, C., de la Torre, A., & Rubio, A. J. (2003). A new adaptive long-term spectral estimation voice activity detector. In Proceedings of the Eurospeech.

Schmidt, R. (1986). Multiple emitter location and signal parameter estimation. IEEE Transactions on Antennas and Propagation, 34(3), 276–280.CrossRef

Schymura, C., Grajales, J. D. R., & Kolossa, D. (2017). Monte Carlo exploration for active binaural localization. In Proceedings of the ICASSP (pp. 491–495).

Siegwart, R., Nourbakhsh, I. R., & Scaramuzza, D. (2011). Introduction to autonomous mobile robots. Cambridge: MIT Press.

Slotani, M. (1964). Tolerance regions for a multivariate normal population. Annals of the Institute of Statistical Mathematics, 16(1), 135–153.MathSciNetCrossRef

Sohn, J., Kim, N. S., & Sung, W. (1999). A statistical model-based voice activity detection. IEEE Signal Processing Letters, 6(1), 1–3.CrossRef

Song, K., Liu, Q., & Wang, Q. (2011). Olfaction and hearing based mobile robot navigation for odor/sound source search. Sensors, 11, 2129–2154.CrossRef

Tanyer, S. G., & Ozer, H. (2000). Voice activity detection in nonstationary noise. IEEE Transactions on Speech and Audio Processing, 8(4), 478–482.CrossRef

Valin, J. M., Michaud, F., & Rouat, J. (2007). Robust localization and tracking of simultaneous moving sound sources using beamforming and particle filtering. Robotics and Autonomous Systems, 55(3), 216–228.CrossRef

Valin, J. M., Yamamoto, S., Rouat, J., Michaud, F., Nakadai, K., & Okuno, H. (2007). Robust recognition of simultaneous speech by a mobile robot. IEEE Transactions on Robotics, 23(4), 742–752.CrossRef

Van Veen, B. D., & Buckley, K. M. (1988). Beamforming: A versatile approach to spatial filtering. IEEE ASSP Magazine, 5(2), 4–24.CrossRef

Vermaak, J., & Blake, A. (2001). Nonlinear filtering for speaker tracking in noisy and reverberant environments. In Proceedings of the ICASSP (Vol. 5, pp. 3021–3024).

Vincent, E., Sini, A., & Charpillet, F. (2015). Audio source localization by optimal control of a mobile robot. In Proceedings of the ICASSP (pp. 5630–5634).

Wightman, F. L., & Kistler, D. J. (1999). Resolution of front-back ambiguity in spatial hearing by listener and source movement. The Journal of the Acoustical Society of America, 105(5), 2841–2853.CrossRef

Woo, K. H., Yang, T. Y., Park, K. J., & Lee, C. (2000). Robust voice activity detection algorithm for estimating noise spectrum. IET Electronics Letters, 36(2), 180–181.CrossRef

Yamauchi, B. (1997). A frontier-based approach for autonomous exploration. In Proceedings of the CIRA (pp. 146–151).

Zhang, X. L., & Wu, J. (2013). Deep belief networks based voice activity detection. IEEE Transactions on Audio, Speech, and Language Processing, 21(4), 697–710.CrossRef

Title: Motion planning for robot audition
Authors: Quan V. Nguyen
Francis Colas
Emmanuel Vincent
François Charpillet
Publication date: 31-07-2019
Publisher: Springer US
Published in: Autonomous Robots / Issue 8/2019
Print ISSN: 0929-5593
Electronic ISSN: 1573-7527
DOI: https://doi.org/10.1007/s10514-019-09880-1

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Other articles of this Issue 8/2019

Motion encoding with asynchronous trajectories of repetitive teleoperation tasks and its extension to human-agent shared teleoperation

NP-completeness of optimal planning problem for modular robots

Distributed iterative learning control for multi-agent systems

Data collection planning with non-zero sensing distance for a budget and curvature constrained unmanned aerial vehicle

Task-centric optimization of configurations for assistive robots

Real-time motion planning with a fixed-wing UAV using an agile maneuver space