Top

Published in:

2020 | OriginalPaper | Chapter

HRTF Representation with Convolutional Auto-encoder

Authors : Wei Chen, Ruimin Hu, Xiaochen Wang, Dengshi Li

Published in: MultiMedia Modeling

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

The head-related transfer function (HRTF) can be considered as some kind of filter that describes how a sound from an arbitrary spatial direction transfers to the listener’s eardrums. HRTF can be used to synthesize vivid virtual 3D sound that seems to come from any spatial location, which makes it play an important role in the 3D audio technology. However, the complexity and variation of auditory cues inherent in HRTF make it difficult to set up an accurate mathematical model with the conventional methods. In this paper, we put forward an HRTF representation modeling based on convolutional auto-encoder (CAE), which is some type of auto-encoder that contains convolutional layers in the encoder part and deconvolution layers in the decoder part. The experimental evaluation on the ARI HRTF database shows that the proposed model provides very good results on dimensionality reduction of HRTF.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Multi-scale Spatial Location Preference for Semantic Segmentation

next chapter Unsupervised Feature Propagation for Fast Video Object Detection Using Generative Adversarial Networks

Ari hrtf database homepage. http://www.kfs.oeaw.ac.at/hrtf. Accessed 4 July 2019

Baumgartner, R., Majdak, P., Laback, B.: Modeling sound-source localization in sagittal planes for human listeners. J. Acoust. Soc. Am. 140(4), 2456 (2016). https://doi.org/10.1121/1.4964753CrossRef

Blommer, M., Wakefield, G.: Pole-zero approximations for head-related transfer functions using a logarithmic error criterion. IEEE Trans. Speech Audio Process. 5(3), 278–287 (1997)CrossRef

Chen, M.C., Hsieh, S.F.: Common acoustical-poles/zeros modeling for 3D sound processing. In: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 785–788. IEEE Signal Processing Society (2000)

Fink, K.J., Ray, L.: Individualization of head related transfer functions using principal component analysis. Appl. Acoust. 87, 162–173 (2015)CrossRef

Grais, E.M., Plumbley, M.D.: Single channel audio source separation using convolutional denoising autoencoders. In: 2017 IEEE Global Conference on Signal and Information Processing (GLOBALSIP 2017), pp. 1265–1269. IEEE (2017). https://doi.org/10.1109/GlobalSIP.2017.8309164

Grijalva, F., Martini, L., Florencio, D., Goldenstein, S.: A manifold learning approach for personalizing HRTFs from anthropometric features. IEEE/ACM Trans. Audio Speech Lang. Process. 24(3), 559–570 (2016)CrossRef

Grijalva, F., Martini, L.C., Florencio, D., Goldenstein, S.: Interpolation of head-related transfer functions using manifold learning. IEEE Signal Process. Lett. 24(2), 221–225 (2017)CrossRef

Grijalva, F., Martini, L.C., Masiero, B., Goldenstein, S.: A recommender system for improving median plane sound localization performance based on a nonlinear representation of HRTFs. IEEE Access 6, 24829–24836 (2018)CrossRef

10.

Haneda, Y., Makino, S., Kaneda, Y., Kitawaki, N.: Common-acoustical-pole and zero modeling of head-related transfer functions. IEEE Trans. Speech Audio Process. 7(2), 188–196 (1999)CrossRef

11.

Hugeng, Gunawan, D., Wahab, W.: Effective preprocessing in modeling head-related impulse responses based on principal components analysis. Sig. Process. Int. J. 4(4), 201–212 (2010)

12.

Iwaya, Y., Sato, W., Okamoto, T., Otani, M., Suzuki, Y.: Interpolation method of head-related transfer functions in the z-plane domain using a common-pole and zero model. In: 20th International Congress on Acoustics 2010, ICA 2010, Sydney, NSW, Australia, vol. 4, pp. 2936–2940 (2010)

13.

Kistler, D.J., Wightman, F.L.: A model of head-related transfer-functions based on principal components-analysis and minimum-phase reconstruction. J. Acoust. Soc. Am. 91(3), 1637–1647 (1992)CrossRef

14.

Kulkarni, A., Colburn, H.S.: Infinite-impulse-response models of the head-related transfer function. J. Acoust. Soc. Am. 115, 1714–1728 (2004)CrossRef

15.

Liu, C.J., Hsieh, S.F.: Common-acoustic-poles/zeros approximation of head-related transfer functions. In: 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 3341–3344. IEEE Signal Processing Society (2001)

16.

Mackenzie, J., Huopaniemi, J., Valimaki, V., Kale, I.: Low-order modeling of head-related transfer functions using balanced model truncation. IEEE Signal Process. Lett. 4(2), 39–41 (1997)CrossRef

17.

Majdak, P., Goupell, M.J., Laback, B.: 3-D localization of virtual sound sources: effects of visual environment, pointing method, and training. Atten. Percept. Psychophys. 72(2), 454–469 (2010)CrossRef

18.

Martens, W.L.: Principal components analysis and resynthesis of spectral cues to perceived direction. In: Proceedings of the International Computer Music Conference, Champaine-Urbana, IL (1987)

19.

Meng, L., Wang, X., Chen, W., Ai, C., Hu, R.: Individualization of head related transfer functions based on radial basis function neural network. In: 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2018). https://doi.org/10.1109/ICME.2018.8486494

20.

Middlebrooks, J.C.: Individual differences in external-ear transfer functions reduced by scaling in frequency. J. Acoust. Soc. Am. 106(3), 1480–1492 (1999)CrossRef

21.

Ming, X., Binzhou, Y., Shuxia, G., Ying, G.: Head-related transfer function individualization based on locally linear embedding. In: Qiao, F., Patnaik, S., Wang, J. (eds.) ICMIR 2017. AISC, vol. 690, pp. 104–111. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-65978-7_16CrossRef

22.

Turchenko, V., Chalmers, E., Luczak, A.: A deep convolutional auto-encoder with pooling – unpooling layers in caffe. Int. J. Comput. 18(1), 8–31 (2019). http://www.computingonline.net/computing/article/view/1270

23.

Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetMATH

24.

Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2528–2535. IEEE Computer Society (2010). https://doi.org/10.1109/CVPR.2010.5539957

Title: HRTF Representation with Convolutional Auto-encoder
Authors: Wei Chen
Ruimin Hu
Xiaochen Wang
Dengshi Li
Publisher: Springer International Publishing
Book: MultiMedia Modeling
Print ISBN: 978-3-030-37730-4

Electronic ISBN: 978-3-030-37731-1

Copyright Year: 2020
DOI: https://doi.org/10.1007/978-3-030-37731-1_49

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"