nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

HRTF Representation with Convolutional Auto-encoder

verfasst von : Wei Chen, Ruimin Hu, Xiaochen Wang, Dengshi Li

Erschienen in: MultiMedia Modeling

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The head-related transfer function (HRTF) can be considered as some kind of filter that describes how a sound from an arbitrary spatial direction transfers to the listener’s eardrums. HRTF can be used to synthesize vivid virtual 3D sound that seems to come from any spatial location, which makes it play an important role in the 3D audio technology. However, the complexity and variation of auditory cues inherent in HRTF make it difficult to set up an accurate mathematical model with the conventional methods. In this paper, we put forward an HRTF representation modeling based on convolutional auto-encoder (CAE), which is some type of auto-encoder that contains convolutional layers in the encoder part and deconvolution layers in the decoder part. The experimental evaluation on the ARI HRTF database shows that the proposed model provides very good results on dimensionality reduction of HRTF.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Multi-scale Spatial Location Preference for Semantic Segmentation

Nächstes Kapitel Unsupervised Feature Propagation for Fast Video Object Detection Using Generative Adversarial Networks

Ari hrtf database homepage. http://www.kfs.oeaw.ac.at/hrtf. Accessed 4 July 2019

Baumgartner, R., Majdak, P., Laback, B.: Modeling sound-source localization in sagittal planes for human listeners. J. Acoust. Soc. Am. 140(4), 2456 (2016). https://doi.org/10.1121/1.4964753CrossRef

Blommer, M., Wakefield, G.: Pole-zero approximations for head-related transfer functions using a logarithmic error criterion. IEEE Trans. Speech Audio Process. 5(3), 278–287 (1997)CrossRef

Chen, M.C., Hsieh, S.F.: Common acoustical-poles/zeros modeling for 3D sound processing. In: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 785–788. IEEE Signal Processing Society (2000)

Fink, K.J., Ray, L.: Individualization of head related transfer functions using principal component analysis. Appl. Acoust. 87, 162–173 (2015)CrossRef

Grais, E.M., Plumbley, M.D.: Single channel audio source separation using convolutional denoising autoencoders. In: 2017 IEEE Global Conference on Signal and Information Processing (GLOBALSIP 2017), pp. 1265–1269. IEEE (2017). https://doi.org/10.1109/GlobalSIP.2017.8309164

Grijalva, F., Martini, L., Florencio, D., Goldenstein, S.: A manifold learning approach for personalizing HRTFs from anthropometric features. IEEE/ACM Trans. Audio Speech Lang. Process. 24(3), 559–570 (2016)CrossRef

Grijalva, F., Martini, L.C., Florencio, D., Goldenstein, S.: Interpolation of head-related transfer functions using manifold learning. IEEE Signal Process. Lett. 24(2), 221–225 (2017)CrossRef

Grijalva, F., Martini, L.C., Masiero, B., Goldenstein, S.: A recommender system for improving median plane sound localization performance based on a nonlinear representation of HRTFs. IEEE Access 6, 24829–24836 (2018)CrossRef

10.

Haneda, Y., Makino, S., Kaneda, Y., Kitawaki, N.: Common-acoustical-pole and zero modeling of head-related transfer functions. IEEE Trans. Speech Audio Process. 7(2), 188–196 (1999)CrossRef

11.

Hugeng, Gunawan, D., Wahab, W.: Effective preprocessing in modeling head-related impulse responses based on principal components analysis. Sig. Process. Int. J. 4(4), 201–212 (2010)

12.

Iwaya, Y., Sato, W., Okamoto, T., Otani, M., Suzuki, Y.: Interpolation method of head-related transfer functions in the z-plane domain using a common-pole and zero model. In: 20th International Congress on Acoustics 2010, ICA 2010, Sydney, NSW, Australia, vol. 4, pp. 2936–2940 (2010)

13.

Kistler, D.J., Wightman, F.L.: A model of head-related transfer-functions based on principal components-analysis and minimum-phase reconstruction. J. Acoust. Soc. Am. 91(3), 1637–1647 (1992)CrossRef

14.

Kulkarni, A., Colburn, H.S.: Infinite-impulse-response models of the head-related transfer function. J. Acoust. Soc. Am. 115, 1714–1728 (2004)CrossRef

15.

Liu, C.J., Hsieh, S.F.: Common-acoustic-poles/zeros approximation of head-related transfer functions. In: 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 3341–3344. IEEE Signal Processing Society (2001)

16.

Mackenzie, J., Huopaniemi, J., Valimaki, V., Kale, I.: Low-order modeling of head-related transfer functions using balanced model truncation. IEEE Signal Process. Lett. 4(2), 39–41 (1997)CrossRef

17.

Majdak, P., Goupell, M.J., Laback, B.: 3-D localization of virtual sound sources: effects of visual environment, pointing method, and training. Atten. Percept. Psychophys. 72(2), 454–469 (2010)CrossRef

18.

Martens, W.L.: Principal components analysis and resynthesis of spectral cues to perceived direction. In: Proceedings of the International Computer Music Conference, Champaine-Urbana, IL (1987)

19.

Meng, L., Wang, X., Chen, W., Ai, C., Hu, R.: Individualization of head related transfer functions based on radial basis function neural network. In: 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2018). https://doi.org/10.1109/ICME.2018.8486494

20.

Middlebrooks, J.C.: Individual differences in external-ear transfer functions reduced by scaling in frequency. J. Acoust. Soc. Am. 106(3), 1480–1492 (1999)CrossRef

21.

Ming, X., Binzhou, Y., Shuxia, G., Ying, G.: Head-related transfer function individualization based on locally linear embedding. In: Qiao, F., Patnaik, S., Wang, J. (eds.) ICMIR 2017. AISC, vol. 690, pp. 104–111. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-65978-7_16CrossRef

22.

Turchenko, V., Chalmers, E., Luczak, A.: A deep convolutional auto-encoder with pooling – unpooling layers in caffe. Int. J. Comput. 18(1), 8–31 (2019). http://www.computingonline.net/computing/article/view/1270

23.

Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetMATH

24.

Zeiler, M.D., Krishnan, D., Taylor, G.W., Fergus, R.: Deconvolutional networks. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2528–2535. IEEE Computer Society (2010). https://doi.org/10.1109/CVPR.2010.5539957

Titel: HRTF Representation with Convolutional Auto-encoder
verfasst von: Wei Chen
Ruimin Hu
Xiaochen Wang
Dengshi Li
Verlag: Springer International Publishing
Buch: MultiMedia Modeling
Print ISBN: 978-3-030-37730-4

Electronic ISBN: 978-3-030-37731-1

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-37731-1_49

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Arbeitszeit/© granata68 / Fotolia, E-Autos im Fuhrpark: Lohnt sich das noch?/© Petair / stock.adobe.com, Kryptowährungen/© gopixa / Getty Images / iStock, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.