Skip to main content

2020 | OriginalPaper | Buchkapitel

HRTF Representation with Convolutional Auto-encoder

verfasst von : Wei Chen, Ruimin Hu, Xiaochen Wang, Dengshi Li

Erschienen in: MultiMedia Modeling

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The head-related transfer function (HRTF) can be considered as some kind of filter that describes how a sound from an arbitrary spatial direction transfers to the listener’s eardrums. HRTF can be used to synthesize vivid virtual 3D sound that seems to come from any spatial location, which makes it play an important role in the 3D audio technology. However, the complexity and variation of auditory cues inherent in HRTF make it difficult to set up an accurate mathematical model with the conventional methods. In this paper, we put forward an HRTF representation modeling based on convolutional auto-encoder (CAE), which is some type of auto-encoder that contains convolutional layers in the encoder part and deconvolution layers in the decoder part. The experimental evaluation on the ARI HRTF database shows that the proposed model provides very good results on dimensionality reduction of HRTF.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat Blommer, M., Wakefield, G.: Pole-zero approximations for head-related transfer functions using a logarithmic error criterion. IEEE Trans. Speech Audio Process. 5(3), 278–287 (1997)CrossRef Blommer, M., Wakefield, G.: Pole-zero approximations for head-related transfer functions using a logarithmic error criterion. IEEE Trans. Speech Audio Process. 5(3), 278–287 (1997)CrossRef
4.
Zurück zum Zitat Chen, M.C., Hsieh, S.F.: Common acoustical-poles/zeros modeling for 3D sound processing. In: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 785–788. IEEE Signal Processing Society (2000) Chen, M.C., Hsieh, S.F.: Common acoustical-poles/zeros modeling for 3D sound processing. In: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 785–788. IEEE Signal Processing Society (2000)
5.
Zurück zum Zitat Fink, K.J., Ray, L.: Individualization of head related transfer functions using principal component analysis. Appl. Acoust. 87, 162–173 (2015)CrossRef Fink, K.J., Ray, L.: Individualization of head related transfer functions using principal component analysis. Appl. Acoust. 87, 162–173 (2015)CrossRef
7.
Zurück zum Zitat Grijalva, F., Martini, L., Florencio, D., Goldenstein, S.: A manifold learning approach for personalizing HRTFs from anthropometric features. IEEE/ACM Trans. Audio Speech Lang. Process. 24(3), 559–570 (2016)CrossRef Grijalva, F., Martini, L., Florencio, D., Goldenstein, S.: A manifold learning approach for personalizing HRTFs from anthropometric features. IEEE/ACM Trans. Audio Speech Lang. Process. 24(3), 559–570 (2016)CrossRef
8.
Zurück zum Zitat Grijalva, F., Martini, L.C., Florencio, D., Goldenstein, S.: Interpolation of head-related transfer functions using manifold learning. IEEE Signal Process. Lett. 24(2), 221–225 (2017)CrossRef Grijalva, F., Martini, L.C., Florencio, D., Goldenstein, S.: Interpolation of head-related transfer functions using manifold learning. IEEE Signal Process. Lett. 24(2), 221–225 (2017)CrossRef
9.
Zurück zum Zitat Grijalva, F., Martini, L.C., Masiero, B., Goldenstein, S.: A recommender system for improving median plane sound localization performance based on a nonlinear representation of HRTFs. IEEE Access 6, 24829–24836 (2018)CrossRef Grijalva, F., Martini, L.C., Masiero, B., Goldenstein, S.: A recommender system for improving median plane sound localization performance based on a nonlinear representation of HRTFs. IEEE Access 6, 24829–24836 (2018)CrossRef
10.
Zurück zum Zitat Haneda, Y., Makino, S., Kaneda, Y., Kitawaki, N.: Common-acoustical-pole and zero modeling of head-related transfer functions. IEEE Trans. Speech Audio Process. 7(2), 188–196 (1999)CrossRef Haneda, Y., Makino, S., Kaneda, Y., Kitawaki, N.: Common-acoustical-pole and zero modeling of head-related transfer functions. IEEE Trans. Speech Audio Process. 7(2), 188–196 (1999)CrossRef
11.
Zurück zum Zitat Hugeng, Gunawan, D., Wahab, W.: Effective preprocessing in modeling head-related impulse responses based on principal components analysis. Sig. Process. Int. J. 4(4), 201–212 (2010) Hugeng, Gunawan, D., Wahab, W.: Effective preprocessing in modeling head-related impulse responses based on principal components analysis. Sig. Process. Int. J. 4(4), 201–212 (2010)
12.
Zurück zum Zitat Iwaya, Y., Sato, W., Okamoto, T., Otani, M., Suzuki, Y.: Interpolation method of head-related transfer functions in the z-plane domain using a common-pole and zero model. In: 20th International Congress on Acoustics 2010, ICA 2010, Sydney, NSW, Australia, vol. 4, pp. 2936–2940 (2010) Iwaya, Y., Sato, W., Okamoto, T., Otani, M., Suzuki, Y.: Interpolation method of head-related transfer functions in the z-plane domain using a common-pole and zero model. In: 20th International Congress on Acoustics 2010, ICA 2010, Sydney, NSW, Australia, vol. 4, pp. 2936–2940 (2010)
13.
Zurück zum Zitat Kistler, D.J., Wightman, F.L.: A model of head-related transfer-functions based on principal components-analysis and minimum-phase reconstruction. J. Acoust. Soc. Am. 91(3), 1637–1647 (1992)CrossRef Kistler, D.J., Wightman, F.L.: A model of head-related transfer-functions based on principal components-analysis and minimum-phase reconstruction. J. Acoust. Soc. Am. 91(3), 1637–1647 (1992)CrossRef
14.
Zurück zum Zitat Kulkarni, A., Colburn, H.S.: Infinite-impulse-response models of the head-related transfer function. J. Acoust. Soc. Am. 115, 1714–1728 (2004)CrossRef Kulkarni, A., Colburn, H.S.: Infinite-impulse-response models of the head-related transfer function. J. Acoust. Soc. Am. 115, 1714–1728 (2004)CrossRef
15.
Zurück zum Zitat Liu, C.J., Hsieh, S.F.: Common-acoustic-poles/zeros approximation of head-related transfer functions. In: 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 3341–3344. IEEE Signal Processing Society (2001) Liu, C.J., Hsieh, S.F.: Common-acoustic-poles/zeros approximation of head-related transfer functions. In: 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 3341–3344. IEEE Signal Processing Society (2001)
16.
Zurück zum Zitat Mackenzie, J., Huopaniemi, J., Valimaki, V., Kale, I.: Low-order modeling of head-related transfer functions using balanced model truncation. IEEE Signal Process. Lett. 4(2), 39–41 (1997)CrossRef Mackenzie, J., Huopaniemi, J., Valimaki, V., Kale, I.: Low-order modeling of head-related transfer functions using balanced model truncation. IEEE Signal Process. Lett. 4(2), 39–41 (1997)CrossRef
17.
Zurück zum Zitat Majdak, P., Goupell, M.J., Laback, B.: 3-D localization of virtual sound sources: effects of visual environment, pointing method, and training. Atten. Percept. Psychophys. 72(2), 454–469 (2010)CrossRef Majdak, P., Goupell, M.J., Laback, B.: 3-D localization of virtual sound sources: effects of visual environment, pointing method, and training. Atten. Percept. Psychophys. 72(2), 454–469 (2010)CrossRef
18.
Zurück zum Zitat Martens, W.L.: Principal components analysis and resynthesis of spectral cues to perceived direction. In: Proceedings of the International Computer Music Conference, Champaine-Urbana, IL (1987) Martens, W.L.: Principal components analysis and resynthesis of spectral cues to perceived direction. In: Proceedings of the International Computer Music Conference, Champaine-Urbana, IL (1987)
19.
20.
Zurück zum Zitat Middlebrooks, J.C.: Individual differences in external-ear transfer functions reduced by scaling in frequency. J. Acoust. Soc. Am. 106(3), 1480–1492 (1999)CrossRef Middlebrooks, J.C.: Individual differences in external-ear transfer functions reduced by scaling in frequency. J. Acoust. Soc. Am. 106(3), 1480–1492 (1999)CrossRef
23.
Zurück zum Zitat Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetMATH Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)MathSciNetMATH
Metadaten
Titel
HRTF Representation with Convolutional Auto-encoder
verfasst von
Wei Chen
Ruimin Hu
Xiaochen Wang
Dengshi Li
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-37731-1_49

Neuer Inhalt