Skip to main content

2017 | OriginalPaper | Buchkapitel

4. Ambient Spectrum Estimation-Based Primary Ambient Extraction

verfasst von : JianJun He

Erschienen in: Spatial Audio Reproduction with Primary Ambient Extraction

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The diversity of today’s playback systems requires a flexible, efficient, and immersive reproduction of sound scenes in digital media. Spatial audio reproduction based on primary ambient extraction (PAE) fulfills this objective, where accurate extraction of primary and ambient components from sound mixtures in channel-based audio is crucial. Severe extraction error was found in existing PAE approaches when dealing with sound mixtures that contain a relatively strong ambient component, a commonly encountered case in the sound scenes of digital media. In this paper, we propose a novel ambient spectrum estimation (ASE) framework to improve the performance of PAE. The ASE framework exploits the equal magnitude of the uncorrelated ambient components in two channels of a stereo signal and reformulates the PAE problem into the problem of estimating either ambient phase or magnitude. In particular, we take advantage of the sparse characteristic of the primary components to derive sparse solutions for ASE-based PAE, together with an approximate solution that can significantly reduce the computational cost. Our objective and subjective experimental results demonstrate that the proposed ASE approaches significantly outperform existing approaches, especially when the ambient component is relatively strong.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The source code and demo tracks are available: http://​jhe007.​wix.​com/​main#!ambient-phase-estimation/​cied.
 
Literatur
[AvJ04]
Zurück zum Zitat Avendano C, Jot JM (2004) A frequency-domain approach to multichannel upmix. J. Audio Eng. Soc. 52(7/8):740–749 Avendano C, Jot JM (2004) A frequency-domain approach to multichannel upmix. J. Audio Eng. Soc. 52(7/8):740–749
[Beg00]
Zurück zum Zitat Begault DR (2000) 3-D sound for virtual reality and multimedia. AP Professional, Cambridge, MA Begault DR (2000) 3-D sound for virtual reality and multimedia. AP Professional, Cambridge, MA
[EVH11]
Zurück zum Zitat Emiya V, Vincent E, Harlander N, Hohmann V (2011) Subjective and objective quality assessment of audio source separation. IEEE Trans. Audio Speech Lang. Process. 19(7):2046–2057 Emiya V, Vincent E, Harlander N, Hohmann V (2011) Subjective and objective quality assessment of audio source separation. IEEE Trans. Audio Speech Lang. Process. 19(7):2046–2057
[GoJ06b]
Zurück zum Zitat Goodwin M, Jot J-M (2006) Analysis and synthesis for universal spatial audio coding. In: Proceedings of the 121st AES convention Goodwin M, Jot J-M (2006) Analysis and synthesis for universal spatial audio coding. In: Proceedings of the 121st AES convention
[GoJ07b]
Zurück zum Zitat Goodwin M, Jot JM (2007) Primary-ambient signal decomposition and vector-based localization for spatial audio coding and enhancement. In: Proceedings of the ICASSP. Hawaii, pp 9–12 Goodwin M, Jot JM (2007) Primary-ambient signal decomposition and vector-based localization for spatial audio coding and enhancement. In: Proceedings of the ICASSP. Hawaii, pp 9–12
[HTG14]
Zurück zum Zitat He J, Tan EL, Gan WS (2014) Linear estimation based primary-ambient extraction for stereo audio signals. IEEE/ACM Trans. Audio Speech Lang. Process. 22(2):505–517 He J, Tan EL, Gan WS (2014) Linear estimation based primary-ambient extraction for stereo audio signals. IEEE/ACM Trans. Audio Speech Lang. Process. 22(2):505–517
[HGT15b]
Zurück zum Zitat He J, Gan WS, Tan EL (2015) Primary-ambient extraction using ambient spectrum estimation for immersive spatial audio reproduction. IEEE/ACM Trans. Audio Speech Lang. Process. 23(9):1430–1443 He J, Gan WS, Tan EL (2015) Primary-ambient extraction using ambient spectrum estimation for immersive spatial audio reproduction. IEEE/ACM Trans. Audio Speech Lang. Process. 23(9):1430–1443
[ITU03b]
Zurück zum Zitat ITU (2003) ITU-R recommendation BS.1534-1: method for the subjective assessment of intermediate quality levels of coding systems ITU (2003) ITU-R recommendation BS.1534-1: method for the subjective assessment of intermediate quality levels of coding systems
[ITU14]
Zurück zum Zitat ITU (2014) ITU-R Recommendation BS.1534-2: method for the subjective assessment of intermediate quality levels of coding systems ITU (2014) ITU-R Recommendation BS.1534-2: method for the subjective assessment of intermediate quality levels of coding systems
[Ken95b]
Zurück zum Zitat Kendall G (1995) The decorrelation of audio signals and its impact on spatial imagery. Comput. Music J. 19(4):71–87 (Winter) Kendall G (1995) The decorrelation of audio signals and its impact on spatial imagery. Comput. Music J. 19(4):71–87 (Winter)
[LaA87]
Zurück zum Zitat Laarhoven PJV, Aarts EH (1987) Simulated annealing. Springer, Netherlands Laarhoven PJV, Aarts EH (1987) Simulated annealing. Springer, Netherlands
[LNZ14]
Zurück zum Zitat Liebetrau J, Nagel F, Zacharov N, Watanabe K, Colomes C, Crum P, Sporer T, Mason A (2014) Revision of Rec. ITU-R BS. 1534. In: Proceedings of the 137th AES convention, LA Liebetrau J, Nagel F, Zacharov N, Watanabe K, Colomes C, Crum P, Sporer T, Mason A (2014) Revision of Rec. ITU-R BS. 1534. In: Proceedings of the 137th AES convention, LA
[MeF09]
Zurück zum Zitat Menzer F, Faller C (2009) Binaural reverberation using a modified Jot reverberator with frequency-dependent interaural coherence matching. In: Proceedings of the 126th AES convention, Munich, Germany Menzer F, Faller C (2009) Binaural reverberation using a modified Jot reverberator with frequency-dependent interaural coherence matching. In: Proceedings of the 126th AES convention, Munich, Germany
[MGJ07]
Zurück zum Zitat Merimaa J, Goodwin M, Jot JM (2007) Correlation-based ambience extraction from stereo recordings. In: Proceedings of the 123rd AES convention, New York Merimaa J, Goodwin M, Jot JM (2007) Correlation-based ambience extraction from stereo recordings. In: Proceedings of the 123rd AES convention, New York
[PBD10]
Zurück zum Zitat Plumbley M, Blumensath T, Daudet L, Gribonval R, Davies ME (2010) Sparse representation in audio and music: from coding to source separation. Proc. IEEE 98(6):995–1016 Plumbley M, Blumensath T, Daudet L, Gribonval R, Davies ME (2010) Sparse representation in audio and music: from coding to source separation. Proc. IEEE 98(6):995–1016
[PoB04]
Zurück zum Zitat Potard G, Burnett I (2004) Decorrelation techniques for the rendering of apparent sound source width in 3D audio displays. In: Proceedings of the DAFx’04, Naples, Italy Potard G, Burnett I (2004) Decorrelation techniques for the rendering of apparent sound source width in 3D audio displays. In: Proceedings of the DAFx’04, Naples, Italy
[Rum99]
Zurück zum Zitat Rumsey F (1999) Controlled subjective assessments of two-to-five channel surround sound processing algorithms. J. Audio Eng. Soc. 47(7/8):563–582 Rumsey F (1999) Controlled subjective assessments of two-to-five channel surround sound processing algorithms. J. Audio Eng. Soc. 47(7/8):563–582
[Sch58]
Zurück zum Zitat Schroeder M (1958) An artificial stereophonic effect obtained from a single audio signal. J. Audio Eng. Soc. 6(2):74–79 Schroeder M (1958) An artificial stereophonic effect obtained from a single audio signal. J. Audio Eng. Soc. 6(2):74–79
Metadaten
Titel
Ambient Spectrum Estimation-Based Primary Ambient Extraction
verfasst von
JianJun He
Copyright-Jahr
2017
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-1551-9_4

Neuer Inhalt