Skip to main content
Top

2017 | OriginalPaper | Chapter

3. Linear Estimation-Based Primary Ambient Extraction

Author : JianJun He

Published in: Spatial Audio Reproduction with Primary Ambient Extraction

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Audio signals for moving pictures and video games are often linear combinations of primary and ambient components. In spatial audio analysis–synthesis, these mixed signals are usually decomposed into primary and ambient components to facilitate flexible spatial rendering and enhancement. Existing approaches such as principal component analysis (PCA) and least squares (LS) are widely used to perform this decomposition from stereo signals. However, the performance of these approaches in primary ambient extraction (PAE) has not been well studied, and no comparative analysis among the existing approaches has been carried out so far. In this paper, we generalize the existing approaches into a linear estimation framework. Under this framework, we propose a series of performance measures to identify the components that contribute to the extraction error. Based on the generalized linear estimation framework and our proposed performance measures, a comparative study and experimental testing of the linear estimation-based PAE approaches including existing PCA, LS, and three proposed variant LS approaches are presented.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
[AnC09]
go back to reference Ando Y, Cariani P (2009) Auditory and visual sensation. Springer, New York Ando Y, Cariani P (2009) Auditory and visual sensation. Springer, New York
[AvJ04]
go back to reference Avendano C, Jot JM (2004) A frequency-domain approach to multichannel upmix. J Audio Eng Soc 52(7/8):740–749 Avendano C, Jot JM (2004) A frequency-domain approach to multichannel upmix. J Audio Eng Soc 52(7/8):740–749
[BJP12]
go back to reference Baek YH, Jeon SW, Park YC, Lee S (2012) Efficient primary-ambient decomposition algorithm for audio upmix. In: Proceedings of 133rd audio engineering society convention, San Francisco Baek YH, Jeon SW, Park YC, Lee S (2012) Efficient primary-ambient decomposition algorithm for audio upmix. In: Proceedings of 133rd audio engineering society convention, San Francisco
[BaF03]
go back to reference Baumgarte F, Faller C (2003) Binaural cue coding-part I: psychoacoustic fundamentals and design principles. IEEE Trans Speech Audio Process 11(6):509–519 Baumgarte F, Faller C (2003) Binaural cue coding-part I: psychoacoustic fundamentals and design principles. IEEE Trans Speech Audio Process 11(6):509–519
[BCH11]
go back to reference Benesty J, Chen J, Huang Y (2011) Binaural noise reduction in the time domain with a stereo setup. IEEE Trans Audio Speech Lang Process 19(8):2260–2272 Benesty J, Chen J, Huang Y (2011) Binaural noise reduction in the time domain with a stereo setup. IEEE Trans Audio Speech Lang Process 19(8):2260–2272
[Bla97]
go back to reference Blauert J (1997) Spatial hearing: the psychophysics of human sound localization. MIT Press, Cambridge Blauert J (1997) Spatial hearing: the psychophysics of human sound localization. MIT Press, Cambridge
[BHK07]
go back to reference Breebaart J, Hotho G, Koppens J, Schuijers E, Oomen W, van de Par S (2007) Background, concept, and architecture for the recent MPEG surround standard on multichannel audio compression. J Audio Eng Soc 55(5):331–351 Breebaart J, Hotho G, Koppens J, Schuijers E, Oomen W, van de Par S (2007) Background, concept, and architecture for the recent MPEG surround standard on multichannel audio compression. J Audio Eng Soc 55(5):331–351
[Cap69]
go back to reference Capon J (1969) High resolution frequency wave number spectrum analysis. Proc IEEE 57(8):1408–1418 Capon J (1969) High resolution frequency wave number spectrum analysis. Proc IEEE 57(8):1408–1418
[Fal06]
go back to reference Faller C (2006) Multiple-loudspeaker playback of stereo signals. J Audio Eng Soc 54(11):1051–1064 Faller C (2006) Multiple-loudspeaker playback of stereo signals. J Audio Eng Soc 54(11):1051–1064
[Fal06b]
go back to reference Faller C (2006) Parametric multichannel audio coding: synthesis of coherence cues. IEEE Trans Audio Speech Lang Process 14(1):299–310 Faller C (2006) Parametric multichannel audio coding: synthesis of coherence cues. IEEE Trans Audio Speech Lang Process 14(1):299–310
[FaB03]
go back to reference Faller C, Baumgarte F (2003) Binaural cue coding-part II: schemes and applications. IEEE Trans Speech Audio Process 11(6):520–531 Faller C, Baumgarte F (2003) Binaural cue coding-part II: schemes and applications. IEEE Trans Speech Audio Process 11(6):520–531
[God08]
go back to reference Goodwin M (2008) Geometric signal decompositions for spatial audio enhancement. In: Proceedings of ICASSP, Las Vegas, pp 409–412 Goodwin M (2008) Geometric signal decompositions for spatial audio enhancement. In: Proceedings of ICASSP, Las Vegas, pp 409–412
[GoJ07b]
go back to reference Goodwin M, Jot JM (2007) Primary-ambient signal decomposition and vector-based localization for spatial audio coding and enhancement. In: Proceedings of ICASSP, Hawaii, 2007, pp 9–12 Goodwin M, Jot JM (2007) Primary-ambient signal decomposition and vector-based localization for spatial audio coding and enhancement. In: Proceedings of ICASSP, Hawaii, 2007, pp 9–12
[GoJ07a]
go back to reference Goodwin M, Jot JM (2007) Binaural 3-D audio rendering based on spatial audio scene coding. In: Proceedings of 123rd audio engineering society convention, New York Goodwin M, Jot JM (2007) Binaural 3-D audio rendering based on spatial audio scene coding. In: Proceedings of 123rd audio engineering society convention, New York
[HGC09]
go back to reference Habets EAP, Gannot S, Cohen I (2009) Late reverberant spectral variance estimation based on a statistical model. IEEE Sig Process Lett 16(9):770–773 Habets EAP, Gannot S, Cohen I (2009) Late reverberant spectral variance estimation based on a statistical model. IEEE Sig Process Lett 16(9):770–773
[HaR09]
go back to reference Haykin S, Ray Liu KJ (2009) Handbook on array processing and sensor networks. Wiley-IEEE Press, New York Haykin S, Ray Liu KJ (2009) Handbook on array processing and sensor networks. Wiley-IEEE Press, New York
[Hol08]
go back to reference Holman T (2008) Surround sound up and running, 2nd edn. Focal Press, MA Holman T (2008) Surround sound up and running, 2nd edn. Focal Press, MA
[IrA02]
go back to reference Irwan R, Aarts RM (2002) Two-to-five channel sound processing. J Audio Eng Soc 50(11):914–926 Irwan R, Aarts RM (2002) Two-to-five channel sound processing. J Audio Eng Soc 50(11):914–926
[Jef48]
go back to reference Jeffress A (1948) A place theory of sound localization. J Comput Physiol Psychol 41(1):35–39 Jeffress A (1948) A place theory of sound localization. J Comput Physiol Psychol 41(1):35–39
[JHS10]
go back to reference Jeon SW, Hyun D, Seo J, Park YC, Youn DH (2010) Enhancement of principal to ambient energy ratio for PCA-based parametric audio coding. In: Proceedings of ICASSP, Dallas, pp 385–388 Jeon SW, Hyun D, Seo J, Park YC, Youn DH (2010) Enhancement of principal to ambient energy ratio for PCA-based parametric audio coding. In: Proceedings of ICASSP, Dallas, pp 385–388
[Jol02]
go back to reference Jolliffe I (2002) Principal component analysis, 2nd edn. Springer-Verlag, New York Jolliffe I (2002) Principal component analysis, 2nd edn. Springer-Verlag, New York
[JSY98]
go back to reference Joris PX, Smith PH, Yin T (1998) Coincidence detection in the auditory system: 50 years after Jeffress. Neuron 21(6):1235–1238 Joris PX, Smith PH, Yin T (1998) Coincidence detection in the auditory system: 50 years after Jeffress. Neuron 21(6):1235–1238
[KDN09]
go back to reference Kinoshita K, Delcroix M, Nakatani T, Miyoshi M (2009) Suppression of late reverberation effect on speech signal using long-term multiple step linear prediction. IEEE Trans Audio Speech Lang Process 17(4):534–545 Kinoshita K, Delcroix M, Nakatani T, Miyoshi M (2009) Suppression of late reverberation effect on speech signal using long-term multiple step linear prediction. IEEE Trans Audio Speech Lang Process 17(4):534–545
[MGJ07]
go back to reference Merimaa J, Goodwin M, Jot JM (2007) Correlation-based ambience extraction from stereo recordings. In: Proceedings of 123rd audio engineering society convention, New York Merimaa J, Goodwin M, Jot JM (2007) Correlation-based ambience extraction from stereo recordings. In: Proceedings of 123rd audio engineering society convention, New York
[Pul07]
go back to reference Pulkki V (2007) Spatial sound reproduction with directional audio coding. J Audio Eng Soc 55(6):503–516 Pulkki V (2007) Spatial sound reproduction with directional audio coding. J Audio Eng Soc 55(6):503–516
[Rum01]
go back to reference Rumsey F (2001) Spatial audio. Focal Press, Oxford Rumsey F (2001) Spatial audio. Focal Press, Oxford
[UsB07]
go back to reference Usher J, Benesty J (2007) Enhancement of spatial sound quality: a new reverberation-extraction audio upmixer. IEEE Trans Audio Speech Lang Process 15(7):2141–2150 Usher J, Benesty J (2007) Enhancement of spatial sound quality: a new reverberation-extraction audio upmixer. IEEE Trans Audio Speech Lang Process 15(7):2141–2150
[VGF06]
go back to reference Vincent E, Gribonval R, Févotte C (2006) Performance measurement in blind audio source separation. IEEE Trans Audio Speech Lang Process 14(4):1462–1469 Vincent E, Gribonval R, Févotte C (2006) Performance measurement in blind audio source separation. IEEE Trans Audio Speech Lang Process 14(4):1462–1469
Metadata
Title
Linear Estimation-Based Primary Ambient Extraction
Author
JianJun He
Copyright Year
2017
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-1551-9_3