Skip to main content

2017 | OriginalPaper | Buchkapitel

On the Use of Latent Mixing Filters in Audio Source Separation

verfasst von : Laurent Girin, Roland Badeau

Erschienen in: Latent Variable Analysis and Signal Separation

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we consider the underdetermined convolutive audio source separation (UCASS) problem. In the STFT domain, we consider both source signals and mixing filters as latent random variables, and we propose to estimate each source image, i.e. each individual source-filter product, by its posterior mean. Although, this is a quite straightforward application of the Bayesian estimation theory, to our knowledge, there exist no similar study in the UCASS context. In this paper, we discuss the interest of this estimator in this context and compare it with the conventional Wiener filter in a semi-oracle configuration.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
A source image is defined as the multichannel version of the source signal, as recorded at the microphones [5].
 
2
Considering the filters as latent variables enables us to make them depend on the time frame \(\ell \) at no additional cost, compared to frame-independent latent filters \(\mathbf {A}_{f}\), given that both models have the same set of parameters. This also comes at a much lower cost than the parametric case. However this does not necessarily mean that we have “trajectories” of filters, as for the moving sources or moving sensors in [7, 9]. This simply allows the realization of the filters to be different for each frame, e.g. modeling slight movements of sources around their mean position. In the following, \(\mathbf {a}_{j,f\ell }\) is assumed wide-sense stationary (WSS) along \(\ell \), hence its mean and covariance matrix do not depend on \(\ell \).
 
3
The proper complex Gaussian distribution is defined as \(\mathcal {N}_c(\mathbf {x};{\varvec{\mu }},{\varvec{\varSigma }}) = {|\pi {\varvec{\varSigma }}|^{-1}} \exp \big ( - [\mathbf {x}-{\varvec{\mu }}]^{\text {H}}{\varvec{\varSigma }}^{-1} [\mathbf {x}-{\varvec{\mu }}] \big )\), where |.| denotes the matrix determinant [14].
 
6
So far, no statistical test could be performed on a large set of mixtures to test the significativity of the results because of the huge computational cost of the Metropolis.
 
Literatur
1.
Zurück zum Zitat Vincent, E., Jafari, M., Abdallah, S., Plumbley, M., Davies, M.: Probabilistic modeling paradigms for audio source separation. In: Machine Audition: Principles, Algorithms and Systems, pp. 162–185 (2010) Vincent, E., Jafari, M., Abdallah, S., Plumbley, M., Davies, M.: Probabilistic modeling paradigms for audio source separation. In: Machine Audition: Principles, Algorithms and Systems, pp. 162–185 (2010)
2.
Zurück zum Zitat Ozerov, A., Févotte, C.: Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio Speech Lang. Process. 18(3), 550–563 (2010)CrossRef Ozerov, A., Févotte, C.: Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio Speech Lang. Process. 18(3), 550–563 (2010)CrossRef
3.
Zurück zum Zitat Duong, N., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio Speech Lang. Process. 18(7), 1830–1840 (2010)CrossRef Duong, N., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio Speech Lang. Process. 18(7), 1830–1840 (2010)CrossRef
4.
Zurück zum Zitat Ozerov, A., Vincent, E., Bimbot, F.: A general flexible framework for the handling of prior information in audio source separation. IEEE Trans. Audio Speech Lang. Process. 20(4), 1118–1133 (2012)CrossRef Ozerov, A., Vincent, E., Bimbot, F.: A general flexible framework for the handling of prior information in audio source separation. IEEE Trans. Audio Speech Lang. Process. 20(4), 1118–1133 (2012)CrossRef
5.
Zurück zum Zitat Sturmel, N., Liutkus, A., Pinel, J., Girin, L., Marchand, S., Richard, G., Badeau, R., Daudet, L.: Linear mixing models for active listening of music productions in realistic studio conditions. In: Convention of the Audio Engineering Society (AES). Budapest, Hungary (2012) Sturmel, N., Liutkus, A., Pinel, J., Girin, L., Marchand, S., Richard, G., Badeau, R., Daudet, L.: Linear mixing models for active listening of music productions in realistic studio conditions. In: Convention of the Audio Engineering Society (AES). Budapest, Hungary (2012)
6.
Zurück zum Zitat Duong, N., Vincent, E., Gribonval, R.: Spatial location priors for Gaussian model based reverberant audio source separation. EURASIP J. Adv. Signal Process. 149, 2013 (2013) Duong, N., Vincent, E., Gribonval, R.: Spatial location priors for Gaussian model based reverberant audio source separation. EURASIP J. Adv. Signal Process. 149, 2013 (2013)
7.
Zurück zum Zitat Higuchi, T., Takamune, N., Tomohiko, N., Kameoka, H.: Underdetermined blind separation and tracking of moving sources based on DOA-HMM. In: Proceedings of the International Conference on Acoustics, Speech and Signal Proceedings (ICASSP) (2014) Higuchi, T., Takamune, N., Tomohiko, N., Kameoka, H.: Underdetermined blind separation and tracking of moving sources based on DOA-HMM. In: Proceedings of the International Conference on Acoustics, Speech and Signal Proceedings (ICASSP) (2014)
8.
Zurück zum Zitat Leglaive, S., Badeau, R., Richard, G.: Multichannel audio source separation with probabilistic reverberant modeling. In IEEE Workshop Applications of Signal Processing to Audio and Acoustics (WASPAA) (2015) Leglaive, S., Badeau, R., Richard, G.: Multichannel audio source separation with probabilistic reverberant modeling. In IEEE Workshop Applications of Signal Processing to Audio and Acoustics (WASPAA) (2015)
9.
Zurück zum Zitat Kounades-Bastian, D., Girin, L., Alameda-Pineda, X., Gannot, S., Horaud, R.: A variational EM algorithm for the separation of moving sound sources. In: IEEE Workshop Application Signal Process. to Audio and Acoustics (WASPAA) (2015) Kounades-Bastian, D., Girin, L., Alameda-Pineda, X., Gannot, S., Horaud, R.: A variational EM algorithm for the separation of moving sound sources. In: IEEE Workshop Application Signal Process. to Audio and Acoustics (WASPAA) (2015)
10.
Zurück zum Zitat Smidl, V., Quinn, A.: The Variational Bayes Method in Signal Processing. Springer, Berlin (2006) Smidl, V., Quinn, A.: The Variational Bayes Method in Signal Processing. Springer, Berlin (2006)
11.
Zurück zum Zitat Cemgil, A., Févotte, C., Godsill, S.: Variational and stochastic inference for Bayesian source separation. Digit. Signal Proc. 2007(17), 891–913 (2007)CrossRef Cemgil, A., Févotte, C., Godsill, S.: Variational and stochastic inference for Bayesian source separation. Digit. Signal Proc. 2007(17), 891–913 (2007)CrossRef
12.
Zurück zum Zitat Liang, F., Liu, C., Carroll, R.: Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples. Wiley, New York (2010)CrossRefMATH Liang, F., Liu, C., Carroll, R.: Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples. Wiley, New York (2010)CrossRefMATH
13.
Zurück zum Zitat Gannot, S., Moonen, M.: On the application of the unscented Kalman filter to speech processing. In: IEEE International Workshop on Acoustic Echo and Noise Control (IWAENC), Japan, Kyoto, p. 811(2003) Gannot, S., Moonen, M.: On the application of the unscented Kalman filter to speech processing. In: IEEE International Workshop on Acoustic Echo and Noise Control (IWAENC), Japan, Kyoto, p. 811(2003)
14.
Zurück zum Zitat Neeser, F., Massey, J.: Proper complex random processes with applications to information theory. IEEE Trans. Info. Theory 39(4), 1293–1302 (1993)MathSciNetCrossRefMATH Neeser, F., Massey, J.: Proper complex random processes with applications to information theory. IEEE Trans. Info. Theory 39(4), 1293–1302 (1993)MathSciNetCrossRefMATH
15.
16.
Zurück zum Zitat Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V.: TIMIT acoustic-phonetic continuous speech corpus. In: Linguistic Data Consortium, Philadelphia (1993) Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V.: TIMIT acoustic-phonetic continuous speech corpus. In: Linguistic Data Consortium, Philadelphia (1993)
17.
Zurück zum Zitat Vincent, E., Sawada, H., Bofill, P., Makino, S., Rosca, J.P.: First stereo audio source separation evaluation campaign: data, algorithms and results. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 552–559. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74494-8_69 CrossRef Vincent, E., Sawada, H., Bofill, P., Makino, S., Rosca, J.P.: First stereo audio source separation evaluation campaign: data, algorithms and results. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 552–559. Springer, Heidelberg (2007). doi:10.​1007/​978-3-540-74494-8_​69 CrossRef
Metadaten
Titel
On the Use of Latent Mixing Filters in Audio Source Separation
verfasst von
Laurent Girin
Roland Badeau
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-53547-0_22