nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

On the Use of Latent Mixing Filters in Audio Source Separation

verfasst von : Laurent Girin, Roland Badeau

Erschienen in: Latent Variable Analysis and Signal Separation

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this paper, we consider the underdetermined convolutive audio source separation (UCASS) problem. In the STFT domain, we consider both source signals and mixing filters as latent random variables, and we propose to estimate each source image, i.e. each individual source-filter product, by its posterior mean. Although, this is a quite straightforward application of the Bayesian estimation theory, to our knowledge, there exist no similar study in the UCASS context. In this paper, we discuss the interest of this estimator in this context and compare it with the conventional Wiener filter in a semi-oracle configuration.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Psychophysical Evaluation of Audio Source Separation Methods

Nächstes Kapitel Discriminative Enhancement for Single Channel Audio Source Separation Using Deep Neural Networks

A source image is defined as the multichannel version of the source signal, as recorded at the microphones [5].

Considering the filters as latent variables enables us to make them depend on the time frame \(\ell \) at no additional cost, compared to frame-independent latent filters \(\mathbf {A}_{f}\), given that both models have the same set of parameters. This also comes at a much lower cost than the parametric case. However this does not necessarily mean that we have “trajectories” of filters, as for the moving sources or moving sensors in [7, 9]. This simply allows the realization of the filters to be different for each frame, e.g. modeling slight movements of sources around their mean position. In the following, \(\mathbf {a}_{j,f\ell }\) is assumed wide-sense stationary (WSS) along \(\ell \), hence its mean and covariance matrix do not depend on \(\ell \).

The proper complex Gaussian distribution is defined as \(\mathcal {N}_c(\mathbf {x};{\varvec{\mu }},{\varvec{\varSigma }}) = {|\pi {\varvec{\varSigma }}|^{-1}} \exp \big ( - [\mathbf {x}-{\varvec{\mu }}]^{\text {H}}{\varvec{\varSigma }}^{-1} [\mathbf {x}-{\varvec{\mu }}] \big )\), where |.| denotes the matrix determinant [14].

Matlab code and data are available at: www.gipsa-lab.grenoble-inp.fr/~laurent.girin/demo/lva2017.zip.

www.audiolabs-erlangen.de/fau/professor/habets/software/rir-generator.

So far, no statistical test could be performed on a large set of mixtures to test the significativity of the results because of the huge computational cost of the Metropolis.

Vincent, E., Jafari, M., Abdallah, S., Plumbley, M., Davies, M.: Probabilistic modeling paradigms for audio source separation. In: Machine Audition: Principles, Algorithms and Systems, pp. 162–185 (2010)

Ozerov, A., Févotte, C.: Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio Speech Lang. Process. 18(3), 550–563 (2010)CrossRef

Duong, N., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio Speech Lang. Process. 18(7), 1830–1840 (2010)CrossRef

Ozerov, A., Vincent, E., Bimbot, F.: A general flexible framework for the handling of prior information in audio source separation. IEEE Trans. Audio Speech Lang. Process. 20(4), 1118–1133 (2012)CrossRef

Sturmel, N., Liutkus, A., Pinel, J., Girin, L., Marchand, S., Richard, G., Badeau, R., Daudet, L.: Linear mixing models for active listening of music productions in realistic studio conditions. In: Convention of the Audio Engineering Society (AES). Budapest, Hungary (2012)

Duong, N., Vincent, E., Gribonval, R.: Spatial location priors for Gaussian model based reverberant audio source separation. EURASIP J. Adv. Signal Process. 149, 2013 (2013)

Higuchi, T., Takamune, N., Tomohiko, N., Kameoka, H.: Underdetermined blind separation and tracking of moving sources based on DOA-HMM. In: Proceedings of the International Conference on Acoustics, Speech and Signal Proceedings (ICASSP) (2014)

Leglaive, S., Badeau, R., Richard, G.: Multichannel audio source separation with probabilistic reverberant modeling. In IEEE Workshop Applications of Signal Processing to Audio and Acoustics (WASPAA) (2015)

Kounades-Bastian, D., Girin, L., Alameda-Pineda, X., Gannot, S., Horaud, R.: A variational EM algorithm for the separation of moving sound sources. In: IEEE Workshop Application Signal Process. to Audio and Acoustics (WASPAA) (2015)

10.

Smidl, V., Quinn, A.: The Variational Bayes Method in Signal Processing. Springer, Berlin (2006)

11.

Cemgil, A., Févotte, C., Godsill, S.: Variational and stochastic inference for Bayesian source separation. Digit. Signal Proc. 2007(17), 891–913 (2007)CrossRef

12.

Liang, F., Liu, C., Carroll, R.: Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples. Wiley, New York (2010)CrossRefMATH

13.

Gannot, S., Moonen, M.: On the application of the unscented Kalman filter to speech processing. In: IEEE International Workshop on Acoustic Echo and Noise Control (IWAENC), Japan, Kyoto, p. 811(2003)

14.

Neeser, F., Massey, J.: Proper complex random processes with applications to information theory. IEEE Trans. Info. Theory 39(4), 1293–1302 (1993)MathSciNetCrossRefMATH

15.

Horn, R., Johnson, C.: Matrix Analysis. Cambridge University Press, Cambridge (1985)CrossRefMATH

16.

Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S., Dahlgren, N.L., Zue, V.: TIMIT acoustic-phonetic continuous speech corpus. In: Linguistic Data Consortium, Philadelphia (1993)

17.

Vincent, E., Sawada, H., Bofill, P., Makino, S., Rosca, J.P.: First stereo audio source separation evaluation campaign: data, algorithms and results. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 552–559. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74494-8_69 CrossRef

Titel: On the Use of Latent Mixing Filters in Audio Source Separation
verfasst von: Laurent Girin
Roland Badeau
Verlag: Springer International Publishing
Buch: Latent Variable Analysis and Signal Separation
Print ISBN: 978-3-319-53546-3

Electronic ISBN: 978-3-319-53547-0

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-53547-0_22

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"