Skip to main content
Erschienen in:
Buchtitelbild

2018 | OriginalPaper | Buchkapitel

1. Single-Channel Audio Source Separation with NMF: Divergences, Constraints and Algorithms

verfasst von : Cédric Févotte, Emmanuel Vincent, Alexey Ozerov

Erschienen in: Audio Source Separation

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Spectral decomposition by nonnegative matrix factorisation (NMF) has become state-of-the-art practice in many audio signal processing tasks, such as source separation, enhancement or transcription. This chapter reviews the fundamentals of NMF-based audio decomposition, in unsupervised and informed settings. We formulate NMF as an optimisation problem and discuss the choice of the measure of fit. We present the standard majorisation-minimisation strategy to address optimisation for NMF with the common \(\beta \)-divergence, a family of measures of fit that takes the quadratic cost, the generalised Kullback-Leibler divergence and the Itakura-Saito divergence as special cases. We discuss the reconstruction of time-domain components from the spectral factorisation and present common variants of NMF-based spectral decomposition: supervised and informed settings, regularised versions, temporal models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat C.J.C. Burges, Dimension reduction: a guided tour. Found. Trends Mach. Learn. 2(4), 275–365 (2009)CrossRefMATH C.J.C. Burges, Dimension reduction: a guided tour. Found. Trends Mach. Learn. 2(4), 275–365 (2009)CrossRefMATH
2.
Zurück zum Zitat P. Comon, Independent component analysis, a new concept ? Sig. process. 36(3), 287–314 (1994)CrossRefMATH P. Comon, Independent component analysis, a new concept ? Sig. process. 36(3), 287–314 (1994)CrossRefMATH
3.
Zurück zum Zitat B.A. Olshausen, D.J. Field, Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996)CrossRef B.A. Olshausen, D.J. Field, Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583), 607–609 (1996)CrossRef
4.
Zurück zum Zitat M. Aharon, M. Elad, A. Bruckstein, K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Sig. Process. 54(11), 4311–4322 (2006)CrossRefMATH M. Aharon, M. Elad, A. Bruckstein, K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Sig. Process. 54(11), 4311–4322 (2006)CrossRefMATH
5.
Zurück zum Zitat P. Paatero, U. Tapper, Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 111–126 (1994)CrossRef P. Paatero, U. Tapper, Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 111–126 (1994)CrossRef
6.
Zurück zum Zitat D.D. Lee, H.S. Seung, Learning the parts of objects with nonnegative matrix factorization. Nature 401, 788–791 (1999)CrossRefMATH D.D. Lee, H.S. Seung, Learning the parts of objects with nonnegative matrix factorization. Nature 401, 788–791 (1999)CrossRefMATH
7.
Zurück zum Zitat T. Hofmann, Probabilistic latent semantic indexing, in Proceedings of the 22nd International Conference on Research and Development in Information Retrieval (SIGIR) (1999) T. Hofmann, Probabilistic latent semantic indexing, in Proceedings of the 22nd International Conference on Research and Development in Information Retrieval (SIGIR) (1999)
8.
Zurück zum Zitat Y. Koren, R. Bell, C. Volinsky, Matrix factorization techniques for recommender systems. Computers 42(8), 30–37 (2009)CrossRef Y. Koren, R. Bell, C. Volinsky, Matrix factorization techniques for recommender systems. Computers 42(8), 30–37 (2009)CrossRef
9.
Zurück zum Zitat N. Dobigeon, J.-Y. Tourneret, C. Richard, J.C.M. Bermudez, S. McLaughlin, A.O. Hero, Nonlinear unmixing of hyperspectral images: models and algorithms. IEEE Sig. Proccess. Mag. 31(1), 89–94 (2014) N. Dobigeon, J.-Y. Tourneret, C. Richard, J.C.M. Bermudez, S. McLaughlin, A.O. Hero, Nonlinear unmixing of hyperspectral images: models and algorithms. IEEE Sig. Proccess. Mag. 31(1), 89–94 (2014)
10.
Zurück zum Zitat P. Smaragdis, J.C. Brown, Non-negative matrix factorization for polyphonic music transcription, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2003) P. Smaragdis, J.C. Brown, Non-negative matrix factorization for polyphonic music transcription, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2003)
11.
Zurück zum Zitat C. Févotte, N. Bertin, J.-L. Durrieu, Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis. Neural Comput. 21(3), 793–830 (2009)CrossRefMATH C. Févotte, N. Bertin, J.-L. Durrieu, Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis. Neural Comput. 21(3), 793–830 (2009)CrossRefMATH
12.
Zurück zum Zitat A. Cichocki, H. Lee, Y.-D. Kim, S. Choi, Non-negative matrix factorization with \(\alpha \)-divergence. Pattern Recognit. Lett. 29(9), 1433–1440 (2008)CrossRef A. Cichocki, H. Lee, Y.-D. Kim, S. Choi, Non-negative matrix factorization with \(\alpha \)-divergence. Pattern Recognit. Lett. 29(9), 1433–1440 (2008)CrossRef
13.
Zurück zum Zitat A. Cichocki, R. Zdunek, S. Amari, Csiszar’s divergences for non-negative matrix factorization: family of new algorithms, in Proceedings of International Conference on Independent Component Analysis and Blind Signal Separation (ICA), Charleston SC, USA (2006), pp. 32–39 A. Cichocki, R. Zdunek, S. Amari, Csiszar’s divergences for non-negative matrix factorization: family of new algorithms, in Proceedings of International Conference on Independent Component Analysis and Blind Signal Separation (ICA), Charleston SC, USA (2006), pp. 32–39
14.
15.
Zurück zum Zitat C. Févotte, J. Idier, Algorithms for nonnegative matrix factorization with the beta-divergence. Neural Comput. 23(9), 2421–2456 (2011)MathSciNetCrossRefMATH C. Févotte, J. Idier, Algorithms for nonnegative matrix factorization with the beta-divergence. Neural Comput. 23(9), 2421–2456 (2011)MathSciNetCrossRefMATH
16.
Zurück zum Zitat I.S. Dhillon, S. Sra, Generalized nonnegative matrix approximations with Bregman divergences, in Advances in Neural Information Processing Systems (NIPS) (2005) I.S. Dhillon, S. Sra, Generalized nonnegative matrix approximations with Bregman divergences, in Advances in Neural Information Processing Systems (NIPS) (2005)
17.
Zurück zum Zitat A. Basu, I.R. Harris, N.L. Hjort, M.C. Jones, Robust and efficient estimation by minimising a density power divergence. Biometrika 85(3), 549–559 (1998)MathSciNetCrossRefMATH A. Basu, I.R. Harris, N.L. Hjort, M.C. Jones, Robust and efficient estimation by minimising a density power divergence. Biometrika 85(3), 549–559 (1998)MathSciNetCrossRefMATH
18.
Zurück zum Zitat S. Eguchi, Y. Kano, Robustifying maximum likelihood estimation, Institute of Statistical Mathematics, Technical report, June 2001, research Memo. 802 S. Eguchi, Y. Kano, Robustifying maximum likelihood estimation, Institute of Statistical Mathematics, Technical report, June 2001, research Memo. 802
19.
Zurück zum Zitat D. FitzGerald, M. Cranitch, E. Coyle, On the use of the beta divergence for musical source separation, in Proceedings of the Irish Signals and Systems Conference (2009) D. FitzGerald, M. Cranitch, E. Coyle, On the use of the beta divergence for musical source separation, in Proceedings of the Irish Signals and Systems Conference (2009)
20.
Zurück zum Zitat R. Hennequin, R. Badeau, B. David, NMF with time-frequency activations to model non stationary audio events, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2010), pp. 445–448 R. Hennequin, R. Badeau, B. David, NMF with time-frequency activations to model non stationary audio events, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2010), pp. 445–448
21.
Zurück zum Zitat E. Vincent, N. Bertin, R. Badeau, Adaptive harmonic spectral decomposition for multiple pitch estimation. IEEE Trans. Audio Speech Lang. Process. 18, 528–537 (2010)CrossRef E. Vincent, N. Bertin, R. Badeau, Adaptive harmonic spectral decomposition for multiple pitch estimation. IEEE Trans. Audio Speech Lang. Process. 18, 528–537 (2010)CrossRef
22.
Zurück zum Zitat V.Y.F. Tan, C. Févotte, Automatic relevance determination in nonnegative matrix factorization with the beta-divergence. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1592–1605 (2013)CrossRef V.Y.F. Tan, C. Févotte, Automatic relevance determination in nonnegative matrix factorization with the beta-divergence. IEEE Trans. Pattern Anal. Mach. Intell. 35(7), 1592–1605 (2013)CrossRef
23.
Zurück zum Zitat C. Févotte, A.T. Cemgil, Nonnegative matrix factorisations as probabilistic inference in composite models, in Proceedings of the 17th European Signal Processing Conference (EUSIPCO), Glasgow, Scotland (2009), pp. 1913–1917 C. Févotte, A.T. Cemgil, Nonnegative matrix factorisations as probabilistic inference in composite models, in Proceedings of the 17th European Signal Processing Conference (EUSIPCO), Glasgow, Scotland (2009), pp. 1913–1917
24.
Zurück zum Zitat J.F. Canny, GaP: a factor model for discrete data, in Proceedings of the ACM International Conference on Research and Development of Information Retrieval (SIGIR) (2004), pp. 122–129 J.F. Canny, GaP: a factor model for discrete data, in Proceedings of the ACM International Conference on Research and Development of Information Retrieval (SIGIR) (2004), pp. 122–129
26.
Zurück zum Zitat P. Smaragdis, B. Raj, M.V. Shashanka, A probabilistic latent variable model for acoustic modeling, in NIPS Workshop on Advances in Models for Acoustic Processing (2006) P. Smaragdis, B. Raj, M.V. Shashanka, A probabilistic latent variable model for acoustic modeling, in NIPS Workshop on Advances in Models for Acoustic Processing (2006)
27.
Zurück zum Zitat T. Virtanen, Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Lang. Process. 15(3), 1066–1074 (2007)CrossRef T. Virtanen, Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Lang. Process. 15(3), 1066–1074 (2007)CrossRef
28.
Zurück zum Zitat B. King, C. Févotte, P. Smaragdis, Optimal cost function and magnitude power for NMF-based speech separation and music interpolation, in Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Santander, Spain (2012) B. King, C. Févotte, P. Smaragdis, Optimal cost function and magnitude power for NMF-based speech separation and music interpolation, in Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Santander, Spain (2012)
30.
Zurück zum Zitat P. Smaragdis, C. Févotte, G. Mysore, N. Mohammadiha, M. Hoffman, Static and dynamic source separation using nonnegative factorizations: a unified view. IEEE Sig. Process. Mag. 31(3), 66–75 (2014)CrossRef P. Smaragdis, C. Févotte, G. Mysore, N. Mohammadiha, M. Hoffman, Static and dynamic source separation using nonnegative factorizations: a unified view. IEEE Sig. Process. Mag. 31(3), 66–75 (2014)CrossRef
31.
Zurück zum Zitat T. Virtanen, Sound source separation using sparse coding with temporal continuity objective, in Proceedings of the International Computer Music Conference (ICMC) (2003), pp. 231–234 T. Virtanen, Sound source separation using sparse coding with temporal continuity objective, in Proceedings of the International Computer Music Conference (ICMC) (2003), pp. 231–234
32.
Zurück zum Zitat S. Vembu, S. Baumann, Separation of vocals from polyphonic audio recordings, in Proceedings of the International Conference on Music Information Retrieval (ISMIR) (2005), pp. 337–344 S. Vembu, S. Baumann, Separation of vocals from polyphonic audio recordings, in Proceedings of the International Conference on Music Information Retrieval (ISMIR) (2005), pp. 337–344
33.
Zurück zum Zitat E. Vincent, N. Bertin, R. Gribonval, F. Bimbot, From blind to guided audio source separation: how models and side information can improve the separation of sound. IEEE Sig. Process. Mag. 31(3), 107–115 (2014)CrossRef E. Vincent, N. Bertin, R. Gribonval, F. Bimbot, From blind to guided audio source separation: how models and side information can improve the separation of sound. IEEE Sig. Process. Mag. 31(3), 107–115 (2014)CrossRef
34.
Zurück zum Zitat E. Vincent, X. Rodet, Underdetermined source separation with structured source priors, in Proceedings of the International Conference on Independent Component Analysis and Blind Source Separation (ICA) (2004), pp. 327–334 E. Vincent, X. Rodet, Underdetermined source separation with structured source priors, in Proceedings of the International Conference on Independent Component Analysis and Blind Source Separation (ICA) (2004), pp. 327–334
35.
Zurück zum Zitat G.J. Mysore, P. Smaragdis, A non-negative approach to semi- supervised separation of speech from noise with the use of temporal dynamics, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2011), pp. 17–20 G.J. Mysore, P. Smaragdis, A non-negative approach to semi- supervised separation of speech from noise with the use of temporal dynamics, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2011), pp. 17–20
36.
Zurück zum Zitat P. Smaragdis, Convolutive speech bases and their application to supervised speech separation. IEEE Trans. Audio Speech Lang. Process. 15(1), 1–12 (2007)CrossRef P. Smaragdis, Convolutive speech bases and their application to supervised speech separation. IEEE Trans. Audio Speech Lang. Process. 15(1), 1–12 (2007)CrossRef
37.
Zurück zum Zitat P. Smaragdis, M. Shashanka, B. Raj, A sparse non-parametric approach for single channel separation of known sounds, in Proceedings of the Neural Information Processing Systems (NIPS) (2009), pp. 1705–1713 P. Smaragdis, M. Shashanka, B. Raj, A sparse non-parametric approach for single channel separation of known sounds, in Proceedings of the Neural Information Processing Systems (NIPS) (2009), pp. 1705–1713
38.
Zurück zum Zitat J.F. Gemmeke, T. Virtanen, A. Hurmalainen, Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE Trans. Audio Speech Lang. Process. 19(7), 2067–2080 (2011)CrossRef J.F. Gemmeke, T. Virtanen, A. Hurmalainen, Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE Trans. Audio Speech Lang. Process. 19(7), 2067–2080 (2011)CrossRef
39.
Zurück zum Zitat T. Virtanen, J. Gemmeke, B. Raj, Active-set Newton algorithm for overcomplete non-negative representations of audio. IEEE Trans. Audio Speech Lang. Process. 21(11), 2277–2289 (2013)CrossRef T. Virtanen, J. Gemmeke, B. Raj, Active-set Newton algorithm for overcomplete non-negative representations of audio. IEEE Trans. Audio Speech Lang. Process. 21(11), 2277–2289 (2013)CrossRef
40.
Zurück zum Zitat P.D. O’Grady, B.A. Pearlmutter, Discovering speech phones using convolutive non-negative matrix factorisation with a sparseness constraint. Neurocomputing 72(1–3), 88–101 (2008)CrossRef P.D. O’Grady, B.A. Pearlmutter, Discovering speech phones using convolutive non-negative matrix factorisation with a sparseness constraint. Neurocomputing 72(1–3), 88–101 (2008)CrossRef
41.
Zurück zum Zitat W. Wang, A. Cichocki, J.A. Chambers, A multiplicative algorithm for convolutive non-negative matrix factorization based on squared Euclidean distance. IEEE Trans. Sig. Process. 57(7), 2858–2864 (2009) W. Wang, A. Cichocki, J.A. Chambers, A multiplicative algorithm for convolutive non-negative matrix factorization based on squared Euclidean distance. IEEE Trans. Sig. Process. 57(7), 2858–2864 (2009)
42.
Zurück zum Zitat J.-L. Durrieu, G. Richard, B. David, C. Févotte, Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Trans. Audio Speech Lang. Process. 18(3), 564–575 (2010)CrossRef J.-L. Durrieu, G. Richard, B. David, C. Févotte, Source/filter model for unsupervised main melody extraction from polyphonic audio signals. IEEE Trans. Audio Speech Lang. Process. 18(3), 564–575 (2010)CrossRef
43.
Zurück zum Zitat A. Ozerov, E. Vincent, F. Bimbot, A general flexible framework for the handling of prior information in audio source separation. IEEE Trans. Audio Speech Lang. Process. 20(4), 1118–1133 (2012)CrossRef A. Ozerov, E. Vincent, F. Bimbot, A general flexible framework for the handling of prior information in audio source separation. IEEE Trans. Audio Speech Lang. Process. 20(4), 1118–1133 (2012)CrossRef
44.
Zurück zum Zitat D. FitzGerald, M. Cranitch, E. Coyle, Extended nonnegative tensor factorisation models for musical sound source separation. Comput. Intell. Neurosci. 2008 (2008). Article ID 872425 D. FitzGerald, M. Cranitch, E. Coyle, Extended nonnegative tensor factorisation models for musical sound source separation. Comput. Intell. Neurosci. 2008 (2008). Article ID 872425
45.
Zurück zum Zitat T. Virtanen, Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Lang. Process. 15(3), 1066–1074 (2007)CrossRef T. Virtanen, Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Lang. Process. 15(3), 1066–1074 (2007)CrossRef
46.
Zurück zum Zitat J. Eggert, E. Körner, Sparse coding and NMF, in Proceedings of the IEEE International Joint Conference on Neural Networks (2004), pp. 2529–2533 J. Eggert, E. Körner, Sparse coding and NMF, in Proceedings of the IEEE International Joint Conference on Neural Networks (2004), pp. 2529–2533
47.
Zurück zum Zitat J. Le Roux, F.J. Weninger, J.R. Hershey, Sparse NMF–half-baked or well done? Mitsubishi Electric Research Laboratories (MERL), Technical report TR2015-023, 2015 J. Le Roux, F.J. Weninger, J.R. Hershey, Sparse NMF–half-baked or well done? Mitsubishi Electric Research Laboratories (MERL), Technical report TR2015-023, 2015
48.
Zurück zum Zitat C. Joder, F. Weninger, D. Virette, B. Schuller, A comparative study on sparsity penalties for NMF-based speech separation: beyond Lp-norms, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2013), pp. 858–862 C. Joder, F. Weninger, D. Virette, B. Schuller, A comparative study on sparsity penalties for NMF-based speech separation: beyond Lp-norms, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2013), pp. 858–862
49.
Zurück zum Zitat Y. Mitsui, D. Kitamura, S. Takamichi, N. Ono, H. Saruwatari, Blind source separation based on independent low-rank matrix analysis with sparse regularization for time-series activity, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2017) Y. Mitsui, D. Kitamura, S. Takamichi, N. Ono, H. Saruwatari, Blind source separation based on independent low-rank matrix analysis with sparse regularization for time-series activity, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2017)
50.
Zurück zum Zitat A. Lefèvre, F. Bach, C. Févotte, Itakura-Saito nonnegative matrix factorization with group sparsity, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2011), pp. 21–24 A. Lefèvre, F. Bach, C. Févotte, Itakura-Saito nonnegative matrix factorization with group sparsity, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2011), pp. 21–24
51.
Zurück zum Zitat D.L. Sun, G.J. Mysore, Universal speech models for speaker independent single channel source separation, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2013), pp. 141–145 D.L. Sun, G.J. Mysore, Universal speech models for speaker independent single channel source separation, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2013), pp. 141–145
52.
Zurück zum Zitat O. Dikmen, A.T. Cemgil, Gamma Markov random fields for audio source modeling. IEEE Trans. Audio Speech Lang. Process. 18(3), 589–601 (2010)CrossRef O. Dikmen, A.T. Cemgil, Gamma Markov random fields for audio source modeling. IEEE Trans. Audio Speech Lang. Process. 18(3), 589–601 (2010)CrossRef
53.
Zurück zum Zitat C. Févotte, J. Le Roux, J.R. Hershey, Non-negative dynamical system with application to speech and audio, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2013), pp. 3158–3162 C. Févotte, J. Le Roux, J.R. Hershey, Non-negative dynamical system with application to speech and audio, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2013), pp. 3158–3162
54.
Zurück zum Zitat G. Mysore, M. Sahani, Variational inference in non-negative factorial hidden Markov models for efficient audio source separation, in Proceedings of the International Conference on Machine Learning (ICML) (2012), pp. 1887–1894 G. Mysore, M. Sahani, Variational inference in non-negative factorial hidden Markov models for efficient audio source separation, in Proceedings of the International Conference on Machine Learning (ICML) (2012), pp. 1887–1894
55.
Zurück zum Zitat A. Ozerov, C. Févotte, R. Blouet, J.-L. Durrieu, Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague (2011), pp. 257–260 A. Ozerov, C. Févotte, R. Blouet, J.-L. Durrieu, Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague (2011), pp. 257–260
56.
Zurück zum Zitat N.Q.K. Duong, A. Ozerov, L. Chevallier, J. Sirot, An interactive audio source separation framework based on non-negative matrix factorization, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2014), pp. 1567–1571 N.Q.K. Duong, A. Ozerov, L. Chevallier, J. Sirot, An interactive audio source separation framework based on non-negative matrix factorization, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2014), pp. 1567–1571
57.
Zurück zum Zitat J.-L. Durrieu, J.-P. Thiran, Musical audio source separation based on user-selected F0 track, in Proceedings of the International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA) (2012), pp. 438–445 J.-L. Durrieu, J.-P. Thiran, Musical audio source separation based on user-selected F0 track, in Proceedings of the International Conference on Latent Variable Analysis and Signal Separation (LVA/ICA) (2012), pp. 438–445
58.
Zurück zum Zitat S. Ewert, B. Pardo, M. Müller, M.D. Plumbley, Score-informed source separation for musical audio recordings: an overview. IEEE Sig. Process. Mag. 31(3), 116–124 (2014)CrossRef S. Ewert, B. Pardo, M. Müller, M.D. Plumbley, Score-informed source separation for musical audio recordings: an overview. IEEE Sig. Process. Mag. 31(3), 116–124 (2014)CrossRef
59.
Zurück zum Zitat P. Smaragdis, G.J. Mysore, Separation by humming: user-guided sound extraction from monophonic mixtures, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2009), pp. 69–72 P. Smaragdis, G.J. Mysore, Separation by humming: user-guided sound extraction from monophonic mixtures, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) (2009), pp. 69–72
60.
Zurück zum Zitat L. Le Magoarou, A. Ozerov, N.Q.K. Duong, Text-informed audio source separation using nonnegative matrix partial co-factorization, in Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP) (2013), pp. 1–6 L. Le Magoarou, A. Ozerov, N.Q.K. Duong, Text-informed audio source separation using nonnegative matrix partial co-factorization, in Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP) (2013), pp. 1–6
61.
Zurück zum Zitat N. Souviraà-Labastie, A. Olivero, E. Vincent, F. Bimbot, Multi-channel audio source separation using multiple deformed references. IEEE/ACM Trans. Audio Speech Lang. Process. 23(11), 1775–1787 (2015)CrossRef N. Souviraà-Labastie, A. Olivero, E. Vincent, F. Bimbot, Multi-channel audio source separation using multiple deformed references. IEEE/ACM Trans. Audio Speech Lang. Process. 23(11), 1775–1787 (2015)CrossRef
62.
Zurück zum Zitat Y.K. Yılmaz, A.T. Cemgil, U. Şimşekli, Generalized coupled tensor factorization, in Advances in Neural Information Processing Systems (NIPS) (2011) Y.K. Yılmaz, A.T. Cemgil, U. Şimşekli, Generalized coupled tensor factorization, in Advances in Neural Information Processing Systems (NIPS) (2011)
63.
Zurück zum Zitat N. Seichepine, S. Essid, C. Févotte, O. Cappé, Soft nonnegative matrix co-factorization. IEEE Trans. Sig. Process. 62(22), 5940–5949 (2014)MathSciNetCrossRef N. Seichepine, S. Essid, C. Févotte, O. Cappé, Soft nonnegative matrix co-factorization. IEEE Trans. Sig. Process. 62(22), 5940–5949 (2014)MathSciNetCrossRef
64.
Zurück zum Zitat E. Vincent, T. Virtanen, S. Gannot, Audio Source Separation and Speech Enhancement (Wiley, 2017) E. Vincent, T. Virtanen, S. Gannot, Audio Source Separation and Speech Enhancement (Wiley, 2017)
Metadaten
Titel
Single-Channel Audio Source Separation with NMF: Divergences, Constraints and Algorithms
verfasst von
Cédric Févotte
Emmanuel Vincent
Alexey Ozerov
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-73031-8_1

Neuer Inhalt