Top

Published in:

2018 | OriginalPaper | Chapter

2. Separation of Known Sources Using Non-negative Spectrogram Factorisation

Authors : Tuomas Virtanen, Tom Barker

Published in: Audio Source Separation

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

This chapter presents non-negative spectrogram factorisation (NMF) techniques which can be used to separate sources in the cases where source-specific training material is available in advance. We first present the basic NMF formulation for sound mixtures and then present criteria and algorithms for estimating the model parameters. We introduce selected methods for training the NMF source models by using either vector quantisation, convexity constraints, archetypal analysis, or discriminative methods. We also explain how the learned dictionaries can be adapted to deal with mismatches between the training data and usage scenario. We present also how semi-supervised learning can be used to deal with unknown noise sources within a mixture and finally we introduce a coupled NMF method which can be used to model large temporal context while retaining low algorithmic latency.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Single-Channel Audio Source Separation with NMF: Divergences, Constraints and Algorithms

next chapter Dynamic Non-negative Models for Audio Source Separation

T. Virtanen, Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Lang. Process. 15(3), 1066–1074 (2007)CrossRef

C. Févotte, N. Bertin, J.-L. Durrieu, Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis. Neural Comput. 21(3), 793–830 (2009)CrossRefMATH

C. Févotte, J. Idier, Algorithms for nonnegative matrix factorization with the beta-divergence. Neural Comput. 23(9), 2421–2456 (2011)MathSciNetCrossRefMATH

T. Virtanen, B. Raj, J. Gemmeke, H.V. hamme, Active-set Newton algorithm for non-negative sparse coding of audio, in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (2014)

J. Gemmeke, T. Virtanen, A. Hurmalainen, Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE Trans. Audio Speech Lang. Process. 19(7), 2067–2080 (2011)CrossRef

B. Raj, T. Virtanen, S. Chaudhure, R. Singh, Non-negative matrix factorization based compensation of music for automatic speech recognition, in Proceedings of Interspeech (2000)

P. Smaragdis, M. Shashanka, B. Raj, A sparse non-parametric approach for single channel separation of known sounds, in Proceedings of Neural Information Processing Systems (2009)

A. Cutler, L. Breiman, Archetypal analysis. Technometrics 36(4), 338–347 (1996)MathSciNetCrossRefMATH

A. Diment, T. Virtanen, Archetypal analysis for audio dictionary learning, in Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (2015)

10.

D.D. Lee, H.S. Seung, Algorithms for non-negative matrix factorization, in Proceedings of Neural Information Processing Systems (2000), pp. 556–562

11.

F. Weninger, J. Le Roux, J.R. Hershey, S. Watanabe, Discriminative NMF and its application to single-channel source separation, in Proceedings of Interspeech (2014)

12.

P. Sprechmann, A.M. Bronstein, G. Sapiro, Supervised non-euclidean sparse nmf via bilevel optimization with applications to speech enhancement, in Proceedings of Joint Workshop on Hands-free Speech Communication and Microphone Arrays (2014)

13.

E.M. Grais, H. Erdogan, Discriminative nonnegative dictionary learning using cross-coherence penalties for single channel source separation, in Proceedings of Interspeech (2013)

14.

J.F. Gemmeke, T. Virtanen, K. Demuynck, Exemplar-based joint channel and noise compensation, in Proceedings of IEEE International Conference on Audio, Speech and Signal Processing (2013)

15.

T. Barker, T. Virtanen, O. Delhomme, Ultrasound-coupled semi-supervised nonnegative matrix factorisation for speech enhancement, in IEEE International Conference on Acoustics, Speech and Signal Processing (2014), pp. 2129–2133

16.

C. Joder, F. Weninger, F. Eyben, D. Virette, B. Schuller, Real-time speech separation by semi-supervised nonnegative matrix factorization, in Proceedings of Latent Variable Analysis and Signal Separation: 10th International Conference, ed. by F. Theis, A. Cichocki, A. Yeredor, M. Zibulevsky (2012), pp. 322–329

17.

S. Laugesen, K. Hansen, J. Hellgren, Acceptable delays in hearing aids and implications for feedback cancellation. J. Acoust. Soc. Am. 105(2), 1211–1212 (1999)CrossRef

18.

J. Agnew, J. Thornton, Just noticeable and objectionable group delays in digital hearing aids. J. Am. Acad. Audiol. 11, 330–336 (2000)

19.

T. Barker, T. Virtanen, N.H. Pontoppidan, Low-latency sound-source-separation using non-negative matrix factorisation with coupled analysis and synthesis dictionaries, in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (2015)

Title: Separation of Known Sources Using Non-negative Spectrogram Factorisation
Authors: Tuomas Virtanen
Tom Barker
Publisher: Springer International Publishing
Book: Audio Source Separation
Print ISBN: 978-3-319-73030-1

Electronic ISBN: 978-3-319-73031-8

Copyright Year: 2018
DOI: https://doi.org/10.1007/978-3-319-73031-8_2