nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

6. Determined Blind Source Separation with Independent Low-Rank Matrix Analysis

verfasst von : Daichi Kitamura, Nobutaka Ono, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari

Erschienen in: Audio Source Separation

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this chapter, we address the determined blind source separation problem and introduce a new effective method of unifying independent vector analysis (IVA) and nonnegative matrix factorization (NMF). IVA is a state-of-the-art technique that utilizes the statistical independence between source vectors. However, since the source model in IVA is based on a spherically symmetric multivariate distribution, IVA cannot utilize the characteristics of specific spectral structures such as various sounds appearing in music signals. To solve this problem, we introduce NMF as the source model in IVA to capture the spectral structures. Since this approach is a natural extension of the source model from a vector to a low-rank matrix represented by NMF, the new method is called independent low-rank matrix analysis (ILRMA). We also reveal the relationship between IVA, ILRMA, and multichannel NMF (MNMF), namely, IVA and ILRMA are identical to a special case of MNMF, which employs a rank-1 spatial model. Experimental results show the efficacy of ILRMA compared with IVA and MNMF in terms of separation accuracy and convergence speed.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel General Formulation of Multichannel Extensions of NMF Variants

Nächstes Kapitel Deep Neural Network Based Multichannel Audio Source Separation

P. Comon, Independent component analysis, a new concept? Signal Process. 36(3), 287–314 (1994)CrossRefMATH

A.J. Bell, T.J. Sejnowski, An information-maximization approach to blind separation and blind deconvolution. Neural Comput. 7(6), 1129–1159 (1995)CrossRef

J.-F. Cardoso, Infomax and maximum likelihood for blind source separation. IEEE Signal Process. Lett. 4(4), 112–114 (1997)CrossRef

S. Haykin (ed.), Unsupervised Adaptive Filtering (Volume I: Blind Source Separation) (Wiley-Interscience, 2000)

A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis (Wiley-Interscience, 2001)

P. Smaragdis, Blind separation of convolved mixtures in the frequency domain. Neurocomputing 22(1), 21–34 (1998)CrossRefMATH

S. Araki, R. Mukai, S. Makino, T. Nishikawa, H. Saruwatari, The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech. IEEE Trans. Speech and Audio Process. 11(2), 109–116 (2003)CrossRefMATH

H. Sawada, R. Mukai, S. Araki, S. Makino, Convolutive blind source separation for more than two sources in the frequency domain, in Proceeding ICASSP (2004), pp. III-885–III-888

H. Buchner, R. Aichner, W. Kellerman, A generalization of blind source separation algorithms for convolutive mixtures based on second order statistics. IEEE Trans. Speech and Audio Process. 13(1), 120–134 (2005)CrossRef

10.

H. Saruwatari, T. Kawamura, T. Nishikawa, A. Lee, K. Shikano, Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. IEEE Trans. Speech and Audio Process. 14(2), 666–678 (2006)CrossRef

11.

D.D. Lee, H.S. Seung, Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)CrossRefMATH

12.

D.D. Lee, H.S. Seung, Algorithms for non-negative matrix factorization, in Proceedings NIPS (2000), pp. 556–562

13.

A. Cichocki, R. Zdunek, A.H. Phan, S. Amari, Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation (Wiley, 2009)

14.

T. Virtanen, Monaural sound source separation by nonnegative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio, Speech, and Lang. Process. 15(3), 1066–1074 (2007)CrossRef

15.

A. Ozerov, C. Févotte, M. Charbit, Factorial scaled hidden Markov model for polyphonic audio representation and source separation, in Proceedings WASPAA (2009), pp. 121–124

16.

P. Smaragdis, B. Raj, M. Shashanka, Supervised and semi-supervised separation of sounds from single-channel mixtures, in Proceedings ICA (2007), pp. 414–421

17.

D. Kitamura, H. Saruwatari, K. Yagi, K. Shikano, Y. Takahashi, K. Kondo, Music signal separation based on supervised nonnegative matrix factorization with orthogonality and maximum-divergence penalties. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E97-A(5), 1113–1118 (2014)

18.

D. Kitamura, H. Saruwatari, H. Kameoka, Y. Takahashi, K. Kondo, S. Nakamura, Multichannel signal separation combining directional clustering and nonnegative matrix factorization with spectrogram restoration. IEEE/ACM Trans. Audio, Speech, and Lang. Process. 23(4), 654–669 (2015)CrossRef

19.

S. Araki, F. Nesta, E. Vincent, Z. Koldovský, G. Nolte, A. Ziehe, A. Benichoux, The 2011 signal separation evaluation campaign (SiSEC2011):-audio source separation, in Proceedings LVA/ICA (2012), pp. 414–422

20.

N. Ono, Z. Koldovský, S. Miyabe, N. Ito, The 2013 signal separation evaluation campaign (SiSEC2013), in Proceedings MLSP (2013)

21.

N. Ono, Z. Rafii, D. Kitamura, N. Ito, A. Liutkus, The 2015 signal separation evaluation campaign, in Proceedings LVA/ICA (2015), pp. 387–395

22.

A. Liutkus, F.-R. Stöter, Z. Rafii, D. Kitamura, B. Rivet, N. Ito, N. Ono, J. Fontecave, The 2016 signal separation evaluation campaign, in Proceedings LVA/ICA (2017)

23.

S. Kurita, H. Saruwatari, S. Kajita, K. Takeda, F. Itakura, Evaluation of blind signal separation method using directivity pattern under reverberant conditions, in Proceedings ICASSP (2000), pp. 3140–3143

24.

N. Murata, S. Ikeda, A. Ziehe, An approach to blind source separation based on temporal structure of speech signals. Neurocomputing 41(1–4), 1–24 (2001)CrossRefMATH

25.

H. Sawada, R. Mukai, S. Araki, S. Makino, A robust and precise method for solving the permutation problem of frequency-domain blind source separation. IEEE Trans. Speech and Audio Process. 12(5), 530–538 (2004)CrossRef

26.

H. Sawada, S. Araki, S. Makino, Measuring dependence of bin-wise separated signals for permutation alignment in frequency-domain BSS, in Proceedings ISCAS (2007), pp. 3247–3250

27.

A. Hiroe, Solution of permutation problem in frequency domain ICA using multivariate probability density functions, in Proceedings ICA (2006), pp. 601–608

28.

T. Kim, T. Eltoft, T.-W. Lee, Independent vector analysis: an extension of ICA to multivariate components, in Proceedings ICA (2006), pp. 165–172

29.

T. Kim, H.T. Attias, S.-Y. Lee, T.-W. Lee, Blind source separation exploiting higher-order frequency dependencies. IEEE Trans. Audio, Speech, and Lang. Process. 15(1), 70–79 (2007)CrossRef

30.

D. Kitamura, N. Ono, H. Sawada, H. Kameoka, H. Saruwatari, Efficient multichannel nonnegative matrix factorization exploiting rank-1 spatial model, in Proceedings ICASSP (2015), pp. 276–280

31.

D. Kitamura, N. Ono, H. Sawada, H. Kameoka, H. Saruwatari, Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization. IEEE/ACM Trans. Audio, Speech, and Lang. Process. 24(9), 1626–1641 (2016)CrossRef

32.

S. Arberet, A. Ozerov, N.Q.K. Duong, E. Vincent, R. Gribonval, F. Bimbot, P. Vandergheynst, Nonnegative matrix factorization and spatial covariance model for under-determined reverberant audio source separation, in Proceedings ISSPA (2010), pp. 1–4

33.

H. Kameoka, T. Yoshioka, M. Hamamura, J. Le Roux, K. Kashino, Statistical model of speech signals based on composite autoregressive system with application to blind source separation, in Proceedings LVA/ICA (2010), pp. 245–253

34.

A. Ozerov, C. Févotte, Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans. Audio, Speech, and Lang. Process. 18(3), 550–563 (2010)CrossRef

35.

A. Ozerov, C. Févotte, R. Blouet, J.-L. Durrieu, Multichannel nonnegative tensor factorization with structured constraints for user-guided audio source separation, in Proceedings ICASSP (2011), pp. 257–260

36.

H. Sawada, H. Kameoka, S. Araki, N. Ueda, Multichannel extensions of non-negative matrix factorization with complex-valued data. IEEE Trans. Audio, Speech, and Lang. Process. 21(5), 971–982 (2013)CrossRef

37.

T. Eltoft, T. Kim, T.-W. Lee, On the multivariate Laplace distribution. IEEE Signal Process. Lett. 13(5), 300–303 (2006)CrossRef

38.

S. Kotz, T.J. Kozubowski, K. Podgórski, Symmetric multivariate Laplace distribution, in The Laplace Distribution and Generalizations, chap. 5 (Birkhäuser, Basel, 2001), pp. 231–238

39.

T. Adali, H. Ki, J.-F. Cardoso, Complex ICA using nonlinear functions. IEEE Trans. Signal Process. 56(9), 4536–4544 (2008)MathSciNetCrossRef

40.

N. Ono, Stable and fast update rules for independent vector analysis based on auxiliary function technique, in Proceedings WASPAA (2011), pp. 189–192

41.

N. Ono, Fast stereo independent vector analysis and its implementation on mobile phone, in Proceedings IWAENC (2012)

42.

N. Ono, Auxiliary-function-based independent vector analysis with power of vector-norm type weighting functions, in Proceedings APSIPA ASC (2012)

43.

T. Ono, N. Ono, S. Sagayama, User-guided independent vector analysis with source activity tuning, in Proceedings ICASSP (2012), pp. 2417–2420

44.

K. Hild, H.T. Attias, S. Nagarajan, An expectation-maximization method for spatio-temporal blind source separation using an AR-MOG source model. IEEE Trans. Neural Netw. 19(3), 508–519 (2008)CrossRefMATH

45.

C. Févotte, J.-F. Cardoso, Maximum likelihood approach for blind audio source separation using time-frequency Gaussian source models, in Proceedings WASPAA (2005), pp. 78–81

46.

T. Nakatani, B.-H. Juang, T. Yoshioka, K. Kinoshita, M. Delcroix, M. Miyoshi, Speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model. IEEE Trans. Audio, Speech, and Lang. Process. 16(8), 1512–1527 (2008)CrossRef

47.

C. Févotte, N. Bertin, J.-L. Durrieu, Nonnegative matrix factorization with the Itakura-Saito divergence. With application to music analysis. Neural Comput. 21(3), 793–830 (2009)CrossRefMATH

48.

F.D. Neeser, J.L. Massey, Proper complex random processes with applications to information theory. IEEE Trans. Inf. Theory 39(4), 1293–1302 (1993)MathSciNetCrossRefMATH

49.

F. Itakura, S. Saito, Analysis synthesis telephony based on the maximum likelihood method, in Proceedings ICA (1968), pp. C-17–C-20

50.

M. Nakano, H. Kameoka, J. Le Roux, Y. Kitano, N. Ono, S. Sagayama, Convergence-guaranteed multiplicative algorithms for nonnegative matrix factorization with beta-divergence, in Proceedings MLSP (2010), pp. 283–288

51.

A.R. López, N. Ono, U. Remes, K. Palomäki, M. Kurimo, Designing multichannel source separation based on single-channel source separation, in Proceedings ICASSP (2015), pp. 469–473

52.

N. Ono, S. Miyabe, Auxiliary-function-based independent component analysis for super-Gaussian sources, in Proceedings LVA/ICA (2010), pp. 165–172

53.

S. Amari, A. Cichocki, H.H. Yang, A new learning algorithm for blind signal separation, in Proceedings NIPS (1996), pp. 757–763

54.

A. Cichocki, S. Amari, Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications, vol. 1 (Wiley, 2002)

55.

T.G. Kolda, B.W. Bader, Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)MathSciNetCrossRefMATH

56.

D. FitzGerald, M. Cranitch, E. Coyle, Non-negative tensor factorisation for sound source separation, in Proceedings ISSC (2005), pp. 8–12

57.

R.M. Parry, I.A. Essa, Estimating the spatial position of spectral components in audio, in Proceedings ICA (2006), pp. 666–673

58.

Y. Mitsufuji, A. Roebel, Sound source separation based on non-negative tensor factorization incorporating spatial cue as prior knowledge, in Proceedings ICASSP (2013), pp. 71–75

59.

N.Q.K. Duong, E. Vincent, R. Gribonval, Spatial covariance models for under-determined reverberant audio source separation, in Proceedings WASPAA (2009), pp. 129–132

60.

N.Q.K. Duong, E. Vincent, R. Gribonval, Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio, Speech, and Lang. Process. 18(7), 1830–1840 (2010)CrossRef

61.

K.U. Simmer, J. Bitzer, C. Marro, Post-filtering techniques, in Microphone Arrays: Signal Processing Techniques and Applications, ed. by M. Brandstein, D. Ward, chap. 3 (Springer, Heidelberg, 2001), pp. 39–60

62.

W. James, C. Stein, Estimation with quadratic loss, in Proceedings Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 (1961), pp. 361–379

63.

B. Kulis, M. Sustik, I. Dhillon, Learning low-rank kernel matrices, in Proceedings ICML (2006), pp. 505–512

64.

S. Nakamura, K. Hiyane, F. Asano, T. Nishiura, T. Yamada, Acoustical sound database in real environments for sound scene understanding and hands-free speech recognition, in Proceedings LREC (2000), pp. 965–968

65.

E. Vincent, R. Gribonval, C. Févotte, Performance measurement in blind audio source separation. IEEE Trans. Audio, Speech, and Lang. Process. 14(4), 1462–1469 (2006)CrossRef

66.

S. Araki, S. Makino, Y. Hinamoto, R. Mukai, T. Nishikawa, H. Saruwatari, Equivalence between frequency-domain blind source separation and frequency-domain adaptive beamforming for convolutive mixtures. EURASIP J. Adv. Signal Process. 2003(11), 1–10 (2003)CrossRefMATH

67.

J.-F. Cardoso, A. Souloumiac, Blind beamforming for non-Gaussian signals. IEE Proc. F - Radar and Signal Process. 140(6), 362–370 (1993)CrossRef

68.

D.B. Ward, R.A. Kennedy, R.C. Williamson, Constant directivity beamforming, in Microphone Arrays: Signal Processing Techniques and Applications, ed. by M. Brandstein, D. Ward, chap. 1 (Springer, Heidelberg, 2001), pp. 3–17

69.

D. Kitamura, N. Ono, H. Sawada, H. Kameoka, H. Saruwatari, Relaxation of rank-1 spatial constraint in overdetermined blind source separation, in Proceedings EUSIPCO (2015), pp. 1271–1275

Titel: Determined Blind Source Separation with Independent Low-Rank Matrix Analysis
verfasst von: Daichi Kitamura
Nobutaka Ono
Hiroshi Sawada
Hirokazu Kameoka
Hiroshi Saruwatari
Verlag: Springer International Publishing
Buch: Audio Source Separation
Print ISBN: 978-3-319-73030-1

Electronic ISBN: 978-3-319-73031-8

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-319-73031-8_6

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Buchstaben, die aus einem Megaphon kommen/© MicroStockHub/Getty Images/iStock, Digitale Lieferkette/© zapp2photo / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.