Abstract
Speech signals in frequency domain were separated based on discrete wavelet transform (DWT) and independent component analysis (ICA). First, mixed speech signals were decomposed into different frequency domains by DWT and the subbands of speech signals were separated using ICA in each wavelet domain; then, the permutation and scaling problems of frequency domain blind source separation (BSS) were solved by utilizing the correlation between adjacent bins in speech signals; at last, source signals were reconstructed from single branches. Experiments were carried out with 2 sources and 6 microphones using speech signals at sampling rate of 40 kHz. The microphones were aligned with 2 sources in front of them, on the left and right. The separation of one male and one female speeches lasted 2.5 s. It is proved that the new method is better than single ICA method and the signal to noise ratio is improved by 1 dB approximately.
Similar content being viewed by others
References
Brandstein M, Ward D (editors). Microphone Arrays: Signal Processing Techniques and Applications[M]. Springer-Verlag, Berlin, 2001.
Park H, Shekhar Dhir C, Oh S et al. A filter bank approach to independent component analysis for convolved mixtures[J]. Neurocomputing, 2006, 69(16–18): 2065–2077.
Makino S. Blind source separation of convolutive mixtures[C]. In: Proceedings of SPIE—The International Society for Optical Engineering. Kissimmee, FL, USA, 2006.
Robledo-Arnuncio E, Juang B. Blind source separation of acoustic mixtures with distributed microphones[C]. In: 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’ 07. Honolulu, HI, USA. 2007. 949–952.
Ukai S, Takatam T, Saruwatari H et al. Multistage SIMOmodel-based blind source separation combining frequencydomain ICA and time-domain ICA[J]. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2005, E88-A(3): 642–649.
Sawada H, Mukai R, Araki S et al. A robust and precise method for solving the permutation problem of frequencydomain blind source separation[J]. IEEE Transactions on Speech and Audio Processing, 2004, 12(5): 530–538.
Reju V G, Koh S N, Soon I Y. Partial separation method for solving permutation problem in frequency domain blind source separation of speech signals[J]. Neurocomputing, 2008, 71(10–12): 2098–2112.
Li Wanlong, Ju L, Du Jun et al. Solving permutation problem in frequency-domain blind source separation using microphone sub-arrays[C]. In: IEEE International Conference Neural Networks and Signal Processing, ICNNSP. Zhejiang, China, 2008. 67–72.
Rennie S J, Aarabi P, Frey B J. Variational probabilistic speech separation using microphone arrays[J]. IEEE Transactions on Audio, Speech and Language Processing, 2007, 15(1): 135–149.
Makino S, Sawada H, Mukai R et al. Blind source separa tion of convolutive mixtures of speech in frequency domain[J]. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2005, E88-A(7): 1640–1654.
Prasad R, Saruwatari H, Shikano K. Effect of central limit theorem non-compliance on blind separation of speech by negentropy maximization[J]. Speech Communication, 2005, 26(6): 511–522.
Hyvarinen A. Fast and robust fixed-point algorithms for independent component analysis[J]. IEEE Transactions on Neural Networks, 1999, 10(3): 626–634.
Hyvarinen A, Oja E. Independent component analysis: Algorithms and applications[J]. Neural Networks, 2000, 13(4/5): 411–430.
Nishikawa T, Abe H, Saruwatari H et al. Overdetermined blind separation for real convolutive mixtures of speech based on multistage ICA using subarray processing[J]. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2004, E87-A(8): 1924–1932.
Saruwatari H, Kawamura T, Nishikawa T et al. Fast-convergence algorithm for blind source separation based on array signal processing[J]. IEEE Workshop on Statistical Signal Processing Proceedings, 2003, E86-A(3): 634–639.
Sawada H, Mukai R, Araki S et al. A robust approach to the permutation problem of frequency-domain blind source separation[C]. In: IEEE International Conference on Acoustics, Speech and Signal Processing Proceedings. Hongkong, China, 2003. 381–384.
Murata N, Ikeda S, Ziehe A. An approach to blind source separation based on temporal structure of speech signals[J]. Neurocomputing, 2001, 41: 1–24.
Mukai R, Sawada H, de la Kethulle de Ryhove S et al. Array geometry arrangement for frequency domain blind source separation[C]. In: International Workshop on Acoustic Echo and Noise Control (IWAENC2003). Kyoto, Japan, 2003. 219–222.
Sawada H, Mukai R, Araki S et al. A robust and precise method for solving the permutation problem of frequencydomain blind source separation[J]. IEEE Transactions on Speech and Audio Processing, 2004, 12(5): 530–538.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by Tianjin Municipal Science and Technology Commission (No.09JCYBJC02200).
WU Xiao, born in 1979, male, doctorate student.
Rights and permissions
About this article
Cite this article
Wu, X., He, J., Jin, S. et al. Blind separation of speech signals based on wavelet transform and independent component analysis. Trans. Tianjin Univ. 16, 123–128 (2010). https://doi.org/10.1007/s12209-010-0022-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12209-010-0022-5