Top

International Journal of Speech Technology

Published in:

15-07-2019

Blind speech dereverberation using sparse decomposition and multi-channel linear prediction

Authors: Leila Mousavi, Farbod Razzazi, Afrooz Haghbin

Published in: International Journal of Speech Technology | Issue 3/2019

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In this study, a blind speech dereverberation method in a noiseless single input multiple output acoustic channel is proposed. The method is based on multichannel linear prediction (MCLP) in STFT domain assuming sparsity in both residual speech and channel coefficients. The proposed speech dereverberation algorithm assumes that both the residual speech signal and the linear prediction coefficients is sparse. The optimization was performed by convex optimization using ADMM and CVX. The proposed model was compared with state of the art methods with l_p norm optimization criteria. Simulations were evaluated in different room models with various reverberation times, numbers of microphones and parameter adjustments. The results show that the performance of the proposed method is superior in terms of speech dereverberation assessment criteria.

previous article Spoken language identification based on optimised genetic algorithm–extreme learning machine approach

next article Efficient anomaly detection from medical signals and images

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Babacan, S. D., Molina, R., Do, M. N., & Katsaggelos, A. K. (2012). Bayesian blind deconvolution with general sparse image priors. In: Proceedings of European conference of computer vision (ECCV), Florence, Italy (pp. 341–355).

Candes, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing sparsity by reweighted minimization. Journal of Fourier Anals and Application, 14(5–6), 877–905.MathSciNetCrossRefMATH

Chartrand, R., & Yin, W. (2008). Iteratively reweighted algorithms for compressive sensing. In Proceedings of IEEE international conference of acoustics, speech, and signal processing (ICASSP), Las Vegas, NV, USA (pp. 3869–3872).

Giacobello, D., Christensen, M. G., Murthi, M. N., Jensen, S. H., & Moonen, M. (2009), Speech coding based on sparse linear prediction. In Proceedings of European signal processing conference (EUSIPCO) (pp. 2524–2528).

Giacobello, D., Christensen, M. G., Murthi, M. N., Jensen, S. H., & Moonen, M. (2012). Sparse linear prediction and its applications to speech processing. IEEE Transactions on Audio, Speech and Language Processing, 20(5), 1644–1657.CrossRef

Hansen, P. C., & O’Leary, D. P. (1993). The use of the L-curve in the regularization of discrete ill-posed problems. SIAM Journal Scientific Computing, 14(6), 1487–1503.MathSciNetCrossRefMATH

Jensen, T. L., Giacobello, D., van Waterschoot, T., & Christensen, M. G. (2016a). Fast algorithms for high-order sparse linear prediction with applications to speech processing. Speech Communication, 76(2), 143–156.CrossRef

Jensen T. L., Giacobello D., van Waterschoot T., & Christensen M. G. (2016b). Computational analysis of a fast algorithm for high-order sparse linear prediction. In Proceedings of European signal processing conference (EUSIPCO) (pp. 1–6).

Jukic, A., van Waterschoot, T., Gerkmann, T., & Doclo S. (2014). Speech dereverberation with multi-channel linear prediction and sparse priors for the desired signal. In Proceedings of joint workshop hands-free speech communication microphone arrays (HSCMA), Nancy, France (pp. 23–26).

Jukic, A., van Waterschoot, T., Gerkmann, T., & Doclo, S. (2015). Multi-channel linear prediction-based speech dereverberation with sparse priors. IEEE Transactions on Audio, Speech and Language Processing, 23(9), 1509–1520.CrossRef

Kinoshita, K., Delcroix, M., Nakatani, T., & Miyoshi, T. (2009). Suppression of late reverberation effect on speech signal using long-term multiple-step linear prediction. IEEE Transactions on Audio, Speech and Language Processing, 17(4), 534–545.CrossRef

Moshirynia, M., Razzazi, F., & Haghbin, A. (2014), A speech dereverberation method using adaptive sparse dictionary learning. In Proceedings of REVERB challenge workshop (pp. 1–4).

Nakatani, T., Juang, B. H., Yoshioka, T., Kinoshita, K., Delcroix, M., & Miyoshi, M. (2008a). Speech dereverberation based on maximum-likelihood estimation with time-varying Gaussian source model. IEEE Transactions on Audio, Speech and Language Processing, 16(8), 1512–1527.CrossRef

Nakatani, T, Yoshioka, T, Kinoshita, K., Miyoshi, M., & Juang, B. H. (2008b). Blind speech dereverberation with multi-channel linear prediction based on short time Fourier transform representation. In Proceedings of international conference acoustic speech and signal processing, Las Vegas, NV (pp. 85–88).

Nakatani, T., Yoshioka, T., Kinoshita, K., Miyoshi, M., & Juang, B. H. (2010). Speech dereverberation based on variance-normalized delayed linear prediction. IEEE Transactions on Audio, Speech and Language Processing, 18(7), 1717–1731.CrossRef

Novey, M., Adali, T., & Roy, A. (2010). A complex generalized Gaussian distribution characterization, generation, and estimation. IEEE Transactions on Signal Processing, 58(3), 1427–1433.MathSciNetCrossRefMATH

Parikh, N., & Boyd, S. (2014). Proximal algorithms. Foundations and Trends in Optimization, 1(3), 123–231.CrossRef

Schmid, D., Enzner, G., Malik, S., Kolossa, D., & Martin, R. (2014). Variational Bayesian inference for multichannel dereverberation and noise reduction. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(8), 1320–1335.CrossRef

Schwartz, B., Gannot, S., & Habets, E. A. P. (2013). Multi-microphone speech dereverberation using expectation-maximization and Kalman smoother. In Proceedings of European signal processing conference (EUSIPCO), Marrakech, Morocco (pp. 1–5).

Wipf, D., & Nagarajan, S. (2010). Iterative reweighted l1 and l2 methods for finding sparse solutions. IEEE Journal of Selective Topics on Signal Processing, 4(2), 317–329.CrossRef

Wipf, D., & Zhang, H. (2013) Analysis of Bayesian blind deconvolution. In Proceedings of international conference of energy minimization methods and computational visual pattern recognition (EMMCVPR), Lund, Sweden, August 2013 (pp. 40–53).

Yoshioka, T. (2010). Speech enhancement in reverberant environment. PhD. Thesis, Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University.

Yoshioka, T., & Nakatani, T. (2012). Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening. IEEE Transactions on Audio, Speech and Language Processing, 20(10), 2707–2720.CrossRef

Title: Blind speech dereverberation using sparse decomposition and multi-channel linear prediction
Authors: Leila Mousavi
Farbod Razzazi
Afrooz Haghbin
Publication date: 15-07-2019
Publisher: Springer US
Published in: International Journal of Speech Technology / Issue 3/2019
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-019-09620-x

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 3/2019

Enhanced speech emotion detection using deep neural networks

Dual estimation based vocal tract shape computation

Efficient underwater acoustic communication with peak-to-average power ratio reduction and channel equalization

Efficient anomaly detection from medical signals and images

Speech enhancement by combining spectral subtraction and minimum mean square error-spectrum power estimator based on zero crossing

Emotions recognition: different sets of features and models