nach oben

International Journal of Speech Technology

Erschienen in:

13.08.2018

A new efficient backward BSS crosstalk-resistant algorithm for automatic blind speech quality enhancement

verfasst von: Mohamed Djendi, Meriem Zoulikha

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In last 10 years, several noise reduction (NR) algorithms have been proposed to be combined with the blind source separation techniques to separate speech and noise signals from blind noisy observations. More often, techniques use voice activity detector (VAD) systems for the optimal solution. In this paper, we propose a new backward blind source separation (BBSS) structure that uses the input correlation properties to provide: (i) high convergence rates and good tracking capabilities, since the acoustic environments imply long and time-variant noise paths, and (ii) low misalignment and robustness against different noise type variations and double-talk. The proposed algorithm has an automatic behavior to enhance noisy speech signals, and do not need any VAD systems to separate speech and noise signals. The obtained results in terms of several objective criteria show the good performance properties of the proposed algorithm in comparison with state-of-the-art algorithms.

Vorheriger Artikel Mel scaled M-band wavelet filter bank for speech recognition

Nächster Artikel Large scale data based audio scene classification

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Al-Kindi, M. J., & Dunlop, J. (1989). Improved adaptive noise cancellation in the presence of signal leakage on the noise reference channel. Signal Processing, 17(3), 241–250.MathSciNetCrossRef

Bouguelia, M. R., Nowaczyk, S., Santosh, K. C., & Verikas, A. (2018). Agreeing to disagree: Active learning with noisy labels without crowdsourcing. International Journal of Machine Learning and Cybernetics, 9(8), 1307–1319.CrossRef

Cho, E., Lee, B., & Schafer, R., Widrow, B. (2016). Single channel speech enhancement using outlier detection. Computer Science. https://arxiv.org/pdf/1605.01329.pdf

Dey, N., Ashour, A. S. (2018a). Challenges and future perspectives in speech-sources direction of arrival estimation and localization. In Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in electrical and computer engineering (pp. 49–52). Cham: Springer.

Dey, N., & Ashour, A. S. (2018b). Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in Speech Technology. Cham: Springer.

Dey, N., & Ashour, A. S. (2018c). Applied examples and applications of localization and tracking problem of multiple speech sources. In Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in Electrical and Computer Engineering (pp. 35–48). Cham: Springer.

Djendi, M., Scalart, P., & Gilloire, A. (2006). Noise cancellation using two closely spaced microphones: Experimental study with a specific model and two adaptive algorithms. In Proceedings of ICASSP, Vol. 3, pp. 744–747.

Djendi, M. Advanced techniques for two-microphone noise reduction in mobile communications, Ph.D. Dissertation (in French). University of Rennes 1. France 2010, n°19012010.

Djendi, M., Scalart, P., & Gilloire, A. (2013). Analysis of two-sensors forward BSS structure with post-filters in the presence of coherent and incoherent noise. Speech Communication, 55(10), 975–987.CrossRef

Djendi, M., Scalart, P., Gilloire, A. (2009). Comparative study of new blind source separation structures for two-channel acoustic noise cancellation. In Proceedings of the IEEE, Glasgow, Scotland, pp. 24–28.

Djendi, M., & Zoulikha, M. (2014). New automatic forward and backward blind sources. Separation algorithms for noise reduction and speech enhancement. Computer and Electrical Engineering, 40, 2072–2088.CrossRef

Fukuda, T., Ichikawa, O., & Nishimura, M. (2010). Long-term spectro-temporal and static harmonic features for voice activity detection. IEEE Journal on Selected Topics in Signal Processing, 4(5), 834–844.CrossRef

Ghosh, P. K., & Tsiartas, A., Narayanan, S. (2011). Robust voice activity detection using long-term signal variability. IEEE Transactions on Audio, Speech, and Language Processing, 19(3), 600–613.CrossRef

Ghribi, K., Djendi, M., & Berkani, D. (2016). A New wavelet-based forward BSS algorithm for acoustic noise reduction and speech quality enhancement. Applied Acoustics, 105, 55–66.CrossRef

Górriz, J. M., Ramírez, J., Lang, E. W., Puntonet, C. G., & Turias, I. (2010). Improved likelihood ratio test based voice activity detector applied to speech recognition. Speech Communication, 52(7–8), 664–677.CrossRef

Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech and Language Processing, 16(1), 229–238.CrossRef

Ikeda, S., & Sugiyama, A. (1999). An adaptive noise canceller with low signal distortion in the present of crosstalk. In IEICE Transactions on Fundamentals, Vol. 82.a, No. 8.

ITU-T P.835.2003. (2003). Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm. ITU-T Recommendation, p. 835.

Lee, S., Han, D. K., & Ko, H. (2017). Single-channel speech enhancement method using reconstructive NMF with spectrotemporal speech presence probabilities. Applied Acoustics, 117(B), 257–262.CrossRef

Loizou, P. C. (2013). Speech enhancement: Theory and practice (2nd Ed.). Boca Raton: Taylor & Francis.CrossRef

Loizou, P. C., & Kim, G. (2011). Reasons why current speech-enhancement algorithms do not improve speech inelligibility and suggested solutions. IEEE Transactions on Audio, Speech, and Language Processing. 19(1), 47–56.CrossRef

Lotter, T., Benien, C., & Vary, P. (2003). Multichannel speech enhancement using Bayesian spectral amplitude estimation. In Proceedings of ICASSP, Hong-Kong, pp. 20–24.

Mak, M. W., Yu, H. B. (2014). A study of voice activity detection techniques for NIST speaker recognition evaluations. Computer Speech and Language, 28(1), 295–313.CrossRef

Marro, C., Mahieux, Y., & Simmer, K. U. (1998). Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering. IEEE Transactions on Speech and Audio Processing, 6(3), 240–259.CrossRef

Meyer, J., Uwe, K. (1997). Simmer multi-channel speech enhancement in a car environment using wiener filtering and spectral subtraction. In Proceedings of ICASSP, IEEE, pp. 1–4.

Mildner, V., Goetze, S., Kammeyer, K.-D. (2006). Multi-channel speech enhancement using a psychoacoustic approach for a post-filter. In Proceedings of ITG-Fachtagung Sprachkommunikation, Kiel, Germany, pp. 1–4.

Mukherjee, H., Obaidullah, S. M., & Phadikar., S. (2018a). MISNA—A musical instrument segregation system from noisy audio with LPCC-S features and extreme learning. Multemedia Tools Applications. https://doi.org/10.1007/s11042-018-5993-6.

Mukherjee, H., Obaidullah, S. M., Santosh, K. C. (2018b). Line spectral frequency-based features and extreme learning machine for voice activity detection from audio signal. International Journal on Speech Technology, https://doi.org/10.1007/s10772-018-9525-6.

Qingning, Z., & Waleed, A. (2006). Speech enhancement by multi-channel crosstalk resistant adaptive noise cancellation. In Proceedings of IEEE ICASS, Vol. 1, pp. 485–488.

Roy, S. K., Zhu, W. P., & Champagne, B. (2016). Single channel speech enhancement using subband iterative Kalman filter. In IEEE International Symposium on Circuits and Systems (ISCAS), pp. 22–26.

Sandoval-Ibarra, Y., Diaz-Ramirez, V. H., & Kober, V. I. (2016). Speech enhancement with adaptive spectral estimators. Journal of Communications Technology and Electronics. 61(6), 672–678.CrossRef

Sato, M., Sugiyama, A., & Ohnaka, A. (2005). An adaptive noise canceller with low signal-distortion based on variable step size sub filter for human-robot communication. In IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol. e88-a, No. 8, pp. 2055–2061.

Sayed, A. H. (2003). Fundamentals of adaptive filtering. New York: Wiley.

Senthamizh Selvi, R., & Suresh, G. R., Kanaga Suba Raj, S. (2017). Speech enhancement using harmonic-model with multichannel Wiener Filter. Journal of Advanced Research in Dynamical and Control Systems, 9(3), 48–54.

Upadhyay, N., Jaiswal, K. (2016). Single channel speech enhancement: Using Wiener filtering with recursive noise estimation. Procedia Computer Science, 84, 22–30.CrossRef

Upadhyay, N., & Karmakar, A. (2015). Speech Enhancement using spectral subtraction-type algorithms: A comparison and simulation study. In Eleventh International Multi-Conference on Information Processing-2015 (IMCIP-2015). Procdia Computer Science. Vol. 4, pp. 574–584.

Vajda, S., & Santosh, K. C. (2017). A fast k-nearest neighbor classifier using unsupervised clustering. In Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016. Communications in Computer and Information Science, Vol. 709, pp. 185–193. Singapore: Springer.

Van Gerven, S., & Van Compernolle, D. (1995). Signal separation by symmetric adaptive decorrelation: Stability, convergence, and uniqueness. IEEE Transactions on Signal Processing, 74(3), 1602–1612.CrossRef

Varga, A., & Steeneken, H. J. (1993). Assessment for automatic speech recognition: II. Noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication, 12(3), 247–251.CrossRef

Vlaj, D., Kačič, Z., & Kos, M. (2012). Voice activity detection algorithm using nonlinear spectral weights, hangover and hang before criteria. Computers and Electrical Engineering, 38(6), 1820–1836.CrossRef

Wang, X., Guo, Y., Fu, Q., & Yan, Y. (2016). Speech enhancement using multi-channel post-filtering with modified signal presence probability in reverberant environment. Chinese Journal of Electronics, 25(3), 512–519.CrossRef

Zhang, J., Wu, X., & Shengs, V. S. (2015). Active learning with imbalanced multiple noisy labeling. IEEE Transactions on Cybernetics, 45(5), 1095–1107.CrossRef

Zoulikha, M., & Djendi, M. (2016). A new regularized forward blind source separation algorithm for automatic speech quality enhancement. Applied Acoustics, 112, 192–200.CrossRef

Zue, V., Seneff, S., & Glass, J. (1990). Speech database development at MIT: TIMIT and beyond. Speech Communication, 9(4), 351–356.CrossRef

Titel: A new efficient backward BSS crosstalk-resistant algorithm for automatic blind speech quality enhancement
verfasst von: Mohamed Djendi
Meriem Zoulikha
Publikationsdatum: 13.08.2018
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 4/2018
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-018-9544-3

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 4/2018

Large scale data based audio scene classification

Evaluation of speech unit modelling for HMM-based speech synthesis for Arabic

Reduction of residual noise based on eigencomponent filtering for speech enhancement

Low bit-rate speech coding based on multicomponent AFM signal model

Efficient SVD speech watermarking with encrypted images

Correction to: Revisiting distinctive phonetic features from applied computing perspective: unifying views and analyzing modern Arabic speech varieties

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.