Skip to main content
Erschienen in: International Journal of Speech Technology 4/2018

13.08.2018

A new efficient backward BSS crosstalk-resistant algorithm for automatic blind speech quality enhancement

verfasst von: Mohamed Djendi, Meriem Zoulikha

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In last 10 years, several noise reduction (NR) algorithms have been proposed to be combined with the blind source separation techniques to separate speech and noise signals from blind noisy observations. More often, techniques use voice activity detector (VAD) systems for the optimal solution. In this paper, we propose a new backward blind source separation (BBSS) structure that uses the input correlation properties to provide: (i) high convergence rates and good tracking capabilities, since the acoustic environments imply long and time-variant noise paths, and (ii) low misalignment and robustness against different noise type variations and double-talk. The proposed algorithm has an automatic behavior to enhance noisy speech signals, and do not need any VAD systems to separate speech and noise signals. The obtained results in terms of several objective criteria show the good performance properties of the proposed algorithm in comparison with state-of-the-art algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Al-Kindi, M. J., & Dunlop, J. (1989). Improved adaptive noise cancellation in the presence of signal leakage on the noise reference channel. Signal Processing, 17(3), 241–250.MathSciNetCrossRef Al-Kindi, M. J., & Dunlop, J. (1989). Improved adaptive noise cancellation in the presence of signal leakage on the noise reference channel. Signal Processing, 17(3), 241–250.MathSciNetCrossRef
Zurück zum Zitat Bouguelia, M. R., Nowaczyk, S., Santosh, K. C., & Verikas, A. (2018). Agreeing to disagree: Active learning with noisy labels without crowdsourcing. International Journal of Machine Learning and Cybernetics, 9(8), 1307–1319.CrossRef Bouguelia, M. R., Nowaczyk, S., Santosh, K. C., & Verikas, A. (2018). Agreeing to disagree: Active learning with noisy labels without crowdsourcing. International Journal of Machine Learning and Cybernetics, 9(8), 1307–1319.CrossRef
Zurück zum Zitat Dey, N., Ashour, A. S. (2018a). Challenges and future perspectives in speech-sources direction of arrival estimation and localization. In Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in electrical and computer engineering (pp. 49–52). Cham: Springer. Dey, N., Ashour, A. S. (2018a). Challenges and future perspectives in speech-sources direction of arrival estimation and localization. In Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in electrical and computer engineering (pp. 49–52). Cham: Springer.
Zurück zum Zitat Dey, N., & Ashour, A. S. (2018b). Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in Speech Technology. Cham: Springer. Dey, N., & Ashour, A. S. (2018b). Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in Speech Technology. Cham: Springer.
Zurück zum Zitat Dey, N., & Ashour, A. S. (2018c). Applied examples and applications of localization and tracking problem of multiple speech sources. In Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in Electrical and Computer Engineering (pp. 35–48). Cham: Springer. Dey, N., & Ashour, A. S. (2018c). Applied examples and applications of localization and tracking problem of multiple speech sources. In Direction of arrival estimation and localization of multi-speech sources. SpringerBriefs in Electrical and Computer Engineering (pp. 35–48). Cham: Springer.
Zurück zum Zitat Djendi, M., Scalart, P., & Gilloire, A. (2006). Noise cancellation using two closely spaced microphones: Experimental study with a specific model and two adaptive algorithms. In Proceedings of ICASSP, Vol. 3, pp. 744–747. Djendi, M., Scalart, P., & Gilloire, A. (2006). Noise cancellation using two closely spaced microphones: Experimental study with a specific model and two adaptive algorithms. In Proceedings of ICASSP, Vol. 3, pp. 744–747.
Zurück zum Zitat Djendi, M. Advanced techniques for two-microphone noise reduction in mobile communications, Ph.D. Dissertation (in French). University of Rennes 1. France 2010, n°19012010. Djendi, M. Advanced techniques for two-microphone noise reduction in mobile communications, Ph.D. Dissertation (in French). University of Rennes 1. France 2010, n°19012010.
Zurück zum Zitat Djendi, M., Scalart, P., & Gilloire, A. (2013). Analysis of two-sensors forward BSS structure with post-filters in the presence of coherent and incoherent noise. Speech Communication, 55(10), 975–987.CrossRef Djendi, M., Scalart, P., & Gilloire, A. (2013). Analysis of two-sensors forward BSS structure with post-filters in the presence of coherent and incoherent noise. Speech Communication, 55(10), 975–987.CrossRef
Zurück zum Zitat Djendi, M., Scalart, P., Gilloire, A. (2009). Comparative study of new blind source separation structures for two-channel acoustic noise cancellation. In Proceedings of the IEEE, Glasgow, Scotland, pp. 24–28. Djendi, M., Scalart, P., Gilloire, A. (2009). Comparative study of new blind source separation structures for two-channel acoustic noise cancellation. In Proceedings of the IEEE, Glasgow, Scotland, pp. 24–28.
Zurück zum Zitat Djendi, M., & Zoulikha, M. (2014). New automatic forward and backward blind sources. Separation algorithms for noise reduction and speech enhancement. Computer and Electrical Engineering, 40, 2072–2088.CrossRef Djendi, M., & Zoulikha, M. (2014). New automatic forward and backward blind sources. Separation algorithms for noise reduction and speech enhancement. Computer and Electrical Engineering, 40, 2072–2088.CrossRef
Zurück zum Zitat Fukuda, T., Ichikawa, O., & Nishimura, M. (2010). Long-term spectro-temporal and static harmonic features for voice activity detection. IEEE Journal on Selected Topics in Signal Processing, 4(5), 834–844.CrossRef Fukuda, T., Ichikawa, O., & Nishimura, M. (2010). Long-term spectro-temporal and static harmonic features for voice activity detection. IEEE Journal on Selected Topics in Signal Processing, 4(5), 834–844.CrossRef
Zurück zum Zitat Ghosh, P. K., & Tsiartas, A., Narayanan, S. (2011). Robust voice activity detection using long-term signal variability. IEEE Transactions on Audio, Speech, and Language Processing, 19(3), 600–613.CrossRef Ghosh, P. K., & Tsiartas, A., Narayanan, S. (2011). Robust voice activity detection using long-term signal variability. IEEE Transactions on Audio, Speech, and Language Processing, 19(3), 600–613.CrossRef
Zurück zum Zitat Ghribi, K., Djendi, M., & Berkani, D. (2016). A New wavelet-based forward BSS algorithm for acoustic noise reduction and speech quality enhancement. Applied Acoustics, 105, 55–66.CrossRef Ghribi, K., Djendi, M., & Berkani, D. (2016). A New wavelet-based forward BSS algorithm for acoustic noise reduction and speech quality enhancement. Applied Acoustics, 105, 55–66.CrossRef
Zurück zum Zitat Górriz, J. M., Ramírez, J., Lang, E. W., Puntonet, C. G., & Turias, I. (2010). Improved likelihood ratio test based voice activity detector applied to speech recognition. Speech Communication, 52(7–8), 664–677.CrossRef Górriz, J. M., Ramírez, J., Lang, E. W., Puntonet, C. G., & Turias, I. (2010). Improved likelihood ratio test based voice activity detector applied to speech recognition. Speech Communication, 52(7–8), 664–677.CrossRef
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech and Language Processing, 16(1), 229–238.CrossRef Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech and Language Processing, 16(1), 229–238.CrossRef
Zurück zum Zitat Ikeda, S., & Sugiyama, A. (1999). An adaptive noise canceller with low signal distortion in the present of crosstalk. In IEICE Transactions on Fundamentals, Vol. 82.a, No. 8. Ikeda, S., & Sugiyama, A. (1999). An adaptive noise canceller with low signal distortion in the present of crosstalk. In IEICE Transactions on Fundamentals, Vol. 82.a, No. 8.
Zurück zum Zitat ITU-T P.835.2003. (2003). Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm. ITU-T Recommendation, p. 835. ITU-T P.835.2003. (2003). Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm. ITU-T Recommendation, p. 835.
Zurück zum Zitat Lee, S., Han, D. K., & Ko, H. (2017). Single-channel speech enhancement method using reconstructive NMF with spectrotemporal speech presence probabilities. Applied Acoustics, 117(B), 257–262.CrossRef Lee, S., Han, D. K., & Ko, H. (2017). Single-channel speech enhancement method using reconstructive NMF with spectrotemporal speech presence probabilities. Applied Acoustics, 117(B), 257–262.CrossRef
Zurück zum Zitat Loizou, P. C. (2013). Speech enhancement: Theory and practice (2nd Ed.). Boca Raton: Taylor & Francis.CrossRef Loizou, P. C. (2013). Speech enhancement: Theory and practice (2nd Ed.). Boca Raton: Taylor & Francis.CrossRef
Zurück zum Zitat Loizou, P. C., & Kim, G. (2011). Reasons why current speech-enhancement algorithms do not improve speech inelligibility and suggested solutions. IEEE Transactions on Audio, Speech, and Language Processing. 19(1), 47–56.CrossRef Loizou, P. C., & Kim, G. (2011). Reasons why current speech-enhancement algorithms do not improve speech inelligibility and suggested solutions. IEEE Transactions on Audio, Speech, and Language Processing. 19(1), 47–56.CrossRef
Zurück zum Zitat Lotter, T., Benien, C., & Vary, P. (2003). Multichannel speech enhancement using Bayesian spectral amplitude estimation. In Proceedings of ICASSP, Hong-Kong, pp. 20–24. Lotter, T., Benien, C., & Vary, P. (2003). Multichannel speech enhancement using Bayesian spectral amplitude estimation. In Proceedings of ICASSP, Hong-Kong, pp. 20–24.
Zurück zum Zitat Mak, M. W., Yu, H. B. (2014). A study of voice activity detection techniques for NIST speaker recognition evaluations. Computer Speech and Language, 28(1), 295–313.CrossRef Mak, M. W., Yu, H. B. (2014). A study of voice activity detection techniques for NIST speaker recognition evaluations. Computer Speech and Language, 28(1), 295–313.CrossRef
Zurück zum Zitat Marro, C., Mahieux, Y., & Simmer, K. U. (1998). Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering. IEEE Transactions on Speech and Audio Processing, 6(3), 240–259.CrossRef Marro, C., Mahieux, Y., & Simmer, K. U. (1998). Analysis of noise reduction and dereverberation techniques based on microphone arrays with postfiltering. IEEE Transactions on Speech and Audio Processing, 6(3), 240–259.CrossRef
Zurück zum Zitat Meyer, J., Uwe, K. (1997). Simmer multi-channel speech enhancement in a car environment using wiener filtering and spectral subtraction. In Proceedings of ICASSP, IEEE, pp. 1–4. Meyer, J., Uwe, K. (1997). Simmer multi-channel speech enhancement in a car environment using wiener filtering and spectral subtraction. In Proceedings of ICASSP, IEEE, pp. 1–4.
Zurück zum Zitat Mildner, V., Goetze, S., Kammeyer, K.-D. (2006). Multi-channel speech enhancement using a psychoacoustic approach for a post-filter. In Proceedings of ITG-Fachtagung Sprachkommunikation, Kiel, Germany, pp. 1–4. Mildner, V., Goetze, S., Kammeyer, K.-D. (2006). Multi-channel speech enhancement using a psychoacoustic approach for a post-filter. In Proceedings of ITG-Fachtagung Sprachkommunikation, Kiel, Germany, pp. 1–4.
Zurück zum Zitat Qingning, Z., & Waleed, A. (2006). Speech enhancement by multi-channel crosstalk resistant adaptive noise cancellation. In Proceedings of IEEE ICASS, Vol. 1, pp. 485–488. Qingning, Z., & Waleed, A. (2006). Speech enhancement by multi-channel crosstalk resistant adaptive noise cancellation. In Proceedings of IEEE ICASS, Vol. 1, pp. 485–488.
Zurück zum Zitat Roy, S. K., Zhu, W. P., & Champagne, B. (2016). Single channel speech enhancement using subband iterative Kalman filter. In IEEE International Symposium on Circuits and Systems (ISCAS), pp. 22–26. Roy, S. K., Zhu, W. P., & Champagne, B. (2016). Single channel speech enhancement using subband iterative Kalman filter. In IEEE International Symposium on Circuits and Systems (ISCAS), pp. 22–26.
Zurück zum Zitat Sandoval-Ibarra, Y., Diaz-Ramirez, V. H., & Kober, V. I. (2016). Speech enhancement with adaptive spectral estimators. Journal of Communications Technology and Electronics. 61(6), 672–678.CrossRef Sandoval-Ibarra, Y., Diaz-Ramirez, V. H., & Kober, V. I. (2016). Speech enhancement with adaptive spectral estimators. Journal of Communications Technology and Electronics. 61(6), 672–678.CrossRef
Zurück zum Zitat Sato, M., Sugiyama, A., & Ohnaka, A. (2005). An adaptive noise canceller with low signal-distortion based on variable step size sub filter for human-robot communication. In IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol. e88-a, No. 8, pp. 2055–2061. Sato, M., Sugiyama, A., & Ohnaka, A. (2005). An adaptive noise canceller with low signal-distortion based on variable step size sub filter for human-robot communication. In IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol. e88-a, No. 8, pp. 2055–2061.
Zurück zum Zitat Sayed, A. H. (2003). Fundamentals of adaptive filtering. New York: Wiley. Sayed, A. H. (2003). Fundamentals of adaptive filtering. New York: Wiley.
Zurück zum Zitat Senthamizh Selvi, R., & Suresh, G. R., Kanaga Suba Raj, S. (2017). Speech enhancement using harmonic-model with multichannel Wiener Filter. Journal of Advanced Research in Dynamical and Control Systems, 9(3), 48–54. Senthamizh Selvi, R., & Suresh, G. R., Kanaga Suba Raj, S. (2017). Speech enhancement using harmonic-model with multichannel Wiener Filter. Journal of Advanced Research in Dynamical and Control Systems, 9(3), 48–54.
Zurück zum Zitat Upadhyay, N., Jaiswal, K. (2016). Single channel speech enhancement: Using Wiener filtering with recursive noise estimation. Procedia Computer Science, 84, 22–30.CrossRef Upadhyay, N., Jaiswal, K. (2016). Single channel speech enhancement: Using Wiener filtering with recursive noise estimation. Procedia Computer Science, 84, 22–30.CrossRef
Zurück zum Zitat Upadhyay, N., & Karmakar, A. (2015). Speech Enhancement using spectral subtraction-type algorithms: A comparison and simulation study. In Eleventh International Multi-Conference on Information Processing-2015 (IMCIP-2015). Procdia Computer Science. Vol. 4, pp. 574–584. Upadhyay, N., & Karmakar, A. (2015). Speech Enhancement using spectral subtraction-type algorithms: A comparison and simulation study. In Eleventh International Multi-Conference on Information Processing-2015 (IMCIP-2015). Procdia Computer Science. Vol. 4, pp. 574–584.
Zurück zum Zitat Vajda, S., & Santosh, K. C. (2017). A fast k-nearest neighbor classifier using unsupervised clustering. In Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016. Communications in Computer and Information Science, Vol. 709, pp. 185–193. Singapore: Springer. Vajda, S., & Santosh, K. C. (2017). A fast k-nearest neighbor classifier using unsupervised clustering. In Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2016. Communications in Computer and Information Science, Vol. 709, pp. 185–193. Singapore: Springer.
Zurück zum Zitat Van Gerven, S., & Van Compernolle, D. (1995). Signal separation by symmetric adaptive decorrelation: Stability, convergence, and uniqueness. IEEE Transactions on Signal Processing, 74(3), 1602–1612.CrossRef Van Gerven, S., & Van Compernolle, D. (1995). Signal separation by symmetric adaptive decorrelation: Stability, convergence, and uniqueness. IEEE Transactions on Signal Processing, 74(3), 1602–1612.CrossRef
Zurück zum Zitat Varga, A., & Steeneken, H. J. (1993). Assessment for automatic speech recognition: II. Noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication, 12(3), 247–251.CrossRef Varga, A., & Steeneken, H. J. (1993). Assessment for automatic speech recognition: II. Noisex-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication, 12(3), 247–251.CrossRef
Zurück zum Zitat Vlaj, D., Kačič, Z., & Kos, M. (2012). Voice activity detection algorithm using nonlinear spectral weights, hangover and hang before criteria. Computers and Electrical Engineering, 38(6), 1820–1836.CrossRef Vlaj, D., Kačič, Z., & Kos, M. (2012). Voice activity detection algorithm using nonlinear spectral weights, hangover and hang before criteria. Computers and Electrical Engineering, 38(6), 1820–1836.CrossRef
Zurück zum Zitat Wang, X., Guo, Y., Fu, Q., & Yan, Y. (2016). Speech enhancement using multi-channel post-filtering with modified signal presence probability in reverberant environment. Chinese Journal of Electronics, 25(3), 512–519.CrossRef Wang, X., Guo, Y., Fu, Q., & Yan, Y. (2016). Speech enhancement using multi-channel post-filtering with modified signal presence probability in reverberant environment. Chinese Journal of Electronics, 25(3), 512–519.CrossRef
Zurück zum Zitat Zhang, J., Wu, X., & Shengs, V. S. (2015). Active learning with imbalanced multiple noisy labeling. IEEE Transactions on Cybernetics, 45(5), 1095–1107.CrossRef Zhang, J., Wu, X., & Shengs, V. S. (2015). Active learning with imbalanced multiple noisy labeling. IEEE Transactions on Cybernetics, 45(5), 1095–1107.CrossRef
Zurück zum Zitat Zoulikha, M., & Djendi, M. (2016). A new regularized forward blind source separation algorithm for automatic speech quality enhancement. Applied Acoustics, 112, 192–200.CrossRef Zoulikha, M., & Djendi, M. (2016). A new regularized forward blind source separation algorithm for automatic speech quality enhancement. Applied Acoustics, 112, 192–200.CrossRef
Zurück zum Zitat Zue, V., Seneff, S., & Glass, J. (1990). Speech database development at MIT: TIMIT and beyond. Speech Communication, 9(4), 351–356.CrossRef Zue, V., Seneff, S., & Glass, J. (1990). Speech database development at MIT: TIMIT and beyond. Speech Communication, 9(4), 351–356.CrossRef
Metadaten
Titel
A new efficient backward BSS crosstalk-resistant algorithm for automatic blind speech quality enhancement
verfasst von
Mohamed Djendi
Meriem Zoulikha
Publikationsdatum
13.08.2018
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 4/2018
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-018-9544-3

Weitere Artikel der Ausgabe 4/2018

International Journal of Speech Technology 4/2018 Zur Ausgabe

Neuer Inhalt