Skip to main content
Erschienen in: International Journal of Speech Technology 3/2018

12.07.2018

A new speech signal denoising algorithm using common vector approach

verfasst von: Erol Seke, Kemal Özkan

Erschienen in: International Journal of Speech Technology | Ausgabe 3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Speech denoising may improve intelligibility of speech and hearing comfort in voice communication/recognition applications in noisy environments. It can also be used to enhance old recordings. Most speech enhancement methods are intrusive and cause some loss in the signal component while removing noise. In this paper, we propose a method based on common vector approach (CVA) for reducing losses in single-channel enhancement algorithms. In the proposed technique, overlapping speech sample frames are collected in classes according to their similarity and common and difference vectors of the classes are separated using CVA. Since the noise component is uncorrelated and therefore presumably concentrated in the difference part, difference vectors are denoised using a common denoising technique and sample frames are reconstructed by combining the common and the denoised difference parts. This operation does not affect the common vector and somewhat secures improvement even for highly noised data. Compared to the state-of-the-art, highly promising results are obtained in terms of several speech quality measures.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Armengot, M., Ferri, F. J., & Villanueva, W. D. (2007). Experiments about the generalization ability of common vector based methods for face recognition. In Proceedings of PRIS 2007, Madeira, pp. 129–37. Armengot, M., Ferri, F. J., & Villanueva, W. D. (2007). Experiments about the generalization ability of common vector based methods for face recognition. In Proceedings of PRIS 2007, Madeira, pp. 129–37.
Zurück zum Zitat Bobillet, W., Diversi, R., Grivel, E., Guidorzi, R., Najim, & Soverini, M., U (2007). Speech enhancement combining optimal smoothing and errors-in-variables identification of noisy AR processes. IEEE Transaction on Signal Processing, 55, 5564–5578.MathSciNetCrossRefMATH Bobillet, W., Diversi, R., Grivel, E., Guidorzi, R., Najim, & Soverini, M., U (2007). Speech enhancement combining optimal smoothing and errors-in-variables identification of noisy AR processes. IEEE Transaction on Signal Processing, 55, 5564–5578.MathSciNetCrossRefMATH
Zurück zum Zitat Dash, T. K., & Solanki, S. S. (2017). Comparative study of speech enhancement algorithms and their effect on speech intelligibility. Second International Conference on Communication and Electronic Systems, 1, 270–276. Dash, T. K., & Solanki, S. S. (2017). Comparative study of speech enhancement algorithms and their effect on speech intelligibility. Second International Conference on Communication and Electronic Systems, 1, 270–276.
Zurück zum Zitat Dendrinos, M., Bakamidis, S., & Carayannis, G. (1991). Speech enhancement from noise: A regenerative approach. Speech Communications, 10(1), 45–57.CrossRef Dendrinos, M., Bakamidis, S., & Carayannis, G. (1991). Speech enhancement from noise: A regenerative approach. Speech Communications, 10(1), 45–57.CrossRef
Zurück zum Zitat Doclo, S., & Moonen, M. (2005). On the output SNR of the speech-distortion weighted multichannel Wiener filter. IEEE Signal Processing Letters, 12(12), 809–811.CrossRef Doclo, S., & Moonen, M. (2005). On the output SNR of the speech-distortion weighted multichannel Wiener filter. IEEE Signal Processing Letters, 12(12), 809–811.CrossRef
Zurück zum Zitat Doclo, S., & Moonen, N. (2002). GSVD-based optimal filtering for signal and multi-microphone speech enhancement. IEEE Transaction on Signal Processing, 50, 2230–2244.CrossRef Doclo, S., & Moonen, N. (2002). GSVD-based optimal filtering for signal and multi-microphone speech enhancement. IEEE Transaction on Signal Processing, 50, 2230–2244.CrossRef
Zurück zum Zitat Durak, M. H., Seke, E., & Özkan, K. (2015). Denoising speech signal using common vector approach. In 23nd signal processing and communication, conference (SIU), Malatya, pp. 1961–1964. Durak, M. H., Seke, E., & Özkan, K. (2015). Denoising speech signal using common vector approach. In 23nd signal processing and communication, conference (SIU), Malatya, pp. 1961–1964.
Zurück zum Zitat Ephraim, Y., & Van Trees, H. L. (1995a). A spectrally-based signal subspace approach for speech enhancement. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 804 807. Ephraim, Y., & Van Trees, H. L. (1995a). A spectrally-based signal subspace approach for speech enhancement. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 804 807.
Zurück zum Zitat Ephraim, Y., & Van Trees, H. L. (1995b). A signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 3, 251–266.CrossRef Ephraim, Y., & Van Trees, H. L. (1995b). A signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 3, 251–266.CrossRef
Zurück zum Zitat Ephraim, Y., Van Trees, H. L., Nilsson, M., & Soli, S. (1996). Enhancement of noisy speech for the hearing impaired using the signal subspace approach. In Proceedings of the national interdisciplinary forum on hearing aid research and development, Bethesda. Ephraim, Y., Van Trees, H. L., Nilsson, M., & Soli, S. (1996). Enhancement of noisy speech for the hearing impaired using the signal subspace approach. In Proceedings of the national interdisciplinary forum on hearing aid research and development, Bethesda.
Zurück zum Zitat Gerkmann, T., & Hendriks, R. C. (2012). Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1383–1393.CrossRef Gerkmann, T., & Hendriks, R. C. (2012). Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1383–1393.CrossRef
Zurück zum Zitat Gulmezoglu, M. B., Dzhafarov, V., & Barkana, A. (2001). The common vector approach and its relation to principal component analysis. IEEE Transactions on Speech and Audio Processing, 9, 655–662.CrossRef Gulmezoglu, M. B., Dzhafarov, V., & Barkana, A. (2001). The common vector approach and its relation to principal component analysis. IEEE Transactions on Speech and Audio Processing, 9, 655–662.CrossRef
Zurück zum Zitat Gulmezoglu, M. B.,. Dzhafarov, V., Keskin, M., & Barkana, A. (1999). A novel approach to isolated word recognition. IEEE Transactions on Speech and Audio Processing, 7, 620–628.CrossRef Gulmezoglu, M. B.,. Dzhafarov, V., Keskin, M., & Barkana, A. (1999). A novel approach to isolated word recognition. IEEE Transactions on Speech and Audio Processing, 7, 620–628.CrossRef
Zurück zum Zitat Günal, S., Ergin, S., Gülmezoglu, M. B., & Gerek, ÖN. (2006). On feature extraction for spam e-mail detection. In Multimedia content representation, classification and security (pp. 635–642). New York: Springer.CrossRef Günal, S., Ergin, S., Gülmezoglu, M. B., & Gerek, ÖN. (2006). On feature extraction for spam e-mail detection. In Multimedia content representation, classification and security (pp. 635–642). New York: Springer.CrossRef
Zurück zum Zitat Hansen, J., & Pellom, B. (1998). An effective quality evaluation protocol for speech enhancement algorithms. Proceedings of International Conference on Spoken Language Processing, 7, 2819–2822. Hansen, J., & Pellom, B. (1998). An effective quality evaluation protocol for speech enhancement algorithms. Proceedings of International Conference on Spoken Language Processing, 7, 2819–2822.
Zurück zum Zitat Hirsch, H., & Pearce, D. (2000). The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In Proceedings of ISCA ITRW ASR 2000, Paris. Hirsch, H., & Pearce, D. (2000). The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In Proceedings of ISCA ITRW ASR 2000, Paris.
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing, 11(4), 334–341.CrossRef Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing, 11(4), 334–341.CrossRef
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2006). Subjective comparison of speech enhancement algorithms. In IEEE proceedings of international conference on acoustics, speech and signal processing, ICASSP 2006. Hu, Y., & Loizou, P. C. (2006). Subjective comparison of speech enhancement algorithms. In IEEE proceedings of international conference on acoustics, speech and signal processing, ICASSP 2006.
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2007). Subjective comparison and evaluation of speech enhancement algorithms. Speech Communication, 49(7), 588–601.CrossRef Hu, Y., & Loizou, P. C. (2007). Subjective comparison and evaluation of speech enhancement algorithms. Speech Communication, 49(7), 588–601.CrossRef
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 229–238.CrossRef Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 229–238.CrossRef
Zurück zum Zitat Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 11(6), 700–708.CrossRef Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 11(6), 700–708.CrossRef
Zurück zum Zitat Klatt, D. (1982). Prediction of perceived phonetic distance from critical-band spectra: A first step. In IEEE international conference on acoustics, speech, and signal processing, ICASSP ‘82, vol. 7, pp. 1278–1281. Klatt, D. (1982). Prediction of perceived phonetic distance from critical-band spectra: A first step. In IEEE international conference on acoustics, speech, and signal processing, ICASSP ‘82, vol. 7, pp. 1278–1281.
Zurück zum Zitat Lu, Y., & Loizou, P. C. (2008). A geometric approach to spectral subtraction. Speech Communication, 50(6), 453–466.CrossRef Lu, Y., & Loizou, P. C. (2008). A geometric approach to spectral subtraction. Speech Communication, 50(6), 453–466.CrossRef
Zurück zum Zitat Lu, Y., & Loizou, P. C. (2011). Estimators of the magnitude-squared spectrum and methods for incorporating SNR uncertainty. IEEE Transactions on Audio, Speech and Language Processing, 19(5), 1123–1137.CrossRef Lu, Y., & Loizou, P. C. (2011). Estimators of the magnitude-squared spectrum and methods for incorporating SNR uncertainty. IEEE Transactions on Audio, Speech and Language Processing, 19(5), 1123–1137.CrossRef
Zurück zum Zitat Mellahi, Y., & Hamdi, R. (2015). LPC-based formant enhancement method in Kalman filtering for speech enhancement. International Journal of Electronics and Communications (AEU), 69, 545–554.CrossRef Mellahi, Y., & Hamdi, R. (2015). LPC-based formant enhancement method in Kalman filtering for speech enhancement. International Journal of Electronics and Communications (AEU), 69, 545–554.CrossRef
Zurück zum Zitat Mohammadiha, N., Gerkmann, T., & Leijon, A. (2011). A new linear MMSE filter for single channel speech enhancement based on nonnegative matrix factorization. In Proceedings of the IEEE workshop applications of signal processing, audio acoustics, pp. 45–48. Mohammadiha, N., Gerkmann, T., & Leijon, A. (2011). A new linear MMSE filter for single channel speech enhancement based on nonnegative matrix factorization. In Proceedings of the IEEE workshop applications of signal processing, audio acoustics, pp. 45–48.
Zurück zum Zitat Mysore, G. J., & Smaragdis, P. (2011). A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics. In Proceedings of the IEEE conference on acoustics, speech, and signal processing, pp. 17–20. Mysore, G. J., & Smaragdis, P. (2011). A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics. In Proceedings of the IEEE conference on acoustics, speech, and signal processing, pp. 17–20.
Zurück zum Zitat Paliwal, K., Wójcicki, K., & Schwerin, B. (2010). Single-channel speech enhancement using spectral subtraction in the short-time modulation domain. Speech Communication, 52(5), 450–475.CrossRef Paliwal, K., Wójcicki, K., & Schwerin, B. (2010). Single-channel speech enhancement using spectral subtraction in the short-time modulation domain. Speech Communication, 52(5), 450–475.CrossRef
Zurück zum Zitat Park, S., & Choi, S. (2008). A constrained sequential EM algorithm for speech enhancement. Neural Networks, 21, 1401–1409.CrossRefMATH Park, S., & Choi, S. (2008). A constrained sequential EM algorithm for speech enhancement. Neural Networks, 21, 1401–1409.CrossRefMATH
Zurück zum Zitat Sigg, C. D., Dikk, T., & Buhmann, J. M. (2010). Speech enhancement with sparse coding in learned dictionaries. In Proceedings of the IEEE conference on acoustics, speech, and signal processing, Dallas, pp. 4758–4761. Sigg, C. D., Dikk, T., & Buhmann, J. M. (2010). Speech enhancement with sparse coding in learned dictionaries. In Proceedings of the IEEE conference on acoustics, speech, and signal processing, Dallas, pp. 4758–4761.
Zurück zum Zitat Wang, G., Li, C., & Dong, L. (2010). Noise estimation using mean square cross prediction error for speech enhancement. IEEE Transactions on Circuits and Systems I: Regular Papers, 57, 1489–1499.MathSciNetCrossRef Wang, G., Li, C., & Dong, L. (2010). Noise estimation using mean square cross prediction error for speech enhancement. IEEE Transactions on Circuits and Systems I: Regular Papers, 57, 1489–1499.MathSciNetCrossRef
Zurück zum Zitat Wang, J., Xie, X., & Kuang, J. (2018). Microphone array speech enhancement based on tensor filtering methods. China Communications, 15(4), 141–152.CrossRef Wang, J., Xie, X., & Kuang, J. (2018). Microphone array speech enhancement based on tensor filtering methods. China Communications, 15(4), 141–152.CrossRef
Zurück zum Zitat Wei, Q., & Xia, Y. S. (2013). A novel prewhitening subspace method for enhancing speech corrupted by colored noise. In 6th international congress on image and signal processing. Wei, Q., & Xia, Y. S. (2013). A novel prewhitening subspace method for enhancing speech corrupted by colored noise. In 6th international congress on image and signal processing.
Zurück zum Zitat Zhang, L., Dong, W. S., Zhang, D., & Shi, G. M. (2010). Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recognition, 43, 1531–1549.CrossRefMATH Zhang, L., Dong, W. S., Zhang, D., & Shi, G. M. (2010). Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recognition, 43, 1531–1549.CrossRefMATH
Metadaten
Titel
A new speech signal denoising algorithm using common vector approach
verfasst von
Erol Seke
Kemal Özkan
Publikationsdatum
12.07.2018
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 3/2018
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-018-9529-2

Weitere Artikel der Ausgabe 3/2018

International Journal of Speech Technology 3/2018 Zur Ausgabe

Neuer Inhalt