Skip to main content
Top
Published in: International Journal of Speech Technology 3/2018

12-07-2018

A new speech signal denoising algorithm using common vector approach

Authors: Erol Seke, Kemal Özkan

Published in: International Journal of Speech Technology | Issue 3/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Speech denoising may improve intelligibility of speech and hearing comfort in voice communication/recognition applications in noisy environments. It can also be used to enhance old recordings. Most speech enhancement methods are intrusive and cause some loss in the signal component while removing noise. In this paper, we propose a method based on common vector approach (CVA) for reducing losses in single-channel enhancement algorithms. In the proposed technique, overlapping speech sample frames are collected in classes according to their similarity and common and difference vectors of the classes are separated using CVA. Since the noise component is uncorrelated and therefore presumably concentrated in the difference part, difference vectors are denoised using a common denoising technique and sample frames are reconstructed by combining the common and the denoised difference parts. This operation does not affect the common vector and somewhat secures improvement even for highly noised data. Compared to the state-of-the-art, highly promising results are obtained in terms of several speech quality measures.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Armengot, M., Ferri, F. J., & Villanueva, W. D. (2007). Experiments about the generalization ability of common vector based methods for face recognition. In Proceedings of PRIS 2007, Madeira, pp. 129–37. Armengot, M., Ferri, F. J., & Villanueva, W. D. (2007). Experiments about the generalization ability of common vector based methods for face recognition. In Proceedings of PRIS 2007, Madeira, pp. 129–37.
go back to reference Bobillet, W., Diversi, R., Grivel, E., Guidorzi, R., Najim, & Soverini, M., U (2007). Speech enhancement combining optimal smoothing and errors-in-variables identification of noisy AR processes. IEEE Transaction on Signal Processing, 55, 5564–5578.MathSciNetCrossRefMATH Bobillet, W., Diversi, R., Grivel, E., Guidorzi, R., Najim, & Soverini, M., U (2007). Speech enhancement combining optimal smoothing and errors-in-variables identification of noisy AR processes. IEEE Transaction on Signal Processing, 55, 5564–5578.MathSciNetCrossRefMATH
go back to reference Dash, T. K., & Solanki, S. S. (2017). Comparative study of speech enhancement algorithms and their effect on speech intelligibility. Second International Conference on Communication and Electronic Systems, 1, 270–276. Dash, T. K., & Solanki, S. S. (2017). Comparative study of speech enhancement algorithms and their effect on speech intelligibility. Second International Conference on Communication and Electronic Systems, 1, 270–276.
go back to reference Dendrinos, M., Bakamidis, S., & Carayannis, G. (1991). Speech enhancement from noise: A regenerative approach. Speech Communications, 10(1), 45–57.CrossRef Dendrinos, M., Bakamidis, S., & Carayannis, G. (1991). Speech enhancement from noise: A regenerative approach. Speech Communications, 10(1), 45–57.CrossRef
go back to reference Doclo, S., & Moonen, M. (2005). On the output SNR of the speech-distortion weighted multichannel Wiener filter. IEEE Signal Processing Letters, 12(12), 809–811.CrossRef Doclo, S., & Moonen, M. (2005). On the output SNR of the speech-distortion weighted multichannel Wiener filter. IEEE Signal Processing Letters, 12(12), 809–811.CrossRef
go back to reference Doclo, S., & Moonen, N. (2002). GSVD-based optimal filtering for signal and multi-microphone speech enhancement. IEEE Transaction on Signal Processing, 50, 2230–2244.CrossRef Doclo, S., & Moonen, N. (2002). GSVD-based optimal filtering for signal and multi-microphone speech enhancement. IEEE Transaction on Signal Processing, 50, 2230–2244.CrossRef
go back to reference Durak, M. H., Seke, E., & Özkan, K. (2015). Denoising speech signal using common vector approach. In 23nd signal processing and communication, conference (SIU), Malatya, pp. 1961–1964. Durak, M. H., Seke, E., & Özkan, K. (2015). Denoising speech signal using common vector approach. In 23nd signal processing and communication, conference (SIU), Malatya, pp. 1961–1964.
go back to reference Ephraim, Y., & Van Trees, H. L. (1995a). A spectrally-based signal subspace approach for speech enhancement. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 804 807. Ephraim, Y., & Van Trees, H. L. (1995a). A spectrally-based signal subspace approach for speech enhancement. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 804 807.
go back to reference Ephraim, Y., & Van Trees, H. L. (1995b). A signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 3, 251–266.CrossRef Ephraim, Y., & Van Trees, H. L. (1995b). A signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 3, 251–266.CrossRef
go back to reference Ephraim, Y., Van Trees, H. L., Nilsson, M., & Soli, S. (1996). Enhancement of noisy speech for the hearing impaired using the signal subspace approach. In Proceedings of the national interdisciplinary forum on hearing aid research and development, Bethesda. Ephraim, Y., Van Trees, H. L., Nilsson, M., & Soli, S. (1996). Enhancement of noisy speech for the hearing impaired using the signal subspace approach. In Proceedings of the national interdisciplinary forum on hearing aid research and development, Bethesda.
go back to reference Gerkmann, T., & Hendriks, R. C. (2012). Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1383–1393.CrossRef Gerkmann, T., & Hendriks, R. C. (2012). Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Transactions on Audio, Speech, and Language Processing, 20, 1383–1393.CrossRef
go back to reference Gulmezoglu, M. B., Dzhafarov, V., & Barkana, A. (2001). The common vector approach and its relation to principal component analysis. IEEE Transactions on Speech and Audio Processing, 9, 655–662.CrossRef Gulmezoglu, M. B., Dzhafarov, V., & Barkana, A. (2001). The common vector approach and its relation to principal component analysis. IEEE Transactions on Speech and Audio Processing, 9, 655–662.CrossRef
go back to reference Gulmezoglu, M. B.,. Dzhafarov, V., Keskin, M., & Barkana, A. (1999). A novel approach to isolated word recognition. IEEE Transactions on Speech and Audio Processing, 7, 620–628.CrossRef Gulmezoglu, M. B.,. Dzhafarov, V., Keskin, M., & Barkana, A. (1999). A novel approach to isolated word recognition. IEEE Transactions on Speech and Audio Processing, 7, 620–628.CrossRef
go back to reference Günal, S., Ergin, S., Gülmezoglu, M. B., & Gerek, ÖN. (2006). On feature extraction for spam e-mail detection. In Multimedia content representation, classification and security (pp. 635–642). New York: Springer.CrossRef Günal, S., Ergin, S., Gülmezoglu, M. B., & Gerek, ÖN. (2006). On feature extraction for spam e-mail detection. In Multimedia content representation, classification and security (pp. 635–642). New York: Springer.CrossRef
go back to reference Hansen, J., & Pellom, B. (1998). An effective quality evaluation protocol for speech enhancement algorithms. Proceedings of International Conference on Spoken Language Processing, 7, 2819–2822. Hansen, J., & Pellom, B. (1998). An effective quality evaluation protocol for speech enhancement algorithms. Proceedings of International Conference on Spoken Language Processing, 7, 2819–2822.
go back to reference Hirsch, H., & Pearce, D. (2000). The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In Proceedings of ISCA ITRW ASR 2000, Paris. Hirsch, H., & Pearce, D. (2000). The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions. In Proceedings of ISCA ITRW ASR 2000, Paris.
go back to reference Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing, 11(4), 334–341.CrossRef Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Speech and Audio Processing, 11(4), 334–341.CrossRef
go back to reference Hu, Y., & Loizou, P. C. (2006). Subjective comparison of speech enhancement algorithms. In IEEE proceedings of international conference on acoustics, speech and signal processing, ICASSP 2006. Hu, Y., & Loizou, P. C. (2006). Subjective comparison of speech enhancement algorithms. In IEEE proceedings of international conference on acoustics, speech and signal processing, ICASSP 2006.
go back to reference Hu, Y., & Loizou, P. C. (2007). Subjective comparison and evaluation of speech enhancement algorithms. Speech Communication, 49(7), 588–601.CrossRef Hu, Y., & Loizou, P. C. (2007). Subjective comparison and evaluation of speech enhancement algorithms. Speech Communication, 49(7), 588–601.CrossRef
go back to reference Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 229–238.CrossRef Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16(1), 229–238.CrossRef
go back to reference Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 11(6), 700–708.CrossRef Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 11(6), 700–708.CrossRef
go back to reference Klatt, D. (1982). Prediction of perceived phonetic distance from critical-band spectra: A first step. In IEEE international conference on acoustics, speech, and signal processing, ICASSP ‘82, vol. 7, pp. 1278–1281. Klatt, D. (1982). Prediction of perceived phonetic distance from critical-band spectra: A first step. In IEEE international conference on acoustics, speech, and signal processing, ICASSP ‘82, vol. 7, pp. 1278–1281.
go back to reference Lu, Y., & Loizou, P. C. (2008). A geometric approach to spectral subtraction. Speech Communication, 50(6), 453–466.CrossRef Lu, Y., & Loizou, P. C. (2008). A geometric approach to spectral subtraction. Speech Communication, 50(6), 453–466.CrossRef
go back to reference Lu, Y., & Loizou, P. C. (2011). Estimators of the magnitude-squared spectrum and methods for incorporating SNR uncertainty. IEEE Transactions on Audio, Speech and Language Processing, 19(5), 1123–1137.CrossRef Lu, Y., & Loizou, P. C. (2011). Estimators of the magnitude-squared spectrum and methods for incorporating SNR uncertainty. IEEE Transactions on Audio, Speech and Language Processing, 19(5), 1123–1137.CrossRef
go back to reference Mellahi, Y., & Hamdi, R. (2015). LPC-based formant enhancement method in Kalman filtering for speech enhancement. International Journal of Electronics and Communications (AEU), 69, 545–554.CrossRef Mellahi, Y., & Hamdi, R. (2015). LPC-based formant enhancement method in Kalman filtering for speech enhancement. International Journal of Electronics and Communications (AEU), 69, 545–554.CrossRef
go back to reference Mohammadiha, N., Gerkmann, T., & Leijon, A. (2011). A new linear MMSE filter for single channel speech enhancement based on nonnegative matrix factorization. In Proceedings of the IEEE workshop applications of signal processing, audio acoustics, pp. 45–48. Mohammadiha, N., Gerkmann, T., & Leijon, A. (2011). A new linear MMSE filter for single channel speech enhancement based on nonnegative matrix factorization. In Proceedings of the IEEE workshop applications of signal processing, audio acoustics, pp. 45–48.
go back to reference Mysore, G. J., & Smaragdis, P. (2011). A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics. In Proceedings of the IEEE conference on acoustics, speech, and signal processing, pp. 17–20. Mysore, G. J., & Smaragdis, P. (2011). A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics. In Proceedings of the IEEE conference on acoustics, speech, and signal processing, pp. 17–20.
go back to reference Paliwal, K., Wójcicki, K., & Schwerin, B. (2010). Single-channel speech enhancement using spectral subtraction in the short-time modulation domain. Speech Communication, 52(5), 450–475.CrossRef Paliwal, K., Wójcicki, K., & Schwerin, B. (2010). Single-channel speech enhancement using spectral subtraction in the short-time modulation domain. Speech Communication, 52(5), 450–475.CrossRef
go back to reference Park, S., & Choi, S. (2008). A constrained sequential EM algorithm for speech enhancement. Neural Networks, 21, 1401–1409.CrossRefMATH Park, S., & Choi, S. (2008). A constrained sequential EM algorithm for speech enhancement. Neural Networks, 21, 1401–1409.CrossRefMATH
go back to reference Sigg, C. D., Dikk, T., & Buhmann, J. M. (2010). Speech enhancement with sparse coding in learned dictionaries. In Proceedings of the IEEE conference on acoustics, speech, and signal processing, Dallas, pp. 4758–4761. Sigg, C. D., Dikk, T., & Buhmann, J. M. (2010). Speech enhancement with sparse coding in learned dictionaries. In Proceedings of the IEEE conference on acoustics, speech, and signal processing, Dallas, pp. 4758–4761.
go back to reference Wang, G., Li, C., & Dong, L. (2010). Noise estimation using mean square cross prediction error for speech enhancement. IEEE Transactions on Circuits and Systems I: Regular Papers, 57, 1489–1499.MathSciNetCrossRef Wang, G., Li, C., & Dong, L. (2010). Noise estimation using mean square cross prediction error for speech enhancement. IEEE Transactions on Circuits and Systems I: Regular Papers, 57, 1489–1499.MathSciNetCrossRef
go back to reference Wang, J., Xie, X., & Kuang, J. (2018). Microphone array speech enhancement based on tensor filtering methods. China Communications, 15(4), 141–152.CrossRef Wang, J., Xie, X., & Kuang, J. (2018). Microphone array speech enhancement based on tensor filtering methods. China Communications, 15(4), 141–152.CrossRef
go back to reference Wei, Q., & Xia, Y. S. (2013). A novel prewhitening subspace method for enhancing speech corrupted by colored noise. In 6th international congress on image and signal processing. Wei, Q., & Xia, Y. S. (2013). A novel prewhitening subspace method for enhancing speech corrupted by colored noise. In 6th international congress on image and signal processing.
go back to reference Zhang, L., Dong, W. S., Zhang, D., & Shi, G. M. (2010). Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recognition, 43, 1531–1549.CrossRefMATH Zhang, L., Dong, W. S., Zhang, D., & Shi, G. M. (2010). Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recognition, 43, 1531–1549.CrossRefMATH
Metadata
Title
A new speech signal denoising algorithm using common vector approach
Authors
Erol Seke
Kemal Özkan
Publication date
12-07-2018
Publisher
Springer US
Published in
International Journal of Speech Technology / Issue 3/2018
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-018-9529-2

Other articles of this Issue 3/2018

International Journal of Speech Technology 3/2018 Go to the issue