Skip to main content
Top
Published in: International Journal of Speech Technology 4/2016

16-08-2016

A wavelet- based transform method for quality improvement in noisy speech patterns of Arabic language

Authors: Sachin Singh, A. M. Mutawa

Published in: International Journal of Speech Technology | Issue 4/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper addresses the problem of single-channel speech enhancement of low (negative) SNR of Arabic noisy speech signals. For this aim, a binary mask thresholding function based coiflet5 mother wavelet transform is proposed for Arabic speech enhancement. The effectiveness of binary mask thresholding function based coiflet5 mother wavelet transform is compared with Wiener method, spectral subtraction, log-MMSE, test-PSC and p-mmse in presence of babble, pink, white, f-16 and Volvo car interior noise. The noisy input speech signals are processed at various levels of input SNR range from −5 to −25 dB. Performance of the proposed method is evaluated with the help of PESQ, SNR and cepstral distance measure. The results obtained by proposed binary mask thresholding function based coiflet5 wavelet transform method are very encouraging and shows that the proposed method is much helpful in Arabic speech enhancement than other existing methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Aggarwalet, A., et al. (2011). Noise reductions of speech signal using wavelet transform with modified universal threshold. International Journal of Computer Application, 20(5), 14–19.CrossRef Aggarwalet, A., et al. (2011). Noise reductions of speech signal using wavelet transform with modified universal threshold. International Journal of Computer Application, 20(5), 14–19.CrossRef
go back to reference Alalshekmubarak, A. & Smith, L. S. (2014). On improving the classification capability of reservoir computing for arabic speech recognition. In Honkela, T., Duch, W., Girolami, M., Kaski, S. (Eds.), Artificial Neural Networks and Machine Learning-ICANN 2014. In 24th International Conference on Artificial Neural Networks, Lecture Notes in Computer Science 8681 (pp. 225–332). Heidelberg: Springer. Alalshekmubarak, A. & Smith, L. S. (2014). On improving the classification capability of reservoir computing for arabic speech recognition. In Honkela, T., Duch, W., Girolami, M., Kaski, S. (Eds.), Artificial Neural Networks and Machine Learning-ICANN 2014. In 24th International Conference on Artificial Neural Networks, Lecture Notes in Computer Science 8681 (pp. 225–332). Heidelberg: Springer.
go back to reference Bahoura, M., & Rouat, J. (2001). Wavelet speech enhancement based on the Teager energy operator. IEEE Signal Processing Letters, 8, 10–12.CrossRef Bahoura, M., & Rouat, J. (2001). Wavelet speech enhancement based on the Teager energy operator. IEEE Signal Processing Letters, 8, 10–12.CrossRef
go back to reference Boll, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27, 113–120.CrossRef Boll, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27, 113–120.CrossRef
go back to reference Ephraim, Y. (1992). Statistical-model-based speech enhancement systems. Proceedings of the IEEE, 80, 1526–1555.CrossRef Ephraim, Y. (1992). Statistical-model-based speech enhancement systems. Proceedings of the IEEE, 80, 1526–1555.CrossRef
go back to reference Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean square error log-spectral amplitude estimator. IEEE Trans. Audio, Speech, and Language Processing, 33, 443–445. Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean square error log-spectral amplitude estimator. IEEE Trans. Audio, Speech, and Language Processing, 33, 443–445.
go back to reference Ghanbari, Y., & Reza, Mohammad. (2006). A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Communication, 48, 927–940.CrossRef Ghanbari, Y., & Reza, Mohammad. (2006). A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Communication, 48, 927–940.CrossRef
go back to reference Haykin, S. S. (2001). Kalman filtering and neural networks. New York: Wiley.CrossRef Haykin, S. S. (2001). Kalman filtering and neural networks. New York: Wiley.CrossRef
go back to reference Hazrati, O., & Loizou, P. (2012). Tackling the combined effects of reverberation and masking noise using ideal channel selection. Journal of Speech Lang Hearing Research, 55, 500–510.CrossRef Hazrati, O., & Loizou, P. (2012). Tackling the combined effects of reverberation and masking noise using ideal channel selection. Journal of Speech Lang Hearing Research, 55, 500–510.CrossRef
go back to reference Hermus, K., Wambacq, P., & Hamme, H. V. (2007). A review of signal subspace speech enhancement and its application to noise robust speech recognition. EURASIP Journal on Advances in Signal Processing, 1, 1–5.MathSciNetMATH Hermus, K., Wambacq, P., & Hamme, H. V. (2007). A review of signal subspace speech enhancement and its application to noise robust speech recognition. EURASIP Journal on Advances in Signal Processing, 1, 1–5.MathSciNetMATH
go back to reference ITU-T Recommendation P.862.1. (2003). Perceptual evaluation of speech quality (PESQ) and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation P.862.1. ITU-T Recommendation P.862.1. (2003). Perceptual evaluation of speech quality (PESQ) and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation P.862.1.
go back to reference Kitawaki, N., Nagabuchi, H., & Itoh, K. (1988). Objective quality evaluation for low bit-rate speech coding systems. IEEE Journal on Selected Areas in Communications, 6, 262–273.CrossRef Kitawaki, N., Nagabuchi, H., & Itoh, K. (1988). Objective quality evaluation for low bit-rate speech coding systems. IEEE Journal on Selected Areas in Communications, 6, 262–273.CrossRef
go back to reference Loizou, P. C. (2005). Speech enhancement based on perceptually motivated bayesian estimators of the magnitude spectrum. IEEE Transactions on Speech and Audio Processing, 13(5), 857–869.CrossRef Loizou, P. C. (2005). Speech enhancement based on perceptually motivated bayesian estimators of the magnitude spectrum. IEEE Transactions on Speech and Audio Processing, 13(5), 857–869.CrossRef
go back to reference Loizou, P. C. (2007). Speech enhancement: Theory and practic. Boca Raton: CRC Press. Loizou, P. C. (2007). Speech enhancement: Theory and practic. Boca Raton: CRC Press.
go back to reference Sanam, T. F., & Shahnaz, C. (2012a). Teager energy operation on wavelet packet coefficients for enhancing noisy speech using a hard thresholding function. Signal Processing: An International Journal, 6(2), 22. Sanam, T. F., & Shahnaz, C. (2012a). Teager energy operation on wavelet packet coefficients for enhancing noisy speech using a hard thresholding function. Signal Processing: An International Journal, 6(2), 22.
go back to reference Sanam, T. F., & Shahnaz, C. (2012b). Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold. International Journal Speech Technology, 15(4), 463–475.CrossRef Sanam, T. F., & Shahnaz, C. (2012b). Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold. International Journal Speech Technology, 15(4), 463–475.CrossRef
go back to reference Scalart, P., & Filho, J. (1996). Speech enhancement based on a priori signal to noise estimation. In Proceedings of IEEE International Conference Acoustics, Speech, Signal Processing (pp. 629–632). Scalart, P., & Filho, J. (1996). Speech enhancement based on a priori signal to noise estimation. In Proceedings of IEEE International Conference Acoustics, Speech, Signal Processing (pp. 629–632).
go back to reference Shao, Y., & Chang, C. H. (2007). A generalized Time-Frequency Subtraction Method for Robust Speech Enhancement Based on Wavelet Filter Banks Modeling of Human Auditory System. IEEE Transactions on Systems, Man, and Cybernetics, 37(4), 877–889.CrossRef Shao, Y., & Chang, C. H. (2007). A generalized Time-Frequency Subtraction Method for Robust Speech Enhancement Based on Wavelet Filter Banks Modeling of Human Auditory System. IEEE Transactions on Systems, Man, and Cybernetics, 37(4), 877–889.CrossRef
go back to reference Singh, S., Tripathy, M., & Anand, R. S. (2013). Noise removal in single channel Hindi speech patterns by using binary mask thresholding function in various mother wavelets. In IEEE International Conference on Signal Processing, Computing and Control (ISPCC). Shimla: Jaypee University. Singh, S., Tripathy, M., & Anand, R. S. (2013). Noise removal in single channel Hindi speech patterns by using binary mask thresholding function in various mother wavelets. In IEEE International Conference on Signal Processing, Computing and Control (ISPCC). Shimla: Jaypee University.
go back to reference Singh, S., Tripathy, M., & Anand, R. S. (2015). Binary mask based method for enhancement of mixed noise speech of low SNR input. International Journal of Speech Technology, 18(4), 609–617.CrossRef Singh, S., Tripathy, M., & Anand, R. S. (2015). Binary mask based method for enhancement of mixed noise speech of low SNR input. International Journal of Speech Technology, 18(4), 609–617.CrossRef
go back to reference So, S., & Paliwal, K. (2011). Modulation-domain Kalman filtering for single channel speech enhancement. Speech Communication, 53, 818–829.CrossRef So, S., & Paliwal, K. (2011). Modulation-domain Kalman filtering for single channel speech enhancement. Speech Communication, 53, 818–829.CrossRef
go back to reference Stark, A. P. et al. (2008). Noise driven short-time phase spectrum compensation procedure for speech enhancement. In Proceedings of Interspeech, Brisbane. Stark, A. P. et al. (2008). Noise driven short-time phase spectrum compensation procedure for speech enhancement. In Proceedings of Interspeech, Brisbane.
go back to reference Sumithra, A. (2009). Performance evaluation of different thresholding methods in time adaptive wavelet based speech enhancement. IACSIT, 1(5), 42–51. Sumithra, A. (2009). Performance evaluation of different thresholding methods in time adaptive wavelet based speech enhancement. IACSIT, 1(5), 42–51.
go back to reference Tabibian, S., Akbari, A., & Nasersharif, B. (2009). A new wavelet thresholding method for speech enhancement based on symmetric Kullback-Leibler divergence. In 14th International Computer Conference (CSICC) (pp. 495–500). Tabibian, S., Akbari, A., & Nasersharif, B. (2009). A new wavelet thresholding method for speech enhancement based on symmetric Kullback-Leibler divergence. In 14th International Computer Conference (CSICC) (pp. 495–500).
go back to reference Wang, J., & Zhang, C. (2005). Noise reduction in speech based on bark scaled wavelet packet decomposition and teager energy operator. Signal Processing, China, 21, 44–47. Wang, J., & Zhang, C. (2005). Noise reduction in speech based on bark scaled wavelet packet decomposition and teager energy operator. Signal Processing, China, 21, 44–47.
go back to reference Yi, H., & Loizou, P. C. (2004). Speech enhancement based on wavelet thresholding the multitaper Spectrum. IEEE Signal Processing Letters, 12, 59–67. Yi, H., & Loizou, P. C. (2004). Speech enhancement based on wavelet thresholding the multitaper Spectrum. IEEE Signal Processing Letters, 12, 59–67.
go back to reference Yu, G., Bacry, E., & Mallat, S. (2007). Audio signal denoising with complex wavelets and adaptive block attenuation. In Proceedings of IEEE International Conference Acoustic, Speech Signal Processing (Vol. 3, pp. 869–872). Yu, G., Bacry, E., & Mallat, S. (2007). Audio signal denoising with complex wavelets and adaptive block attenuation. In Proceedings of IEEE International Conference Acoustic, Speech Signal Processing (Vol. 3, pp. 869–872).
go back to reference Zhao, H., et al. (2011). An improved speech enhancement method based on teager energy operator and perceptual wavelet packet decomposition. Journal of Multimedia, 6(3), 308–315.CrossRef Zhao, H., et al. (2011). An improved speech enhancement method based on teager energy operator and perceptual wavelet packet decomposition. Journal of Multimedia, 6(3), 308–315.CrossRef
go back to reference Zhou, B. et al. (2010). An improved wavelet-based speech enhancement method using adaptive block thresholding. In IEEE International Conference Acoustic, Speech Signal Processing. Zhou, B. et al. (2010). An improved wavelet-based speech enhancement method using adaptive block thresholding. In IEEE International Conference Acoustic, Speech Signal Processing.
Metadata
Title
A wavelet- based transform method for quality improvement in noisy speech patterns of Arabic language
Authors
Sachin Singh
A. M. Mutawa
Publication date
16-08-2016
Publisher
Springer US
Published in
International Journal of Speech Technology / Issue 4/2016
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-016-9359-z

Other articles of this Issue 4/2016

International Journal of Speech Technology 4/2016 Go to the issue