Skip to main content
Erschienen in: Wireless Personal Communications 4/2017

21.03.2017

A Wavelet Packet Based Approach for Speech Enhancement Using Modulation Channel Selection

verfasst von: Sachin Singh, Manoj Tripathy, R. S. Anand

Erschienen in: Wireless Personal Communications | Ausgabe 4/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, a wavelet packet based speech enhancement system is proposed for noise reduction. In the proposed method, a modulation channel selection is used as a thresholding function for de-noising. Three levels 8 sub-band wavelet packet decomposition is used and all sub-bands are given to threshold function for noise suppression. This novel modulation channel selection is based on calculation of true signal-to-noise ratio (SNR) by thresholding with local SNR of −7 dB. The presented method is used for noise suppression in single-channel speech patterns. Objective and subjective parameters are used for performance evaluation of this method. The performance of the proposed method is also compared with spectral subtraction, mband, mmse, test-psc, idbm, klt, and pklt. The proposed method give maximum intelligibility and quality in compared to other given methods. MATLAB 7.14 is used for simulation.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Stark, A. P. et al. (2008). Noise driven short-time phase spectrum compensation procedure for speech enhancement. In Proceedings of Interspeech, Brisbane Australia. Stark, A. P. et al. (2008). Noise driven short-time phase spectrum compensation procedure for speech enhancement. In Proceedings of Interspeech, Brisbane Australia.
2.
Zurück zum Zitat Berouti, M., Schwartz, M., & Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In Proceedings of the IEEE international conference on acoustics, speech, signal processing (pp. 208–211). Berouti, M., Schwartz, M., & Makhoul, J. (1979). Enhancement of speech corrupted by acoustic noise. In Proceedings of the IEEE international conference on acoustics, speech, signal processing (pp. 208–211).
3.
Zurück zum Zitat Boll, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120.CrossRef Boll, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120.CrossRef
4.
Zurück zum Zitat Cohen, I. (2002). Optimal speech enhancement under signal presence uncertainty using log-spectra amplitude estimator. IEEE Signal Processing Letters, 9(4), 113–116.CrossRef Cohen, I. (2002). Optimal speech enhancement under signal presence uncertainty using log-spectra amplitude estimator. IEEE Signal Processing Letters, 9(4), 113–116.CrossRef
5.
Zurück zum Zitat Ephraim, Y. (1992). Statistical-model-based speech enhancement systems. Proceedings of the IEEE, 80, 1526–1555.CrossRef Ephraim, Y. (1992). Statistical-model-based speech enhancement systems. Proceedings of the IEEE, 80, 1526–1555.CrossRef
6.
Zurück zum Zitat Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121.CrossRef Ephraim, Y., & Malah, D. (1984). Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(6), 1109–1121.CrossRef
7.
Zurück zum Zitat Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 23(2), 443–445.CrossRef Ephraim, Y., & Malah, D. (1985). Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Transactions on Acoustics, Speech, and Signal Processing, 23(2), 443–445.CrossRef
8.
Zurück zum Zitat Ephraim, Y., & Van, H. L. (1995). A signal subspace approach for speech enhancement. IEEE Transactions on Acoustics, Speech, and Signal Processing, 3(4), 251–266.CrossRef Ephraim, Y., & Van, H. L. (1995). A signal subspace approach for speech enhancement. IEEE Transactions on Acoustics, Speech, and Signal Processing, 3(4), 251–266.CrossRef
9.
Zurück zum Zitat Gustafsson, H., Nordholm, S., & Claesson, I. (2001). Spectral sub-traction using reduced delay convolution and adaptive averaging. IEEE Transactions on Acoustics, Speech, and Signal Processing, 9(8), 799–807.CrossRef Gustafsson, H., Nordholm, S., & Claesson, I. (2001). Spectral sub-traction using reduced delay convolution and adaptive averaging. IEEE Transactions on Acoustics, Speech, and Signal Processing, 9(8), 799–807.CrossRef
10.
Zurück zum Zitat Jia, H., Ren, Y., & Xueying, Z. (2013). An improved wavelet packet threshold function for speech enhancement method. Journal of Information & Computational Science, 10(3), 941–948. Jia, H., Ren, Y., & Xueying, Z. (2013). An improved wavelet packet threshold function for speech enhancement method. Journal of Information & Computational Science, 10(3), 941–948.
11.
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Acoustics, Speech, and Signal Processing, 11, 334–341.CrossRef Hu, Y., & Loizou, P. C. (2003). A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Transactions on Acoustics, Speech, and Signal Processing, 11, 334–341.CrossRef
12.
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2004). Incorporating a psychoacoustical model in frequency domain speech enhancement. IEEE Signal Processing Letters, 11(2), 270–273.CrossRef Hu, Y., & Loizou, P. C. (2004). Incorporating a psychoacoustical model in frequency domain speech enhancement. IEEE Signal Processing Letters, 11(2), 270–273.CrossRef
13.
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2004). Speech enhancement based on wavelet thresholding the multitaper spectrum. IEEE Transactions on Acoustics, Speech, and Signal Processing, 12(1), 59–67.CrossRef Hu, Y., & Loizou, P. C. (2004). Speech enhancement based on wavelet thresholding the multitaper spectrum. IEEE Transactions on Acoustics, Speech, and Signal Processing, 12(1), 59–67.CrossRef
14.
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2006). Evaluation of objective measures for speech enhancement. In Proceedings of the Interspeech. Hu, Y., & Loizou, P. C. (2006). Evaluation of objective measures for speech enhancement. In Proceedings of the Interspeech.
15.
Zurück zum Zitat ITU. (2000). Perceptual evaluation of speech quality (PESQ) and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codes. ITU-T Recommendation, 862. ITU. (2000). Perceptual evaluation of speech quality (PESQ) and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codes. ITU-T Recommendation, 862.
16.
Zurück zum Zitat Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Acoustics, Speech, and Signal Processing, 11(6), 700–708.CrossRef Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Acoustics, Speech, and Signal Processing, 11(6), 700–708.CrossRef
17.
Zurück zum Zitat Johnson, M. T., & Ren, Y. (2007). Speech signal enhancement through adaptive wavelet thresholding. Speech Communication, 2(49), 123–133.CrossRef Johnson, M. T., & Ren, Y. (2007). Speech signal enhancement through adaptive wavelet thresholding. Speech Communication, 2(49), 123–133.CrossRef
19.
Zurück zum Zitat Kamath, S., & Loizou, P. C. (2002). A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In Proceedings of the IEEE international conference on acoustics, speech, signal processing. Kamath, S., & Loizou, P. C. (2002). A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In Proceedings of the IEEE international conference on acoustics, speech, signal processing.
20.
Zurück zum Zitat Klatt, D. (1982). Prediction of perceived phonetic distance from critical band spectra. Processing of the IEEE International Conference on Acoustics, Speech, Signal Processing, 7, 1278–1281.CrossRef Klatt, D. (1982). Prediction of perceived phonetic distance from critical band spectra. Processing of the IEEE International Conference on Acoustics, Speech, Signal Processing, 7, 1278–1281.CrossRef
21.
Zurück zum Zitat Lim, J., & Oppenheim, A. V. (1978). All-pole modeling of degraded speech. IEEE Transactions on Acoustics, Speech, and Signal Processing, 26(3), 197–210.CrossRefMATH Lim, J., & Oppenheim, A. V. (1978). All-pole modeling of degraded speech. IEEE Transactions on Acoustics, Speech, and Signal Processing, 26(3), 197–210.CrossRefMATH
22.
Zurück zum Zitat Jie, L., & Liu, H. (2012). New wavelet packet transform algorithm based on critical bandwidth. Computer Engineering and Applications, 14(48), 5–7. Jie, L., & Liu, H. (2012). New wavelet packet transform algorithm based on critical bandwidth. Computer Engineering and Applications, 14(48), 5–7.
23.
Zurück zum Zitat Loizou, P. C. (2005). Speech enhancement based on perceptually motivated Bayesian estimators of the speech magnitude spectrum. IEEE Transactions on Acoustics, Speech, and Signal Processing, 13(5), 857–869.CrossRef Loizou, P. C. (2005). Speech enhancement based on perceptually motivated Bayesian estimators of the speech magnitude spectrum. IEEE Transactions on Acoustics, Speech, and Signal Processing, 13(5), 857–869.CrossRef
24.
Zurück zum Zitat McAulay, R., & Malpass, M. (1980). Speech enhancement using soft-decision noise suppression filter. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(2), 137–145.CrossRef McAulay, R., & Malpass, M. (1980). Speech enhancement using soft-decision noise suppression filter. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(2), 137–145.CrossRef
25.
Zurück zum Zitat Li, R., Bao, C., & Xia, B. (2012) Speech enhancement using the combination of adaptive wavelet threshold and spectral subtraction based on wavelet packet decomposition. In ICSP Proceedings. Li, R., Bao, C., & Xia, B. (2012) Speech enhancement using the combination of adaptive wavelet threshold and spectral subtraction based on wavelet packet decomposition. In ICSP Proceedings.
26.
Zurück zum Zitat Scalart, P., & Filho, J. (1996). Speech enhancement based on a priori signal to noise estimation. In Proceedings of the IEEE international conference on acoustics, speech, signal processing (pp. 629–632). Scalart, P., & Filho, J. (1996). Speech enhancement based on a priori signal to noise estimation. In Proceedings of the IEEE international conference on acoustics, speech, signal processing (pp. 629–632).
27.
Zurück zum Zitat Singh, S., et al. (2016). A wavelet based transform method for quality improvement in noisy speech patterns of Arabic language. International Journal of Speech Technology, 20(4), 609–617. Singh, S., et al. (2016). A wavelet based transform method for quality improvement in noisy speech patterns of Arabic language. International Journal of Speech Technology, 20(4), 609–617.
28.
Zurück zum Zitat Li, S., et al. (2013). Enhancement of non-air conducted speech based on wavelet-packet adaptive threshold. Telkomnika, 11(1), 130–135. Li, S., et al. (2013). Enhancement of non-air conducted speech based on wavelet-packet adaptive threshold. Telkomnika, 11(1), 130–135.
30.
Zurück zum Zitat Sanam, T. F., & Shahnaz, C. (2012). Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold. International Journal of Speech Technology, 15(4), 463–475.CrossRef Sanam, T. F., & Shahnaz, C. (2012). Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold. International Journal of Speech Technology, 15(4), 463–475.CrossRef
31.
Zurück zum Zitat Zhi, T. A. O., He-Ming, Z., & Xiao-Jun, Z. (2011). Speech enhancement based on the multi-scales and multi-thresholds of the auditory perception wavelet transform. Archives of Acoustics, 36(3), 519–532. Zhi, T. A. O., He-Ming, Z., & Xiao-Jun, Z. (2011). Speech enhancement based on the multi-scales and multi-thresholds of the auditory perception wavelet transform. Archives of Acoustics, 36(3), 519–532.
32.
Zurück zum Zitat Tribolet, J., Noll, P., & McDermott, B. (1978). A study of complexity and quality of speech waveform coders. In Proceedings of the IEEE international conference on acoustics, speech, signal processing (pp. 586–590). Tribolet, J., Noll, P., & McDermott, B. (1978). A study of complexity and quality of speech waveform coders. In Proceedings of the IEEE international conference on acoustics, speech, signal processing (pp. 586–590).
33.
Zurück zum Zitat Kamil, W., & Loizou, P. C. (2012). Channel selection in the modulation domain for improved speech intelligibility in noise. The Journal of the Acoustical Society of America, 131(4), 2904–2913.CrossRef Kamil, W., & Loizou, P. C. (2012). Channel selection in the modulation domain for improved speech intelligibility in noise. The Journal of the Acoustical Society of America, 131(4), 2904–2913.CrossRef
34.
Zurück zum Zitat Zhang, X. (2010). Digital speech signal processing and MATLAB simulation. Beijing: Publishing House of Electronics Industry, Beijing Inc. Zhang, X. (2010). Digital speech signal processing and MATLAB simulation. Beijing: Publishing House of Electronics Industry, Beijing Inc.
Metadaten
Titel
A Wavelet Packet Based Approach for Speech Enhancement Using Modulation Channel Selection
verfasst von
Sachin Singh
Manoj Tripathy
R. S. Anand
Publikationsdatum
21.03.2017
Verlag
Springer US
Erschienen in
Wireless Personal Communications / Ausgabe 4/2017
Print ISSN: 0929-6212
Elektronische ISSN: 1572-834X
DOI
https://doi.org/10.1007/s11277-017-4094-6

Weitere Artikel der Ausgabe 4/2017

Wireless Personal Communications 4/2017 Zur Ausgabe

Neuer Inhalt