Skip to main content
Erschienen in: International Journal of Speech Technology 4/2012

01.12.2012

Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold

verfasst von: Tahsina Farah Sanam, Celia Shahnaz

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2012

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Performance of the thresholding based speech enhancement methods largely depend on the estimate of the exact threshold value as well as on the choice of the thresholding function. In this paper, a speech enhancement method is presented, in which a custom thresholding function is proposed and employed upon the Wavelet Packet (WP) coefficients of the noisy speech. The thresholding function is capable of switching between modified hard and semisoft thresholding functions depending on a parameter that decides the signal characteristics under consideration. Here, the threshold is determined based on the statistical modeling of the Teager energy operated WP coefficients of the noisy speech. Extensive simulations indicate that the threshold thus obtained in conjunction with the custom thresholding function is very effective in reduction of not only the white noise but also the color noise from the noisy speech thus resulting in an enhanced speech with better quality and intelligibility. Several standard objective measures and subjective evaluations including informal listening tests show that the proposed method outperforms the recent state-of-the-art thresholding based approaches of noisy speech enhancement from high to low levels of SNR.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Almajai, I., & Milner, B. (2011). Visually derived wiener filters for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 19(6), 1642–1651. CrossRef Almajai, I., & Milner, B. (2011). Visually derived wiener filters for speech enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 19(6), 1642–1651. CrossRef
Zurück zum Zitat Bahoura, M., & Rouat, J. (2001). A new approach for wavelet speech enhancement. In EUROSPEECH (pp. 1937–1940). Bahoura, M., & Rouat, J. (2001). A new approach for wavelet speech enhancement. In EUROSPEECH (pp. 1937–1940).
Zurück zum Zitat Chang, J.-H. (2005). Warped discrete cosine transform-based noisy speech enhancement. IEEE Transactions on Circuits and Systems. II, Express Briefs, 52, 535–539. CrossRef Chang, J.-H. (2005). Warped discrete cosine transform-based noisy speech enhancement. IEEE Transactions on Circuits and Systems. II, Express Briefs, 52, 535–539. CrossRef
Zurück zum Zitat Chang, J.-H. (2007). Complex Laplacian probability density function for noisy speech enhancement. IEICE Electronics Express, 4, 245–250. CrossRef Chang, J.-H. (2007). Complex Laplacian probability density function for noisy speech enhancement. IEICE Electronics Express, 4, 245–250. CrossRef
Zurück zum Zitat Chang, S., Kwon, Y., Yang, S.-I., & Kim, I.-J. (2002). Speech enhancement for non-stationary noise environment by adaptive wavelet packet. In Proc. IEEE int. conf. acoustics, speech, and signal processing (ICASSP) (Vol. 1, pp. I-561–I-564). Chang, S., Kwon, Y., Yang, S.-I., & Kim, I.-J. (2002). Speech enhancement for non-stationary noise environment by adaptive wavelet packet. In Proc. IEEE int. conf. acoustics, speech, and signal processing (ICASSP) (Vol. 1, pp. I-561–I-564).
Zurück zum Zitat Chen, B., & Loizou, P. C. (2007). A Laplacian-based MMSE estimator for speech enhancement. Speech Communication, 49, 134–143. CrossRef Chen, B., & Loizou, P. C. (2007). A Laplacian-based MMSE estimator for speech enhancement. Speech Communication, 49, 134–143. CrossRef
Zurück zum Zitat Ghanbari, Y., & Mollaei, M. R. K. (2006). A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Communication, 48(8), 927–940. CrossRef Ghanbari, Y., & Mollaei, M. R. K. (2006). A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets. Speech Communication, 48(8), 927–940. CrossRef
Zurück zum Zitat Hirsch, H., & Pearce, D. (2000). The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions (ISCA ITRW ASR2000). Paris, France. Hirsch, H., & Pearce, D. (2000). The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions (ISCA ITRW ASR2000). Paris, France.
Zurück zum Zitat Hu, Y., & Loizou, P. (2004). Speech enhancement based on wavelet thresholding the multitaper spectrum. IEEE Transactions on Speech and Audio Processing, 12, 59–67. CrossRef Hu, Y., & Loizou, P. (2004). Speech enhancement based on wavelet thresholding the multitaper spectrum. IEEE Transactions on Speech and Audio Processing, 12, 59–67. CrossRef
Zurück zum Zitat Hu, Y., & Loizou, P. C. (2007). Subjective comparison and evaluation of speech enhancement algorithms. Speech Communication, 49, 588–601. CrossRef Hu, Y., & Loizou, P. C. (2007). Subjective comparison and evaluation of speech enhancement algorithms. Speech Communication, 49, 588–601. CrossRef
Zurück zum Zitat Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 11, 700–708. CrossRef Jabloun, F., & Champagne, B. (2003). Incorporating the human hearing properties in the signal subspace approach for speech enhancement. IEEE Transactions on Speech and Audio Processing, 11, 700–708. CrossRef
Zurück zum Zitat Johnson, M. T., Yuan, X., & Ren, Y. (2007). Speech signal enhancement through adaptive wavelet thresholding. Speech Communication, 2007, 123–133. CrossRef Johnson, M. T., Yuan, X., & Ren, Y. (2007). Speech signal enhancement through adaptive wavelet thresholding. Speech Communication, 2007, 123–133. CrossRef
Zurück zum Zitat Kaiser, J. (1993). Some useful properties of teager’s energy operators. In Proc. IEEE int. conf. speech, and signal processing (ICASSP) (Vol. 3, pp. 149–152). Kaiser, J. (1993). Some useful properties of teager’s energy operators. In Proc. IEEE int. conf. speech, and signal processing (ICASSP) (Vol. 3, pp. 149–152).
Zurück zum Zitat Kim, N. S., & Chang, J.-H. (2000). Spectral enhancement based on global soft decision. Signal Processing Letters, 7, 108–110. CrossRef Kim, N. S., & Chang, J.-H. (2000). Spectral enhancement based on global soft decision. Signal Processing Letters, 7, 108–110. CrossRef
Zurück zum Zitat O’Shaughnessy, D. (2000). Speech enhancement: theory and practice. New York: IEEE Press. O’Shaughnessy, D. (2000). Speech enhancement: theory and practice. New York: IEEE Press.
Zurück zum Zitat Sameti, H., Sheikhzadeh, H., Deng, L., & Brennan, R. (1998). HMM-based strategies for enhancement of speech signals embedded in nonstationary noise. IEEE Transactions on Speech and Audio Processing, 6(5), 445–455. CrossRef Sameti, H., Sheikhzadeh, H., Deng, L., & Brennan, R. (1998). HMM-based strategies for enhancement of speech signals embedded in nonstationary noise. IEEE Transactions on Speech and Audio Processing, 6(5), 445–455. CrossRef
Zurück zum Zitat Shao, Y., & Chang, C.-H. (2007). A generalized time-frequency subtraction method for robust speech enhancement based on wavelet filter banks modeling of human auditory system. IEEE Transactions on Systems, Man, and Cybernetics, 37(4), 877–889. CrossRef Shao, Y., & Chang, C.-H. (2007). A generalized time-frequency subtraction method for robust speech enhancement based on wavelet filter banks modeling of human auditory system. IEEE Transactions on Systems, Man, and Cybernetics, 37(4), 877–889. CrossRef
Zurück zum Zitat Sheikhzadeh, H., & Abutalebi, H. R. (2001). An improved wavelet-based speech enhancement system. In EUROSPEECH (pp. 1855–1858). Sheikhzadeh, H., & Abutalebi, H. R. (2001). An improved wavelet-based speech enhancement system. In EUROSPEECH (pp. 1855–1858).
Zurück zum Zitat Tabibian, S., Akbari, A., & Nasersharif, B. (2009). A new wavelet thresholding method for speech enhancement based on symmetric Kullback-Leibler divergence. In Computer conference, 2009. CSICC 2009. 14th international CSI (pp. 495–500). CrossRef Tabibian, S., Akbari, A., & Nasersharif, B. (2009). A new wavelet thresholding method for speech enhancement based on symmetric Kullback-Leibler divergence. In Computer conference, 2009. CSICC 2009. 14th international CSI (pp. 495–500). CrossRef
Zurück zum Zitat Varga, A., & Steeneken, H. J. M. (1993). Assessment for automatic speech recognition: Ii. noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication, 12, 247–251. CrossRef Varga, A., & Steeneken, H. J. M. (1993). Assessment for automatic speech recognition: Ii. noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Communication, 12, 247–251. CrossRef
Zurück zum Zitat Yamashita, K., & Shimamura, T. (2005). Nonstationary noise estimation using low-frequency regions for spectral subtraction. Signal Processing Letters, 12, 465–468. CrossRef Yamashita, K., & Shimamura, T. (2005). Nonstationary noise estimation using low-frequency regions for spectral subtraction. Signal Processing Letters, 12, 465–468. CrossRef
Metadaten
Titel
Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold
verfasst von
Tahsina Farah Sanam
Celia Shahnaz
Publikationsdatum
01.12.2012
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 4/2012
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-012-9144-6

Weitere Artikel der Ausgabe 4/2012

International Journal of Speech Technology 4/2012 Zur Ausgabe

Neuer Inhalt