Skip to main content
Log in

Speech Enhancement Using Perceptual Wavelet Packet Decomposition and Teager Energy Operator

  • Published:
Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Abstract

It has been shown in the literature that the perceptual wavelet packet decomposition (PWPD) and the Teager energy operator (TEO) are useful for various speech processing systems and speech enhancement applications, respectively. By the use of the PWPD and the TEO, this paper presents an improved wavelet-based speech enhancement method. The main advantage of the proposed method is that the over thresholding of speech segments which is usually occurred in conventional wavelet-based speech enhancement schemes can be avoided. As a consequence, the enhanced speech quality of the proposed method can be increased substantially from those of conventional approaches. In addition, the proposed method does not require a complicated estimation of the noise level or any knowledge of the SNR. Using speech signals corrupted by additive and real noises, experimental results demonstrate that the speech enhancement method presented in this paper is capable of outperforming conventional noise cancellation schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. B.H. Juang, ‘Recent Developments in Speech Recognition Under Adverse Conditions,’ in Proceedings of Int. Conf. Spoken Language Process '90, 1990, pp. 1113-1116.

  2. J.H. Chen and A. Gersho, ‘Adaptive Postfiltering for Quality Enhancement of Coded Speech,’ IEEE Trans. Speech and Audio Processing, vol. 3, 1995, pp. 57-71.

    Google Scholar 

  3. R. Le Bouquin, ‘Enhancement of Noisy Speech Signals: Application To Mobile Radio Communications,’ Speech Communication, vol. 18, no. 1, 1996, pp. 3-19.

    Article  Google Scholar 

  4. Y. Ephraim and D. Malah, ‘Speech Enhancement Using a Minimum Mean Square Error Short Time Spectral Amplitude Estimator,’ IEEE Trans. Acoust. Speech Signal Processing ASSP-32, 1984, pp. 1109-1121.

  5. J. Meyer and K.U. Simmer, ‘Multi-Channel Speech Enhancement in a Car Environment Using Wiener Filtering and Spectral Subtraction,’ in Proceedings of IEEE Int. Conf. Acoustics, Speech, and Signal Processing '97, vol. 2, 1997, pp. 1167-1170.

    Google Scholar 

  6. B. Yegnanarayana, C. Avendano, H. Hermansky, and P. Satyanarayana Murthy, ‘Speech Enhancement Using Linear Prediction Residual,’ Speech Communication, vol. 28, 1999, pp. 25-42.

    Article  Google Scholar 

  7. J.W. Seok and K.S. Bae, ‘Speech Enhancement with Reduction of Noise Components in The Wavelet Domain,’ in Proceedings of IEEE Int. Conf. Acoustics, Speech, and Signal Processing '97, vol. 2, 1997, pp. 1323-1326.

    Google Scholar 

  8. I. Pinter, ‘Perceptual Wavelet-Representation of Speech Signals and its Application to Speech Enhancement,’ Computer Speech and Language, vol. 10, no. 1, 1996, pp. 1-22.

    Article  Google Scholar 

  9. B. Carneno and A. Drygajlo, ‘Perceptual Speech Coding and Enhancement Using Frame-Synchronized Fast Wavelet Packet Transform Algorithms,’ IEEE Tans. Signal Processing, vol. 47, no. 6, 1999, pp. 1622-1635.

    Article  Google Scholar 

  10. M. Bahoura and J. Rouat, ‘Wavelet Speech Enhancement Based on the Teager Energy Operator,’ IEEE Signal Processing Lett., vol. 8, 2001, pp. 10-12.

    Article  Google Scholar 

  11. S.G. Chang, B. Yu, and M. Vetterli, ‘Adaptive Wavelet Thresholding for Image Denoising and Compression,’ IEEE Trans. Image Processing, vol. 9, 2000, pp. 1532-1546.

    Article  MathSciNet  MATH  Google Scholar 

  12. D.L. Donoho, ‘De-Noising by Soft-Thresholding,’ IEEE Trans. Inform. Theory, vol. 41, 1995, pp. 613-627.

    Article  MathSciNet  MATH  Google Scholar 

  13. D.L. Donoho and I.M. Johnstone, ‘Ideal Spatial Adaptation by Wavelet Shrinkage,’ Biometrika, vol. 81, 1994, pp. 425-455.

    Article  MathSciNet  MATH  Google Scholar 

  14. D.L. Donoho, ‘Unconditional Bases are Optimal Bases for Data Compression and Statistical Estimation,’ Applied and Computational Harmonic Analysis, vol. 1, 1994, pp. 100-115.

    Article  MathSciNet  Google Scholar 

  15. I.M. Johnstone and B.W. Silverman, ‘Wavelet Threshold Estimators for Data with Correlated Noise,’ J. Roy. Statist. Soc. B, vol. 59, 1997, pp. 319-351.

    Article  MathSciNet  MATH  Google Scholar 

  16. O. Farooq and S. Datta, ‘Mel Filter-Like Admissible Wavelet Packet Structure for Speech Recognition,’ IEEE Signal Processing Letters, vol. 8, no 7, 2001, pp. 196-198.

    Article  Google Scholar 

  17. R. Sarikaya, B. Pellom, and J.H.L. Hansen, ‘Wavelet Packet Transform Features with Application to Speaker Identification,’ in NORSIG-98, IEEE Nordic Signal Processing Symposium, Vigso, Denmark, 1998, pp. 81-84.

  18. P. Srinivasan and L.H. Jamieson, ‘High Quality Audio Compression Using an Adaptive Wavelet Decomposition and Psychoacoustic Modeling,’ IEEE Trans. Signal Processing, vol. 46, no. 4, 1998, pp. 1085-1093.

    Article  Google Scholar 

  19. J.F. Kaiser, ‘On a Simple Algorithm to Calculate the ‘Energy’ of a Signal,’ in Proceedings of IEEE Int. Conf. Acoustics, Speech, and Signal Processing '90, 1990, pp. 381-384.

  20. J.F. Kaiser, ‘Some Useful Properties of Teager's Energy Operator,’ in Proceedings of IEEE Int. Conf. Acoustics, Speech, and Signal Processing '93, 1993, pp. 149-152.

  21. F. Jabloun, A.E. Cetin, and E. Erzin, ‘Teager Energy Based Feature Parameters for Speech Recognition in Car Noise,’ IEEE Signal Processing Lett., vol. 6, 1999, pp. 259-261.

    Article  Google Scholar 

  22. G. Zhou, J.H.L. Hansen, and J.F. Kaiser, ‘Nonlinear Feature Based Classification of Speech Under Stress,’ IEEE Trans. Speech and Audio Processing, vol. 9, 2001, pp. 201-216.

    Article  Google Scholar 

  23. C.S. Burrus, R.A. Gopinath, and H. Guo, Introduction to Wavelets and Wavelet Transforms, A Primer, Upper Saddle River, Nj: Prentice-Hall, 1998.

    Google Scholar 

  24. I. Daubechies, Ten Lectures on Wavelets, CBMS, SIAM Publ., 1992.

  25. S. Mallat, ‘Multifrequency Channel Decomposition of Images and Wavelet Model,’ IEEE Trans. Acoustic, Speech and Signal Processing, vol. 37, 1989, pp. 2091-2110.

    Article  Google Scholar 

  26. O. Ghitza, ‘Auditory Model and Human Performance in Tasks Related to Speech Coding and Speech Recognition,’ IEEE Trans. Speech and Audio Processing, vol. 2, 1994, pp. 115-132.

    Article  Google Scholar 

  27. L. Rabiner and B.H. Juang, Fundamental of Speech Recognition, Upper Saddle River, NJ: Prentice-Hall, 1993.

    Google Scholar 

  28. E. Zwicker and E. Terhardt, ‘Analytical Expressions for Critical-Band Rate and Critical Bandwidth as a Function of Frequency,’ JASA, vol. 68, 1980, pp. 1523-1525.

    Article  Google Scholar 

  29. See http://www.icp.inpg.fr/ELRA/aurora2.html.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, SH., Wang, JF. Speech Enhancement Using Perceptual Wavelet Packet Decomposition and Teager Energy Operator. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 36, 125–139 (2004). https://doi.org/10.1023/B:VLSI.0000015092.19005.62

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:VLSI.0000015092.19005.62

Navigation