Skip to main content
Top
Published in: International Journal of Speech Technology 2/2013

01-06-2013

The optimized wavelet filters for speech compression

Authors: A. Kumar, G. K. Singh, G. Rajesh, K. Ranjeet

Published in: International Journal of Speech Technology | Issue 2/2013

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, optimized wavelet filters for speech compression are proposed whose wavelet filter coefficients are derived with different window techniques such as Kaiser and Blackman windows via simple linear optimization. When the developed wavelet filters are exploited for speech compression, they not only give better compression ratio but also yield good fidelity parameters as compared to other wavelet filters. A comparative study of performance of different existing wavelet filters and the proposed wavelet filters is made in terms of compression ratio (CR), signal-to-noise ratio (SNR), peak signal-to-noise ratio (PSNR) and normalized root-mean square error (NRMSE) at different thresholding levels. The simulation result included in this paper shows increased efficacy and improved performance of the proposed filters in the field of speech signal processing.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Agbinya, J. I. (1996). Discrete wavelet transform techniques in speech processing. In IEEE Tencon digital signal processing applications proceedings (pp. 514–519). New York: IEEE. CrossRef Agbinya, J. I. (1996). Discrete wavelet transform techniques in speech processing. In IEEE Tencon digital signal processing applications proceedings (pp. 514–519). New York: IEEE. CrossRef
go back to reference Daubechies, I. (1988). Orthonormal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 41, 909–996. MathSciNetMATHCrossRef Daubechies, I. (1988). Orthonormal bases of compactly supported wavelets. Communications on Pure and Applied Mathematics, 41, 909–996. MathSciNetMATHCrossRef
go back to reference Daubechies, I. (1992). Ten lectures on wavelets. CBMS-NSF. Daubechies, I. (1992). Ten lectures on wavelets. CBMS-NSF.
go back to reference Dusan, S., Flanagan, J. L., Karve, A., & Balaraman, M. (2007). Speech compression using polynomial approximation. IEEE Transactions on Audio, Speech, and Language Processing, 15(2), 387–397. CrossRef Dusan, S., Flanagan, J. L., Karve, A., & Balaraman, M. (2007). Speech compression using polynomial approximation. IEEE Transactions on Audio, Speech, and Language Processing, 15(2), 387–397. CrossRef
go back to reference Fgee, E. B., Philips, W. J., & Robertson, W. (1999). Comparing audio compression using wavelet with other audio compression schemes. Proceedings IEEE Electrical and Computer Engineering, 2, 698–701. Fgee, E. B., Philips, W. J., & Robertson, W. (1999). Comparing audio compression using wavelet with other audio compression schemes. Proceedings IEEE Electrical and Computer Engineering, 2, 698–701.
go back to reference Gershikov, E., & Porat, M. (2007). On color transforms and bit allocation for optimal subband image compression. Signal Processing. Image Communication, 22, 1–18. CrossRef Gershikov, E., & Porat, M. (2007). On color transforms and bit allocation for optimal subband image compression. Signal Processing. Image Communication, 22, 1–18. CrossRef
go back to reference Gersho, A. (1992). Speech coding. In A. N. Ince (Ed.), Digital speech processing (pp. 73–100). Boston: Kluwer Academic. CrossRef Gersho, A. (1992). Speech coding. In A. N. Ince (Ed.), Digital speech processing (pp. 73–100). Boston: Kluwer Academic. CrossRef
go back to reference Gersho, A. (1994). Advance in speech and audio compression. Proceedings of the IEEE, 82(6), 900–918. CrossRef Gersho, A. (1994). Advance in speech and audio compression. Proceedings of the IEEE, 82(6), 900–918. CrossRef
go back to reference Gibson, J. D. (2005). Speech coding methods, standards, and applications. IEEE Circuits and Systems Magazine, 5(4), 30–49. CrossRef Gibson, J. D. (2005). Speech coding methods, standards, and applications. IEEE Circuits and Systems Magazine, 5(4), 30–49. CrossRef
go back to reference Joseph, S. M. (2010). Spoken digit compression using wavelet packet. In IEEE international conference on signal and image processing (ICSIP-2010) (pp. 255–259). CrossRef Joseph, S. M. (2010). Spoken digit compression using wavelet packet. In IEEE international conference on signal and image processing (ICSIP-2010) (pp. 255–259). CrossRef
go back to reference Junejo, N., Ahmed, N., Unar, M. A., & Rajput, A. Q. K. (2005). Speech and image compression using discrete wavelet transform. In IEEE symposium on advances in wired and wireless communication (pp. 45–48). Junejo, N., Ahmed, N., Unar, M. A., & Rajput, A. Q. K. (2005). Speech and image compression using discrete wavelet transform. In IEEE symposium on advances in wired and wireless communication (pp. 45–48).
go back to reference Kumar, A., Singh, G. K., & Anand, R. S. (2008). Near perfect reconstruction quadrature mirror filter. International Journal of Computer Science and Engineering, 2(3), 121–123. Kumar, A., Singh, G. K., & Anand, R. S. (2008). Near perfect reconstruction quadrature mirror filter. International Journal of Computer Science and Engineering, 2(3), 121–123.
go back to reference Laskar, R. H., Banerjee, K., Talukdar, F. A., & Sreenivasa Rao, K. (2012). A pitch synchronous approach to design voice conversion system using source-filter correlation. International Journal of Speech Technology, 15, 419–431. CrossRef Laskar, R. H., Banerjee, K., Talukdar, F. A., & Sreenivasa Rao, K. (2012). A pitch synchronous approach to design voice conversion system using source-filter correlation. International Journal of Speech Technology, 15, 419–431. CrossRef
go back to reference Magboun, H. M., Ali, N., Osman, M. A., & Alfandi, S. A. (2010). Multimedia speech compression techniques. In IEEE international conference on computing science and information technology (ICCSIT) (Vol. 9, pp. 498–502). Magboun, H. M., Ali, N., Osman, M. A., & Alfandi, S. A. (2010). Multimedia speech compression techniques. In IEEE international conference on computing science and information technology (ICCSIT) (Vol. 9, pp. 498–502).
go back to reference Mallat, S. G. (1987). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Recognition and Machine Intelligence, 11(7), 674–684. CrossRef Mallat, S. G. (1987). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Recognition and Machine Intelligence, 11(7), 674–684. CrossRef
go back to reference Mallat, S. G. (1989). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 674–693. MATHCrossRef Mallat, S. G. (1989). A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 674–693. MATHCrossRef
go back to reference McCauley, J., Ming, J., Stewart, D., & Hanna, P. (2005). Subband correlation and robust speech recognition. IEEE Transactions on Speech and Audio Processing, 13(5), 956–964. CrossRef McCauley, J., Ming, J., Stewart, D., & Hanna, P. (2005). Subband correlation and robust speech recognition. IEEE Transactions on Speech and Audio Processing, 13(5), 956–964. CrossRef
go back to reference Misiti, M., Misiti, Y., Oppenheim, G., & Poggi, J. (2000). Matlab wavelet tool box. The Math Works Inc. Misiti, M., Misiti, Y., Oppenheim, G., & Poggi, J. (2000). Matlab wavelet tool box. The Math Works Inc.
go back to reference Najih, A. M. M. A., Ramli, A. R., Ibrahim, A., & Syed, A. R. (2003). Speech compression using discreet wavelet transform. In Proceedings of 4th national conference on telecommunication technology (pp. 1–3). CrossRef Najih, A. M. M. A., Ramli, A. R., Ibrahim, A., & Syed, A. R. (2003). Speech compression using discreet wavelet transform. In Proceedings of 4th national conference on telecommunication technology (pp. 1–3). CrossRef
go back to reference Ntalampiras, S., & Fakotakis, N. (2012). Modeling the temporal evolution of acoustic parameters for speech emotion recognition. IEEE Transactions on Affective Computing, 3(1), 116–125. CrossRef Ntalampiras, S., & Fakotakis, N. (2012). Modeling the temporal evolution of acoustic parameters for speech emotion recognition. IEEE Transactions on Affective Computing, 3(1), 116–125. CrossRef
go back to reference Osman, M. A., Al, N., Magboud, H. M., & Alfandi, S. A. (2010). Speech compression using LPC and wavelet. In IEEE international conference on computer engineering and technology (ICCET) (Vol. 7, pp. 92–99). Osman, M. A., Al, N., Magboud, H. M., & Alfandi, S. A. (2010). Speech compression using LPC and wavelet. In IEEE international conference on computer engineering and technology (ICCET) (Vol. 7, pp. 92–99).
go back to reference Ramchandran, K., Vetterli, M., & Herley, C. (1996). Wavelet, subband coding, and best bases. Proceedings of the IEEE, 84(4), 541–560. CrossRef Ramchandran, K., Vetterli, M., & Herley, C. (1996). Wavelet, subband coding, and best bases. Proceedings of the IEEE, 84(4), 541–560. CrossRef
go back to reference Satt, A., & Malah, D. (1989). Design of uniform DFT filter banks optimized for subband coding of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(11), 1672–1679. CrossRef Satt, A., & Malah, D. (1989). Design of uniform DFT filter banks optimized for subband coding of speech. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(11), 1672–1679. CrossRef
go back to reference Shahin, I. M. A. (2012). Speaker identification investigation and analysis in unbiased and biased emotional talking environments. International Journal of Speech Technology, 15, 325–334. CrossRef Shahin, I. M. A. (2012). Speaker identification investigation and analysis in unbiased and biased emotional talking environments. International Journal of Speech Technology, 15, 325–334. CrossRef
go back to reference Shao, Y., & Chang, C. H. (2011). Bayesian separation with sparsity promotion in perceptual wavelet domain for speech enhancement and hybrid speech recognition. IEEE Transactions on Systems, Man and Cybernetics. Part A: System and Humans, 41(2), 284–293. CrossRef Shao, Y., & Chang, C. H. (2011). Bayesian separation with sparsity promotion in perceptual wavelet domain for speech enhancement and hybrid speech recognition. IEEE Transactions on Systems, Man and Cybernetics. Part A: System and Humans, 41(2), 284–293. CrossRef
go back to reference Shlomot, E., Cuperman, V., & Gersho, A. (1998). Combined harmonic and waveform coding of speech at low bit rates. In IEEE conference on acoustics, speech and signal processing (ICASSP98) (Vol. 2, pp. 585–588). Shlomot, E., Cuperman, V., & Gersho, A. (1998). Combined harmonic and waveform coding of speech at low bit rates. In IEEE conference on acoustics, speech and signal processing (ICASSP98) (Vol. 2, pp. 585–588).
go back to reference Shlomot, E., Cuperman, V., & Gersho, A. (2001). Hybrid coding: combined harmonic and waveform coding of speech at 4 kb/s. IEEE Transactions on Speech and Audio Processing, 9(6), 632–645. CrossRef Shlomot, E., Cuperman, V., & Gersho, A. (2001). Hybrid coding: combined harmonic and waveform coding of speech at 4 kb/s. IEEE Transactions on Speech and Audio Processing, 9(6), 632–645. CrossRef
go back to reference Vankateswaran, P., Sanyal, A., Das, S., Nandi, R., & Sanyal, S. K. (2009). An efficient time domain speech compression algorithm based on LPC and sub-band coding techniques. Journal of Communication, 4(6), 423–428. Vankateswaran, P., Sanyal, A., Das, S., Nandi, R., & Sanyal, S. K. (2009). An efficient time domain speech compression algorithm based on LPC and sub-band coding techniques. Journal of Communication, 4(6), 423–428.
go back to reference Vetterli, M., & Kovacevic, J. (1995). Wavelets and subband coding. New York: Prentice Hall. MATH Vetterli, M., & Kovacevic, J. (1995). Wavelets and subband coding. New York: Prentice Hall. MATH
go back to reference Xie, N., Dong, G., & Zhang, T. (2011). Using lossless data compression in data storage systems: not for saving space. IEEE Transactions on Computers, 60(3), 335–345. MathSciNetCrossRef Xie, N., Dong, G., & Zhang, T. (2011). Using lossless data compression in data storage systems: not for saving space. IEEE Transactions on Computers, 60(3), 335–345. MathSciNetCrossRef
go back to reference Young, R. M. (1980). An introduction to nonharmonic Fourier series. New York: Academic Press. MATH Young, R. M. (1980). An introduction to nonharmonic Fourier series. New York: Academic Press. MATH
go back to reference Zois, E. N., & Anastassopoulos, V. (2000). Morphological waveform coding for writer identification. Pattern Recognition, 33(3), 385–398. CrossRef Zois, E. N., & Anastassopoulos, V. (2000). Morphological waveform coding for writer identification. Pattern Recognition, 33(3), 385–398. CrossRef
Metadata
Title
The optimized wavelet filters for speech compression
Authors
A. Kumar
G. K. Singh
G. Rajesh
K. Ranjeet
Publication date
01-06-2013
Publisher
Springer US
Published in
International Journal of Speech Technology / Issue 2/2013
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-012-9173-1

Other articles of this Issue 2/2013

International Journal of Speech Technology 2/2013 Go to the issue