Skip to main content
Top
Published in: International Journal of Speech Technology 3/2016

06-06-2016

Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding

Authors: Shijo M. Joseph, Anto P. Babu

Published in: International Journal of Speech Technology | Issue 3/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

During the last five decades, extensive researches have been carried out in the field of speech compression, which has resulted in various techniques for speech coding. Researchers have been in full swing for more efficient speech coding and their effort is still continuing in different parts of the world. In this paper we are proposing an alternative method for better speech coding. In the proposed technique we use discrete wavelet transform to decompose the signal and wavelet energy is used to differentiate between active voice region and silence region in the speech signal. Depending upon the region’s status the system, different thresholding strategies have been chosen which leads to a better compression without any loss of speech intelligibility. The proposed method is evaluated in terms of qualitative and quantitative parameters. In this paper we also propose an alternative parameter for MOS values which is here after known as System Recognition Rate.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Achuthan, A., Rajeswari, M., Ramachandram, D., Aziz, M.E., & Shuaib, I.L. (2010). Wavelet energy-guided level set-based active contour: A segmentation method to segment highly similar regions. Computers in Biology and Medicine, 407, 608–620. Achuthan, A., Rajeswari, M., Ramachandram, D., Aziz, M.E., & Shuaib, I.L. (2010). Wavelet energy-guided level set-based active contour: A segmentation method to segment highly similar regions. Computers in Biology and Medicine, 407, 608–620.
go back to reference Amaar, A., Saad, E.M., Ashour, I., & Elzorkany, M. (2011). Image compression using hybrid vector quantization with DC., In International conference on graphic and image processing, Cairo. Amaar, A., Saad, E.M., Ashour, I., & Elzorkany, M. (2011). Image compression using hybrid vector quantization with DC., In International conference on graphic and image processing, Cairo.
go back to reference Chacko, B. P. (2011). Intelligent Character Recognition: A study and anlysis of extreme learning machine and support vector machine using divison point and wavelet feature. Kannur: Depatrment of Information Technology, Kannur University. Chacko, B. P. (2011). Intelligent Character Recognition: A study and anlysis of extreme learning machine and support vector machine using divison point and wavelet feature. Kannur: Depatrment of Information Technology, Kannur University.
go back to reference Chacko, B.P., Vimal Krishnan, V.R., Raju, G., & Anto, P.B. (2012). Handwritten character recognition using wavelet energy and extreme learning machine. International Journal of Machine Learning and Cybernetics, 32, 149–161. Chacko, B.P., Vimal Krishnan, V.R., Raju, G., & Anto, P.B. (2012). Handwritten character recognition using wavelet energy and extreme learning machine. International Journal of Machine Learning and Cybernetics, 32, 149–161.
go back to reference Feher, K. (2001). Wirless digital communication, modulation & spread spectrum applications. New Delhi: Prentice Hall of India. Feher, K. (2001). Wirless digital communication, modulation & spread spectrum applications. New Delhi: Prentice Hall of India.
go back to reference Haykin, S. (2001). Communication systems. New York: Wiley. Haykin, S. (2001). Communication systems. New York: Wiley.
go back to reference Holmes, J. N. (1988). Speech synthesis and recognition. London: Chapman & Hall. Holmes, J. N. (1988). Speech synthesis and recognition. London: Chapman & Hall.
go back to reference Hubbard, B. B. (2003). The world according to wavelets: The story of a mathematical technique in the making (2nd ed.). Ahmedabad: Universities Press.MATH Hubbard, B. B. (2003). The world according to wavelets: The story of a mathematical technique in the making (2nd ed.). Ahmedabad: Universities Press.MATH
go back to reference Joseph, S.M., & Anto, P.B. (2011). The optimal wavelet for speech compression. In Advances in computing and communications (pp. 406–414). Berlin: Springer. Joseph, S.M., & Anto, P.B. (2011). The optimal wavelet for speech compression. In Advances in computing and communications (pp. 406–414). Berlin: Springer.
go back to reference Karam, J. (2006). Various speech processing techniques for multimedia applications. Kuwait: Gulf University for Sciences and Technology (GUST). Karam, J. (2006). Various speech processing techniques for multimedia applications. Kuwait: Gulf University for Sciences and Technology (GUST).
go back to reference Karam, J. (2010). A comprenhensive approach for speech related multimedia applications. WSEAS Transactions on Signal Processing, 6(1), 12–21. Karam, J. (2010). A comprenhensive approach for speech related multimedia applications. WSEAS Transactions on Signal Processing, 6(1), 12–21.
go back to reference Kondoz, X. X. X. (2004). Digital speech coding for low bit rate communication systems (2nd ed.). New York: Wiley.CrossRef Kondoz, X. X. X. (2004). Digital speech coding for low bit rate communication systems (2nd ed.). New York: Wiley.CrossRef
go back to reference Lin, B., Nguyen, B., & Olsen, E. T. (1995). Orthogonal wavelets and signal processing, signal processing methods for audio images and telecommunications. London: Academic Press. Lin, B., Nguyen, B., & Olsen, E. T. (1995). Orthogonal wavelets and signal processing, signal processing methods for audio images and telecommunications. London: Academic Press.
go back to reference Litwin, L.R. (1998). Speech coding with wavelets. IEEE Potentials, 17(2), 38–41. Litwin, L.R. (1998). Speech coding with wavelets. IEEE Potentials, 17(2), 38–41.
go back to reference Mallat, S. A. (1989). Theory for muItiresolution signal decomposition: The wavelet representation. EEE Transactions on Pattern Analysis. Machine Intelligence, 31, 674–693. Mallat, S. A. (1989). Theory for muItiresolution signal decomposition: The wavelet representation. EEE Transactions on Pattern Analysis. Machine Intelligence, 31, 674–693.
go back to reference McClellan, J. H., & Schafer, R. W. (2003). Signal processing first. Upper Saddle River: Pearson Education. McClellan, J. H., & Schafer, R. W. (2003). Signal processing first. Upper Saddle River: Pearson Education.
go back to reference Meyer, Y., & Ryan, R. D. (1993). Wavelets: algorithms and applications. Philadelphia: Society for Industrial and Applied Mathematics. Meyer, Y., & Ryan, R. D. (1993). Wavelets: algorithms and applications. Philadelphia: Society for Industrial and Applied Mathematics.
go back to reference Nelson, M., & Gailly, J.-L. (2003). The data compression book (2nd ed.). Mumba: BPB Publications. Nelson, M., & Gailly, J.-L. (2003). The data compression book (2nd ed.). Mumba: BPB Publications.
go back to reference Oi, J., & Viswanathan, V. (1995). Application of wavelets to speech processing, modern methods of speech processing. Boston: Kluwer Academic Publishers. Oi, J., & Viswanathan, V. (1995). Application of wavelets to speech processing, modern methods of speech processing. Boston: Kluwer Academic Publishers.
go back to reference Osman, M.A., Al, N., Magboub, H.M., & Alfandi, S.A. (2010). Speech compression using LPC and wavelet, in 2010 2nd international conference on computer engineering and technology (ICCET), (pp. V7–92–V97-99). Osman, M.A., Al, N., Magboub, H.M., & Alfandi, S.A. (2010). Speech compression using LPC and wavelet, in 2010 2nd international conference on computer engineering and technology (ICCET), (pp. V7–92–V97-99).
go back to reference Osman, A., Nasser A.I., Magboub, H.M., & Alfandi, S.A. (2010). Speech compression using LPC and Wavelet, In 2nd international conference on computer engineering and technology IEEE, pp. 7. Osman, A., Nasser A.I., Magboub, H.M., & Alfandi, S.A. (2010). Speech compression using LPC and Wavelet, In 2nd international conference on computer engineering and technology IEEE, pp. 7.
go back to reference Painter, T., & Spanias, A. (2000). Perceputal coding of digital audio. Proceedings of the IEEE, 884, 62. Painter, T., & Spanias, A. (2000). Perceputal coding of digital audio. Proceedings of the IEEE, 884, 62.
go back to reference Polikar, R. (1999). The story of wavelets. In Proceedings of IMACS/IEEE CSCC’99 (pp. 5481–5486). Polikar, R. (1999). The story of wavelets. In Proceedings of IMACS/IEEE CSCC’99 (pp. 5481–5486).
go back to reference Polikar, R. (1996). Fundamental concept & an over view of the wavelet theory. Glassboro: Rowan University. Polikar, R. (1996). Fundamental concept & an over view of the wavelet theory. Glassboro: Rowan University.
go back to reference Rabiner, L., & Schafer, R. W. (2003). Digital processing of speech signals. New Delhi: Pearson Education. Rabiner, L., & Schafer, R. W. (2003). Digital processing of speech signals. New Delhi: Pearson Education.
go back to reference Rabiner, L. R., Juang, B. H., & Yengnanarayana, B. (2009). Fundamentals of speech recognition. New Delhi: Pearson Education Inc. Rabiner, L. R., Juang, B. H., & Yengnanarayana, B. (2009). Fundamentals of speech recognition. New Delhi: Pearson Education Inc.
go back to reference Rao, R. M., & Ajit, S. (2004). Wavelet transforms: Introduction to theory and applications. New Delhi: Pearson Education Pvt. Ltd, Rao, R. M., & Ajit, S. (2004). Wavelet transforms: Introduction to theory and applications. New Delhi: Pearson Education Pvt. Ltd,
go back to reference Resnikoff, H. L., & Wells, R. O. (2004). Wavelet analysis: The scalable strcture of information. Heidelberg: Springer. Resnikoff, H. L., & Wells, R. O. (2004). Wavelet analysis: The scalable strcture of information. Heidelberg: Springer.
go back to reference Salomon, D. (2011). Data compression, The complete reference (4th ed.). New Delh: Springer.MATH Salomon, D. (2011). Data compression, The complete reference (4th ed.). New Delh: Springer.MATH
go back to reference Sayood, K. (2000). Introduction to data compression (2nd ed.). New Delhi: Elsevier India Pvt Ltd. Sayood, K. (2000). Introduction to data compression (2nd ed.). New Delhi: Elsevier India Pvt Ltd.
go back to reference Schiller, J. (2005). Mobile communication (2e ed.). New Delhi: Pearson Education. Schiller, J. (2005). Mobile communication (2e ed.). New Delhi: Pearson Education.
go back to reference Wu, X.-Q., Wang, K.-Q., & Zhang, D. (2005). Wavelet energy feature extraction and matching for palmprint recognition. Journal of Computer Science and Technology, 203, 411–418.CrossRef Wu, X.-Q., Wang, K.-Q., & Zhang, D. (2005). Wavelet energy feature extraction and matching for palmprint recognition. Journal of Computer Science and Technology, 203, 411–418.CrossRef
Metadata
Title
Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding
Authors
Shijo M. Joseph
Anto P. Babu
Publication date
06-06-2016
Publisher
Springer US
Published in
International Journal of Speech Technology / Issue 3/2016
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-014-9240-x

Other articles of this Issue 3/2016

International Journal of Speech Technology 3/2016 Go to the issue