nach oben

International Journal of Speech Technology

Erschienen in:

28.05.2016

Simultaneous speech coding and de-noising in a dictionary based quantized CS framework

verfasst von: Vinitha Ramdas, Sai Subrahmanyam R. K. Gorthi, Deepak Mishra

Erschienen in: International Journal of Speech Technology | Ausgabe 3/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Speech compression or speech coding is inevitable for effective communication of speech signals in resource limited scenarios and researcher’s have been working on achieving lower and lower transmission bit rates (BR) without much compromise on the quality of speech. Medium BR hybrid speech coding schemes have gained much interest in the recent years with most of them based on CELP, the basic medium bit-rate coding scheme. In this work, we provide an insight to the capabilities of compressive sensing (CS) in speech processing and propose a novel idea in the quantized framework. Three major aspects demonstrated in this paper are (1) Inherent de-noising of noisy speech by the CS based coder along with compression (2) Quantization of CS measurements to achieve medium transmission bit-rates and (3) Enhancement of quality and compression performance of the coder with better sparse representations of speech using dictionaries. The results indicate that the proposed scheme offers better compression in comparison with basic Gaussian codebook CELP. The CS scheme has the added advantage of inherent noise suppression and provides more robustness to background noise in comparison with parameter extraction based medium bit-rate speech coding systems.

Vorheriger Artikel Arabic phonemes recognition using hybrid LVQ/HMM model for continuous speech recognition

Nächster Artikel Text-dependent speaker verification using classical LBG, adaptive LBG and FCM vector quantization

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

http://dsp.rice.edu/.

Aharon, M., Elad, M., & Bruckstein, A. (2006). K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54, 4311–4322.CrossRef

Andreas, S. (1994). Spanias, speech coding: A tutorial review, Proceedings of the IEEE, vol. 82(10).

Chu, W. C. (2003). Speech coding algorithms foundation and evolution of standardized coders. Hoboken: Wiley.CrossRefMATH

Dai, W., Pham R. V., & Milenkovic O. (2009). A comparative study of quantized compressive sensing schemes, IEEE International Symposium on Information Theory, pp. 11–15.

Daniels M. L., & Rao B. D. (2012). Compressed sensing based scalable speech coders, Proceedings of ASILOMAR, pp. 92–96.

Donoho, D. L. (2006). Compressed sensing. IEEE Transactions on Information Theory, 52, 1289–1306.MathSciNetCrossRefMATH

Eldar, Y. C., & Kutyniok, G. (2012). Compressed sensing: Theory and applications. Cambridge: Cambridge University Press.CrossRef

Foucart, S., & Rauhut, H. (2013). A mathematical introduction to compressive sensing (Vol. XVIII). New York: Springer.CrossRefMATH

Giacobello, D., Christensen, M. G., Murthi, M. N., Jensen, S. H., & Moonen, M. (2010). Retrieving sparse patterns using a compressed sensing framework: applications to speech coding based on sparse linear prediction. IEEE Signal Processing Letters, 17, 103–106.CrossRef

Gunawan, T.S., Khalifa, O.O., Shafie, A.A., & Ambikairajah, E. (2011) Speech compression using compressive sensing on a multicore system. In Proceedings of 4th International Conference on Mechatronics (ICOM), pp. 1–4.

Hu, Y., & Loizou, P. (2007). Subjective evaluation and comparison of speech enhancement algorithms. Journal of Speech Communications, 49, 588–601.CrossRef

Jafari M. G. & Plumbey M. D., (2008). An adaptive orthogonal sparsifying transform for speech signals, Proceedings of IEEE Conference on Communications, Control and Signal Processing (ISCCSP), pp. 786–790.

Jafari M. G. & Plumbley M. D. (2009). Speech denoising based on a greedy adaptive dictionary algorithm, Proceedings of European Signal Processing Conference, pp. 1423–1426.

Kadambe, S., & Davis, J. (2010). Compressive sensing and vector quantization based image compression, Proceedings of IEEE ASILOMAR, pp. 2023–2027.

Kamboh, A. M., Lawrence, K. C., Thomas, A. M., & Tsai, P. I. (2005). Design of a CELP coder and analysis of various quantization techniques. Ann Arbor: University of Michigan.

Kassim L.A., Khalifa, O.O., & Gunawan T.S. (2012). Compressive sensing based low bit rate speech encoder. In International Conference on Computer & Communication Engineering (ICCCE), pp. 302–307.

Kondoz, A. M. (2004). Digital speech—coding for low bit rate communication systems (2nd ed.). New York: Chichester.CrossRef

Lin K.-H., Lin C.-H., Chung K.-H., & Lin K.-S. (2013). A compressive sensing-based speech signal processing system for wearable computing device in IPTV environment. In Third International Congress on Multimedia Technology, Atlantis Press.

Murray J. F. & Kreutz-Delgado K. (2004). Sparse image coding using learned dictionaries, IEEE Workshop on Machine Learning for Signal Processing, pp. 579–588.

Nowak, R. D., & Wright, S. J. (2007). Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE Journal of Selected Topics in Signal Processing, 1(4), 586–597.CrossRef

Pham, D. S., & Venkatesh, S. (2013). Compressive speech enhancement. Journal of Speech Communication, 55, 757–768.CrossRef

Plumbey, M. D., & Jafari, M. G. (2011). Fast dictionary learning for sparse representations of speech signal. IEEE Journal of Selected Topics in Signal Processing, 5, 1025–1031.CrossRef

Rubinstein R., Bruckstein A. M., & Elad M. (2010). Dictionaries for sparse representation modelling, Invited paper, proceedings of IEEE, pp. 1045–1057.

Sanderson, C. (2008). Biometric person recognition: Face, speech and fusion. Saarbrucken: VDM.

Shirazinia, A., Chatterjee, S., & Skoglund, M. (2013). Analysis-by-synthesis quantization for compressed sensing measurements. IEEE Transaction on Signal Processing, 61(22), 5789–5800.MathSciNetCrossRef

Sigg, C. D., Dikk, T., & Buhmann, J. M. (2012). Speech enhancement using generative dictionary learning. IEEE Transaction on Audio, Speech and Language Processing, 20(6), 1698–1712.CrossRef

Wang, Y., Xu, Z., Li, G., Chang L., & Hong C. (2011). Compressive sensing framework for speech signal synthesis using a hybrid dictionary, Proceedings of IEEE CISP, pp. 2400–2403

Wu, D., Zhu W.-P., & Swamy M.N.S. On sparsity issues in compressive sensing based speech enhancement. In Proceedings of IEEE ISCAS, 2012, pp. 285–288.

Titel: Simultaneous speech coding and de-noising in a dictionary based quantized CS framework
verfasst von: Vinitha Ramdas
Sai Subrahmanyam R. K. Gorthi
Deepak Mishra
Publikationsdatum: 28.05.2016
Verlag: Springer US
Erschienen in: International Journal of Speech Technology / Ausgabe 3/2016
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-016-9345-5

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Arbeitszeit/© granata68 / Fotolia, E-Autos im Fuhrpark: Lohnt sich das noch?/© Petair / stock.adobe.com, Kryptowährungen/© gopixa / Getty Images / iStock, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2016

Performance of speaker localization using microphone array

Assessment of dysarthric speech using Elman back propagation network (recurrent network) for speech recognition

Speech transmission with COFDM based on different discrete transforms

Wavelet energy based voice activity detection and adaptive thresholding for efficient speech coding

Erratum to: What we have and what is needed, how to evaluate Arabic Speech Synthesizer?

Text-dependent speaker verification using classical LBG, adaptive LBG and FCM vector quantization

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.