Skip to main content
Erschienen in: International Journal of Speech Technology 4/2016

08.10.2016

Simulation and overall comparative evaluation of performance between different techniques for high band feature extraction based on artificial bandwidth extension of speech over proposed global system for mobile full rate narrow band coder

verfasst von: Ninad S. Bhatt

Erschienen in: International Journal of Speech Technology | Ausgabe 4/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper addresses a novel approach to investigate, study and simulate computation of high band (HB) feature extraction based on linear predictive coding (LPC) and mel frequency cepstral coefficient (MFCC) techniques. Further, HB features are embedded into encoded bitstream of proposed global system for mobile (GSM) full rate (FR) 06.10 coder using joint source coding and data hiding before being transmitted to receiving terminal. At receiver, HB features are extracted to reproduce HB portion of speech and for the same different extension of excitation techniques are applied and their results evaluated in terms of quality (intelligibility and naturalness) and bandwidth. MATLAB based e-test bench is created for implementing the proposed artificial bandwidth extension (ABE) coder following series of simulations, that are carried out to discover and gain insight about the performance of it using subjective [mean opinion score (MOS)] and objective [perceptual evaluation of speech quality (PESQ)] analysis. The results obtained for both the analyses advocate that proposed ABE coder outperforms proposed GSM FR NB (legacy GSM FR) coder. While the fact remains that, compared to LPC based parameterizations over ABE coder, MFCC parameterization results in higher speech intelligibility which is evident from obtained slightly better PESQ and MOS scores.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bhatt, N., Gajjar, P., & Kosta, Y.(2012) Artificial bandwidth extension of speech & its applications in wireless communication systems: A review. In Proceedings of IEEE international conference on communication systems and network technologies, Rajkot, India (p. 563). Bhatt, N., Gajjar, P., & Kosta, Y.(2012) Artificial bandwidth extension of speech & its applications in wireless communication systems: A review. In Proceedings of IEEE international conference on communication systems and network technologies, Rajkot, India (p. 563).
Zurück zum Zitat Bhatt, N., & Kosta, Y. (2011). Proposed modifications in ETSI GSM 06.10 full rate speech codec and its overall evaluation of performance using MATLAB. International Journal of Speech Technology, 14(3), 157.CrossRef Bhatt, N., & Kosta, Y. (2011). Proposed modifications in ETSI GSM 06.10 full rate speech codec and its overall evaluation of performance using MATLAB. International Journal of Speech Technology, 14(3), 157.CrossRef
Zurück zum Zitat Bhatt, N., Kosta, Y., & Tank, V. (2011). Proposed modifications in ETSI GSM 06.10 full rate speech coder for high rate data hiding and its objective evaluation of performance using simulink. In International conference on communication systems and network technologies (p. 27). Katra: IEEE Computer Society. Bhatt, N., Kosta, Y., & Tank, V. (2011). Proposed modifications in ETSI GSM 06.10 full rate speech coder for high rate data hiding and its objective evaluation of performance using simulink. In International conference on communication systems and network technologies (p. 27). Katra: IEEE Computer Society.
Zurück zum Zitat Cabaral, J., & Oliveira, L. (2005). Pitch-synchronous time-scaling for high-frequency excitation regeneration. In INTERSPEECH (p. 1513). Cabaral, J., & Oliveira, L. (2005). Pitch-synchronous time-scaling for high-frequency excitation regeneration. In INTERSPEECH (p. 1513).
Zurück zum Zitat ETSI channel coding (GSM 05.03 version 8.9.0, release 1999,12, 2005-01). ETSI channel coding (GSM 05.03 version 8.9.0, release 1999,12, 2005-01).
Zurück zum Zitat ETSI digital cellular telecommunications system (phase 2+), full rate speech, transcoding, (GSM 06.10 version 8.2.0 Release, 10, 2005-06). ETSI digital cellular telecommunications system (phase 2+), full rate speech, transcoding, (GSM 06.10 version 8.2.0 Release, 10, 2005-06).
Zurück zum Zitat Fuemmeler, J., Hardie, R., & Gardner, W. (2001). Techniques for the regeneration of wideband speech from narrowband speech. EURASIP Journal on Applied Signal Processing, 2001(1), 266.CrossRef Fuemmeler, J., Hardie, R., & Gardner, W. (2001). Techniques for the regeneration of wideband speech from narrowband speech. EURASIP Journal on Applied Signal Processing, 2001(1), 266.CrossRef
Zurück zum Zitat Geiser, B., & Vary, P. (2007). Backward compatible telephony in mobile networks: CELP watermarking & bandwidth extension. In Proceedings of IEEE international conference on acoustics speech and signal processing (ICASSP), Toulouse. Geiser, B., & Vary, P. (2007). Backward compatible telephony in mobile networks: CELP watermarking & bandwidth extension. In Proceedings of IEEE international conference on acoustics speech and signal processing (ICASSP), Toulouse.
Zurück zum Zitat Jax, P., Geiser, B., Schandl, S., Taddei, H., & Vary, P. (2006). An embedded scalable wideband codec based on the GSM EFR codec. In Proceedings of IEEE international conference on acoustics speech and signal processing (ICASSP), Toulouse. Jax, P., Geiser, B., Schandl, S., Taddei, H., & Vary, P. (2006). An embedded scalable wideband codec based on the GSM EFR codec. In Proceedings of IEEE international conference on acoustics speech and signal processing (ICASSP), Toulouse.
Zurück zum Zitat Jax, P., & Vary, P. (2003). On artificial bandwidth extension of telephone speech. Journal of Signal Processing, 83(8), 1707.CrossRefMATH Jax, P., & Vary, P. (2003). On artificial bandwidth extension of telephone speech. Journal of Signal Processing, 83(8), 1707.CrossRefMATH
Zurück zum Zitat Jax, P., & Vary, P. (2006). Bandwidth extension of speech signals: A catalyst for the introduction of wideband speech coding? IEEE Communications Magazine, 44(5), 106.CrossRef Jax, P., & Vary, P. (2006). Bandwidth extension of speech signals: A catalyst for the introduction of wideband speech coding? IEEE Communications Magazine, 44(5), 106.CrossRef
Zurück zum Zitat McAuley, R. & Quatieri, T. (1990). Pitch estimation and voicing detection based on a sinusoidal speech model. In IEEE transactions on acoustics, speech, and signal processing, (ICASSP) (p. 249). McAuley, R. & Quatieri, T. (1990). Pitch estimation and voicing detection based on a sinusoidal speech model. In IEEE transactions on acoustics, speech, and signal processing, (ICASSP) (p. 249).
Zurück zum Zitat Nour-Eldin, A., & Kabal, P. (2008). Mel-frequency cepstral coefficient-based bandwidth extension of narrowband speech. In INTERSPEECH (pp. 53–56). Nour-Eldin, A., & Kabal, P. (2008). Mel-frequency cepstral coefficient-based bandwidth extension of narrowband speech. In INTERSPEECH (pp. 53–56).
Zurück zum Zitat Ramabadran, T., & Jasiuk, M. (2008). Artificial bandwidth extension of narrow band speech signals via high band energy estimation. In 16th European signal processing conference (EUSIPCO). Ramabadran, T., & Jasiuk, M. (2008). Artificial bandwidth extension of narrow band speech signals via high band energy estimation. In 16th European signal processing conference (EUSIPCO).
Zurück zum Zitat Rix, A. W., Beerends, J. G., Hollier, M. P., & Hekstra, A. P. (2001) Perceptual evaluation of speech quality (PESQ), and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation, 862. Rix, A. W., Beerends, J. G., Hollier, M. P., & Hekstra, A. P. (2001) Perceptual evaluation of speech quality (PESQ), and objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs. ITU-T Recommendation, 862.
Zurück zum Zitat Shahbazi, A. (2010). Content dependent data hiding on GSM FR encoded speech. In International conference on signal acquisition and processing. Tehran: IEEE Computer Society. Shahbazi, A. (2010). Content dependent data hiding on GSM FR encoded speech. In International conference on signal acquisition and processing. Tehran: IEEE Computer Society.
Zurück zum Zitat Uysal, I., Sathyendra, H., & Harris, J. (2005). Bandwidth extension of telephone speech using frame-based excitation and robust features. In 13th European signal processing conference, Antalya. Uysal, I., Sathyendra, H., & Harris, J. (2005). Bandwidth extension of telephone speech using frame-based excitation and robust features. In 13th European signal processing conference, Antalya.
Zurück zum Zitat Vary, P., & Geiser, B. (2007). Steganographic wideband telephony using narrowband speech codecs. In IEEE 41st Asilomar conference on signals, systems and computers (ACSSC) (p. 1475). Vary, P., & Geiser, B. (2007). Steganographic wideband telephony using narrowband speech codecs. In IEEE 41st Asilomar conference on signals, systems and computers (ACSSC) (p. 1475).
Metadaten
Titel
Simulation and overall comparative evaluation of performance between different techniques for high band feature extraction based on artificial bandwidth extension of speech over proposed global system for mobile full rate narrow band coder
verfasst von
Ninad S. Bhatt
Publikationsdatum
08.10.2016
Verlag
Springer US
Erschienen in
International Journal of Speech Technology / Ausgabe 4/2016
Print ISSN: 1381-2416
Elektronische ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-016-9378-9

Weitere Artikel der Ausgabe 4/2016

International Journal of Speech Technology 4/2016 Zur Ausgabe

Neuer Inhalt