Top

International Journal of Speech Technology

Published in:

13-01-2017

Bandwidth extension of telephone speech using magnitude spectrum data hiding

Authors: Prasad Nizampatnam, Kishore Kumar Tappeta

Published in: International Journal of Speech Technology | Issue 1/2017

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Public telephone systems transmit speech across a limited frequency range, about 300–3400 Hz, called narrowband (NB) which results in a significant reduction of quality and intelligibility of speech. This paper proposes a fully backward compatible novel method for bandwidth extension of NB speech. The method uses magnitude spectrum data hiding technique to provide a perceptually better wideband speech signal. Code excited linear prediction parameters are extracted from the down sampled frequency shifted version of the high frequency components of speech signal existing above NB, which are spread by using pseudo-noise codes, and are embedded in the low amplitude high-frequency regions of the magnitude spectrum of NB speech signal. The embedded information is extracted at the receiving end to reconstruct the wideband speech signal. Theoretical and simulation analyses show that the proposed method is robust to quantization and channel noises. The comparison category rating listening and log spectral distortion tests clearly show that the reconstructed wideband signal gives a much better performance in terms of speech quality when compared to some of the existing speech bandwidth extension methods employing data hiding.

previous article Quantification system of Parkinson’s disease

next article Security enhancement for AES encrypted speech in communications

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Andreas, S., Ed, P. T., & Venkatraman, A. (2006). Audio signal processing and coding. New York: Wiley.

Bauer, P., & Fingscheidt, T. (2008). An HMM based artificial bandwidth extension evaluated by cross-language training and test. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 4589–4592.

Chen, S., & Leung, H. (2005). Artificial bandwidth extension of telephony speech by data hiding. In Proceedings of ISCAS, pp. 3151–3154.

Chen, S., & Leung, H. (2007). Speech bandwidth extension by data hiding and phonetic classification. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 593–596.

Chen, S., & Leung, H. (2008). A bandwidth extension technique for signal transmission using chaotic data hiding. Circuits, Systems and Signal Processing, 27(6), 893–913.MathSciNetCrossRefMATH

Chen, S., Leung, H., & Ding, H. (2007). Telephony speech enhancement by data hiding. IEEE Transactions on Instrumentation and Measurement, 56(1), 63–74.CrossRef

Chen, Z., Zhao, C., Geng, G., & Yin, F. (2013). An audio watermark based speech bandwidth extension method. EURASIP Journal Audio, Speech and Music Processing, 2013(10), 1–8.

Ding, H. (2004). Wideband audio over narrowband low-resolution media. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 489–492.

Epps, J., & Holmes, W. H. (1999). A new technique for wideband enhancement of coded narrowband speech. In Proceedings of IEEE workshop on speech coding, pp. 174–176.

Erdmann, C., Vary, P., Fischer, K., Xu, W., Marke, M., Fingscheidt, T., Varga, I., Kaindl, M., Quinquis, C., Kovesi, B., & Massaloux, D. (2001). A candidate proposal for a 3GPP adaptive multi-rate wideband speechcodec. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 757–760.

ETSI ES 201 108 V1.1.2. (2000). Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; front-end feature extraction algorithm; compression algorithms.

Geiser, B., Jax, P., & Vary, P. (2005). Artificial bandwidth extension of speech supported by watermark-transmitted side information. In Proceedings of 9th European conference on speech communication and technology (INTERSPEECH), pp. 1497–1500.

Geiser, B., & Vary, P. (2007). Backwards compatible wideband telephony in mobile networks: CELP watermarking and bandwidth extension. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 533–536.

GPP TS 26.171 (2001). AMR Wideband speech codec; general description, 3GPP.

Hassan, A. A., Hershey, J. E., & Saulnier, G. J. (1998). Perspectives in spread spectrum. Boston: Kluwer Academic Publishers.CrossRef

ITU-T (1996). ITU-T recommendation P.800, methods for subjective determination of transmission quality.

ITU-T (2001). ITU-T Rec. P.862: Perceptual evaluation of speech quality (PESQ): An objective method for end to-end speech quality assessment of narrow-band telephone networks and speech codecs.

Jax, P. (2002). Enhancement of bandlimited speech signals: Algorithms and theoretical bounds. Ph.D. dissertation, RWTH Aachen University, Aachen, Germany.

Jax, P. (2004). Audio bandwidth extension: Application of psychoacoustics, signal processing and loudspeaker design. England: Wiley.

Jax, P., & Vary, P. (2002). An upper bound on the quality of artificial bandwidth extension of narrowband speech signals. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 237–240.

Jax, P., & Vary, P. (2003). On artificial bandwidth extension of telephone speech. Signal Processing, 83(8), 1707–1719.CrossRefMATH

Jax, P., & Vary, P. (2006). Bandwidth extension of speech signals: A catalyst for the introduction of wideband speech coding? IEEE Communication Magazine, 44(5), 106–111.CrossRef

Keiser, B. E., & Strange, E. (1995). Digital telephony and network integration. New York: Van Nostrand Reinhold.CrossRef

Nakatoh, Y., Tsushima, M., Norimatsu, T. (1997). Generation of broadband speech from narrowband speech using piecewise linear mapping. In Proceedings of EUROSPEECH, pp. 1643–1646.

NTT Adv. Technol. Corp. (1994). Multi-lingual speech database for telephonometry 1994.

Vary P., & Geiser, B. (2007). Steganographic wideband telephony using narrowband speech codecs. In Proceedings of conference record of asilomar conference on signals, systems, and computers, pp. 1475–1479.

Paulus, J., & Schnitzler, J. (1996). 16 kbit/s Wideband Speech Coding Based on Unequal Subbands. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 651–654.

Pulakka, H., & Alku, P. (2011). Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband Melspectrum. IEEE Transactions on Audio, Speech and Language Processing, 19(7), 2170–2183.CrossRef

Pulakka, H., Laaksonen, L., Vainio, M., Pohjalainen, J., & Alku, P. (2008). Evaluation of an artificial speech bandwidth extension method in three languages. IEEE Transactions on Audio, Speech and Language Processing, 16(6), 1124–1137.CrossRef

Pulakka, H., Remes, U., Palomaki, K., Kurimo, M., & Alku, P. (2011). Speech bandwidth extension using gaussian mixture model-based estimation of the highband Mel spectrum. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 5100–5103.

Qian, Y., & Kabal, P. (2003). Dual-mode wideband speech recovery from narrowband speech. In Proceedings of EUROSPEECH, pp. 1433–1436.

Rabie, T., & Guerchi, D. (2007). Magnitude spectrum speech hiding. In Proceedings of IEEE international conference on signal processing and communications, pp. 1147–1150.

Rongqiang, H. U., Venkatesh, K., & Anderson, D. V. (2005). Speech bandwidth extension by improved codebook mapping towards increased phonetic classification. In Proceedings of Interspeech, pp. 1501–1504.

Sayed, A. H. (2008). Adaptive filters. New Jersy: Wiley.CrossRef

Schroeder, M. R., & Atal, B. S. (1985). Code-excited linear prediction (CELP); high quality at low bit rates. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 937–940.

Strange, W., Edman, T. R., & Jenkins, J. J. (1979). Acoustic and phonological factors in vowel identification. Journal of Experimental Psychology: Human Perception and Performance, 5(4), 643–656.

Vary, P., & Martin, R. (2006). Digital speech transmission: Enhancement, coding and error concealment. Chichester: Wiley.CrossRef

Vaseghi, S., Zavarehei, E., & Yan, Q. (2006). Speech bandwidth extension: Extrapolations of spectral envelop and harmonicity quality of excitation. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 844–847.

Title: Bandwidth extension of telephone speech using magnitude spectrum data hiding
Authors: Prasad Nizampatnam
Kishore Kumar Tappeta
Publication date: 13-01-2017
Publisher: Springer US
Published in: International Journal of Speech Technology / Issue 1/2017
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-016-9393-x

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 1/2017

Speech based automatic personality perception using spectral features

Glottal opening instants detection using zero frequency resonator

Enhanced multiclass SVM with thresholding fusion for speech-based emotion classification

Speech enhancement based on stationary bionic wavelet transform and maximum a posterior estimator of magnitude-squared spectrum

Single channel noise reduction system in low SNR

Single-channel speech separation using empirical mode decomposition and multi pitch information with estimation of number of speakers