Skip to main content
Top
Published in: International Journal of Speech Technology 1/2017

13-01-2017

Bandwidth extension of telephone speech using magnitude spectrum data hiding

Authors: Prasad Nizampatnam, Kishore Kumar Tappeta

Published in: International Journal of Speech Technology | Issue 1/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Public telephone systems transmit speech across a limited frequency range, about 300–3400 Hz, called narrowband (NB) which results in a significant reduction of quality and intelligibility of speech. This paper proposes a fully backward compatible novel method for bandwidth extension of NB speech. The method uses magnitude spectrum data hiding technique to provide a perceptually better wideband speech signal. Code excited linear prediction parameters are extracted from the down sampled frequency shifted version of the high frequency components of speech signal existing above NB, which are spread by using pseudo-noise codes, and are embedded in the low amplitude high-frequency regions of the magnitude spectrum of NB speech signal. The embedded information is extracted at the receiving end to reconstruct the wideband speech signal. Theoretical and simulation analyses show that the proposed method is robust to quantization and channel noises. The comparison category rating listening and log spectral distortion tests clearly show that the reconstructed wideband signal gives a much better performance in terms of speech quality when compared to some of the existing speech bandwidth extension methods employing data hiding.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
go back to reference Andreas, S., Ed, P. T., & Venkatraman, A. (2006). Audio signal processing and coding. New York: Wiley. Andreas, S., Ed, P. T., & Venkatraman, A. (2006). Audio signal processing and coding. New York: Wiley.
go back to reference Bauer, P., & Fingscheidt, T. (2008). An HMM based artificial bandwidth extension evaluated by cross-language training and test. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 4589–4592. Bauer, P., & Fingscheidt, T. (2008). An HMM based artificial bandwidth extension evaluated by cross-language training and test. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 4589–4592.
go back to reference Chen, S., & Leung, H. (2005). Artificial bandwidth extension of telephony speech by data hiding. In Proceedings of ISCAS, pp. 3151–3154. Chen, S., & Leung, H. (2005). Artificial bandwidth extension of telephony speech by data hiding. In Proceedings of ISCAS, pp. 3151–3154.
go back to reference Chen, S., & Leung, H. (2007). Speech bandwidth extension by data hiding and phonetic classification. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 593–596. Chen, S., & Leung, H. (2007). Speech bandwidth extension by data hiding and phonetic classification. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 593–596.
go back to reference Chen, S., & Leung, H. (2008). A bandwidth extension technique for signal transmission using chaotic data hiding. Circuits, Systems and Signal Processing, 27(6), 893–913.MathSciNetCrossRefMATH Chen, S., & Leung, H. (2008). A bandwidth extension technique for signal transmission using chaotic data hiding. Circuits, Systems and Signal Processing, 27(6), 893–913.MathSciNetCrossRefMATH
go back to reference Chen, S., Leung, H., & Ding, H. (2007). Telephony speech enhancement by data hiding. IEEE Transactions on Instrumentation and Measurement, 56(1), 63–74.CrossRef Chen, S., Leung, H., & Ding, H. (2007). Telephony speech enhancement by data hiding. IEEE Transactions on Instrumentation and Measurement, 56(1), 63–74.CrossRef
go back to reference Chen, Z., Zhao, C., Geng, G., & Yin, F. (2013). An audio watermark based speech bandwidth extension method. EURASIP Journal Audio, Speech and Music Processing, 2013(10), 1–8. Chen, Z., Zhao, C., Geng, G., & Yin, F. (2013). An audio watermark based speech bandwidth extension method. EURASIP Journal Audio, Speech and Music Processing, 2013(10), 1–8.
go back to reference Ding, H. (2004). Wideband audio over narrowband low-resolution media. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 489–492. Ding, H. (2004). Wideband audio over narrowband low-resolution media. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 489–492.
go back to reference Epps, J., & Holmes, W. H. (1999). A new technique for wideband enhancement of coded narrowband speech. In Proceedings of IEEE workshop on speech coding, pp. 174–176. Epps, J., & Holmes, W. H. (1999). A new technique for wideband enhancement of coded narrowband speech. In Proceedings of IEEE workshop on speech coding, pp. 174–176.
go back to reference Erdmann, C., Vary, P., Fischer, K., Xu, W., Marke, M., Fingscheidt, T., Varga, I., Kaindl, M., Quinquis, C., Kovesi, B., & Massaloux, D. (2001). A candidate proposal for a 3GPP adaptive multi-rate wideband speechcodec. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 757–760. Erdmann, C., Vary, P., Fischer, K., Xu, W., Marke, M., Fingscheidt, T., Varga, I., Kaindl, M., Quinquis, C., Kovesi, B., & Massaloux, D. (2001). A candidate proposal for a 3GPP adaptive multi-rate wideband speechcodec. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 757–760.
go back to reference ETSI ES 201 108 V1.1.2. (2000). Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; front-end feature extraction algorithm; compression algorithms. ETSI ES 201 108 V1.1.2. (2000). Speech processing, transmission and quality aspects (STQ); Distributed speech recognition; front-end feature extraction algorithm; compression algorithms.
go back to reference Geiser, B., Jax, P., & Vary, P. (2005). Artificial bandwidth extension of speech supported by watermark-transmitted side information. In Proceedings of 9th European conference on speech communication and technology (INTERSPEECH), pp. 1497–1500. Geiser, B., Jax, P., & Vary, P. (2005). Artificial bandwidth extension of speech supported by watermark-transmitted side information. In Proceedings of 9th European conference on speech communication and technology (INTERSPEECH), pp. 1497–1500.
go back to reference Geiser, B., & Vary, P. (2007). Backwards compatible wideband telephony in mobile networks: CELP watermarking and bandwidth extension. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 533–536. Geiser, B., & Vary, P. (2007). Backwards compatible wideband telephony in mobile networks: CELP watermarking and bandwidth extension. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 533–536.
go back to reference GPP TS 26.171 (2001). AMR Wideband speech codec; general description, 3GPP. GPP TS 26.171 (2001). AMR Wideband speech codec; general description, 3GPP.
go back to reference Hassan, A. A., Hershey, J. E., & Saulnier, G. J. (1998). Perspectives in spread spectrum. Boston: Kluwer Academic Publishers.CrossRef Hassan, A. A., Hershey, J. E., & Saulnier, G. J. (1998). Perspectives in spread spectrum. Boston: Kluwer Academic Publishers.CrossRef
go back to reference ITU-T (1996). ITU-T recommendation P.800, methods for subjective determination of transmission quality. ITU-T (1996). ITU-T recommendation P.800, methods for subjective determination of transmission quality.
go back to reference ITU-T (2001). ITU-T Rec. P.862: Perceptual evaluation of speech quality (PESQ): An objective method for end to-end speech quality assessment of narrow-band telephone networks and speech codecs. ITU-T (2001). ITU-T Rec. P.862: Perceptual evaluation of speech quality (PESQ): An objective method for end to-end speech quality assessment of narrow-band telephone networks and speech codecs.
go back to reference Jax, P. (2002). Enhancement of bandlimited speech signals: Algorithms and theoretical bounds. Ph.D. dissertation, RWTH Aachen University, Aachen, Germany. Jax, P. (2002). Enhancement of bandlimited speech signals: Algorithms and theoretical bounds. Ph.D. dissertation, RWTH Aachen University, Aachen, Germany.
go back to reference Jax, P. (2004). Audio bandwidth extension: Application of psychoacoustics, signal processing and loudspeaker design. England: Wiley. Jax, P. (2004). Audio bandwidth extension: Application of psychoacoustics, signal processing and loudspeaker design. England: Wiley.
go back to reference Jax, P., & Vary, P. (2002). An upper bound on the quality of artificial bandwidth extension of narrowband speech signals. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 237–240. Jax, P., & Vary, P. (2002). An upper bound on the quality of artificial bandwidth extension of narrowband speech signals. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 237–240.
go back to reference Jax, P., & Vary, P. (2003). On artificial bandwidth extension of telephone speech. Signal Processing, 83(8), 1707–1719.CrossRefMATH Jax, P., & Vary, P. (2003). On artificial bandwidth extension of telephone speech. Signal Processing, 83(8), 1707–1719.CrossRefMATH
go back to reference Jax, P., & Vary, P. (2006). Bandwidth extension of speech signals: A catalyst for the introduction of wideband speech coding? IEEE Communication Magazine, 44(5), 106–111.CrossRef Jax, P., & Vary, P. (2006). Bandwidth extension of speech signals: A catalyst for the introduction of wideband speech coding? IEEE Communication Magazine, 44(5), 106–111.CrossRef
go back to reference Keiser, B. E., & Strange, E. (1995). Digital telephony and network integration. New York: Van Nostrand Reinhold.CrossRef Keiser, B. E., & Strange, E. (1995). Digital telephony and network integration. New York: Van Nostrand Reinhold.CrossRef
go back to reference Nakatoh, Y., Tsushima, M., Norimatsu, T. (1997). Generation of broadband speech from narrowband speech using piecewise linear mapping. In Proceedings of EUROSPEECH, pp. 1643–1646. Nakatoh, Y., Tsushima, M., Norimatsu, T. (1997). Generation of broadband speech from narrowband speech using piecewise linear mapping. In Proceedings of EUROSPEECH, pp. 1643–1646.
go back to reference NTT Adv. Technol. Corp. (1994). Multi-lingual speech database for telephonometry 1994. NTT Adv. Technol. Corp. (1994). Multi-lingual speech database for telephonometry 1994.
go back to reference Vary P., & Geiser, B. (2007). Steganographic wideband telephony using narrowband speech codecs. In Proceedings of conference record of asilomar conference on signals, systems, and computers, pp. 1475–1479. Vary P., & Geiser, B. (2007). Steganographic wideband telephony using narrowband speech codecs. In Proceedings of conference record of asilomar conference on signals, systems, and computers, pp. 1475–1479.
go back to reference Paulus, J., & Schnitzler, J. (1996). 16 kbit/s Wideband Speech Coding Based on Unequal Subbands. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 651–654. Paulus, J., & Schnitzler, J. (1996). 16 kbit/s Wideband Speech Coding Based on Unequal Subbands. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 651–654.
go back to reference Pulakka, H., & Alku, P. (2011). Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband Melspectrum. IEEE Transactions on Audio, Speech and Language Processing, 19(7), 2170–2183.CrossRef Pulakka, H., & Alku, P. (2011). Bandwidth extension of telephone speech using a neural network and a filter bank implementation for highband Melspectrum. IEEE Transactions on Audio, Speech and Language Processing, 19(7), 2170–2183.CrossRef
go back to reference Pulakka, H., Laaksonen, L., Vainio, M., Pohjalainen, J., & Alku, P. (2008). Evaluation of an artificial speech bandwidth extension method in three languages. IEEE Transactions on Audio, Speech and Language Processing, 16(6), 1124–1137.CrossRef Pulakka, H., Laaksonen, L., Vainio, M., Pohjalainen, J., & Alku, P. (2008). Evaluation of an artificial speech bandwidth extension method in three languages. IEEE Transactions on Audio, Speech and Language Processing, 16(6), 1124–1137.CrossRef
go back to reference Pulakka, H., Remes, U., Palomaki, K., Kurimo, M., & Alku, P. (2011). Speech bandwidth extension using gaussian mixture model-based estimation of the highband Mel spectrum. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 5100–5103. Pulakka, H., Remes, U., Palomaki, K., Kurimo, M., & Alku, P. (2011). Speech bandwidth extension using gaussian mixture model-based estimation of the highband Mel spectrum. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 5100–5103.
go back to reference Qian, Y., & Kabal, P. (2003). Dual-mode wideband speech recovery from narrowband speech. In Proceedings of EUROSPEECH, pp. 1433–1436. Qian, Y., & Kabal, P. (2003). Dual-mode wideband speech recovery from narrowband speech. In Proceedings of EUROSPEECH, pp. 1433–1436.
go back to reference Rabie, T., & Guerchi, D. (2007). Magnitude spectrum speech hiding. In Proceedings of IEEE international conference on signal processing and communications, pp. 1147–1150. Rabie, T., & Guerchi, D. (2007). Magnitude spectrum speech hiding. In Proceedings of IEEE international conference on signal processing and communications, pp. 1147–1150.
go back to reference Rongqiang, H. U., Venkatesh, K., & Anderson, D. V. (2005). Speech bandwidth extension by improved codebook mapping towards increased phonetic classification. In Proceedings of Interspeech, pp. 1501–1504. Rongqiang, H. U., Venkatesh, K., & Anderson, D. V. (2005). Speech bandwidth extension by improved codebook mapping towards increased phonetic classification. In Proceedings of Interspeech, pp. 1501–1504.
go back to reference Schroeder, M. R., & Atal, B. S. (1985). Code-excited linear prediction (CELP); high quality at low bit rates. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 937–940. Schroeder, M. R., & Atal, B. S. (1985). Code-excited linear prediction (CELP); high quality at low bit rates. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 937–940.
go back to reference Strange, W., Edman, T. R., & Jenkins, J. J. (1979). Acoustic and phonological factors in vowel identification. Journal of Experimental Psychology: Human Perception and Performance, 5(4), 643–656. Strange, W., Edman, T. R., & Jenkins, J. J. (1979). Acoustic and phonological factors in vowel identification. Journal of Experimental Psychology: Human Perception and Performance, 5(4), 643–656.
go back to reference Vary, P., & Martin, R. (2006). Digital speech transmission: Enhancement, coding and error concealment. Chichester: Wiley.CrossRef Vary, P., & Martin, R. (2006). Digital speech transmission: Enhancement, coding and error concealment. Chichester: Wiley.CrossRef
go back to reference Vaseghi, S., Zavarehei, E., & Yan, Q. (2006). Speech bandwidth extension: Extrapolations of spectral envelop and harmonicity quality of excitation. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 844–847. Vaseghi, S., Zavarehei, E., & Yan, Q. (2006). Speech bandwidth extension: Extrapolations of spectral envelop and harmonicity quality of excitation. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp. 844–847.
Metadata
Title
Bandwidth extension of telephone speech using magnitude spectrum data hiding
Authors
Prasad Nizampatnam
Kishore Kumar Tappeta
Publication date
13-01-2017
Publisher
Springer US
Published in
International Journal of Speech Technology / Issue 1/2017
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI
https://doi.org/10.1007/s10772-016-9393-x

Other articles of this Issue 1/2017

International Journal of Speech Technology 1/2017 Go to the issue