Skip to main content
Top

2016 | OriginalPaper | Chapter

A Novel Error Mitigation Scheme Based on Replacement Vectors and FEC Codes for Speech Recovery in Loss-Prone Channels

Authors : Domingo López-Oller, Angel M. Gomez, José L. Pérez-Córdoba

Published in: Advances in Speech and Language Technologies for Iberian Languages

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we propose an error mitigation scheme which combines two different approaches, a replacement super vector technique which provides replacements to reconstruct both the LPC coefficients and the excitation signal along bursts of lost packets, and a Forward Error Code (FEC) technique in order to minimize the error propagation after the last lost frame. Moreover, this FEC code is embedded into the bitstream in order to avoid the bitrate increment and keep the codec working in a compliant way on clean transmissions. The success of our recovery technique deeply relies on a quantization of the speech parameters (LPC coefficients and the excitation signal), especially in the case of the excitation signal where a modified version of the well-known Linde-Buzo-Gray (LBG) algorithm is applied. The performance of our proposal is evaluated over the AMR codec in terms of speech quality by using the PESQ algorithm. Our proposal achieves a noticeable improvement over the standard AMR legacy codec under adverse channel conditions without incurring neither on high computational costs or delays during the decoding stage nor consuming any additional bitrate.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference 3GPP TS 26.090: Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec (1999) 3GPP TS 26.090: Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec (1999)
2.
go back to reference Schroeder, M., Atal, B.: Code-excited linear prediction (CELP): high-quality speech at very low bit rates. IEEE ICASSP 10, 937–940 (1985) Schroeder, M., Atal, B.: Code-excited linear prediction (CELP): high-quality speech at very low bit rates. IEEE ICASSP 10, 937–940 (1985)
3.
go back to reference Serizawa, M., Ito, H.: A packet loss recovery method using packet arrived behind the playout time for CELP decoding. IEEE ICASSP 1, 169–172 (2002) Serizawa, M., Ito, H.: A packet loss recovery method using packet arrived behind the playout time for CELP decoding. IEEE ICASSP 1, 169–172 (2002)
4.
go back to reference Chibani, M., Lefebvre, R., Gournay, P.: Fast recovery for a CELP-like speech codec after a frame erasure. IEEE Trans. Audio Speech Lang. Process. 15(8), 2485–2495 (2007)CrossRef Chibani, M., Lefebvre, R., Gournay, P.: Fast recovery for a CELP-like speech codec after a frame erasure. IEEE Trans. Audio Speech Lang. Process. 15(8), 2485–2495 (2007)CrossRef
5.
go back to reference Carmona, J., Pérez-Córdoba, J., Peinado, A., Gomez, A., González, J.: A scalable coding scheme based on interframe dependency limitation. In: IEEE ICASSP, pp. 4805–4808 (2008) Carmona, J., Pérez-Córdoba, J., Peinado, A., Gomez, A., González, J.: A scalable coding scheme based on interframe dependency limitation. In: IEEE ICASSP, pp. 4805–4808 (2008)
6.
go back to reference Liao, W., Chen, J., Chen, M.: Adaptive recovery techniques for real-time audio streams. IEEE INFOCOM 2, 815–823 (2001) Liao, W., Chen, J., Chen, M.: Adaptive recovery techniques for real-time audio streams. IEEE INFOCOM 2, 815–823 (2001)
7.
go back to reference Merazka, F.: Packet loss concealment by interpolation for speech over IP network services. In: CIWSP, pp. 1–4 (2013) Merazka, F.: Packet loss concealment by interpolation for speech over IP network services. In: CIWSP, pp. 1–4 (2013)
8.
go back to reference Lindbrom, J., Hedelin, P.: Packet loss concealment based on sinusoidal extrapolation. IEEE ICASSP 1, 173–176 (2002) Lindbrom, J., Hedelin, P.: Packet loss concealment based on sinusoidal extrapolation. IEEE ICASSP 1, 173–176 (2002)
9.
go back to reference Hodson, O., Perkins, C., Hardman, V.: A survey of packet loss recovery techniques for streaming audio. IEEE Netw. 12, 40–48 (1998)CrossRef Hodson, O., Perkins, C., Hardman, V.: A survey of packet loss recovery techniques for streaming audio. IEEE Netw. 12, 40–48 (1998)CrossRef
10.
go back to reference Rodbro, C., Murthi, M., Andersen, S., Jensen, S.: Hidden Markov model-based packet loss concealment for voice over IP. IEEE Trans. Audio Speech Lang. Process. 14, 1609–1622 (2006)CrossRef Rodbro, C., Murthi, M., Andersen, S., Jensen, S.: Hidden Markov model-based packet loss concealment for voice over IP. IEEE Trans. Audio Speech Lang. Process. 14, 1609–1622 (2006)CrossRef
11.
go back to reference López-Oller, D., Gomez, A., Pérez-Córdoba, J.: Residual VQ-quantization for speech frame loss concealment. In: IberSPEECH, November 2014 López-Oller, D., Gomez, A., Pérez-Córdoba, J.: Residual VQ-quantization for speech frame loss concealment. In: IberSPEECH, November 2014
12.
go back to reference Zhang, G., Kleijn, W.: Autoregressive model-based speech packet-loss concealment. IEEE ICASSP 1, 4797–4800 (2008) Zhang, G., Kleijn, W.: Autoregressive model-based speech packet-loss concealment. IEEE ICASSP 1, 4797–4800 (2008)
13.
go back to reference Ma, Z., Martin, R., Guo, J., Zhang, H.: Nonlinear estimation of missing LSF parameters by a mixture of Dirichlet distributions. In: IEEE ICASSP, pp. 6929–6933, May 2014 Ma, Z., Martin, R., Guo, J., Zhang, H.: Nonlinear estimation of missing LSF parameters by a mixture of Dirichlet distributions. In: IEEE ICASSP, pp. 6929–6933, May 2014
14.
go back to reference Boubakir, C., Berkani, D.: The estimation of line spectral frequencies trajectories based on unscented Kalman filtering. In: International Multi-Conference on Systems, Signals and Devices, pp. 1–6 (2009) Boubakir, C., Berkani, D.: The estimation of line spectral frequencies trajectories based on unscented Kalman filtering. In: International Multi-Conference on Systems, Signals and Devices, pp. 1–6 (2009)
15.
go back to reference Chazan, D., Hoory, R., Cohen, G., Zibulski, M.: Speech reconstruction from MEL frequency cepstral coefficients and pitch frequency. IEEE ICASSP 3, 1299–1302 (2000) Chazan, D., Hoory, R., Cohen, G., Zibulski, M.: Speech reconstruction from MEL frequency cepstral coefficients and pitch frequency. IEEE ICASSP 3, 1299–1302 (2000)
16.
go back to reference Merazka, F.: Differential quantization of spectral parameters for CELP based coders in packet networks. In: IECON, pp. 1495–1498, October 2012 Merazka, F.: Differential quantization of spectral parameters for CELP based coders in packet networks. In: IECON, pp. 1495–1498, October 2012
17.
go back to reference Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Trans. Commun. 28(1), 84–95 (1980)CrossRef Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE Trans. Commun. 28(1), 84–95 (1980)CrossRef
18.
go back to reference Gomez, A., Carmona, J., Peinado, A., Sánchez, V.: A multipulse-based forward error correction technique for robust CELP-coded speech transmission over erasure channels. IEEE Trans. Audio Speech Lang. Process. 18, 1258–1268 (2010)CrossRef Gomez, A., Carmona, J., Peinado, A., Sánchez, V.: A multipulse-based forward error correction technique for robust CELP-coded speech transmission over erasure channels. IEEE Trans. Audio Speech Lang. Process. 18, 1258–1268 (2010)CrossRef
19.
go back to reference Gomez, A., Carmona, J., González, J., Sánchez, V.: One-pulse FEC coding for robust CELP-coded speech transmission over erasure channels. IEEE Trans. Multimedia 13(5), 894–904 (2011)CrossRef Gomez, A., Carmona, J., González, J., Sánchez, V.: One-pulse FEC coding for robust CELP-coded speech transmission over erasure channels. IEEE Trans. Multimedia 13(5), 894–904 (2011)CrossRef
20.
go back to reference Ehara, H., Yoshida, K.: Decoder initializing technique for improving frame-erasure resilience of a CELP speech codec. IEEE Trans. Multimedia 10, 549–553 (2008)CrossRef Ehara, H., Yoshida, K.: Decoder initializing technique for improving frame-erasure resilience of a CELP speech codec. IEEE Trans. Multimedia 10, 549–553 (2008)CrossRef
21.
go back to reference Itakura, F.: Line spectrum representation of linear predictive coefficients of speech signals. J. Acoust. Soc. Am. 57, S35 (1975)CrossRef Itakura, F.: Line spectrum representation of linear predictive coefficients of speech signals. J. Acoust. Soc. Am. 57, S35 (1975)CrossRef
22.
go back to reference Kondoz, A.: Digital Speech: Coding for Low Bit Rate Communications Systems. Wiley, Hoboken (1994) Kondoz, A.: Digital Speech: Coding for Low Bit Rate Communications Systems. Wiley, Hoboken (1994)
23.
go back to reference Soong, F., Juang, B.: Line spectrum pair (LSP) and speech data compression. IEEE ICASSP 9, 37–40 (1984) Soong, F., Juang, B.: Line spectrum pair (LSP) and speech data compression. IEEE ICASSP 9, 37–40 (1984)
24.
go back to reference López-Oller, D., Gomez, A., Pérez-Córdoba, J.: Source-based error mitigation for speech transmissions over erasure channels. In: EUSIPCO, pp. 1242–1246, September 2014 López-Oller, D., Gomez, A., Pérez-Córdoba, J.: Source-based error mitigation for speech transmissions over erasure channels. In: EUSIPCO, pp. 1242–1246, September 2014
25.
go back to reference Gómez, A., Peinado, A., Sánchez, V., Rubio, A.: A source model mitigation technique for distributd speech recognition over lossy packet channels. In: Proceedings of EUROSPEECH, pp. 2733–2736 (2003) Gómez, A., Peinado, A., Sánchez, V., Rubio, A.: A source model mitigation technique for distributd speech recognition over lossy packet channels. In: Proceedings of EUROSPEECH, pp. 2733–2736 (2003)
26.
go back to reference Geiser, B., Vary, P.: High rate data hiding in ACELP speech codecs. In: IEEE ICASSP, pp. 4005–4008, April 2008 Geiser, B., Vary, P.: High rate data hiding in ACELP speech codecs. In: IEEE ICASSP, pp. 4005–4008, April 2008
27.
go back to reference López-Oller, D., Gomez, A.M., Córdoba, J.L.P., Geiser, B., Vary, P.: Steganographic pulse-based recovery for robust ACELP transmission over erasure channels. In: Torre Toledano, D., Ortega Giménez, A., Teixeira, A., González Rodríguez, J., Hernández Gómez, L., San Segundo Hernández, R., Ramos Castro, D. (eds.) IberSPEECH 2012. CCIS, vol. 328, pp. 257–266. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35292-8_27 CrossRef López-Oller, D., Gomez, A.M., Córdoba, J.L.P., Geiser, B., Vary, P.: Steganographic pulse-based recovery for robust ACELP transmission over erasure channels. In: Torre Toledano, D., Ortega Giménez, A., Teixeira, A., González Rodríguez, J., Hernández Gómez, L., San Segundo Hernández, R., Ramos Castro, D. (eds.) IberSPEECH 2012. CCIS, vol. 328, pp. 257–266. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-35292-8_​27 CrossRef
28.
go back to reference ITU-T Recomendation P.862: Perceptual evaluation of speech quality (PESQ) (2001) ITU-T Recomendation P.862: Perceptual evaluation of speech quality (PESQ) (2001)
29.
go back to reference ITU-R BS.1534-1: Method for the subjective assessment of intermediate quality level of coding systems (2001) ITU-R BS.1534-1: Method for the subjective assessment of intermediate quality level of coding systems (2001)
30.
go back to reference Garofolo, J., et al.: The Structure and Format of the DARPA TIMIT CD-ROM Prototype (1990) Garofolo, J., et al.: The Structure and Format of the DARPA TIMIT CD-ROM Prototype (1990)
Metadata
Title
A Novel Error Mitigation Scheme Based on Replacement Vectors and FEC Codes for Speech Recovery in Loss-Prone Channels
Authors
Domingo López-Oller
Angel M. Gomez
José L. Pérez-Córdoba
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-49169-1_5

Premium Partner