Top

International Journal of Speech Technology

Published in:

29-08-2016

Low complexity forward error correction for CELP-type speech coding over erasure channel transmission

Authors: Nadir Benamirouche, Bachir Boudraa, Domingo López-Oller, José L. Pérez-Córdoba

Published in: International Journal of Speech Technology | Issue 4/2016

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

One of the well-known problems of Code-Excited Linear Prediction (CELP)-type codec is its vulnerability to a frame erasure. When a frame is erased, the inter-frame dependency introduced by the Long Term Prediction causes a desynchronization of the Adaptive Codebook (ACB) which introduces in its turn an error propagation through the correctly received frames. In this paper, we propose a media-specific Forward Error Correction (FEC) method using a Pitch-Pulse Codebook (PPCB)-based approach to model the ACB contribution for voiced frame (frame onset) determined under Zero Crossing Rate constraint. The PPCB uses a single pulse optimized by Multipulse Maximum Likelihood Quantization algorithm to model the pitch-like contribution at the encoder side while the quantized version of that pulse will be sent as FEC information to resynchronize the ACB at the decoder side after a frame erasure. Through this approach a noticeable improvement of the synthesis speech quality is achieved under adverse channel conditions with the advantage of low computational complexity while the legacy bit-rate of the codec is kept unchanged.

previous article Energy bands and spectral cues for Arabic vowels recognition

next article Subspace filtering approach based on orthogonal projection for better analysis of stressed speech under clean and noisy environments

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Al-Rousan, M., & Nawasrah, A. (2012). Adaptive FEC technique for multimedia applications over the internet. Journal of Emerging Technologies in Web Intelligence, 4(2), 142–147.

Andersen, S. V., Kleijn, W. B., Hagen, R., Linden, J., Murthi, M. N., & Skoglund, J. (2002). iLBCA linear predictive coder with robustness to packet losses. Proceedings of IEEE Speech Coding Workshop, Tsukuba (pp. 23–25).

Anselam, A. S., & Pillai, S. S. (2014). Performance evaluation of code excited linear prediction speech coders at various bit rates. In IEEE 2014 International Conference on Computation of Power, Energy, Information and Communication (ICCPEIC) (pp. 93–98).

Assem, H., Malone, D., Dunne, J., & O’Sullivan, P. (2013). A new adaptive redundancy control algorithm for VoIP applications. In IEEE Global Communications Conference (GLOBECOM) (pp. 1323–1328).

Bhebhe, L., & Parkkali, R. (2011). VoIP performance over HSPA with different VoIP clients. Wireless Personal Communications, 58(3), 613–626.CrossRef

Blake, I. F., & Mullin, R. C. (2014). An introduction to algebraic and combinatorial coding theory. London: Academic Press.MATH

Carmona, J. L., Pérez-Córdoba, J. L., Peinado, A. M., Gomez, A. M., & Gonzalez, J. A. (2008). A scalable coding scheme based on interframe dependency limitation. IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4805–4808).

Casu, F., Cabrera, J., Jaureguizar, F., & Garca, N. (2015). A protection scheme for multimedia packet streams in bursty packet loss networks based on small block low-density parity-check codes. EURASIP Journal on Wireless Communications and Networking, 1, 1.

Chandra, M., & Ray, M. (2015). Comparative study of PCM, LPC, and CELP speech coders used for VoIP applications. Intelligent computing, communication and devices (pp. 579–587). India: Springer.CrossRef

Garofolo, J. S. The Structure and Format of the DARPA TIMIT”, CD-ROM Prototype, Documentation of DARPA TIMIT.

Gomez, A. M., Carmona, J. L., Peinado, A., & Sanchez, V. (2010). A multipulse-based forward error correction technique for robust CELP-coded speech transmission over erasure channels. IEEE Transaction on Audio, Speech, Language Process, 18(6), 1258–1268.CrossRef

Gomez, A. M., Carmona, J. L., Peinado, A., & Sanchez, V. (2011). One-pulse fec coding for robust celp-coded speech transmission over erasure channels. IEEE Transactions on Multimedia, 13(5), 894–904.CrossRef

Gupta, V., Dharmaraja, S., & Arunachalam, V. (2015). Stochastic modeling for delay analysis of a VoIP network. Annals of Operations Research, 233(1), 171–180.MathSciNetCrossRefMATH

ITU Rec., (1996). G.723.1, Dual rate speech coder for multimedia communication transmitting at 5.3kbit/s and 6.3kbit/s.

Jalil, S., Abbad, M., & El Azouzi, R. (2015). Hybrid FEC/ARQ schemes for real-time traffic in wireless networks. In: 2015 International Conference on Wireless Networks and Mobile Communications (WINCOM) (pp. 1–6).

Jalil, M., Butt, F. A., & Malik, A. (2013). Short-time energy, magnitude, zero crossing rate and autocorrelation measurement for discriminating voiced and unvoiced segments of speech signals. In IEEE International Conference In Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE) (pp. 208–212).

Jiang, W., & Schulzrinne, H. (2000). Modeling of packet loss anddelay and their effect on real-time multimedia service quality. In Proceedings of NOSSDAV 2000.

Kang, J. A., & Kim, H. K. (2011). An adaptive packet loss recovery method based on real-time speech quality assessment and redundant speech transmission. International Journal of Innovative Computing, Information and Control, 7(12), 6773–6783.MathSciNet

Kheddar, H., & Boudraa, B. (2015). Implementation of interleaving methods on MELP 2.4 coder to reduce packet loss in the Voice over IP (VoIP) transmission. International Journal of Engineering Research and Applications, 5(3), 1–4.

Kim, B. H., Kim, H. G., Jeong, J., & Kim, J. Y. (2013). VoIP receiver-based adaptive playout scheduling and packet loss concealment technique. IEEE Transactions on Consumer Electronics, 59(1), 250–258.CrossRef

Kuo, C. F., Tseng, H. W., & Pang, A. C. (2013). A fragment-based retransmission scheme with quality-of-service considerations for wireless networks. Wireless Communications and Mobile Computing, 13(16), 1450–1463.

Lamel, L., Kassel, R., & Seneff, S. (1986). Speech database development: Design and analysis of the acoustic-phonetic corpus. In Proceedings on Speech Recognition Workshop (DARPA) (pp. 100–110).

Liu, J., Zhao, S., Wang, J., & Kuang, J. (2011). FEC-based packet loss recovery for AVS-M audio codec. In: 2011 IEEE International Conference on Multimedia Technology (ICMT) (pp. 3069–3072).

López-Oller, D., Gomez, A. M., & Pérez-Córdoba, J. L. (2014). Source-based error mitigation for speech transmissions over erasure channels. In 2014 22nd European Signal Processing Conference (EUSIPCO) (pp. 1242–1246).

Ma, Z., Martin, R., Guo, J., & Zhang, H. (2014). Nonlinear estimation of missing \(\delta\) LSF parameters by a mixture of Dirichlet distributions. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6929–6933).

Mehran, F. (2011). Interleaver choice in serially concatenated codes for soft real-time wireless speech transmission applications. In IEEE 2011 19th Iranian Conference on Electrical Engineering (ICEE) (pp. 1–6).

Merazka, F. (2013). Packet loss concealment by interpolation for speech over IP network services. In 2013 Constantinides International Workshop on Signal Processing (CIWSP 2013) (pp. 1–4).

Merazka, F. (2014). A comparison of packet loss concealment and control for voice transmission over IP network services. In IEEE 9th International Symposium: Communication Systems, Networks & Digital Signal Processing (CSNDSP) (pp. 497–501).

Miralavi, S. R., Ghorshi, S., Mortazavi, M., & Choupan, J. (2011). Packet loss replacement in voip using a recursive low-order autoregressive model-based speech. In 8th International Multi-Conference on Systems, Signals and Devices (SSD) (pp. 1–4).

Nagano, T., & Ito, A. (2013). A Packet Loss Recovery of G.729 speech using discriminative model and N-gram. In 2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (pp. 267–270).

Nath, D., & Kalita, S. K. (2014). An effective age detection method based on short time energy and zero crossing rate. In IEEE 2nd International Conference In Business and Information Management (ICBIM) (pp. 99–103).

Oh, S. M., & Kim, J. H. (2012). Application-aware retransmission design for VoIP services in BWA networks. In IEEE 14th International Conference on Advanced Communication Technology (ICACT) (pp. 122–131).

Park, N. I., Kim, H. K., Jung, M. A., Lee, S. R., & Choi, S. H. (2010). A packet loss concealment algorithm robust to burst packet loss using multiple codebooks and comfort noise for CELP-type speech coders. Communication and Networking (pp. 138–147). Berlin, Heidelberg: Springer.CrossRef

Perceptual Evaluation of Speech Quality (PESQ). (2001). An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs. ITU-T P.862 Recommendation.

Perkins, C., Hodson, O. & Hardman, V. (2001). A survey of packet loss recovery techniques for streaming audio. Readings in Multimedia Computing and Networking, 607–615.

Silveira, F., & Silva, E. D. S. (2012). Predicting packet loss statistics with hidden Markov models for FEC control. Computer Networks, 56(2), 628–641.CrossRef

Singh, H. P., Singh, S., Singh, J., & Khan, S. A. (2014). VoIP: State of art for global connectivity a critical review. Journal of Network and Computer Applications, 37, 365–379.CrossRef

Taleb, A., (2011). Low-complexity code excited linear prediction encoding. U.S. Patent No. 8,000,967. Washington, DC: U.S. Patent and Trademark Office.

Toral-Cruz, H., Pathan, A. S. K., & Ramirez Pacheco, J. C. (2013). Accurate modeling of VoIP traffic QoS parameters in current and future networks with multifractal and Markov models. Mathematical and Computer Modelling, 2832–2845(57), 11.MathSciNetMATH

Toyoshima, M. & Shimamura, T. (2014). Packet loss concealment for VoIP based on pitch waveform replication and linear predictive coding. In IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) (pp. 89–92).

TS 26.190, Adaptive multi-rate-wideband (AMR-WB) speech codec: Transcoding functions, 3GPP Tech. Spec.

Title: Low complexity forward error correction for CELP-type speech coding over erasure channel transmission
Authors: Nadir Benamirouche
Bachir Boudraa
Domingo López-Oller
José L. Pérez-Córdoba
Publication date: 29-08-2016
Publisher: Springer US
Published in: International Journal of Speech Technology / Issue 4/2016
Print ISSN: 1381-2416
Electronic ISSN: 1572-8110
DOI: https://doi.org/10.1007/s10772-016-9365-1

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 4/2016

Maghrebian dialect recognition based on support vector machines and neural network classifiers

Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

Spectral analysis of infant cries and adult speech

Study of sub-word acoustical models for Kannada isolated word recognition system

Speaker diarization system using MKMFCC parameterization and WLI-fuzzy clustering

FDBN: Design and development of Fractional Deep Belief Networks for speaker emotion recognition