nach oben

Erschienen in:

2015 | OriginalPaper | Buchkapitel

3. Scalable and Multi-Rate Speech Coding for Voice-over-Internet Protocol (VoIP) Networks

verfasst von : Tokunbo Ogunfunmi, Koji Seto

Erschienen in: Speech and Audio Processing for Coding, Enhancement and Recognition

Verlag: Springer New York

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Communication by speech is still a very popular and effective means of transmitting information from one person to another. Speech signals form the basic method of human communication. The information communicated in this case is verbal or auditory information. The methods used for speech coding are very extensive and continuously evolving.

Speech Coding can be defined as the means by which the information-bearing speech signal is coded to remove redundancy thereby reducing transmission bandwidth requirements, improving storage efficiency, and making possible myriad other applications that rely on speech coding techniques.

The medium of speech transmission has also been changing over the years. Currently a large percentage of speech is communicated over channels using internet protocols. The voice-over-internet protocols (VoIP) channels present some challenges that have to be overcome in order to enable error-free, robust speech communication.

There are several advantages to use bit-streams that are multi-rate and scalable for time-varying VoIP channels. In this chapter, we present the methods for scalable, multi-rate speech coding for VoIP channels.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Challenges in Speech Coding Research

Nächstes Kapitel Recent Speech Coding Technologies and Standards

J. Skoglund et~al., Voice over IP: speech transmission over packet networks, in Handbook of Speech Processing, ed. by J. Benesty, M.M. Sondhi, Y. Huang (Berlin, Springer, 2009). Chap. 15

A. Gersho, E. Paksoy, An overview of variable rate speech coding for cellular networks, in Proc. of the Int. Conf. On Selected Topics in Wireless Communications, Vancouver (1992)

A. Gersho, E. Paksoy, Variable rate speech coding for cellular networks, in Speech and Audio Coding for Wireless and Network Applications, ed. by B.S. Atal, V. Cuperman, A. Gersho (Kluwer Academic, Norwell, 1993), pp. 77–84CrossRef

V. Cuperman, P. Lupini, Variable rate speech coding, in Modern Methods of Speech Processing, ed. by R.P. Ramachandran, R.J. Mammone (Kluwer Academic, Norwell, 1995), pp. 101–120CrossRef

W. Gardner, P. Jacobs, C. Lee, QCELP: a variable rate speech coder for CDMA digital cellular, in Speech and Audio Coding for Wireless and Network Applications, ed. by B.S. Atal, V. Cuperman, A. Gersho (Kluwer Academic, Norwell, 1993), pp. 85–92CrossRef

TIA, Speech service option standard for wideband spread spectrum systems—TIA/EIA/IS-96 (1994)

TIA, Enhanced variable rate codec, speech service option 3 for wideband spread spectrum digital systems—TIA/EIA/IS-127 (1997)

K. Järvinen, Standardization of the adaptive multi-rate codec, in Proceedings of European Signal Processing Conference (EUSIPCO), Tampere (2000)

E. Ekudden, R. Hagen, I. Johansson, J. Svedberg, The AMR speech coder, in Proc. IEEE Workshop on speech coding, Porvoo (1999), pp. 117–119

10.

ETSI, Digital cellular telecommunications system (Phase 2+); Adaptive multi-rate (AMR) speech transcoding, GSM 06.90, version 7.2.1, Release (1998)

11.

ETSI, Universal mobile telecommunications system (UMTS); Mandatory speech codec speech processing functions AMR speech codec; Transcoding Functions, 3GPP TS 26.090 Version 3.1.0, Release (1999)

12.

B. Bessette et~al., The adaptive multirate wideband speech codec (AMR-WB). IEEE Trans. Speech Audio Process. 10, 620–636 (2002)

13.

ETSI, Adaptive multi-rate – wideband (AMR-WB) speech codec; Transcoding functions, 3GPP TS 26.190 (2001)

14.

K. Järvinen et~al., Media coding for the next generation mobile system LTE. Elsevier Comput. Commun. 33(16), 1916–1927 (2010)

15.

C. Laflamme, J-P. Adoul, R. Salami, S. Morisette, P. Mabilleau, 16 kbps wideband speech coding technique based on algebraic CELP, in Proceedings of IEEE International Conference on Acoustic, Speech and Signal Processing, Toronto (1991), pp. 13–16

16.

K. Järvinen et~al., GSM enhanced full rate speech codec, in Proceedings of IEEE International Conference on Acoustic, Speech and Signal Processing, Munich (1997), pp. 771–774

17.

T. Honkanen et~al., Enhanced full rate speech codec for IS-136 digital cellular system, in Proceedings of IEEE International Conference on Acoustic, Speech and Signal Processing, Munich (1997), pp. 731–734

18.

S. Bruhn, P. Blöcher, K. Hellwig, J. Sjöberg, Concepts and solutions for link adaptation and inband signaling for the GSM AMR speech coding standard, in IEEE Vehicular Technology Conference (1999)

19.

Y. Hiwasaki, T. Mori, H. Ohmuro, J. Ikedo, D. Tokumoto, A. Kataoka, Scalable speech coding technology for high-quality ubiquitous communications. NTT Tech. Rev. 2(3), 53–58 (2004)

20.

B. Geiser et~al., Embedded speech coding: from G.711 to G.729.1, in Advances in Digital Speech Transmission, ed. by R. Martin, U. Heute, C. Antweiler (Wiley, Chichester, 2008), pp. 201–247. Chap. 8

21.

ITU-T Rec. G.729.1, An 8–32 kbit/s Scalable Wideband Coder Bitstream Interoperable with G.729, International Telecommunication Union (ITU) (2006)

22.

ITU-T Rec. G.726, Adaptive Differential Pulse Code Modulation (ADPCM) of Voice Frequencies, International Telecommunication Union (ITU) (1990)

23.

ITU-T Rec. G.728, Coding of Speech at 16 kbit/s Using Low-Delay Code-Excited Linear Prediction (LD-CELP), International Telecommunication Union (ITU) (1992)

24.

ITU-T Rec. G.729, Coding of Speech at 8 kbit/s Using Conjugate-Structure Algebraic-Code-Excited Linear Prediction (CS-ACELP), International Telecommunication Union (ITU) (1996)

25.

S. Ragot, B. Kovesi, R. Trilling, D. Virette, N. Duc, D. Massaloux, S. Proust, B. Geiser, M. Gartner, S. Schandl, H. Taddei, Y. Gao, E. Shlomot, H. Ehara, K. Yoshida, T. Vaillancourt, R. Salami, M.S. Lee, D.Y. Kim. ITU-T G.729.1: an 8–32 kb/s scalable coder interoperable with G.729 for wideband telephony and voice over IP, in Proceedings of IEEE International Conference on Acoustic, Speech and Signal Processing (2007), pp. 529–532

26.

TIA, Source-controlled variable-rate multimode wideband speech codec (VMR-WB)—3GPP2 C.S0052-0 (2004)

27.

M. Jelínek, R. Salami, Wideband speech coding advances in VMR-WB standard. IEEE Trans. Audio Speech Lang. Process.15(4), 1167–1179 (2007)CrossRef

28.

T. Vaillancourt et~al., ITU-T G.EV-VBR: a Robust 8–32 kb/s scalable coder for error prone telecommunications channels, in Proceedings of the Eusipco, Lausanne, Switzerland (2008)

29.

V. Eksler, M. Jelínek, Transition coding for source controlled CELP codecs, in Proc. IEEE ICASSP, Las Vegas (2008), pp. 4001–4004

30.

M. Oshikiri et~al., An 8–32 kb/s scalable wideband coder extended with MDCT-based bandwidth extension on top of a 6.8 kb/s narrowband CELP coder, in Proceedings of Interspeech, Antwerp (2007), pp.1701–1704

31.

U. Mittal, J.P. Ashley, E. Cruz-Zeno. Low complexity factorial pulse coding of MDCT coefficients using approximation of combinatorial functions, in Proceedings of IEEE ICASSP, Honolulu, vol. 1 (2007), pp. 289–292

32.

T. Vaillancourt et~al., Efficient frame erasure concealment in predictive speech codecs using glottal pulse resynchronisation, in Proceedings of IEEE ICASSP, Honolulu, vol. 4 (2007) pp. 1113–1116

33.

T. Ogunfunmi, M.J. Narasimha, Speech over VoIP networks: advanced signal processing and system implementation. IEEE Circuits Syst. Magazine 12(2), 35–55 (2012)CrossRef

34.

FCC, http://transition.fcc.gov/oet/tac/TACMarch2011mtgfullpresentation.pdf, Meeting presentation of the Technological Advisory Council (2011a)

35.

FCC, http://transition.fcc.gov/oet/tac/TACJune2011mtgfullpresentation.pdf, Meeting presentation of the Technological Advisory Council (2011b)

36.

R. Lefebvre, P. Gournay, R. Salami, A study of design compromises for speech coders in packet networks, in Proceedings of IEEE International Conference on Acoustic, Speech and Signal Processing, vol. I (2004) pp. 265–268

37.

V. Eksler, M. Jelinek, Glottal-shape codebook to improve robust-ness of CELP codecs. IEEE Trans. Audio Speech Lang. Process. 18(6), 1208–1217 (2010)CrossRef

38.

J.-M. Valin, K. Vos, T. Terriberry, Internet Engineering Task Force RFC6716 (2012)

39.

S.V. Andersen, W.B. Kleijn, R. Hagen, J. Linden, M.N. Murthi, J. Skoglund, iLBC-A linear predictive coder with robustness to packet losses, in IEEE Speech Coding Workshop Proceedings (2002), pp. 23–25

40.

T. Ogunfunmi, M.J. Narasimha, Principles of Speech Coding (CRC, BocaRaton, 2010)CrossRefMATH

41.

K. Seto, T. Ogunfunmi, Multi-rate iLBC using the DCT, in Proceedings of the IEEE Workshop on SiPS (2010), pp. 478–482

42.

K. Seto, T. Ogunfunmi, Performance enhanced multi-rate iLBC, in Proceedings of the 45th Asilomar Conference (2011)

43.

K. Seto, T. Ogunfunmi, Scalable multi-rate iLBC, in Proceedings of IEEE International Symposium on Circuits and Systems (2012)

44.

K. Seto, T. Ogunfunmi, Scalable speech coding for IP networks: beyond iLBC. IEEE Trans. Audio Speech Lang. Process. 21(11), 2337–2345 (2013)CrossRef

45.

K. Seto, T. Ogunfunmi, Scalable wideband speech coding for IP networks, in Proceedings of the 46th Annual Asilomar Conference on Signals, Systems, and Computers, Pacific Grove (2012)

46.

K. Seto, T. Ogunfunmi, A scalable wideband speech codec based on the iLBC, submitted to IEEE Transactions on Audio, Speech, and Language Processing

47.

S.V. Andersen et~al., Internet low bit-rate codec (iLBC) [Online]. RFC3951, IETF organization (2004), http://tools.ietf.org/html/rfc3951

48.

C.M. Garrido, M.N. Murthi, S.V. Andersen, On variable rate frame independent predictive speech coding: re-engineering iLBC, in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. 1, 717–720 (2006)

49.

J. Princen, A. Bradley, Analysis/synthesis filter bank design based on time domain aliasing cancellation. IEEE Trans. Acoust. Speech Signal Process. 34(5), 1153–1161 (1986)CrossRef

50.

ITU-T Rec. P.862, Perceptual Evaluation of Speech Quality (PESQ) (2001)

51.

ITU-T Rec. P.501, Test signals for use in telephonometry (2012)

52.

ITU-T Rec. G.191, Software tools for speech and audio coding standardization (2010)

53.

E.N. Gilbert, Capacity of a burst-noise channel. Bell Syst. Tech. J. 39, 1253–1265 (1960)CrossRef

54.

I. Daubechies, Orthonormal bases of compactly supported wavelets. Commun. Pure Appl. Math. 41, 909–996 (1988)MathSciNetCrossRefMATH

56.

F. Chen, K. Kuo, Complexity scalability design in the internet low bit rate codec (iLBC) for speech coding. IEICE Trans. Inf. Syst. 93(5), 1238–1243 (2010)CrossRef

57.

D. Collins, Carrier-Grade Voice-over-IP, 2nd edn. (McGraw-Hill, New York, 2002)

58.

A. Das, E. Paksoy, A. Gersho, Multimode and variable-rate coding of speech, in Speech Coding and Synthesis, ed. by W.B. Kleijn, K.K. Paliwal (Elsevier, Amsterdam, 1995), pp. 257–288

59.

J. Davidson, Voice-over-IP Fundamentals, 2nd edn. (Cisco, Indianapolis, 2006)

60.

G.D. Forney, Coset codes. I. Introduction and geometrical classification. IEEE Trans. Inf. Theory 34(5), 1123–1151 (1988)MathSciNetCrossRef

62.

A. Gersho, Advances in speech and audio compression. Proc. IEEE 82, 900–918 (1994)CrossRef

63.

J. Gibson, Speech coding methods, standards and applications. IEEE Circuits Syst. Magazine 5(4), 30–40 (2005)CrossRef

64.

J. Gibson, J. Hu, Rate distortion bounds for voice and video, Foundations and Trends in Communications and Information Theory 10(4), 379–514 (2013), http://dx.doi.org/10.1561/0100000061, ISBN: 978-1-60198-778-5

65.

L. Hanzo, F.C.A. Somerville, J.P. Woodard, Voice and Audio Compression for Wireless Communications, 2nd edn. (Wiley, Chichester, 2007)CrossRef

66.

O. Hersent, IP Telephony: Deploying VoIP Protocols and IMS Infrastructure (Wiley, Chichester, 2010)

67.

K. Homayounfar, Rate adaptive speech coding for universal multimedia access. IEEE Signal Process. Magazine 20(2), 30–39 (2003)

69.

ITU-T Rec. G.718, Frame error robust narrow-band and wideband embedded variable bit-rate coding of speech and audio from 8–32 kbit/s, International Telecommunication Union (ITU) (2008)

71.

M. Jelinek et~al., G.718: a new embedded speech and audio coding standard with high resilience to error-prone transmission channels. IEEE Commun. Magazine 46(10), 117–123 (2009)

73.

W.B. Kleijn, Enhancement of coded speech by constrained optimization, in Proceedings of the IEEE Speech Coding Workshop (2002)

75.

J. Makinen, B. Bessette, S. Bruhn, P. Ojala, R. Salami, A. Taleb, AMR-WB+: a new audio coding standard for 3rd generation mobile audio services, in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. 2, 1109–1112 (2005)

76.

S. Ragot, B. Bessette, R. Lefebvre, Low-complexity multi-rate lattice vector quantization with application to wideband speech coding at 32 kbit/s, in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. 1, 501–504 (2004)

77.

M.R. Schroeder, B.S. Atal, Code-excited linear prediction (CELP): High-quality speech at very low bit rates, in Proceedings of IEEE International Conference on Acoustic, Speech and Signal Processing (1984), pp. 937–940

78.

D. Wright, Voice-over-Packet Networks (Wiley, Chichester, 2001)

Titel: Scalable and Multi-Rate Speech Coding for Voice-over-Internet Protocol (VoIP) Networks
verfasst von: Tokunbo Ogunfunmi
Koji Seto
Verlag: Springer New York
Buch: Speech and Audio Processing for Coding, Enhancement and Recognition
Print ISBN: 978-1-4939-1455-5

Electronic ISBN: 978-1-4939-1456-2

Copyright-Jahr: 2015
DOI: https://doi.org/10.1007/978-1-4939-1456-2_3

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Die Gewinner und Laudatoren des Sustainability Award in Automotive 2024/© Uli Regenscheit | ATZlive, Search Icon, Banner Hanser, Kundenpotenzial/© Andrii Yalanskyi / Getty Images / iStock, Toyota-Logo/© ollo / Getty Images / iStock, Sebastian Glenschek/© Hermes International, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade, chassis.tech plus 2023/© [M] ATZlive / TÜV SÜD PRODUCT SERVICE GMBH, adäsion-Webinar-Matinee/© krystiannawrocki_ Getty Images

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.