Skip to main content

2015 | OriginalPaper | Buchkapitel

2. Challenges in Speech Coding Research

verfasst von : Jerry D. Gibson

Erschienen in: Speech and Audio Processing for Coding, Enhancement and Recognition

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Speech and audio coding underlie many of the products and services that we have come to rely on and enjoy today. In this chapter, we discuss speech and audio coding, including a concise background summary, key coding methods, and the latest standards, with an eye toward current limitations and possible future research directions.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat J.D. Gibson, Speech coding methods, standards, and applications. IEEE Circuits Syst. Magazine 5, 30–49 (2005)CrossRef J.D. Gibson, Speech coding methods, standards, and applications. IEEE Circuits Syst. Magazine 5, 30–49 (2005)CrossRef
2.
Zurück zum Zitat J.D. Gibson, T. Berger, T. Lookabaugh, D. Lindbergh, R.L. Baker, Digital Compression for Multimedia: Principles and Standards (Morgan-Kaufmann, San Francisco, 1998) J.D. Gibson, T. Berger, T. Lookabaugh, D. Lindbergh, R.L. Baker, Digital Compression for Multimedia: Principles and Standards (Morgan-Kaufmann, San Francisco, 1998)
3.
Zurück zum Zitat R. Cox, S.F. de Campos Neto, C. Lamblin, M.H. Sherif, ITU-T coders for wideband, superwideband, and fullband speech communication. IEEE Commun. Magazine 47, 106–109 (2009)CrossRef R. Cox, S.F. de Campos Neto, C. Lamblin, M.H. Sherif, ITU-T coders for wideband, superwideband, and fullband speech communication. IEEE Commun. Magazine 47, 106–109 (2009)CrossRef
4.
Zurück zum Zitat ITU-T Recommendation P.862, Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs (2001) ITU-T Recommendation P.862, Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs (2001)
5.
Zurück zum Zitat ITU-T Recommendation P.862.2, Wideband extension to Recommendation P.862 for the assessment of wideband telephone networks and speech codecs (2007) ITU-T Recommendation P.862.2, Wideband extension to Recommendation P.862 for the assessment of wideband telephone networks and speech codecs (2007)
6.
Zurück zum Zitat ITU-T Recommendation P.863, Perceptual objective listening quality assessment (2011) ITU-T Recommendation P.863, Perceptual objective listening quality assessment (2011)
7.
Zurück zum Zitat W.-Y. Chan, T.H. Falk, Machine assessment of speech communication quality, in The Mobile Communications Handbook, ed. by J.D. Gibson, 3rd edn. (CRC Press, BocaRaton, FL, 2012). Chapter 30 W.-Y. Chan, T.H. Falk, Machine assessment of speech communication quality, in The Mobile Communications Handbook, ed. by J.D. Gibson, 3rd edn. (CRC Press, BocaRaton, FL, 2012). Chapter 30
9.
Zurück zum Zitat H.S. Malvar, Signal Processing with Lapped Transforms (Artech House, Norwood, 1992)MATH H.S. Malvar, Signal Processing with Lapped Transforms (Artech House, Norwood, 1992)MATH
10.
Zurück zum Zitat A.M. Kondoz, Digital Speech: Coding for Low Bit Rate Communication Systems (Wiley, West Sussex, 2004)CrossRef A.M. Kondoz, Digital Speech: Coding for Low Bit Rate Communication Systems (Wiley, West Sussex, 2004)CrossRef
11.
Zurück zum Zitat J.H. Chen, A. Gersho, Adaptive postfiltering for quality enhancement of coded speech. IEEE Trans. Audio Process. 3, 59–70 (1995)CrossRef J.H. Chen, A. Gersho, Adaptive postfiltering for quality enhancement of coded speech. IEEE Trans. Audio Process. 3, 59–70 (1995)CrossRef
12.
Zurück zum Zitat S. Ragot et~al., ITU-T G.729.1: An 8-32 kbit/s scalable coder interoperable with G.729 for wideband telephony and Voice over IP, in Proceedings of ICASSP, Honolulu, April 2007 S. Ragot et~al., ITU-T G.729.1: An 8-32 kbit/s scalable coder interoperable with G.729 for wideband telephony and Voice over IP, in Proceedings of ICASSP, Honolulu, April 2007
13.
Zurück zum Zitat ITU-T Recommendation G.722.1, Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss (1999) ITU-T Recommendation G.722.1, Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss (1999)
14.
Zurück zum Zitat ITU-T Recommendation G.722.2, Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB) (2002) ITU-T Recommendation G.722.2, Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB) (2002)
15.
Zurück zum Zitat ITU-T Rec. G.718, Frame error robust narrowband and wideband embedded variable bit-rate coding of speech and audio from 8-32 kbit/s (2008) ITU-T Rec. G.718, Frame error robust narrowband and wideband embedded variable bit-rate coding of speech and audio from 8-32 kbit/s (2008)
16.
Zurück zum Zitat ITU-T Rec. 719, Low-complexity, full-band audio coding for high-quality, conversational applications, June 2008 ITU-T Rec. 719, Low-complexity, full-band audio coding for high-quality, conversational applications, June 2008
17.
Zurück zum Zitat S. Karapetkov, G.719: the first ITU-T standard for full-band audio. Polycom white paper, April 2009 S. Karapetkov, G.719: the first ITU-T standard for full-band audio. Polycom white paper, April 2009
19.
Zurück zum Zitat S.V. Andersen, W.B. Kleijn, R. Hagen, J. Linden, M.N. Murthi, J. Skoglund, iLBC – a linear predictive coder with robustness to packet losses, in Proceedings of the IEEE Speech Coding Workshop, October 2002, pp 23–25 S.V. Andersen, W.B. Kleijn, R. Hagen, J. Linden, M.N. Murthi, J. Skoglund, iLBC – a linear predictive coder with robustness to packet losses, in Proceedings of the IEEE Speech Coding Workshop, October 2002, pp 23–25
21.
Zurück zum Zitat RFC6716, Definition of the Opus Audio Codec, September 2012 RFC6716, Definition of the Opus Audio Codec, September 2012
22.
Zurück zum Zitat A. Ramo, Voice quality evaluation of various codecs, in ICASSP 2010, Dallas, 14–19 March 2010 A. Ramo, Voice quality evaluation of various codecs, in ICASSP 2010, Dallas, 14–19 March 2010
23.
Zurück zum Zitat A. Ramo, H. Toukomaa, Voice quality characterization of the IETF Opus Codec, in Proceedings of Interspeech 2011, Florence (2011) A. Ramo, H. Toukomaa, Voice quality characterization of the IETF Opus Codec, in Proceedings of Interspeech 2011, Florence (2011)
24.
Zurück zum Zitat A. Ramo, H. Toukomaa, On comparing speech quality of various narrow- and wideband speech codecs, in Proceeding of ISSPA, Sydney (2005) A. Ramo, H. Toukomaa, On comparing speech quality of various narrow- and wideband speech codecs, in Proceeding of ISSPA, Sydney (2005)
25.
Zurück zum Zitat M. Bosi, R.E. Goldberg, Introduction to Audio Coding and Standards (Kluwer, Boston, 2003)CrossRef M. Bosi, R.E. Goldberg, Introduction to Audio Coding and Standards (Kluwer, Boston, 2003)CrossRef
26.
Zurück zum Zitat T. Painter, A. Spanias, Perceptual coding of digital audio. Proc. IEEE 88, 451–512 (2000)CrossRef T. Painter, A. Spanias, Perceptual coding of digital audio. Proc. IEEE 88, 451–512 (2000)CrossRef
27.
Zurück zum Zitat ITU-T Recommendation G.114, One-Way Transmission Time (2000) ITU-T Recommendation G.114, One-Way Transmission Time (2000)
28.
Zurück zum Zitat ITU-T Rec. G.718 Amendment 2: New Annex B on superwideband scalable extension for ITU-T G.718 and corrections to main body fixed-point C-code and description text, March 2010 ITU-T Rec. G.718 Amendment 2: New Annex B on superwideband scalable extension for ITU-T G.718 and corrections to main body fixed-point C-code and description text, March 2010
29.
Zurück zum Zitat M. Neuendorf, P. Gournay, M. Multrus, J. Lecomte, B. Bessette, R. Geiger, S. Bayer, G. Fuchs, J. Hilpert, N. Rettelbach, F. Nagel, J. Robilliard, R. Salami, G. Schuller, R. Lefebvre, B. Grill, A novel scheme for low bitrate unified speech and audio coding-MPEG RM0, in Audio Engineering Society, Convention Paper 7713, May 2009 M. Neuendorf, P. Gournay, M. Multrus, J. Lecomte, B. Bessette, R. Geiger, S. Bayer, G. Fuchs, J. Hilpert, N. Rettelbach, F. Nagel, J. Robilliard, R. Salami, G. Schuller, R. Lefebvre, B. Grill, A novel scheme for low bitrate unified speech and audio coding-MPEG RM0, in Audio Engineering Society, Convention Paper 7713, May 2009
30.
Zurück zum Zitat Y. Hiwasaki et~al., G.711.1: a wideband extension to ITU-T G.711. EUSIPCO 2008, Lausanne, 25–29 August 2008 Y. Hiwasaki et~al., G.711.1: a wideband extension to ITU-T G.711. EUSIPCO 2008, Lausanne, 25–29 August 2008
31.
Zurück zum Zitat M. Xie, D. Lindbergh, P. Chu, ITU-T G.722.1 Annex C: a new low-complexity 14 kHz audio coding standard, in Proceedings of ICASSP, Toulouse, May 2006 M. Xie, D. Lindbergh, P. Chu, ITU-T G.722.1 Annex C: a new low-complexity 14 kHz audio coding standard, in Proceedings of ICASSP, Toulouse, May 2006
32.
Zurück zum Zitat K. Jarvinen, I. Bouazizi, L. Laaksonen, P. Ojala, A. Ramo, Media coding for the next generation mobile system LTE. Comput. Commun. 33, 1916–1927 (2010)CrossRef K. Jarvinen, I. Bouazizi, L. Laaksonen, P. Ojala, A. Ramo, Media coding for the next generation mobile system LTE. Comput. Commun. 33, 1916–1927 (2010)CrossRef
33.
Zurück zum Zitat J. Rodman, The effect of bandwidth on speech intelligibility. Polycom white paper, September 2006 J. Rodman, The effect of bandwidth on speech intelligibility. Polycom white paper, September 2006
Metadaten
Titel
Challenges in Speech Coding Research
verfasst von
Jerry D. Gibson
Copyright-Jahr
2015
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4939-1456-2_2

Neuer Inhalt