nach oben

Erschienen in:

2017 | OriginalPaper | Buchkapitel

2. Speech Production and Modelling

verfasst von : Tom Bäckström

Erschienen in: Speech Coding

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Humans produce speech sounds by pushing air out of the lungs and letting the vocal folds oscillate by the airflow as well as by turbulent constrictions in the vocal tract. The flow-waveform thus created is further modulated by the resonances of the vocal tract. These features form the characteristic properties of phones. For efficient coding, we must model these features with a minimum number of parameters without altering the perceptual impression.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Introduction

Nächstes Kapitel Principles of Entropy Coding with Perceptual Quality Evaluation

This is a representative list of vowels, but in no way complete. For example, diphthongs have been omitted, since for our purposes they can be modelled as a transition between two vowels.

Austin, S.F., Titze, I.R.: The effect of subglottal resonance upon vocal fold vibration. J. Voice 11(4), 391–402 (1997)CrossRef

Benesty, J., Sondhi, M., Huang, Y.: Springer Handbook of Speech Processing. Springer, Heidelberg (2008)CrossRef

Bozkurt, B., Doval, B., d’Alessandro, C., Dutoit, T.: Zeros of z-transform (zzt) decomposition of speech for source-tract separation. In: Proceedings International Conference Speech, Language Processing (2004)

Bozkurt, B., Dutoit, T.: Mixed-phase speech modeling and formant estimation, using differential phase spectrums. In: ISCA Tutorial and Research Workshop on Voice Quality: Functions, Analysis and Synthesis (2003)

Degottex, G., Roebel, A., Rodet, X.: Phase minimization for glottal model estimation. IEEE Trans. Audio Speech Lang. Process. 19(5), 1080–1090 (2011)CrossRef

Erath, B.D., Zañartu, M., Stewart, K.C., Plesniak, M.W., Sommer, D.E., Peterson, S.D.: A review of lumped-element models of voiced speech. Speech Commun. 55(5), 667–690 (2013)

Fant, G.: Acoustic Theory of Speech Production. Walter de Gruyter, Germany (1970)

Flanagan, J.L.: Speech Analysis: Synthesis and Perception. Springer-Verlag, New York (1972)CrossRef

Goldstein, U.G.: An articulatory model for the vocal tracts of growing children. Ph.D. thesis, Massachusetts Institute of Technology (1980)

10.

Kelly, J.L., Lochbaum, C.C.: Speech synthesis. In: Proceedings Fourth International Congress on Acoustics, vol. G42, pp. 1–4. Copenhagen, Denmark (1962)

11.

Laine, U.K.: Modelling of lip radiation impedance in z-domain. In: Proceedings of the ICASSP, vol. 7, pp. 1992–1995. IEEE (1982)

12.

Lulich, S.M.: Subglottal resonances and distinctive features. J. Phon. 38(1), 20–32 (2010)CrossRef

13.

Markel, J.E., Gray, A.H.: Linear Prediction of Speech. Springer-Verlag, Inc., New York (1982)

14.

Palo, J., Aalto, D., Aaltonen, O., Happonen, R.P., Malinen, J., Saunavaara, J., Vainio, M.: Articulating finnish vowels: results from MRI and sound data. Ling. Ural. 48(3), 194–199 (2012)

15.

Pulkki, V., Karjalainen, M.: Communication Acoustics: An Introduction to Speech, Audio and Psychoacoustics. Wiley, New Jersey (2015)

16.

Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals, vol. 100. Prentice-Hall, Englewood Cliffs (1978)

17.

Ramasubramanian, V.: Ultra low bit-rate speech coding: an overview and recent results. In: Signal Processing and Communications (SPCOM), 2012 International Conference on, pp. 1–5. IEEE (2012)

18.

Ramasubramanian, V., Harish, D.: Ultra low bit-rate speech coding based on unit-selection with joint spectral-residual quantization: no transmission of any residual information. In: Proceedings of the Interspeech (2009)

19.

Rossing, T.D.: The Science of Sound. Addison-Wesley, New York (1990)

20.

Smith III, J.O.: Physical audio signal processing for virtual musical instruments and audio effects. In: Center for Computer Research in Music and Acoustics (CCRMA) (2013)

21.

Tokuda, K., Masuko, T., Hiroi, J., Kobayashi, T., Kitamura, T.: A very low bit rate speech coder using hmm-based speech recognition/synthesis techniques. In: Proceedings of the ICASSP, vol. 2, pp. 609–612. IEEE (1998)

22.

Vary, P., Martin, R.: Digital Speech Transmission: Enhancement, Coding and Error Concealment. Wiley, New Jersey (2006)CrossRef

23.

Wikipedia. Formant — Wikipedia, the free encyclopedia (2015). Accessed 1 Dec 2015

24.

Wikipedia. International phonetic alphabet chart for English dialects — Wikipedia, the free encyclopedia (2015). Accessed 1 Dec 2015

25.

Wikipedia. Table of vowels — Wikipedia, the free encyclopedia (2015). Accessed 1 Dec 2015

Titel: Speech Production and Modelling
verfasst von: Tom Bäckström
Verlag: Springer International Publishing
Buch: Speech Coding
Print ISBN: 978-3-319-50202-1

Electronic ISBN: 978-3-319-50204-5

Copyright-Jahr: 2017
DOI: https://doi.org/10.1007/978-3-319-50204-5_2

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.