nach oben

Erschienen in:

2008 | OriginalPaper | Buchkapitel

20. Rule-Based Speech Synthesis

verfasst von : Rolf Carlson, Prof., Björn Granström, Prof.

Erschienen in: Springer Handbook of Speech Processing

Verlag: Springer Berlin Heidelberg

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

In this chapter, we review some of the issues in rule-based synthesis and specifically discuss formant synthesis. Formant synthesis and the theory behind have played an important role in both the scientific progress in understanding how humans talk and also the development of the first speech technology applications. Its flexibility and small footprint makes the approach still of interest and a valuable complement to the current dominant methods based on concatenative data-driven synthesis. As already mentioned in the overview by Schroeter (Chap. 19) we also see a new trend to combine the rule-based and data-driven approaches. Formant features from a database that can be used both to optimize a rule-based formant synthesis system and to optimize the search for good units in a concatenative system.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Basic Principles of Speech Synthesis

Nächstes Kapitel Corpus-Based Speech Synthesis

20.1.

G. Fant: Acoustic Theory of Speech Production (Mouton, The Hague 1960)

20.2.

G. Fant: Speech Acoustics and Phonetics, Selected Writings Series: Text, Speech and Language Technology, Vol. 24 (Springer, Berlin, Heidelberg 2006)

20.3.

K. Stevens: Acoustic Phonetics (MIT Press, Cambridge 1999)

20.4.

J. Holmes, I.G. Mattingly, J.N. Shearme: Speech synthesis by rule, Lang. Speech 7, 127-143 (1964)CrossRef

20.5.

I.G. Mattingly: Synthesis by rule as a tool for phonological research, Lang. Speech 14(1), 47-56 (1971)CrossRef

20.6.

J.L. Flanagan: Speech Analysis, Synthesis and Perception (Springer, Berlin, Heidelberg 1972)CrossRef

20.7.

D. K. Klatt: Structure of a phonological rule component for a synthesis-by-rule program, IEEE Trans. ASSP-24 (1976)

20.8.

J. Allen, M.S. Hunnicutt, D. Klatt: From Text to Speech. The MITalk System (Cambridge University Press, Cambridge 1987)

20.9.

R. Carlson, B. Granström: A text-to-speech system based entirely on rules, Proc. ICASSP 76, Philadelphia (1976)

20.10.

Y. Sagisaka: Speech synthesis from text, IEEE Commun. Mag. 28(1), 35-41 (1990)CrossRef

20.11.

T. Dutoit: An Introduction to Text-to-Speech Synthesis (Kluwer Academic, Dordrecht 1997)CrossRef

20.12.

R. Carlson, B. Granström: Speech synthesis. In: Hardcastle WJ and Laver J. The Handbook of Phonetic Science (Blackwell, Oxford 1997) pp. 768-788

20.13.

D.K. Klatt: Review of text-to-speech conversion for English, J. Acoust. Soc. Am. 82(3), 737-793 (1987)CrossRef

20.14.

W. Lawrence: The synthesis of speech from signals which have a low information rate. In: Communication Theory, ed. by W. Jackson (Butterworths, London 1953) pp. 460-469

20.15.

G. Fant: Speech Communication Research, Ing. Vetenskaps Akad. Stockholm 24, 331-337 (1953)

20.16.

D.K. Klatt: Software for a cascade/parallel formant synthesizer, J. Acoust. Soc. Am. 67, 971 (1980)CrossRef

20.17.

J. Holmes: Formant synthesizers, cascade or parallel, Speech Commun. 2, 251-273 (1983)CrossRef

20.18.

K. Stevens, C. Bickley: Constraints among parameters simplify control of Klatt formant synthesizer, J. Phonetics 19(1) (1991)

20.19.

R. Carlson, B. Granström, I. Karlsson: Experiments with voice modelling in speech synthesis, Speech Commun. 10, 481-489 (1991)CrossRef

20.20.

D. Klatt: The Klattalk text-to-speech conversion system, Proc. ICASSP 82, 1589-1592 (1982)

20.21.

D. Klatt: DecTalk userʼs manual, Digital Equipment Corporation (1990)

20.22.

J. Liljencrants: The OVE III speech synthesizer, IEEE Trans.Audio Electroac. 16(1), 137-140 (1968)CrossRef

20.23.

R. Carlson, B. Granström, S. Hunnicutt: A multi-language text-to-speech module, Proc. ICASSP 82 82(3), 1604-1607 (1982)

20.24.

R. Carlson, B. Granström, S. Hunnicutt: Multilingual text-to-speech development and applications. In: Advances in speech, hearing and language processing, ed. by A.W. Ainsworth (JAI, London 1991)

20.25.

H.M. Hanson, K.N. Stevens: A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn, J. Acoust. Soc. Am. 112, 1158-1182 (2002)CrossRef

20.26.

K. Stevens: Toward Formant Synthesis with Articulatory Controls, Proceedings of IEEE Workshop on Speech Synthesis (2002)

20.27.

R. Ogden, S. Hawkins, J. House, M. Huckvale, J. Local, P. Carter, J. Dankovicová, S. Heid: ProSynth: An integrated prosodic approach to device-independent natural-sounding speech synthesis, Comput. Speech Lang. 14, 177-210 (2000)CrossRef

20.28.

S. Heid, S. Hawkins: Synthesizing systematic variation at boundaries between vowels and obstruents. In: Proceedings of the XIVth International Congress of Phonetic Sciences, Vol. 1, ed. by J.J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, A.C. Bailey (University of California, Berkeley 1999) pp. 511-514

20.29.

C. Gobl, J. Karlsson: Male and female voice source dynamics. In: Vocal Fold Physiology: Acoustic, Perceptual, and Physiological Aspects of Voice Mechanisms, ed. by J. Gauffin, B. Hammarberg (Singular, San Diego 1991)

20.30.

T. V. Ananthapadmanabha: Acoustic analysis of voice source dynamics, STL-QPSR 2(3) 1-24 (1984)

20.31.

P. Hedelin: A glottal LPC-vocoder, Proc. IEEE, San Diego, 1.6.1-1.6.4 (1984)

20.32.

J. Holmes: Influence of the glottal waveform on the naturalness of speech from a parallel formant synthesizer, IEEE Trans. Audio Electroac. AU-21, 298-305 (1973)CrossRef

20.33.

D.K. Klatt, L. Klatt: Analysis, synthesis, and perception of voice quality variations among female and male talkers, J. Acoust. Soc. Am. 87, 820-857 (1990)CrossRef

20.34.

A.E. Rosenberg: Effect of glottal pulse shape on the quality of natural vowels, J. Acoust. Soc. Am. 53, 1632-1645 (1971)CrossRef

20.35.

M. Rothenberg, R. Carlson, B. Granström, J. Lindqvist-Gauffin: A three-parameter voice source for speech synthesis", Proc. of Speech Communication Seminar, Stockholm 1974; in Speech Communication, Vol. 2 (Almqvist and Wiksell, Stockholm 1975) pp. 235-243

20.36.

R. Carlson, B. Granström: A Phonetically Oriented Programming Language for Rule Description of Speech. In: Speech Communication, Vol. 2, ed. by G. Fant (Almqvist Wiksell, Uppsala 1975) pp. 245-253

20.37.

P. Alku: An automatic inverse filtering method for the analysis of glottal waveforms, Dissertation (Helsinki University of Technology, Helsinki 1992)

20.38.

C. Gobl, A. Ní Chasaide: Acoustic characteristics of voice quality, Speech Commun. 11, 481-490 (1992)CrossRef

20.39.

C. Gobl, A. Ní Chasaide: The role of voice quality in communicating emotion, mood and attitude, Speech Commun. 40, 189-212 (2003)CrossRefMATH

20.40.

I. Karlsson: Modelling speaking styles in female speech synthesis, Speech Commun. 11, 491-497 (1992)CrossRef

20.41.

G. Fant, J. Liljencrants, Q. Lin: A four parameter model of glottal flow, Speech Transmission Laboratory Quarterly and Status Report STL-QPSR No 4 (1985)

20.42.

C. Bickley, K. Stevens: Effects of the vocal tract constriction on the glottal source: Experimental and modelling studies, J. Phon. 14, 373-382 (1986)

20.43.

K.N. Stevens: Airflow and turbulence noise for fricative and stop consonants: Static considerations, J. Acoust. Soc. Am. 50(4), 1180-1192 (1971)CrossRef

20.44.

C.H. Shadle: The Aerodynamics of Speech. Handbook of Phonetics, ed. by W.J. Hardcastle, J. Laver, (Blackwell, Oxford 1997) pp. 33-64

20.45.

P. Badin, G. Fant: Fricative modeling: some essentials, Proc. Europ. Conf. Speech Technol. Paris (1989)

20.46.

H.C. van Leeuwen, E. te Lindert: Speech Maker: A flexible and general framework for text-to-speech synthesis, and its application to Dutch, Comput. Speech Lang. 7(2), 149-168 (1993)CrossRef

20.47.

P.C. Delattre, A.M. Liberman, F.S. Cooper: Acoustic loci and transitional cues for consonants, J. Acoust. Soc. Am. 27, 769-773 (1955)CrossRef

20.48.

J. Liljencrants: Speech synthesizer control by smoothed step functions, STL-QPSR 1969(4), 43-50 (1969)

20.49.

D.H. Klatt: Synthesis of stop consonants in initial position, J. Acoust. Soc. Am. Suppl. 147, S93 (1970)CrossRef

20.50.

N. Umeda: Linguistic rules for text-to-speech synthesis, Proc. IEEE 64(4), 443-451 (1976)CrossRef

20.51.

D.K. Klatt: Synthesis by rule of segmental durations in English sentences. In: Frontiers in Speech Communication Research, ed. by B. Lindblom, S. öhman (Academic, New York 1979)

20.52.

G. Bailly, R. Laboissière, J. L. Schwartz: Formant trajectories as audible gestures: an alternative for speech synthesis, J. Phon. 19(1), 9-23 (1991)

20.53.

B. Lindblom: Explaining phonetic variation: A sketch of the H and H theory. In: Speech Production Modeling, ed. by Hardcastle, Marchal (Kluwer Academic, Dordrecht 1990)

20.54.

R. J. J. H. van Son, L. Pols: Comparing formant movements in fast and normal rate speech, Proc. Europ. Conf. on Speech Commun. Technol. 89 (1989)

20.55.

A. Slater, S. Hawkins: Effects of stress and vowel context on velar stops in British English, ICSLP 92 (Proc. 1992 Int. Conf. Spoken Language Processing) 1, 57-60 (1992)

20.56.

N. Chomsky, M. Halle: Sound pattern of English (Harper and Row, New York 1968)

20.57.

J.B. Pierrehumbert: The Phonetics of English Intonation (IULC, Bloomington 1987)

20.58.

S. R. Hertz: Streams, phones, and transitions: toward a new phonological and phonetic model of formant timing, J. Phon. 19(1) (1991)

20.59.

S.R. Hertz, J. Kadin, K.J. Karplus: The Delta rule development system for speech synthesis from text, Proc. IEEE 73(11), 1589-1601 (1985)CrossRef

20.60.

S. Lazzaretto, L. Nebbia: SCYLA: Speech compiler for your language, Proc. European Conf on Speech Comm and Technology, Edinburgh 1, 381-384 (1987)

20.61.

K. Ceder, B. Lyberg: Yet another rule compiler for text-to-speech conversion? Proc. ICSLP92, Banff, Canada, pp. 1151-1154 (1992)

20.62.

H. C. van Leeuwen, E. te Lindert: Speechmaker, text-to-speech synthesis based on a multilevel, synchronized data structure, Proc. ICASSP-91 (1991)

20.63.

R. Carlson, B. Granström: Data-driven multimodal synthesis, Issues Speech Commun. 47(1-2), 182-193 (2005)CrossRef

20.64.

W. J. Holmes, D. J. B. Pearce: Automatic derivation of segment models for synthesis-by-rule. Proc ESCA Workshop on Speech Synthesis, Autrans, France (1990)

20.65.

G. Peterson, W. Wang, E. Sivertsen: Segmentation techniques in speech synthesis, J. Acoust. Soc. Am. 32, 639-703 (1958)

20.66.

N.R. Dixon, H.D. Maxey: Terminal Analog Synthesis of Continuous Speech Using the Diphone Method of Segment Assembly, IEEE Trans. Audio Electroac. AU-16, 40-50 (1968)CrossRef

20.67.

J.P. Olive: Rule synthesis of speech from dyadic units, Proc. ICASSP 77, 568-570 (1977)

20.68.

R. H. Mannell: Formant diphone parameter extraction utilising a labeled single speaker database. In: Proc. ICSLP-98 (1998)

20.69.

H. Mori, T. Ohtsuka, H. Kasuya: A data-driven approach to source-formant type text-to-speech system, ICSLP 2002, 2365-2368 (2002)

20.70.

S. Hertz: Integration of Rule-Based Formant Synthesis and Waveform Concatenation: A Hybrid Approach to Text-to-Speech Synthesis, In: Proc. IEEE 2002 Workshop on Speech Synthesis, 11-13, Santa Monica (2002)

20.71.

D. Talkin: Looking at Speech. In: Speech Technology, 74-77 (1989)

20.72.

A. Acero: Formant analysis and synthesis using hidden Markov models, Proc. Eurospeech 99, 1047-1050 (1999)

20.73.

M. Lee, J. van Santen, B. Möbius, J. Olive: Formant tracking using context-dependent phonemic information, IEEE TSAP 13(5), 741-750 (2005)

20.74.

A.-M. Öster: The use of a synthesis-by-rule system in a study of deaf speech, STL-QPSR 1/ 1985, 95-107 (1985)

20.75.

B. Granström, A.-M. Öster: Speech synthesis for hearing impaired persons - in research, training and communication, STL/QPSR 2-3/ 94, 93-111 (1994)

20.76.

A. Kain, X. Niu, J. Hosom, J. Miao, J. van Santen: Formant Re-synthesis of Dysarthric Speech. Proceedings of IEEE Workshop on Speech Synthesis (2004)

20.77.

I. Murray, J. Arnott, N. Alm, A. Newell: A communication system for the disabled with emotional synthetic speech produced by rule, Procs. Eurospeech 91(1), 311-314 (1991)

20.78.

P. A. Cudd, S. Hunnicutt, J. Arthur, B. Granström, S. Aguilera, B. Waernulf, P. Dalsgaard, G. Wilson: Voices, attitudes and emotions in speech synthesis. In Placencia Porrero, I., and Puig de la Bellacasa, P. (Eds.), Proc of 2nd TIDE Congress on The European Context for Assistive Technology (pp. 344-347). Paris, Amsterdam: IOS Press Ohmsha (1995)

20.79.

J. Cahn: The generation of affect in synthesized speech, J. Am. Voice I/O Soc. 8 (1990)

Titel: Rule-Based Speech Synthesis
verfasst von: Rolf Carlson, Prof.
Björn Granström, Prof.
Verlag: Springer Berlin Heidelberg
Buch: Springer Handbook of Speech Processing
Print ISBN: 978-3-540-49125-5

Electronic ISBN: 978-3-540-49127-9

Copyright-Jahr: 2008
DOI: https://doi.org/10.1007/978-3-540-49127-9_20

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Jonas Klose/© Pine Valley Capital GmbH, Carina Kießling von der Strategieberatung Roland Berger/© Monika Walther Fotografie | ATZ, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell, ATZ-Webinar: Prototypenfreie Entwicklung durch Offline- und Driver-in-the-Loop-HiL-Tests /© (c) VI-grade

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.