Skip to main content
Top

2006 | OriginalPaper | Chapter

An HMM-Based Approach to Flexible Speech Synthesis

Author : Keiichi Tokuda

Published in: Chinese Spoken Language Processing

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

The increasing availability of large speech databases makes it possible to construct speech synthesis systems, which are referred to as corpus-based, data-driven, speaker-driven, or trainable approach, by applying statistical learning algorithms. These systems, which can be automatically trained, not only generate natural and high quality synthetic speech but also can reproduce voice characteristics of the original speaker. This talk presents one of these approaches, HMM-based speech synthesis. The basic idea of the approach is very simple: just train HMMs (hidden Markov models) and generate speech directly from them. To realize such a speech synthesis system, however, we need some tricks: algorithms for speech parameter generation from HMMs, and a mel-cepstrum based vocoding technique are reviewed, and an approach to simultaneous modeling of phonetic and prosodic parameters (spectrum, F0, and duration) is also presented. The main feature of the system is the use of dynamic feature: by inclusion of dynamic coefficients in the feature vector, the speech parameter sequence generated in synthesis is constrained to be realistic, as defined by the parameters of the HMMs. The attraction of this approach is that voice characteristics of synthesized speech can easily be changed by transforming HMM parameters. Actually, it has been shown that we can change voice characteristics of synthetic speech by applying a speaker adaptation technique which has been used in speech recognition systems. The relationship between the HMM-based approach and other concatenative speech synthesis approaches is also discussed. In the talk, not only the technical description but also recent results and demos will be presented.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Metadata
Title
An HMM-Based Approach to Flexible Speech Synthesis
Author
Keiichi Tokuda
Copyright Year
2006
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/11939993_5

Premium Partner