Abstract
Statistical parametric speech synthesis, based on hidden Markov model-like models, has become competitive with established concatenative techniques over the last few years. This paper offers a non-mathematical introduction to this method of speech synthesis. It is intended to be complementary to the wide range of excellent technical publications already available. Rather than offer a comprehensive literature review, this paper instead gives a small number of carefully chosen references which are good starting points for further reading.
Similar content being viewed by others
References
Jurafsky D, Martin J H 2009 Speech and language processing, 2nd edition (Upper Saddle River, New Jersey, USA: Prentice Hall)
Kawahara H, Masuda-Katsuse I, de Cheveigné A 1999 Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based f0 extraction: Possible role of a repetitive structure in sounds, Speech Commun. 27(3–4): 187–207
Taylor P 2009 Text-to-speech synthesis (Cambridge: Cambridge University Press)
Zen H, Tokuda K 2009 TechWare: HMM-based speech synthesis resources, IEEE Signal Processing Magazine
Zen H, Tokuda K, Black A W 2009 Statistical parametric speech synthesis, Speech Commun. 51(11): 1039–1064
Zen H, Tokuda K, Kitamura T 2007 Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences, Comput. Speech Lang. 21(1): 153–173
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
King, S. An introduction to statistical parametric speech synthesis. Sadhana 36, 837–852 (2011). https://doi.org/10.1007/s12046-011-0048-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12046-011-0048-y