Top

Published in:

2017 | OriginalPaper | Chapter

Quality Improvement of Vietnamese HMM-Based Speech Synthesis System Based on Decomposition of Naturalness and Intelligibility Using Non-negative Matrix Factorization

Authors : Anh-Tuan Dinh, Thanh-Son Phan, Masato Akagi

Published in: Advances in Information and Communication Technology

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Hidden Markov model (HMM)-based synthesized speech is intelligible but not natural especially under limited data condition. The goal of this study is to improve naturalness without violating acceptable intelligibility by decomposing the naturalness and intelligibility of synthesized speech using a novel asymmetric bilinear model involving non-negative matrix factorization (NMF). Subjective evaluations carried out on Vietnamese data confirmed that the achieved synthesis quality is higher than other methods under limited data condition. Since F0 contour is important for naturalness and intelligibility, especially in Vietnamese. Proposed method is capable of modifying over-smoothed F0 contour without destroying tonal information.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Prediction of Generalized Anxiety Disorder Using Particle Swarm Optimization

next chapter Reducing Middle Nodes Mapping Algorithm for Energy Efficiency in Network Virtualization

Zen, H., Tokuda, K., Black, W.: Statistical parametric speech synthesis. Speech Comm. 51(11), 1039–1064 (2009)CrossRef

Toda, T., Tokuda, K.: A speech parameter generation algorithm considering global variance for HMM-based speech synthesis. IEICE Trans. E90–D(05), 816–824 (2007)CrossRef

Takamichi, S., Toda, T., Black, A., Nakamura, S.: Parameter generation algorithm considering modulation spectrum for HMM-based speech synthesis. In: Proceedings of ICASSP, pp. 4210–4214 (2015)

Takamichi, S., Toda, T., Neubig, G., Nakamura, S.: A post-filter to modify the modulation spectrum in HMM-based speech synthesis. In: Proceedings of ICASSP, pp. 290–294 (2014)

Chen, L.H., Raitio, T., Valentini-Botinhao, C., Yamagishi, J., Ling, Z.H.: DNN-based stochastic postfilter for HMM-based speech synthesis. In: Proceedings of Interspeech, pp. 1954–1958 (2014)

Tenenbaum, J., Freeman, W.: Separating style and content with bilinear models. Neural Comput. 12, 1247–1283 (2000)CrossRef

Popa, V., Nurminen, J., Gabbouj, M.: A novel technique for voice conversion based on style and content decomposition with bilinear models. In: Proceedings of Interspeech, pp. 2655–2658 (2009)

Stylianou, Y., Cappe, O., Moulines, E.: Continuous probabilistic transform for voice conversion. IEEE Trans. Audio, Speech, Lang. Process. 6, 131–142 (1998)CrossRef

Tokuda, K., Masuko, T., Imai, S.: Mel-generalized cepstral analysis - a unified approach to speech spectral estimation. In: Proceedings of ICSLP, pp. 1043–1046 (1994)

10.

Dinh-Anh, T., Morikawa, D., Akagi, M.: Study on quality improvement of HMM-based synthesized voices using asymmetric bilinear model. J. Sig. Process. 20(4), 205–208 (2016)CrossRef

11.

Vu, T.T., Luong, M.C., Nakamura, S.: An HMM-based vietnamese speech synthesis system. In: Proceedings of Oriental COCOSDA, pp. 116–121 (2009)

12.

Phan, T.S., Duong, T.C., Dinh, A.T., Vu, T.T., Luong, M.C.: Improvement of naturalness for an HMM-based Vietnamese speech synthesis using the prosodic information. In: Proceedings of RIVF, pp. 276–281 (2013)

13.

Doan, T.T.:

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-49073-1_53/MediaObjects/426117_1_En_53_Figb_HTML.gif

(Vietnamese Phonetics), pp. 99–148. Hanoi National University Publishing House (1999)

14.

Mai, L.C., Duc, D.N.: Design of Vietnamese speech corpus and current status. In: Proceedings of ISCSLP 2006, pp. 748–758 (2006)

15.

Scheffe, H.: An analysis of variance for paired comparisons. J. Am. Stat. Assoc. 37, 381–400 (1952)MathSciNetMATH

16.

Kawahara, H., Masuda-Katsue, I., de Cheveigne, M.: Restructuring speech representations using a pitch-adaptive time-frequency smoothing and a instantaneous frequency-based F0 extraction: Possible role of a repetitive structure in sounds. Speech Comm. 27, 187–207 (1999)CrossRef

Title: Quality Improvement of Vietnamese HMM-Based Speech Synthesis System Based on Decomposition of Naturalness and Intelligibility Using Non-negative Matrix Factorization
Authors: Anh-Tuan Dinh
Thanh-Son Phan
Masato Akagi
Publisher: Springer International Publishing
Book: Advances in Information and Communication Technology
Print ISBN: 978-3-319-49072-4

Electronic ISBN: 978-3-319-49073-1

Copyright Year: 2017
DOI: https://doi.org/10.1007/978-3-319-49073-1_53

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner