Skip to main content
main-content

Tipp

Weitere Artikel dieser Ausgabe durch Wischen aufrufen

01.07.2019 | Ausgabe 4/2019

Automatic Control and Computer Sciences 4/2019

LSTM-Based Robust Voicing Decision Applied to DNN-Based Speech Synthesis

Zeitschrift:
Automatic Control and Computer Sciences > Ausgabe 4/2019
Autoren:
R. Pradeep, M. Kiran Reddy, K. Sreenivasa Rao

Abstract

The quality of statistical parametric speech synthesis (SPSS) relies on voiced/unvoiced classification. Errors in voicing decision can contribute to significant degradation in speech quality. This paper proposes a robust voicing detection method based on power spectrum and long short term memory (LSTM) network for SPSS. The performance of the proposed method is evaluated using CMU Arctic, Keele and MIR-1K databases. Further, the effectiveness of the proposed method is analyzed for deep neural network (DNN)-based SPSS. The results show that the proposed method can better classify the voiced and unvoiced speech segments, which significantly improves the speech quality.

Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten

Literatur
Über diesen Artikel

Weitere Artikel der Ausgabe 4/2019

Automatic Control and Computer Sciences 4/2019 Zur Ausgabe