Voice conversion based on state-space model for modelling spectral trajectory
A novel voice conversion (VC) method using a state-space model (SSM) is presented. The SSM, which has never been shown before in the context of VC, has the advantage of explicitly modelling spectral parameter trajectory. Thus, it will be superior to the conventional Gaussian mixture model (GMM)-based method, where the conversion algorithm is performed on a frame-by-frame procedure, ignoring the correlation between adjacent frames. Experiments using both objective and subjective measurements show that the proposed SSM-based method significantly outperforms the traditional GMM-based technique in the view of both speech quality and conversion accuracy for speaker individuality.