2011 | OriginalPaper | Chapter
Morphological Analysis Based Part-of-Speech Tagging for Uyghur Speech Synthesis
Authors : Guljamal Mamateli, Askar Rozi, Gulnar Ali, Askar Hamdulla
Published in: Knowledge Engineering and Management
Publisher: Springer Berlin Heidelberg
Activate our intelligent search to find suitable subject content or patents.
Select sections of text to find matching patents with Artificial Intelligence. powered by
Select sections of text to find additional relevant content using AI-assisted search. powered by
Accuracy of part-of-speech tagging is critical to downstream sub-tasks in front-end text analysis model of text-to-speech System. Uyghuris an agglutinative language in which numbers of words are formed by suffixes attaching to a stem (or root). Owing to there are unlimited new formed and derived syntactic words in Uyghur, Sizes of part-of-speech tagging set were big and out-of-vocabulary words often occurred in conventional Uyghur part-of-speech tagging method which directly trained and predicted the part-of-speech of word. To address this problem, this paper proposes the idea that trains the part-of-speech of stem and predicts the part-of-speech of word mainly by stem. Bi-gram language model is used to segment the stem and affix boundary of word, hidden markov model is used to train and predict part-of-speech of stem. In the end, rule adjusting method is used to adjust the changed part-of-speech of word when suffix attaching to a stem. Experimental result shows that proposed method obviously reduces the part-of-speech tagging error rate comparing to conventional part-of-speech tagging method.