Published in:

2011 | OriginalPaper | Chapter

Text-To-Speech Synthesis System for Punjabi Language

Authors : Parminder Singh, Gurpreet Singh Lehal

Published in: Information Systems for Indian Languages

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

A Text-To-Speech (TTS) synthesis system has been developed for Punjabi text written in Gurmukhi script. Concatenative method has been used to develop this TTS system. Syllables have been reported as good choice of speech unit for speech databases of many languages. Since Punjabi is a syllabic language, so syllables has been selected as the basic speech unit for this TTS system, which preserves within unit co-articulation effects. The working of this Punjabi TTS system can be divided into two modules: Online Process and Offline Process. Online process is responsible for pre-processing of the input text, schwa deletion, syllabification and then searching the syllables in the speech database. Pre-processing involves the expansion of abbreviations, numeric figures and special symbols etc. Schwa deletion is an important step for the development of a high quality Text-To-Speech synthesis system. Phonetically, schwa is a very short neutral vowel sound, and like all vowels, its precise quality varies depending on the adjacent consonants. During utterance of words not every schwa following a consonant is pronounced. In order to determine the proper pronunciation of words, it is necessary to identify which schwas are to be deleted and which are to be retained. Grammar rules, inflectional rules and morphotactics of language play important role for identification of schwa those are to be deleted. A rule based schwa deletion algorithm has been developed for Punjabi having accuracy of about 98.27%. Syllabification of the words of input text is also a challenging task. Defining a syllable in a language is a complex task. There are many theories available in phonetics and phonology to define a syllable. In phonetics, syllables are defined based upon the articulation. However in phonological approach, syllables are defined by the different sequences of the phonemes. In every language, certain sequences of phonemes are recognized. In Punjabi seven types of syllables are recognized – V, VC, CV, VCC, CVC, CVCC and CCVC (where V and C represents vowel and consonant respectively), which combine in turn to produce words. A syllabification algorithm for Punjabi has been developed having accuracy of about 96.7%, which works on the output of the schwa deletion algorithm.

The Offline process of this TTS system involved the development of the Punjabi speech database. In order to minimize the size of speech database, effort has been made to select a minimal set of syllables covering almost whole Punjabi word set. To accomplish this all Punjabi syllables have been statistically analyzed on the Punjabi corpus having more than 104 million words. Interesting and very important results have been obtained from this analysis those helps to select a relatively smaller syllable set (about first ten thousand syllables (0.86% of total syllables)) of most frequently occurring syllables having cumulative frequency of occurrence less than 99.81%, out of 1156740 total available syllables. The developed Punjabi speech database stores the starting and end positions of the selected syllable-sounds labeled carefully in a wave file of recorded words. As the syllable sound varies depending upon its position (starting, middle or end) in the word, so separate entries for these three positions has been made in the database for each syllable. An algorithm has been developed based on the set covering problem for selecting the minimum number of words containing above selected syllables for recording of sound file in which syllable positions are marked.

The syllables of the input text are first searched in the speech database for corresponding syllable-sound positions in recorded wave file and then these syllable sounds are concatenated. Normalisation of the synthesized Punjabi sound is done in order to remove the discontinuities at the concatenation points and hence producing smooth, natural sound. A good quality sound is being produced by this TTS system for Punjabi language.

Springer Professional

Text-To-Speech Synthesis System for Punjabi Language

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner