2011 | OriginalPaper | Buchkapitel
Small-Word Pronunciation Modeling for Arabic Speech Recognition: A Data-Driven Approach
verfasst von : Dia AbuZeina, Wasfi Al-khatib, Moustafa Elshafei
Erschienen in: Information Retrieval Technology
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
Incorrect recognition of adjacent small words is considered one of the obstacles in improving the performance of automatic continuous speech recognition systems. The pronunciation variation in the phonemes of adjacent words introduces ambiguity to the triphone of the acoustic model and adds more confusion to the speech recognition decoder. However, small words are more likely to be affected by this ambiguity than longer words. In this paper, we present a data-driven approach to model the small words problem. The proposed method identifies the adjacent small words in the corpus transcription to generate the compound words. The unique compound words are then added to the expanded pronunciation dictionary, as well as to the language model as a new sentence. Results show a significant improvement of 2.16% in the word error rate compared to that of the Baseline speech corpus of Modern Standard Arabic broadcast news.