Top

Published in:

2011 | OriginalPaper | Chapter

Part of Speech Tagging Approach to Designing Compound Words for Arabic Continuous Speech Recognition Systems

Authors : Dia AbuZeina, Moustafa Elshafei, Wasfi Al-Khatib

Published in: Informatics Engineering and Information Science

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Misrecognition of small words is one of the factors that lead to suboptimal performance in automatic continuous speech recognition systems. In general, errors generated from small words are much more than errors in long words. Therefore, compounding some words (small or long) to produce longer words is welcome by speech recognition decoders. In this paper, we present a novel approach to artificially generate compound words using part of speech tagging. For this purpose, we consider two Arabic pronunciation cases that usually occur together without any silence: a noun followed by an adjective, and a preposition followed by any other word. To collect the candidate compound words, we use Stanford Arabic tagger to tag all words in our Baseline transcription corpus. Using Sphinx 3, we test the proposed method on a 5.4 hours speech corpus of modern standard Arabic. The results show significant improvement, with the word error rate being reduced by 2.39%.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Conceptual Design System for Monitoring Electrical Loads

next chapter A Discriminative Non-linear Manifold Learning Technique for Face Recognition

Title: Part of Speech Tagging Approach to Designing Compound Words for Arabic Continuous Speech Recognition Systems
Authors: Dia AbuZeina
Moustafa Elshafei
Wasfi Al-Khatib
Publisher: Springer Berlin Heidelberg
Book: Informatics Engineering and Information Science
Print ISBN: 978-3-642-25482-6

Electronic ISBN: 978-3-642-25483-3

Copyright Year: 2011
DOI: https://doi.org/10.1007/978-3-642-25483-3_27

Springer Professional

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner