Skip to main content
Top

2011 | OriginalPaper | Chapter

Part of Speech Tagging Approach to Designing Compound Words for Arabic Continuous Speech Recognition Systems

Authors : Dia AbuZeina, Moustafa Elshafei, Wasfi Al-Khatib

Published in: Informatics Engineering and Information Science

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Misrecognition of small words is one of the factors that lead to suboptimal performance in automatic continuous speech recognition systems. In general, errors generated from small words are much more than errors in long words. Therefore, compounding some words (small or long) to produce longer words is welcome by speech recognition decoders. In this paper, we present a novel approach to artificially generate compound words using part of speech tagging. For this purpose, we consider two Arabic pronunciation cases that usually occur together without any silence: a noun followed by an adjective, and a preposition followed by any other word. To collect the candidate compound words, we use Stanford Arabic tagger to tag all words in our Baseline transcription corpus. Using Sphinx 3, we test the proposed method on a 5.4 hours speech corpus of modern standard Arabic. The results show significant improvement, with the word error rate being reduced by 2.39%.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Metadata
Title
Part of Speech Tagging Approach to Designing Compound Words for Arabic Continuous Speech Recognition Systems
Authors
Dia AbuZeina
Moustafa Elshafei
Wasfi Al-Khatib
Copyright Year
2011
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-25483-3_27

Premium Partner