Skip to main content
main-content
Top

Hint

Swipe to navigate through the articles of this issue

28-07-2020 | Original Paper | Issue 3/2020

International Journal on Document Analysis and Recognition (IJDAR) 3/2020

A benchmark for unconstrained online handwritten Uyghur word recognition

Journal:
International Journal on Document Analysis and Recognition (IJDAR) > Issue 3/2020
Authors:
Wujiahemaiti Simayi, Mayire Ibrahim, Xu-Yao Zhang, Cheng-Lin Liu, Askar Hamdulla
Important notes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Abstract

Despite some interesting results from different research groups, a public database for Uyghur online handwriting recognition and a baseline study are not yet available for comparison purpose. In order to fill this void, we present a database of Uyghur online handwritten words and carry out the first benchmark experiments using it. This database contains 125,020 samples of 2030 words collected from 393 writers. According to Uyghur lexicon characteristics, two out-of-vocabulary datasets are especially provided for evaluation. We carry out some unconstrained handwritten word recognition experiments on the database using recurrent neural networks as base model. Recognition results are acquired using connectionist temporal classification without lexicon search and external language model. Concatenated and averaged bidirectional recurrent layers are compared for better generalization. Based on Uyghur unicode representation, we are interested in comparing the models using different alphabets, based both on character types and character forms. To improve generalization, we propose 1D convolutional model which implements 1D convolutional layers for sequence feature extraction. In our experiments, the proposed 1D convolutional model and its variations surpassed the base recurrent layered model on the out-of-vocabulary words by clear margin. 83.23% CAR (character accurate rate) was resulted when out-of-vocabulary samples are used for testing. The highest recognition rate is as high as 94.95% CAR when the test set shares the same lexicon to the training set. The experiments in this paper can be the baseline references for the future study using this database.

Please log in to get access to this content

To get access to this content you need the following product:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 69.000 Bücher
  • über 500 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Umwelt
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Testen Sie jetzt 30 Tage kostenlos.

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 50.000 Bücher
  • über 380 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Umwelt
  • Maschinenbau + Werkstoffe




Testen Sie jetzt 30 Tage kostenlos.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 58.000 Bücher
  • über 300 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Testen Sie jetzt 30 Tage kostenlos.

Literature
About this article

Premium Partner

    Image Credits