Skip to main content
Erschienen in: Neural Computing and Applications 3/2016

01.04.2016 | Original Article

Evaluation of cursive and non-cursive scripts using recurrent neural networks

verfasst von: Saad Bin Ahmed, Saeeda Naz, Muhammad Imran Razzak, Shiekh Faisal Rashid, Muhammad Zeeshan Afzal, Thomas M. Breuel

Erschienen in: Neural Computing and Applications | Ausgabe 3/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Character recognition has been widely used since its inception in applications involved processing of scanned or camera-captured documents. There exist multiple scripts in which the languages are written. The scripts could broadly be divided into cursive and non-cursive scripts. The recurrent neural networks have been proved to obtain state-of-the-art results for optical character recognition. We present a thorough investigation of the performance of recurrent neural network (RNN) for cursive and non-cursive scripts. We employ bidirectional long short-term memory (BLSTM) networks, which is a variant of the standard RNN. The output layer of the architecture used to carry out our investigation is a special layer called connectionist temporal classification (CTC) which does the sequence alignment. The CTC layer takes as an input the activations of LSTM and aligns the target labels with the inputs. The results were obtained at the character level for both cursive Urdu and non-cursive English scripts are significant and suggest that the BLSTM technique is potentially more useful than the existing OCR algorithms.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Wang Y, Ding X, Liu C (2011) MQDF discriminative learning based offline handwritten chinese character recognition. In: ICDAR, IEEE, pp 1100–1104 Wang Y, Ding X, Liu C (2011) MQDF discriminative learning based offline handwritten chinese character recognition. In: ICDAR, IEEE, pp 1100–1104
2.
Zurück zum Zitat Tonouchi Y (2010) Path evaluation and character classifier training on integrated segmentation and recognition of online handwritten Japanese character string. In: ICFHR, IEEE Computer Society, pp 513–517 Tonouchi Y (2010) Path evaluation and character classifier training on integrated segmentation and recognition of online handwritten Japanese character string. In: ICFHR, IEEE Computer Society, pp 513–517
3.
Zurück zum Zitat Razzak MI, Hussain SA (2010) Locally baseline detection for online Arabic script based languages character recognition. International Journal of the Physical Sciences 5:955–959 Razzak MI, Hussain SA (2010) Locally baseline detection for online Arabic script based languages character recognition. International Journal of the Physical Sciences 5:955–959
5.
Zurück zum Zitat Rashid SF, Shafait F, Breuel TM (2012) Scanning neural network for text line recognition. In: Blumenstein M, Pal U, Uchida S (eds) Document analysis systems. IEEE, New Jersey, pp 105–109 Rashid SF, Shafait F, Breuel TM (2012) Scanning neural network for text line recognition. In: Blumenstein M, Pal U, Uchida S (eds) Document analysis systems. IEEE, New Jersey, pp 105–109
6.
Zurück zum Zitat Taghva K, Nartker T, Borsack J, Condit A (2000) UNLV-ISRI document collection for research in OCR and information retrieval. SPIE—The International Society for Optical Engineering, pp 157–164 Taghva K, Nartker T, Borsack J, Condit A (2000) UNLV-ISRI document collection for research in OCR and information retrieval. SPIE—The International Society for Optical Engineering, pp 157–164
7.
Zurück zum Zitat Azmi AN, Nasien D, Shamsuddin SM (2013) A review on handwritten character and numeral recognition for Roman, Arabic, Chinese and Indian scripts. CoRR abs/1308.4902 Azmi AN, Nasien D, Shamsuddin SM (2013) A review on handwritten character and numeral recognition for Roman, Arabic, Chinese and Indian scripts. CoRR abs/1308.4902
8.
Zurück zum Zitat Camastra F (2008) A SVM Greek character recognizer. IJIDSS 1:290–299CrossRef Camastra F (2008) A SVM Greek character recognizer. IJIDSS 1:290–299CrossRef
9.
Zurück zum Zitat Syal S, Sood E, Sharma S, Randhawa EN (2013) Segmented character recognition using neural networks. Int J Eng Res Appl (IJERA) 1:1731–1735 Syal S, Sood E, Sharma S, Randhawa EN (2013) Segmented character recognition using neural networks. Int J Eng Res Appl (IJERA) 1:1731–1735
10.
Zurück zum Zitat Camastra F (2007) A SVM-based cursive character recognizer. Pattern Recogn 40:3721–3727CrossRefMATH Camastra F (2007) A SVM-based cursive character recognizer. Pattern Recogn 40:3721–3727CrossRefMATH
11.
Zurück zum Zitat Nagata M (1998) Japanese OCR error correction using character shape similarity and statistical language model. In: Boitet C, Whitelock P (eds) ACL/COLING. Morgan Kaufmann Publishers/ACL, San Francisco, pp 922–928 Nagata M (1998) Japanese OCR error correction using character shape similarity and statistical language model. In: Boitet C, Whitelock P (eds) ACL/COLING. Morgan Kaufmann Publishers/ACL, San Francisco, pp 922–928
12.
Zurück zum Zitat Pathan IK, Ali AB, Ramteke RJ (2012) Recognition of offline handwritten isolated Urdu character. Advances in Computational Research 4:117–121 Pathan IK, Ali AB, Ramteke RJ (2012) Recognition of offline handwritten isolated Urdu character. Advances in Computational Research 4:117–121
13.
Zurück zum Zitat Graves A (2012) Supervised sequence labelling with recurrent neural networks, Studies in computational intelligence, vol 385. Springer, Heidelberg, pp 1–131MATH Graves A (2012) Supervised sequence labelling with recurrent neural networks, Studies in computational intelligence, vol 385. Springer, Heidelberg, pp 1–131MATH
14.
Zurück zum Zitat Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45:2673–2681CrossRef Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45:2673–2681CrossRef
15.
Zurück zum Zitat Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5–6):602–610CrossRef Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5–6):602–610CrossRef
16.
Zurück zum Zitat Danilo P, Mandic JA Chambers (2001) Recurrent neural network for prediction. Wiley, New YorkMATH Danilo P, Mandic JA Chambers (2001) Recurrent neural network for prediction. Wiley, New YorkMATH
17.
Zurück zum Zitat Hochreiter S, Schmidhuber J (1997) Long short term memory. Neural Comput 9:1735–1780CrossRef Hochreiter S, Schmidhuber J (1997) Long short term memory. Neural Comput 9:1735–1780CrossRef
18.
Zurück zum Zitat Gers FA, Schmidhuber E (2001) LSTM recurrent networks learn simple context-free and context-sensitive languages. IEEE-EC 12:1333–1340 Gers FA, Schmidhuber E (2001) LSTM recurrent networks learn simple context-free and context-sensitive languages. IEEE-EC 12:1333–1340
19.
Zurück zum Zitat Fernández S, Graves A, Schmidhuber J (2007) Sequence labelling in structured domains with hierarchical recurrent neural networks. IJCAI, Hyderabad, India, pp 774–779 Fernández S, Graves A, Schmidhuber J (2007) Sequence labelling in structured domains with hierarchical recurrent neural networks. IJCAI, Hyderabad, India, pp 774–779
20.
Zurück zum Zitat Ahmed SB, Naz S, Salahuddin (2015) UCOM offline dataset—a Urdu handwritten dataset generation. Int Arab J Inf Technol (IAJIT) (still to be published) Ahmed SB, Naz S, Salahuddin (2015) UCOM offline dataset—a Urdu handwritten dataset generation. Int Arab J Inf Technol (IAJIT) (still to be published)
21.
Zurück zum Zitat Graves A, Fernández S, Liwicki M, Bunke H, Schmidhuber J (2007) Unconstrained online handwriting recognition with recurrent neural networks. NIPS, Vancouver, Canada Graves A, Fernández S, Liwicki M, Bunke H, Schmidhuber J (2007) Unconstrained online handwriting recognition with recurrent neural networks. NIPS, Vancouver, Canada
22.
Zurück zum Zitat Smith R (2007) An overview of the tesseract OCR engine. In: ICDAR, pp 629–633 Smith R (2007) An overview of the tesseract OCR engine. In: ICDAR, pp 629–633
23.
Zurück zum Zitat Breuel TM (2008) The OCRopus open source OCR system. In: Yanikoglu BA, Berkner K (eds) DRR. SPIE, volume 6815 of SPIE proceedings, p 68150. doi:10.1117/12.783598 Breuel TM (2008) The OCRopus open source OCR system. In: Yanikoglu BA, Berkner K (eds) DRR. SPIE, volume 6815 of SPIE proceedings, p 68150. doi:10.​1117/​12.​783598
24.
Zurück zum Zitat Naz S, Hayat K, Razzak MI, Anwar MW, Akbar H (2013) Challenges in baseline detection of cursive script languages. In: Science and information conference (SAI), pp 551–556 Naz S, Hayat K, Razzak MI, Anwar MW, Akbar H (2013) Challenges in baseline detection of cursive script languages. In: Science and information conference (SAI), pp 551–556
25.
Zurück zum Zitat Naz S, Hayat K, Razzak MI, Anwar MW, Akbar H (2013) Arabic script based language character recognition: Nasta’liq vs Naskh analysis. In: Proceeding of IEEE World Congress on in Computer and Information Technology (WCCIT), pp 1–7 Naz S, Hayat K, Razzak MI, Anwar MW, Akbar H (2013) Arabic script based language character recognition: Nasta’liq vs Naskh analysis. In: Proceeding of IEEE World Congress on in Computer and Information Technology (WCCIT), pp 1–7
29.
Zurück zum Zitat Ul-Hasan A, Bin Ahmed S, Rashid SF, Shafait F, Breuel TM (2013) Offline printed Urdu Nastaleeq script recognition with bidirectional LSTM networks. In: Proceedings IEEE “2013”, International Conference on Document Analysis and Recognition (ICDAR) Ul-Hasan A, Bin Ahmed S, Rashid SF, Shafait F, Breuel TM (2013) Offline printed Urdu Nastaleeq script recognition with bidirectional LSTM networks. In: Proceedings IEEE “2013”, International Conference on Document Analysis and Recognition (ICDAR)
30.
Zurück zum Zitat Ahmed SB, Naz S, Salahuddin, Razzak MI, Umar AI (2015) Handwritten Urdu character recognition using recurrent neural networks. Neural Comput Appl (NCA) (under reviewed) Ahmed SB, Naz S, Salahuddin, Razzak MI, Umar AI (2015) Handwritten Urdu character recognition using recurrent neural networks. Neural Comput Appl (NCA) (under reviewed)
31.
Zurück zum Zitat AlKhateeb JH (2011) Word-based handwritten Arabic scripts recognition using dynamic Bayesian network, 2011. In: The 5th international conference on information technology, Faculty of Science and Information Technology, AlZaytoonah University of Jordan Ammnan Jordan AlKhateeb JH (2011) Word-based handwritten Arabic scripts recognition using dynamic Bayesian network, 2011. In: The 5th international conference on information technology, Faculty of Science and Information Technology, AlZaytoonah University of Jordan Ammnan Jordan
32.
Zurück zum Zitat Sabbour N 1, Shafait F (2013) A segmentation free approach to Arabic and Urdu OCR. In 20th Document recognition and retrieval conference, 2013, Department of Computer Science, German University in Cairo (GUC), Cairo; German Research Center for Artificial Intelligence (DFKI), Kaiserslautern Sabbour N 1, Shafait F (2013) A segmentation free approach to Arabic and Urdu OCR. In 20th Document recognition and retrieval conference, 2013, Department of Computer Science, German University in Cairo (GUC), Cairo; German Research Center for Artificial Intelligence (DFKI), Kaiserslautern
Metadaten
Titel
Evaluation of cursive and non-cursive scripts using recurrent neural networks
verfasst von
Saad Bin Ahmed
Saeeda Naz
Muhammad Imran Razzak
Shiekh Faisal Rashid
Muhammad Zeeshan Afzal
Thomas M. Breuel
Publikationsdatum
01.04.2016
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 3/2016
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-015-1881-4

Weitere Artikel der Ausgabe 3/2016

Neural Computing and Applications 3/2016 Zur Ausgabe