Skip to main content

2017 | OriginalPaper | Buchkapitel

Handwritten Mixed-Script Recognition System: A Comprehensive Approach

verfasst von : Pawan Kumar Singh, Supratim Das, Ram Sarkar, Mita Nasipuri

Erschienen in: Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Most of the researchers around the world focus on developing monolingual Optical Character Recognition (OCR) systems. But in a multilingual country like India, it is quite common that a single document page includes text words written in more than one script. Therefore, OCRing such documents need a script identification module as a prerequisite. This paper reports a complete script recognition system for handwritten mixed-script documents. The document pages are first segmented into their corresponding text-lines and words. Then, the script recognition is done at word-level using texture-based features. The present technique is applied on 100 mixed-script document pages written in Bangla or Devanagari text mixed with English words. Encouraging outcomes would motivate more researchers to work on multilingual handwriting recognition domain.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Singh, P.K., Sarkar, R., Nasipuri, M.: Offline Script Identification from Multilingual Indic-script Documents: A state-of-the-art. Computer Science Review (Elsevier). 15–16, 1–28 (2015). Singh, P.K., Sarkar, R., Nasipuri, M.: Offline Script Identification from Multilingual Indic-script Documents: A state-of-the-art. Computer Science Review (Elsevier). 15–16, 1–28 (2015).
2.
Zurück zum Zitat Obaidullah, S.M., Kundu, S.K., Roy, K.: A System for Handwritten Script Identification from Indian Document. Journal of Pattern Recognition Research. 8, 1–12 (2013). Obaidullah, S.M., Kundu, S.K., Roy, K.: A System for Handwritten Script Identification from Indian Document. Journal of Pattern Recognition Research. 8, 1–12 (2013).
3.
Zurück zum Zitat Padma, M.C., Vijaya, P.A.: Global Approach for Script Identification using Wavelet Packet Based Features. International Journal of Signal Processing, Image Processing and Pattern Recogntion. 20, 29–40 (2010). Padma, M.C., Vijaya, P.A.: Global Approach for Script Identification using Wavelet Packet Based Features. International Journal of Signal Processing, Image Processing and Pattern Recogntion. 20, 29–40 (2010).
4.
Zurück zum Zitat Hiremath, P.S., Shivshankar, S., Pujari, J.D., Mouneswara, V.: Script identification in a handwritten document image using texture features. In: IEEE 2nd International Conference on Advance Computing. pp. 110–114 (2010). Hiremath, P.S., Shivshankar, S., Pujari, J.D., Mouneswara, V.: Script identification in a handwritten document image using texture features. In: IEEE 2nd International Conference on Advance Computing. pp. 110–114 (2010).
5.
Zurück zum Zitat Hangarge, M., Dhandra, B. V: Offline Handwritten Script Identification in Document Images. International Journal of Computer Applications (IJCA). 4, (2010). Hangarge, M., Dhandra, B. V: Offline Handwritten Script Identification in Document Images. International Journal of Computer Applications (IJCA). 4, (2010).
6.
Zurück zum Zitat Singh, P.K., Sarkar, R., Nasipuri, M.: Line-level Script Identification for six handwritten scripts using texture based features. In: 2nd Information Systems Design and In-telligent Applications, AISC. pp. 285–293 (2015). Singh, P.K., Sarkar, R., Nasipuri, M.: Line-level Script Identification for six handwritten scripts using texture based features. In: 2nd Information Systems Design and In-telligent Applications, AISC. pp. 285–293 (2015).
7.
Zurück zum Zitat Sarkar, R., Das, N., Basu, S., Kundu, M., Nasipuri, M., Basu, D.K.: Word level script Identification from Bangla and Devnagari Handwritten texts mixed with Roman scripts. Journal of Computing. 2, 103–108 (2010). Sarkar, R., Das, N., Basu, S., Kundu, M., Nasipuri, M., Basu, D.K.: Word level script Identification from Bangla and Devnagari Handwritten texts mixed with Roman scripts. Journal of Computing. 2, 103–108 (2010).
8.
Zurück zum Zitat Pati, P.B., Ramakrishnan, A.G.: Word level multi-script identification. Pattern Rec-ognition Letters. 29, 1218–1229 (2008). Pati, P.B., Ramakrishnan, A.G.: Word level multi-script identification. Pattern Rec-ognition Letters. 29, 1218–1229 (2008).
9.
Zurück zum Zitat Singh, P.K., Sarkar, R., Das, N., Basu, S., Nasipuri, M.: Identification of Devnagari and Roman script from Multiscript Handwritten documents. In: 5th International Conference on PReMI, LNCS 8251. pp. 509–514 (2013). Singh, P.K., Sarkar, R., Das, N., Basu, S., Nasipuri, M.: Identification of Devnagari and Roman script from Multiscript Handwritten documents. In: 5th International Conference on PReMI, LNCS 8251. pp. 509–514 (2013).
10.
Zurück zum Zitat Singh, P.K., Mondal, A., Bhowmik, S., Sarkar, R., Nasipuri, M.: Word-level Script Identification from Multi-script Handwritten Documents. In: 3rd International Conference on Frontiers in Intelligent Computing Theory and Applications (FICTA). pp. 551–558 (2014). Singh, P.K., Mondal, A., Bhowmik, S., Sarkar, R., Nasipuri, M.: Word-level Script Identification from Multi-script Handwritten Documents. In: 3rd International Conference on Frontiers in Intelligent Computing Theory and Applications (FICTA). pp. 551–558 (2014).
11.
Zurück zum Zitat Saabni, R., Asi, A., El-Sana, J.: Text line extraction for historical document images. Pattern Recognition Letters. 35, 23–33 (2014). Saabni, R., Asi, A., El-Sana, J.: Text line extraction for historical document images. Pattern Recognition Letters. 35, 23–33 (2014).
12.
Zurück zum Zitat Saabni, R., El-Sana, J.: Language-independent text lines extraction using seam carving. In: IEEE International Conference on Document Analysis and Recognition. pp. 563–568 (2011). Saabni, R., El-Sana, J.: Language-independent text lines extraction using seam carving. In: IEEE International Conference on Document Analysis and Recognition. pp. 563–568 (2011).
13.
Zurück zum Zitat Singh, P.K., Chowdhury, S.P., Sinha, S., Eum, S., Sarkar, R.: Page-to-Word Extraction from Unconstrained Handwritten Document Images. In: 1st International Conference on Intelligent Computing and Communication(ICIC2) (2016). Singh, P.K., Chowdhury, S.P., Sinha, S., Eum, S., Sarkar, R.: Page-to-Word Extraction from Unconstrained Handwritten Document Images. In: 1st International Conference on Intelligent Computing and Communication(ICIC2) (2016).
14.
Zurück zum Zitat Harris, C., Stephens, M.: A combined corner and edge detector. Alvey vision Conference. 15, (1988). Harris, C., Stephens, M.: A combined corner and edge detector. Alvey vision Conference. 15, (1988).
15.
Zurück zum Zitat Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on Knowledge Discovery and Data Mining. pp. 226–231 (1996). Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: 2nd International Conference on Knowledge Discovery and Data Mining. pp. 226–231 (1996).
16.
Zurück zum Zitat Laws, K.: Rapid Texture Identification. Image Processing for Missile Guidance. SPIE. 238, 376–380 (1980). Laws, K.: Rapid Texture Identification. Image Processing for Missile Guidance. SPIE. 238, 376–380 (1980).
17.
Zurück zum Zitat Gonzalez, R.C., Woods, R.E.: Digital Image Processing. vol. 1, Prentice-hall, (1992). Gonzalez, R.C., Woods, R.E.: Digital Image Processing. vol. 1, Prentice-hall, (1992).
18.
Zurück zum Zitat Tamura, H., Mori, S., Yamawaki, T.: Textural Features Corresponding to Visual Perception. IEEE Transactions on Systems, Man, and Cybernetics. 8, 460–473 (1978). Tamura, H., Mori, S., Yamawaki, T.: Textural Features Corresponding to Visual Perception. IEEE Transactions on Systems, Man, and Cybernetics. 8, 460–473 (1978).
Metadaten
Titel
Handwritten Mixed-Script Recognition System: A Comprehensive Approach
verfasst von
Pawan Kumar Singh
Supratim Das
Ram Sarkar
Mita Nasipuri
Copyright-Jahr
2017
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-3153-3_78

Premium Partner