Skip to main content
Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) 1-2/2018

30.04.2018 | Original Paper

Recognition-based character segmentation for multi-level writing style

verfasst von: Papangkorn Inkeaw, Jakramate Bootkrajang, Phasit Charoenkwan, Sanparith Marukatat, Shinn-Ying Ho, Jeerayut Chaijaruwanich

Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) | Ausgabe 1-2/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Character segmentation is an important task in optical character recognition (OCR). The quality of any OCR system is highly dependent on character segmentation algorithm. Despite the availability of various character segmentation methods proposed to date, existing methods cannot satisfyingly segment characters belonging to some complex writing styles such as the Lanna Dhamma characters. In this paper, a new character segmentation method named graph partitioning-based character segmentation is proposed to address the problem. The proposed method can deal with multi-level writing style as well as touching and broken characters. It is considered as a generalization of existing approaches to multi-level writing style. The proposed method consists of three phases. In the first phase, a newly devised over-segmentation technique based on morphological skeleton is used to obtain redundant fragments of a word image. The fragments are then used to form a segmentation hypotheses graph. In the last phase, the hypotheses graph is partitioned into subgraphs each corresponding to a segmented character using the partitioning algorithm developed specifically for character segmentation purpose. Experimental results based on handwritten Lanna Dhamma characters datasets showed that the proposed method achieved high correct segmentation rate and outperformed existing methods for the Lanna Dhamma alphabet.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Inkeaw, P., Chueaphun, C., Chaijaruwanich, J., Klomsae, A., Marukatat, S.: Lanna Dharma handwritten character recognition on palm leaves manuscript based on Wavelet transform. In: 2015 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), 19–21, pp 253–258 (2015) Inkeaw, P., Chueaphun, C., Chaijaruwanich, J., Klomsae, A., Marukatat, S.: Lanna Dharma handwritten character recognition on palm leaves manuscript based on Wavelet transform. In: 2015 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), 19–21, pp 253–258 (2015)
2.
Zurück zum Zitat Inkeaw, P., Charoenkwan, P., Huang, H.-L., Marukatat, S., Ho, S.-Y., Chaijaruwanich, J.: Recognition of handwritten Lanna Dhamma characters using a set of optimally designed moment features. IJDAR 20(4), 259–274 (2017)CrossRef Inkeaw, P., Charoenkwan, P., Huang, H.-L., Marukatat, S., Ho, S.-Y., Chaijaruwanich, J.: Recognition of handwritten Lanna Dhamma characters using a set of optimally designed moment features. IJDAR 20(4), 259–274 (2017)CrossRef
3.
Zurück zum Zitat Thammano, A., Pravesjit, S.: Recognition of archaic Lanna handwritten manuscripts using a hybrid bio-inspired algorithm. Memet. Comput. 7(1), 3–17 (2015)CrossRef Thammano, A., Pravesjit, S.: Recognition of archaic Lanna handwritten manuscripts using a hybrid bio-inspired algorithm. Memet. Comput. 7(1), 3–17 (2015)CrossRef
4.
Zurück zum Zitat Casey, R.G., Lecolinet, E.: A survey of methods and strategies in character segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 18(7), 690–706 (1996)CrossRef Casey, R.G., Lecolinet, E.: A survey of methods and strategies in character segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 18(7), 690–706 (1996)CrossRef
5.
Zurück zum Zitat Shi, Z., Govindaraju, V.: Segmentation and recognition of connected handwritten numeral strings. Pattern Recogn. 30(9), 1501–1504 (1997)CrossRef Shi, Z., Govindaraju, V.: Segmentation and recognition of connected handwritten numeral strings. Pattern Recogn. 30(9), 1501–1504 (1997)CrossRef
6.
Zurück zum Zitat Elnagar, A., Alhajj, R.: Segmentation of connected handwritten numeral strings. Pattern Recogn. 36(3), 625–634 (2003)CrossRef Elnagar, A., Alhajj, R.: Segmentation of connected handwritten numeral strings. Pattern Recogn. 36(3), 625–634 (2003)CrossRef
7.
Zurück zum Zitat Pal, U., Belad, A., Choisy, C.: Touching numeral segmentation using water reservoir concept. Pattern Recogn. Lett. 24(1–3), 261–272 (2003)CrossRef Pal, U., Belad, A., Choisy, C.: Touching numeral segmentation using water reservoir concept. Pattern Recogn. Lett. 24(1–3), 261–272 (2003)CrossRef
8.
Zurück zum Zitat Pravesjit, S., Thammano, A.: Segmentation of historical Lanna handwritten manuscripts. In: 2012 6th IEEE International Conference Intelligent Systems, 6–8, pp 332–337 (2012) Pravesjit, S., Thammano, A.: Segmentation of historical Lanna handwritten manuscripts. In: 2012 6th IEEE International Conference Intelligent Systems, 6–8, pp 332–337 (2012)
9.
Zurück zum Zitat Ribas, F.C., Oliveira, L.S., Britto, A.S., Sabourin, R.: Handwritten digit segmentation: a comparative study. IJDAR 16(2), 127–137 (2013)CrossRef Ribas, F.C., Oliveira, L.S., Britto, A.S., Sabourin, R.: Handwritten digit segmentation: a comparative study. IJDAR 16(2), 127–137 (2013)CrossRef
10.
Zurück zum Zitat Kovalevski, V.A.: Character Readers and Pattern Recognition. Spartan Books, Washington (1968) Kovalevski, V.A.: Character Readers and Pattern Recognition. Spartan Books, Washington (1968)
11.
Zurück zum Zitat Casey, R.G., Nagy, G.: Recursive segmentation and classification of composite character patterns. In: Proceedings of Sixth International Conference on Pattern Recognition (1982) Casey, R.G., Nagy, G.: Recursive segmentation and classification of composite character patterns. In: Proceedings of Sixth International Conference on Pattern Recognition (1982)
12.
Zurück zum Zitat Elagouni, K., Garcia, C., Mamalet, F., Sebillot, P.: Combining multi-scale character recognition and linguistic knowledge for natural scene text OCR. In: 2012 10th IAPR International Workshop on Document Analysis Systems, 27–29, pp 120–124 (2012) Elagouni, K., Garcia, C., Mamalet, F., Sebillot, P.: Combining multi-scale character recognition and linguistic knowledge for natural scene text OCR. In: 2012 10th IAPR International Workshop on Document Analysis Systems, 27–29, pp 120–124 (2012)
13.
Zurück zum Zitat Fujisawa, H., Nakano, Y., Kurino, K.: Segmentation methods for character recognition: from segmentation to document structure analysis. Proc. IEEE 80(7), 1079–1092 (1992)CrossRef Fujisawa, H., Nakano, Y., Kurino, K.: Segmentation methods for character recognition: from segmentation to document structure analysis. Proc. IEEE 80(7), 1079–1092 (1992)CrossRef
14.
Zurück zum Zitat Xiu, P., Peng, L., Ding, X., Wang, H.: Offline handwritten Arabic character segmentation with probabilistic model. In: Bunke, H., Spitz, A.L. (eds) Proceedings of Document Analysis Systems VII: 7th International Workshop, DAS 2006, Nelson, New Zealand, February 13–15, 2006. Springer, Berlin, pp. 402–412 (2006) Xiu, P., Peng, L., Ding, X., Wang, H.: Offline handwritten Arabic character segmentation with probabilistic model. In: Bunke, H., Spitz, A.L. (eds) Proceedings of Document Analysis Systems VII: 7th International Workshop, DAS 2006, Nelson, New Zealand, February 13–15, 2006. Springer, Berlin, pp. 402–412 (2006)
15.
Zurück zum Zitat Oliveira, L.S., Sabourin, R., Bortolozzi, F., Suen, C.Y.: Automatic recognition of handwritten numerical strings: a recognition and verification strategy. IEEE Trans. Pattern Anal. Mach. Intell. 24(11), 1438–1454 (2002)CrossRef Oliveira, L.S., Sabourin, R., Bortolozzi, F., Suen, C.Y.: Automatic recognition of handwritten numerical strings: a recognition and verification strategy. IEEE Trans. Pattern Anal. Mach. Intell. 24(11), 1438–1454 (2002)CrossRef
16.
Zurück zum Zitat Chatchinarat, A.: Thai handwritten segmentation using proportional invariant recognition technique. In: 2009 International Conference on Future Computer and Communication, 3–5, pp. 283–287 (2009) Chatchinarat, A.: Thai handwritten segmentation using proportional invariant recognition technique. In: 2009 International Conference on Future Computer and Communication, 3–5, pp. 283–287 (2009)
17.
Zurück zum Zitat Chen, Y.-K., Wang, J.-F.: Segmentation of single- or multiple-touching handwritten numeral string using background and foreground analysis. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1304–1317 (2000)CrossRef Chen, Y.-K., Wang, J.-F.: Segmentation of single- or multiple-touching handwritten numeral string using background and foreground analysis. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1304–1317 (2000)CrossRef
18.
Zurück zum Zitat Fenrich, R., Krishnamoorthy, K.: Segmentation diverse quality handwritten digit strings in near real-time. In: the 5th USPS Advance Technology Conference, pp. 523–537 (1990) Fenrich, R., Krishnamoorthy, K.: Segmentation diverse quality handwritten digit strings in near real-time. In: the 5th USPS Advance Technology Conference, pp. 523–537 (1990)
19.
Zurück zum Zitat Ji, J., Peng, L., Li, B.: Graph model optimization based historical Chinese character segmentation method. In: 2014 11th IAPR International Workshop on Document Analysis Systems, 7–10, pp 282–286 (2014) Ji, J., Peng, L., Li, B.: Graph model optimization based historical Chinese character segmentation method. In: 2014 11th IAPR International Workshop on Document Analysis Systems, 7–10, pp 282–286 (2014)
20.
Zurück zum Zitat Stentiford, F.W.M., Mortimer, R.G.: Some new heuristics for thinning binary handprinted characters for OCR. IEEE Trans. Syst. Man Cybern. 13(1), 81–84 (1983)CrossRef Stentiford, F.W.M., Mortimer, R.G.: Some new heuristics for thinning binary handprinted characters for OCR. IEEE Trans. Syst. Man Cybern. 13(1), 81–84 (1983)CrossRef
21.
Zurück zum Zitat Jang, B.K., Chin, R.T.: One-pass parallel thinning: analysis, properties, and quantitative evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 14(11), 1129–1140 (1992)CrossRef Jang, B.K., Chin, R.T.: One-pass parallel thinning: analysis, properties, and quantitative evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 14(11), 1129–1140 (1992)CrossRef
22.
Zurück zum Zitat Chen, W., Sui, L., Xu, Z., Lang, Y.: Improved Zhang–Suen thinning algorithm in binary line drawing applications. In: 2012 International Conference on Systems and Informatics (ICSAI2012), 19–20, pp. 1947–1950 (2012) Chen, W., Sui, L., Xu, Z., Lang, Y.: Improved Zhang–Suen thinning algorithm in binary line drawing applications. In: 2012 International Conference on Systems and Informatics (ICSAI2012), 19–20, pp. 1947–1950 (2012)
23.
24.
Zurück zum Zitat Ping, Z., Lihui, C.: A novel feature extraction method and hybrid tree classification for handwritten numeral recognition. Pattern Recogn. Lett. 23(1), 45–56 (2002)CrossRefMATH Ping, Z., Lihui, C.: A novel feature extraction method and hybrid tree classification for handwritten numeral recognition. Pattern Recogn. Lett. 23(1), 45–56 (2002)CrossRefMATH
25.
Zurück zum Zitat Kamranian, Z., Monadjemi, S.A., Nematbakhsh, N.: A novel free format Persian/Arabic handwritten zip code recognition system. Comput. Electr. Eng. 39(7), 1970–1979 (2013)CrossRef Kamranian, Z., Monadjemi, S.A., Nematbakhsh, N.: A novel free format Persian/Arabic handwritten zip code recognition system. Comput. Electr. Eng. 39(7), 1970–1979 (2013)CrossRef
26.
Zurück zum Zitat Basu, S., Das, N., Sarkar, R., Kundu, M., Nasipuri, M., Basu, D.K.: A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recogn. 42(7), 1467–1484 (2009)CrossRefMATH Basu, S., Das, N., Sarkar, R., Kundu, M., Nasipuri, M., Basu, D.K.: A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recogn. 42(7), 1467–1484 (2009)CrossRefMATH
27.
Zurück zum Zitat Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 25–25, vol. 881, pp. 886–893 (2005) Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 25–25, vol. 881, pp. 886–893 (2005)
28.
Zurück zum Zitat Kim, J., Hwang, I., Kim, Y.-H., Moon, B.-R.: Genetic approaches for graph partitioning: a survey. In: Paper presented at the Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, Dublin, Ireland (2011) Kim, J., Hwang, I., Kim, Y.-H., Moon, B.-R.: Genetic approaches for graph partitioning: a survey. In: Paper presented at the Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, Dublin, Ireland (2011)
29.
Zurück zum Zitat Klomsae, A.: Image feature extraction for Lanna Dharma handwritten character recognition. Master Thesis, Chiang Mai University, Thailand (2012) Klomsae, A.: Image feature extraction for Lanna Dharma handwritten character recognition. Master Thesis, Chiang Mai University, Thailand (2012)
31.
Zurück zum Zitat McLachlan, G.J.: Discriminant analysis and statistical pattern recognition. Wiley series in probability and mathematical statistics. Applied probability and statistics; Wiley series in probability and mathematical statistics. Applied probability and statistics. Wiley, New York (1992) McLachlan, G.J.: Discriminant analysis and statistical pattern recognition. Wiley series in probability and mathematical statistics. Applied probability and statistics; Wiley series in probability and mathematical statistics. Applied probability and statistics. Wiley, New York (1992)
32.
Zurück zum Zitat Haykin, S.S.: Neural Networks and Learning Machines, 3rd edn. Prentice Hall, New York (2009) Haykin, S.S.: Neural Networks and Learning Machines, 3rd edn. Prentice Hall, New York (2009)
33.
Zurück zum Zitat Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)MATH
34.
Zurück zum Zitat Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)MathSciNetCrossRefMATH Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)MathSciNetCrossRefMATH
Metadaten
Titel
Recognition-based character segmentation for multi-level writing style
verfasst von
Papangkorn Inkeaw
Jakramate Bootkrajang
Phasit Charoenkwan
Sanparith Marukatat
Shinn-Ying Ho
Jeerayut Chaijaruwanich
Publikationsdatum
30.04.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal on Document Analysis and Recognition (IJDAR) / Ausgabe 1-2/2018
Print ISSN: 1433-2833
Elektronische ISSN: 1433-2825
DOI
https://doi.org/10.1007/s10032-018-0302-5

Weitere Artikel der Ausgabe 1-2/2018

International Journal on Document Analysis and Recognition (IJDAR) 1-2/2018 Zur Ausgabe