Skip to main content
Erschienen in: Pattern Analysis and Applications 4/2011

01.11.2011 | Theoretical Advances

Piece-wise painting technique for line segmentation of unconstrained handwritten text: a specific study with Persian text documents

verfasst von: Alireza Alaei, P. Nagabhushan, Umapada Pal

Erschienen in: Pattern Analysis and Applications | Ausgabe 4/2011

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The most important and difficult task in text document analysis is to achieve line segmentation accurately, particularly when the document is composed of unconstrained handwritten text. To accomplish this objective a painting scheme is proposed in this research work. Being motivated by the fact that the handwritten Persian texts offer the most critical challenges in the process of text-line segmentation, the new method has been devised by studying the cursive Persian text scripts extensively; yet, in general the proposed line segmentation algorithm is applicable to handwritten text in any language/script. The text block is vertically decomposed into parallel pipe structures called as strip. Each row in each strip is painted by a gray intensity, which is the average intensity value of gray values of all pixels present in that row-strip. Subsequently, the painted pipes are converted into two-tone painting and it is smoothed. The white/black spaces in each pipe of the smoothed image are analyzed to get a short line of separation, phrased as Piece-wise Potential Separating Line (PPSL), between two consecutive black spaces. The PPSLs are concatenated to produce the segmentation of text lines. Some additional procedures are built to handle certain anomalies, which may occur. The scheme is validated by extensive experimentation. We tested the proposed algorithm with 52 pages of Persian text documents containing totally 823 lines and correct line segmentation of 92.35% is achieved. Moreover, the proposed algorithm was also tested with two different datasets of 152 and 200 handwritten text-pages of different languages. Efficiency and script independency of the proposed algorithm were proved when compared with various approaches presented in recent literature.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Likforman-Sulem L, Zahour A, Taconet B (2007) Text line segmentation of historical documents: a survey. Int J Document Anal Recognit 9(2):123–138CrossRef Likforman-Sulem L, Zahour A, Taconet B (2007) Text line segmentation of historical documents: a survey. Int J Document Anal Recognit 9(2):123–138CrossRef
2.
Zurück zum Zitat Bortolozzi F, Britto Jr, Alceu de S, Oliveira LS, Morita M (2005) Recent advances in handwriting recognition. In: Pal et al U (eds) Document analysis. ISBN: 8177647849, pp 1–31 Bortolozzi F, Britto Jr, Alceu de S, Oliveira LS, Morita M (2005) Recent advances in handwriting recognition. In: Pal et al U (eds) Document analysis. ISBN: 8177647849, pp 1–31
3.
Zurück zum Zitat Srihari SN, Ball G (2008) An assessment of arabic handwriting recognition technology. CEDAR Technical Report, TR-03-07 Srihari SN, Ball G (2008) An assessment of arabic handwriting recognition technology. CEDAR Technical Report, TR-03-07
5.
Zurück zum Zitat Hashemi MR, Fatemi O, Safavi R (1995) Persian cursive script recognition. Proc Third Int Conf Document Anal Recogn 2:869–873CrossRef Hashemi MR, Fatemi O, Safavi R (1995) Persian cursive script recognition. Proc Third Int Conf Document Anal Recogn 2:869–873CrossRef
6.
Zurück zum Zitat Timár G, Karacs K, Rekeczky C (2002) Analogic preprocessing and segmentation algorithms for off-line handwriting recognition. In: Proceedings of seventh IEEE international workshop on cellular neural networks and their applications (CNNA02), pp 407–414 Timár G, Karacs K, Rekeczky C (2002) Analogic preprocessing and segmentation algorithms for off-line handwriting recognition. In: Proceedings of seventh IEEE international workshop on cellular neural networks and their applications (CNNA02), pp 407–414
7.
Zurück zum Zitat Manmatha R, Rothfeder JL (2005) A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Trans Pattern Anal Mach Intell 27(8):1212–1225CrossRef Manmatha R, Rothfeder JL (2005) A scale space approach for automatically segmenting words from historical handwritten documents. IEEE Trans Pattern Anal Mach Intell 27(8):1212–1225CrossRef
8.
Zurück zum Zitat Zahour A, Taconet B, Mercy P, Ramdane S (2001) Arabic hand-written text-line extraction. In: Proceedings of the sixth international conference on document analysis and recognition (ICDAR01), pp 281–285 Zahour A, Taconet B, Mercy P, Ramdane S (2001) Arabic hand-written text-line extraction. In: Proceedings of the sixth international conference on document analysis and recognition (ICDAR01), pp 281–285
9.
Zurück zum Zitat Pal U, Datta S (2003) Segmentation of bangla unconstrained handwritten text. In: Proceedings of the seventh international conference on document analysis and recognition (ICDAR 2003), pp 1128–1132 Pal U, Datta S (2003) Segmentation of bangla unconstrained handwritten text. In: Proceedings of the seventh international conference on document analysis and recognition (ICDAR 2003), pp 1128–1132
10.
Zurück zum Zitat Tripathy N, Pal U (2004) Handwriting segmentation of unconstrained oriya text. In: Proceedings of ninth international workshop on frontiers in handwriting recognition (IWFHR), pp 306–311 Tripathy N, Pal U (2004) Handwriting segmentation of unconstrained oriya text. In: Proceedings of ninth international workshop on frontiers in handwriting recognition (IWFHR), pp 306–311
11.
Zurück zum Zitat Zahour A, Taconet B, Likforman-Sulem L, Boussellaa W (2009) Overlapping and multi-touching text-line segmentation by Block Covering analysis. Pattern Anal Appl 12(4):335–351MathSciNetCrossRef Zahour A, Taconet B, Likforman-Sulem L, Boussellaa W (2009) Overlapping and multi-touching text-line segmentation by Block Covering analysis. Pattern Anal Appl 12(4):335–351MathSciNetCrossRef
12.
Zurück zum Zitat Shi Z, Govindaraju V (2004) Line separation for complex document images using fuzzy runlength. In: First international workshop on document image analysis for libraries (DIAL’04), pp 306–307 Shi Z, Govindaraju V (2004) Line separation for complex document images using fuzzy runlength. In: First international workshop on document image analysis for libraries (DIAL’04), pp 306–307
13.
Zurück zum Zitat Likforman-Sulem L, Hanimyan A, Faure C (1995) A Hough based algorithm for extracting text lines in handwritten documents. In: Proceedings of the third international conference on document analysis and recognition, Montreal, Canada, pp 774–777 Likforman-Sulem L, Hanimyan A, Faure C (1995) A Hough based algorithm for extracting text lines in handwritten documents. In: Proceedings of the third international conference on document analysis and recognition, Montreal, Canada, pp 774–777
14.
Zurück zum Zitat Louloudis G, Gatos B, Pratikakis I, Halatsis C (2008) Text line detection in handwritten documents. Pattern Recogn 41:3758–3772MATHCrossRef Louloudis G, Gatos B, Pratikakis I, Halatsis C (2008) Text line detection in handwritten documents. Pattern Recogn 41:3758–3772MATHCrossRef
15.
Zurück zum Zitat Basu S, Chaudhuri C, Kundu M, Nasipuri M, Basu DK (2007) Text line extraction from multi-skewed handwritten documents. Pattern Recogn 40(6):1825–1839MATHCrossRef Basu S, Chaudhuri C, Kundu M, Nasipuri M, Basu DK (2007) Text line extraction from multi-skewed handwritten documents. Pattern Recogn 40(6):1825–1839MATHCrossRef
16.
Zurück zum Zitat Li Y, Zheng Y, Doermann D, Jaeger S (2008) Script-independent text line segmentation in freestyle handwritten documents. IEEE Trans Pattern Anal Mach Intell 30(8):1313–1329CrossRef Li Y, Zheng Y, Doermann D, Jaeger S (2008) Script-independent text line segmentation in freestyle handwritten documents. IEEE Trans Pattern Anal Mach Intell 30(8):1313–1329CrossRef
17.
Zurück zum Zitat Bukhari SS, Shafait F, Breuel TM (2009) Script-independent handwritten textlines segmentation using active contours. In: Proceedings of the 10th international conference on document analysis and recognition, pp 446–450 Bukhari SS, Shafait F, Breuel TM (2009) Script-independent handwritten textlines segmentation using active contours. In: Proceedings of the 10th international conference on document analysis and recognition, pp 446–450
18.
Zurück zum Zitat Yin F, Liu C-L (2009) Handwritten Chinese text line segmentation by clustering with distance metric learning. Pattern Recogn 42(12):3146–3157MATHCrossRef Yin F, Liu C-L (2009) Handwritten Chinese text line segmentation by clustering with distance metric learning. Pattern Recogn 42(12):3146–3157MATHCrossRef
19.
Zurück zum Zitat Wang H, Suter D (2003) Color image segmentation using global information and local homogeneity. In: Seventh international conference on digital image computing: techniques and applications, pp 89–98 Wang H, Suter D (2003) Color image segmentation using global information and local homogeneity. In: Seventh international conference on digital image computing: techniques and applications, pp 89–98
20.
Zurück zum Zitat Skarbek W, Koschan A (1994) Colour image segmentation—a survey. Technical Report 94-32, Technical University of Berlin, Department of Computer Science, Germany Skarbek W, Koschan A (1994) Colour image segmentation—a survey. Technical Report 94-32, Technical University of Berlin, Department of Computer Science, Germany
21.
Zurück zum Zitat Panneton B, Brouillard M (2008) Assessing color representation methods for segmentation of vegetation in color photographs. Published by the American Society of Agricultural and Biological Engineers Panneton B, Brouillard M (2008) Assessing color representation methods for segmentation of vegetation in color photographs. Published by the American Society of Agricultural and Biological Engineers
22.
Zurück zum Zitat Ball GR, Srihari SN, Srinivasan H (2006) Segmentation-based and segmentation-free methods for spotting handwritten arabic words. In: Proceedings of 10th international workshop on frontiers in handwriting recognition (IWFHR 2006), pp 53–58 Ball GR, Srihari SN, Srinivasan H (2006) Segmentation-based and segmentation-free methods for spotting handwritten arabic words. In: Proceedings of 10th international workshop on frontiers in handwriting recognition (IWFHR 2006), pp 53–58
23.
Zurück zum Zitat Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–69MathSciNetCrossRef Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–69MathSciNetCrossRef
24.
Zurück zum Zitat Gatos B, Stamatopoulos N, Louloudis G (2009) ICDAR2009 Handwriting segmentation contest. In: Proceedings of 10th international conference on document analysis and recognition, pp 1393–1397 Gatos B, Stamatopoulos N, Louloudis G (2009) ICDAR2009 Handwriting segmentation contest. In: Proceedings of 10th international conference on document analysis and recognition, pp 1393–1397
25.
Zurück zum Zitat Gatos B, Antonacopoulos A, Stamatopoulos N (2007) ICDAR2007 handwriting segmentation contest. In: Proceedings of ninth international conference on document analysis and recognition, pp 1284–1288 Gatos B, Antonacopoulos A, Stamatopoulos N (2007) ICDAR2007 handwriting segmentation contest. In: Proceedings of ninth international conference on document analysis and recognition, pp 1284–1288
Metadaten
Titel
Piece-wise painting technique for line segmentation of unconstrained handwritten text: a specific study with Persian text documents
verfasst von
Alireza Alaei
P. Nagabhushan
Umapada Pal
Publikationsdatum
01.11.2011
Verlag
Springer-Verlag
Erschienen in
Pattern Analysis and Applications / Ausgabe 4/2011
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-011-0226-x

Weitere Artikel der Ausgabe 4/2011

Pattern Analysis and Applications 4/2011 Zur Ausgabe

Premium Partner