Skip to main content
Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) 2/2015

01.06.2015 | Special Issue Paper

A new ring radius transform-based thinning method for multi-oriented video characters

verfasst von: Yirui Wu, Palaiahnakote Shivakumara, Wang Wei, Tong Lu, Umapada Pal

Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) | Ausgabe 2/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Thinning that preserves visual topology of characters in video is challenging in the field of document analysis and video text analysis due to low resolution and complex background. This paper proposes to explore ring radius transform (RRT) to generate a radius map from Canny edges of each input image to obtain its medial axis. A radius value contained in the radius map here is the nearest distance to the edge pixels on contours. For the radius map, the method proposes a novel idea for identifying medial axis (middle pixels between two strokes) for arbitrary orientations of the character. Iterative-maximal-growing is then proposed to connect missing medial axis pixels at junctions and intersections. Next, we perform histogram on color information of medial axes with clustering to eliminate false medial axis segments. The method finally restores the shape of the character through radius values of medial axis pixels for the purpose of recognition with the Google Open source OCR (Tesseract). The method has been tested on video, natural scene and handwritten characters from ICDAR 2013, SVT, arbitrary-oriented data from MSRA-TD500, multi-script character data and MPEG7 object data to evaluate its performances at thinning level as well as recognition level. Experimental results comparing with the state-of-the-art methods show that the proposed method is generic and outperforms the existing methods in terms of obtaining skeleton, preserving visual topology and recognition rate. The method is also robust to handle characters of arbitrary orientations.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Chatbri, H., Kameyama, K.: Using scale space filtering to make thinning algorithm robust against noise sketch images. Pattern Recognit. Lett. 42, 1–10 (2014) Chatbri, H., Kameyama, K.: Using scale space filtering to make thinning algorithm robust against noise sketch images. Pattern Recognit. Lett. 42, 1–10 (2014)
2.
Zurück zum Zitat Su, Z., Cao, Z., Wang, Y.: Stroke extraction based ambiguous zone detection: a preprocessing step to recover dynamic information from handwritten Chinese characters. In: IJDAR, pp. 109–121 (2009) Su, Z., Cao, Z., Wang, Y.: Stroke extraction based ambiguous zone detection: a preprocessing step to recover dynamic information from handwritten Chinese characters. In: IJDAR, pp. 109–121 (2009)
3.
Zurück zum Zitat Guo, Z., Hall, R.W.: Parallel thinning with two-subiteration algorithms. Commun. ACM 32(3), 359–373 (1989) Guo, Z., Hall, R.W.: Parallel thinning with two-subiteration algorithms. Commun. ACM 32(3), 359–373 (1989)
4.
Zurück zum Zitat Zhang, T.Y., Suen, C.Y.: A fast parallel algorithm for thinning digital patterns. Commun. ACM 27(3), 236–239 (1984) Zhang, T.Y., Suen, C.Y.: A fast parallel algorithm for thinning digital patterns. Commun. ACM 27(3), 236–239 (1984)
5.
Zurück zum Zitat Ward, A.D., Hamarneh, G.: The groupwise medial axis transform for fuzzy skeletonization and pruning. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1084–1096 (2010) Ward, A.D., Hamarneh, G.: The groupwise medial axis transform for fuzzy skeletonization and pruning. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1084–1096 (2010)
6.
Zurück zum Zitat Alginahi, Y.M.: A survey on Arabic character segmentation. In: IJDAR, pp. 105–126 (2013) Alginahi, Y.M.: A survey on Arabic character segmentation. In: IJDAR, pp. 105–126 (2013)
7.
Zurück zum Zitat Lam, L., Lee, S.-W., Suen, C.Y.: Thinning methodologies—a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 14(9), 869–885 (1992) Lam, L., Lee, S.-W., Suen, C.Y.: Thinning methodologies—a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 14(9), 869–885 (1992)
8.
Zurück zum Zitat Sharma, N., Pal, U., Blumenstein, M.: Recent advances in video based document processing: a review. In: Proceedings DAS, pp. 63–68 (2012) Sharma, N., Pal, U., Blumenstein, M.: Recent advances in video based document processing: a review. In: Proceedings DAS, pp. 63–68 (2012)
9.
Zurück zum Zitat Zang, J., Kasturi, R.: Extraction of text objects in video documents: recent progress. In: Proceedings DAS, pp. 5–17 (2008) Zang, J., Kasturi, R.: Extraction of text objects in video documents: recent progress. In: Proceedings DAS, pp. 5–17 (2008)
10.
Zurück zum Zitat Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011) Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011)
11.
Zurück zum Zitat Zhao, D., Shivakumara, P., Lu, S., Tan, C.L.: New spatial-gradient-features for video script identification. In: Proceedings DAS, pp. 38–42 (2012) Zhao, D., Shivakumara, P., Lu, S., Tan, C.L.: New spatial-gradient-features for video script identification. In: Proceedings DAS, pp. 38–42 (2012)
12.
Zurück zum Zitat Phan, T.Q., Shivakumara, P., Ding, Z., Lu, S., Tan, C.L.: Video script identification based on text lines. In: Proceedings ICDAR, pp. 1240–1244 (2011) Phan, T.Q., Shivakumara, P., Ding, Z., Lu, S., Tan, C.L.: Video script identification based on text lines. In: Proceedings ICDAR, pp. 1240–1244 (2011)
13.
Zurück zum Zitat Hoffman, M.E., Wong, E.K.: Scale-space approach to image thinning using the most prominent ridge line in the image pyramid data structure. In: Proceedings SPIE, pp. 242–252 (1998) Hoffman, M.E., Wong, E.K.: Scale-space approach to image thinning using the most prominent ridge line in the image pyramid data structure. In: Proceedings SPIE, pp. 242–252 (1998)
14.
Zurück zum Zitat Cai, J.: Robust filtering-based thinning algorithm for pattern recognition. Comput. J. 55(7), 887–896 (2012) Cai, J.: Robust filtering-based thinning algorithm for pattern recognition. Comput. J. 55(7), 887–896 (2012)
15.
Zurück zum Zitat Chen,Y.-S., Yu, Y.-T.: Thinning approach for noisy digital patterns. Pattern Recognit. 29(11), 1847–1862 (1996) Chen,Y.-S., Yu, Y.-T.: Thinning approach for noisy digital patterns. Pattern Recognit. 29(11), 1847–1862 (1996)
16.
Zurück zum Zitat Bag, S., Harit, G.: An improved contour-based thinning method for character images. Pattern Recognit. Lett. 32(14), 1836–1842 (2011) Bag, S., Harit, G.: An improved contour-based thinning method for character images. Pattern Recognit. Lett. 32(14), 1836–1842 (2011)
17.
Zurück zum Zitat Shivakumara, P., Phan, T.Q., Bhowmick, S., Tan, C.L., Pal, U.: A novel ring radius transform for video character reconstruction. Pattern Recognit. 46(1), 131–140 (2013) Shivakumara, P., Phan, T.Q., Bhowmick, S., Tan, C.L., Pal, U.: A novel ring radius transform for video character reconstruction. Pattern Recognit. 46(1), 131–140 (2013)
18.
Zurück zum Zitat Tian, S., Shivakumara, P., Phan, T.Q., Tan, C.L.: Scene character reconstruction through medial axis. In: Proceedings ICDAR, pp. 1360–1364 (2013) Tian, S., Shivakumara, P., Phan, T.Q., Tan, C.L.: Scene character reconstruction through medial axis. In: Proceedings ICDAR, pp. 1360–1364 (2013)
19.
Zurück zum Zitat Shivakumara, P., Hong, D.B., Zhao, D., Tan, C.L., Pal, U.: A new iterative-midpoint-method for video character gap filling. In: Proceedings ICPR, pp. 673–676 (2012) Shivakumara, P., Hong, D.B., Zhao, D., Tan, C.L., Pal, U.: A new iterative-midpoint-method for video character gap filling. In: Proceedings ICPR, pp. 673–676 (2012)
20.
Zurück zum Zitat Phan, T.Q., Shivakumara, P., Lu, S., Tan, C.L.: A gradient vector flow-based method for video character segmentation. In: Proceedings ICDAR, pp. 1024–1028 (2011) Phan, T.Q., Shivakumara, P., Lu, S., Tan, C.L.: A gradient vector flow-based method for video character segmentation. In: Proceedings ICDAR, pp. 1024–1028 (2011)
21.
Zurück zum Zitat Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings CVPR, pp. 2963–2970 (2010) Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings CVPR, pp. 2963–2970 (2010)
23.
Zurück zum Zitat Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Boorda, L.G.I., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., De las Heras, L.P.: ICDAR 2013 robust reading competition. In: Proceedings ICDAR, pp. 1115–1124 (2013) Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Boorda, L.G.I., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., De las Heras, L.P.: ICDAR 2013 robust reading competition. In: Proceedings ICDAR, pp. 1115–1124 (2013)
24.
Zurück zum Zitat Phan, T.Q., Shivakumara, P., Tian, S., Tan, C.L.: Recognizing text with perspective distortion in natural scenes. In: Proceedings ICCV, pp. 569–576 (2013) Phan, T.Q., Shivakumara, P., Tian, S., Tan, C.L.: Recognizing text with perspective distortion in natural scenes. In: Proceedings ICCV, pp. 569–576 (2013)
25.
Zurück zum Zitat Yao, C., Bai, Z., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural scene imags. In: Proceedings CVPR, pp. 1083–1090 (2012) Yao, C., Bai, Z., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural scene imags. In: Proceedings CVPR, pp. 1083–1090 (2012)
26.
Zurück zum Zitat Latecki, L.J., Lakamper, R., Echardt, U.: Shape description for non-rigid shapes with a single closed conrour. In: Proceedings CVPR, pp. 424–429 (2000) Latecki, L.J., Lakamper, R., Echardt, U.: Shape description for non-rigid shapes with a single closed conrour. In: Proceedings CVPR, pp. 424–429 (2000)
27.
Zurück zum Zitat Jalba, A., Wilkinson, M.H.F., Roerdink, J.B.T.M.: Shape representation and recognition through morphological curvature scale spaces. IEEE Trans. Image Process. 15(2), 331–341 (2006) Jalba, A., Wilkinson, M.H.F., Roerdink, J.B.T.M.: Shape representation and recognition through morphological curvature scale spaces. IEEE Trans. Image Process. 15(2), 331–341 (2006)
28.
Zurück zum Zitat Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: ICDAR2013 Handwriting Segmentation Contest. In: Proceedings ICDAR, pp. 1402–1406 (2013) Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: ICDAR2013 Handwriting Segmentation Contest. In: Proceedings ICDAR, pp. 1402–1406 (2013)
29.
Zurück zum Zitat Jang, B.-K., Chin, R.T.: One-pass parallel thinning: analysis, properties, and quantitative evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 14(11), 1129–1140 (1992) Jang, B.-K., Chin, R.T.: One-pass parallel thinning: analysis, properties, and quantitative evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 14(11), 1129–1140 (1992)
Metadaten
Titel
A new ring radius transform-based thinning method for multi-oriented video characters
verfasst von
Yirui Wu
Palaiahnakote Shivakumara
Wang Wei
Tong Lu
Umapada Pal
Publikationsdatum
01.06.2015
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal on Document Analysis and Recognition (IJDAR) / Ausgabe 2/2015
Print ISSN: 1433-2833
Elektronische ISSN: 1433-2825
DOI
https://doi.org/10.1007/s10032-015-0238-y

Weitere Artikel der Ausgabe 2/2015

International Journal on Document Analysis and Recognition (IJDAR) 2/2015 Zur Ausgabe

Editorial

Preface