nach oben

International Journal on Document Analysis and Recognition (IJDAR)

Erschienen in:

01.06.2015 | Special Issue Paper

A new ring radius transform-based thinning method for multi-oriented video characters

verfasst von: Yirui Wu, Palaiahnakote Shivakumara, Wang Wei, Tong Lu, Umapada Pal

Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) | Ausgabe 2/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Thinning that preserves visual topology of characters in video is challenging in the field of document analysis and video text analysis due to low resolution and complex background. This paper proposes to explore ring radius transform (RRT) to generate a radius map from Canny edges of each input image to obtain its medial axis. A radius value contained in the radius map here is the nearest distance to the edge pixels on contours. For the radius map, the method proposes a novel idea for identifying medial axis (middle pixels between two strokes) for arbitrary orientations of the character. Iterative-maximal-growing is then proposed to connect missing medial axis pixels at junctions and intersections. Next, we perform histogram on color information of medial axes with clustering to eliminate false medial axis segments. The method finally restores the shape of the character through radius values of medial axis pixels for the purpose of recognition with the Google Open source OCR (Tesseract). The method has been tested on video, natural scene and handwritten characters from ICDAR 2013, SVT, arbitrary-oriented data from MSRA-TD500, multi-script character data and MPEG7 object data to evaluate its performances at thinning level as well as recognition level. Experimental results comparing with the state-of-the-art methods show that the proposed method is generic and outperforms the existing methods in terms of obtaining skeleton, preserving visual topology and recognition rate. The method is also robust to handle characters of arbitrary orientations.

Vorheriger Artikel Scene text extraction based on edges and support vector regression

Nächster Artikel Exploiting colour information for better scene text detection and recognition

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Chatbri, H., Kameyama, K.: Using scale space filtering to make thinning algorithm robust against noise sketch images. Pattern Recognit. Lett. 42, 1–10 (2014)

Su, Z., Cao, Z., Wang, Y.: Stroke extraction based ambiguous zone detection: a preprocessing step to recover dynamic information from handwritten Chinese characters. In: IJDAR, pp. 109–121 (2009)

Guo, Z., Hall, R.W.: Parallel thinning with two-subiteration algorithms. Commun. ACM 32(3), 359–373 (1989)

Zhang, T.Y., Suen, C.Y.: A fast parallel algorithm for thinning digital patterns. Commun. ACM 27(3), 236–239 (1984)

Ward, A.D., Hamarneh, G.: The groupwise medial axis transform for fuzzy skeletonization and pruning. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 1084–1096 (2010)

Alginahi, Y.M.: A survey on Arabic character segmentation. In: IJDAR, pp. 105–126 (2013)

Lam, L., Lee, S.-W., Suen, C.Y.: Thinning methodologies—a comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 14(9), 869–885 (1992)

Sharma, N., Pal, U., Blumenstein, M.: Recent advances in video based document processing: a review. In: Proceedings DAS, pp. 63–68 (2012)

Zang, J., Kasturi, R.: Extraction of text objects in video documents: recent progress. In: Proceedings DAS, pp. 5–17 (2008)

10.

Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011)

11.

Zhao, D., Shivakumara, P., Lu, S., Tan, C.L.: New spatial-gradient-features for video script identification. In: Proceedings DAS, pp. 38–42 (2012)

12.

Phan, T.Q., Shivakumara, P., Ding, Z., Lu, S., Tan, C.L.: Video script identification based on text lines. In: Proceedings ICDAR, pp. 1240–1244 (2011)

13.

Hoffman, M.E., Wong, E.K.: Scale-space approach to image thinning using the most prominent ridge line in the image pyramid data structure. In: Proceedings SPIE, pp. 242–252 (1998)

14.

Cai, J.: Robust filtering-based thinning algorithm for pattern recognition. Comput. J. 55(7), 887–896 (2012)

15.

Chen,Y.-S., Yu, Y.-T.: Thinning approach for noisy digital patterns. Pattern Recognit. 29(11), 1847–1862 (1996)

16.

Bag, S., Harit, G.: An improved contour-based thinning method for character images. Pattern Recognit. Lett. 32(14), 1836–1842 (2011)

17.

Shivakumara, P., Phan, T.Q., Bhowmick, S., Tan, C.L., Pal, U.: A novel ring radius transform for video character reconstruction. Pattern Recognit. 46(1), 131–140 (2013)

18.

Tian, S., Shivakumara, P., Phan, T.Q., Tan, C.L.: Scene character reconstruction through medial axis. In: Proceedings ICDAR, pp. 1360–1364 (2013)

19.

Shivakumara, P., Hong, D.B., Zhao, D., Tan, C.L., Pal, U.: A new iterative-midpoint-method for video character gap filling. In: Proceedings ICPR, pp. 673–676 (2012)

20.

Phan, T.Q., Shivakumara, P., Lu, S., Tan, C.L.: A gradient vector flow-based method for video character segmentation. In: Proceedings ICDAR, pp. 1024–1028 (2011)

21.

Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings CVPR, pp. 2963–2970 (2010)

22.

Tesseract. http://code.google.com/p/tesseract-ocr/

23.

Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Boorda, L.G.I., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., De las Heras, L.P.: ICDAR 2013 robust reading competition. In: Proceedings ICDAR, pp. 1115–1124 (2013)

24.

Phan, T.Q., Shivakumara, P., Tian, S., Tan, C.L.: Recognizing text with perspective distortion in natural scenes. In: Proceedings ICCV, pp. 569–576 (2013)

25.

Yao, C., Bai, Z., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural scene imags. In: Proceedings CVPR, pp. 1083–1090 (2012)

26.

Latecki, L.J., Lakamper, R., Echardt, U.: Shape description for non-rigid shapes with a single closed conrour. In: Proceedings CVPR, pp. 424–429 (2000)

27.

Jalba, A., Wilkinson, M.H.F., Roerdink, J.B.T.M.: Shape representation and recognition through morphological curvature scale spaces. IEEE Trans. Image Process. 15(2), 331–341 (2006)

28.

Stamatopoulos, N., Gatos, B., Louloudis, G., Pal, U., Alaei, A.: ICDAR2013 Handwriting Segmentation Contest. In: Proceedings ICDAR, pp. 1402–1406 (2013)

29.

Jang, B.-K., Chin, R.T.: One-pass parallel thinning: analysis, properties, and quantitative evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 14(11), 1129–1140 (1992)

Titel: A new ring radius transform-based thinning method for multi-oriented video characters
verfasst von: Yirui Wu
Palaiahnakote Shivakumara
Wang Wei
Tong Lu
Umapada Pal
Publikationsdatum: 01.06.2015
Verlag: Springer Berlin Heidelberg
Erschienen in: International Journal on Document Analysis and Recognition (IJDAR) / Ausgabe 2/2015
Print ISSN: 1433-2833
Elektronische ISSN: 1433-2825
DOI: https://doi.org/10.1007/s10032-015-0238-y

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 2/2015

Scene text extraction based on edges and support vector regression

Preface

Restoring camera-captured distorted document images

Exploiting colour information for better scene text detection and recognition

Automatic diacritization of Arabic text using recurrent neural networks

Fast and accurate scene text understanding with image binarization and off-the-shelf OCR