Top

Machine Vision and Applications

Published in:

01-08-2016 | Original Paper

Portable and fast text detection

Authors: L. Zini, F. Odone

Published in: Machine Vision and Applications | Issue 6/2016

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In this paper, we describe an efficient pipeline for real-time text detection to be implemented on different architectures, with particular reference to smart phones. The text detection pipeline is based on a rather standard segmentation followed by a classification of each segmented connected component. Segmentation is performed by a linear implementation of MSER, state-of-the-art for text detection, where we control the overall computational cost of the method by computing a set of descriptive features as segmentation goes on. Classification is carried out by a cascade of SVM classifiers, where each layer captures different levels of complexity by means of an appropriate choice of descriptive features and kernel functions. Each detected text element, or character, is finally merged into lines of text and words. Further on, each element can be fed to a multi-class classifier that performs character recognition—this functionality is currently under development. We report experiments aiming at assessing the appropriateness of the text detection procedure, in terms of both performance and speed, when running on both x86 and ARM processors.

previous article Accurate keyframe selection and keypoint tracking for robust visual odometry

next article Action recognition using edge trajectories and motion acceleration descriptor

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

http://www.dubout.ch/en/coding.html.

Source code of the Text Segmentation, the libERtxt library, is available for download at https://bitbucket.org/slipguru.

Source code of the general-purpose optimized classification library libMsC is available for download at https://bitbucket.org/slipguru.

http://dag.cvc.uab.es/icdar2013competition.

Dataset acquired for the project VIT—Vision for Innovative Transport—VII FP EU—SP4 Capacities Research for SMEs—n. 222199 http://www.vitproject.eu.

http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/.

https://developer.qualcomm.com/.

http://www.csie.ntu.edu.tw/~cjlin/libsvm/.

Eigen 3 http://eigen.tuxfamily.org.

GLASSENSE is a regional project developed within the SI4Life Ligurian Regional Hub—Research and Innovation—Live Sciences http://www.si4life.com/.

Ezaki, N., Bulacu, M., Schomaker, L.: Text detection from natural scene images: towards a system for visually impaired persons. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 2, pp. 683–686. IEEE (2004)

Destrero, A., Zini, L., Odone, F.: A classification architecture based on connected components for text detection in unconstrained environments. In: IEEE AVSS, pp. 176–181 (2009)

Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3538–3545. IEEE (2012)

Shao, Y., Wang, C., Xiao, B., Zhang, Y., Zhang, L., Ma, L.: Text detection in natural images based on character classification. In: Advances in Multimedia Information Processing-PCM 2010, pp. 736–746. Springer, Berlin (2011)

Shivakumara, P., Sreedhar, R.P., Phan, T.Q., Lu, S., Tan, C.L.: Multioriented video scene text detection through Bayesian classification and boundary growing. IEEE Trans. Circuits Syst. Video Technol. 22(8), 1227–1235 (2012)CrossRef

Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2002)CrossRef

Nistér, D., Stewénius, H.: Linear time maximally stable extremal regions. In: Computer Vision—ECCV 2008, pp. 183–196. Springer, Berlin (2008)

Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R., Ashida, K., Nagai, Hiroki, Okamoto, Masayuki, Yamamoto, Hiroaki, et al.: ICDAR 2003 robust reading competitions: entries, results, and future directions. Int. J. Doc. Anal. Recognit. (IJDAR) 7(2–3), 105–122 (2005)CrossRef

Pavlidis, T.: Algorithms for Graphics and Image Processing. Computer Science Press, Rockville (1982)CrossRefMATH

10.

Lucas, S.M.: ICDAR 2005 text locating competition results. In: Eighth International Conference on Document Analysis and Recognition, 2005. Proceedings, pp. 80–84. IEEE (2005)

11.

Wu, V., Manmatha, R., Riseman, E.M.: Textfinder: an automatic system to detect and recognize text in images. IEEE Trans. Pattern Anal. Mach. Intell. 21(11), 1224–1229 (1999)CrossRef

12.

Shahab, A., Shafait, F., Dengel, A.: ICDAR 2011 robust reading competition challenge 2: reading text in scene images. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 1491–1496. IEEE (2011)

13.

Yi, C., Tian, Y.L.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20(9), 2594–2605 (2011)MathSciNetCrossRef

14.

Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004, vol. 2, pp. II-366. IEEE (2004)

15.

Neumann, L., Matas, J.: Text localization in real-world images using efficiently pruned exhaustive search. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 687–691. IEEE (2011)

16.

Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)CrossRef

17.

Gomez, L., Karatzas, D.: Multi-script text extraction from natural scenes. In: ICDAR (2013)

18.

Chen, X., Yuille, A.L.: A time-efficient cascade for real-time object detection: with applications for the visually impaired. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, 2005. CVPR Workshops. pp. 28–28. IEEE (2005)

19.

Hou, X., Pan, Y., Liu, C.-L.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans. Image Process. 20(3), 800–813 (2011)MathSciNetCrossRef

20.

Wolf, C., Jolion, J.-M.: Extraction and recognition of artificial text in multimedia documents. Form. Pattern Anal. Appl. 6(4), 309–326 (2004)MathSciNet

21.

Ahonen, T., Hadid, A., Pietikäinen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)CrossRefMATH

22.

Viola, P., Jones, M.: Robust real-time object detection. Int. J. Computer. Vis. 57(2), 137–154 (2001)CrossRef

Title: Portable and fast text detection
Authors: L. Zini
F. Odone
Publication date: 01-08-2016
Publisher: Springer Berlin Heidelberg
Published in: Machine Vision and Applications / Issue 6/2016
Print ISSN: 0932-8092
Electronic ISSN: 1432-1769
DOI: https://doi.org/10.1007/s00138-016-0778-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Wirtschaft"

Springer Professional "Technik"

Other articles of this Issue 6/2016

Unsupervised manifold alignment using soft-assign technique

OptiFuzz: a robust illumination invariant face recognition system and its implementation

Fast and automatic city-scale environment modelling using hard and/or weak constrained bundle adjustments

Robust deformable shape reconstruction from monocular video with manifold forests

Accurate keyframe selection and keypoint tracking for robust visual odometry

Action recognition using edge trajectories and motion acceleration descriptor

Premium Partner