Skip to main content
Top
Published in: Machine Vision and Applications 6/2016

01-08-2016 | Original Paper

Portable and fast text detection

Authors: L. Zini, F. Odone

Published in: Machine Vision and Applications | Issue 6/2016

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper, we describe an efficient pipeline for real-time text detection to be implemented on different architectures, with particular reference to smart phones. The text detection pipeline is based on a rather standard segmentation followed by a classification of each segmented connected component. Segmentation is performed by a linear implementation of MSER, state-of-the-art for text detection, where we control the overall computational cost of the method by computing a set of descriptive features as segmentation goes on. Classification is carried out by a cascade of SVM classifiers, where each layer captures different levels of complexity by means of an appropriate choice of descriptive features and kernel functions. Each detected text element, or character, is finally merged into lines of text and words. Further on, each element can be fed to a multi-class classifier that performs character recognition—this functionality is currently under development. We report experiments aiming at assessing the appropriateness of the text detection procedure, in terms of both performance and speed, when running on both x86 and ARM processors.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
2
Source code of the Text Segmentation, the libERtxt library, is available for download at https://​bitbucket.​org/​slipguru.
 
3
Source code of the general-purpose optimized classification library libMsC is available for download at https://​bitbucket.​org/​slipguru.
 
5
Dataset acquired for the project VIT—Vision for Innovative Transport—VII FP EU—SP4 Capacities Research for SMEs—n. 222199 http://​www.​vitproject.​eu.
 
10
GLASSENSE is a regional project developed within the SI4Life Ligurian Regional Hub—Research and Innovation—Live Sciences http://​www.​si4life.​com/​.
 
Literature
1.
go back to reference Ezaki, N., Bulacu, M., Schomaker, L.: Text detection from natural scene images: towards a system for visually impaired persons. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 2, pp. 683–686. IEEE (2004) Ezaki, N., Bulacu, M., Schomaker, L.: Text detection from natural scene images: towards a system for visually impaired persons. In: Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004, vol. 2, pp. 683–686. IEEE (2004)
2.
go back to reference Destrero, A., Zini, L., Odone, F.: A classification architecture based on connected components for text detection in unconstrained environments. In: IEEE AVSS, pp. 176–181 (2009) Destrero, A., Zini, L., Odone, F.: A classification architecture based on connected components for text detection in unconstrained environments. In: IEEE AVSS, pp. 176–181 (2009)
3.
go back to reference Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3538–3545. IEEE (2012) Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3538–3545. IEEE (2012)
4.
go back to reference Shao, Y., Wang, C., Xiao, B., Zhang, Y., Zhang, L., Ma, L.: Text detection in natural images based on character classification. In: Advances in Multimedia Information Processing-PCM 2010, pp. 736–746. Springer, Berlin (2011) Shao, Y., Wang, C., Xiao, B., Zhang, Y., Zhang, L., Ma, L.: Text detection in natural images based on character classification. In: Advances in Multimedia Information Processing-PCM 2010, pp. 736–746. Springer, Berlin (2011)
5.
go back to reference Shivakumara, P., Sreedhar, R.P., Phan, T.Q., Lu, S., Tan, C.L.: Multioriented video scene text detection through Bayesian classification and boundary growing. IEEE Trans. Circuits Syst. Video Technol. 22(8), 1227–1235 (2012)CrossRef Shivakumara, P., Sreedhar, R.P., Phan, T.Q., Lu, S., Tan, C.L.: Multioriented video scene text detection through Bayesian classification and boundary growing. IEEE Trans. Circuits Syst. Video Technol. 22(8), 1227–1235 (2012)CrossRef
6.
go back to reference Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2002)CrossRef Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2002)CrossRef
7.
go back to reference Nistér, D., Stewénius, H.: Linear time maximally stable extremal regions. In: Computer Vision—ECCV 2008, pp. 183–196. Springer, Berlin (2008) Nistér, D., Stewénius, H.: Linear time maximally stable extremal regions. In: Computer Vision—ECCV 2008, pp. 183–196. Springer, Berlin (2008)
8.
go back to reference Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R., Ashida, K., Nagai, Hiroki, Okamoto, Masayuki, Yamamoto, Hiroaki, et al.: ICDAR 2003 robust reading competitions: entries, results, and future directions. Int. J. Doc. Anal. Recognit. (IJDAR) 7(2–3), 105–122 (2005)CrossRef Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R., Ashida, K., Nagai, Hiroki, Okamoto, Masayuki, Yamamoto, Hiroaki, et al.: ICDAR 2003 robust reading competitions: entries, results, and future directions. Int. J. Doc. Anal. Recognit. (IJDAR) 7(2–3), 105–122 (2005)CrossRef
9.
go back to reference Pavlidis, T.: Algorithms for Graphics and Image Processing. Computer Science Press, Rockville (1982)CrossRefMATH Pavlidis, T.: Algorithms for Graphics and Image Processing. Computer Science Press, Rockville (1982)CrossRefMATH
10.
go back to reference Lucas, S.M.: ICDAR 2005 text locating competition results. In: Eighth International Conference on Document Analysis and Recognition, 2005. Proceedings, pp. 80–84. IEEE (2005) Lucas, S.M.: ICDAR 2005 text locating competition results. In: Eighth International Conference on Document Analysis and Recognition, 2005. Proceedings, pp. 80–84. IEEE (2005)
11.
go back to reference Wu, V., Manmatha, R., Riseman, E.M.: Textfinder: an automatic system to detect and recognize text in images. IEEE Trans. Pattern Anal. Mach. Intell. 21(11), 1224–1229 (1999)CrossRef Wu, V., Manmatha, R., Riseman, E.M.: Textfinder: an automatic system to detect and recognize text in images. IEEE Trans. Pattern Anal. Mach. Intell. 21(11), 1224–1229 (1999)CrossRef
12.
go back to reference Shahab, A., Shafait, F., Dengel, A.: ICDAR 2011 robust reading competition challenge 2: reading text in scene images. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 1491–1496. IEEE (2011) Shahab, A., Shafait, F., Dengel, A.: ICDAR 2011 robust reading competition challenge 2: reading text in scene images. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 1491–1496. IEEE (2011)
13.
go back to reference Yi, C., Tian, Y.L.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20(9), 2594–2605 (2011)MathSciNetCrossRef Yi, C., Tian, Y.L.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20(9), 2594–2605 (2011)MathSciNetCrossRef
14.
go back to reference Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004, vol. 2, pp. II-366. IEEE (2004) Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004, vol. 2, pp. II-366. IEEE (2004)
15.
go back to reference Neumann, L., Matas, J.: Text localization in real-world images using efficiently pruned exhaustive search. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 687–691. IEEE (2011) Neumann, L., Matas, J.: Text localization in real-world images using efficiently pruned exhaustive search. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 687–691. IEEE (2011)
16.
go back to reference Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)CrossRef Yin, X., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)CrossRef
17.
go back to reference Gomez, L., Karatzas, D.: Multi-script text extraction from natural scenes. In: ICDAR (2013) Gomez, L., Karatzas, D.: Multi-script text extraction from natural scenes. In: ICDAR (2013)
18.
go back to reference Chen, X., Yuille, A.L.: A time-efficient cascade for real-time object detection: with applications for the visually impaired. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, 2005. CVPR Workshops. pp. 28–28. IEEE (2005) Chen, X., Yuille, A.L.: A time-efficient cascade for real-time object detection: with applications for the visually impaired. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, 2005. CVPR Workshops. pp. 28–28. IEEE (2005)
19.
go back to reference Hou, X., Pan, Y., Liu, C.-L.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans. Image Process. 20(3), 800–813 (2011)MathSciNetCrossRef Hou, X., Pan, Y., Liu, C.-L.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans. Image Process. 20(3), 800–813 (2011)MathSciNetCrossRef
20.
go back to reference Wolf, C., Jolion, J.-M.: Extraction and recognition of artificial text in multimedia documents. Form. Pattern Anal. Appl. 6(4), 309–326 (2004)MathSciNet Wolf, C., Jolion, J.-M.: Extraction and recognition of artificial text in multimedia documents. Form. Pattern Anal. Appl. 6(4), 309–326 (2004)MathSciNet
21.
go back to reference Ahonen, T., Hadid, A., Pietikäinen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)CrossRefMATH Ahonen, T., Hadid, A., Pietikäinen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28(12), 2037–2041 (2006)CrossRefMATH
22.
go back to reference Viola, P., Jones, M.: Robust real-time object detection. Int. J. Computer. Vis. 57(2), 137–154 (2001)CrossRef Viola, P., Jones, M.: Robust real-time object detection. Int. J. Computer. Vis. 57(2), 137–154 (2001)CrossRef
Metadata
Title
Portable and fast text detection
Authors
L. Zini
F. Odone
Publication date
01-08-2016
Publisher
Springer Berlin Heidelberg
Published in
Machine Vision and Applications / Issue 6/2016
Print ISSN: 0932-8092
Electronic ISSN: 1432-1769
DOI
https://doi.org/10.1007/s00138-016-0778-2

Other articles of this Issue 6/2016

Machine Vision and Applications 6/2016 Go to the issue

Premium Partner