Top

Machine Intelligence Research

Published in:

01-10-2014 | Regular paper

Robust Text Detection in Natural Scenes Using Text Geometry and Visual Appearance

Authors: Sheng-Ye Yan, Xin-Xing Xu, Qing-Shan Liu

Published in: Machine Intelligence Research | Issue 5/2014

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

This paper proposes a new two-phase approach to robust text detection by integrating the visual appearance and the geometric reasoning rules. In the first phase, geometric rules are used to achieve a higher recall rate. Specifically, a robust stroke width transform (RSWT) feature is proposed to better recover the stroke width by additionally considering the cross of two strokes and the continuousness of the letter border. In the second phase, a classification scheme based on visual appearance features is used to reject the false alarms while keeping the recall rate. To learn a better classifier from multiple visual appearance features, a novel classification method called double soft multiple kernel learning (DS-MKL) is proposed. DS-MKL is motivated by a novel kernel margin perspective for multiple kernel learning and can effectively suppress the influence of noisy base kernels. Comprehensive experiments on the benchmark ICDAR2005 competition dataset demonstrate the effectiveness of the proposed two-phase text detection approach over the state-of-the-art approaches by a performance gain up to 4.4% in terms of F-measure.

previous article Illumination-robust and Anti-blur Feature Descriptors for Image Matching in Abdomen Reconstruction

next article Semantic Rule Based Image Visual Feature Ontology Creation

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

[1]

G. Sahoo, T. Kumar, B. L. Raina, C. M. Bhatia. Text extraction and enhancement of binary images using cellular automata. International Journal of Automation and Computing, vol. 6, no. 3, pp. 254–260, 2009.CrossRef

[2]

B. Epshtein, E. Ofek, Y. Wexler. Detecting text in natural scenes with stroke width transform. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, San Francisco, USA, pp. 2963–2970, 2010.

[3]

L. Neumann, J. Matas. A method for text localization and recognition in real-world images. In Proceedings of the 10th Asian Conference on Computer Vision, Lecture Notes in Corputer Science, vol. 6494, Springer, Queenstown, New Zealand, pp. 770–783, 2010.

[4]

C. Yao, X. Bai, W. Liu, Y. Ma, Z. Tu. Detecting texts of arbitrary orientations in natural images. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 1083–1090, 2012.

[5]

Y. C. Wei, C. H. Lin. A robust video text detection approach using SVM. Expert Systems with Applications, vol. 39, no. 12, pp. 10832–10840, 2012.CrossRef

[6]

Y. Y. Qu, W. M. Liao, S. Lu, S. J. Wu. Hierarchical text detection: From word level to character level. In Proceedings of the 19th International Conference on Advances in Multimedia Modeling, Lecture Notes in Computer Science, Springer, Huangshan, China, vol. 7733 pp. 24–35, 2013.

[7]

V. N. M. Aradhya, M. S. Pavithra. An application of K-means clustering for improving video text detection. In Proceedings of International Symposium on Intelligent Informatics, Advances in Intelligent Systems and Computer, Springer, Channai, India, vol. 182, pp. 41–47, 2013.

[8]

C. Z. Shi, C. H. Wang, B. H. Xiao, Y. Zhang, S. Gao. Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognition Letters, vol. 34, no. 2, pp. 107–116, 2013.CrossRef

[9]

S. M. Lucas, A. Panaretos, L. Sosa, A. Tang, S. Wong, R. Young. ICDAR 2003 robust reading competitions. In Proceedings of the 7th International Conference on Document Analysis and Recognition, IEEE, Edinburgh, Scotland, pp. 682–687, 2003.

[10]

J. Liang, D. Doermann, H. P. Li. Camera-based analysis of text and documents: A survey. International Journal of Document Analysis and Recognition, vol. 7, no. 2–3, pp. 83–104, 2005.

[11]

H. G. Zhang, K. Zhao, Y. Z. Song, J. Guo. Text extraction from natural scene image: A survey. Neurocomputing, vol. 122, pp. 310–323, 2013.CrossRef

[12]

A. K. Jain, B. Yu. Automatic text location in images and video frames. Pattern Recognition, vol. 31, no. 12, pp. 2055–2076, 1998.CrossRef

[13]

X. R. Chen, A. L. Yuille. Detecting and reading text in natural scenes. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, Washington DC, USA, pp. 366–373, 2004.

[14]

L. Neumann, R. Ewerth, B. Freisleben. Text detection in images based on unsupervised classification of high frequency wavelet coefficients. In Proceedings of International Conference on Pattern Recognition, IEEE, Cambridge, England, pp. 425–428, 2004.

[15]

L. Neumann, J. Matas. Real-time scene text localization and recognition. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 3538–3545, 2012.

[16]

G. R. G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, M. I. Jordan. Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, vol. 5, pp. 27–72, 2004.MATH

[17]

F. R. Bach, G. R. G. Lanckriet, M. I. Jordan. Multiple kernel learning, conic duality, and the SMO algorithm. In Proceedings of the 21st International Conference on Machine Learning, ACM, Banff, Alberta, Canada, 2004.

[18]

S. Sonnenburg, G. Rätsch, C. Schäfer, B. Schölkopf. Large scale multiple kernel learning. Journal of Machine Learning Research, vol. 7, pp. 1531–1565, 2006.MATH

[19]

A. Rakotomamonjy, F. Bach, S. Canu, Y. Grandvalet. Simple MKL. Journal of Machine Learning Research, vol. 9, pp. 2491–2521, 2008.MATHMathSciNet

[20]

C. Cortes, M. Mohri, A. Rostamizadeh. L2 regularization for learning kernels. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, AUAI Press, Arlington, Virginia, USA, pp. 109–116, 2009.

[21]

M. Kloft, U. Brefeld, S. Sonnenburg, A. Zien. L _p-norm multiple kernel learning. Journal of Machine Learning Research, vol. 12, pp. 953–997, 2011.MATHMathSciNet

[22]

X. Xu, I. W. Tsang, D. Xu. Soft margin multiple kernel learning. IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 5, pp. 749–761, 2013.CrossRef

[23]

J. X. Xiao, J. Hays, K. A. Ehinger, A. Oliva, A. Torralba. Sun database: Large-scale scene recognition from abbey to zoo. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, San Francisco, USA, pp. 3485–3492, 2010.

[24]

T. Ojala, M. Pietikainen, T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Recognition and Machine Intelligence, vol. 24, no. 7, pp. 971–987, 2002.CrossRef

[25]

D. G. Lowe. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.CrossRef

[26]

E. Shechtman, M. Irani. Matching local self-similarities across images and videos. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Minneapolis, USA, pp. 1–8, 2007.

[27]

C. Cortes, V. Vapnik. Support-vector networks. Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.MATH

[28]

B. E. Boser, I. M. Guyon, V. N. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual Workshop on Computational Learning Theory, ACM, Pittsburgh, PA, USA, pp. 144–152, 1992.

[29]

Z. L. Xu, R. Jin, H. Q. Yang, I. King, M. R. Lyu. Simple and efficient multiple kernel learning by group lasso. In Proceedings of the 27th International Conference on Machine Learning, Omnipress, Haifa, Israel, pp. 1175–1182, 2010.

[30]

M. Szafranski, Y. Grandvalet, A. Rakotomamonjy. Composite kernel learning. Machine Learning, vol. 79, no. 1–2, pp. 73–103, 2010.CrossRefMathSciNet

[31]

S. Shalev-Shwartz, Y. Singer. Efficient learning of label ranking by soft projections onto polyhedra. Journal of Machine Learning Research, vol. 7, pp. 1567–1599, 2006.MATHMathSciNet

[32]

S. M. Lucas. Text locating competition results. In Proceedings of the 8th International Conference on Document Analysis and Recognition, IEEE, Seoul, Korea, pp. 80–85, 2005.

[33]

S. Y. Yan, X. X. Xu, D. Xu, S. Lin, X. L. Li. Beyond spatial pyramids: A new feature extraction framework with dense spatial sampling for image classification. In Proceedings of the 12th European Conference on Computer Vision, Springer, Florence, Italy, pp. 464–478, 2012.

[34]

C. C. Chang, C. J. Lin. Libsvm: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, Article 27, 2011.

[35]

C. Yi, Y. L. Tian. Text string detection from natural scenes by structure-based partition and grouping. IEEE Transactions on Image Processing, vol. 20, no. 9, pp. 2594–2605, 2011.CrossRefMathSciNet

Title: Robust Text Detection in Natural Scenes Using Text Geometry and Visual Appearance
Authors: Sheng-Ye Yan
Xin-Xing Xu
Qing-Shan Liu
Publication date: 01-10-2014
Publisher: Springer-Verlag
Published in: Machine Intelligence Research / Issue 5/2014
Print ISSN: 2731-538X
Electronic ISSN: 2731-5398
DOI: https://doi.org/10.1007/s11633-014-0833-2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Other articles of this Issue 5/2014

H.264/SVC Mode Decision Based on Mode Correlation and Desired Mode List

General Convex Integral Control

Bio-inspired Backstepping Adaptive Sliding Mode Control for Parallel Mechanism with Actuation Redundancy

Data-driven Nonparametric Model Adaptive Precision Control for Linear Servo Systems

Illumination-robust and Anti-blur Feature Descriptors for Image Matching in Abdomen Reconstruction

Identification of Eye Movements from Non-frontal Face Images for Eye-controlled Systems

Premium Partner