Top

Published in:

2015 | OriginalPaper | Chapter

A Machine Learning Approach to Hypothesis Decoding in Scene Text Recognition

Authors : Jindřich Libovický, Lukáš Neumann, Pavel Pecina, Jiří Matas

Published in: Computer Vision - ACCV 2014 Workshops

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Scene Text Recognition (STR) is a task of localizing and transcribing textual information captured in real-word images. With its increasing accuracy, it becomes a new source of textual data for standard Natural Language Processing tasks and poses new problems because of the specific nature of Scene Text. In this paper, we learn a string hypotheses decoding procedure in an STR pipeline using structured prediction methods that proved to be useful in automatic Speech Recognition and Machine Translation. The model allow to employ a wide range of typographical and language features into the decoding process. The proposed method is evaluated on a standard dataset and improves both character and word recognition performance over the baseline.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Scene Text Recognition: No Country for Old Men?

next chapter Perspective Scene Text Recognition with Feature Compression and Ranking

We used the current version of TextSpotter available at http://www.textspotter.org.

Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., de las Heras, L.P., et al.: ICDAR 2013 robust reading competition. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1484–1493. IEEE (2013)

Neumann, L., Matas, J.: On combining multiple segmentations in scene text recognition. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 523–527. IEEE (2013)

Ghoshal, A., Jansche, M., Khudanpur, S., Riley, M., Ulinski, M.: Web-derived pronunciations. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2009. ICASSP 2009, pp. 4289–4292. IEEE (2009)

Bilmes, J.A.: Graphical models and automatic speech recognition. In: Johnson, M., Khudanpur, S.P., Ostendorf, M., Rosenfeld, R. (eds.) Mathematical Foundations of Speech and Language Processing, pp. 191–245. Springer, New York (2004)CrossRef

Daumé III, H., Langford, J., Marcu, D.: Search-based structured prediction. Mach. Learn. 75, 297–325 (2009)CrossRef

Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pp. 177–180. Association for Computational Linguistics (2007)

Zhang, H., Zhao, K., Song, Y.Z., Guo, J.: Text extraction from natural scene image: a survey. Neurocomputing 122, 310–323 (2013)CrossRef

Mishra, A., Alahari, K., Jawahar, C.: Top-down and bottom-up cues for scene text recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2687–2694. IEEE (2012)

Novikova, T., Barinova, O., Kohli, P., Lempitsky, V.: Large-lexicon attribute-consistent text recognition in natural images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 752–765. Springer, Heidelberg (2012) CrossRef

10.

Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1457–1464. IEEE (2011)

11.

Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), CA, USA, pp. 3538–3545. IEEE (2012)

12.

Roy, S., Roy, P.P., Shivakumara, P., Louloudis, G., Tan, C.L., Pal, U.: HMM-based multi oriented text recognition in natural scene image. In: 2013 2nd IAPR Asian Conference on Pattern Recognition (ACPR), pp. 288–292. IEEE (2013)

13.

Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S., Zhang, Z.: Scene text recognition using part-based tree-structured character detection. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2961–2968. IEEE (2013)

14.

Bissacco, A., Cummins, M., Netzer, Y., Neven, H.: PhotoOCR: reading text in uncontrolled conditions. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 785–792. IEEE (2013)

15.

Weinman, J., Butler, Z., Knoll, D., Feild, J.: Toward integrated scene text reading. IEEE Trans. Pattern Anal. Mach. Intell. 36, 375–387 (2014)CrossRef

16.

Feild, J.: Improving text recognition in images of natural scenes. Ph.D. thesis, University Massachusetts Amherst (2014)

17.

Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newslett. 11, 10–18 (2009)CrossRef

18.

Collins, M.: Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10, pp. 1–8. Association for Computational Linguistics (2002)

19.

Joachims, T., Finley, T., Yu, C.N.J.: Cutting-plane training of structural SVMs. Mach. Learn. 77, 27–59 (2009)CrossRefMATH

20.

Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.P., Feuston, B.P.: Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43, 1947–1958 (2003)CrossRef

Title: A Machine Learning Approach to Hypothesis Decoding in Scene Text Recognition
Authors: Jindřich Libovický
Lukáš Neumann
Pavel Pecina
Jiří Matas
Publisher: Springer International Publishing
Book: Computer Vision - ACCV 2014 Workshops
Print ISBN: 978-3-319-16630-8

Electronic ISBN: 978-3-319-16631-5

Copyright Year: 2015
DOI: https://doi.org/10.1007/978-3-319-16631-5_13

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner