Top

Published in:

2015 | OriginalPaper | Chapter

A Robust Video Text Extraction and Recognition Approach Using OCR Feedback Information

Authors : Guangyu Gao, He Zhang, Hongting Chen

Published in: Advances in Multimedia Information Processing -- PCM 2015

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn’t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter A New Multi-spectral Fusion Method for Degraded Video Text Frame Enhancement

next chapter Color and Active Infrared Vision: Estimate Infrared Vision of Printed Color Using Bayesian Classifier and K-Nearest Neighbor Regression

https://www.youtube.com/yt/press/statistics.html

Zhang, D., Chang, S.: Event detection in basketball video using superimposed caption recognition. In: Proceedings of the ACM MM, pp. 315–318 (2002)

Zhang, D., Rajendran, R., Chang, S.: General and domain-specific techniques for detecting and recognizing superimposed text in video. In: Proceedings of ICIP, pp. I-593–I-596

Kim, H.H.: Toward video semantic search based on a structured folksonomy. J. Am. Soc. Inf. Sci. Technol. 62(3), 478–492 (2011)

Bhute, A.N., Meshram, B.B.: Text based approach for indexing and retrieval of image and video: a review. Adv. Vis. Comput. 1(1), 27–38 (2014)

Mitra, V., Franco, H., Graciarena, M., Vergyri, D.: Medium-duration modulation cepstral feature for robust speech recognition. In: Proceedings of ICASSP, pp. 1749–1753 (2014)

Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. Circ. Syst. Video Technol. 15(2), 243–255 (2005)CrossRef

Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Circ. Syst. Video Technol. 9(1), 62–66 (1979)

Leedham, G., Yan, C., Takru, K., Tan, J.H.N., Mian, L.: Comparison of some thresholding algorithms for text/background segmentation in difficult document images. In: Proceedings of ICDAR, pp. 859–864 (2003)

10.

Ngo, C.W., Chan, C.K.: Video text detection and segmentation for optical character recognition. Multimedia Syst. 10(3), 261–272 (2005)CrossRef

11.

Kim, W., Kim, C.: A new approach for overlay text detection and extraction from complex video scene. IEEE Trans. Image Process. 18(2), 401–411 (2009)MathSciNetCrossRef

12.

Gao, J., Yang, J.: An adaptive algorithm for text detection from natural scenes. In: Proceedings of CVPR, pp. II-84–II-89 (2001)

13.

Chen, D., Olobez, J.M., Bourlard, H.: Text segmentation and recognition in complex background based on Markov random field. In: Proceedings of ICPR, pp. 227–230 (2002)

14.

Fu, H., Liu, X., Jia, Y., Deng, H.: Gaussian mixture modeling of neighbor characters for multilingual text extraction in images. In: Proceedings of ICIP, pp. 3321–3324 (2006)

15.

Roy, A., Parui, S.K., Roy, U.: A pair-copula based scheme for text extraction from digital images. In: Proceedings of ICDA, pp. 892–896 (2013)

16.

Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. IEEE Trans. Circ. Syst. Video Technol. 12(4), 256–268 (2002)CrossRef

17.

Song, Y., Liu, A., Pang, L., Lin, S., Zhang, Y., Tang, S.: A novel image text extraction method based on k-means clustering. In: Proceedings of ICIS, pp. 185–190 (2008)

18.

Li, X., Wang, W., Huang, Q., Gao, W., Qing, L.: A hybrid text segmentation approach. In: Proceedings of ICME, pp. 510–513 (2009)

19.

Li, Z., Liu, G., Qian, X., Guo, D., Jiang, H.: Effective and efficient video text extraction using key text points. IET Image Process. 5(8), 671–683 (2011)MathSciNetCrossRef

20.

Liu, Y., Song, Y., Zhang, Y., Meng, Q.: A novel multi-oriented Chinese text extraction approach from videos. In: Proceedings of ICDAR, pp. 1355–1359 (2013)

21.

Sharma, N., Shivakumara, P., Pal, U., Blumenstein, M., Tan, C.L.: A new gradient based character segmentation method for video text recognition. In: ICDAR, pp. 126–130 (2011)

22.

Huang, X., Ma, H., Zhang, H.: A new video text extraction approach. In: Proceedings of ICME 2009, pp. 650–653 (2009)

23.

Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011)CrossRef

24.

Huang, X., Ma, H., Yuan, H.: A novel video text detection and localization approach. In: Huang, Y.-M.R., Xu, C., Cheng, K.-S., Yang, J.-F.K., Swamy, M.N.S., Li, S., Ding, J.-W. (eds.) PCM 2008. LNCS, vol. 5353, pp. 525–534. Springer, Heidelberg (2008)CrossRef

Title: A Robust Video Text Extraction and Recognition Approach Using OCR Feedback Information
Authors: Guangyu Gao
He Zhang
Hongting Chen
Publisher: Springer International Publishing
Book: Advances in Multimedia Information Processing -- PCM 2015
Print ISBN: 978-3-319-24074-9

Electronic ISBN: 978-3-319-24075-6

Copyright Year: 2015
DOI: https://doi.org/10.1007/978-3-319-24075-6_49

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"