Skip to main content
Top

2015 | OriginalPaper | Chapter

A Robust Video Text Extraction and Recognition Approach Using OCR Feedback Information

Authors : Guangyu Gao, He Zhang, Hongting Chen

Published in: Advances in Multimedia Information Processing -- PCM 2015

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn’t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Zhang, D., Chang, S.: Event detection in basketball video using superimposed caption recognition. In: Proceedings of the ACM MM, pp. 315–318 (2002) Zhang, D., Chang, S.: Event detection in basketball video using superimposed caption recognition. In: Proceedings of the ACM MM, pp. 315–318 (2002)
3.
go back to reference Zhang, D., Rajendran, R., Chang, S.: General and domain-specific techniques for detecting and recognizing superimposed text in video. In: Proceedings of ICIP, pp. I-593–I-596 Zhang, D., Rajendran, R., Chang, S.: General and domain-specific techniques for detecting and recognizing superimposed text in video. In: Proceedings of ICIP, pp. I-593–I-596
4.
go back to reference Kim, H.H.: Toward video semantic search based on a structured folksonomy. J. Am. Soc. Inf. Sci. Technol. 62(3), 478–492 (2011) Kim, H.H.: Toward video semantic search based on a structured folksonomy. J. Am. Soc. Inf. Sci. Technol. 62(3), 478–492 (2011)
5.
go back to reference Bhute, A.N., Meshram, B.B.: Text based approach for indexing and retrieval of image and video: a review. Adv. Vis. Comput. 1(1), 27–38 (2014) Bhute, A.N., Meshram, B.B.: Text based approach for indexing and retrieval of image and video: a review. Adv. Vis. Comput. 1(1), 27–38 (2014)
6.
go back to reference Mitra, V., Franco, H., Graciarena, M., Vergyri, D.: Medium-duration modulation cepstral feature for robust speech recognition. In: Proceedings of ICASSP, pp. 1749–1753 (2014) Mitra, V., Franco, H., Graciarena, M., Vergyri, D.: Medium-duration modulation cepstral feature for robust speech recognition. In: Proceedings of ICASSP, pp. 1749–1753 (2014)
7.
go back to reference Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. Circ. Syst. Video Technol. 15(2), 243–255 (2005)CrossRef Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. Circ. Syst. Video Technol. 15(2), 243–255 (2005)CrossRef
8.
go back to reference Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Circ. Syst. Video Technol. 9(1), 62–66 (1979) Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Circ. Syst. Video Technol. 9(1), 62–66 (1979)
9.
go back to reference Leedham, G., Yan, C., Takru, K., Tan, J.H.N., Mian, L.: Comparison of some thresholding algorithms for text/background segmentation in difficult document images. In: Proceedings of ICDAR, pp. 859–864 (2003) Leedham, G., Yan, C., Takru, K., Tan, J.H.N., Mian, L.: Comparison of some thresholding algorithms for text/background segmentation in difficult document images. In: Proceedings of ICDAR, pp. 859–864 (2003)
10.
go back to reference Ngo, C.W., Chan, C.K.: Video text detection and segmentation for optical character recognition. Multimedia Syst. 10(3), 261–272 (2005)CrossRef Ngo, C.W., Chan, C.K.: Video text detection and segmentation for optical character recognition. Multimedia Syst. 10(3), 261–272 (2005)CrossRef
11.
go back to reference Kim, W., Kim, C.: A new approach for overlay text detection and extraction from complex video scene. IEEE Trans. Image Process. 18(2), 401–411 (2009)MathSciNetCrossRef Kim, W., Kim, C.: A new approach for overlay text detection and extraction from complex video scene. IEEE Trans. Image Process. 18(2), 401–411 (2009)MathSciNetCrossRef
12.
go back to reference Gao, J., Yang, J.: An adaptive algorithm for text detection from natural scenes. In: Proceedings of CVPR, pp. II-84–II-89 (2001) Gao, J., Yang, J.: An adaptive algorithm for text detection from natural scenes. In: Proceedings of CVPR, pp. II-84–II-89 (2001)
13.
go back to reference Chen, D., Olobez, J.M., Bourlard, H.: Text segmentation and recognition in complex background based on Markov random field. In: Proceedings of ICPR, pp. 227–230 (2002) Chen, D., Olobez, J.M., Bourlard, H.: Text segmentation and recognition in complex background based on Markov random field. In: Proceedings of ICPR, pp. 227–230 (2002)
14.
go back to reference Fu, H., Liu, X., Jia, Y., Deng, H.: Gaussian mixture modeling of neighbor characters for multilingual text extraction in images. In: Proceedings of ICIP, pp. 3321–3324 (2006) Fu, H., Liu, X., Jia, Y., Deng, H.: Gaussian mixture modeling of neighbor characters for multilingual text extraction in images. In: Proceedings of ICIP, pp. 3321–3324 (2006)
15.
go back to reference Roy, A., Parui, S.K., Roy, U.: A pair-copula based scheme for text extraction from digital images. In: Proceedings of ICDA, pp. 892–896 (2013) Roy, A., Parui, S.K., Roy, U.: A pair-copula based scheme for text extraction from digital images. In: Proceedings of ICDA, pp. 892–896 (2013)
16.
go back to reference Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. IEEE Trans. Circ. Syst. Video Technol. 12(4), 256–268 (2002)CrossRef Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. IEEE Trans. Circ. Syst. Video Technol. 12(4), 256–268 (2002)CrossRef
17.
go back to reference Song, Y., Liu, A., Pang, L., Lin, S., Zhang, Y., Tang, S.: A novel image text extraction method based on k-means clustering. In: Proceedings of ICIS, pp. 185–190 (2008) Song, Y., Liu, A., Pang, L., Lin, S., Zhang, Y., Tang, S.: A novel image text extraction method based on k-means clustering. In: Proceedings of ICIS, pp. 185–190 (2008)
18.
go back to reference Li, X., Wang, W., Huang, Q., Gao, W., Qing, L.: A hybrid text segmentation approach. In: Proceedings of ICME, pp. 510–513 (2009) Li, X., Wang, W., Huang, Q., Gao, W., Qing, L.: A hybrid text segmentation approach. In: Proceedings of ICME, pp. 510–513 (2009)
19.
go back to reference Li, Z., Liu, G., Qian, X., Guo, D., Jiang, H.: Effective and efficient video text extraction using key text points. IET Image Process. 5(8), 671–683 (2011)MathSciNetCrossRef Li, Z., Liu, G., Qian, X., Guo, D., Jiang, H.: Effective and efficient video text extraction using key text points. IET Image Process. 5(8), 671–683 (2011)MathSciNetCrossRef
20.
go back to reference Liu, Y., Song, Y., Zhang, Y., Meng, Q.: A novel multi-oriented Chinese text extraction approach from videos. In: Proceedings of ICDAR, pp. 1355–1359 (2013) Liu, Y., Song, Y., Zhang, Y., Meng, Q.: A novel multi-oriented Chinese text extraction approach from videos. In: Proceedings of ICDAR, pp. 1355–1359 (2013)
21.
go back to reference Sharma, N., Shivakumara, P., Pal, U., Blumenstein, M., Tan, C.L.: A new gradient based character segmentation method for video text recognition. In: ICDAR, pp. 126–130 (2011) Sharma, N., Shivakumara, P., Pal, U., Blumenstein, M., Tan, C.L.: A new gradient based character segmentation method for video text recognition. In: ICDAR, pp. 126–130 (2011)
22.
go back to reference Huang, X., Ma, H., Zhang, H.: A new video text extraction approach. In: Proceedings of ICME 2009, pp. 650–653 (2009) Huang, X., Ma, H., Zhang, H.: A new video text extraction approach. In: Proceedings of ICME 2009, pp. 650–653 (2009)
23.
go back to reference Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011)CrossRef Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011)CrossRef
24.
go back to reference Huang, X., Ma, H., Yuan, H.: A novel video text detection and localization approach. In: Huang, Y.-M.R., Xu, C., Cheng, K.-S., Yang, J.-F.K., Swamy, M.N.S., Li, S., Ding, J.-W. (eds.) PCM 2008. LNCS, vol. 5353, pp. 525–534. Springer, Heidelberg (2008)CrossRef Huang, X., Ma, H., Yuan, H.: A novel video text detection and localization approach. In: Huang, Y.-M.R., Xu, C., Cheng, K.-S., Yang, J.-F.K., Swamy, M.N.S., Li, S., Ding, J.-W. (eds.) PCM 2008. LNCS, vol. 5353, pp. 525–534. Springer, Heidelberg (2008)CrossRef
Metadata
Title
A Robust Video Text Extraction and Recognition Approach Using OCR Feedback Information
Authors
Guangyu Gao
He Zhang
Hongting Chen
Copyright Year
2015
DOI
https://doi.org/10.1007/978-3-319-24075-6_49