Skip to main content

2015 | OriginalPaper | Buchkapitel

A New Multi-spectral Fusion Method for Degraded Video Text Frame Enhancement

verfasst von : Yangbing Weng, Palaiahnakote Shivakumara, Tong Lu, Liang Kim Meng, Hon Hock Woon

Erschienen in: Advances in Multimedia Information Processing -- PCM 2015

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Text detection and recognition in degraded video is complex and challenging due to lighting effect, sensor and motion blurring. This paper presents a new method that derives multi-spectral images from each input video frame by studying non-linear intensity values in Gray, R, G and B color spaces to increase the contrast of text pixels, which results in four respective multi-spectral images. Then we propose a multiple fusion criteria for the four multi-spectral images to enhance text information in degraded video frames. We propose median operation to obtain a single image from the results of the multiple fusion criteria, which we name fusion-1. We further apply k-means clustering on the fused images obtained by the multiple fusion criteria to classify text clusters, which results in binary images. Then we propose the same median operation to obtain a single image by fusing binary images, which we name fusion-2. We evaluate the enhanced images at fusion-1 and fusion-2 using quality measures, such as Mean Square Error, Peak Signal to Noise Ratio and Structural Symmetry. Furthermore, the enhanced images are validated through text detection and recognition accuracies in video frames to show the effectiveness of enhancement.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE. Trans. Pattern Anal. Mach Intell. 1, 1 (2014) Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE. Trans. Pattern Anal. Mach Intell. 1, 1 (2014)
2.
Zurück zum Zitat Sharma, N., Pal, U., Blumenstein, M.: Recent advances in video based document processing: a review. In: Proceedings of DAS, pp. 63–68 (2012) Sharma, N., Pal, U., Blumenstein, M.: Recent advances in video based document processing: a review. In: Proceedings of DAS, pp. 63–68 (2012)
3.
Zurück zum Zitat Yu, S., Li, B., Zhang, Q., Liu, C., Meng, M.A.H.: A novel license plate location method based on wavelet transform and EMD analysis. Pattern Recogn. 48, 114–125 (2015)CrossRef Yu, S., Li, B., Zhang, Q., Liu, C., Meng, M.A.H.: A novel license plate location method based on wavelet transform and EMD analysis. Pattern Recogn. 48, 114–125 (2015)CrossRef
4.
Zurück zum Zitat Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of CVPR, pp. 2963–2970 (2010) Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of CVPR, pp. 2963–2970 (2010)
5.
Zurück zum Zitat Otsu, N.: A threshold selection method from gray level histogram. IEEE Trans. Syst. Man Cybern. 11, 62–66 (1978) Otsu, N.: A threshold selection method from gray level histogram. IEEE Trans. Syst. Man Cybern. 11, 62–66 (1978)
7.
Zurück zum Zitat Niblack, W.: An Introduction to Digital Image Processing. Strandberg Publishing Company, Birkeroed (1985) Niblack, W.: An Introduction to Digital Image Processing. Strandberg Publishing Company, Birkeroed (1985)
8.
Zurück zum Zitat Sauvola, J., Seeppanen, T., Haapakoski, S., Pietikainen, M.: Adaptive document binarization. In: Proceedings of ICDAR, pp. 147–152 (1997) Sauvola, J., Seeppanen, T., Haapakoski, S., Pietikainen, M.: Adaptive document binarization. In: Proceedings of ICDAR, pp. 147–152 (1997)
9.
Zurück zum Zitat Zhou, Y., Feid, J., Miller, E.L., Wang, R.: Scene text segmentation via inverse rendering. In: Proceedings of ICDAR, pp. 457–461 (2013) Zhou, Y., Feid, J., Miller, E.L., Wang, R.: Scene text segmentation via inverse rendering. In: Proceedings of ICDAR, pp. 457–461 (2013)
10.
Zurück zum Zitat Su, B., Lu, S., Tan, C.L.: A robust document image binarization for degraded document images. IEEE Trans. Image Process. 22, 1408–1417 (2013)MathSciNetCrossRef Su, B., Lu, S., Tan, C.L.: A robust document image binarization for degraded document images. IEEE Trans. Image Process. 22, 1408–1417 (2013)MathSciNetCrossRef
11.
Zurück zum Zitat Su, B., Lu, S., Tan, C.L.: Binarization of historical document images using the local maximum and minimum. In: Proceedings of DAS, pp. 159–166 (2010) Su, B., Lu, S., Tan, C.L.: Binarization of historical document images using the local maximum and minimum. In: Proceedings of DAS, pp. 159–166 (2010)
12.
Zurück zum Zitat Nayef, N., Chazalon, J., Kramer, P.G., Ogier, J.M.: Efficient example-based super-resolution of single text images based on selective patch processing. In: Proceedings of DAS, pp. 227–231 (2014) Nayef, N., Chazalon, J., Kramer, P.G., Ogier, J.M.: Efficient example-based super-resolution of single text images based on selective patch processing. In: Proceedings of DAS, pp. 227–231 (2014)
13.
Zurück zum Zitat Zheng, Y., Li, X.K.S., Sun, Y.H.J.: Real-time document image super-resolution by fast matting. In: Proceedings of DAS, pp. 232–236 (2014) Zheng, Y., Li, X.K.S., Sun, Y.H.J.: Real-time document image super-resolution by fast matting. In: Proceedings of DAS, pp. 232–236 (2014)
14.
Zurück zum Zitat Saleem, S., Hollaus, F., Sablatnig, R.: Recognition of degraded ancient characters based on dense SIFT. In: Proceedings of DATeCH, pp. 15–20 (2014) Saleem, S., Hollaus, F., Sablatnig, R.: Recognition of degraded ancient characters based on dense SIFT. In: Proceedings of DATeCH, pp. 15–20 (2014)
15.
Zurück zum Zitat Minetto, R., Thome, N., Cord, M., Leite, N.J., Stolfi, J.: SnooperText: a text detection system for automatic indexing of urban scenes. In: CVIU, pp. 92–104 (2014) Minetto, R., Thome, N., Cord, M., Leite, N.J., Stolfi, J.: SnooperText: a text detection system for automatic indexing of urban scenes. In: CVIU, pp. 92–104 (2014)
16.
Zurück zum Zitat Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20, 2594–2605 (2011)MathSciNetCrossRef Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20, 2594–2605 (2011)MathSciNetCrossRef
17.
Zurück zum Zitat Shivakumara, P., Phan, T.Q., Lu, S., Tan, C.L.: Gradient vector flow and grouping based method for arbitrarily-oriented scene text detection in video images. IEEE Trans. Circ. Syst. Video Technol. 23, 1729–1739 (2013)CrossRef Shivakumara, P., Phan, T.Q., Lu, S., Tan, C.L.: Gradient vector flow and grouping based method for arbitrarily-oriented scene text detection in video images. IEEE Trans. Circ. Syst. Video Technol. 23, 1729–1739 (2013)CrossRef
18.
Zurück zum Zitat Xu, J., Shivakumara, P., Lu, T., Phan, T.Q., Tan, C.L.: Graphics and scene text classification in video. In: Proceedings of ICPR, pp. 4714–4719 (2014) Xu, J., Shivakumara, P., Lu, T., Phan, T.Q., Tan, C.L.: Graphics and scene text classification in video. In: Proceedings of ICPR, pp. 4714–4719 (2014)
19.
Zurück zum Zitat Cui, Y., Huang, Q.: Character extraction of license plate from video. In: Proceedings of CVPR, pp. 502–507 (1997) Cui, Y., Huang, Q.: Character extraction of license plate from video. In: Proceedings of CVPR, pp. 502–507 (1997)
20.
Zurück zum Zitat Li, H., Doermann, D.: Super-resolution-based enhancement for text in digital video. In: Proceedings of ICPR, pp 847–850 (2000) Li, H., Doermann, D.: Super-resolution-based enhancement for text in digital video. In: Proceedings of ICPR, pp 847–850 (2000)
21.
Zurück zum Zitat Suresh, K.V., Kumar, G.M., Rajagopalan, A.N.: Superresolution of license plates in real traffic videos. IEEE Trans. Intell. Transp. Syst. 8, 321–331 (2007)CrossRef Suresh, K.V., Kumar, G.M., Rajagopalan, A.N.: Superresolution of license plates in real traffic videos. IEEE Trans. Intell. Transp. Syst. 8, 321–331 (2007)CrossRef
22.
Zurück zum Zitat Saleeem, S., Sablatnig, R.: A robust SIFT descriptor for multi-spectral images. IEEE Signal Process. Lett. 21, 400–403 (2014)CrossRef Saleeem, S., Sablatnig, R.: A robust SIFT descriptor for multi-spectral images. IEEE Signal Process. Lett. 21, 400–403 (2014)CrossRef
23.
Zurück zum Zitat Rusinol, M., Chazalon, J., Ogier, J. M.: Combining focus measure operators to predict OCR accuracy in mobile-captured document images. In: Proceedings of IWDAS, pp 181–185 (2014) Rusinol, M., Chazalon, J., Ogier, J. M.: Combining focus measure operators to predict OCR accuracy in mobile-captured document images. In: Proceedings of IWDAS, pp 181–185 (2014)
24.
Zurück zum Zitat Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Boorda, L.G.I., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., De las Heras, L.P.: ICDAR 2013 robust reading competition. In: Proceedings of ICDAR, pp. 1115–1124 (2013) Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Boorda, L.G.I., Mestre, S.R., Mas, J., Mota, D.F., Almazan, J.A., De las Heras, L.P.: ICDAR 2013 robust reading competition. In: Proceedings of ICDAR, pp. 1115–1124 (2013)
25.
Zurück zum Zitat Lu, W., Tao, D.: Multiview Hessian regularization for image annotation. IEEE Trans. Image Process. 22, 2676–2687 (2013)MathSciNetCrossRef Lu, W., Tao, D.: Multiview Hessian regularization for image annotation. IEEE Trans. Image Process. 22, 2676–2687 (2013)MathSciNetCrossRef
26.
Zurück zum Zitat Xu, C., Tao, D., Xu, C.: Large-margin multi-view information bottleneck. IEEE Trans. Pattern Anal. Mach. Intell. 36, 1559–1572 (2014)CrossRef Xu, C., Tao, D., Xu, C.: Large-margin multi-view information bottleneck. IEEE Trans. Pattern Anal. Mach. Intell. 36, 1559–1572 (2014)CrossRef
Metadaten
Titel
A New Multi-spectral Fusion Method for Degraded Video Text Frame Enhancement
verfasst von
Yangbing Weng
Palaiahnakote Shivakumara
Tong Lu
Liang Kim Meng
Hon Hock Woon
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-24075-6_48

Neuer Inhalt