Skip to main content

2019 | OriginalPaper | Buchkapitel

Feature Pooling in Scene Character Recognition: A Comprehensive Study

verfasst von : Zhong Zhang, Hong Wang, Shuang Liu, Yunxue Shao

Erschienen in: Communications, Signal Processing, and Systems

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we focus on the feature pooling methods for scene character recognition. We research three kinds of pooling methods: the average (sum) pooling, max pooling and weighted-based pooling methods. Specifically, various feature pooling methods are introduced, their merits and demerits are studied, and existing problems are discussed. Finally, we offer a specific comparison on the ICDAR2003 and Chars74k databases.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Pavlovic, V.I., Sharma, R., Huang, T.S.: Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 677–695 (1997) Pavlovic, V.I., Sharma, R., Huang, T.S.: Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 677–695 (1997)
2.
Zurück zum Zitat DeSouza, G.N., Kak, A.C.: Vision for mobile robot navigation: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 237–267 (2002) DeSouza, G.N., Kak, A.C.: Vision for mobile robot navigation: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 237–267 (2002)
3.
Zurück zum Zitat Vailaya, A., Figueiredo, M.A.T., Jain, A.K., Zhang, H.J.: Image classification for content-based indexing. IEEE Trans. Image Process. 10(1), 117–130 (2001) Vailaya, A., Figueiredo, M.A.T., Jain, A.K., Zhang, H.J.: Image classification for content-based indexing. IEEE Trans. Image Process. 10(1), 117–130 (2001)
4.
Zurück zum Zitat Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 366–373 (2004) Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 366–373 (2004)
5.
Zurück zum Zitat Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3538–3545 (2012) Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3538–3545 (2012)
6.
Zurück zum Zitat Gemert, J., Geusebroek, J., Veenman, C., Smeulders, A.: Kernel codebooks for scene categorization. In: European Conference on Computer Vision (ECCV), pp. 696–709 (2008) Gemert, J., Geusebroek, J., Veenman, C., Smeulders, A.: Kernel codebooks for scene categorization. In: European Conference on Computer Vision (ECCV), pp. 696–709 (2008)
7.
Zurück zum Zitat Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367 (2010) Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367 (2010)
8.
Zurück zum Zitat Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning, pp. 1096–1103 (2008) Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning, pp. 1096–1103 (2008)
9.
Zurück zum Zitat Norouzi, M., Ranjbar, M., Mori, G.: Stacks of convolutional restricted boltzmann machines for shift-invariant feature learning. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2735–2742 (2009) Norouzi, M., Ranjbar, M., Mori, G.: Stacks of convolutional restricted boltzmann machines for shift-invariant feature learning. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2735–2742 (2009)
10.
Zurück zum Zitat Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Cross-view action recognition using contextual maximum margin clustering. IEEE Trans. Circuits Syst. Video Technol. 24(10), 1663–1668 (2014) Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Cross-view action recognition using contextual maximum margin clustering. IEEE Trans. Circuits Syst. Video Technol. 24(10), 1663–1668 (2014)
11.
Zurück zum Zitat Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1794–1801 (2009) Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1794–1801 (2009)
12.
Zurück zum Zitat Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Action recognition using context-constrained linear coding. IEEE Signal Process. Lett. 19(7), 439–442 (2012) Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Action recognition using context-constrained linear coding. IEEE Signal Process. Lett. 19(7), 439–442 (2012)
13.
Zurück zum Zitat Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: International Conforence on Document Analysis and Recognition, pp. 682–687 (2003) Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: International Conforence on Document Analysis and Recognition, pp. 682–687 (2003)
14.
Zurück zum Zitat de Campos, T.E., Babu, B.R., Varma, M.: Character recognition in natural images. In: International Conference on Computer Vision and Applications, pp. 273–280 (2009) de Campos, T.E., Babu, B.R., Varma, M.: Character recognition in natural images. In: International Conference on Computer Vision and Applications, pp. 273–280 (2009)
15.
Zurück zum Zitat Zubair, S., Yan, F., Wang, W.: Dictionary learning based sparse coefficients for audio classification with max and average pooling. Digit. Signal Proc. 23(3), 960–970 (2013) Zubair, S., Yan, F., Wang, W.: Dictionary learning based sparse coefficients for audio classification with max and average pooling. Digit. Signal Proc. 23(3), 960–970 (2013)
16.
Zurück zum Zitat Murray, N., Perronnin, F.: Generalized max pooling. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2473–2480 (2014) Murray, N., Perronnin, F.: Generalized max pooling. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2473–2480 (2014)
17.
Zurück zum Zitat Hu, Y., Li, M., Yu, N.: Multiple-instance ranking: learning to rank images for image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008) Hu, Y., Li, M., Yu, N.: Multiple-instance ranking: learning to rank images for image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
18.
Zurück zum Zitat Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2169–2178 (2006) Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2169–2178 (2006)
19.
Zurück zum Zitat Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007) Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)
20.
Zurück zum Zitat Gao, S., Wang, C., Xiao, B., Shi, C., Zhang, Z.: Stroke bank: a high-level representation for scene character recognition. In: International Conference on Pattern Recognition (ICPR), pp. 2909–2913 (2014) Gao, S., Wang, C., Xiao, B., Shi, C., Zhang, Z.: Stroke bank: a high-level representation for scene character recognition. In: International Conference on Pattern Recognition (ICPR), pp. 2909–2913 (2014)
21.
Zurück zum Zitat Xiong, W., Zhang, L., Du, B., Tao, D.: Combining local and global: rich and robust feature pooling for visual recognition. Pattern Recogn. 62, 225–235 (2017) Xiong, W., Zhang, L., Du, B., Tao, D.: Combining local and global: rich and robust feature pooling for visual recognition. Pattern Recogn. 62, 225–235 (2017)
22.
Zurück zum Zitat Lee, C., Bhardwaj, A., Di, W., Jagadeesh, V., Piramuthu, R.: Region-based discriminative feature pooling for scene text recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4050–4057 (2014) Lee, C., Bhardwaj, A., Di, W., Jagadeesh, V., Piramuthu, R.: Region-based discriminative feature pooling for scene text recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4050–4057 (2014)
23.
Zurück zum Zitat Shi, C., Gao, S., Liu, M., Qi, C., Wang, C., Xiao, B.: Stroke detector and structure based models for character recognition: a comparative study. IEEE Trans. Image Process. 24(12), 4952–4964 (2015) Shi, C., Gao, S., Liu, M., Qi, C., Wang, C., Xiao, B.: Stroke detector and structure based models for character recognition: a comparative study. IEEE Trans. Image Process. 24(12), 4952–4964 (2015)
24.
Zurück zum Zitat Yi, C., Yang, X., Tian, Y.: Feature representations for scene text character recognition: a comparative study. In: International Conference on Document Analysis and Recognition, pp. 907–911 (2013) Yi, C., Yang, X., Tian, Y.: Feature representations for scene text character recognition: a comparative study. In: International Conference on Document Analysis and Recognition, pp. 907–911 (2013)
25.
Zurück zum Zitat Tian, S., Bhattacharya, U., Lu, S., Su, B.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recogn. 51, 126–134 (2016) Tian, S., Bhattacharya, U., Lu, S., Su, B.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recogn. 51, 126–134 (2016)
26.
Zurück zum Zitat Su, B., Lu, S., Tian, S., Lim, J.H., Tan, C.L.: Character recognition in natural scene using convolutional co-occurrence HOG. In: International Conference on Pattern Recognition (ICPR), pp. 2926–2931 (2014) Su, B., Lu, S., Tian, S., Lim, J.H., Tan, C.L.: Character recognition in natural scene using convolutional co-occurrence HOG. In: International Conference on Pattern Recognition (ICPR), pp. 2926–2931 (2014)
27.
Zurück zum Zitat Gao, S., Wang, C., Xiao, B., Shi, C., Zhou, W., Zhang, Z.: Learning co-occurrence strokes for scene character recognition based on spatiality embedded dictionary. In: IEEE International Conference on Image Processing (ICIP), pp. 5956–5960 (2014) Gao, S., Wang, C., Xiao, B., Shi, C., Zhou, W., Zhang, Z.: Learning co-occurrence strokes for scene character recognition based on spatiality embedded dictionary. In: IEEE International Conference on Image Processing (ICIP), pp. 5956–5960 (2014)
Metadaten
Titel
Feature Pooling in Scene Character Recognition: A Comprehensive Study
verfasst von
Zhong Zhang
Hong Wang
Shuang Liu
Yunxue Shao
Copyright-Jahr
2019
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-6571-2_262