Skip to main content

2020 | OriginalPaper | Buchkapitel

Regularized Pooling

verfasst von : Takato Otsuzuki, Hideaki Hayashi, Yuchen Zheng, Seiichi Uchida

Erschienen in: Artificial Neural Networks and Machine Learning – ICANN 2020

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In convolutional neural networks (CNNs), pooling operations play important roles such as dimensionality reduction and deformation compensation. In general, max pooling, which is the most widely used operation for local pooling, is performed independently for each kernel. However, the deformation may be spatially smooth over the neighboring kernels. This means that max pooling is too flexible to compensate for actual deformations. In other words, its excessive flexibility risks canceling the essential spatial differences between classes. In this paper, we propose regularized pooling, which enables the value selection direction in the pooling operation to be spatially smooth across adjacent kernels so as to compensate only for actual deformations. The results of experiments on handwritten character images and texture images showed that regularized pooling not only improves recognition accuracy but also accelerates the convergence of learning compared with conventional pooling operations.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
To be specific, given a convolutional feature map of size \(H \times W\) as input, \(I = \lfloor (H - 1)/s \rfloor + 1\) and \(J = \lfloor (W - 1)/s \rfloor + 1\) if we add a proper size of padding to the input.
 
2
If the fraction part is exactly 0.5, it is rounded away from zero.
 
Literatur
1.
Zurück zum Zitat Aich, S., Stavness, I.: Global sum pooling: a generalization trick for object counting with small datasets of large images. In: Proceedings of CVPR Deep Vision Workshop (2019) Aich, S., Stavness, I.: Global sum pooling: a generalization trick for object counting with small datasets of large images. In: Proceedings of CVPR Deep Vision Workshop (2019)
2.
Zurück zum Zitat Bulo, S.R., Neuhold, G., Kontschieder, P.: Loss max-pooling for semantic image segmentation. In: Proceedings of CVPR, pp. 7082–7091 (2017) Bulo, S.R., Neuhold, G., Kontschieder, P.: Loss max-pooling for semantic image segmentation. In: Proceedings of CVPR, pp. 7082–7091 (2017)
3.
Zurück zum Zitat Cohen, G., Afshar, S., Tapson, J., Van Schaik, A.: EMNIST: extending MNIST to handwritten letters. In: Proceedings of IJCNN, pp. 2921–2926 (2017) Cohen, G., Afshar, S., Tapson, J., Van Schaik, A.: EMNIST: extending MNIST to handwritten letters. In: Proceedings of IJCNN, pp. 2921–2926 (2017)
4.
Zurück zum Zitat Feng, J., Ni, B., Tian, Q., Yan, S.: Geometric \(l_p\)-norm feature pooling for image classification. In: Proceedings of CVPR, pp. 2609–2704 (2011) Feng, J., Ni, B., Tian, Q., Yan, S.: Geometric \(l_p\)-norm feature pooling for image classification. In: Proceedings of CVPR, pp. 2609–2704 (2011)
5.
Zurück zum Zitat Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: Proceedings of CVPR, pp. 317–326 (2016) Gao, Y., Beijbom, O., Zhang, N., Darrell, T.: Compact bilinear pooling. In: Proceedings of CVPR, pp. 317–326 (2016)
6.
Zurück zum Zitat Gao, Z., Wang, L., Wu, G.: LIP: local importance-based pooling. In: Proceedings of ICCV, pp. 3355–3364 (2019) Gao, Z., Wang, L., Wu, G.: LIP: local importance-based pooling. In: Proceedings of ICCV, pp. 3355–3364 (2019)
9.
Zurück zum Zitat He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)CrossRef He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)CrossRef
10.
Zurück zum Zitat He, Y., Chiu, W.C., Keuper, M., Fritz, M.: STD2P: RGBD semantic segmentation using spatio-temporal data-driven pooling. In: Proceedings of CVPR, pp. 4837–4846 (2017) He, Y., Chiu, W.C., Keuper, M., Fritz, M.: STD2P: RGBD semantic segmentation using spatio-temporal data-driven pooling. In: Proceedings of CVPR, pp. 4837–4846 (2017)
11.
Zurück zum Zitat Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580 (2012) Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:​1207.​0580 (2012)
12.
Zurück zum Zitat Husain, S.S., Bober, M.: REMAP: multi-layer entropy-guided pooling of dense CNN features for image retrieval. IEEE Trans. Image Process. 28(10), 5201–5213 (2019)MathSciNetCrossRef Husain, S.S., Bober, M.: REMAP: multi-layer entropy-guided pooling of dense CNN features for image retrieval. IEEE Trans. Image Process. 28(10), 5201–5213 (2019)MathSciNetCrossRef
13.
Zurück zum Zitat Kobayashi, T.: Global feature guided local pooling. In: Proceedings of ICCV, pp. 3365–3374 (2019) Kobayashi, T.: Global feature guided local pooling. In: Proceedings of ICCV, pp. 3365–3374 (2019)
14.
Zurück zum Zitat Kumar, A.: Ordinal pooling networks: for preserving information over shrinking feature maps. arXiv preprint arXiv:1804.02702 (2018) Kumar, A.: Ordinal pooling networks: for preserving information over shrinking feature maps. arXiv preprint arXiv:​1804.​02702 (2018)
16.
Zurück zum Zitat Laptev, D., Savinov, N., Buhmann, J.M., Pollefeys, M.: TI-POOLING: transformation-invariant pooling for feature learning in convolutional neural networks. In: Proceedings of CVPR, pp. 289–297 (2016) Laptev, D., Savinov, N., Buhmann, J.M., Pollefeys, M.: TI-POOLING: transformation-invariant pooling for feature learning in convolutional neural networks. In: Proceedings of CVPR, pp. 289–297 (2016)
17.
Zurück zum Zitat LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRef
18.
Zurück zum Zitat LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
19.
Zurück zum Zitat Liu, J.J., Hou, Q., Cheng, M.M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: Proceedings of CVPR, pp. 3917–3926 (2019) Liu, J.J., Hou, Q., Cheng, M.M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: Proceedings of CVPR, pp. 3917–3926 (2019)
20.
Zurück zum Zitat Nguyen, D., Lu, S., Tian, S., Ouarti, N., Mokhtari, M.: A pooling based scene text proposal technique for scene text reading in the wild. Pattern Recogn. 87, 118–129 (2019)CrossRef Nguyen, D., Lu, S., Tian, S., Ouarti, N., Mokhtari, M.: A pooling based scene text proposal technique for scene text reading in the wild. Pattern Recogn. 87, 118–129 (2019)CrossRef
21.
Zurück zum Zitat Saeedan, F., Weber, N., Goesele, M., Roth, S.: Detail-preserving pooling in deep networks. In: Proceedings of CVPR, pp. 9108–9116 (2018) Saeedan, F., Weber, N., Goesele, M., Roth, S.: Detail-preserving pooling in deep networks. In: Proceedings of CVPR, pp. 9108–9116 (2018)
22.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of ICLR (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of ICLR (2015)
23.
Zurück zum Zitat Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: Proceedings of ICML, pp. 1058–1066 (2013) Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: Proceedings of ICML, pp. 1058–1066 (2013)
24.
Zurück zum Zitat Wei, Z., et al.: Building detail-sensitive semantic segmentation networks with polynomial pooling. In: Proceedings of CVPR, pp. 7115–7123 (2019) Wei, Z., et al.: Building detail-sensitive semantic segmentation networks with polynomial pooling. In: Proceedings of CVPR, pp. 7115–7123 (2019)
26.
Zurück zum Zitat Zhai, S., et al.: S3Pool: pooling with stochastic spatial sampling. In: Proceedings of CVPR, pp. 4970–4978 (2017) Zhai, S., et al.: S3Pool: pooling with stochastic spatial sampling. In: Proceedings of CVPR, pp. 4970–4978 (2017)
27.
Zurück zum Zitat Zhang, Y., Tang, S., Muandet, K., Jarvers, C., Neumann, H.: Local temporal bilinear pooling for fine-grained action parsing. In: Proceedings of CVPR, pp. 12005–12015 (2019) Zhang, Y., Tang, S., Muandet, K., Jarvers, C., Neumann, H.: Local temporal bilinear pooling for fine-grained action parsing. In: Proceedings of CVPR, pp. 12005–12015 (2019)
28.
Zurück zum Zitat Zheng, Y., Iwana, B.K., Uchida, S.: Mining the displacement of max-pooling for text recognition. Pattern Recogn. 93, 558–569 (2019)CrossRef Zheng, Y., Iwana, B.K., Uchida, S.: Mining the displacement of max-pooling for text recognition. Pattern Recogn. 93, 558–569 (2019)CrossRef
Metadaten
Titel
Regularized Pooling
verfasst von
Takato Otsuzuki
Hideaki Hayashi
Yuchen Zheng
Seiichi Uchida
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-61616-8_20