Skip to main content
Top
Published in: Neural Computing and Applications 3/2020

03-07-2019 | Review Article

Interpretation of intelligence in CNN-pooling processes: a methodological survey

Authors: Nadeem Akhtar, U. Ragavendran

Published in: Neural Computing and Applications | Issue 3/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The convolutional neural network architecture has different components like convolution and pooling. The pooling is crucial component placed after the convolution layer. It plays a vital role in visual recognition, detection and segmentation course to overcome the concerns like overfitting, computation time and recognition accuracy. The elementary pooling process involves down sampling of feature map by piercing into subregions. This piercing and down sampling is defined by the pooling hyperparameters, viz. stride and filter size. This down sampling process discards the irrelevant information and picks the defined global feature. The generally used global feature selection methods are average and max pooling. These methods decline, when the main element has higher or lesser intensity than the nonsignificant element. It also suffers with locus and order of nominated global feature, hence not suitable for every situation. The pooling variants are proposed by numerous researchers to overcome concern. This article presents the state of the art on selection of global feature for pooling process mainly based on four categories such as value, probability, rank and transformed domain. The value and probability-based methods use the criteria such as the way of down sampling, size of kernel, input output feature map, location of pooling, number stages and random selection based on probability value. The rank-based methods assign the rank and weight to activation; the feature is selected based on the defined criteria. The transformed domain pooling methods transform the image to other domains such as wavelet, frequency for pooling the feature.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literature
3.
12.
go back to reference Nagi J, Ducatelle F, Di Caro GA, Ciresan D, Meier U, Giusti A, Nagi F, Schmidhuber J, Gambardella LM (2011) Max-pooling convolutional neural networks for vision-based hand gesture recognition. Proceedings of the IEEE international conference on signal and image processing applications, pp 342–347. https://doi.org/10.1109/ICSIPA.2011.6144164 Nagi J, Ducatelle F, Di Caro GA, Ciresan D, Meier U, Giusti A, Nagi F, Schmidhuber J, Gambardella LM (2011) Max-pooling convolutional neural networks for vision-based hand gesture recognition. Proceedings of the IEEE international conference on signal and image processing applications, pp 342–347. https://​doi.​org/​10.​1109/​ICSIPA.​2011.​6144164
15.
go back to reference Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. In: Proceedings of the neural information processing systems Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. In: Proceedings of the neural information processing systems
17.
go back to reference Boureau Y, Ponce J, LeCun Y (2010) A theoretical analysis of feature pooling in visual recognition. In: Proceedings of the 27th international conference on machine learning, pp 111–118 Boureau Y, Ponce J, LeCun Y (2010) A theoretical analysis of feature pooling in visual recognition. In: Proceedings of the 27th international conference on machine learning, pp 111–118
18.
go back to reference Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the international conference on learning representations, pp 609–616. https://doi.org/10.1145/1553374.1553453 Lee H, Grosse R, Ranganath R, Ng AY (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the international conference on learning representations, pp 609–616. https://​doi.​org/​10.​1145/​1553374.​1553453
22.
go back to reference Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. arXiv: 1301.3557v1 Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks. arXiv: 1301.3557v1
23.
go back to reference Sainath TN, Kingsbury B, Mohamed A, Dahl GE, Saon G, Soltau H, Beran T, Aravkin Aleksandr Y, Ramabhadran B (2013) Improvements to deep convolutional neural networks for LVCSR. In: 2013 IEEE workshop on automatic speech recognition and understanding, pp 315–320. https://doi.org/10.1109/ASRU.2013.6707749 Sainath TN, Kingsbury B, Mohamed A, Dahl GE, Saon G, Soltau H, Beran T, Aravkin Aleksandr Y, Ramabhadran B (2013) Improvements to deep convolutional neural networks for LVCSR. In: 2013 IEEE workshop on automatic speech recognition and understanding, pp 315–320. https://​doi.​org/​10.​1109/​ASRU.​2013.​6707749
28.
go back to reference Lin M, Chen Q, Yan S (2013) Network in network. arXiv: 1312.4400v3 Lin M, Chen Q, Yan S (2013) Network in network. arXiv: 1312.4400v3
33.
go back to reference Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 2169–2178. https://doi.org/10.1109/CVPR.2006.68 Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 2169–2178. https://​doi.​org/​10.​1109/​CVPR.​2006.​68
36.
go back to reference Laptev D, Savinov N, Buhmann JM, Pollefeys M (2016) TI-POOLING: transformation-invariant pooling for feature learning in convolutional neural networks. arXiv: 1604.06318 Laptev D, Savinov N, Buhmann JM, Pollefeys M (2016) TI-POOLING: transformation-invariant pooling for feature learning in convolutional neural networks. arXiv: 1604.06318
38.
go back to reference Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. arXiv:1409.4842v1 Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. arXiv:​1409.​4842v1
42.
go back to reference Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master’s Dissertation, University of Toronto, Canada Krizhevsky A (2009) Learning multiple layers of features from tiny images. Master’s Dissertation, University of Toronto, Canada
48.
go back to reference Dias CA et al (2018) Using the choquet integral in the pooling layer in deep learning networks. In: Barreto G, Coelho R (eds) Fuzzy information processing. NAFIPS 2018. Communications in computer and information science, vol 831. Springer, Cham Dias CA et al (2018) Using the choquet integral in the pooling layer in deep learning networks. In: Barreto G, Coelho R (eds) Fuzzy information processing. NAFIPS 2018. Communications in computer and information science, vol 831. Springer, Cham
49.
57.
go back to reference Estrach JB, Szlam A, Lecun Y (204) Signal recovery from pooling representations. In: Proceedings of the international conference on machine learning , pp 307–315. arXiv:1311.4025v3 Estrach JB, Szlam A, Lecun Y (204) Signal recovery from pooling representations. In: Proceedings of the international conference on machine learning , pp 307–315. arXiv:​1311.​4025v3
58.
go back to reference Sermanet P, Chintala S, LeCun Y (2012) Convolutional neural networks applied to house numbers digit classification. In: Proceedings of the 21st international conference on pattern recognition, pp 3288–3291 Sermanet P, Chintala S, LeCun Y (2012) Convolutional neural networks applied to house numbers digit classification. In: Proceedings of the 21st international conference on pattern recognition, pp 3288–3291
60.
go back to reference Wan L, Zeiler M, Zhang S, LeCun Y, Fergus R (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning, vol 28(3), pp 1058–1066 Wan L, Zeiler M, Zhang S, LeCun Y, Fergus R (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning, vol 28(3), pp 1058–1066
61.
go back to reference Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv: 1207.0580 Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv: 1207.0580
62.
69.
go back to reference Lee C-Y, Gallagher PW, Tu Z (2015) Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree. arXiv:1509.08985 Lee C-Y, Gallagher PW, Tu Z (2015) Generalizing pooling functions in convolutional neural networks: mixed, gated, and tree. arXiv:​1509.​08985
75.
78.
go back to reference Deng L, Abdel-Hamid O, Yu D (2013) A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion. In Proceedings of international conference on acoustics, speech and signal processing, pp 6669–6673. https://doi.org/10.1109/ICASSP.2013.6638952 Deng L, Abdel-Hamid O, Yu D (2013) A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion. In Proceedings of international conference on acoustics, speech and signal processing, pp 6669–6673. https://​doi.​org/​10.​1109/​ICASSP.​2013.​6638952
79.
go back to reference Williams T, Li R (2018) Wavelet pooling for convolutional neural networks. In: Proceedings of the international conference on learning representations, vol 6 Williams T, Li R (2018) Wavelet pooling for convolutional neural networks. In: Proceedings of the international conference on learning representations, vol 6
80.
81.
go back to reference Xu Y, Kong Q, Wang W, Plumbley MD (2018) Large-scale weakly supervised audio classification using gated convolutional. Neural Netw. arXiv:1710.00343v1 Xu Y, Kong Q, Wang W, Plumbley MD (2018) Large-scale weakly supervised audio classification using gated convolutional. Neural Netw. arXiv:​1710.​00343v1
85.
go back to reference Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M (2015) Striving for simplicity: the all convolutional net. In: Proceedings of the international conference on learning representations. arXiv:1412.6806v3 Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M (2015) Striving for simplicity: the all convolutional net. In: Proceedings of the international conference on learning representations. arXiv:​1412.​6806v3
91.
go back to reference Wang P, Cao Y, Shen C, Liu L, Shen HT (2015) Temporal pyramid pooling based convolutional neural networks for action recognition. arXiv:1503.01224 Wang P, Cao Y, Shen C, Liu L, Shen HT (2015) Temporal pyramid pooling based convolutional neural networks for action recognition. arXiv:​1503.​01224
92.
96.
go back to reference Wang P, Li W, Gao Z, Tang C, Ogunbona P (2018) Depth pooling based large-scale 3D action recognition with convolutional neural networks. arXiv:1804.01194 Wang P, Li W, Gao Z, Tang C, Ogunbona P (2018) Depth pooling based large-scale 3D action recognition with convolutional neural networks. arXiv:​1804.​01194
100.
Metadata
Title
Interpretation of intelligence in CNN-pooling processes: a methodological survey
Authors
Nadeem Akhtar
U. Ragavendran
Publication date
03-07-2019
Publisher
Springer London
Published in
Neural Computing and Applications / Issue 3/2020
Print ISSN: 0941-0643
Electronic ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-019-04296-5

Other articles of this Issue 3/2020

Neural Computing and Applications 3/2020 Go to the issue

Editorial

Editorial

Premium Partner