Skip to main content

2016 | OriginalPaper | Buchkapitel

Design of Kernels in Convolutional Neural Networks for Image Classification

verfasst von : Zhun Sun, Mete Ozay, Takayuki Okatani

Erschienen in: Computer Vision – ECCV 2016

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Despite the effectiveness of convolutional neural networks (CNNs) for image classification, our understanding of the effect of shape of convolution kernels on learned representations is limited. In this work, we explore and employ the relationship between shape of kernels which define receptive fields (RFs) in CNNs for learning of feature representations and image classification. For this purpose, we present a feature visualization method for visualization of pixel-wise classification score maps of learned features. Motivated by our experimental results, and observations reported in the literature for modeling of visual systems, we propose a novel design of shape of kernels for learning of representations in CNNs.
In the experimental results, the proposed models also outperform the state-of-the-art methods employed on the CIFAR-10/100 datasets [1] for image classification. We also achieved an outstanding performance in the classification task, comparing to a base CNN model that introduces more parameters and computational time, using the ILSVRC-2012 dataset [2]. Additionally, we examined the region of interest (ROI) of different models in the classification task and analyzed the robustness of the proposed method to occluded images. Our results indicate the effectiveness of the proposed approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
2
In addition to the BASE and the QH-BASE models, we also employ a “third-party” model, namely VGG [4], to generate the occluded images.
 
Literatur
1.
Zurück zum Zitat Krizhevsky, A.: Learning multiple layers of features from tiny images (2009) Krizhevsky, A.: Learning multiple layers of features from tiny images (2009)
2.
Zurück zum Zitat Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115, 1–42 (2015)MathSciNetCrossRef Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. (IJCV) 115, 1–42 (2015)MathSciNetCrossRef
3.
Zurück zum Zitat Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc., Red Hook (2012) Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc., Red Hook (2012)
4.
Zurück zum Zitat Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of ICLR (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Proceedings of ICLR (2015)
5.
Zurück zum Zitat Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J. Physiol. 160(1), 106–154 (1962)CrossRef Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J. Physiol. 160(1), 106–154 (1962)CrossRef
6.
Zurück zum Zitat Mutch, J., Lowe, D.: Object class recognition and localization using sparse features with limited receptive fields. Int. J. Comput. Vision 80(1), 45–57 (2008)CrossRef Mutch, J., Lowe, D.: Object class recognition and localization using sparse features with limited receptive fields. Int. J. Comput. Vision 80(1), 45–57 (2008)CrossRef
7.
Zurück zum Zitat Simoncelli, E.P., Olshausen, B.A.: Natural image statistics and neural representation. Ann. Rev. Neurosci. 24, 1193–1216 (2001)CrossRef Simoncelli, E.P., Olshausen, B.A.: Natural image statistics and neural representation. Ann. Rev. Neurosci. 24, 1193–1216 (2001)CrossRef
8.
Zurück zum Zitat Liu, Y.S., Stevens, C.F., Sharpee, T.O.: Predictable irregularities in retinal receptive fields. PNAS 106(38), 16499–16504 (2009)CrossRef Liu, Y.S., Stevens, C.F., Sharpee, T.O.: Predictable irregularities in retinal receptive fields. PNAS 106(38), 16499–16504 (2009)CrossRef
9.
Zurück zum Zitat Kerr, D., Coleman, S., McGinnity, T., Wu, Q., Clogenson, M.: A novel approach to robot vision using a hexagonal grid and spiking neural networks. In: The International Joint Conference on Neural Networks (IJCNN), pp. 1–7, June 2012 Kerr, D., Coleman, S., McGinnity, T., Wu, Q., Clogenson, M.: A novel approach to robot vision using a hexagonal grid and spiking neural networks. In: The International Joint Conference on Neural Networks (IJCNN), pp. 1–7, June 2012
10.
Zurück zum Zitat Mersereau, R.: The processing of hexagonally sampled two-dimensional signals. Proc. IEEE 67(6), 930–949 (1979)CrossRef Mersereau, R.: The processing of hexagonally sampled two-dimensional signals. Proc. IEEE 67(6), 930–949 (1979)CrossRef
11.
Zurück zum Zitat Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
12.
Zurück zum Zitat Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. CoRR abs/1506.02626 (2015) Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. CoRR abs/1506.02626 (2015)
13.
Zurück zum Zitat Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 3rd edn. Prentice-Hall Inc., Upper Saddle River (2006) Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 3rd edn. Prentice-Hall Inc., Upper Saddle River (2006)
14.
Zurück zum Zitat Klette, R., Rosenfeld, A.: Digital Geometry: Geometric Methods for Digital Picture Analysis. Morgan Kaufmann, San Francisco (2004)MATH Klette, R., Rosenfeld, A.: Digital Geometry: Geometric Methods for Digital Picture Analysis. Morgan Kaufmann, San Francisco (2004)MATH
15.
Zurück zum Zitat Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, New York (2004)CrossRefMATH Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, New York (2004)CrossRefMATH
16.
Zurück zum Zitat Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: Proceedings of the International Conference on Learning Representations (ICLR) (2014) Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: Proceedings of the International Conference on Learning Representations (ICLR) (2014)
17.
Zurück zum Zitat Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint (2014). arXiv:1408.5093 Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint (2014). arXiv:​1408.​5093
18.
Zurück zum Zitat Springenberg, J., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. In: ICLR (workshop track) (2015) Springenberg, J., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. In: ICLR (workshop track) (2015)
19.
Zurück zum Zitat Lin, M., Chen, Q., Yan, S.: Network in network. In: Proceedings of ICLR (2014) Lin, M., Chen, Q., Yan, S.: Network in network. In: Proceedings of ICLR (2014)
20.
Zurück zum Zitat Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-Supervised Nets. ArXiv e-prints, September 2014 Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-Supervised Nets. ArXiv e-prints, September 2014
21.
Zurück zum Zitat Liang, M., Hu, X.: Recurrent convolutional neural network for object recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015 Liang, M., Hu, X.: Recurrent convolutional neural network for object recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
22.
Zurück zum Zitat Rippel, O., Snoek, J., Adams, R.P.: Spectral Representations for Convolutional Neural Networks. ArXiv e-prints, June 2015 Rippel, O., Snoek, J., Adams, R.P.: Spectral Representations for Convolutional Neural Networks. ArXiv e-prints, June 2015
23.
Zurück zum Zitat Graham, B.: Fractional max-pooling. CoRR abs/1412.6071 (2014) Graham, B.: Fractional max-pooling. CoRR abs/1412.6071 (2014)
24.
Zurück zum Zitat Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)MathSciNetMATH Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)MathSciNetMATH
Metadaten
Titel
Design of Kernels in Convolutional Neural Networks for Image Classification
verfasst von
Zhun Sun
Mete Ozay
Takayuki Okatani
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-46478-7_4