Skip to main content
Top
Published in: Optical Memory and Neural Networks 1/2021

01-01-2021

Improving the Performance of Convolutional Neural Networks for Image Classification

Author: Davar Giveki

Published in: Optical Memory and Neural Networks | Issue 1/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

As a high performance method for various image processing tasks, deep convolutional neural networks (CNNs) have reached impressive performances and absorbed considerable attention in the last few years. However, object classification on small size datasets for which a limited number of training images is available is still considered as an open problem. In this paper, we investigate a new method to effectively extract semantic image features. The proposed method which is based on CNNs boosts the performance of the object classification problem on small size dataset. To this end, a new method using image segmentation and CNNs is investigated. Our main goal is to increase the classification accuracy by first detecting and then extracting the main object of images. As training CNNs on small datasets does not yield to high performances because of millions of parameters to be learned, we propose using transfer learning strategy. Consequently, we first determine the main object of an image, and then we extract it. The extracted main object is used to tune the weights of the CNN in the training process. In this study, we employ a CNN that has been trained on the ImageNet dataset to reach mid-level image representation. Our experiments on Caltech-101 object dataset have shown that the proposed method substantially defeats other state-of-the-art methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Giveki, D., Soltanshahi, M.A., and Montazer, G.A., A new image feature descriptor for content based image retrieval using scale invariant feature transform and local derivative pattern, Optik, 2017, vol. 131, pp. 242–254.CrossRef Giveki, D., Soltanshahi, M.A., and Montazer, G.A., A new image feature descriptor for content based image retrieval using scale invariant feature transform and local derivative pattern, Optik, 2017, vol. 131, pp. 242–254.CrossRef
2.
go back to reference Giveki, D., Soltanshahi, M.A., and Yousefvand, M., Proposing a new feature descriptor for moving object detection, Optik, 2020, vol. 209, 164563.CrossRef Giveki, D., Soltanshahi, M.A., and Yousefvand, M., Proposing a new feature descriptor for moving object detection, Optik, 2020, vol. 209, 164563.CrossRef
3.
go back to reference Giveki, D., Scale-space multi-view bag of words for scene categorization, Multimedia Tools and Applications, 2020, pp. 1–23. Giveki, D., Scale-space multi-view bag of words for scene categorization, Multimedia Tools and Applications, 2020, pp. 1–23.
4.
go back to reference Jiang, X., Pang, Y., Li, X., Pan, J., and Xie, Y., Deep neural networks with elastic rectified linear units for object recognition, Neurocomputing, 2018, vol. 275, pp. 1132–1139.CrossRef Jiang, X., Pang, Y., Li, X., Pan, J., and Xie, Y., Deep neural networks with elastic rectified linear units for object recognition, Neurocomputing, 2018, vol. 275, pp. 1132–1139.CrossRef
5.
go back to reference Machine Learning and Signal Processing for Big Multimedia Analysis, Yu, J., Sang, J., and Gao, X., Eds., Elsevier B.V., 2017. Machine Learning and Signal Processing for Big Multimedia Analysis, Yu, J., Sang, J., and Gao, X., Eds., Elsevier B.V., 2017.
6.
go back to reference Aspiras, T.H. and Asari, V.K., Hierarchical autoassociative polynimial network (hap net) for pattern recognition, Neurocomputing, 2017, vol. 222, pp. 1–10.CrossRef Aspiras, T.H. and Asari, V.K., Hierarchical autoassociative polynimial network (hap net) for pattern recognition, Neurocomputing, 2017, vol. 222, pp. 1–10.CrossRef
7.
go back to reference Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., and Chen, T., Recent advances in convolutional neural networks. Pattern Recognit., 2018, vol. 77, pp. 354–377.CrossRef Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., and Chen, T., Recent advances in convolutional neural networks. Pattern Recognit., 2018, vol. 77, pp. 354–377.CrossRef
8.
go back to reference Jiang, X., Pang, Y., Li, X., and Pan, J., Speed up deep neural network based pedestrian detection by sharing features across multi-scale models, Neurocomputing, 2016, vol. 185, pp. 163–170.CrossRef Jiang, X., Pang, Y., Li, X., and Pan, J., Speed up deep neural network based pedestrian detection by sharing features across multi-scale models, Neurocomputing, 2016, vol. 185, pp. 163–170.CrossRef
9.
go back to reference Nian, F., Li, T., Wang, Y., Xu, M., and Wu, J., Pornographic image detection utilizing deep convolutional neural networks, Neurocomputing, 2016, vol. 210, pp. 283–293.CrossRef Nian, F., Li, T., Wang, Y., Xu, M., and Wu, J., Pornographic image detection utilizing deep convolutional neural networks, Neurocomputing, 2016, vol. 210, pp. 283–293.CrossRef
10.
go back to reference Han, D., Liu, Q., and Fan, W., A new image classification method using CNN transfer learning and web data augmentation, Expert Syst. Appl., 2018, vol. 95, pp. 43–56.CrossRef Han, D., Liu, Q., and Fan, W., A new image classification method using CNN transfer learning and web data augmentation, Expert Syst. Appl., 2018, vol. 95, pp. 43–56.CrossRef
11.
go back to reference Srivastava, N. and Salakhutdinov, R.R., Discriminative transfer learning with tree-based priors, in Advances in Neural Information Processing Systems, 2013, pp. 2094–2102. Srivastava, N. and Salakhutdinov, R.R., Discriminative transfer learning with tree-based priors, in Advances in Neural Information Processing Systems, 2013, pp. 2094–2102.
12.
go back to reference Wang, Z., Wang, X., and Wang, G., Learning fine-grained features via a CNN tree for large-scale classification, Neurocomputing, 2018, vol. 275, pp. 1231–1240.CrossRef Wang, Z., Wang, X., and Wang, G., Learning fine-grained features via a CNN tree for large-scale classification, Neurocomputing, 2018, vol. 275, pp. 1231–1240.CrossRef
13.
go back to reference Zheng, Y., Zhang, Y.J., and Larochelle, H., A deep and autoregressive approach for topic modeling of multimodal data, IEEE Trans. Pattern Anal. Mach. Intell., 2016, vol. 38, no. 6, pp. 1056–1069.CrossRef Zheng, Y., Zhang, Y.J., and Larochelle, H., A deep and autoregressive approach for topic modeling of multimodal data, IEEE Trans. Pattern Anal. Mach. Intell., 2016, vol. 38, no. 6, pp. 1056–1069.CrossRef
14.
go back to reference Zheng, Y., Zhang, Y.J., and Larochelle, H., Topic modeling of multimodal data: an autoregressive approach, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1370–1377. Zheng, Y., Zhang, Y.J., and Larochelle, H., Topic modeling of multimodal data: an autoregressive approach, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1370–1377.
15.
go back to reference Zhu, F., Ma, Z., Li, X., Chen, G., Chien, J.T., Xue, J.H., and Guo, J., Image-text dual neural network with decision strategy for small-sample image classification, Neurocomputing, 2019, vol. 328, pp. 182–188.CrossRef Zhu, F., Ma, Z., Li, X., Chen, G., Chien, J.T., Xue, J.H., and Guo, J., Image-text dual neural network with decision strategy for small-sample image classification, Neurocomputing, 2019, vol. 328, pp. 182–188.CrossRef
16.
go back to reference Fu, Y. and Aldrich, C., Froth image analysis by use of transfer learning and convolutional neural networks, Miner. Eng., 2018, vol. 115, pp. 68–78.CrossRef Fu, Y. and Aldrich, C., Froth image analysis by use of transfer learning and convolutional neural networks, Miner. Eng., 2018, vol. 115, pp. 68–78.CrossRef
17.
go back to reference Han, D., Liu, Q., and Fan, W., A new image classification method using CNN transfer learning and web data augmentation, Expert Syst. Appl., 2018, vol. 95, pp. 43–56.CrossRef Han, D., Liu, Q., and Fan, W., A new image classification method using CNN transfer learning and web data augmentation, Expert Syst. Appl., 2018, vol. 95, pp. 43–56.CrossRef
18.
go back to reference Sun, X., Shi, J., Liu, L., Dong, J., Plant, C., Wang, X., and Zhou, H., Transferring deep knowledge for object recognition in Low-quality underwater videos, Neurocomputing, 2018, vol. 275, pp. 897–908.CrossRef Sun, X., Shi, J., Liu, L., Dong, J., Plant, C., Wang, X., and Zhou, H., Transferring deep knowledge for object recognition in Low-quality underwater videos, Neurocomputing, 2018, vol. 275, pp. 897–908.CrossRef
19.
go back to reference Deng, Y., Manjunath, B.S., Kenney, C., Moore, M.S., and Shin, H., An efficient color representation for image retrieval, IEEE Trans. Image Process., 2001, vol. 10, no. 1, pp. 140–147.CrossRef Deng, Y., Manjunath, B.S., Kenney, C., Moore, M.S., and Shin, H., An efficient color representation for image retrieval, IEEE Trans. Image Process., 2001, vol. 10, no. 1, pp. 140–147.CrossRef
20.
go back to reference Montazer, G.A. and Giveki, D., Scene classification using multi-resolution WAHOLB features and neural network classifier, Neural Process. Lett., 2017, vol. 46, no. 2, pp. 681–704.CrossRef Montazer, G.A. and Giveki, D., Scene classification using multi-resolution WAHOLB features and neural network classifier, Neural Process. Lett., 2017, vol. 46, no. 2, pp. 681–704.CrossRef
21.
go back to reference Krizhevsky, A., Sutskever, I., and Hinton, G.E., Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 2012, pp. 1097–1105. Krizhevsky, A., Sutskever, I., and Hinton, G.E., Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.
22.
go back to reference Shin, H.C., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., an Summers, R.M., Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning, IEEE Trans. Med. Imaging, 2016, vol. 35, no. 5, pp. 1285–1298.CrossRef Shin, H.C., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., an Summers, R.M., Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning, IEEE Trans. Med. Imaging, 2016, vol. 35, no. 5, pp. 1285–1298.CrossRef
23.
go back to reference Fei-Fei, L., Fergus, R., and Perona, P., Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories, Comput. Vision Image Understanding, 2007, vol. 106, no. 1, pp. 59–70.CrossRef Fei-Fei, L., Fergus, R., and Perona, P., Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories, Comput. Vision Image Understanding, 2007, vol. 106, no. 1, pp. 59–70.CrossRef
24.
go back to reference Khan, S.H., Hayat, M., Bennamoun, M., Sohel, F.A., and Togneri, R., Cost-sensitive learning of deep feature representations from imbalanced data, IEEE Trans. Neural Networks Learn. Syst., 2018, vol. 29, no. 8, pp. 3573–3587.CrossRef Khan, S.H., Hayat, M., Bennamoun, M., Sohel, F.A., and Togneri, R., Cost-sensitive learning of deep feature representations from imbalanced data, IEEE Trans. Neural Networks Learn. Syst., 2018, vol. 29, no. 8, pp. 3573–3587.CrossRef
25.
go back to reference Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T., Decaf: A deep convolutional activation feature for generic visual recognition, in International Conference on Machine Learning, 2014, pp. 647–655. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., and Darrell, T., Decaf: A deep convolutional activation feature for generic visual recognition, in International Conference on Machine Learning, 2014, pp. 647–655.
26.
go back to reference Zeiler, M.D. and Fergus, R., Visualizing and understanding convolutional networks, in European Conference on Computer Vision, 2014, Cham.: Springer, pp. 818–833. Zeiler, M.D. and Fergus, R., Visualizing and understanding convolutional networks, in European Conference on Computer Vision, 2014, Cham.: Springer, pp. 818–833.
27.
go back to reference Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A., Return of the devil in the details: Delving deep into convolutional nets, 2014. arXiv preprint arXiv:1405.3531. Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A., Return of the devil in the details: Delving deep into convolutional nets, 2014. arXiv preprint arXiv:1405.3531.
28.
go back to reference He, K., Zhang, X., Ren, S., and Sun, J., Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 2015, vol. 37, no. 9, pp. 1904–1916.CrossRef He, K., Zhang, X., Ren, S., and Sun, J., Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 2015, vol. 37, no. 9, pp. 1904–1916.CrossRef
29.
go back to reference Li, Q., Peng, Q., and Yan, C., Multiple VLAD encoding of CNNs for image classification, Comput. Sci. Eng., 2018, vol. 20, no. 2, pp. 52–63.CrossRef Li, Q., Peng, Q., and Yan, C., Multiple VLAD encoding of CNNs for image classification, Comput. Sci. Eng., 2018, vol. 20, no. 2, pp. 52–63.CrossRef
30.
go back to reference Kaya, A., Keceli, A.S., Catal, C., Yalic, H.Y., Temucin, H., and Tekinerdogan, B., Analysis of transfer learning for deep neural network based plant classification models, Comput. Electron. Agric., 2019, vol. 158, pp. 20–29.CrossRef Kaya, A., Keceli, A.S., Catal, C., Yalic, H.Y., Temucin, H., and Tekinerdogan, B., Analysis of transfer learning for deep neural network based plant classification models, Comput. Electron. Agric., 2019, vol. 158, pp. 20–29.CrossRef
Metadata
Title
Improving the Performance of Convolutional Neural Networks for Image Classification
Author
Davar Giveki
Publication date
01-01-2021
Publisher
Pleiades Publishing
Published in
Optical Memory and Neural Networks / Issue 1/2021
Print ISSN: 1060-992X
Electronic ISSN: 1934-7898
DOI
https://doi.org/10.3103/S1060992X21010100

Other articles of this Issue 1/2021

Optical Memory and Neural Networks 1/2021 Go to the issue

Premium Partner