Skip to main content
Top
Published in: Pattern Recognition and Image Analysis 3/2020

01-07-2020 | MATHEMATICAL THEORY OF IMAGES AND SIGNALS REPRESENTING, PROCESSING, ANALYSIS, RECOGNITION, AND UNDERSTANDING

Red Green Blue Depth Image Classification Using Pre-Trained Deep Convolutional Neural Network

Authors: N. Kumar, N. Kaur, D. Gupta

Published in: Pattern Recognition and Image Analysis | Issue 3/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Image Classification is one of the eminent challenges in the field of computer vision, and it also acts as a foundation for other tasks such as image captioning, object detection, image coloring, etc. The Convolutional Neural Networks (CNN) techniques have the potency to accomplish image classification for a variety of datasets. With the advancements in technologies, cameras are capturing high-level information such as depth. Therefore, it is essential to incorporate depth information into CNN to provide a better experience of image classification. In this paper, an attempt is made to adapt pre-trained GoogLeNet on Washington RGB-D (RGB-Depth) dataset. Moreover, GoogLeNet is evaluated on depth data that has provided reasonable classification rate on RGB-D dataset. In addition, the paper works on analyzing the impact of pre-processing or resizing of images and batch size on classification accuracy of the model.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference R. Szeliski, Computer Vision: Algorithms and Applications (Springer, London, 2011).CrossRef R. Szeliski, Computer Vision: Algorithms and Applications (Springer, London, 2011).CrossRef
3.
go back to reference D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision 60 (2), 91–110 (2004).CrossRef D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision 60 (2), 91–110 (2004).CrossRef
4.
go back to reference H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded Up Robust Features,” in Computer Vision − ECCV 2006, Proc. 14th European Conference, Part I, Ed. by A. Leonardis, H. Bischof, and A. Pinz, Lecture Notes in Computer Science (Springer, Berlin, Heidelberg, 2006), Vol. 3951, pp. 404–417. H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded Up Robust Features,” in Computer VisionECCV 2006, Proc. 14th European Conference, Part I, Ed. by A. Leonardis, H. Bischof, and A. Pinz, Lecture Notes in Computer Science (Springer, Berlin, Heidelberg, 2006), Vol. 3951, pp. 404–417.
5.
go back to reference Y. LeCun, B. Boser, et al., “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1 (4), 541–551 (1989).CrossRef Y. LeCun, B. Boser, et al., “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1 (4), 541–551 (1989).CrossRef
6.
go back to reference Y. LeCun, L. Bottou, et al., “Gradient-based learning applied to document recognition,” Proc. IEEE 86 (11), 2278–2324 (1998).CrossRef Y. LeCun, L. Bottou, et al., “Gradient-based learning applied to document recognition,” Proc. IEEE 86 (11), 2278–2324 (1998).CrossRef
7.
go back to reference A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25: Proc. NIPS 2012 Conf. (Lake Tahoe, NV, USA, 2012), Vol. 1, pp. 1097–1105. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25: Proc. NIPS 2012 Conf. (Lake Tahoe, NV, USA, 2012), Vol. 1, pp. 1097–1105.
8.
go back to reference J. Deng, W. Dong, et al., “ImageNet: A large-scale hierarchical image database,” in Proc. 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009) (Miami, FL, USA, 2009), pp. 248–255. J. Deng, W. Dong, et al., “ImageNet: A large-scale hierarchical image database,” in Proc. 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009) (Miami, FL, USA, 2009), pp. 248–255.
9.
go back to reference C. Szegedy, W. Liu, Y. Jia, et al., “Going deeper with convolutions,” in Proc. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015) (Boston, MA, USA, 2015), pp. 1–9. C. Szegedy, W. Liu, Y. Jia, et al., “Going deeper with convolutions,” in Proc. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015) (Boston, MA, USA, 2015), pp. 1–9.
10.
go back to reference M. Lin, Q. Chen, and S. Yan, “Network in network,” CoRR, arXiv preprint arXiv:1312.4400 (2013). https://arxiv.org/abs/1312.4400. M. Lin, Q. Chen, and S. Yan, “Network in network,” CoRR, arXiv preprint arXiv:1312.4400 (2013). https://​arxiv.​org/​abs/​1312.​4400.​
11.
go back to reference O. Russakovsky, J. Deng, H. Su, et al., “ImageNet large scale visual recognition challenge,” Int. J. Comput. Vision 115 (3), 211–252 (2015).MathSciNetCrossRef O. Russakovsky, J. Deng, H. Su, et al., “ImageNet large scale visual recognition challenge,” Int. J. Comput. Vision 115 (3), 211–252 (2015).MathSciNetCrossRef
12.
go back to reference Kumar, N., Kaur, N., and Gupta, D. (2020). Major Convolutional Neural Networks in Image Classification: A Survey. In Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India (pp. 243-258). Springer, Singapore, https://doi.org/10.1007/978-981-15-3020-3_23. Kumar, N., Kaur, N., and Gupta, D. (2020). Major Convolutional Neural Networks in Image Classification: A Survey. In Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India (pp. 243-258). Springer, Singapore, https://​doi.​org/​10.​1007/​978-981-15-3020-3_​23.​
13.
go back to reference K. Lai, L. Bo, X. Ren, and D. Fox, “A large-scale hierarchical multi-view RGB-D object dataset,” in Proc. 2011 IEEE Int. Conf. on Robotics and Automation (ICRA) (Shanghai, China, 2011), pp. 1817–1824. K. Lai, L. Bo, X. Ren, and D. Fox, “A large-scale hierarchical multi-view RGB-D object dataset,” in Proc. 2011 IEEE Int. Conf. on Robotics and Automation (ICRA) (Shanghai, China, 2011), pp. 1817–1824.
14.
go back to reference A. Eitel, J. T. Springenberg, L. Spinello, M. Riedmiller, and W. Burgard, “Multimodal deep learning for robust RGB-D object recognition,” in Proc.2015IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2015) (Hamburg, Germany, September 2015), pp. 681–687. A. Eitel, J. T. Springenberg, L. Spinello, M. Riedmiller, and W. Burgard, “Multimodal deep learning for robust RGB-D object recognition,” in Proc.2015IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2015) (Hamburg, Germany, September 2015), pp. 681–687.
15.
go back to reference S. Zia, B. Yüksel, D. Yüret, and Y. Yemez, “RGB-D object recognition using deep convolutional neural networks,” in Proc.2017IEEE Int. Conf. on Computer Vision Workshops (ICCVW 2017) (Venice, Italy, 22–29 October 2017), pp. 887–894. S. Zia, B. Yüksel, D. Yüret, and Y. Yemez, “RGB-D object recognition using deep convolutional neural networks,” in Proc.2017IEEE Int. Conf. on Computer Vision Workshops (ICCVW 2017) (Venice, Italy, 22–29 October 2017), pp. 887–894.
16.
go back to reference R. Socher, B. Huval, B. Bhat, C. D. Manning, and A. Y. Ng, “Convolutional-recursive deep learning for 3D object classification,” in Advances in Neural Information Processing Systems 25: Proc. NIPS 2012 Conf. (Lake Tahoe, NV, USA, 2012), Vol. 1, pp. 656–664. R. Socher, B. Huval, B. Bhat, C. D. Manning, and A. Y. Ng, “Convolutional-recursive deep learning for 3D object classification,” in Advances in Neural Information Processing Systems 25: Proc. NIPS 2012 Conf. (Lake Tahoe, NV, USA, 2012), Vol. 1, pp. 656–664.
17.
go back to reference M. Schwarz, H. Schulz, and S. Behnke, “RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features,” in Proc.2015IEEE Int. Conf. on Robotics and Automation (ICRA) (Seattle, WA, USA, May 2015), pp. 1329–1335. M. Schwarz, H. Schulz, and S. Behnke, “RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features,” in Proc.2015IEEE Int. Conf. on Robotics and Automation (ICRA) (Seattle, WA, USA, May 2015), pp. 1329–1335.
18.
go back to reference Y. Cheng, X. Zhao, K. Huang, and T. Tan, “Semi-supervised learning and feature evaluation for RGB-D object recognition,” Comput. Vision Image Understanding 139, 149–160 (2015).CrossRef Y. Cheng, X. Zhao, K. Huang, and T. Tan, “Semi-supervised learning and feature evaluation for RGB-D object recognition,” Comput. Vision Image Understanding 139, 149–160 (2015).CrossRef
19.
go back to reference Y. Cheng, X. Zhao, R. Cai, Z. Li, K. Huang, and Y. Rui, “Semi-supervised multimodal deep learning for RGB-D object recognition,” in Proc. 25th Int. Joint Conference on Artificial Intelligence (IJCAI-16) (NewYork, USA, 2016), pp. 3345–3351. Y. Cheng, X. Zhao, R. Cai, Z. Li, K. Huang, and Y. Rui, “Semi-supervised multimodal deep learning for RGB-D object recognition,” in Proc. 25th Int. Joint Conference on Artificial Intelligence (IJCAI-16) (NewYork, USA, 2016), pp. 3345–3351.
20.
go back to reference N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005) (San Diego, CA, USA, 2005), Vol. 1, pp. 886–893. N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005) (San Diego, CA, USA, 2005), Vol. 1, pp. 886–893.
21.
go back to reference L. Bo, X. Ren, and D. Fox, “Depth kernel descriptors for object recognition,” in Proc. 2011 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2011) (San Francisco, CA, USA, 2011), pp. 821–826. L. Bo, X. Ren, and D. Fox, “Depth kernel descriptors for object recognition,” in Proc. 2011 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2011) (San Francisco, CA, USA, 2011), pp. 821–826.
22.
go back to reference J. J. DiCarlo, D. Zoccolan, and N. C. Rust, “How does the brain solve visual object recognition?” Neuron 73 (3), 415–434 (2012).CrossRef J. J. DiCarlo, D. Zoccolan, and N. C. Rust, “How does the brain solve visual object recognition?” Neuron 73 (3), 415–434 (2012).CrossRef
23.
go back to reference L. Shao, F. Zhu, and X. Li, “Transfer learning for visual categorization: A survey,” IEEE Trans. Neural Networks Learn. Syst. 26 (5), 1019–1034 (2014).MathSciNetCrossRef L. Shao, F. Zhu, and X. Li, “Transfer learning for visual categorization: A survey,” IEEE Trans. Neural Networks Learn. Syst. 26 (5), 1019–1034 (2014).MathSciNetCrossRef
24.
go back to reference Y. Cheng, R. Cai, X. Zhao, and K. Huang, “Convolutional fisher kernels for RGB-D object recognition,” in Proc.2015International Conference on 3D Vision (3DV 2015) (Lyon, France, October 2015). pp. 135–143. Y. Cheng, R. Cai, X. Zhao, and K. Huang, “Convolutional fisher kernels for RGB-D object recognition,” in Proc.2015International Conference on 3D Vision (3DV 2015) (Lyon, France, October 2015). pp. 135–143.
25.
go back to reference PrimeSense. https://en.wikipedia.org/wiki/PrimeSense PrimeSense. https://​en.​wikipedia.​org/​wiki/​PrimeSense
26.
go back to reference S. Savarese and L. Fei-Fei, “3D generic object categorization, localization and pose estimation,” in Proc.200711th IEEE Intern. Conf. on Computer Vision (ICCV) (Rio de Janeiro, Brazil, October 2007), pp. 1–8. S. Savarese and L. Fei-Fei, “3D generic object categorization, localization and pose estimation,” in Proc.200711th IEEE Intern. Conf. on Computer Vision (ICCV) (Rio de Janeiro, Brazil, October 2007), pp. 1–8.
27.
go back to reference S. Ruder, “An overview of gradient descent optimization algorithms,” CoRR, arXiv preprint arXiv:1609.04747 (2016). http://arxiv.org/abs/1609.04747. S. Ruder, “An overview of gradient descent optimization algorithms,” CoRR, arXiv preprint arXiv:1609.04747 (2016). http://​arxiv.​org/​abs/​1609.​04747.​
Metadata
Title
Red Green Blue Depth Image Classification Using Pre-Trained Deep Convolutional Neural Network
Authors
N. Kumar
N. Kaur
D. Gupta
Publication date
01-07-2020
Publisher
Pleiades Publishing
Published in
Pattern Recognition and Image Analysis / Issue 3/2020
Print ISSN: 1054-6618
Electronic ISSN: 1555-6212
DOI
https://doi.org/10.1134/S1054661820030153

Other articles of this Issue 3/2020

Pattern Recognition and Image Analysis 3/2020 Go to the issue

MATHEMATICAL THEORY OF IMAGES AND SIGNALS REPRESENTING, PROCESSING, ANALYSIS, RECOGNITION, AND UNDERSTANDING

Restoration of Noisy and Noiseless Fence Occlusion Images

MATHEMATICAL THEORY OF IMAGES AND SIGNALS REPRESENTING, PROCESSING, ANALYSIS, RECOGNITION, AND UNDERSTANDING

Descriptive Image Analysis: Part III. Multilevel Model for Algorithms and Initial Data Combining in Pattern Recognition

Premium Partner