Skip to main content
Erschienen in: Pattern Recognition and Image Analysis 3/2020

01.07.2020 | MATHEMATICAL THEORY OF IMAGES AND SIGNALS REPRESENTING, PROCESSING, ANALYSIS, RECOGNITION, AND UNDERSTANDING

Red Green Blue Depth Image Classification Using Pre-Trained Deep Convolutional Neural Network

verfasst von: N. Kumar, N. Kaur, D. Gupta

Erschienen in: Pattern Recognition and Image Analysis | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Image Classification is one of the eminent challenges in the field of computer vision, and it also acts as a foundation for other tasks such as image captioning, object detection, image coloring, etc. The Convolutional Neural Networks (CNN) techniques have the potency to accomplish image classification for a variety of datasets. With the advancements in technologies, cameras are capturing high-level information such as depth. Therefore, it is essential to incorporate depth information into CNN to provide a better experience of image classification. In this paper, an attempt is made to adapt pre-trained GoogLeNet on Washington RGB-D (RGB-Depth) dataset. Moreover, GoogLeNet is evaluated on depth data that has provided reasonable classification rate on RGB-D dataset. In addition, the paper works on analyzing the impact of pre-processing or resizing of images and batch size on classification accuracy of the model.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat R. Szeliski, Computer Vision: Algorithms and Applications (Springer, London, 2011).CrossRef R. Szeliski, Computer Vision: Algorithms and Applications (Springer, London, 2011).CrossRef
3.
Zurück zum Zitat D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision 60 (2), 91–110 (2004).CrossRef D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision 60 (2), 91–110 (2004).CrossRef
4.
Zurück zum Zitat H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded Up Robust Features,” in Computer Vision − ECCV 2006, Proc. 14th European Conference, Part I, Ed. by A. Leonardis, H. Bischof, and A. Pinz, Lecture Notes in Computer Science (Springer, Berlin, Heidelberg, 2006), Vol. 3951, pp. 404–417. H. Bay, T. Tuytelaars, and L. Van Gool, “SURF: Speeded Up Robust Features,” in Computer VisionECCV 2006, Proc. 14th European Conference, Part I, Ed. by A. Leonardis, H. Bischof, and A. Pinz, Lecture Notes in Computer Science (Springer, Berlin, Heidelberg, 2006), Vol. 3951, pp. 404–417.
5.
Zurück zum Zitat Y. LeCun, B. Boser, et al., “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1 (4), 541–551 (1989).CrossRef Y. LeCun, B. Boser, et al., “Backpropagation applied to handwritten zip code recognition,” Neural Comput. 1 (4), 541–551 (1989).CrossRef
6.
Zurück zum Zitat Y. LeCun, L. Bottou, et al., “Gradient-based learning applied to document recognition,” Proc. IEEE 86 (11), 2278–2324 (1998).CrossRef Y. LeCun, L. Bottou, et al., “Gradient-based learning applied to document recognition,” Proc. IEEE 86 (11), 2278–2324 (1998).CrossRef
7.
Zurück zum Zitat A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25: Proc. NIPS 2012 Conf. (Lake Tahoe, NV, USA, 2012), Vol. 1, pp. 1097–1105. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25: Proc. NIPS 2012 Conf. (Lake Tahoe, NV, USA, 2012), Vol. 1, pp. 1097–1105.
8.
Zurück zum Zitat J. Deng, W. Dong, et al., “ImageNet: A large-scale hierarchical image database,” in Proc. 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009) (Miami, FL, USA, 2009), pp. 248–255. J. Deng, W. Dong, et al., “ImageNet: A large-scale hierarchical image database,” in Proc. 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009) (Miami, FL, USA, 2009), pp. 248–255.
9.
Zurück zum Zitat C. Szegedy, W. Liu, Y. Jia, et al., “Going deeper with convolutions,” in Proc. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015) (Boston, MA, USA, 2015), pp. 1–9. C. Szegedy, W. Liu, Y. Jia, et al., “Going deeper with convolutions,” in Proc. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015) (Boston, MA, USA, 2015), pp. 1–9.
10.
Zurück zum Zitat M. Lin, Q. Chen, and S. Yan, “Network in network,” CoRR, arXiv preprint arXiv:1312.4400 (2013). https://arxiv.org/abs/1312.4400. M. Lin, Q. Chen, and S. Yan, “Network in network,” CoRR, arXiv preprint arXiv:1312.4400 (2013). https://​arxiv.​org/​abs/​1312.​4400.​
11.
Zurück zum Zitat O. Russakovsky, J. Deng, H. Su, et al., “ImageNet large scale visual recognition challenge,” Int. J. Comput. Vision 115 (3), 211–252 (2015).MathSciNetCrossRef O. Russakovsky, J. Deng, H. Su, et al., “ImageNet large scale visual recognition challenge,” Int. J. Comput. Vision 115 (3), 211–252 (2015).MathSciNetCrossRef
12.
Zurück zum Zitat Kumar, N., Kaur, N., and Gupta, D. (2020). Major Convolutional Neural Networks in Image Classification: A Survey. In Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India (pp. 243-258). Springer, Singapore, https://doi.org/10.1007/978-981-15-3020-3_23. Kumar, N., Kaur, N., and Gupta, D. (2020). Major Convolutional Neural Networks in Image Classification: A Survey. In Proceedings of International Conference on IoT Inclusive Life (ICIIL 2019), NITTTR Chandigarh, India (pp. 243-258). Springer, Singapore, https://​doi.​org/​10.​1007/​978-981-15-3020-3_​23.​
13.
Zurück zum Zitat K. Lai, L. Bo, X. Ren, and D. Fox, “A large-scale hierarchical multi-view RGB-D object dataset,” in Proc. 2011 IEEE Int. Conf. on Robotics and Automation (ICRA) (Shanghai, China, 2011), pp. 1817–1824. K. Lai, L. Bo, X. Ren, and D. Fox, “A large-scale hierarchical multi-view RGB-D object dataset,” in Proc. 2011 IEEE Int. Conf. on Robotics and Automation (ICRA) (Shanghai, China, 2011), pp. 1817–1824.
14.
Zurück zum Zitat A. Eitel, J. T. Springenberg, L. Spinello, M. Riedmiller, and W. Burgard, “Multimodal deep learning for robust RGB-D object recognition,” in Proc.2015IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2015) (Hamburg, Germany, September 2015), pp. 681–687. A. Eitel, J. T. Springenberg, L. Spinello, M. Riedmiller, and W. Burgard, “Multimodal deep learning for robust RGB-D object recognition,” in Proc.2015IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2015) (Hamburg, Germany, September 2015), pp. 681–687.
15.
Zurück zum Zitat S. Zia, B. Yüksel, D. Yüret, and Y. Yemez, “RGB-D object recognition using deep convolutional neural networks,” in Proc.2017IEEE Int. Conf. on Computer Vision Workshops (ICCVW 2017) (Venice, Italy, 22–29 October 2017), pp. 887–894. S. Zia, B. Yüksel, D. Yüret, and Y. Yemez, “RGB-D object recognition using deep convolutional neural networks,” in Proc.2017IEEE Int. Conf. on Computer Vision Workshops (ICCVW 2017) (Venice, Italy, 22–29 October 2017), pp. 887–894.
16.
Zurück zum Zitat R. Socher, B. Huval, B. Bhat, C. D. Manning, and A. Y. Ng, “Convolutional-recursive deep learning for 3D object classification,” in Advances in Neural Information Processing Systems 25: Proc. NIPS 2012 Conf. (Lake Tahoe, NV, USA, 2012), Vol. 1, pp. 656–664. R. Socher, B. Huval, B. Bhat, C. D. Manning, and A. Y. Ng, “Convolutional-recursive deep learning for 3D object classification,” in Advances in Neural Information Processing Systems 25: Proc. NIPS 2012 Conf. (Lake Tahoe, NV, USA, 2012), Vol. 1, pp. 656–664.
17.
Zurück zum Zitat M. Schwarz, H. Schulz, and S. Behnke, “RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features,” in Proc.2015IEEE Int. Conf. on Robotics and Automation (ICRA) (Seattle, WA, USA, May 2015), pp. 1329–1335. M. Schwarz, H. Schulz, and S. Behnke, “RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features,” in Proc.2015IEEE Int. Conf. on Robotics and Automation (ICRA) (Seattle, WA, USA, May 2015), pp. 1329–1335.
18.
Zurück zum Zitat Y. Cheng, X. Zhao, K. Huang, and T. Tan, “Semi-supervised learning and feature evaluation for RGB-D object recognition,” Comput. Vision Image Understanding 139, 149–160 (2015).CrossRef Y. Cheng, X. Zhao, K. Huang, and T. Tan, “Semi-supervised learning and feature evaluation for RGB-D object recognition,” Comput. Vision Image Understanding 139, 149–160 (2015).CrossRef
19.
Zurück zum Zitat Y. Cheng, X. Zhao, R. Cai, Z. Li, K. Huang, and Y. Rui, “Semi-supervised multimodal deep learning for RGB-D object recognition,” in Proc. 25th Int. Joint Conference on Artificial Intelligence (IJCAI-16) (NewYork, USA, 2016), pp. 3345–3351. Y. Cheng, X. Zhao, R. Cai, Z. Li, K. Huang, and Y. Rui, “Semi-supervised multimodal deep learning for RGB-D object recognition,” in Proc. 25th Int. Joint Conference on Artificial Intelligence (IJCAI-16) (NewYork, USA, 2016), pp. 3345–3351.
20.
Zurück zum Zitat N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005) (San Diego, CA, USA, 2005), Vol. 1, pp. 886–893. N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005) (San Diego, CA, USA, 2005), Vol. 1, pp. 886–893.
21.
Zurück zum Zitat L. Bo, X. Ren, and D. Fox, “Depth kernel descriptors for object recognition,” in Proc. 2011 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2011) (San Francisco, CA, USA, 2011), pp. 821–826. L. Bo, X. Ren, and D. Fox, “Depth kernel descriptors for object recognition,” in Proc. 2011 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2011) (San Francisco, CA, USA, 2011), pp. 821–826.
22.
Zurück zum Zitat J. J. DiCarlo, D. Zoccolan, and N. C. Rust, “How does the brain solve visual object recognition?” Neuron 73 (3), 415–434 (2012).CrossRef J. J. DiCarlo, D. Zoccolan, and N. C. Rust, “How does the brain solve visual object recognition?” Neuron 73 (3), 415–434 (2012).CrossRef
23.
Zurück zum Zitat L. Shao, F. Zhu, and X. Li, “Transfer learning for visual categorization: A survey,” IEEE Trans. Neural Networks Learn. Syst. 26 (5), 1019–1034 (2014).MathSciNetCrossRef L. Shao, F. Zhu, and X. Li, “Transfer learning for visual categorization: A survey,” IEEE Trans. Neural Networks Learn. Syst. 26 (5), 1019–1034 (2014).MathSciNetCrossRef
24.
Zurück zum Zitat Y. Cheng, R. Cai, X. Zhao, and K. Huang, “Convolutional fisher kernels for RGB-D object recognition,” in Proc.2015International Conference on 3D Vision (3DV 2015) (Lyon, France, October 2015). pp. 135–143. Y. Cheng, R. Cai, X. Zhao, and K. Huang, “Convolutional fisher kernels for RGB-D object recognition,” in Proc.2015International Conference on 3D Vision (3DV 2015) (Lyon, France, October 2015). pp. 135–143.
25.
Zurück zum Zitat PrimeSense. https://en.wikipedia.org/wiki/PrimeSense PrimeSense. https://​en.​wikipedia.​org/​wiki/​PrimeSense
26.
Zurück zum Zitat S. Savarese and L. Fei-Fei, “3D generic object categorization, localization and pose estimation,” in Proc.200711th IEEE Intern. Conf. on Computer Vision (ICCV) (Rio de Janeiro, Brazil, October 2007), pp. 1–8. S. Savarese and L. Fei-Fei, “3D generic object categorization, localization and pose estimation,” in Proc.200711th IEEE Intern. Conf. on Computer Vision (ICCV) (Rio de Janeiro, Brazil, October 2007), pp. 1–8.
27.
Zurück zum Zitat S. Ruder, “An overview of gradient descent optimization algorithms,” CoRR, arXiv preprint arXiv:1609.04747 (2016). http://arxiv.org/abs/1609.04747. S. Ruder, “An overview of gradient descent optimization algorithms,” CoRR, arXiv preprint arXiv:1609.04747 (2016). http://​arxiv.​org/​abs/​1609.​04747.​
Metadaten
Titel
Red Green Blue Depth Image Classification Using Pre-Trained Deep Convolutional Neural Network
verfasst von
N. Kumar
N. Kaur
D. Gupta
Publikationsdatum
01.07.2020
Verlag
Pleiades Publishing
Erschienen in
Pattern Recognition and Image Analysis / Ausgabe 3/2020
Print ISSN: 1054-6618
Elektronische ISSN: 1555-6212
DOI
https://doi.org/10.1134/S1054661820030153

Weitere Artikel der Ausgabe 3/2020

Pattern Recognition and Image Analysis 3/2020 Zur Ausgabe

MATHEMATICAL THEORY OF IMAGES AND SIGNALS REPRESENTING, PROCESSING, ANALYSIS, RECOGNITION, AND UNDERSTANDING

Probabilistic Decision Based Improved Trimmed Median Filter to Remove High-Density Salt and Pepper Noise

MATHEMATICAL THEORY OF IMAGES AND SIGNALS REPRESENTING, PROCESSING, ANALYSIS, RECOGNITION, AND UNDERSTANDING

On the Metric on Images Invariant with Respect to the Monotonic Brightness Transformation

ARTIFICIAL INTELLIGENCE TECHNIQUES IN PATTERN RECOGNITION AND IMAGE ANALYSIS

Hierarchization of Topical Texts Based on the Estimate of Proximity to the Semantic Pattern without Paraphrasing