Skip to main content
Top

2021 | OriginalPaper | Chapter

34. Deep Learning for Robot Vision

Authors : Mamilla Keerthikeshwar, S. Anto

Published in: Intelligent Manufacturing and Energy Sustainability

Publisher: Springer Singapore

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Deep learning comes under a class of machine learning where we use it for extremely high-level output, like recognition of images, etc. It has been used in pattern recognition over a vast area such as handmade crafts to extract the data from learning procedures. At present, it has gained a great significance in robot vision. In this paper, we show how neural networks play a vital role in robot vision. Image segmentation, which is the initial step, is used to preprocess the images and videos. The multilayered artificial neural networks have a lot more applications. It can be applied in drug detection, military bases, and many more. The main objective of this paper is to review how deep learning algorithms and deep nets can be used in various areas of robot vision. There are some predefined deep learning algorithms that are available in the market, which are used here to perform this comparative study. These will help us to have a clear insight while building vision systems using deep learning.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference J. Ren, X. Chen, J. Liu, W. Sun, J. Pang, Q. Yan, Y.-W. Tai, L. Xu, Accurate single stage detector using recurrent rolling convolution, in CVPR (2017) J. Ren, X. Chen, J. Liu, W. Sun, J. Pang, Q. Yan, Y.-W. Tai, L. Xu, Accurate single stage detector using recurrent rolling convolution, in CVPR (2017)
3.
go back to reference G. Wang, P. Luo, L. Lin, X. Wang, Learning object interactions and descriptions for semantic image segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 5859–5867 G. Wang, P. Luo, L. Lin, X. Wang, Learning object interactions and descriptions for semantic image segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 5859–5867
4.
go back to reference Y.H. Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, G. Toderici, Beyond short snippets: deep networks for video classification, in IEEE Conference on Computer Vision and Pattern Recognition[CVPR] (2015), pp. 4694–4702 Y.H. Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, G. Toderici, Beyond short snippets: deep networks for video classification, in IEEE Conference on Computer Vision and Pattern Recognition[CVPR] (2015), pp. 4694–4702
5.
go back to reference Y. Chen, C. Shen, X.-S. Wei, L. Liu, J. Yang, Adversarial PoseNet: a structure-aware convolutional network for human pose estimation. arXiv:1705.00389 Y. Chen, C. Shen, X.-S. Wei, L. Liu, J. Yang, Adversarial PoseNet: a structure-aware convolutional network for human pose estimation. arXiv:​1705.​00389
6.
7.
go back to reference A. Kendall, R. Cipolla, Modelling uncertainty in deep learning for camera relocalization, in IEEE International Conference on Robotics and Automation [ICRA] (May 2016) A. Kendall, R. Cipolla, Modelling uncertainty in deep learning for camera relocalization, in IEEE International Conference on Robotics and Automation [ICRA] (May 2016)
8.
go back to reference J. Schlosser, C.K. Chow, Z. Kira. Fusing LIDAR and images for pedestrian detection using convolutional neural networks, in IEEE International Conference on Robotics and Automation [ICRA] (May 2016) J. Schlosser, C.K. Chow, Z. Kira. Fusing LIDAR and images for pedestrian detection using convolutional neural networks, in IEEE International Conference on Robotics and Automation [ICRA] (May 2016)
10.
go back to reference J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in CVPR 2015 [best paper honorable mention] J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in CVPR 2015 [best paper honorable mention]
11.
go back to reference S. Yang, D. Maturana, S. Scherer, Real-time 3D scene layout from a single image using convolutional neural networks, in IEEE International Conference on Robotics and Automation [ICRA] (2016) S. Yang, D. Maturana, S. Scherer, Real-time 3D scene layout from a single image using convolutional neural networks, in IEEE International Conference on Robotics and Automation [ICRA] (2016)
12.
go back to reference Z. Cai, Q. Fan, R.S. Feris, N. Vasconcelos, A unified multi-scale deep convolutional neural network for fast object detection, in European Conference on Computer Vision (Springer International Publishing, 2016), pp. 354–370 Z. Cai, Q. Fan, R.S. Feris, N. Vasconcelos, A unified multi-scale deep convolutional neural network for fast object detection, in European Conference on Computer Vision (Springer International Publishing, 2016), pp. 354–370
13.
go back to reference D. Guo, T. Kong, F. Sun, H. Liu, Object discovery and grasp detection with a shared convolutional neural network, in IEEE International Conference on Robotics and Automation [ICRA] (2016) D. Guo, T. Kong, F. Sun, H. Liu, Object discovery and grasp detection with a shared convolutional neural network, in IEEE International Conference on Robotics and Automation [ICRA] (2016)
14.
go back to reference N. Cruz, K. Lobos-Tsunekawa, J. Ruiz-del-Solar, Using convolutional neural networks in robots with limited computational resources: detecting NAO robots while playing soccer (2017). arXiv:1706.06702 N. Cruz, K. Lobos-Tsunekawa, J. Ruiz-del-Solar, Using convolutional neural networks in robots with limited computational resources: detecting NAO robots while playing soccer (2017). arXiv:​1706.​06702
15.
go back to reference V. Badrinarayanan, A. Kendall, R. Cipolla, SegNet: a deep convolutional encoder-decoder architecture for image segmentation (2015). arXiv:1511.00561 V. Badrinarayanan, A. Kendall, R. Cipolla, SegNet: a deep convolutional encoder-decoder architecture for image segmentation (2015). arXiv:​1511.​00561
16.
go back to reference F. Yang, W. Choi, Y. Lin, Exploit all the layers: fast and accurate CNN object detector with scale dependent pooling and cascaded rejection classifiers, in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (2016) F. Yang, W. Choi, Y. Lin, Exploit all the layers: fast and accurate CNN object detector with scale dependent pooling and cascaded rejection classifiers, in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (2016)
17.
18.
go back to reference R.J. Williams, J. Peng, An efficient gradient-based algorithm for on-line training of recurrent network trajectories. Neural Comput. 2(4), 490–501 (1990) R.J. Williams, J. Peng, An efficient gradient-based algorithm for on-line training of recurrent network trajectories. Neural Comput. 2(4), 490–501 (1990)
19.
go back to reference M. Alonso Jr, Y-GAN: a generative adversarial network for depthmap estimation from multi-camera stereo images, 3 Jun 2019. arXiv preprint arXiv:1906.00932 M. Alonso Jr, Y-GAN: a generative adversarial network for depthmap estimation from multi-camera stereo images, 3 Jun 2019. arXiv preprint arXiv:​1906.​00932
20.
go back to reference A. Pronobis, R.P. Rao, Learning deep generative spatial models for mobile robots, in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 24 Sep 2017. IEEE, pp. 755–762 A. Pronobis, R.P. Rao, Learning deep generative spatial models for mobile robots, in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 24 Sep 2017. IEEE, pp. 755–762
21.
go back to reference Z. Erickson, S. Chernova, C.C. Kemp, Semi-supervised haptic material recognition for robots using generative adversarial networks (2017). arXiv:1707.02796v2 Z. Erickson, S. Chernova, C.C. Kemp, Semi-supervised haptic material recognition for robots using generative adversarial networks (2017). arXiv:​1707.​02796v2
22.
go back to reference A. Al, M. Zain Amin, A briefly explanation of restricted boltzmann machine with practical implementation in pytorch A. Al, M. Zain Amin, A briefly explanation of restricted boltzmann machine with practical implementation in pytorch
23.
go back to reference R. Rastgoo, K. Kiani, S. Escalera, Multi-modal deep hand sign language recognition in still images using restricted boltzmann machine, in Entropy 23 Oct 2018 R. Rastgoo, K. Kiani, S. Escalera, Multi-modal deep hand sign language recognition in still images using restricted boltzmann machine, in Entropy 23 Oct 2018
24.
go back to reference K. Sasaki, K. Noda, T. Ogata, Visual motor integration of robot’s drawing behavior using recurrent neural network. Rob. Auton. Syst. 86, 184–195 (2016)CrossRef K. Sasaki, K. Noda, T. Ogata, Visual motor integration of robot’s drawing behavior using recurrent neural network. Rob. Auton. Syst. 86, 184–195 (2016)CrossRef
25.
go back to reference R. Cheng, Z. Wang, K. Fragkiadaki, Geometry-aware recurrent neural networks for active visual recognition, in 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada R. Cheng, Z. Wang, K. Fragkiadaki, Geometry-aware recurrent neural networks for active visual recognition, in 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada
26.
go back to reference A. Khan, F. Zhang, Using recurrent neural networks (RNNs) as planners for bio-inspired robotic motion, in 2017 IEEE Conference on Control Technology and Applications (CCTA), 27 Aug 2017. IEEE, pp. 1025–1030 A. Khan, F. Zhang, Using recurrent neural networks (RNNs) as planners for bio-inspired robotic motion, in 2017 IEEE Conference on Control Technology and Applications (CCTA), 27 Aug 2017. IEEE, pp. 1025–1030
27.
go back to reference N. Bin, C. Xiong, Z. Liming, X. Wendong, Recurrent neural network for robot path planning, in International Conference on Parallel and Distributed Computing: Applications and Technologies, 8 Dec 2004 (Springer, Berlin, Heidelberg, 2004), pp. 188–191 N. Bin, C. Xiong, Z. Liming, X. Wendong, Recurrent neural network for robot path planning, in International Conference on Parallel and Distributed Computing: Applications and Technologies, 8 Dec 2004 (Springer, Berlin, Heidelberg, 2004), pp. 188–191
28.
go back to reference A. Eitel, J.T. Springenberg, L. Spinello, M. Riedmiller, W. Burgard, Multimodal deep learning for robust RGB-D object recognition, in IEEE/RSJ International Conference on Intelligent Robots and Systems [IROS], Hamburg, Germany (2015) A. Eitel, J.T. Springenberg, L. Spinello, M. Riedmiller, W. Burgard, Multimodal deep learning for robust RGB-D object recognition, in IEEE/RSJ International Conference on Intelligent Robots and Systems [IROS], Hamburg, Germany (2015)
29.
go back to reference M. Schwarz, H. Schulz, S. Behnke, RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features, in ICRA (2015) M. Schwarz, H. Schulz, S. Behnke, RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features, in ICRA (2015)
30.
go back to reference O. Ronneberger, P. Fischer, T. Brox. U-net: convolutional networks for biomedical image segmentation (2015). arXiv preprint arXiv:1505.04597 O. Ronneberger, P. Fischer, T. Brox. U-net: convolutional networks for biomedical image segmentation (2015). arXiv preprint arXiv:​1505.​04597
31.
go back to reference C. Finn, X.Y. Tan, Y. Duan, T. Darrell, S. Levine, P. Abbeel, Deep spatial autoencoders for visuomotor learning, in IEEE International Conference on Robotics and Automation [ICRA] (2016) C. Finn, X.Y. Tan, Y. Duan, T. Darrell, S. Levine, P. Abbeel, Deep spatial autoencoders for visuomotor learning, in IEEE International Conference on Robotics and Automation [ICRA] (2016)
32.
go back to reference R. Zhang, S. Tang, Y. Zhang, J. Li, S. Yan, Perspective-adaptive convolutions for scene parsing, in IEEE Transaction on Pattern Analysis and Machine Intelligence [Early Access] R. Zhang, S. Tang, Y. Zhang, J. Li, S. Yan, Perspective-adaptive convolutions for scene parsing, in IEEE Transaction on Pattern Analysis and Machine Intelligence [Early Access]
33.
go back to reference P. Luo, G. Wang, L. Lin, X. Wang, Deep dual learning for semantic image segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 2718–2726 P. Luo, G. Wang, L. Lin, X. Wang, Deep dual learning for semantic image segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 2718–2726
Metadata
Title
Deep Learning for Robot Vision
Authors
Mamilla Keerthikeshwar
S. Anto
Copyright Year
2021
Publisher
Springer Singapore
DOI
https://doi.org/10.1007/978-981-33-4443-3_34

Premium Partners