Skip to main content
Top

2022 | OriginalPaper | Chapter

Transformer Based Feature Pyramid Network for Transparent Objects Grasp

Authors : Jiawei Zhang, Houde Liu, Chongkun Xia

Published in: Intelligent Robotics and Applications

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Transparent objects like glass bottles and plastic cups are common in daily life, while few works show good performance on grasping transparent objects due to their unique optic properties. Besides the difficulties of this task, there is no dataset for transparent objects grasp. To address this problem, we propose an efficient dataset construction pipeline to label grasp pose for transparent objects. With Blender physics engines, our pipeline could generate numerous photo-realistic images and label grasp poses in a short time. We also propose TTG-Net - a transformer-based feature pyramid network for generating planar grasp pose, which utilizes features pyramid network with residual module to extract features and use transformer encoder to refine features for better global information. TTG-Net is fully trained on the virtual dataset generated by our pipeline and it shows 80.4% validation accuracy on the virtual dataset. To prove the effectiveness of TTG-Net on real-world data, we also test TTG-Net with photos randomly captured in our lab. TTG-Net shows 73.4% accuracy on real-world benchmark which shows remarkable sim2real generalization. We also evaluate other main-stream methods on our dataset, TTG-Net shows better generalization ability.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Yun, J., Moseson, S., Saxena, A.: Efficient grasping from RGBD images: learning using a new rectangle representation. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3304–3311. IEEE (2011) Yun, J., Moseson, S., Saxena, A.: Efficient grasping from RGBD images: learning using a new rectangle representation. In: 2011 IEEE International Conference on Robotics and Automation, pp. 3304–3311. IEEE (2011)
2.
go back to reference Depierre, A., Dellandréa, E., Chen, L.: Jacquard: a large scale dataset for robotic grasp detection. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3511–3516. IEEE (2018) Depierre, A., Dellandréa, E., Chen, L.: Jacquard: a large scale dataset for robotic grasp detection. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3511–3516. IEEE (2018)
3.
go back to reference Redmon, J., Angelova, A.: Real-time grasp detection using convolutional neural networks. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1316–1322. IEEE (2015) Redmon, J., Angelova, A.: Real-time grasp detection using convolutional neural networks. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 1316–1322. IEEE (2015)
4.
go back to reference Kumra, S., Kanan, C.: Robotic grasp detection using deep convolutional neural networks. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 769–776. IEEE (2017) Kumra, S., Kanan, C.: Robotic grasp detection using deep convolutional neural networks. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 769–776. IEEE (2017)
5.
go back to reference Guo, D., Sun, F., Liu, H., Kong, T., Fang, B., Xi, N.: A hybrid deep architecture for robotic grasp detection. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1609–1614. IEEE (2017) Guo, D., Sun, F., Liu, H., Kong, T., Fang, B., Xi, N.: A hybrid deep architecture for robotic grasp detection. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1609–1614. IEEE (2017)
6.
go back to reference Morrison, D., Corke, P., Leitner, J.: Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach. arXiv preprint arXiv:1804.05172 (2018) Morrison, D., Corke, P., Leitner, J.: Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach. arXiv preprint arXiv:​1804.​05172 (2018)
7.
go back to reference Zhou, X., Lan, X., Zhang, H., Tian, Z., Zhang, Y., Zheng, N.: Fully convolutional grasp detection network with oriented anchor box. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7223–7230. IEEE (2018) Zhou, X., Lan, X., Zhang, H., Tian, Z., Zhang, Y., Zheng, N.: Fully convolutional grasp detection network with oriented anchor box. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7223–7230. IEEE (2018)
8.
go back to reference Cao, H., Chen, G., Li, Z., Lin, J., Knoll, A.: Residual squeeze-and-excitation network with multi-scale spatial pyramid module for fast robotic grasping detection. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13445–13451. IEEE (2021) Cao, H., Chen, G., Li, Z., Lin, J., Knoll, A.: Residual squeeze-and-excitation network with multi-scale spatial pyramid module for fast robotic grasping detection. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13445–13451. IEEE (2021)
9.
go back to reference Ainetter, S., Fraundorfer, F.: End-to-end trainable deep neural network for robotic grasp detection and semantic segmentation from RGB. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13452–13458. IEEE (2021) Ainetter, S., Fraundorfer, F.: End-to-end trainable deep neural network for robotic grasp detection and semantic segmentation from RGB. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 13452–13458. IEEE (2021)
10.
go back to reference Xu, Y., Nagahara, H., Shimada, A., Taniguchi, R.: Transcut: transparent object segmentation from a light-field image. In Proceedings of the IEEE International Conference on Computer Vision, pp. 3442–3450 (2015) Xu, Y., Nagahara, H., Shimada, A., Taniguchi, R.: Transcut: transparent object segmentation from a light-field image. In Proceedings of the IEEE International Conference on Computer Vision, pp. 3442–3450 (2015)
11.
go back to reference Chen, G., Han, K., Wong, K.-Y.K.: Tom-net: Learning transparent object matting from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9233–9241 (2018) Chen, G., Han, K., Wong, K.-Y.K.: Tom-net: Learning transparent object matting from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9233–9241 (2018)
13.
go back to reference Xie, E., et al.: Trans2seg: transparent object segmentation with transformer (2021) Xie, E., et al.: Trans2seg: transparent object segmentation with transformer (2021)
14.
go back to reference Kalra, A., Taamazyan, V., Rao, S.K., Venkataraman, K., Raskar, R., Kadambi, A.: Deep polarization cues for transparent object segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8602–8611 (2020) Kalra, A., Taamazyan, V., Rao, S.K., Venkataraman, K., Raskar, R., Kadambi, A.: Deep polarization cues for transparent object segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8602–8611 (2020)
15.
go back to reference Lai, P.-J., Fuh, C.-S.: Transparent object detection using regions with convolutional neural network. In: IPPR Conference on Computer Vision, Graphics, and Image Processing, vol. 2 (2015) Lai, P.-J., Fuh, C.-S.: Transparent object detection using regions with convolutional neural network. In: IPPR Conference on Computer Vision, Graphics, and Image Processing, vol. 2 (2015)
16.
go back to reference Sajjan, S., et al.: Clear grasp: 3D shape estimation of transparent objects for manipulation. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 3634–3642. IEEE (2020) Sajjan, S., et al.: Clear grasp: 3D shape estimation of transparent objects for manipulation. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 3634–3642. IEEE (2020)
17.
go back to reference Liu, X., Jonschkowski, R., Angelova, A., Konolige, K.: Keypose: multi-view 3D labeling and keypoint estimation for transparent objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11602–11610 (2020) Liu, X., Jonschkowski, R., Angelova, A., Konolige, K.: Keypose: multi-view 3D labeling and keypoint estimation for transparent objects. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11602–11610 (2020)
18.
19.
go back to reference Vaswani, A., et al.: Attention is all you need. Advances in Neural Information Processing Systems, vol. 30 (2017) Vaswani, A., et al.: Attention is all you need. Advances in Neural Information Processing Systems, vol. 30 (2017)
20.
go back to reference Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020) Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:​2010.​11929 (2020)
21.
go back to reference Wang, W., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568–578 (2021) Wang, W., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568–578 (2021)
22.
go back to reference Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020) Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. arXiv preprint arXiv:​2010.​04159 (2020)
23.
go back to reference Zheng, S., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890 (2021) Zheng, S., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890 (2021)
24.
go back to reference Wang, Y., et al.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021) Wang, Y., et al.: End-to-end video instance segmentation with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8741–8750 (2021)
25.
go back to reference Zhang, H., Lan, X., Bai, S., Zhou, X., Tian, Z., Zheng, N.: Roi-based robotic grasp detection for object overlapping scenes. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4768–4775. IEEE (2019) Zhang, H., Lan, X., Bai, S., Zhou, X., Tian, Z., Zheng, N.: Roi-based robotic grasp detection for object overlapping scenes. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4768–4775. IEEE (2019)
26.
go back to reference Song, Y., Gao, L., Li, X., Shen, W.: A novel robotic grasp detection method based on region proposal networks. Robot. Comput.-Integr. Manuf. 65, 101963 (2020)CrossRef Song, Y., Gao, L., Li, X., Shen, W.: A novel robotic grasp detection method based on region proposal networks. Robot. Comput.-Integr. Manuf. 65, 101963 (2020)CrossRef
Metadata
Title
Transformer Based Feature Pyramid Network for Transparent Objects Grasp
Authors
Jiawei Zhang
Houde Liu
Chongkun Xia
Copyright Year
2022
DOI
https://doi.org/10.1007/978-3-031-13822-5_37

Premium Partner